Post Job Free

Resume

Sign in

Data Engineer

Location:
San Francisco, CA
Posted:
February 08, 2024

Contact this candidate

Resume:

ad3hm3@r.postjobfree.com

650-***-****

github.com/pycoder2000

EDUCATION

parthdesai.site

medium.com/@desaiparth2000

linkedin.com/in/desaiparth2000

Master of Science in Data Science (Specialization in Big Data) Aug 2023 – Jul 2025 (Expected) San Francisco State University, San Francisco, CA GPA: 4.0 Bachelor of Technology in Computer Science Aug 2018 – Jul 2022 Nirma University, Gujarat, India GPA: 3.6

SKILLS

Languages: Python, Java, Scala, SQL

Frameworks: Apache Spark, Apache Airflow, Apache Hadoop, Redshift, Snowflake, Iceberg, Apache Hive Platform/Tools: AWS, GCP, BigQuery, DBeaver, NoSQL, Docker, Kubernetes, Tableau, Git Soft Skills: SCRUM, Agile, Requirement gathering, Data Management, Critical Thinking, Leadership PROFESSIONAL EXPERIENCE

Data Engineer - Accenture Jun 2022 – Jul 2023

• Spearheaded the migration of 39 AWS-backed Tableau Dashboards to GCP, requiring complex SQL query replication, enhancing dashboard performance by 30% and cutting operational costs by 25%.

• Scripted in Python for RDS to Redshift migration, automating SQL tasks and slashing migration time by 98.33%.

• Led a 3-person team to develop a web app that automates Python Airflow DAGs, enhancing workflow automation.

• Engineered and deployed a robust solution on EKS using Helm for executing CRUD operations on DAGs across diverse environments such as on-premise, Amazon MWAA, and Google Cloud Composer, streamlining deployment processes. Data Engineer Intern - Accenture Jan 2022 – Jun 2022

• Developed AWS Lambda functions and integrated with SNS Queue, reducing manual tasks, and cutting migration time, enhancing overall efficiency by 40%.

• Crafted an Encryption & Hashing module using Scala and Spark for our ETL platform which bolstered data security in transit.

• Automated ETL pipelines for clients, aligning with key KPIs, which enhanced data accuracy and workflow efficiency, leading to a 20% increase in client satisfaction.

Data Science Intern - HOPS Healthcare Mar 2021 – Jun 2021

• Developed a pipeline for extracting critical healthcare information from patient-doctor conversations.

• Led the launch of a Django-based web application, for secure analysis and storage of patient reports, improving data security and access efficiency by 40%.

• Played a pivotal role in the MongoDB database design & architecture during the initial phase, establishing foundational models.

• Engineered a Bio-BERT and Regex-powered parsing bot, to automate key data extraction from reports. RESEARCH & LEADERSHIP

Data Engineering Research Assistant - College of Business, SFSU Sep 2023 – Present

• Migrated Python scripts to Spark, cutting app processing times by 40%, boosting performance and operational efficiency.

• Automated Algolia search index creation for drug data, achieving sub-100ms search result delivery, enhancing user experience.

• Implemented a chatbot powered by OpenAI and trained on FDA drug data, shaped by insights from user surveys, to provide safe drug suggestions, boosting customer engagement.

Graduate Assistant - School of Nursing, SFSU Sep 2023 – Present

• Streamlined Qualtrics survey data processing, employing advanced cleaning and transformation techniques, enhancing data quality by 30% and accelerating analysis readiness.

• Generated actionable insights from survey data through efficient analytics and reporting, increasing data-driven decisions. Lead ROS Developer - AUV (Autonomous Underwater Vehicles) Team, Nirma University Jan 2019 – Apr 2019

• Built the framework for communication between various controllers and sensors using C++ and ROS on Nvidia Jetson TX2. PROJECTS

InstantMD - Python, Pandas, Matplotlib, TensorFlow, Keras - GitHub

• Utilized NLP in InstantMD for patient story analysis, achieving 95% accuracy in symptom detection, thus optimizing diagnostic procedures, and improving patient care.

• Championed an automation initiative using Bio-BERT and BERN, securing 1st place at the Mined Hackathon. Socrata API ELT Pipeline - Python, Airflow, Google Cloud Platform, BigQuery, SQL, dbt - GitHub

• Constructed an ELT pipeline for San Francisco Eviction Notice data, from the SF Open Data Website to GCS, and transforming it into BigQuery using dbt for Fact and Dimension tables.

• Automated monthly pipeline execution with Airflow, ensuring consistent data updates and accessibility for analysis. CERTIFICATIONS

• AWS Certified Cloud Practitioner, Amazon Web Services Jan 2023

• AWS Fundamentals: Going Cloud-Native, Coursera Sep 2020

• AWS Fundamentals: Migrating to the Cloud, Coursera Sep 2020 ACHIEVEMENTS

• Winner of the Economic Times Campus Star 5th edition, competition, securing first place out of 52,000 participants.

• Selected for the Facebook School of Innovation, Spark AR program amongst more than 10,000 participants and completed industrial training in Augmented Reality for 3 months.

• Awarded first place amongst 600+ students from over 20 universities in the HealthCare Track at Mined Hackathon, a Nationwide Hackathon organized by Nirma University x Binghamton University.

• Attained 1st position in a National Scholarship Quiz on Python, earning 3-month industrial training in Python. Parth Desai



Contact this candidate