ad3hm3@r.postjobfree.com
github.com/pycoder2000
EDUCATION
parthdesai.site
medium.com/@desaiparth2000
linkedin.com/in/desaiparth2000
Master of Science in Data Science (Specialization in Big Data) Aug 2023 – Jul 2025 (Expected) San Francisco State University, San Francisco, CA GPA: 4.0 Bachelor of Technology in Computer Science Aug 2018 – Jul 2022 Nirma University, Gujarat, India GPA: 3.6
SKILLS
Languages: Python, Java, Scala, SQL
Frameworks: Apache Spark, Apache Airflow, Apache Hadoop, Redshift, Snowflake, Iceberg, Apache Hive Platform/Tools: AWS, GCP, BigQuery, DBeaver, NoSQL, Docker, Kubernetes, Tableau, Git Soft Skills: SCRUM, Agile, Requirement gathering, Data Management, Critical Thinking, Leadership PROFESSIONAL EXPERIENCE
Data Engineer - Accenture Jun 2022 – Jul 2023
• Spearheaded the migration of 39 AWS-backed Tableau Dashboards to GCP, requiring complex SQL query replication, enhancing dashboard performance by 30% and cutting operational costs by 25%.
• Scripted in Python for RDS to Redshift migration, automating SQL tasks and slashing migration time by 98.33%.
• Led a 3-person team to develop a web app that automates Python Airflow DAGs, enhancing workflow automation.
• Engineered and deployed a robust solution on EKS using Helm for executing CRUD operations on DAGs across diverse environments such as on-premise, Amazon MWAA, and Google Cloud Composer, streamlining deployment processes. Data Engineer Intern - Accenture Jan 2022 – Jun 2022
• Developed AWS Lambda functions and integrated with SNS Queue, reducing manual tasks, and cutting migration time, enhancing overall efficiency by 40%.
• Crafted an Encryption & Hashing module using Scala and Spark for our ETL platform which bolstered data security in transit.
• Automated ETL pipelines for clients, aligning with key KPIs, which enhanced data accuracy and workflow efficiency, leading to a 20% increase in client satisfaction.
Data Science Intern - HOPS Healthcare Mar 2021 – Jun 2021
• Developed a pipeline for extracting critical healthcare information from patient-doctor conversations.
• Led the launch of a Django-based web application, for secure analysis and storage of patient reports, improving data security and access efficiency by 40%.
• Played a pivotal role in the MongoDB database design & architecture during the initial phase, establishing foundational models.
• Engineered a Bio-BERT and Regex-powered parsing bot, to automate key data extraction from reports. RESEARCH & LEADERSHIP
Data Engineering Research Assistant - College of Business, SFSU Sep 2023 – Present
• Migrated Python scripts to Spark, cutting app processing times by 40%, boosting performance and operational efficiency.
• Automated Algolia search index creation for drug data, achieving sub-100ms search result delivery, enhancing user experience.
• Implemented a chatbot powered by OpenAI and trained on FDA drug data, shaped by insights from user surveys, to provide safe drug suggestions, boosting customer engagement.
Graduate Assistant - School of Nursing, SFSU Sep 2023 – Present
• Streamlined Qualtrics survey data processing, employing advanced cleaning and transformation techniques, enhancing data quality by 30% and accelerating analysis readiness.
• Generated actionable insights from survey data through efficient analytics and reporting, increasing data-driven decisions. Lead ROS Developer - AUV (Autonomous Underwater Vehicles) Team, Nirma University Jan 2019 – Apr 2019
• Built the framework for communication between various controllers and sensors using C++ and ROS on Nvidia Jetson TX2. PROJECTS
InstantMD - Python, Pandas, Matplotlib, TensorFlow, Keras - GitHub
• Utilized NLP in InstantMD for patient story analysis, achieving 95% accuracy in symptom detection, thus optimizing diagnostic procedures, and improving patient care.
• Championed an automation initiative using Bio-BERT and BERN, securing 1st place at the Mined Hackathon. Socrata API ELT Pipeline - Python, Airflow, Google Cloud Platform, BigQuery, SQL, dbt - GitHub
• Constructed an ELT pipeline for San Francisco Eviction Notice data, from the SF Open Data Website to GCS, and transforming it into BigQuery using dbt for Fact and Dimension tables.
• Automated monthly pipeline execution with Airflow, ensuring consistent data updates and accessibility for analysis. CERTIFICATIONS
• AWS Certified Cloud Practitioner, Amazon Web Services Jan 2023
• AWS Fundamentals: Going Cloud-Native, Coursera Sep 2020
• AWS Fundamentals: Migrating to the Cloud, Coursera Sep 2020 ACHIEVEMENTS
• Winner of the Economic Times Campus Star 5th edition, competition, securing first place out of 52,000 participants.
• Selected for the Facebook School of Innovation, Spark AR program amongst more than 10,000 participants and completed industrial training in Augmented Reality for 3 months.
• Awarded first place amongst 600+ students from over 20 universities in the HealthCare Track at Mined Hackathon, a Nationwide Hackathon organized by Nirma University x Binghamton University.
• Attained 1st position in a National Scholarship Quiz on Python, earning 3-month industrial training in Python. Parth Desai