Priyanshu Gupta
College Park, MD *****
240-***-**** # *******@***.*** ï priyanshu-gupta § github.com/priyanshu124 Education
University of Maryland Sep 2024 – May 2026
Masters in Information Management, Track - Data Science and Analytics College Park, MD Delhi Technological University Aug 2016 – Jul. 2020 Bachelors of Technology Delhi, India
Experience
Intuit Sep 2023 – Sep 2024
Business Systems Analyst 2 Remote, US
• Partnered with Program Managers to gather requirements, define KPIs, and integrate data from various sources, including Coupa, Oracle ERP, Concur, Salesforce, and 3rd party APIs.
• Collaborated with the data engineering team to build data marts using Dimensional Modelling (Star Schema) on the Databricks platform using Spark SQL, ensuring the required 5-star data quality metrics.
• Designed and automated ad-hoc business intelligence reporting workflows using Power BI and Qlik Sense deliv- ering the required SLAs with 98% efficiency.
• Identified $2.4M discrepancy in diverse spend classification by resolving inconsistent Slowly Changing Dimension
(SCD) implementation, improving data mapping accuracy by 25%.
• Integrated procurement and supply chain data into AWS QuickSight’s GenAI model, enhancing Text-to-SQL re- sponse accuracy with comprehensive data dictionary and continuous feedback loops. Polestar Aug 2020 – Sept 2023
Senior Analyst Noida, India
• Developed and deployed incremental ETL pipelines using Spark (AWS EMR and Glue) for Parquet, CSV and JSON formats, reducing data freshness latency from daily to hourly .
• Automated data pipeline orchestration for multiple data sources using Airflow and Shell Scripting running on AWS ECS environment, providing better monitoring and backfilling capabilities.
• Optimized SQL queries in AWS Redshift by analyzing query execution plans, implementing efficient table design, and utilizing caching, reducing query execution time by 65%.
• Led a team of 2 analysts in developing an end-to-end data pipeline for the Salesperson Performance Dashboard on Qlik Sense and Salesforce, boosting salesperson compliance to quarterly targets by 22%.
• Developed near real time customer segmentation workflows for a beauty brand using RFM analysis and K-means clustering to identify high-value customer cohorts.
• Implemented fine-grained RBAC access framework by defining column-level, row-level, and tag-based security rules using AWS S3 and Lake Formation for secure data governance. Technical Skills
Languages: SQL, Python, Java
Databases/ Data Warehouses: SQL Server, Snowflake, Redshift, Athena, DynamoDB, Hive Big Data/Cloud:: Databricks, PySpark, EMR, Glue, Kafka, Airflow, Terraform, GitLab CI/CD, Docker BI Tools: Qlik Sense, Qlik Nprinting, PowerBI, Quicksight. MS Excel Cetifications
• AWS Data Engineer – Associate
• Databricks Certified Developer for Apache Spark
• Astronomer DAG Authoring for Apache Airflow
Academic Projects
Incremental Data Pipeline for FDA Adverse Events Analysis
• Built an incremental ETL pipeline using Cloud Composer (Airflow), Dataproc (Pyspark), and BigQuery for the openFDA API, providing insights related to adverse events on drug usage.
• Deployed cloud resources using Terraform (IaC) and GitLab CI/CD. Multiclassification of Disaster Tweets using NLP and Machine Learning
• Developed a supervised NLP pipeline to classify disaster-related tweets into 8 actionable categories, enhancing data usability for emergency response.
• Explored semantic word embeddings like Word2Vec, GloVe, and BERT to achieve 18% higher accuracy than TF-IDF representations. Achieved an overall accuracy of 82% by fine-tuning the RoBERTa transformer model.