Priyanka Pimpalekar
Data Analyst
945-***-**** ********.*@**********.*** Plano, TX
SUMMARY
5+ years of experience in design, development, deploying, and maintenance of cloud-based data assets and ETL pipelines
Passionate about utilizing data to inform strategic initiatives and deliver impactful results.
Skilled in leveraging advanced querying tools such as Google BigQuery and AWS Redshift for using advanced SQL queries to extract actionable insights from large datasets.
Expert in conducting big-data analytics on Snowflake DWH and data modeling concepts in DBT
Passionate about analyzing clickstream data using Adobe Analytics to understand user behavior and optimize digital experiences.
Proficient in the development of various reports in SSRS, and dashboards using various Tableau/Looker Visualizations.
Experienced in using Google Analytics to monitor and improve website performance, resulting in data-driven decision-making and measurable business growth
Proficiency in using Version Control systems like Git.
SKILLS
Methodology
SDLC, Agile, Waterfall, CloudFormation, Terraform
Programming Language:
Python, SQL, PySpark, R
IDE’s:
PyCharm, Jupyter Notebook, Databricks Notebook, VS Code
Analytics tools:
Google Analytics, Adobe Analytics, Amplitude, Heap
Big Data Ecosystem:
Hadoop, MapReduce, Hive, Apache Spark, Pig, Flink
ETL Tools:
AWS Glue, SSIS, DBT,
Cloud Services:
Orchestration:
AWS Redshift, Lambda, Athena, Quicksight, Textract, Google BigQuery
Apache Airflow, AWS Step function, IBM Datastage, Google Dataflow
Frameworks:
Kafka, Airflow, Snowflake, Docker
Packages:
NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow
Reporting Tools:
Tableau, Power BI, SSRS
Database:
MongoDB, MySQL, PostgreSQL
Other Tools:
Git, MS Office, Atlassian Jira, Confluence, Jenkins, Peoplesoft, ALM, Postman
Other Skills:
Data Cleaning, Data Wrangling, Critical Thinking, Communication Skills, Presentation Skills, Problem-Solving, Generative AI, Cross Functional collaboration
Operating Systems:
Windows, Linux CloudFormation, Terraform
EDUCATION
Master in Business Analytics The University of Texas at Dallas Aug 2021- May 2023
Bachelor in Computer Science Engineering University of Pune, India
Certifications AWS Cloud Practitioner, Azure Fundamentals
EXPERIENCE
Metlife, USA Sr. Data Engineer Aug 2023 - Current
Used the Agile methodology to build different phases of the Software development life cycle with regular code reviews
Engineered and executed efficient data processing ETL pipeline with AWS services successfully, resulting in a 20% reduction in data ingestion time and 30% improvement in data availability
Streamlined an OLAP solution for historical data form Data Lake, implementing Kimball data modeling methodologies in DBT
Managed data batch sizes of upto 100 TB in Snowflake Data Lake using Data warehousing and ETL/modeling principles
Wrote high level SQL queries to analyze big data, implemented statistical analysis techniques in R and utilized Tableau to visualize data with multiple layers of abstraction
Generated and maintained a scalable data infrastructure, harnessing cloud-based technologies such as AWS, leading to a decrease in infrastructure costs and a 50% increase in data processing capacity.
Designed and implemented Bronze, Silver and Gold layers of data quality using PySpark and SparkSQL in Databricks, resulting in a 15% improvement in overall data quality and 20% lesser data related errors
Implemented Change Data Capture (CDC) mechanism using triggers and event handlers in AWS Lambda and SNS to enable real-time data synchronization and event-driven processing
Conducted comprehensive data analysis using SQL, PySpark, MongoDB and RESTful APIs driving data-driven decision making and delivering actionable insights that propelled operational efficiency by 10%.
Harnessed advanced Tableau features to connect and blend semi structured sensor data from multiple sources to visualize critical metrics leading to enhanced operational efficiency and boosting revenue growth.
Vanguard Group Data & Analytics Intern May 2022 - Aug 2022
Uncovered and fixed bug in the notification system, ingested real time AWS SNS into a messaging queue AWS SQS
Optimized Cloudwatch logs output by streamlining the Python code in AWS Lambda, eliminating redundant functions
Feature enhancement to improve data pipeline health by 5% - analyzed a sample of 700 system fault logs, implementing preventive measures that reduced service interruptions by 18%
Trained AWS Textract to detect keywords in input documents and classify various financial document types, potentially increasing prediction accuracy by 25%
Magna Infotech, India Sr. Data Analyst Dec 2019 - July 2021
Managed a 1 PB data lake, ensuring data availability and reliability for business intelligence and reporting purposes.
Boosted Cloud product adoption by 20% by effectively communicating product vision and PLG strategy to stakeholders, influencing product pricing and positioning
Increased feature alignment with business goals by 30% through effective management of the product backlog, conducting A/B testing, prioritizing strategic objectives and enabling cross-functional team collaboration
Analyzed and interpreted Adobe Clickstream data to gain insights into user behavior and website performance, leading to actionable recommendations that increased user engagement and improved conversion rates by 20%
Conducted market research using primary and secondary methods, including BI platforms such as Tableau and Adobe Analytics for trend analysis and Heap to unlock and translate customer empathy and market dynamics
Enabled reusability, code modularity, and high availability by introducing appropriate cloud services following comprehensive what-if, Gap, and MoSCoW analyses- leading to significant process improvements and new features
Implemented A/B testing strategies, optimizing email campaigns and landing pages in Adobe Target, leading to a 15% improvement in CTR within a 3-month timeframe
Leveraged Google BigQuery to perform complex data analysis and querying of large datasets, resulting in the identification of key business insights and a 25% improvement in data processing efficiency. Improved customer satisfaction scores by 10% by harnessing MixPanel’s funneling and user workflow analysis, leading to optimized customer experiences and resource allocation
Identified and implemented key performance metrics that improved business performance tracking by 40%, enabling data-driven decision-making and strategic planning
Orchestrated an automated custom workflow for streamlining the ETL process using Apache Airflow, minimizing manual labor costs by 30%, designed efficient customized data models in DBT after thorough requirement elicitation
Groovy Web, India Data Engineer Jan 2018 - Nov 2019
Worked on Agile project execution methodology, mentored new resources for maximizing team performance
Developed complex queries and designed SSIS packages to extract, transform, and load (ETL) data into data warehouses data marts from heterogeneous sources
Queried database in Snowflake data warehouse and utilised SPSS for statistical analysis on panel data
Streamlined an automated QA solution in Jenkins, UiPath as part of a process improvement initiative saving 30 man-hours
Conducted comprehensive exploratory data analysis on financial dataset using Snowflake to uncover hidden patterns and insights, visualized data pipeline health on self-created interactive dashboard in Visual Basic
Maintained ETL multi-cloud pipelines, enabling reusability using IAC tools such as CFT and Airflow DAG templates
Slashed 15% of data processing and query times by conducting database tuning and optimization activities in MS SQL
Designed salient SSIS packages with transformation scripts in Python implementing schema modeling best practices in SSAS or DBT
Developed 5+ dashboards in Tableau with compelling stratified visualizations to drive business decisions, visualize KPI’s and validate stakeholder’s requirements
Created and maintained detailed project documentation, including Project Charters, Scope Documents, Stakeholder Analysis Reports, also assisted in writing RFPs, BRDs and FRDs with 100% client satisfaction
Optimized pipeline architecture by rewriting ETL jobs scripts, ensuring deduplication and normalization