Gaurav Thorat Data Analyst
Chicago IL Email: ***************@*****.*** Phone: +1-401-***-**** LinkedIn:
PROFESSIONAL SUMMARY:
* ***** ** ********** ** a Data Analyst designing and implementing end-to-end data analytics solutions in modern, cloud-native environments. Proven expertise in architecting and developing scalable ETL/ELT pipelines using Python, SQL, Airflow, dbt, and cloud services (AWS S3, Redshift, Lambda; Azure Data Factory, Synapse; GCP Big Query, GCS). Adept at transforming complex data into actionable insights through interactive BI dashboards (Power BI, Tableau), advanced analytics, and predictive models (ARIMA, LSTM, Random Forest, NLP) to drive business decisions across Finance, Supply Chain, and Marketing domains. Skilled in cloud data engineering, real-time data processing, and ML-powered forecasting pipelines, optimizing analytics workflows, reducing latency, and supporting enterprise-wide decision-making. Passionate about modernizing legacy systems, enabling data-driven cultures, and delivering measurable impact through innovative analytics solutions.
TECHNICAL SKILLS:
Primary Tools: Informatica Power Center, Ab Initio, Teradata SQL, Teradata Tools and Utilities, Oracle 10g/9i, MS SQL Server
Programming Languages: Python, R, Java, Teradata SQL, SQL, PLSQL, C/C++, Python, R, Bash
Data Visualization: MATLAB, Power BI, Tableau
Databases: Teradata, Oracle, DB2/UDB, SQL Server, MS Access, MySQL, PostgreSQL, MS SQL Server, MongoDB, Hive, Presto, AWS RDS, AWS Redshift, AWS Redis, Big Query
Cloud / Web Services: Amazon Web Services, Google Cloud, Microsoft Azure Cloud
Operating Systems: Windows, UNIX, Linux, NCR MP-RAS UNIX
Data Modeling: Erwin, ER Studio
DevOps & Automation: GitHub Actions, Terraform, RESTful APIs, CI/CD Pipelines – automating reporting and RCA workflows
Tools: Tableau, Hadoop, Hive, Apache Airflow, Apache Spark, Flask, Apache Kafka, Jupyter Notebook, Excel, Jira, Git, Docker, Kubernetes
Data Warehousing Informatica (Repository Manager, Designer, Workflow Manager, and Workflow Monitor), SSIS, Data Stage 8.x
Scheduling tools: Control M, Autosys
Tableau: Tableau Desktop, Tableau Server, Tableau Online, Tableau Public, Tableau Reader.
Cross-Functional Collaboration: UX, Legal, Ops, and Data Science, Salesforce, Mixpanel, Amplitude, Box, SharePoint, SAP, S/4 Hana,
Servers: Windows, Microsoft SQL Server.
EDUCATION QUALIFICATION:
University of Texas at Arlington, MS in Data Science — CGPA: 3.5 Dec 2023
University of Mumbai, BE in Computer Engineering — CGPA: 3.6 July 2019
PROFESSIONAL SUMMARY:
Client: Bank of America, Chicago IL (July 2024 – Present)
Role: Data Analyst
1. Enterprise Financial KPI Automation & Insights/ Automated Financial Health Scoring Engine
2. Credit Risk & Marketing Campaign Synergy Platform/ Intelligence Dashboard
3. Predictive Forecasting Model for Business Planning
4. Customer Segments and Revenue Growth Optimization Engine
Responsibilities:
•Provided business strategy recommendations through data analytics and process automation
•Delivered business insights by automating reports and building interactive dashboards in Power BI, Tableau, and Excel, boosting operational efficiency by 30%.
•Extracted, transformed, and cleansed data from Oracle, flat files, mainframes, and SAP transactional/master data using PL/SQL, SQL, SSIS, Azure Data Factory, and Python, ensuring high data quality and accuracy.
•Developed complex stored procedures, optimized SQL queries, and translated SAS scripts to Snowflake and Teradata, supporting efficient data retrieval and analytics across multiple platforms.
•Built and published Power BI dashboards featuring calculated columns, DAX measures, Power Query, dynamic filters, role-based security, and automated refresh scheduling via Power BI Service.
•Created ARIMA-based Python forecasting models integrated into Power BI for predictive analytics on warehouse capacity and inventory management.
•Automated ETL pipelines with Python, Airflow, AWS Glue, and containerized workflows deployed on AWS ECS, reducing report generation time by 40% and enhancing pipeline reliability.
•Designed and maintained data models using Erwin, reverse-engineered from databases and ODS systems, and published models in Model Mart for enterprise reuse.
•Migrated dashboards and applications to Azure, managing data validation, UAT testing, deployment, and ongoing production support.
•Extracted and transformed S/4HANA data into analytics-ready datasets, ensured reconciliation across OLAP/OLTP systems, and enabled real-time reporting on critical financial modules.
•Enabled real-time data access and reporting by integrating SAP S/4HANA’s in-memory capabilities with modern BI tools, significantly improving operational decision-making and financial KPI visibility.
•Developed KPIs, TDE extracts, and ad-hoc Tableau visualizations supporting legal, billing, and strategic planning functions.
•Centralized structured and unstructured data into AWS S3 and Snowflake, enabling scalable analytics and quality checks with SQL and validation scripts.
•Built supply chain risk analytics by leveraging Amazon Redshift and integrating diverse data sources, enhancing query performance and decision support.
•Implemented Python-based anomaly detection to flag inventory discrepancies, reducing manual checks by 30%, and created custom scripts to improve data quality from heterogeneous sources.
•Coordinated with DevOps and offshore teams to containerize, version, and deploy ETL workflows; supported system modifications, UAT, and business validation across multiple environments.
•Collaborated with 50+ stakeholders and 15+ cross-functional teams, translating business needs into dashboards in Tableau and Power BI, improving strategic decision-making accuracy by 25% and achieving 95% on-time project delivery.
•Migrated legacy SQL systems to AWS RDS, preserving 100% data accuracy and eliminating $2M in annual maintenance overhead from legacy infrastructure.
•Consolidated 20+ raw data formats into a single schema using Python and Athena; automated retrieval from 100+ sources, cutting 200+ monthly analyst hours and reducing infrastructure strain by 25%.
•Built Power BI dashboards to monitor post-migration KPIs, improving stakeholder visibility into remote access, performance, and security — helping justify $10M+ in projected annual savings.
•Standardized data pipelines in partnership with ops and analytics teams; ensured governance compliance and created re-usable documentation to support future scaling efforts.
Client: Denodo, Chicago IL (Jan 2024 - July 2024)
Role: Data Analyst
Responsibilities
•Collaborated with cross-functional Agile teams, including delivery managers, business analysts, Scrum Masters, and others, to gather user stories and deliver reports and dashboards aligned with business requirements.
•Partnered with Agile cross-functional teams—including PMs, BAs, Scrum Masters, DBAs, architects, and engineers—to elicit requirements, write Jira user stories, document workflows, and deliver data solutions that align with strategic business goals.
•Conducted advanced data analysis using SQL, Python, and SAS to drive KPIs, financial reporting, and operational insights across complex datasets supporting $50M+ business units.
•Architected and deployed dynamic, user-centric dashboards and reports with Power BI, Tableau, and SAS Enterprise Guide, leveraging DAX, LOD expressions, advanced calculations, role-based security, and scheduled refreshes for marketing, sales, procurement, and executive leadership.
•Integrated and transformed heterogeneous data from Oracle, SQL Server, Excel, flat files, SAP S/4HANA ERP, and APIs into unified analytics pipelines using SSIS, AWS Glue, Azure Data Factory, Apache Airflow, and Snowflake.
•Designed, developed, and optimized scalable ETL pipelines, data models, and schema architectures (Star, Snowflake) with ERWIN and SQL DDL scripting, leveraging cloud storage (AWS S3) to reduce data latency and enhance performance.
•Automated complex financial data workflows and anomaly detection models using Python, ARIMA, LSTM, and AI techniques, reducing manual reconciliation by 35% and accelerating budgeting accuracy.
•Collaborated closely with data science teams to build NLP-based classification models, streamlining document tagging processes and enhancing workflow efficiency.
•Maintained robust version control and documentation for ETL scripts, dashboards, and Airflow DAGs using GitLab, driving continuous integration and deployment in partnership with DevOps teams.
•Executed comprehensive data profiling, cleansing, validation, and migration across OLTP, ODS, and cloud databases (Big Query, CloudSQL), ensuring high data quality and consistency for enterprise reporting.
•Tuned SQL queries and optimized schema designs within cloud platforms (Snowflake, Big Query, CloudSQL) to maximize query performance and support scalable analytics solutions.
•Integrated SAP ERP and external API data securely into analytics ecosystems, employing authentication tokens and cloud-native services to enable enterprise-wide KPI dashboards on Azure and AWS.
•Delivered source-to-target mappings, automated financial reporting, and predictive analytics solutions to modernize legacy systems and align analytics architecture with evolving business strategies.
•Developed and deployed time-series forecasting and machine learning models via Jupyter Notebooks and Airflow, supporting proactive financial planning and real-time KPI monitoring.
•Provided mentorship to junior analysts, enforced rigorous data governance and CMS compliance standards, and championed a data-driven culture across global teams to enhance decision-making agility.
•Developed an LLM-based text detection pipeline (Python, SQL, Snowflake, DBT) that helped prevent misclassification in CTI reports and reduced analyst review burden by 40%.
•Cleaned and automated ingestion of 1M+ CTI reports, cutting manual data prep by 40+ hours weekly and ensuring faster threat intelligence cycles.
•Streamlined detection workflows and prioritized expert-written content, extending patch cycles from 6 to 12 months and saving $2M+ annually in cybersecurity operations.
•Built strategic dashboards (Tableau) to benchmark model trustworthiness and drive adoption decisions across teams, directly accelerating AI integration in threat triage systems.
Client: Phillips 66, Houston TX (July 2023 - Dec 2023)
Role: Data Analyst
Responsibilities:
•Empowered executives with actionable insights by delivering Power BI dashboards and scorecards for KPIs, MOM, YOY, QOQ trends, and financial metrics, using DAX, R scripts, and SQL.
•Leveraged Python (pandas, NumPy, matplotlib) for exploratory data analysis (EDA) and preprocessing of financial data, integrating R/Python scripts within Power BI for advanced trend analysis.
•Built Python modules for calculating financial ratios, moving averages, and volatility metrics across equity and bond portfolios, enhancing investment analytics.
•Partnered with data engineering teams to implement Python-based streaming data ingestion workflows using Azure Stream Analytics for real-time reporting.
•Designed complex SQL queries, stored procedures, CTEs, views, and PL/SQL triggers to support advanced data modeling and BI needs.
•Developed robust BI solutions by gathering business requirements from stakeholders, SMEs, and BAs, and aligning deliverables with SDLC phases.
•Integrated and transformed data from OLTP systems using SSIS, Azure Data Factory, and Azure Stream Analytics, enabling real-time and batch analytics.
•Published and maintained Power BI reports with row-level security, automated refresh schedules, and managed role-based access for secure stakeholder insights.
•Built Power BI data models with dynamic features such as calculated tables, computed columns, pivot/unpivot transformations, and Azure Blob Storage integration.
•Authored technical design documents, unit test cases, user guides, and collaborated in Agile stand-ups for continuous sprint delivery.
•Conducted ETL optimization, real-time data streaming, and performance monitoring to ensure timely, accurate reporting with enhanced data reliability.
•Resolved data integrity and duplication issues through profiling, cleansing, and reconciliation; coordinated across departments for system-wide alignment.
•Developed Tableau dashboards, managed users and permissions on Tableau Server, and centralized access for enterprise-wide reporting.
•Created SharePoint-integrated workflows with internal banking systems to automate approval processes, reducing manual intervention and improving operational efficiency.
•Contributed to a scalable SaaS solution by deploying and optimizing 15+ services on GCP App Engine and Kubernetes Engine (GKE); partnered with project managers for business cases, solution designs, and Agile delivery.
•Used SQL to analyze 5M+ transaction logs and identify recurring fraud patterns, enabling policy updates that improved case resolution time by 18% and supported $700K in loss prevention per year.
•Developed interactive Tableau dashboards to track fraud types by channel, region, and risk level; increased case triage efficiency and enabled faster decision-making by risk leads.
•Consolidated siloed financial and user data from 6+ sources into a normalized SQL schema, reducing inconsistencies by 80% and ensuring a single resource for fraud analytics.
•Segmented users based on risk behavior using historical data and business rules; recommendations from the analysis improved detection accuracy and reduced false positives by 21%.
Client: Deloitte, India (Aug 2020 - Dec 2021)
Role: Data Analyst
Responsibilities:
•Managed client data using Salesforce, created personalized financial plans, and conducted mutual fund performance tracking and financial research using Excel and Bloomberg tools.
•Defined business transformation rules, data mappings, and source-to-target interfaces for sales and service data across ODS, OLTP, and OLAP systems.
•Developed and maintained Power BI dashboards using normalized/de-normalized data, DAX, calculated columns, and configured row-level security to protect sensitive information.
•Designed and optimized SharePoint sites to facilitate data-driven collaboration, enabling seamless document sharing, version control, and communication across business and IT teams.
•Gathered business requirements and authored BRDs, leading to a 40% faster Power BI development cycle; performed gap analysis and supported legacy system migrations.
•Created and optimized SQL and PL/SQL scripts for data extraction, transformation, and database indexing; conducted ETL using Base SAS and custom scripts.
•Performed data profiling, cleansing, and validation using Talend, Informatica, and SQL, ensuring data accuracy, quality, and integrity.
•Participated in metadata management, data governance programs, and data quality initiatives including compliance with internal controls and audit requirements.
•Built source-to-target mapping documents, transformation rules, and data dictionaries; used Erwin for logical/physical modeling and reverse engineering.
•Migrated data from Oracle to Teradata, designed reconciliation reports, reviewed migration scripts, and supported batch job scheduling.
•Delivered training, documentation, and compliance reviews; supported test case execution, QA validation, and smooth transition to production environments.
•Authored and maintained product requirement documents (PRDs) and safety workflows, ensuring clarity in execution and alignment across international teams.
•Captured and communicated customer feedback to product and engineering teams, aiding in continuous improvement.
•Delivered high-visibility product presentations to leadership, clearly communicating performance metrics, and roadmaps.
•Deployed Postgres & MySQL to store & process student data, using object-oriented programming languages including Java & C# to optimize website & automatically process information for student reporting & communication, with good code quality.
•Resolved complex technical challenges using innovative solutions, resulting in a 25% improvement in system reliability
Client: eClerx, India (Jul 2019 - Jul 2020)
Role: Jr. Data Analyst
Responsibilities:
•Analyzing various logs that are been generated and predicting/forecasting the future outcome with various python libraries. Created/modifying worksheets and data visualization dashboards in Tableau.
•Developed Data Mapping, Transformation and Cleansing rules for the Master Data Management Architecture involved OLTP, ODS and OLAP.
•Involved with Data Analysis primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats.
•Worked with the ETL team to document the Transformation Rules for Data Migration from OLTP to Warehouse Environment for reporting purposes.
•Performed data testing, tested ETL mappings (Transformation logic), tested stored procedures, and tested the XML messages.
•Built interactive Power BI dashboards with parameters, calculated fields, and table calculations.
•Created Use cases, activity report, logical components to extract business process flows and workflows involved in the project using Rational Rose. UML and Microsoft Visio
•Involved in development and implementation of SSIS, SSRS and SSAS application solutions for various business units across the organization.
•Wrote test cases, developed Test scripts using SQL and PL/SQL for UAT.
•Creating or modifying the T-SQL queries as per the business requirements and worked on creating role playing dimensions, fact-less Fact, snowflake and star schemas.
•Involved in data analysis and creating data mapping documents to capture source to target transformation rules.
•Extensively used SQL, T-SQL and PL/SQL to write stored procedures, functions, packages and triggers.
•Analyzed of data report were prepared weekly, biweekly, monthly using MS Excel, SQL
CERTIFICATIONS:
Data Science: Probability in R (Harvard),
Data Science (Orientation and Tools) (IBM)
Data Analysis Using Python (IBM)
SQL - Advanced for Data Professionals
Advanced SQL: MySQL Data Analysis & Business Intelligence
Google Analytics Individual Qualification (GAIQ)
AWS Certified Developer- Associate & Cloud Practioner
Azure- A-Z900 Cloud Fundamentals
GCP- Certified Data Engineer & Certified Data Analyst
Oracle Cloud Certified- Associate
Power BI Data Analytics for All Levels 2.0
KEY ACHIEVEMENTS
•Built and deployed modern data pipelines using Python, SQL, Airflow, and Snowflake across AWS, Azure, and GCP, improving data processing efficiency by 50%.
•Migrated legacy systems to cloud-based architectures (Snowflake, Big Query, Redshift), reducing infrastructure costs by 30% and enabling real-time analytics.
•Delivered 40+ BI dashboards (Power BI, Tableau) with advanced DAX, predictive modeling, and cloud data sources, supporting $200M+ business decisions.
•Automated anomaly detection pipelines using Python and deployed on cloud-native stacks (AWS ECS, Lambda), reducing data quality issues by 40%.
ACADEMIC PROJECTS:
Retail Sales & Forecasting Automation in Excel
• Built an Excel-based sales dashboard with PivotTables, Power Query, and macros to track SKU-level trends
across 80+ stores, saving up to 12 hours/week in manual reporting if adopted.
• Deployed seasonal forecasting using Excel formulas and custom VBA scripts, which could reduce stockouts by
18% and generate $250K in additional sales over 3 months based on simulated restocking strategies.
Customer Churn Segmentation Using SQL
• Queried and segmented 1M+ telecom users into churn-risk bands using SQL (CTEs, window functions)
identifying high-risk segments that could reduce churn-induced losses by $180K/month if integrated into outreach
• Built dynamic SQL views for real-time churn monitoring, projected to improve campaign targeting efficiency by
32% based on churn simulator results