Onkar Khanolkar
TX ****************@*****.*** 510-***-**** LinkedIn GitHub
Summary
• Data Analyst with over 2+ years of experience in data pipeline development, ETL processes, predictive modeling, and data visualization across finance, technology, and consulting sectors
• Proficient in Python, SQL, R, and Java for building data solutions, with expertise in AWS (Glue, Lambda, Redshift) and Azure Data Lake for scalable data management
• Skilled in creating interactive dashboards in Tableau and Power BI to deliver real-time insights, enabling data- driven decision-making for business stakeholders
• Experienced in data warehousing with Snowflake and Amazon Redshift and ETL automation using Informatica, Talend, dbt, and Alteryx, improving data accuracy and efficiency
• Statistical analysis and predictive modeling experience with Scikit-Learn and R to analyze customer behavior and forecast inventory needs, driving strategic initiatives
• Strong background in data governance and compliance, with experience in data validation checks and implementing data security protocols via AWS IAM and Azure AD
• Collaborative and skilled in Agile methodologies, with hands-on experience in project management tools like JIRA and Confluence to facilitate cross-functional teamwork and efficient project delivery Technical Skills
Programming Languages : Python, R, SQL, NoSQL, Java, MATLAB, SAS Data Technologies & Databases : Oracle, MS SQL Server, MySQL, PostgreSQL, MongoDB, Cassandra Data Analytics & Visualization Tools : MS Excel (Pivot Tables, VLOOKUP, VBA), Tableau, Power BI, RStudio Big Data & Distributed Systems : Hadoop, Spark, Kafka, AWS (Redshift, S3, Glue, Athena), Google BigQuery Machine Learning Libraries : NumPy, Pandas, Scikit-Learn, Matplotlib, Seaborn, PyTorch Data Processing & ETL Tools : Informatica, Alteryx, Talend, Apache NiFi, dbt, Apache Airflow Statistical Analysis & Modeling : Regression Analysis, Hypothesis Testing, A/B Testing, Predictive Modeling Data Engineering & Warehousing : Snowflake, Amazon Redshift, Azure SQL Data Warehouse Project Management : JIRA, Confluence, Asana, MS Project, Agile, Scrum ERP & Business Systems : SAP ERP, Oracle ERP
Professional Experience
Data Analyst, JPMorgan Chase & Co. 09/2024 – Current TX, United States
• Engineered scalable data pipelines with AWS Glue and Apache Airflow, processing over 10 million records daily, enhancing data accessibility and operational efficiency across departments
• Developed data models on Amazon Redshift and Azure Data Lake, reducing query times by 40% and facilitating faster access to analytics for business teams
• Implemented ETL workflows with Informatica and Talend, ensuring 99% data accuracy and synchronized data from five different sources for unified reporting
• Enhanced data governance practices with AWS IAM and Azure AD, ensuring 100% compliance with data security standards for sensitive information
• Automated data validation processes with Python and AWS Lambda, reducing data inconsistencies by 30% and improving report reliability
• Built interactive Power BI dashboards, monitoring KPIs in real-time and supporting business stakeholders in data- driven decision-making
Recovery Analyst, Samsung SDS America 03/2024 – 08/2024 TX, United States
• Analyzed over 300 claims monthly, leveraging data validation techniques to ensure data accuracy and compliance with company policies, reducing claim errors, and optimizing claims integrity
• Collaborated with cross-functional teams to execute data analytics projects using advanced Excel functions (Pivot Tables, VLOOKUP), streamlining claim processing workflows
• Recovered $3 million in lost funds by conducting data-driven root cause analyses, identifying patterns to accelerate claim resolution for improved financial outcomes
• Created real-time visualizations of claims data in Excel, offering performance metrics and actionable insights that enhanced overall claim-handling processes
• Executed trend analysis to forecast claim volumes, supporting resource allocation and reducing future risk exposure
• Generated comprehensive reports on claim activities, spearheading improvement in data-driven decision-making Data Analyst, Epsilon 09/2023 – 02/2024
TX, United States
• Developed SQL-based data extraction and analysis workflows, optimizing reporting efficiency by structuring and automating customer trend reports for stakeholders
• Analyzed customer segmentation and developed ML models with Scikit-Learn for targeted marketing, boosting campaign performance by 15%
• Created Power BI dashboards for sales metrics, improving data accessibility for non-technical teams and supporting real-time operational insights
• Cleaned and validated data using Alteryx and SQL, achieving 99.5% data accuracy in monthly reports and reducing manual entry errors by 40%
• Orchestrated ETL workflows with Azure Data Factory, improving data integrity across multiple sources and cutting data transfer times by 35%
• Supported senior engineers in creating metric tables and end-user views on Snowflake to improve data access for analysts in Power BI
• Migrated data from Teradata SQL to Snowflake with custom SQL scripts, enhancing data integration speed by 25% for secure analytics
Data Analyst, KPMG 07/2020 – 06/2021
India
• Conducted SQL and Python-based data analysis to identify customer behavior, supporting marketing strategies that increased client retention by 13%
• Standardized data governance practices across cross-functional teams, ensuring 100% consistency in data definitions across reports
• Automated ETL workflows with Informatica and dbt, reducing reporting time by 40% and maintaining consistency across datasets for KPIs
• Executed A/B testing and segmentation analyses, resulting in 15% more effective targeting for resource allocation in marketing efforts
• Developed predictive models with R to support marketing initiatives, providing insights for swift and efficient decision-making
• Deployed AWS Redshift for data warehousing, enhancing data retrieval by 50% and facilitating faster business unit responses
• Built interactive Tableau dashboards for marketing teams, generating actionable insights that influenced decision- making
Education
The University of Texas at Dallas 08/2021 – 05/2023 Master of Science, Business Analytics
Dean’s Excellence Scholar
University of Mumbai 08/2014 – 05/2020
Bachelor of Engineering, Electrical Engineering
Projects
Credit Card Fraud Detection 01/2023 – 06/2023
• Identified, analyzed, and interpreted trends leading to credit card fraud by balancing a 99% skewed dataset
• Trained four classifiers (Logistic Regression, Decision Trees, KNN, SVM) to achieve a best AUC score of 0.97
• Obtained an accuracy of 0.98 with Logistic Regression using SMOTE on the oversampled class dataset
• Applied neural network models (Keras) on both datasets to improve defect detection accuracy Small Business Loan Default Analysis 07/2022 – 12/2022
• Employed machine learning techniques to predict if an approved loan will default with an accuracy of 94%
• Achieved a maximum precision and recall of 0.884 and 0.864, respectively, utilizing an XGBoost classifier
• Examined feature importance to determine that the length of loan affected default rates by about 50%
• Located and defined process improvement opportunities to reduce the risk of default in loans by 95% Car Price Prediction 01/2022 – 06/2022
• Implemented predictive modeling to forecast car prices with a Mean Absolute Error (price difference) of $3000
• Applied Linear Regression, K-Nearest Regressor, and Decision Tree Regressor on a dataset of 11000 entries
• Produced easily interpretable dashboards via 40 visualizations in the form of scatter plots and histograms
• Improved decision-making by providing actionable business recommendations for price optimization