Sagar Kulkarni
***** ********* *****, *****, ** 813-***-**** *****************@*****.*** GitHub LinkedIn
PROFESSIONAL SUMMARY
Over 5 years of industry experience in the field of Business Intelligence and Data Engineering with expertise in building, integrating and deploying data solutions with real-time analytics and reporting solutions.
EDUCATION
Master of Science in Business Analytics and Information Systems, University of South Florida Expected, May 2021
Bachelor of Engineering in Electronics and Telecommunication Engineering, University of Mumbai May 2015
SKILLS
Programming languages: Python, R, C#, HTML
Database: - SQL, MySQL, PostgreSQL, IQ Sybase, Oracle, Microsoft SQL Server, NoSQL
Big Data Technologies: HDFS, Hive, Spark, Hadoop, Map Reduce, HBase, Azure, Cloudera, Databricks, Docker.
Tools: MS Office, Informatica Power Center, SAP PowerDesigner, Microsoft Visio, Jupyter Notebook, SAS Enterprise Miner, Azure ML, Tableau, AWS, Git, Microsoft Visual Studio, Alteryx, Microsoft Azure, Google Analytics, JIRA
Machine Learning: Regression, Classification, Clustering, Decision Trees, KNN, Random Forest, SVM, NLP, KMeans, Ensembles, PCA
CERTIFICATIONS
1.Tableau Certified Desktop Specialist 2. Advanced Google Analytics Certified
3.CCA Spark and Hadoop Developer (CCA175) 4. AWS Certified Cloud Practitioner
PROFESSIONAL EXPERIENCE
Data Analytics Intern, AvMed Health Plans, Gainesville, Florida Aug 2020 - Present
Analyzed health insurance and financial claims data to create ad-hoc business reports using Python and MarkLogic NoSQL database.
Automated data pipelines and reporting by successfully integrating data into the MarkLogic Server with the necessary architecture and technical specifications and scheduled reports according to the required frequency.
Created User Acceptance Test Cases and test summary documents as part of QA Process to track defects and maintain data integrity.
Data Analyst, Center for Urban Transportation and Research (CUTR), Tampa, Florida Dec 2019 – Aug 2020
Responsible for cleaning and analyzing field traffic data using Python and R and creating interactive reports using Power BI.
Applied several methods such as exploratory analysis, clustering, classification to analyze traffic crash data and its different features and accordingly prepared visual dashboards to detect trends and patterns in the factors affecting traffic crashes.
Performed root cause analysis and mitigated defects by 20% by developing metrics to determine inefficiencies within the system.
Business Intelligence Analyst, Tech Mahindra Limited, Mumbai, India Aug 2015 – June 2019
Researched and analyzed data from different data sources such as Flat Files, SAP Hana, relational databases and performed system designing and data modeling for efficient data integration.
Managed data of over 300 Million users in databases like PostgreSQL and SQL Server using ETL workflows and stored procedures.
Translated client requirements and drafted business requirement documents, software requirement specifications and user stories.
Cleaned and normalized data to meet the business standards using SQL queries, transformations, conditional splits, data conversion, and using logically derived columns resulting in an 20% increase in performance.
Developed ETL workflows and mappings for creating pipelines to load data from various sources into a centralized data mart.
Analyzed data and gathered insights using Python, SQL Queries, VBA Scripts and Excel functions like Pivot, Vlookup to produce monthly and quarterly business reports displaying consumer behavior and total sales for several products increasing revenue by 15%.
Streamlined various processes and enabled automated scheduling of reports which increased performance and reduced data loading time by 10 hours per week which helped to achieve close to 90% reporting accuracy of final business reports.
Developed KPI dashboards and visualizations using Tableau for analyzing and identifying key metrics to improve business quality.
ACADEMIC PROJECTS
Automated ETL Pipeline using Python (SQL, ETL, PostgreSQL) Fall 2020
Used Python to build an ETL Pipeline to transfer data from multiple JSON files in different directories into a PostgreSQL database.
Defined the separate fact and dimension tables in a star schema according to perform analytical operations on the incoming data.
Genome Drug Response Regression Analysis (Spark, DataBricks, AWS) Fall 2020
Built a regression model and performed co-relation analysis using MLlib libraries in PySpark to predict the drug response based on genome data from patients by integrating data and performing various operations in Spark Data Frames and Spark SQL.
Employee Satisfaction Sentiment Analysis using NLP (Python, NLTK, Gensim, Tableau) Spring 2020
Performed web scraping of employee text reviews from sites like Glassdoor and did sentiment analysis of top companies to show the average employee sentiment score and performed topic modeling to identify the top features affecting employee sentiment.
Built an interactive dashboard using Tableau comparing the average sentiment scores for all companies over the years.
Retail Sales Analysis using Tableau Spring 2021
Created dashboards to track core business KPIs like sales revenues, gross profit and returns of a global retail company.
Built a relational model to blend data together and create new calculated columns to design an interactive dashboard.