Post Job Free

Resume

Sign in

Data Analyst Analytics

Location:
New York, NY
Posted:
February 12, 2023

Contact this candidate

Resume:

Prakriti Jha

*** ********** **** ***, ***, Stamford CT advawx@r.postjobfree.com 475-***-**** https://linkedin.com/in/prakriti-jha

SUMMARY

Total IT experience of 7+ years with 5+ years of experience in data analyst and data scientist role with strong technical skills, solid business acumen and client facing experience. Proven ability to manage multiple projects and meet critical deadlines.

TECHNICAL AND BUSINESS SKILLS

Programming Languages: R, Python, SAS, SQL, HiveQL, Excel macro with VBA

Tools: R-Studio, Spyder, Jupyter, Tableau, Qlik Sense, Azure Databricks, Data Factory and Storage Explorer, Azure DevOps, SAS Time-series forecasting, SAS Enterprise miner, Sqoop, Microsoft – Visual Studio Code, Oracle developer, JMP-Pro 13, Excel Solver, JIRA, HP Quality Centre (HPQC), SVN Version Control tool

Software: Microsoft Office (Excel, Word, PowerPoint), Microsoft Project Plan, Microsoft Visual Studio, Microsoft Visio

PROFESSIONAL EXPERIENCE

Insights Global

Designation: Data Analyst/Scientist Client: Charter Communications Nov 2022 – Present

Data Quality and integrity checks on new data going into the production models using SQL and Python.

Periodic check on the distribution of features in new data to make sure it aligns with the production data.

Monitoring the performance of the machine learning models in production.

Sample size and test duration calculation for AB testing on new changes made in the business decisions.

Cyma Systems Inc

Designation: R-Programmer Role: Data Analyst Client: Department of Public Health, CT Jul 2021 – Nov 2021

FTP and reporting of Covid Vaccine Data:

The objective of this project was to study the public health data of Covid vaccination and identify the drivers to increase the percentage of vaccination in Connecticut.

Daily meetings with client to understand and log the requirements in Azure DevOps.

Build R codes to prepare charts and reports using source data from SQL tables, identified and presented the reasons behind less vaccination percentage in different areas of Connecticut.

Used Git in Azure DevOps to maintain the versions of codes.

Automated several excel and word reports generation using Openxlsx, R-Markdown and KnitR.

General Reinsurance Corporation

Designation: Business Analytics Specialist Role: Data Analyst/Scientist Domain: P&C Insurance Feb 2019 – June 2021

Creation of data lake, pipelines and data transformation in Azure platform:

The objective of this project was to understand the profitability based on underwriting characteristics.

Gathered data from different legacy data sources in organization to Azure Data Storage Explorer.

Created pipelines in Data Factory to be able to refresh data on monthly basis.

Transformed data using Spark SQL on Azure DataBricks to connect Gen Re’s underwriting data with the Claims data.

Prepared and presented visualization reports using Qlik Sense.

Frequency – Severity analysis of catastrophe claims:

The objective of this study was to understand the drivers of frequency and severity of catastrophe claims for one of the Gen Re’s clients.

Transformed and cleaned data using dplyr in R.

Prepared frequency and severity plots by cause of loss for all features of dataset using ggplot in R.

Generated plots and tables in easy-to-read word document using R-Markdown and KnitR.

Prepared glm and glmnet models to determine the important drivers of frequency and severity.

Prepared Actual vs Expected (AtoEs) for all levels of features to deep dive and understand the impact of each driver on the frequency and severity.

Prepared excel calculator using model equation to help client easily see the expected frequency and severity by inserting raw input values.

Loss Ratio analysis for Personal Umbrella Claims:

The objective of this study was to analyze Gen Re’s personal umbrella claims data and generate data driven insights through a cause of loss study to help the business in their future underwriting decisions.

Prepared and cleaned data in R by imputing missing values, and capping and flooring the outliers based on variable distribution.

Prepared XGBoost model in R as it performed better than glm and random forest.

Prepared contingency matrix and AtoEs for all features of dataset.

Presented and shared key observations with business.

Bind Rate Analysis using Logistic Regression and Decision Tree in Python:

The objective of this study was to identify the drivers associated with binding general liability certificates for business to develop a plan to grow these lines of business.

Analyzed and designed mapping table and data dictionary for all fields.

Assembled, cleansed, and transformed data using Python, SAS, SQL and Excel.

Performed correlation test and variable reduction using Pearson correlation test, Cramer’s V test, Lasso regression and Information Value test in Python.

Built logistic regression in Python to determine the significance of independent variables.

Performed hyperparameter tuning to identify the best hyperparameters for decision tree.

Built classification decision tree to visually display the significance and classification of independent variables.

Prepared story on Qlik Sense app and presented significant findings to business team members.

Loss analysis using clustering:

Assembled and cleaned data using Python and SQL.

Determined optimal number of clusters using elbow method.

Performed K-Means clustering using optimal number of K.

Performed F-Statistics test to rank the variables based on their participation in formation of clusters.

Prepared Qlik Sense app to present the results to the business team.

Designation: Business Analytics Intern Role: Data Analyst Intern Domain: P&C Insurance May 2018 – Dec 2018

Data lake in Hadoop:

Different systems in the organization had different data sources such as Oracle, Sybase, Mainframe-DB2 etc. Worked on bringing the data together into Hadoop to be able perform broad data exploration and data discovery in future. Main responsibility included as below:

Interacted with different teams in the organization to identify real data source for each system. Sometimes it also included understanding the legacy data sources.

Back traced Views and Materialized Views to get the real data sources.

Interacted with business team to get overall understanding of data fields.

Sqooped in datasets into HDFS, created Hive and ORC tables, and performed required transformations using HiveQL.

Monitored and maintained data lineage using Apache Atlas.

Computer Sciences Corporation

Designation: Product Developer Role: Data Analyst Client: Metlife Life Insurance Dec 2016 - Jul 2017

Mortality Risk Analysis:

Performed data cleaning and transformation using dplyr in R.

Performed supervised and unsupervised learning on the data to analyze mortality risk based on geography and medical data of the customers. Prepared Tableau dashboards and stories.

Built SQL queries for data mining and for on-demand analysis. Produced and distributed standard reports.

Created and presented team metrics report and business case to improve customer experience.

Capgemini Ltd.

Designation: Senior Software Engineer Role: Data Analyst Client: Metlife Health Insurance Sep 2015 - Dec 2016

Created several business reports by joining multiple tables from multiple databases using SQL queries.

Created several tableau dashboards and presented insights and stories to the clients.

Received appreciation for optimizing job execution in QA and UAT phase using Excel VBA. It saved a considerable cost of $4080 for the client and 2 to 3 hours a day for associates.

Tata Consultancy Services

Designation: System Engineer Role: Developer Oct 2012 - Sep 2015

Interacted directly with the client to gather business requirements; provided data analysis reports and pertinent technical solutions; developed code modules in Cobol and JCL based on the requirements; prepared WSR and logged defects and discussions in JIRA.

Developed and implemented project modules of Financial Management System.

Executed code modules following Software Development Life Cycle.

Optimized repetitive processes by creating excel macros. Thus, saved a considerable work time for associates.

EDUCATION

University of Connecticut School of Business, Stamford, CT, USA Aug 2017 - Dec 2018

Master of Science in Business Analytics and Project Management GPA: 4.0

Rajiv Gandhi Proudyogiki Vishwavidyalaya Jun 2008 - Jun 2012

Bachelor of Engineering in Electrical and Electronics Engineering



Contact this candidate