P.Praveen Kumar
(Data Scientist)
Professional Summary
•Software Developer with 4.8 Years of Experience in Data Science and Java.
•Genpact certified Data Scientist with 3.8 Years of Experience in Machine Learning.
•Achievements include creating data regression models to predict company stock prices with 25%
more accuracy than historical average.
•Highly skilled in Machine learning, data visualization and Data Analytics.
•Good knowledge in Core Java, J2EE, Spring, MuleSoft (API).
•Worked on both Supervised and Unsupervised learning models such as Linear, Logistic regression Random Forest, KNN using Python.
•Statistical modelling techniques like Hypothesis testing, Z testing, ANOVA.
•Performed Exploratory data analysis for complex data and files.
•Good knowledge in Deep learning and Natural Language Processing.
•Have done Pro-degree in Data science.
Work experience
TCS – Data Scientist 09/2020 – 09/2021
Project: Hartford Insurance
Worked in creating a customized python package (automata) which helps to extract, transform and validate company’s internal data and also has features like storing the model metadata into oracle and snowflake tables using PySpark.
Worked with data filtering, EDA process and CI/CD pipeline.
Visualizing the insights of the customer’s policy/insurance data’s for every quarter.
CI Engine which is used to analyses and compare the quotes data of the companies in order to give
better policy rate for the customers.
Automated the scripts using autosys which collects the latest data from hive and generate reports.
Actively engaged in quantitative analysis of sophisticated modelling to address business needs.
Worked on Virtual Environment for creating packages and customized kernels.
Selecting the better model using MLFlow and Model Registry
TCS – Data Scientist 07/2018 – 07/2019
Project: SPRINT DIGITAL TELECOM (SERVICE PLATFORM)
Analysing the Storage space usage for snowflake services for every quarter.
Approached Time Series analysis which helps to identity the trends in our compute habits.
Trend Analysis for days/times when resources are used infrequently and limit the compute nodes
during slow times. Drill down into databases, schemas and query types to track adoption across the organization with user adoption.
Analysing the monthly average for each individual project and usage over time which lies under
cost per limitation of business.
Tuning the predictive churn user activity from past and detecting which customers are likely to cancel the subscription to a service based on their usage. Two class Logistic and Boosted Decision Tree.
Geetham Software Pvt Ltd – Junior Data Scientist 07/2017 – 06/2018
Project: TIME Dataset / Aadhaar Dataset
Collected, studied and interpreted large datasets from different sources for the company and cleansed
it for further analysis.
Analysed the sales trends of the company and helped to predict their requirements for the upcoming sales. Reported healthy and unhealthy months of the company in ordered to improve their revenue.
Suggested to determine what actions to take for targeting the right area of customers in Aadhaar.
Packages used are Numpy, pandas, Matplotlib, sklearn, Keras.
Developed and implemented new forecasting models which increased company productivity and efficiency
Implemented Sentimental Analysis for Aadhaar using NLP (NLTK).
Visualized the data using Tableau. Also got customer appreciations for the reports.
Artificial Neural Networks
Developed a POC model in ANN
Based on the track of six months of the randomly picked customers, have collected those customers
who left and stayed in the bank.
Predictive model which will say which of their customers are at highest risk of leaving the bank
Geetham Software Pvt Ltd – MuleSoft Developer 01/2016 – 06/2017
Project : Transport Administration(TIME-Transport Income Made Easy)
Worked on development of RAML 0.8 using Anypoint exchange.
Integrating the layer for the Module application using Mule Soft.
Developed flow for a set of messages to be converted into the canonical format
MULE EXPRESSION LANGUAGE (MEL) is used for development.
Continuous integration and continuous development using Jenkins.
Implemented development of Error framework for the service layer.
Worked in various connectors, transformers and Listeners.
Bugging out the developed modules using Munit test cases for the integrations.
Profoundly worked on DATA WEAVE EXPRESSION Language.
Skills
Programming Language
Java, Spring, MuleSoft
Data Analytics
Python, R, Data Science, Machine Learning, Deep Learning, PySpark
Databases
SQL, MSQL, Oracle, SnowFlake, Hive
IDE
Eclipse, Anypoint Studio, Anaconda, Jupyter Notebook, Spyder, Colab
Frameworks
Collections
Repository
Git, Bitbucket
Operating Systems
Windows, Unix, Putty
Visualization Tools
Tableau, Excel
Design Patterns
Singleton
Project Tracking Tools
JIRA, Rally
Web Services
RESTful, Postman
Education
St. Joseph’s College
Of Engineering Chennai
Electronics and Communication Engineering Graduated with Aggregate of 56%
09/2011 -06/2015
Vidhya Vikas Higher
12th – State Board
09/2009 -06/2011
Sec school
Graduated with Aggregate of 70%
Petit Seminaire
10th – State Board
09/1997 -06/2009
Pondicherry
Graduated with Aggregate of 77%
Profile
Date of birth:
01/06/1994
Nationality:
Indian
Address:
No: 130, Mahaveer Nagar, lawspet, Pondicherry 600508, India
Phone number:
Email address:
*******************@*****.***
LinkedIn:
linkedin.com/in/praveen-kumar-a1b949bb
GitHub:
https://github.com/Praveen8/datascience
Date: P.Praveen Kumar