Post Job Free

Resume

Sign in

Prateek Sharma data scientist

Location:
Hyderabad, Telangana, India
Salary:
65$
Posted:
February 19, 2020

Contact this candidate

Resume:

PRATEEK SHARMA adbvdr@r.postjobfree.com +1-469-***-****

PROFESSIONAL SUMMARY

I have been working in data science and business intelligence for more than 8 years with master’s in data science from Michigan Technological university. Around 7 years of Industry experience as a Data Analyst, Data engineer, Statistical Analysis (4 years in India and 3 years in the Netherlands). Around one and a half year of academic experience working as a research assistant under Dr Jinshan Tang in Machine Learning, Deep Learning, AWS, GCP. Interested to work and grow as a Data Scientist/Sr Data Engineer in a dynamic firm.

Around 8 years of rich experience in Machine Learning Algorithms, Data Mining techniques, Natural Language processing

Experienced scrum master and agile certified professional.

Worked on end-to-end basis from gathering business requirements, pulling the data from different data sources, data wrangling, implementing machine learning algorithms, deploying models and presenting end results to clients.

Mined and analyzed huge datasets using Python and R languages. Created an automated data cleansing module using supervised learning model in python.

Worked with different data set manageable packages like Pandas, NumPy, SciPy, Keras etc

Implemented various statistical tests like ANOVA, A/B testing, Z-Test, T-Test for various business cases.

Worked with various text analytics libraries like Word2Vec, GloVe etc.

Knowledge in Seq2Seq models, Bag of Words, Beam Search and other natural language processing (NLP) concepts.

Experienced with Hyper Parameter Tuning techniques like Grid Search, Random Search.

Worked with outlier analysis with various methods like Z-Score value analysis, Liner regression, dB scan (Density Based Spatial Clustering of Applications with Noise) and Isolation forest

Knowledge in PostgreSQL and Unix Shell Scripting. Designed and developed wide variety of PostgreSQL modules and shell scripts with maximum optimization

Commendable knowledge in SQL and relational databases (Oracle, SQL Server, gp admin)

Worked with Microstrategy visualization to create business reports with key KPIs

Experienced with DevOps tools like Docker, Container, Jenkins.

Worked with DevOps teams to help them in deployment by writing python code for custom logics to achieve Infrastructure as code concept.

SKILLS

●Python

●Tensorflow

●Machine learning

●OpenCV

●Linux/Unix Scripting

●Cloud Computing (GCP)

●Microstrategy (Reporting tool)

●R

●Informatica

●SQL (DB2), Big Query

●Agile Methodology

●Android Development

●Statistics

●Hadoop Cluster Setup

WORK HISTORY

Michigan Technology University Data Scientist / Research Assistant: Houghton, United States Aug 2019 - Present

●Created an Automated Ticket Routing algorithm for the support team using Natural Language processing and other machine learning algorithms.

●Analyzed and significantly reduce customer churn using machine learning to streamline risk prediction and intervention models.

●Worked with K-Means, K-Means++ clustering and Hierarchical clustering algorithm to sort of the customer classification.

●Worked with outlier analysis with various methods like Z-Score value analysis, Liner regression, Dbscan (Density Based Spatial Clustering of Applications with Noise) and Isolation forest

●Used cross-validation to test the models with different batches of data to optimize the models and prevent overfitting.

●Worked with PCA(Principle Component Analysis), LDA(Linear Discriminant Analysis) and other dimensionality reduction concepts on various classification problems on various linear models.

●Worked with sales forecast and campaign sales forecast models such as ARIMA, Holt-Winter,Vector Autoregression (VAR),Autoregressive Neural Networks (NNAR).

●Experimented with predictive models including Logistic Regression, Support Vector Machine (SVM) and re-enforcement learning to prevent the retail fraud.

●Worked with ETL developers to increase the data inflow standards using various preprocessing methods.

●Worked with Survival Analysis for customer dormancy rates, periods and inventory management.

●Created a customer service upgrade which is an automated chatbot to better assist the online customers using text classification and knowledgebase.

●Responsible for design and development of advanced R/Python programs to prepare to transform and harmonize data sets in preparation for modeling.

●Deep knowledge of a scripting and statistical programming language like python. Advanced SQL ability to efficiently work with very large datasets. Ability to deal with non-standard machine learning datasets.

●Built visualizations to facilitate research into the Human Connectome Project data and identify on the anatomical and functional connectivity within the healthy human brain, as well as brain disorders such as dyslexia, autism, Alzheimer's disease, and schizophrenia.

●Performed Exploratory Data Analysis (EDA) to maximize insight into the dataset, detect the outliers and extract important variables.

●Develop data preprocessing pipelines using Python, R, Linux scripts on on-premise High-performance cluster and AWS, GCP cloud VMs.

●Model development, evaluation using machine learning methods such as KNN, deep learning on MRI data.

TATA CONSULTANCY SERVICES Data Science Engineer & Informatica Admin:

Amsterdam, Netherlands and Mumbai, India Jan 2014 – Aug 2018

●Assisted Business Analyst and Data Scientists in Data preprocessing: Data Preparation, Data cleaning masking, Data Analysis, Data profiling

●Create a classification model to reduce the false alerts generated by the existing anti-money laundering & fraud detection system by 35 %.

●Successfully upgraded Informatica from version 9.6.1 HF2 to 10.1.2

●Worked with claim classification models to reduce the different workloads for the Core Operations team.

●Implemented CNN model to go through various documents coming from downstream to identify set of images for claims department.

●Explored and created different new data sets to work with and implement few data science workflow platforms for future applications.

●Designed and implemented workflow methodologies for the claim predictions API and also involved in ETL component creations to pull the required data.

●Created various models like SVM with RBF kernel, Multi Perceptron Neural Network, KNN, Lasso, Ridge, Elastic net Regression models.

●Worked with K-fold cross validation and other model evaluation techniques throughout different projects.

●Worked with text extraction modules such as Tesseract to extract text from various documents and process the text with NLTK.

●Set up process for Data archival to free up 24 TB space in production environment and improve the performance.

●Refined the design of data quality design leading to 20% faster batches and added new data quality checks.

●Worked on Backlog Management in JIRA as scrum master and capacity management for EM team.

●Working together with Client Management to Define and Design the cloud migration strategy for the department

●Groomed 14 data engineers in Informatica and lead the enterprise memory team

Informatica Developer and MicroStrategy Report Developer Mumbai, India Jan 2012 – Dec 2013

●Formulated Requirements, and created Functional & Technical Specifications from business requirements

●Develop Data pipelines to ingest different variety, velocity and volume of data within accepted performance limits

●Develop the Transformations and Mappings to achieve the business goal as per the specification and IBM BDWM data model

●Migrate the existing Business object reports into MicroStrategy (Reporting Tools), Perform ad hoc reporting and testing on migrated reports

●Involved in creation of metrics, attributes, filters, reports, and dashboards created advanced chart types, visualizations and complex calculations to manipulate the data.

●Built and maintained SQL scripts, indexes, and complex queries for data analysis and extraction.

●Scheduled data extracts and monitor on daily basis.

●Installation of MicroStrategy Desktop and creating a MicroStrategy project.

●Worked with all levels of SDLC from analysis through implementation and support.

●Worked on Data Analysis, Data profiling, Data quality, mapping & transformation

●Created checklists and review tools to ensure error free generation of ETL components

●Participated in test planning and test script definition.

●Created Report Service documents with multiple layouts formatted as per the business requirements.

●Resolved end user reporting problems through collaboration with IT and Operations.

●Developing KPI's dashboard package to depict success of MD live program for the client.

●Acquired expertise in Performance analysis and tuning of PowerCenter Workflow

●Developed Mainframe, db2 scheduling components to run the batch as per the requirements

EDUCATION

MS Data Science Michigan Technological University Houghton, Michigan, 2018(Aug) - 2019 (Dec) GPA:3.76

BTECH Computer Science IEC College of Engineering and Technology G. Noida, India, 2007-2011 GPA:3.2

Individual Projects

Created the Android App to apply Euler Video Magnification (Research paper by MIT in 2012) with help of serverless computing and OpenCV

Created Clustering, Classification models for forest cover type data set, evaluated the models on the basis of different performance metrics

Created the google sketch up georeferenced 3d model of college and made animations,3d pdfs and counterstrike maps using the 3d model

CERTIFICATIONS

●Serverless Machine Learning with Tensorflow on Google Cloud Platform

●Building Resilient Streaming Systems on Google Cloud Platform

●Serverless Data Analysis with Google BigQuery and Cloud Dataflow

●Leveraging Unstructured Data with Cloud Dataproc on Google Cloud Platform

●Data Engineering, Big Data, and Machine Learning on GCP Specialization

●Convolutional Neural Networks in TensorFlow

●Natural Language Processing in TensorFlow

●Sequences, Time Series and Prediction

●Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning

ACCOMPLISHMENTS

Automations worth savings of 80 k Euros per year created for the bank First Rank in Topolis SOA Architecture competition



Contact this candidate