NEHA KUMARI
*** ***** ** ******** ** ***** 201-***-**** ac2euv@r.postjobfree.com Linkedin Github Rpubs EDUCATION
Saint Peter’s University, Jersey City, NJ Dec’ 2017 M.S. Data Science Concentration in Business Analytics 3.97/4.00 GPA
Rajasthan Technical University, Rajasthan, India. June 2010 B.Tech in Computer Engineering
TECHNICAL SKILLS
Programming Language : R, : Python, SAS, SQL, Scala, ABAP, JavaScript Database Technologies : SQL : Server 2008, Mongo DB, MySQL Tools : Git, : Tableau, Zeppelin, Databricks
Other Skills : Apache Spark, Amazon EC2, EMR, S3
Relevant Courses : DataBase & Data Warehousing, Decision Modeling, Statistical Programming, Machine Learning, Data Mining, Big Data Analytics, Marketing Analytics & operation Research PROFESSIONAL EXPERIENCE
Technical Consultant, KPIT Technologies, India June 2015-Jan’ 2016
Customizing and configuration of high volume marketing campaigns.
Performed data cleansing, data manipulation and exploratory analysis using Talend and Tableau
Extracted, transformed and loaded (ETL) campaign data to provide actionable information about the potential target groups Technology Consultant, Exa-ag India Pvt Ltd, India Oct’ 2013-May 2015
Created relational schema in SQL that captured the master and transactional data for a global steel company
Provided BI data to Analysts; utilized BI tools and Data Visualizations using Tableau
Extensively used XML Web Services for transferring/retrieving data between different providers Software Engineer, Hinduja Tech Ltd, India April 2012-Oct’ 2013
Wrote queries, complex store procedures using PL/SQL for SQL database
Developed user interface and log in screens with Validations using JavaScript. ACADEMIC EXPERIENCE
Reddit users comment classification (SparkML,Watson)
K-means clustering on comments based on the important words.
Personality insights of reddit users using Watson API Time Series forecast of Stocks (R,dplyr,forecast,caret,ggplot)
The objective of this study was to create an optimal portfolio of stocks.
Implemented ARIMA model to forecast the regression model parameters.
Predicted the price returns using the forecasted parameters. Analyzing Genomics Data and BDG Project (Scala, PySpark, lightning-viz)
Analyzed data from the 1000 genomes project.
Use of hadoop friendly serialization and file formats (Avro and Parquet) for compact binary representation and achieve language cross compatibility.
Implemented K-means clustering to predict population NYC Traffic Collisions (Python, MySql workbench, Tableau)
Explored data for factors contributing to traffic collisions, in each borough and number of people injured.
Determine the road safety index for a given route Movie Recommender System (Python)
Implemented matrix factorization techniques like Singular Value Decomposition and Alternate Least Squares combined with Biases and Collaborative Filtering
Achieved RMSE of 0.89