Sign in

Data Civil Engineering

Bronx County, New York, United States
November 27, 2017

Contact this candidate



*** ***** ******, ******* 848-***-****



To obtain a Data Analytics position in a challenging Environment.


Master of Science in Business Intelligence and Analytics (August 2014 - Dec 2015) Stevens Institute of Technology, New Jersey

B.E, Civil engineering (2010 – 2014)

Birla Institute of Technology and Science, Pilani, India

Languages and Tools

•Statistical Modeling and analysis: R, Python, SAS, Hadoop (hive, Pig, Spark), SAP

•Languages: Java, Scala,Qlik

•Databases &cloud: MySQL, Aws(ec2,redshift), Mongodb,Hbase, GCP

•Data Visualization: D3, R, Tableau, Gephi

•Others: SQL(active-record), Git commands, Linux, Solr, Jira

Professional Experience

Data Scientist, Seven Networks, Marshall Texas (Mar 2016 –Nov 2017)

Predicting the customer retention and built models using machine learning tools to increase the traffic for adclear app across all the mobile platforms using big data tools like hive, python

Built data pipelines, created data visualizations, automated adhoc-scripts, dashboards for retention and user growth analysis using sql, Tableau, Rshiny

Aggregating the data from multiple cloud sources like Gcp (big query, big table) using pandas, sql to check data quality and performed statistical analysis, a/b tests on user’s data to gain insights about installation flow using t-tests, p-values.

Data science intern, Gravity4, San Francisco (May 2015 – Sep 2015)

Designed model for programmatic ad buying on real time bidding platform by ctr, audience prediction through behavioral targeting, geo targeting and dimensional reduction techniques and deploying the model on to the Kafka reporter.

Build Data /Ml pipeline for Memsql to replace Redshift and writing hourly/daily jobs using active record, ruby on rails

Writing Map reduce jobs for customer insights & finding the influencers in the network to improve retargeting performance and pushing the results to Mongo dB

Data Science Research Assistant, Stevens Institute of Technology (Aug 2015- Dec 2015)

Extracting Git hub pull requests comments through Api and Topic Modeling for classifications using LDAvis

Rshiny, python

Building recommendation systems for kaggle jobs for students with respect to their skills using collaborative filtering, bagging & boosting techniques

Analyst, National Stock Exchange of India Ltd, India (Dec 2013- July 2014)

Designed new data base and build ETL pipelines for analytics department keep the track of investor’s complaints and which updated the case automatically to next stage.

Assisted the Analytics team profiling the data of investments to find pattern related to the investments in different segments. Results used for designing campaign of NSE and in establishment of new regional offices.

Academic Projects

Key word extraction engine for online social networking posts: Multiple label classification

Built a tag prediction engine from various stack overflow posts using classification algorithms in Python and deployed on web server using flask

Recommendation engine using collaborative filtering techniques in Spark, Python

Built a personalized movie recommendation system using collaborative filtering and content based filtering techniques in spark, python

Interactive Weather Outlier Visualization with D3.js, Mongodb

Data is loaded from Mongodb to for outlier analysis into R, rshiny and sever are built in Python. D3.js is mainly used to build charts, heat map with the time line slider for the outliers

Prediction of heart disease risk score using classification and subset feature techniques

Additional attributes such as geo-location, dietary, physical conditions for each user are merged into the data set

Data mining, Feature selection (PCA), Logistic regression, Naïve Bayes, Decision tress, K-nearest techniques used in R, SAS. Best prediction model is decided based on cross validation, confusion matrix, Roc curve, accuracy rate

Information Extraction & Sentiment analysis using Text mining, NLTK techniques

Scraping users, reviews, ratings from Trip advisor and Open Table websites using Python about different restaurants for training the model

Ngrams, Naïve Bayes classifier, NLTK in Python are trained on the data to classify the reviews and for predicting the traffic of restaurants based on their popularity in review websites.

Social Network Analysis to predict content strategy & brand of Travel sites (Trip advisor, Expedia etc)

Tweets about 6 travel websites are collected based on Twitter api using R, python including the profile and location.

Word clouds, n grams associations, nltk, content mining, plotting social network of the data to study about influence

Analysis of Variance, A/B testing

•Designed an experiment to test people’s mood based on different factors like work, weather, and time of week. The data was collected from different parts of world by surveying and sampled, Different experimental design methods like CRD, RBD, CRAC and CRF are done to test ANOVA and Covariance between the factors.

Detecting Fraudulent Transactions

Extracted strange transaction reports that indicated fraud attempts by salespeople by using data mining processes (KNN, Decision Trees, Random forests)

Leadership & Achievements

Runner up for 6sense Machine Learning Hackathon 2015

Worked on the census data set to predict the future Occupations of the persons using SVM, GBM and build the Prediction Engine

Contact this candidate