Union City, CA
USD 80,000
May 16, 2020

PARESH GUPTA 510-***-****

Detail oriented, focused and data driven professional with a demonstrated ability to deliver valuable insights via Data Analytics, Data Manipulation and Predictive Modeling. Involved in Research and Development using Applied Statistics and Machine Learning Professional Skills:

Programming Languages R, Python, SQL

Supervised Learning Decision Tree, Naïve Bayes, Support Vector Machines, Nearest Neighbors, Ensemble Methods (Bagging, Boosting, Random Forest) Unsupervised Learning Hierarchical Clustering, K- means, DBScan Statistical Techniques Regression Analysis (Linear and Non-linear methods), ARIMA (Time Series Forecasting), Generalized Linear Models (GLMs), ANOVA Software Applications/Tools Microsoft SQL Server, RStudio, Jupyter, TABLEAU, Hadoop – MapReduce, PostgreSQL

BI Tools Tableau, Excel, Google Analytics, MS Access, MS Visio Research Assistant at Syracuse University: January 2019- Present

Sentiment Analysis:

Analyzing how emotions vary across individuals, evolve over time, and influence friendships made online using unstructured datasets comprising of 115+ million tweets and posts mined from Twitter and Live Journal websites Tasks:

- Identifying proportion of positive and negative users; Determining the influence of sentiments on friendships

- Developed a regression-based network model that predicted the distribution of positive and negative users of a real-world social network using multi-tier analysis – user level, ego level, community level, network level

Identifying social media bots:

Identifying profiles of bots and real users on Twitter; Determining proportion of fake followers of influencers / celebrities on the social media platform


- Building an SVM based classifier to identify profiles of social media bots; Achieved 96% True Positive Rate

- Involved scraping of profile information of 100+ million Twitter users using Twitter’s Streaming API Data Science Projects:

Marketing Analytics: Customer Segmentation

- Applied RFM modelling technique to identify high value and low value customers of an online retail store comprising of 541,000 customers using k-means algorithm

Text Analytics: Location Prediction using tweets

- Built a content-based classifier to predict the geo-location of tweets; Evaluated performance using accuracy, precision and recall; achieved 74% accuracy

- Identified words that are more commonly used in a specific geographical location

Topic_Modelling (NLP): Amazon_Fine_Foods_Reviews

- Summarized a dataset consisting of 568,000 reviews by grouping them into 10 clusters/topics

- Identified the top 5 frequently used words from each of the topics

SQL: HealthCare Center Database

- Designed an efficient database of a healthcare center using SQL server comprising of center’s employees and patients, including an integrated billing system; Designed E-R diagrams, Implemented Tables, Columns, Relationships and 3rd NF; Addressed issues through various test cases (Views/Triggers/Functions/Scripts) Education:

Syracuse University

Master of Science, Engineering Management Graduated: May 2019 Guru Gobind Singh Indraprastha University, New Delhi Bachelor of Technology, Electronics and Communications Graduated: June 2017 Part Time Work Experience:

Student Supervisor at Syracuse University September 2017- May 2019 Led teams of 7-10 students during various sporting events held at Carrier Dome

