Sign in

Social Media Data

Jamaica Plain, MA
May 20, 2020

Contact this candidate


AYUSH JAIN Boston *** -891-2001 EDUCATION

Master’s in data Analytics, (GPA – 3.95/4.0) Northeastern University Sep. 2019 – April.2021(Expected) Bachelor of Technology, Computer Science Uttar Pradesh Technical University Jul. 2014 SKILLS

Certifications: Python R Machine Learning Data Science Advanced Excel Tableau MySQL Power-BI Storytelling Azure Tools & Framework: Pandas Numpy Matplotlib Seaborn Scikit-learn Tensor Flow Hadoop Azure Agile Scrum Databricks

Apache Spark Amazon S3 Jupyter Notebook Google Colab GIT Gephi Hadoop Ecosystem Mlib R-Studio Techniques: Data Analysis Data Visualization Data Mining Regression Classification Time Series Forecasting Random Forest

Bayesian & Naïve-Bayesian Hypothesis Testing Ensemble ANN CNN PCA Map Reduce Monte Carlo Simulation WORK EXPERIENCE

ALARIK KINESIS, New Delhi, India Feb. 2017 – Aug. 2019 HIGH-FREQUENCY TRADING PLATFORM, Senior Research Analyst

• Directed a team of 7 for gathering software requirements, developed & implemented automated trading strategy

• Parsed, cleaned and transformed time-series data extracted from Cassandra stored via Kafka using RestAPI for in-depth analysis

• Designed predictive model for prices, applied Linear Regression and Random Forest Regression model found the factor affected

• Identified instability due to factors to Optimized model using selection methods & Feature Engineering improved accuracy by 3.2%

• Utilized quantitative tools & supported real-time research insights by extensive analysis on intraday trade data logs

• Prepared Time Series forecast and ARIMA model to find market trends and anomalies of stock data, resolved business problem

• Anticipated the market trends through volatility index and stock trend indicators and achieved the improved profitability by 4% ADYAH INDUSTRIES, New Delhi, India Aug. 2014 – Jan. 2017 RESEARCH AND DEVELOPMENT, Research Analyst

• Compiled intraday trade order books through Technical indicators including MACD, Bollinger Band, Stochastics and RSI

• Examined data of oil well using Excel and performed statistical analysis to forecast the cost of oil extraction

• Managed and coached to create marketing content used for social media marketing program

• Invented content for social media channels such as Twitter, Facebook, Instagram, Google+, LinkedIn and Pinterest

• Supported and prioritized the keywords for SEO and integrated the words to content used for marketing plans Airline Market Analysis PROJECTS May.2020

• Managed 27 Gb historical data of last 20 years with more than 128 million rows from BTS by loading it in Amazon S3 bucket database

• Extracted file from S3 bucket through RestAPI, transformed into tables & further loaded into a variable by merging all 240 files

• Optimized spark queries by repartitioning the data frame partitions and pulling datasets into a cluster-wide in-memory cache

• Implemented Apache Spark & managed the databricks for data integration, interactive & advanced analytics, real-time processing.

• Hypothesized EDA & found the factors impacted US Airline Industry among season, bookings, delays, global recession (2008), covid19, anticipation of frauds

• Resolved challenges faced due to the availability of limited storage space, CPU computation power & 1 node cluster Dog Breed Image Classification April.2020

• Gathered, imported and preprocessed the dataset through Google Colab of dogs from 120 breeds and 10,000 images

• Converted labelled images into Boolean array and turned them into integer values, turned images to tensors (array as list of lists)

• Normalized tensors data and designed model using Keras API, converted dataset into batches due to in-memory limitations

• Integrated call-back to check the progress of training model, to stop once it stops improving and early stopping to prevent overfitting

• Prepared epochs as a hyperparameter & defined the number of times model learn, trained the model & applied on whole dataset

• Developed a deep learning model that identify and classify dog breed images by using Tensorflow.hub, TensorBoard, and Keras

• Evaluated the model by making predictions, compared them with the ground truth labels and obtained insight full outcomes Melbourne Housing Market Jan. 2020

• Performed EDA analyzed & found the patterns of correlation between the attributes of housing data & designed a predictive model

• Formulated Regression model with an accuracy of 46.86%, applied Feature Engineering and Selection method that enhanced the accuracy to 62.13%

• Utilized Random Forest Regressor algorithm which increased the model accuracy to 74.06%. Tuned model by RandomizedSearchCV cross-validation hyperparameter increased accuracy to 80.06%

Contact this candidate