Post Job Free
Sign in

Data Analyst Assistant

Location:
Boston, MA
Salary:
95000
Posted:
July 09, 2020

Contact this candidate

Resume:

SIDDHANT SAPTE

857-***-**** Boston, MA *****.*@************.*** LinkedIn: https://www.linkedin.com/in/siddhant-sapte/ EDUCATION

Master of Science in Analytics GPA: 3.7 Northeastern University Boston, MA, USA Apr 2020 Bachelor of Engineering in Electronics and Telecommunication University of Mumbai Mumbai, India Jul 2015 SKILLS

Programming Languages : Python (pandas, numpy, seaborn, scikit-learn, scipy, tensorflow), R (shiny, dplyr, tidyr), SQL, NoSQL Data Tools and Databases : Tableau, Power BI, QuickSight, Excel, Hadoop, Spark, Hive, ELK, MySQL, Oracle 11g, Redshift Technologies : AWS, GCP, Databricks, Docker, Cloudera, Google Analytics, GitHub Techniques/OS : Web Scrapping, ETL, Agile (Scrum), Windows, Linux, Ubuntu WORK EXPERIENCE

Research Assistant AI Skunkworks Northeastern University Boston, MA Apr 2020 – Present

• Created a dash (framework in python for web applications) dashboard to visualize COVID-19 spread with a website and doing research on the doubling rate of the virus (https://covid19-visual-app.herokuapp.com/)

• Integrated different data sources with SSIS to dump COVID data into SSMS and created analytical reports in Tableau Graduate Teaching Assistant (ML in R and Python) Northeastern University Boston, MA Jan 2020 – Apr 2020

• Mentored students for ML algorithms, code debugging, conducted lab sessions and assisted professor for assignments System Analyst Larsen and Toubro Infotech Mumbai, India Mar 2017 – Jul 2018

• Conducted log analysis and event correlation using ELK (Elasticsearch, Logstash, Kibana) to reduce server errors by 30%

• Improved server efficiency by 10% by developing a python script to scrape errors from server log and HTTP access log files

• Created Tableau dashboards showing server error report to ensure security and performance of various applications

• Developed SQL queries in Oracle DB to bulk upload transaction data for SOA applications reducing latency by 25% Data Engineer Larsen and Toubro Infotech Mumbai, India Sep 2015 – Mar 2017

• Helped architecting IoT solutions to analyze the live streaming data and to take action based on the live events

• Transformed unstructured sensors data into a structured dataframe by performing ETL operations in python

• Analyzed transformed sensors data to carry out trend analysis to detect irregularity of sensor values

• Automated business intelligent IoT dashboards with Tableau depicting performance of sensors in applications FREELANCE EXPERIENCE : Northeastern University

Data Analyst RedCrow Boston, MA Jan 2020 – Mar 2020

• Architected a pipeline to dockerize a flask application containing machine learning model on AWS EC2 instance

• Achieved success prediction of a startup by implementing a classification model by comparing the F-beta score, accuracy, AUC of Logistic Regression, XGBoost and SVM

Data Analyst Pluralpoint Inc Boston, MA Sep 2019 – Dec 2019

• Designed a pipeline to derive information from images into a CSV file leveraging AWS Textract with an accuracy of 90%

• Classified images based on appearance by building Convolution Neural Networks (CNN) using kerass

• Preprocessed images to enhance image features important for text extraction using OpenCV increasing clarity by 70% Data Analyst VIACOM Inc Boston, MA Apr 2019 – Jun 2019

• Imputed CPM values for ad campaigns on different pages applying Random Forest Algorithm with accuracy of 95%

• Identified set of Facebook pages, posts giving maximum conversion rate depending on the demographics and ad placement using K-Means clustering

• Pitched a pricing model to increase conversion rate depending on demographics and ad placement by a factor of delta PROJECTS : Northeastern University GitHub: https://github.com/siddsapte12 OList E-commerce Data Analysis (AWS, Pyspark, Python)

• Built an AWS Redshift data warehouse by dumping data from S3 data lakes and AWS RDS

• Created AWS data pipelines to sync incremental data by writing AWS glue jobs for ETL and batch processing

• Made use of Lambda functions to trigger and automate ETL/Data Syncing process

• Used AWS Athena for ad-hoc analysis and AWS QuickSight to make dashboards and KPI reports Airbnb Analysis (Pyspark, Databricks,AWS, S3)

• Leveraged databricks to analyze data stored in AWS S3 and used spark sql to find out different insights from data Serverless File Upload Web Application (Python, AWS, MySQl )

• Deployed a serverless web application to upload files through an optimized pipeline on S3 bucket to save cost by 90%

• Integrated Lambda function with API gateway to upload images in S3 and store the user metadata in MySQL database



Contact this candidate