Post Job Free

Resume

Sign in

Data Engineer

Location:
Syracuse, NY
Posted:
January 24, 2021

Contact this candidate

Resume:

EDUCATION

Syracuse University, School of Information Studies, Syracuse, NY Jan 2019 - Dec 2020

Master of Science in Applied Data Science CGPA: 3.6 Coursework: Scripting for Data Analysis, Financial Analytics, Big Data Analytics, Time Series and Forecasting, Data Visualization, NLP Mumbai University, Mumbai, India Jun 2013 - May 2017 Bachelor of Engineering in Information Technology CGPA: 3.3 PROFESSIONAL EXPERIENCE

Graduate Research Assistant - Syracuse University, Syracuse, NY May 2020 – Aug 2020

• Assisted Prof Minjung Kwon in research by performing data mining and pre-processing of hospital information and medical data

• Created and maintained high-quality data set to serve as source of truth for various fundamental data captured, and hence implement data preparation solutions for analysis and machine learning.

• Designed blueprint for data to integrate, centralize and sustain data sources. Associate Software Engineer - ITIVITI Group AB, Mumbai, India July 2017 - Jan 2018

• Interacted with customers in conjunction with Software Development staff, Tech Support staff, and end users to devise a quick, effective solutions to software problems

• Developed robust test cases and improved overall software quality to verify accuracy and timely exchange of financial information regarding securities trades in FIX (Financial Information Exchange) protocol TECHNICAL SKILLS

Certifications: AWS Certified Solutions Architect – Associate Data Science Toolkit: R, Python, Java, C++, SQL, PySpark, Hadoop, MapReduce, Apache Spark, R Shiny Libraries: Scikit-learn, Pandas, Seaborn, ggplot2, Caret, NumPy, Keras, Plotly, NLTK, Matplotlib Machine Learning: Regression/Logistic Regression, Classification, Clustering, Decision Trees, Random Forest, SVM Functional Skills: Requirements Gathering, Technical Design Specifications, Agile and Waterfall Methodologies Tools: Tableau, PowerBI, AWS (S3, EC2, RDS), Google Analytics, Minitab, Adobe Illustrator, MS Excel, JIRA, Git ACADEMIC PROJECTS

Data Warehouse for Order Fulfillment: - SQL Server SSIS SSAS PowerBI July 2020 - Dec 2020

• Created a star schema-based Data Warehouse using Kimball architecture for Fudge Corporation using MS SQL server

• Developed ETL pipelines using SSIS to stage and load data from OLTP systems into the data warehouse, created cubes using SSAS enabling efficient analytics for the business process and created a PowerBI dashboard as the BI layer for visualization Pneumonia Detection Using Chest X-Rays Imaging: - Deep Learning CNN SVM Shiny Keras Jan 2020 - May 2020

• Constructed a Convolutional neural network-based solution in Python utilizing Keras (TensorFlow) and SVM to surmise if a patient is diagnosed with Pneumonia, achieving 78% classification accuracy rate

• Deployed an interactive front-end solution using Shiny and demonstrated cross-environment connectivity in R and Python TV News Commercial Detection: - PySpark Logistic Regression Random Forest GBT Aug 2019 - Dec 2019

• Accomplished a classification model to anticipate TV commercial from audio and visual features by building PySpark pipelines

• Incorporated PySpark ML to train logistic regression with elastic net regularization, Random Forest, and Gradient boosting trees models, performed hyper-parameter tuning and assessed generalization performance with model having 95% testing AUC score Fake News Classifier: - Text Mining NLP SVM Naïve Bayes May 2019 - June 2019

• Implemented classification model by employing Naive Bayes and SVM algorithm to classify news articles

• Utilized techniques such as Tf-Idf vectorization and lemmatization to analyze unstructured data by enhancing feature engineering process and boosted model accuracy to 76% Improving Customer Satisfaction for Airlines: - R Caret SVM Random Forest Statistics Jan 2019 - May 2019

• Analyzed a large customer survey dataset of airline companies to enhance customer ratings using data driven decision making

• Generated actionable insights by performing data pre-processing, data cleaning and applied linear regression and Association rules mining to discover business rules affecting satisfaction of customers

• Built predictive models and linear regression to determine satisfaction of a customer, achieving 86% accuracy rate Event Management System: - MySQL SQL Server MS Access MS Visio Aug 2014 - Dec 2014

• Managed customer records and catalog of events by building a Database Management System with MySQL

• Deploying OLTP database system Concepts such as ER model, EER model, ACID properties and SQL Queries for description creation as well as storage and optimization of database leveraging MS Visio and MS Access

• Wrote SQL scripts and complex ETL queries for analysis of important information of customer records from database YASH SHAH 315-***-****

adjn7l@r.postjobfree.com

https://www.linkedin.com/in/yash-h-shah/



Contact this candidate