Resume

Data Engineer

Location:

Tampa, FL

Salary:

90000

Posted:

March 28, 2020

Contact this candidate

Resume:

Ankur Srivastava

***** ***** * ***** ****, Tampa, Florida, ZIP 336**-***-*** 3476 adcice@r.postjobfree.com LinkedIn

SUMMARY

Data Scientist cum Data Engineer with more than 3 years of industrial experience in Fortune 100 companies, skilled in Big Data, Data Engineering, Data Analysis, Data Manipulation, Data Pipelining, Data Visualization, Predictive Modelling and BI reporting. Looking for Full-time position.

EDUCATION

• University of South Florida, USA Master of Science, Business Analytics and Information System Aug 2018 - May 2020

Coursework: Data Mining, Statistical Analysis, Analytical Methods, Data Visualization, Data Warehousing, Advanced Database Management

• SRM University, India Bachelor of Technology, Information Technology Aug 2011 – May 2015

SKILLS

• Analytical Tools/Languages: T-SQL, Python, R, Tableau, Excel, Azure ML, Google Analytics, Snowflake, Data Bricks

• Databases: MS SQL Server, Oracle NoSQL: Cassandra Big Data: AWS EC2, Redshift, RDS, Apache Spark, Hadoop

• Python Libraries: PANDAS, NumPy, SciKit-Learn, Matplotlib, Seaborn, Plotly, NLTK, Statsmodels.

• ML Techniques: Regression, Classification, Clustering, Anomaly Detection, Dimensionality Reduction.

• ML Algorithms: Linear and Logistic Regression, Decision Tree, Neural Networks, SVM, KNN, Gradient Boosting.

PROFESSIONAL EXPERIENCE

Tech Data Corporation, Clearwater, Florida Data Science Intern September 2019 – March 2020

• Developed a Machine Learning predictive model in Python to choose the best carrier for a shipment to maximize profit of the company. Then performed A/B testing to declare a net saving of $800k annually.

• Utilized Snowflake as a Data warehouse, performed ETL on millions of records using Apache Spark.

• Connected Snowflake to Tableau for visualization of multiple case scenarios for comparative analysis and report presentation to senior leaders.

Tech Data Corporation, Clearwater, Florida Data Engineering Intern May 2019 – August 2019

• Developed pipeline to feed the data from SAP Web portal to Azure, then used this data for predictive and descriptive modeling.

• Used Spark SQL to fetch and transform data from multiple big files. Created a data pipeline to study the summary statistics by using Mapper and Reducer Programs on Databricks. Extracted data from multiple sources to flat files and load the data to the target database.

Cognizant Technology Solutions, Pune, India Programmer Analyst December 2015 – June 2018

• Worked as an ETL developer, responsible for analysis, design, development, implementation and unit testing of data warehousing using Informatica PowerCenter. Developed and modified ETL Mappings using Informatica. Also created BRD / FRD for the same.

• Ingested data, performed indexing, query optimization, performance tuning, and build SQL queries to analyze KPI Metrics.

• Developed Unix Shell Scripts for job automation. And Developed Tableau visualizations of the time taken by each job for reporting.

Startup, New Delhi, India Market Analyst January 2013 - November 2015

• Performed keyword research and analysis using Google AdWords, competitor analysis, A/B testing of traffic, website management with Google Analytics. Reached 30k+ visitors/monthly (organic) and reduced the Bounce rate from 80% to 35%.

RESEARCH WORK / PROJECTS

Built a Deep Neural Network - Data Mining (Python)

• Built a generic -L layer neural network for classification from scratch and experimented with various hyper-parameters.

• Tested in data sets like Wine, Iris, Abalone, Automobile and Digits dataset; provided a comparative study between the accuracies obtained by Deep Neural Network, SVM, Decision Tree, Naive Bayes, and K-Nearest Neighbors.

Sentiment Analysis of Medical Drugs – Machine Learning - Python, LR, Decision Trees, Natural Language Processing (NLP)

• Calculated the Sentiment score of the drugs using the techniques of Text Mining (like tokenization, stop word removal) in python.

• Applied various ML Models like Decision Trees & Linear Regression & selected the best drug for different conditions.

Customer Churn Dataset – Big Data Analytics using Databricks - Spark SQL and MLlib

• Built a classification on customer churn dataset, for predicting the upcoming customer behavior and Used Azure Databricks platform.

• Draw the correlation for feature engineering and variance-principal component analysis to get the best possible columns for predictions.

• Created the Pipeline which includes StrindIndexing, LabelEncoding and StandardScaler to normalize the data for classification algorithms.

Crime Rate Analysis in the US (Tableau)

• Performed data cleansing & exploratory analysis for the crime that has happened in the US for 20 years. Created Visualization for the same.

Contact this candidate