Post Job Free

Resume

Sign in

Data Engineer

Location:
Boston, MA
Posted:
April 03, 2020

Contact this candidate

Resume:

SHUBHANGI CHANORE

https://github.com/schanore

adcl2e@r.postjobfree.com 617-***-**** https://www.linkedin.com/in/shubhangi-chanore/ EDUCATION

Master of Science in Information Systems, Northeastern University, Boston, MA GPA 3.30 Sep 2018-Apr 2020 Bachelor of Science in Electronics Engineering, G.H. Raisoni College of Engineering, India GPA 3.92 Aug 2010-Jun 2014 TECHNICAL SKILLS

Programming Language SQL, Python, C, T- SQL, UNIX Bash Scripting Python Libraries NumPy, Pandas, Scikit-learn, Seaborn, Matplotlib, H2O, tkinter Databases Development SQL Server, MySQL, Oracle, PostgreSQL, TOAD, HDFS, Hive, Impala, Sqoop ETL and Analytical Power BI, Tableau, Talend, Alteryx, Microsoft Excel, SSIS Testing TOSCA Automation tool, HP ALM QC, QTP

Tools GitHub, Bitbucket, Jira, Microsoft Azure,AWS, EC2, S3, Amazon Sagemaker EXPERIENCE

Granite Telecommunications Aug 2019-Dec 2019

Data Engineer Intern

• Migrated data from SQL Server & Oracle to Hadoop cluster Impala using Sqoop increasing execution speed by 35%

• Owned an automation task for the Sqoop Statement creation process using python reducing manual efforts by 80%

• Wrote stored procedures on impala which increased speed of execution of that process by 40%

• Integrated Hive with Power BI to create Interactive dashboards using Cloudera ODBC connection on Hue Platform

• Implemented ETL (SSIS) to create jobs for extracting, cleaning, transforming and loading data into data warehouse

• Created a Data pipeline for classification model using pyspark library which saved 25% processing time HSBC Software development

Data Analyst/Engineer Mar 2017-Jul 2018

• Created and maintained jobs using Informatica tool to insert data from Oracle and SQL server Database to SQL Server Datawarehouse

• Designed reports to be sent EOD and on Ad-hoc basis for stakeholders for further analysis on Cognos

• Performed database development, testing, business analysis, production support being a part of a cross-functional POD

• Probed business impacting production issues using Unix and SQL and worked on implementation of multiple release

• Developed sub-queries, complex Stored Procedures, Triggers and Views on Oracle and MySQL Test Analyst Jul 2014-Mar 2017

• Oversaw a team of 5 on a tier 0 application and worked on performance, system integration, user acceptance, regression testing and managing complete testing cycle in Agile

• Co-ordinated with business stakeholders to understand business requirement and convert into functional requirement

• Led a team on automation of regression pack conserving around 80% of manual testing task leveraging TOSCA tool ACADEMIC PROJECTS

Data Warehousing and Business Intelligence for Retail Sales (Talend, Data Modeling, Tableau, Power BI) Sep 2019-Dec 2019

• Designed a master job to populate a retail data warehouse on Talend, loading 40 million rows in 20 minutes

• Implemented Source to Target Mappings, Data Profiling, ETL flows, Slowly Changing Dimensions, Reject Codes, Currency Conversion and Performance Tuning on data sourced from SORs (MySQL, SQL Server, Postgres, Oracle and Excel)

• Generated interactive dashboards to convey stories of retail sales using Tableau and Power BI Crimes in Boston Data Analysis and Visualization (Python, NoSQL Datastore- CouchDB, Power BI) Jul 2019-Aug 2019

• Devised Data pipeline using python to insert data from excel source into CouchDB in JSON format, reducing the manual insertion efforts by 60%

• Built interactive visualizations on Power BI by establishing connectivity through API with CouchDB Weather App (Python, Tkinter) Mar 2019-Jun 2019

• Created an application which fetches weather data based on location entered by user

• Integrated a weather API into Application and built user interface with tkinter library of python Bank Marketing Analysis Using Python (NumPy, S3, Amazon Sagemaker, Pandas, Matplotlib, Seaborn) Jan 2019-Feb 2019

• Performed data cleaning, data quality checks and data pre-processing for the raw data and applied logistic Regression and Classification techniques to predict whether a customer would subscribe to a term deposit

• Established data processing using NumPy and Pandas, Visualized results using matplotlib and seaborn libraries



Contact this candidate