Sign in

Data Engineer

Santa Clara, California, United States
February 17, 2018

Contact this candidate

Neha Pawar

**** **** *, ******* ******, Santa Clara, CA 95051, 408-***-****

Graduate student with 5 years of experience in Data Warehousing, Analytics and Big Data Programming. As a data savvy storyteller, I am looking for opportunities to utilize my previous work experience and graduate coursework to build end-to-end data pipelines and discover insights from data. EDUCATION

Santa Clara University, Leavey School of Business Santa Clara, CA Master of Science in Information Systems (MSIS) June 2018 Veermata Jijabai Technological Institute Mumbai, India Bachelor of Technology, Mechanical Engineering May 2011 TECHNICAL SKILLS

• Data Mining tools: Python (Numpy, Pandas, Scikit-learn)

• Visualization: Tableau, Seaborn

• Distributed Computing: Hadoop, Apache Spark (pyspark)

• Tools & Platforms: AWS Redshift, EMR, Spark, Jupyter Notebook, Hive, Athena, Presto, ETL, ELT, Informatica Power Center, Docker, Git, MS Office

• Databases: Oracle, Netezza, SQL server, MongoDB

• Operating Systems: Windows, Unix (Linux), Mac (OS)

• Other skills: Data Wrangling, SQL, Requirements Gathering & Analysis, Dimensional and Fact Data Modeling, Data Warehousing and Analysis, SDLC - Agile/Scrum and Waterfall, Machine Learning (Supervised Learning: Regression, Logistic Regression, Decision Tree) RELEVANT COURSEWORK

DBMS, Cloud Computing, Big Data Modeling & Analytics, Object Oriented Programming, Data Science with Python, Dashboards Scorecards and Visualizations, Telecomm and Business Networks, Computer Based Decision Models, Information Systems Analysis and Design, Machine Learning, Software Project Management WORK EXPERIENCE

Bank of America Mumbai, India

Senior Data Engineer 2014-2016

• Worked as an architect for data warehousing projects including consulting requirements with BAs, ETL design, implementation and maintenance, supporting existing processes in production, communicating with stakeholders and collaborating with cross-functional teams along with vendor partners

• Spearheaded data management activities for onboarding legacy application on internal CRM tool for multiple lines of businesses to track KPIs related to sales activity and reflect sales matrices

• Revised standard operational stored procedures to load Netezza schema to improve performance and scalability by thoroughly analyzing around 200 jobs that improved 10 functional areas

• As part of a Master Data Management (MDM) initiative, replaced a set of scattered, manually executed scripts with an automated process to improve data quality by addressing bottlenecks along with ensuring data consistency, which resulted in significant cost savings for Line of Business

• Proactively took part in resolving production support/troubleshooting activities which involved fixing highly critical issues (50+ high priority and 350+ user issues), monitored performance, debugged and enhanced codes, worked with data infrastructure team to triage infra issues and drive to solution

• Mentored new college graduates and new team members on the adoption of data warehouse standards and best practices; reduced the onboarding time by 30%; worked with technical panel for evaluating software analyst candidates Bank of America Mumbai, India

Data Engineer 2011-2014

• Worked as ETL developer in all aspects of Data Analytics and Data Warehousing solutions such as Database issues, Data modeling, end-to-end data pipeline development, Metadata management, Data migration solutions, Impact Analysis, Reverse engineering, debugging and managing deployments

• Implemented enhancements along with performance tuning which decreased job cycle time by 67%; Converted Type 2 dimensions to Type 1 dimensions which reduced the daily data load time by 40%.

• Performed data mining on large datasets to make business recommendations to forecast as well as to perform impact analysis, created visual displays and delivered the same to stakeholders

• Played a key role in creating centralized data model to self-serve users with automated reporting process which reduced user’s need to go to reporting and analytics team; Analyzed data to identify deliverables, gaps and inconsistencies

• Engineered fast, real-time data pipelines to process data from across the financial markets and internal event streams

• Planned, designed and executed Proof of Concept(POC) for moving from ORACLE to NETEZZA using ETL for data extraction and transformation ACADEMIC PROJECTS

• Used Machine Learning techniques such as Classification, Regression and Clustering to derive interesting patterns in Flights and Weather, Mental Health Patients Services and Student Alcohol Consumption effect on Grades data.

• Analyzed White House staff salaries for Obama and Trump administrations, performed data wrangling in Python and created dashboards and advanced graphical visualizations by using techniques for guided analytics, interactive dashboard design, and visual best practices in Tableau.

• Analysis of movie ratings using MapReduce and Spark to unique users to find unique movies, users with the highest average rating, number of users who rated within a given range of movies and users who rated identical movies.

• Created Tableau Visualizations on “Which City is better to buy House in”, performed Data Gathering and cleaning, wrangling in Python; derived Metrics and KPIs to Compare-Contrast data; Critiqued and redesigned multiple visualizations to get clear understanding of Visualization Aesthetics.

• Compared cloud vendors AWS and Microsoft Azure by deploying a scalable, reliable and highly available WordPress and explored IaaS, SaaS, PaaS offerings such as Compute services, Storage services, Database services, Security services and Application services; Performed a comparative analysis of the offerings and pricing from the two cloud vendors.

Contact this candidate