Resume

Data Engineering Lake

Location:

Posted:

December 15, 2023

Resume:

I am a proficient Professional with extensive expertise in orchestrating large-scale deliveries, Engineering, Technical Solutions, architecture, as well as adept business and risk management. My background includes talent development and resource management.

• Successfully led Global Cloud enablement initiatives, demonstrating a profound comprehension of Cloud data platforms like Snowflake, coupled with an in-depth understanding of Data Warehouse principles.

• Orchestrated numerous Data programs spanning various technologies, encompassing the development of applications and expansive systems for esteemed clients such as Apple, HSBC, and Cummins.

• Proficient in crafting data-driven solutions across diverse domains including Technology, Retail, Manufacturing, and Finance.

• Directed a global technical team dispersed across different geographical locations, comprising Implementation Managers and Delivery Managers.

Certifications:

Snowflake SnowPro Core

AWS Certified Data Analytics – Specialty

Technologies and Languages

● Skills: Data Integration, Data Modeling, Data Engineering Apache Kafka, Data Governance, Amazon Web Services

● Technologies: AWS Redshift, Snowflake, Microsoft Analytics Platform System, Cassandra, Python, Kafka, Pentaho, Kubernetes, Spark, Erwin Data Modeler, ER Studio 9.5, Informatica BOxi, OBIEE, MBIDS, Oracle, PL/SQL, IDQ, Data Lakes

● Others: Program Planning and Management, Business & IT Liaison, Business Case Development, Release Management, Agile Delivery

Work Experience

Data Engineering Manager Nielsen Feb 2022-Present

Nielsen- Audience Measurement Santa Clara, California Description: Nielsen Audience Measurement team is responsible to collect, integrate and deliver Media metadata to Media Data Lake. This reference data is utilized to in products such as viewership analytics along with Nielsen Business insights to various Media clients.

Leading Data Engineering Team for

• Spark Pipelines for Reference Data Transformation: I oversaw the development and deployment of Spark Pipelines designed to transform and integrate reference data seamlessly. This critical step enhanced the quality and usability of our data assets.

• Media Data-Lake Migration and Gracenote Integration: A significant part of my responsibilities included orchestrating the migration of Media Data to our Data Lake and its seamless integration with Gracenote. This integration fortified our data ecosystem and allowed for enhanced data insights.

• Data Governance: Ensuring the integrity and accuracy of our data was paramount. I led the migration efforts for historical data into the Data Lake, establishing a comprehensive repository for past and present data.

• Orchestration with AWS Data Pipelines, Airflow, SNS, and SQS: To streamline our data processing, I orchestrated workflows spanning AWS Data Pipelines, Airflow for workflow management, and SNS/SQS for seamless communication between systems, ensuring timely and efficient data delivery. Santa Clara, CA

ad1z8n@r.postjobfree.com Mangesh Kulkarni

Nielsen

linkedin.com/in/mangeshkulkarnidwbi

Phone - 669-***-****

• Ad-Exposure and Business Insights Delivery: I played a pivotal role in orchestrating the delivery of critical ad-exposure data and business insights to our Media clients. This involved optimizing data delivery mechanisms for efficiency and accuracy.

• Machine Learning Service Orchestration: Leveraging Machine Learning, I orchestrated smart matching services for metadata, enhancing the accuracy and relevance of our data.

• External Geo-Distributed Stakeholder Management: I actively engaged with external stakeholders spread across geographical locations to manage reference data relationships effectively, ensuring alignment with their needs and requirements.

Solution Environment:

Airflow for orchestrating workflows.

Data Lake architecture for data management.

Python for data processing and manipulation.

AWS cloud services for scalability and flexibility. Spark for efficient data transformation.

S3 Data Lake storage for robust data repositories. Cassandra for effective data management.

Kubernetes micro-services for scalable and adaptable solutions. Data governance principles to ensure data quality and compliance with industry standards. Data Engineering Manager Apple (Contract) Sep 2020–Feb 2022 Global Business Intelligence Sunnyvale, California Description: Global Business Intelligence (GBI) is responsible for managing all the Data Warehouses and Analytics, develop new reports with changing business needs, and managing data induction, transformations and loading based on business requirements. ETL Framework is in house tool to integrate data in complex Apple ecosystem. Leading team of 8 members:

• ETL framework migration from On Prim to Internal and AWS cloud (9000 Data Pipelines)

• Implementation of GBI Data bus (Kafka/S3) for about 80TB datasets

• Metadata data model conversion from Oracle to Cassandra

• Automation design and deployments for ETL pipelines movements to cloud

• Framework Performance improvement

• Work with Apple cloud Infrastructure team for infrastructure readiness on Kube(K8), Spark, Kafka and Cassandra

• Roadmap creation and implementation

Solution Environment:

Airflow, Data Lake, Python, AWS, Spark, S3 Data lake, Cassandra, Kubernetes micro services, Data Governance, Kafka Data Integration Manager Apple (Contract) Sep 2018 to Sep 2020 Battery Data Analytics Sunnyvale, CA

Description: Battery Analytics team is responsible to analyze infield and manufacturing data correlation and impact on Batter performance. Team required to integrate vendor datasets for manufacturing plants across China and correlate the same with telemetry data and apple care repairs datasets to create data mart for machine learning platform for ANOVA and multivariate time series analysis

Solution Environment: ETL, PL/SQL Python, Oracle, File based data sources Data Engineering Manager HSBC (Contract) Dec 2016–Sep 2017 Strategic Analytics New York, NY

Description: Semantic layer implementation using AWS Redshift and Pentaho Data integration Solution Environment: Amazon Web Services (S3, EC2 and Redshift), Pentaho Data Integration, Aginity, Qlikview, ER/ Studio Highlights

Data Integration Manager Silicon Valley Bank Sep 2015–Nov 2016 CCAR Santa Clara, CA/Pune, India

Description: SVB has to comply with CCAR Submissions to Federal Reserve Board. This project is initiated for end to end CCAR delivery.

Solution Environment: Oracle 11g, SAP BODS, SAP Information Steward Company: Jet Blue Airlines (TCS) Apr 2014–Aug 2015 Role: ETL and Data Architect

Responsibility: Data profiling, Data modeling, ETL design Solution Environment: Informatica 9.1, Tibco BW, ER Studio Professional 10.0 Microsoft APS Company: Cummins (TCS) Jan 2008 - Mar 2014

Project: Cummins Global Logistics

Role: ETL and Data Architect & Project Management

Responsibility: Project and Delivery management, Data Modeling, Requirement analysis Solution Environment: Informatica Power Centre 8.6, OBIEE, Oracle 10g, mBIDS data model Company: Morgan Stanley (TCS) Jan 2004 - Dec 2007

Role: Business Intelligence Managed Services – Informatica Developer Responsibility: ETL Development and testing, Performance Tuning Solution Environment: Informatica Power Centre 7.1.3, Sybase, DB2, Unix shell scripting, Education and Certifications

● Bachelor of Engineering, Information Technology, Shivaji University, India 1999–2003

● Snowflake SnowPro Core Nov-21

● AWS Certified Data Analytics – Specialty Feb-22

● AWS Solution Architect – Associate Jan-18

● Oracle Certified Associate(Oracle 9i PL/SQL) Apr-06

● NCFM-Financial Markets Oct-07

● Project Management Professional (Expired) Aug-11

Contact this candidate