Data Engineering Senior

Location:

Ashburn, VA

Posted:

March 09, 2025

Contact this candidate

Resume:

KARTHIK V SRINIVASAN

USA - I*** Approved +1-917-***-**** *******.************@*******.*** LinkedIn https://github.com/ksriniv2

PROFESSIONAL SUMMARY

Senior Data engineering manager with 21 years of experience, skilled in devising efficient, innovative and off-beat solutions for huge volume data environments. Specialties: Application Data Architecture, Design & Development - Business Intelligence and Data warehouse Design, Huge volume Data Ingestion Pipeline for Financial data, Game telemetry data design, Unstructured data handling from Virtual assistants, Game data marketing, ML Engineering. Domain: Banking and Finance Specialization: Credit Risk, Market Risk, Investment Management CRM, Compliance Solution, Game Telemetry and Marketing, VaR calculation. PRIMARY SKILLS

● Data processing /ETL: Spark, AWS Glue

● Languages: Shell Scripting, SQL, Python

● Data lake storage: AWS S3

● Databases: Netezza, Oracle 11.x, SAP IQ, Exadata, Bigquery, Redshift

● Data warehouse / Analytics: Dremio, Kinetica, AWS Athena

● Orchestration: Airflow, Sagemaker

● Data Lineage: Manta tools

● Real time: Kafka, PubSub, AWS Kinesis

● Distributed data store: Hive, Impala, HDFS

● ML Engineering: Sagemaker, Fargate based docker solution SUPPORTING

SKILLS

● Scheduler: Autosys, Control M

● CRM: Salesforce

● Reporting: Cognos, Business Object, Spot Fire, Tableau, AWS Quick sight

● Automated Deployment: Jules / Jenkin

● Micro service: Kubernetes, Docker, Skaffold

WORK HISTORY

MARCH 2021 - CURRENT

Sr. Data engineering - Manager Amazon Games Studio, Alexa VA

●Managing the data engineering team and leading the effort for maintaining and building the data infrastructure and storage solution for Games engagement data.

● Games Telemetry data modeling and platform architecture for games scalability.

●Leading the effort for Data and storage re-architecture of Alexa experience platform to keep the system future ready.

● Designing architecture for burst data with very high throughput.

● Building automated tools for superior operational excellence on petabyte scale data storage solutions.

DECEMBER 2012- APRIL 2021

Senior Data Engineering Lead JPMorgan Chase NY

●Designed and built data lake / data warehouse Analytics solution with Dremio / S3 combining nearline and cold line data in S3 using spark, HDFS and Hive metastore.

●Designed Kafka based event triggering mechanism to run the spark pipeline for various feeds.

●Designed CI/CD tool for continuous integration of objects in Dremio using Rest API and python. This is a microservice which runs out of Kubernetes with Docker and Skaffold setup.

●Designed and developed Kafka based data egress mechanism for the consumers / downstream. Other projects:

● Designed and developed 80 TB data ingestion pipeline between SAP IQ and Kinetica using PySpark and HDFS.

● Designed and developed ETL layer between Salesforce CRM and Greenplum using Informatica.

● Worked on data Modeling, design and requirement Analysis of various ETL pipelines.

● Architected Lineage solution across different technology platforms on a decade old system which helps with efficient impact analysis reducing the in-person dependency.

● Designed and architected data governance mechanisms for the risk data. Data Migration:

● Designed archival solution to unload data from SAP IQ into S3

● Designed and developed a pipeline between AWS S3 and Oracle, SAP IQ and hadoop file system for archival

● Built an inhouse ETL tool to load close to 100 TB across SAP IQ environments with metadata management and incremental loads saving 6 months of man hour effort.

● Built an Inhouse ETL tool to load transform data across Netezza system OCTOBER 2004-DECEMBER 2012

Data Engineering Lead Infosys Technologies Ltd

● Built ETL and ELT Pipelines across databases and flat files using tools like Informatica, Shell Scripts and PLSQL.

● Helped Clients like Mellon, Bank of America and Merrill Lynch to setup the Datawarehouse, model the data to persist close to 100TB of data.

● Performed Analysis, Design and Development of various ETL processes successfully and mentored the team across various countries for seamless delivery.

● Lead data and analytics team for a multi year market risk data delivery for client Bank of america.

● Managed data team of 8 across countries including USA, Mexico, India, China, Singapore. EDUCATION

Master of Science: Software Systems

Birla Institute of Technology and Science, Pilani, India Bachelor of Engineering: Computer Science

Madras University

ACCOMPLISHMENTS

● Winner of Hackathon: Designed solution to determine data lineage and provide useful answers to business users using NLP libraries of AWS.

● Built an Inhouse ETL tool to load transform PB scale data across SAP IQ system.

● Built a fuzzy comparison system to compare Salesforce and Oracle object metadata reducing the dependency on any change in upstream and helped with Analysis team to do automated analysis of object metadata.

● Built an Inhouse ETL tool to load transform data across the Netezza system.

● Delivered spark / big data training across JPMorgan technology centers.

● Built data ingestion tool from Quip for low volume data transfer.

● Built chatbot solution from Slack for redshift cluster management.

● Using Claude Gen AI to build a lineage solution integrating Analytics with the backend system. CERTIFICATIONS:

Machine Learning: Stanford Online

Google Cloud: Data Engineering

Tensorflow

Generative AI

Contact this candidate