Post Job Free

Resume

Sign in

Data Engineering Senior

Location:
Ashburn, VA
Posted:
January 04, 2024

Contact this candidate

Resume:

KARTHIK V SRINIVASAN

Work Authorization : Canadian PR, USA - I140 Approved +1-917-***-**** ad2fr2@r.postjobfree.com LinkedIn

https://github.com/ksriniv2

PROFESSIONAL SUMMARY

Senior Data engineering manager with 19 years of experience, skilled in devising efficient, innovative and off-beat solutions for huge volume data environments.

Specialties: Application Data Architecture, Design & Development - Business Intelligence and Data warehouse Design, Huge volume Data Ingestion Pipeline for Financial data, Game telemetry data design, Unstructured data handling from Virtual assistants, Game data marketing, ML Engineering.

Domain: Banking and Finance Specialization: Credit Risk, Market Risk, Investment Management CRM, Compliance Solution, Game Telemetry and Marketing, VaR calculation.

Primary Skills

●Data processing /ETL: Spark, AWS Glue

●Languages: Shell Scripting, SQL, Python

●Data lake storage: AWS S3

●Databases: Netezza, Oracle 11.x, SAP IQ, Exadata, Bigquery, Redshift

●Data warehouse / Analytics: Dremio, Kinetica, AWS Athena

●Orchestration: Airflow, Sagemaker

●Data Lineage: Manta tools

●Real time: Kafka, PubSub, AWS Kinesis

●Distributed data store: Hive, Impala, HDFS

●ML Engineering: Sagemaker, Fargate based docker solution

Supporting Skills

●Scheduler: Autosys, Control M

●CRM: Salesforce

●Reporting: Cognos, Business Object, Spot Fire, Tableau, AWS Quick sight

●Automated Deployment: Jules / Jenkin

●Micro service: Kubernetes, Docker, Skaffold

WORK HISTORY

March 2021 - Current

Sr. Data engineering - Manager Amazon Games Studio, Alexa VA

●Managing the data engineering team and leading the effort for maintaining and building the data infrastructure and storage solution for Games engagement data.

●Games Telemetry data modeling and platform architecture for games scalability.

●Leading the effort for Data and storage re-architecture of Alexa experience platform to keep the system future ready.

●Designing architecture for burst data with very high throughput.

●Building automated tools for superior operational excellence on petabyte scale data storage solutions.

December 2012- April 2021

Senior Data Engineering Lead JPMorgan Chase NY

●Designed and built data lake / data warehouse Analytics solution with Dremio / S3 combining nearline and cold line data in S3 using spark, HDFS and Hive metastore.

●Designed Kafka based event triggering mechanism to run the spark pipeline for various feeds.

●Designed CI/CD tool for continuous integration of objects in Dremio using Rest API and python. This is a microservice which runs out of Kubernetes with Docker and Skaffold setup.

●Designed and developed Kafka based data egress mechanism for the consumers / downstream.

Other projects:

●Designed and developed 80 TB data ingestion pipeline between SAP IQ and Kinetica using PySpark and HDFS.

●Designed and developed ETL layer between Salesforce CRM and Greenplum using Informatica.

●Worked on data Modeling, design and requirement Analysis of various ETL pipelines.

●Architected Lineage solution across different technology platforms on a decade old system which helps with efficient impact analysis reducing the in-person dependency.

●Designed and architected data governance mechanisms for the risk data.

Data Migration:

●Designed archival solution to unload data from SAP IQ into S3

●Designed and developed a pipeline between AWS S3 and Oracle, SAP IQ and hadoop file system for archival

●Built an inhouse ETL tool to load close to 100 TB across SAP IQ environments with metadata management and incremental loads saving 6 months of man hour effort.

●Built an Inhouse ETL tool to load transform data across Netezza system

October 2004-December 2012

Data Engineering Manager and Lead Infosys Technologies Ltd

●Built ETL and ELT Pipelines across databases and flat files using tools like Informatica, Shell Scripts and PLSQL.

●Helped Clients like Mellon, Bank of America and Merrill Lynch to setup the Datawarehouse, model the data to persist close to 100TB of data.

●Performed Analysis, Design and Development of various ETL processes successfully and mentored the team across various countries for seamless delivery.

●Lead data and analytics team for a multi year market risk data delivery for client Bank of america.

●Managed data team of 8 across countries including USA, Mexico, India, China, Singapore.

EDUCATION

Master of Science: Software Systems

Birla Institute of Technology and Science, Pilani, India

Bachelor of Engineering: Computer Science

Madras University

ACCOMPLISHMENTS

●Winner of Hackathon: Designed solution to determine data lineage and provide useful answers to business users using NLP libraries of AWS.

●Built an Inhouse ETL tool to load transform PB scale data across SAP IQ system.

●Built a fuzzy comparison system to compare Salesforce and Oracle object metadata reducing the dependency on any change in upstream and helped with Analysis team to do automated analysis of object metadata.

●Built an Inhouse ETL tool to load transform data across the Netezza system.

●Delivered spark / big data training across JPMorgan technology centers.

●Built data ingestion tool from Quip for low volume data transfer.

●Built chatbot solution from Slack for redshift cluster management.

CERTIFICATIONS:

Machine Learning: Stanford Online

Google Cloud: Data Engineering

Tensorflow

Generative AI : In progress



Contact this candidate