KARTHIK V SRINIVASAN
USA - I*** Approved +1-917-***-**** *******.************@*******.*** LinkedIn https://github.com/ksriniv2
PROFESSIONAL SUMMARY
Senior Data engineering manager with 21 years of experience, skilled in devising efficient, innovative and off-beat solutions for huge volume data environments. Specialties: Application Data Architecture, Design & Development - Business Intelligence and Data warehouse Design, Huge volume Data Ingestion Pipeline for Financial data, Game telemetry data design, Unstructured data handling from Virtual assistants, Game data marketing, ML Engineering. Domain: Banking and Finance Specialization: Credit Risk, Market Risk, Investment Management CRM, Compliance Solution, Game Telemetry and Marketing, VaR calculation. PRIMARY SKILLS
● Data processing /ETL: Spark, AWS Glue
● Languages: Shell Scripting, SQL, Python
● Data lake storage: AWS S3
● Databases: Netezza, Oracle 11.x, SAP IQ, Exadata, Bigquery, Redshift
● Data warehouse / Analytics: Dremio, Kinetica, AWS Athena
● Orchestration: Airflow, Sagemaker
● Data Lineage: Manta tools
● Real time: Kafka, PubSub, AWS Kinesis
● Distributed data store: Hive, Impala, HDFS
● ML Engineering: Sagemaker, Fargate based docker solution SUPPORTING
SKILLS
● Scheduler: Autosys, Control M
● CRM: Salesforce
● Reporting: Cognos, Business Object, Spot Fire, Tableau, AWS Quick sight
● Automated Deployment: Jules / Jenkin
● Micro service: Kubernetes, Docker, Skaffold
WORK HISTORY
MARCH 2021 - CURRENT
Sr. Data engineering - Manager Amazon Games Studio, Alexa VA
●Managing the data engineering team and leading the effort for maintaining and building the data infrastructure and storage solution for Games engagement data.
● Games Telemetry data modeling and platform architecture for games scalability.
●Leading the effort for Data and storage re-architecture of Alexa experience platform to keep the system future ready.
● Designing architecture for burst data with very high throughput.
● Building automated tools for superior operational excellence on petabyte scale data storage solutions.
DECEMBER 2012- APRIL 2021
Senior Data Engineering Lead JPMorgan Chase NY
●Designed and built data lake / data warehouse Analytics solution with Dremio / S3 combining nearline and cold line data in S3 using spark, HDFS and Hive metastore.
●Designed Kafka based event triggering mechanism to run the spark pipeline for various feeds.
●Designed CI/CD tool for continuous integration of objects in Dremio using Rest API and python. This is a microservice which runs out of Kubernetes with Docker and Skaffold setup.
●Designed and developed Kafka based data egress mechanism for the consumers / downstream. Other projects:
● Designed and developed 80 TB data ingestion pipeline between SAP IQ and Kinetica using PySpark and HDFS.
● Designed and developed ETL layer between Salesforce CRM and Greenplum using Informatica.
● Worked on data Modeling, design and requirement Analysis of various ETL pipelines.
● Architected Lineage solution across different technology platforms on a decade old system which helps with efficient impact analysis reducing the in-person dependency.
● Designed and architected data governance mechanisms for the risk data. Data Migration:
● Designed archival solution to unload data from SAP IQ into S3
● Designed and developed a pipeline between AWS S3 and Oracle, SAP IQ and hadoop file system for archival
● Built an inhouse ETL tool to load close to 100 TB across SAP IQ environments with metadata management and incremental loads saving 6 months of man hour effort.
● Built an Inhouse ETL tool to load transform data across Netezza system OCTOBER 2004-DECEMBER 2012
Data Engineering Lead Infosys Technologies Ltd
● Built ETL and ELT Pipelines across databases and flat files using tools like Informatica, Shell Scripts and PLSQL.
● Helped Clients like Mellon, Bank of America and Merrill Lynch to setup the Datawarehouse, model the data to persist close to 100TB of data.
● Performed Analysis, Design and Development of various ETL processes successfully and mentored the team across various countries for seamless delivery.
● Lead data and analytics team for a multi year market risk data delivery for client Bank of america.
● Managed data team of 8 across countries including USA, Mexico, India, China, Singapore. EDUCATION
Master of Science: Software Systems
Birla Institute of Technology and Science, Pilani, India Bachelor of Engineering: Computer Science
Madras University
ACCOMPLISHMENTS
● Winner of Hackathon: Designed solution to determine data lineage and provide useful answers to business users using NLP libraries of AWS.
● Built an Inhouse ETL tool to load transform PB scale data across SAP IQ system.
● Built a fuzzy comparison system to compare Salesforce and Oracle object metadata reducing the dependency on any change in upstream and helped with Analysis team to do automated analysis of object metadata.
● Built an Inhouse ETL tool to load transform data across the Netezza system.
● Delivered spark / big data training across JPMorgan technology centers.
● Built data ingestion tool from Quip for low volume data transfer.
● Built chatbot solution from Slack for redshift cluster management.
● Using Claude Gen AI to build a lineage solution integrating Analytics with the backend system. CERTIFICATIONS:
Machine Learning: Stanford Online
Google Cloud: Data Engineering
Tensorflow
Generative AI