SRI SAI CHETAN SEERAM
Seattle, WA • +1-347-***-**** • ******************@*****.*** • linkedin.com/in/chetansai282001/
Experienced Data Engineer with 4 years of expertise in database development, data warehousing, and big data technologies. Skilled in Hadoop, Spark, and Azure cloud services, optimizing data pipelines for a 35% efficiency boost. Proficient in ETL processes, data modeling, and visualization using SQL, Power BI, and Tableau. Strong hands-on experience with CI/CD (Terraform, Jenkins), RDBMS (SQL Server, MySQL), NoSQL (MongoDB), and cloud platforms like Azure Data Lake, Databricks, and Synapse Analytics. Certified in AI, Data Science, and Programming. Passionate about leveraging data engineering solutions for impactful projects in healthcare and environmental conservation.
SKILLS
Programming Languages: Scala, Python, SQL, Java
Scripting Languages: PowerShell, Unix
Databases: MS SQL Server, PostgreSQL, MongoDB, MySQL
Big Data Ecosystem: Hadoop, MapReduce, Hive, Redshift, DynamoDB, BigQuery, Kubernetes
ETL Tools: SSIS, Informatica
Cloud Technologies: GCP, AWS, Azure
Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow, Kafka
Reporting Tools: Tableau, Power BI, SSRS, Microsoft Excel
Methodologies: SDLC, Agile, Waterfall
IDEs: PyCharm, Jupyter Notebook, Visual Studio Code
PROFESSIONAL EXPERIENCE
MasterCard, TX, USA Aug 2023 – Present
Data Engineer
●Engineered data ingestion framework using Spark, Python, and Azure Data Factory to process structured and semi-structured data.
●Designed ETL workflows for Azure Blob, Data Lake, and APIs, loading data into Synapse and Cosmos DB while ensuring accuracy.
●Integrated REST APIs in Spark to fetch real-time financial data using JSON parsing, OAuth 2.0, and error handling for reliability.
●Established optimized data pipelines in Databricks with Delta Lake for ACID transactions, versioned storage, and schema evolution.
●Built batch and streaming data solutions via Kafka, Spark Streaming, and Event Hubs for real-time fraud detection and monitoring.
●Optimized queries and indexing in Azure SQL and Synapse using partitioning, materialized views, and adaptive query optimization.
●Automated ETL pipelines using Airflow and Azure Data Factory, reducing manual effort and ensuring seamless data workflows.
●Implemented PySpark and Great Expectations for data validation and anomaly detection to ensure financial transaction integrity.
●Enforced RBAC, encryption, and Azure Purview for compliance with PCI DSS and GDPR, ensuring data security and governance.
●Tuned Spark jobs by optimizing shuffles, joins, caching, and auto-scaling clusters, reducing cloud costs and improving performance.
Vegesna Securities Private Limited June 2020 – May 2022
Data Engineer
●Built batch and real-time pipelines with Kafka, Pub/Sub, and Dataflow, ensuring low latency financial data ingestion and processing.
●Automated event-driven workflows via Cloud Functions and Cloud Run, streamlining financial transaction processing with insights.
●Optimized BigQuery partitioning and clustering, cutting query execution time and storage costs for multi-terabyte financial datasets.
●Implemented data validation and anomaly detection with Great Expectations and BigQuery SQL, ensuring accuracy in datasets.
●Designed a scalable financial data lake with GCS and BigQuery, supporting structured and unstructured data for predictive analytics.
●Engineered CDC and incremental ingestion via Debezium and BigQuery Streaming API for real-time sync of financial transactions.
●Leveraged Terraform and Cloud Composer (Airflow) to automate provisioning and orchestration, increasing deployment efficiency.
Robo Couplers Pvt. Ltd May 2021 – Sep 2021
Data Engineer Intern
●Engineered ETL using Python, SQL and BigQuery, optimizing data transformations and analytics for large-scale healthcare datasets.
●Integrated AI pipelines with Vertex AI and Cloud Functions, enabling biometric data processing and facial recognition workflows.
●Developed real-time IoT data processing with Pub/Sub, Cloud Run and BigQuery, reducing latency and improving responsiveness.
●Built API ingestion pipelines (RESTful APIs, Cloud Functions MQTT), ensuring seamless biometric device-cloud connectivity.
●Automated monitoring/logging with Cloud Logging, Stackdriver and Terraform, ensuring data pipeline traceability and scalability.
EDUCATION
George Mason University, VA, USA Aug 2022 – May 2024
Master’s in Information Systems, GPA: 3.71/4.00
The Institute of Aeronautical Engineering, India Aug 2018 – June 2022
Bachelor’s in Information Technology, GPA: 3.20/4.00