Seelam Sai Vardhan Reddy
New Haven, CT-***** ****************@*****.*** 636-***-**** LinkedIn
Professional Summary:
●Experienced Data & Network Engineer with 5 years in designing and managing cloud-native, data-centric, and high-performance network environments. Proficient in architecting ETL pipelines, managing hybrid cloud infrastructure (AWS, Azure, GCP), and implementing Spine & Leaf architectures, Cisco ACI, MPLS, BGP, and VXLAN fabrics. Adept at deploying secure and scalable data center networks and automating workflows through Python, Shell scripting, and CI/CD tools. Skilled in cross-team collaboration and delivering robust solutions for real-time analytics, 5G systems, and enterprise workloads.
●Expertise in designing and building end-to-end ETL/ELT pipelines using tools such as Informatica, Talend, Matillion, Apache NiFi, and Apache Airflow.
●Skilled in real-time and batch data processing using Spark, PySpark, Scala, and Python on platforms like Databricks, HDInsight, and Hadoop YARN.
●Proficient in working with modern data warehouses/lakehouses including Snowflake, Redshift, BigQuery, and Synapse Analytics.
●Strong background in data migration from on-premises to cloud, ensuring data integrity, security, and minimal business disruption.
●Hands-on experience with AWS S3, Glue, EMR, Athena, and Redshift Spectrum, as well as Azure Functions, Blob Storage, and Cosmos DB.
●Implemented robust data governance, metadata management, and compliance using GCP Data Catalog, IAM roles, and secure VPC/VPN configurations.
●Created interactive dashboards and reports with Tableau, QlikView, DataStudio, and Kibana to deliver business insights.
●Developed CI/CD pipelines using Jenkins, Docker, and deployed scalable containers in Kubernetes environments.
●Experienced in writing complex SQL scripts, UDFs, and performing data validation, cleansing, and enrichment for analytics readiness.
●Automated large-scale data lake and data warehouse solutions using Azure Data Factory, Azure Data Lake, and Stream Analytics.
●Collaborated with cross-functional teams to support mission-critical data systems, troubleshoot issues, and ensure high system availability.
●Familiar with version control and issue tracking using GitHub, JIRA, and ServiceNow.
Technical Skills:
Networking: Cisco ACI, BGP, MPLS, Nexus 7k/9k, VXLAN, FEX, Spine & Leaf Architecture, Network Automation
Programming Languages
Python, Java, Scala, PL/SQL, Shell Scripting, C#, JavaScript, TypeScript, R, Go, Ruby, SQL, HTML, CSS.
Cloud Platforms
AWS (S3, Redshift, EC2, Lambda, Glue, EMR, Athena, QuickSight, Lake Formation, RDS, Lambda, SQS), Azure (Blob Storage, Synapse, Data Factory, Azure Functions, Azure Data Lake, Cosmos DB, Power BI) and GCP
Data Processing Tools
Apache Spark, PySpark, Spark Streaming, Kafka, Flume, dbt, Hadoop YARN, MapReduce, Sqoop, Azure Stream Analytics, Databricks.
Data Warehousing
AWS Redshift, Azure Synapse Analytics, Snowflake, Google BigQuery, Teradata
Database Systems
Oracle, MS SQL Server, MySQL, Teradata, DynamoDB, ElasticCache, Azure SQL Database, Azure Cosmos DB, Cassandra, HDFS, Hive, PostgreSQL, T-SQL
Real-Time Data Tools
Kafka, Spark Streaming, Azure Event Hubs, AWS Pub/Sub, AWS Glue Streaming, BQ-ML
ETL Tools
AWS Glue, Azure Data Factory, Talend, Informatica
Data Visualization
Amazon QuickSight, Microsoft Power BI, Tableau, Looker
Version Control
Git, GitHub, GitLab
CI/CD Tools
Jenkins, GitLab CI/CD, Azure DevOps, AWS CodePipeline
Containerization
Docker, Kubernetes, OpenShift
Load Balancers
AWS Elastic Load Balancer, Azure Load Balancer
Educational Qualifications:
University of Bridgeport Bridgeport, CT
●Master of Science Aug 23 - April 2025
Lakireddy Balireddy College of Engineering Vijayawada, India
●Bachelors in Electronics and Communications Engineer Dec 2021
Certifications:
●AWS Certified Data Engineer
Professional Experience:
Citibank Minneapolis, MN
AWS Data Engineer Sept 2024 – Current
Roles & Responsibilities:
●Designed and implemented robust AWS Data Pipelines to automate data transfer and transformation processes, enhancing efficiency in data management using AWS S3, AWS Glue, and Snowflake.
●Managed large-scale data warehouses using AWS Redshift and SQL Server, ensuring optimal data storage, retrieval, and scalability for enterprise applications.
●As a part of Data Migration, wrote many SQL Scripts for Mismatch of data and worked on loading the history data from Teradata SQL to snowflake.
●Engineered and executed data processing scripts using PySpark, Scala, and Python, significantly improving data manipulation and batch processing tasks in a Hadoop YARN environment.
●Developed and managed ETL workflows using Informatica, enabling seamless data integration across multiple sources, and improving data accuracy and consistency for business reporting.
●Utilized Python scripting for data cleansing, transformation, and enrichment, ensuring high-quality data availability for analytical applications.
●Implemented comprehensive data backup and recovery solutions using AWS S3 and RDS, safeguarding critical business data against potential loss or corruption.
●Developed interactive dashboards and reports using Tableau, Cassandra, and QlikView, providing actionable insights into business operations and customer behavior.
●Leveraged Sqoop to efficiently transfer bulk data between Hadoop and relational databases, enhancing data integration and consistency across platforms.
●Orchestrated the migration of enterprise data to cloud platforms, utilizing AWS Glue, Informatica, and Talend to ensure seamless data integration and consistency.
●Programmed complex ETL processes using Talend, Fivetran, and AWS Glue, facilitating the consolidation of data from multiple sources into a centralized repository.
Travelers Insurance Bridgeport, CT
Data Engineer Dec 2023 – Aug 2024
Roles & Responsibilities:
●Architected a scalable data warehouse solution using Azure Synapse Analytics and executed advanced analytical queries within the Azure environment.
●Participated in building and integrating a data lake on Azure Data Lake Storage to support a wide range of application and development needs.
●Enhanced operational workflows by scripting automation solutions with Azure Functions using Python in the Azure cloud.
●Utilized Azure HDInsight for big data processing across Hadoop clusters, efficiently using Azure Virtual Machines and Blob Storage.
●Developed and executed Spark jobs in HDInsight via Azure Notebooks to streamline large-scale data processing tasks.
●Built high-performance Spark applications in Python to run on HDInsight clusters, improving data handling efficiency.
●Deployed the ELK stack (Elasticsearch, Logstash, Kibana) on Azure to collect, analyze, and visualize website logs.
●Designed and deployed robust ETL processes using tools such as Apache NiFi, Talend, or Informatica to ingest data from APIs, flat files, and relational databases.
●Applied testing best practices by writing thorough unit tests with PyTest to ensure code reliability and maintainability.
●Created serverless architectures incorporating Azure API Management, Azure Functions, Azure Blob Storage, and Cosmos DB, with auto-scaling features for enhanced performance.
●Leveraged Azure Stream Analytics and Synapse Analytics to populate and manage data warehousing solutions efficiently.
●Programmed User Defined Functions (UDFs) in Scala to encapsulate and automate business logic within data applications.
●Built end-to-end Azure Data Factory pipelines to ingest, transform, and store data, seamlessly integrating with various Azure services.
●Ran Hadoop and Spark jobs on HDInsight using data stored in Azure Blob Storage to support distributed big data processing.
●Designed custom infrastructure using Azure Resource Manager (ARM) templates to deploy and manage pipelines effectively.
TATA Consultancy Services Hyderabad, India
Software Engineer Dec 2021 to Aug 2023
Roles & Responsibilities:
●Designed and implemented secure, scalable data storage and processing solutions using GCP Cloud Storage, BigQuery, Cloud SQL, and GCS Buckets.
●Built and automated data pipelines with Python, Shell scripts, and Matillion, enabling real-time ingestion, transformation, and reporting for healthcare data.
●Performed a database update by redirecting all automation scripts from the legacy database to the new version database, ensuring smooth migration with minimal downtime.
●Developed federated queries across BigQuery and Snowflake to integrate multiple data sources for real-time analytics and business insights.
●Led end-to-end migration projects from on-premises systems to GCP Cloud, ensuring data integrity, compliance, and minimal business disruption.
●Administered IAM roles and VPC/VPN configurations to enforce strict healthcare security standards and network protection.
●Optimized Databricks clusters and monitored system performance to support high-performance distributed data processing and analytics.
●Utilized BQ-ML for predictive modeling and machine learning to enhance operational efficiency and decision-making.
●Established robust data governance using GCP Data Catalog for metadata organization, discoverability, and compliance.
●Created dynamic reports and dashboards with DataStudio to deliver actionable insights to stakeholders.
●Developed event-driven architectures using Pub/Sub for real-time alerting and system integration.
●Managed cross-platform data transfers with Service Data Transfer between Snowflake, MySQL, and other sources.
●Collaborated across teams to ensure optimal Cloud SQL configuration, performance, and high availability.
●Resolved pipeline issues and supported critical operations by ensuring high system resilience during peak workloads.
Zepto Hyderabad, India
Big Data Engineer Dec 2020 – Nov 2021
Roles & Responsibilities:
●Built and optimized data pipelines using Spark on AWS EMR to ingest and transform data from S3 and deliver curated datasets to Snowflake.
●Developed Spark SQL and PySpark scripts in Databricks for data extraction, transformation, and real-time processing using Spark Streaming.
●Designed and deployed end-to-end ETL/ELT workflows using Apache Airflow, integrating Snowflake, Snowpark, and AWS services.
●Authored Python parsers to extract insights from unstructured data and automated content updates to databases.
●Created and maintained complex Snowflake SQL queries, defined roles, and optimized warehouse configurations for cost-efficient performance.
●Engineered Hadoop-based data workflows using HDFS, Sqoop, Hive, MapReduce, and Spark; supported Teradata-to-Hive incremental imports using Sqoop.
●Developed Java and Talend ETL jobs for data ingestion into Hadoop and Redshift, leveraging Talend Big Data and cloud components.
●Implemented disaster recovery plans and security controls for Snowflake, ensuring business continuity and compliance.
●Built CI/CD pipelines with Jenkins and Docker, and deployed containers to Kubernetes for scalable runtime environments.
●Migrated legacy on-premises applications to AWS, using EC2, S3, and CloudWatch for monitoring, logging, and alerting.
●Utilized AWS Athena, Redshift Spectrum, and S3 to enable serverless querying and virtual data lake architecture without traditional ETL.
●Collaborated using GitHub for version control and managed issues and change tickets through JIRA and ServiceNow.