Post Job Free
Sign in

Real-Time Big Data

Location:
Cincinnati, OH
Posted:
February 14, 2025

Contact this candidate

Resume:

Sai Sravanthi Ponigate Mobile: +1-220-***-****

LinkedIn: https://www.linkedin.com/in/sravanthi-ponigate/ Github: https://github.com/pssravanthi Email: *********@*****.***

Education

Master of Engineering (MEng) in Computer Science August 2023 – May 2025 University of Cincinnati Cincinnati, USA

Technical Skills

Programming Languages: Python, Java, Scala, SQL.

Big Data & Streaming Technologies: Apache Spark, Hadoop (HDFS, Hive, HBase, Oozie, Sqoop), Kafka (Streams, Zookeeper), Airflow, PySpark, Pandas, NumPy, Hue, Flume, Cloudera Cloud Platforms: Azure (ADF, Synapse, Data Lake, Databricks, Blob Storage, Event hubs, Functions), AWS (IAM, S3, EC2, Lambda, Glue, Redshift, EMR), GCP (Cloud storage, DataProc, Composer, BigQuery). DevOps & CI/CD: Docker, Kubernetes, Terraform, Ansible, Airflow, Jenkins, Git, ServiceNow, SonarQube. Monitoring & BI Tools: Grafana, Prometheus, Tableau, Looker, Power BI. Database & Big Data Technologies: PostgreSQL, MySQL, MongoDB, Oracle, Snowflake, SQLite, DynamoDB, Cassandra. Work Experience

Tata Consultancy Services September 2021 – July 2023 Software Engineer - Data Hyderabad, India

Tech Stack: Apache Spark, Hadoop, HDFS, Azure Data Factory, Databricks, Synapse, Azure SQL, Snowflake, Kafka, Kubernetes (AKS), Docker, Tableau, Oozie, Hive, Cosmos DB, Oracle DB.

• Designed and optimized high-performance ETL pipelines using ADF, Databricks (PySpark), and Snowflake, improving data processing efficiency by 30% while reducing storage costs by 20% through schema optimization.

• Built large-scale data processing solutions with Hadoop, Apache Spark, and HDFS, optimizing query performance by 40% and enabling real-time analytics with Hive, Oozie, and Azure Synapse.

• Developed scalable, microservices-based data workflows using Docker, Kubernetes (AKS), and Azure DevOps, cutting ETL deployment time by 50% and streamlining CI/CD pipelines.

• Engineered real-time data streaming pipelines with Apache Kafka, Spark Streaming, and Zookeeper, exposing processed data via REST APIs and reducing incident response time by 35%.

• Orchestrated complex data workflows using Apache Airflow, automating data ingestion, transformation, and pipeline monitoring to ensure high availability and fault tolerance.

• Automated infrastructure provisioning with Terraform and Ansible, cutting manual provisioning efforts by 60%, and integrated Jenkins CI/CD automation for seamless deployments.

• Optimized data ingestion workflows using Sqoop and Flume, ensuring smooth migration of on-premise Hadoop workloads to Azure Data Lake and Snowflake.

• Led technical PoCs and cross-functional collaboration, improving Hadoop-based distributed data processing, enhancing BI capabilities with Tableau, and presenting architecture improvements to stakeholders. Projects

Real-time Data Streaming Pipeline Apache Kafka, Spark, Airflow, PostgreSQL

Tech Stack: Apache Kafka, Apache Spark (Streaming), Airflow, PostgreSQL, Zookeeper, Docker, Cassandra.

• Designed and deployed a real-time data streaming pipeline leveraging Kafka, Spark Streaming, and Airflow, reducing processing latency by 50% and improving event-driven analytics.

• Automated ETL workflows using Apache Airflow DAGs, enabling seamless data ingestion into PostgreSQL and real-time processing in Cassandra.

• Containerized and automated deployments with Docker & Docker Compose, improving pipeline scalability and reducing infrastructure setup time by 40%.

Real-Time Log Processing Pipeline Apache Kafka, Spark, Snowflake, FastAPI Github: Real-Time Log Processing

Tech Stack: Apache Kafka, Apache Spark, FastAPI, Snowflake, Docker, Kubernetes, Prometheus, Grafana.

• Developed a real-time fraud detection system using Kafka & Spark, enabling sub-second anomaly detection, reducing financial fraud risks by 40%.

• Engineered a log processing pipeline integrating FastAPI and Snowflake, ensuring structured log storage & efficient retrieval for business intelligence teams.

• Implemented real-time monitoring using Prometheus & Grafana, providing live dashboard insights and anomaly detection in log data.

Certifications

• Microsoft Certified: Azure Fundamentals

• Microsoft Certified: Azure Data Engineer Associate



Contact this candidate