VAIBHAV SAMUDRALA
Data Engineer
Texas, USA ***** 469-***-**** *****************@*****.*** linkedin.com/in/vaibhav-s-01303017a/ PROFESSIONAL SUMMARY
Experienced Data Engineer with over 6 years of expertise designing, building, and optimizing robust data pipelines and ETL workflows for large-scale enterprise environments. Proficient in leveraging cloud platforms such as AWS and Azure, along with big data technologies like Apache Spark and Kafka, to support real-time and batch data processing. Skilled in automating data ingestion, transformation, and validation processes to ensure high data quality and availability for analytics and AI applications. Adept at collaborating with cross-functional teams, driving scalable data solutions, and maintaining compliance with data governance standards. Strong background in Agile methodologies, CI/CD pipelines, and documentation best practices to deliver efficient and reliable data infrastructure.
TECHNICAL SKILLS
Programming Languages: Python (Pandas, NumPy, SQLAlchemy), SQL, Scala, Bash/Shell Scripting Databases & Data Warehousing: SQL Server, PostgreSQL, MySQL, MongoDB, DynamoDB, Snowflake, Redshift, Azure Synapse Analytics, Schema Design, Indexing, Partitioning, Query Optimization Cloud Platforms & Infrastructure: AWS (S3, Redshift, Glue, Lambda, EMR, EC2, Route53, Elastic Beanstalk, Kinesis), Azure
(Data Factory, Databricks, Blob Storage, Azure Data Lake Storage), IaC, Terraform Big Data Technologies: Spark (PySpark), Kafka, Airflow, Hive, NiFi, Flink, Hadoop (HDFS, YARN, MapReduce) ETL / ELT & Data Pipelines: Talend, Informatica
Data Modeling & Architecture: Dimensional Modeling, Star Schema, OLAP, Relational Modeling DevOps & Infrastructure: CI/CD (Jenkins, GitHub Actions), Git, GitHub, Bitbucket, Prometheus, Grafana Data Visualization & Analytics: Tableau, Power BI, QuickSight Environments: SDLC, Agile, Kanban, Scrum, Waterfall, Linux/Unix, MacOS PROFESSIONAL EXPERIENCE
Data Engineer Jan 2024 – Present
PICKET USA
Design, develop, and optimize scalable data pipelines and ETL workflows for processing large volumes of transactional and behavioral data.
Collaborate with data scientists and analysts to build reliable data platforms supporting real-time analytics and machine learning initiatives.
Utilize cloud platforms such as AWS (Redshift, S3, Glue) and Azure to manage data storage, processing, and orchestration.
Implement data validation, quality checks, and error handling mechanisms to ensure data accuracy and consistency.
Automate data ingestion processes from multiple sources including APIs, databases, and streaming platforms like Kafka.
Monitor pipeline performance using tools like CloudWatch and Grafana, proactively identifying and resolving bottlenecks.
Participate in architecture design reviews to enhance data platform scalability, security, and compliance.
Develop documentation and standard operating procedures for data engineering workflows and best practices.
Collaborate within Agile teams, contributing to sprint planning, code reviews, and continuous integration/deployment (CI/CD) pipelines.
Data Engineer Jul 2017 – Dec 2021
CGI India
Built and maintained ETL processes to aggregate and transform data from diverse sources including SQL/NoSQL databases and flat files.
Developed data pipelines using Python, SQL, and Apache Spark to support reporting and analytics platforms.
Assisted in migrating legacy batch processing workflows to modern cloud-based architectures on AWS and Azure.
Performed data profiling and cleansing activities to improve data quality and reliability for downstream consumption.
Worked closely with business intelligence and analytics teams to deliver timely and accurate data insights.
Automated repetitive data processing tasks using scripting and orchestration tools such as Airflow and Luigi.
Monitored data pipeline health, troubleshot issues, and implemented fixes to reduce downtime and data delays.
Participated in code reviews and adhered to best practices for version control using Git.
Engaged in knowledge sharing sessions and documentation to support team learning and onboarding.
Collaborated with security teams to implement data governance policies ensuring compliance with regulatory standards. EDUCATION & CERTIFICATIONS
Masters: Computer Science Dec 2023
East Texas A&M University Commerce, TX