Vamshidhar Reddy Python Data Engineer
+1-636-***-**** ***************@*****.*** LinkedIn
PROFESSIONAL SUMMARY
Results-driven Python Developer and Data Engineer with 6+ years of experience building scalable data platforms, ETL pipelines, and cloud-based analytics solutions across insurance, healthcare, and e-commerce domains. Strong expertise in Python, PySpark, Apache Spark, Kafka, and distributed data processing within Hadoop ecosystems. Experienced in designing high-performance data pipelines, real-time streaming solutions, and cloud-native architectures using AWS, Azure, and GCP. Proven ability to process large-scale datasets, optimize data workflows, and develop REST APIs that support enterprise analytics and business intelligence. Adept at collaborating in Agile environments to deliver reliable, scalable, and production-ready data engineering solutions.
PROFESSIONAL EXPERIENCE
Orthomed Anesthesia — Dallas, Texas, Python Automation Engineer, Jan 2025 – Present
Architected Python automation frameworks to streamline healthcare billing and clinical reporting workflows, eliminating manual intervention across 40+ operational processes and reducing processing time by 65%.
Developed scalable ETL pipelines using Python, SQL, and Apache Airflow to ingest and transform clinical datasets exceeding 5M+ records weekly for operational analytics.
Implemented automated data validation and reconciliation scripts for anesthesia procedure data, improving data accuracy by 30% and reducing claim processing discrepancies.
Integrated RESTful APIs between EHR systems, analytics platforms, and internal services, enabling real-time synchronization across six enterprise healthcare applications.
Built monitoring and alerting mechanisms for production pipelines using logging and observability tools, decreasing pipeline failures by 45% and accelerating incident response time.
Collaborated with DevOps teams to deploy containerized automation services using Docker and cloud infrastructure, improving deployment efficiency and scalability by 50%. USAA — San Antonio, Texas, Python Developer / Data Engineer, Nov 2023 – Dec 2024
Engineered high-volume data pipelines using Python, PySpark, and AWS services (EMR, S3, Glue) to process 10M+ financial transactions daily for fraud detection analytics.
Designed optimized ETL and data ingestion frameworks that reduced pipeline execution time by 40% while improving reliability and fault tolerance.
Built scalable microservices and REST APIs to deliver insurance risk analytics powering 20+ enterprise reporting dashboards used by risk and compliance teams.
Tuned complex SQL queries and warehouse schemas in Redshift and cloud data platforms, improving query performance across large financial datasets by 35%.
Developed automated data quality monitoring frameworks that detected anomalies early and increased downstream data reliability by 25%.
Partnered with data scientists and DevOps engineers to deploy cloud-native data solutions, enabling scalable analytics for enterprise fraud and compliance systems.
State Farm Insurance — Bloomington, Illinois, Python Developer / Data Engineer, Jan 2023 – Oct 2023
Developed automated PySpark ETL pipelines to ingest and process insurance claims and policy datasets exceeding 8M records, enabling faster actuarial and risk analytics.
Built modular Python data processing components that improved data preparation efficiency by 45% for underwriting and analytics teams.
Implemented scalable workflow orchestration using Apache Airflow and AWS services, improving pipeline scheduling reliability by 30%.
Designed secure REST APIs and data services to enable seamless data access between enterprise analytics platforms and underwriting systems.
Optimized large-scale warehouse queries and indexing strategies, reducing report generation latency by 40% across high-volume datasets.
Introduced automated testing and validation frameworks for ETL pipelines, reducing production data errors by 28% and improving data reliability.
DishNetwork — Arizona, USA, Python Engineer, Jan 2022 – Dec 2022
Built Python-based data processing systems to analyze 12M+ marketplace product records, improving catalog data accuracy and product listing quality.
Designed scalable data ingestion pipelines supporting near real-time product analytics and recommendation systems used by marketplace teams.
Implemented advanced data cleansing and transformation workflows, reducing duplicate listings by 35% across global product catalogs.
Developed internal RESTful APIs for analytics platforms, enabling marketing and product teams to access real-time sales insights.
Optimized backend Python processing jobs to improve data processing performance and reduce pipeline execution time by 30%.
Collaborated with distributed engineering teams to deploy data-driven solutions supporting global e-commerce operations.
Quest Software — India, Python Data Engineer, May 2020 – Jul 2021
Designed scalable data ingestion pipelines to process enterprise telemetry and log data exceeding 5TB per month, enabling advanced system monitoring.
Developed Python automation scripts for log analysis and anomaly detection, accelerating incident identification and reducing troubleshooting time by 40%.
Built ETL pipelines integrating multiple enterprise databases, reducing manual data consolidation efforts by 50%.
Optimized backend processing workflows using Python and SQL, improving data availability for analytics teams by 35%.
Implemented automated data validation and quality monitoring frameworks, ensuring consistent data across distributed environments.
Assisted DevOps teams in deploying data services using Docker-based containerization and cloud infrastructure. Info Edge Solutions — Hyderabad, India, Software Engineer (Intern), May 2019 – Apr 2020
Assisted in building backend modules using Python and SQL for job-search analytics platforms handling 1M+ daily user interactions.
Developed data processing scripts to cleanse and structure recruitment datasets, improving data usability for analytics teams by 25%.
Contributed to automation workflows that reduced manual reporting tasks by 40% across recruitment operations.
Implemented REST API endpoints enabling internal applications to access candidate and job market data efficiently.
Participated in debugging and optimizing backend services, improving application response time by 20%.
Collaborated with senior engineers in Agile development cycles, delivering feature enhancements and platform improvements.
TECHNICAL SKILLS
Programming Languages: Python, SQL, Java, C, C++, Bash/Shell Scripting Big Data & Data Processing: Apache Spark, PySpark, Spark SQL, Spark Streaming, Hadoop, HDFS, Hive, MapReduce, Pig, Sqoop, Flume, Apache Beam
Data Engineering & ETL: ETL/ELT Pipelines, Data Modeling, Data Transformation, Batch Processing, Streaming Pipelines, Data Integration
Workflow Orchestration: Apache Airflow, Azure Data Factory, Oozie, Autosys, Control-M, Cron Messaging & Streaming: Apache Kafka, Kafka Streams, AWS Kinesis, Azure EventHub, Google Pub/Sub Cloud Platforms: AWS (EC2, EMR, S3, Redshift, Glue, Lambda, Step Functions, SNS, SQS, CloudWatch), Azure (Synapse, Data Factory, Azure Functions, EventHub, Stream Analytics), GCP (Dataproc, Dataflow, BigQuery, Cloud Functions, Composer)
Databases: PostgreSQL, MySQL, Oracle, Teradata, Microsoft SQL Server, MongoDB, Cassandra, HBase, DynamoDB, Azure Cosmos DB, Databricks, Delta Lake, Unity Catalog APIs & Backend Development: REST APIs, GraphQL, Flask, Django, Node.js DevOps & Tools: Docker, Kubernetes, CI/CD Pipelines, Git, Jenkins Data Visualization & Governance: Tableau, Informatica Enterprise Data Catalog Data Analysis & Libraries: Pandas, NumPy, Excel Macros Development Methodologies: Agile, Scrum, Waterfall, SDLC, Test Automation EDUCATION
Master of Science (M.S.) in Computer Science 2021 - 2022 University of Central Missouri — Warrensburg, Missouri, USA Bachelor of Technology (B.Tech.) in Electronics and Communication Engineering 2016 - 2020 Aurora Technological and Research Institute — Hyderabad, India