Vinitha Mallapally
Data Engineer
****************@*****.*** +1-805-***-**** LinkedIn: www.linkedin.com/in/vinithareddym PROFESSIONAL SUMMARY
Data engineer with 3+ years of experience in data pipeline optimization and architecture for companies like Capital One and Apollo Hospitals. Spearheaded the construction of enterprise data warehouses using Redshift and Snowflake, accelerating BI query performance by 50%. Automated ETL workflows with AWS Glue and Lambda, eliminating 15 hours of weekly manual tasks. Ensured data governance frameworks maintained 99.9% accuracy and full SOX compliance across 50+ data sources.
TECHNICAL SKILLS
• Programming Languages: Python, SQL, Scala, Java, R, Shell Scripting
• Big Data Technologies: Apache Spark, Hadoop, Hive, Kafka, HBase, Cassandra
• Cloud Platforms: AWS (S3, Redshift, Glue, EMR, Lambda, Athena), Azure (Data Factory, Databricks, Synapse Analytics), GCP (BigQuery, Dataflow)
• Databases: PostgreSQL, MySQL, Oracle, SQL Server, MongoDB, DynamoDB
• Data Warehousing: Snowflake, Redshift, BigQuery, Teradata
• ETL/ELT Tools: Informatica, Talend, Apache NiFi, AWS Glue, Azure Data Factory
• Data Modeling: ERwin, ER/Studio, Dimensional Modeling, Star/Snowflake Schemas
• DevOps & Monitoring: Docker, Kubernetes, Jenkins, GitLab CI, Datadog, Splunk, CloudWatch
• Methodologies: Agile, Scrum, DevOps, DataOps
WORK EXPERIENCE
Data Engineer Capital One McLean, VA Aug 2024 – Present
• Architected scalable AWS data pipelines processing 10TB+ daily financial transaction data, boosting processing efficiency by 40%.
• Engineered Spark applications in Scala and Python for batch and real-time processing, cutting data latency from 45 to 30 minutes.
• Constructed enterprise data warehouses using Redshift and Snowflake, accelerating BI query performance by 50%.
• Automated ETL workflows with AWS Glue and Lambda, eliminating 15 hours of weekly manual tasks and achieving 99.5% pipeline reliability.
• Collaborated with ML teams to productionize fraud detection models, enabling real-time risk assessment for 2M+ daily transactions.
• Established data governance frameworks ensuring 99.9% accuracy and full SOX compliance across 50+ data sources.
• Optimized storage architecture using S3, Parquet, and ORC formats, reducing monthly storage costs by $50K
(25% reduction).
• Developed comprehensive data catalogs increasing data asset discoverability by 60% across 8 business units.
• Transformed business requirements into technical specifications for 12 cross-functional projects, delivering 100% on-time.
• Monitored production systems using CloudWatch and Datadog, maintaining 99.8% uptime across critical data pipelines.
Technologies: AWS (S3, Redshift, Glue, Lambda, EMR), Apache Spark, Scala, Python, Snowflake, SQL, Docker, Jenkins, Datadog
Data Engineer Apollo Hospitals India Mar 2021 – Dec 2022
• Streamlined healthcare data integration from 15+ hospital systems, reducing patient data retrieval time by 35%.
• Built real-time patient monitoring dashboards processing 500K+ records daily, improving clinical decision- making speed by 25%.
• Implemented HIPAA-compliant data pipelines ensuring 100% regulatory adherence while maintaining data accessibility.
• Migrated legacy systems to cloud infrastructure, reducing operational costs by 30% and improving system performance by 45%.
• Created automated reporting solutions for clinical teams, eliminating 20 hours of manual report generation weekly.
• Standardized data quality processes across 12 departments, achieving 98% data accuracy improvement.
• Collaborated with medical professionals to design patient outcome prediction models, supporting evidence- based care decisions.
Technologies: Python, SQL Server, Azure Data Factory, Power BI, Apache Kafka, MongoDB, Tableau EDUCATION
• Master of Science in Information Technology St. Francis College New York, NY Aug 2023 - May 2025
• Bachelor of Engineering in Computer Science and Engineering Aurora’s Technological and Research Institute Hyderabad, India June 2018 - May 2022