• Data Engineer with *+ years of experience designing scalable ETL pipelines, building data lakes, and managing end-to-end data workflows across AWS, Azure, and GCP environments.
• Expertise in big data technologies including Apache Spark, Hadoop, Kafka, Hive, Sqoop, and Airflow to support real-time and batch data processing.
• Proficient in Python, SQL, Scala, R, and Shell scripting for data ingestion, transformation, and pipeline automation across structured and semi-structured sources.
• Skilled in data modelling using Star and Snowflake schemas, optimizing data warehouse performance on platforms like Snowflake, Redshift, Synapse, and Databricks.
• Strong experience in building interactive dashboards and reports using Tableau, Power BI, and Looker to deliver actionable business insights.
• Adept at integrating DevOps practices into data engineering workflows using Docker, Terraform, GitHub Actions, Jenkins, and Agile/Scrum methodologies.
TECHNICAL SKILLS:
• Programming Languages: Python, R, Scala, SQL, Shell Scripting
• Data Engineering & ETL: ETL Development, Data Pipelines, Data Ingestion & Transformation, ETL Automation, Apache Airflow, Informatica PowerCenter, Talend
• Cloud Platforms & Technologies: AWS (S3, Redshift, Lambda, Glue), GCP (BigQuery, Dataflow, Cloud Storage), Azure (Data Lake, ADF, Synapse), Snowflake, Databricks
• Big Data Technologies: Apache Hadoop, Apache Spark, Apache Kafka, Apache Hive, Apache Pig, Apache Sqoop, Apache Oozie
• Databases & Storage: PostgreSQL, MySQL, SQL Server, Oracle, MongoDB, Cassandra, DynamoDB, Amazon Redshift, HBase
• Data Warehousing & Data Management: Data Warehousing, Data Modelling (Star and Snowflake Schemas), Data Migration, Data Governance
• Machine Learning & AI: Feature Engineering, Predictive Modelling, Time Series Analysis, Regression Analysis
(Linear/Logistic)
• Data Visualization & Reporting: Tableau, Power BI, Looker, Matplotlib, Seaborn, Plotly
• DevOps & CI/CD: Git, GitHub, Jenkins, GitLab CI/CD, Docker, Terraform, Agile/Scrum, JIRA, Confluence
• Advanced Analytics & Statistical Tools: NumPy, pandas, scikit- learn, SparkSQL, Statistical Analysis, A/B Testing EDUCATION:
• Master of Science
Concordia University Wisconsin
• Bachelor of Technology in Electronics and Communication Engineering RVR&JC College of Engineering
WORK EXPERIENCE:
Ameri Source INC Data Engineer Feb 2023 - Current
• Architected and deployed 30+ ETL pipelines using Apache Airflow, AWS Glue, and Snowflake, boosting ingestion efficiency by 45%.
• Spearheaded real-time data processing solutions via Apache Spark, Kafka, and Hadoop, cutting query latency for sales analytics.
• Automated ingestion of 50+ external sources into AWS through Dataflow, decreasing manual workload by 80%.
• Designed and optimized dimensional models (Star and Snowflake schemas), accelerating executive dashboard performance in Power BI.
• Established CI/CD workflows with GitHub Actions, Jenkins, and Terraform, achieving deployment automation for data platform resources.
• Developed predictive churn models in Python and SparkMLlib, improving retention forecasting accuracy by 22%.
• Rolled out a metadata-driven governance model with Informatica PowerCenter, enhancing data lineage visibility for 100+ datasets.
• Led bi-weekly sprint ceremonies (backlog grooming, planning) using Jira and Confluence, increasing sprint deliverable success rates. Coforge Data Engineer Aug 2019 - July 2022
• Engineered 15+ real-time streaming pipelines with Apache Kafka, Spark Structured Streaming, and Azure, reducing data delivery time.
• Developed batch ETL frameworks with Talend and Hive, minimizing transformation errors by 40% and enhancing SLA compliance.
• Implemented Azure Data Factory workflows linking 5+ enterprise data lakes to Synapse Analytics, enabling seamless multi-source reporting.
• Conducted 25+ A/B testing experiments with R, pandas, and NumPy, boosting marketing campaign ROI by 18%.
• Migrated 20+ legacy SQL Server and Oracle databases to Snowflake and Databricks, improving query performance by 50%.
• Created 40+ dynamic dashboards in Looker and Tableau, enhancing stakeholder access to real-time financial metrics.
• Standardized cloud infrastructure with Docker and Terraform, cutting environment provisioning time across Azure.
• Designed feature engineering pipelines supporting 10+ time series models for credit risk predictions, enhancing loan default detection. CERTIFICATION:
• AWS Solution Architect
Manikanta Papasani
414-***-**** ******************@*****.***
Data Engineer