Post Job Free
Sign in

Data Engineer Machine Learning

Location:
Columbus, OH
Salary:
90 per hour
Posted:
May 09, 2025

Contact this candidate

Resume:

Srinivas Rao Perala

SR. DATA ENGINEER

Summary:

• Seasoned Software Engineer with 13+ years of professional experience, including over 7 years specializing in Data Engineering. Proven expertise in building data-intensive applications and robust data pipelines using Python, Shell scripting, and a range of cloud and big data technologies across AWS and Azure platforms.

• Strong background in data analytics, data mining, and statistical modeling, with hands-on experience handling large-scale structured and unstructured datasets for predictive analytics, data validation, and visualization.

• Proficient in the full software development lifecycle (SDLC) and Agile methodologies, from requirements gathering through deployment and support.

• Expert in implementing complex ETL pipelines using tools such as Azure Data Factory, AWS Glue, and SSIS, and managing scalable big data environments with Apache Spark, Kafka, Azure Synapse, and Snowflake.

• Applied machine learning and advanced analytics using Azure Machine Learning Studio and AWS SageMaker to support data-driven decision-making.

• Experience managing and scheduling workflows with Apache Airflow on Hadoop clusters.

• Strong DevOps skills, including CI/CD automation with Azure DevOps and AWS tools (CodePipeline, CodeDeploy, CodeCommit), leading to improved deployment speed and system stability.

• Extensive hands-on experience with AWS services such as EC2, S3, Lambda, API Gateway, DynamoDB, EMR, RDS, IAM, CloudFront, and CloudTrail. Also familiar with network and security components like DNS, VPC, Security Groups, and Firewalls.

• Led a major healthcare data engineering project using Azure, integrating real-time data processing with ML pipelines and establishing robust analytics workflows.

• Streamlined data asset discovery and cataloging through Azure Data Factory, enhancing metadata governance and reusability.

• Designed and implemented fault-tolerant pipelines with Azure Functions and Azure Monitor to ensure data integrity and reliability.

• Skilled in real-time streaming using Azure Stream Analytics and Apache Kafka (HDInsight), supporting event-driven architectures.

• Successfully led the migration of mission-critical infrastructure from AWS to Azure, achieving performance gains and operational continuity.

• Implemented data security and compliance protocols using Azure Key Vault and Azure Active Directory, ensuring governance and regulatory adherence.

• Expertise in disaster recovery planning using Azure Backup and Blob Storage, with strong proficiency in schema management on Azure Data Lake Storage.

• Developed scalable data workflows using Azure Logic Apps and created custom connectors with Azure SDKs/APIs to integrate third-party data sources.

• Committed to continuous learning, staying current with emerging technologies such as Azure Synapse Analytics to drive innovation and operational efficiency. Technical Skills:

Programming & Big Data: Python, Java, Apache PySpark, Apache Kafka Cloud Platforms:

Azure: Azure Data Factory, Azure Event Hub, Azure Databricks, Azure Synapse, Azure Data Lake Gen2, Blob Storage

AWS: EC2, Glue, S3, RDS, Athena, Lambda, SageMaker, EMR, Kinesis, IAM, Step Functions Data Warehousing & Databases: Snowflake, AWS Redshift, Google BigQuery, Oracle, MySQL, PostgreSQL ETL & Reporting Tools: Informatica PowerCenter, Fivetran, Apache Airflow, Power BI, Tableau Version Control & CI/CD: Git, GitLab, Jenkins, Docker Development Tools & Methodologies: Anaconda, Jupyter, VS Code, PyCharm, Agile/Scrum, JIRA Education:

Master's, Computer Applications from Jawaharlal Nehru Technological University Hyderabad - 2012 Professional Experience:

Tech Mahindra – Senior Data Engineer

Mar 2023 – Till Date India

• Designed and optimized scalable ELT/ETL pipelines using SQL, Python, Azure Data Factory, Apache Airflow, and Databricks.

• Automated complex data transformations in Databricks, improving processing efficiency by 30%.

• Architected and maintained data workflows using Azure Synapse Analytics, Logic Apps, and Spark SQL for large-scale analytics.

• Implemented advanced query optimization and partitioning in Synapse SQL for performance gains.

• Strengthened data security using Azure Key Vault, Active Directory, and compliance-driven best practices.

• Automated validation and testing with Python across Azure Data Lake and Data Factory, ensuring data reliability.

• Integrated Power BI with Synapse to deliver real-time healthcare dashboards to stakeholders.

• Developed custom APIs and connectors with Azure SDKs for third-party healthcare systems.

• Implemented CI/CD with Azure DevOps, streamlining deployments and reducing release time.

• Used Apache Kafka for high-throughput, fault-tolerant data pipelines across distributed systems. Environment: Azure Data Factory, Azure Synapse Analytics, Azure Databricks, Python, PySpark, Apache Airflow, Apache Kafka, SQL, T-SQL, U-SQL, Power BI, Azure Functions, Azure Key Vault, AAD, Azure DevOps, Git, Snowflake, CI/CD, Agile, NumPy, Matplotlib, Scala

Born Commerce – Cloud Data Engineer

May 2021 – Feb 2023 India

• Led data infrastructure migration from AWS to Azure, ensuring minimal downtime.

• Built cross-cloud ETL workflows with Azure Data Factory and AWS Data Pipeline.

• Engineered real-time ingestion pipelines with Kafka and Cassandra for high-volume streaming data.

• Developed data APIs using Flask/Django for internal and third-party integration.

• Optimized ETL with PySpark, Snowflake, Talend, and Informatica; enhanced query performance and data profiling.

• Built CI/CD pipelines using Jenkins, Docker, GitHub Actions, integrated REST APIs for real-time analytics.

• Implemented cloud security with IAM, VPC, and compliance controls.

• Utilized AWS Athena and Redshift for ad-hoc querying and warehousing of large datasets. Environment: Python, Scala, dataflow, Map- Reduce, Machine Learning, Power BI, Airflow, Data Factory, Spark, Kafka, CI/CD, AWS Lambda, S3, IAM, DynamoDB, Git, GitHub, Docker, Jenkins, SQL, Snowflake, Tableau, Informatica, PostgreSQL

VCarve Technologies – Data Engineer

Sep 2019 – Apr 2021 India

• Built Spark Streaming pipelines to process healthcare data from Kafka in near-real time.

• Used AWS EMR and Glue to handle large-scale processing and unified data lakes.

• Developed centralized AWS S3 data lake and optimized workflows for 25% faster performance.

• Created and deployed services using AWS EKS, Lambda, and CLI automation tools.

• Designed ETL jobs in Informatica Cloud, improving transformation speeds by 30%.

• Achieved a 25% reduction in data processing times by optimizing SQL queries and ETL workflows, utilizing performance tuning techniques, and enhancing Spark job configurations to maximize resource efficiency. Environment: Python, ETL, Spark, Kafka, CI/CD, CloudWatch, DynamoDB, PySpark, MySQL, Tableau, Informatica, ETL, Agile, Linux, AWS Lambda, AWS Glue, EC2, S3, IAM, CLI, EKS Zoylo Digihealth – Data Analyst / Data Engineer

Sep 2018 – Sep 2019 India

• Transitioned from e-commerce to healthcare data engineering and analytics.

• Built enterprise-level data warehouses and lakes on AWS, automated pipelines.

• Developed ML models for fraud detection using Python, Spark, and AWS SageMaker.

• Created dashboards with Power BI and AWS QuickSight; improved time-to-insight by 25%.

• Conducted statistical analysis and A/B testing for executive reports using Python and R.

• Developed REST APIs in Flask for cross-system integration.

• Developed various Python scripts to find vulnerabilities with SQL Queries by doing SQL injection, permission checks, and analysis.

Environment: AWS, Python, AWS Lambda, EC2, S3, IAM, CloudWatch, MySQL, Snowflake, Tableau, RedShift, Agile, Linux, ETL, AWS QuickSight, Docker, Jenkins, Kubernetes. Tech Mahindra – Magento / PHP Lead Developer

Oct 2016 – Sep 2018 India

• Led data migration for 11 Magento sites with 55 stores under one core system.

• Developed custom payment, shipping, and inventory middleware integrations.

• Delivered large-scale B2B Magento solutions, including multi-store, multi-currency, and custom approval workflows.

• Improved application performance using caching (Redis, Varnish) and server tuning (PHP-FPM, Nginx).

• Integrated analytics, CRM, and GTM to boost online sales.

• Translated project requirements into detailed technical designs to be implemented

• Handled the Customer Support scenarios in the Magento systems.

• Added the solution to have multi websites, multi-stores and multi-currency

• Added large-scale Magento B2B implementations, including custom Company Accounts, Shared Catalogues, Request for Quote (RFQ) flows, and Purchase Orders with custom approval workflows Optimized Magento application performance through advanced coaching (Varnish, Redis), database tuning, and server-side optimizations (PHP-FPM, Nginx).

Environment: Magento 1 & 2, PHP, Redis, RabbitMQ, Cloudflare, Jenkins, Docker, GTM, REST APIs Dotcomweavers Solutions – Magento / PHP Lead Developer May 2015 – Sep 2016 India

• Led e-commerce development and customization for various Magento-based platforms.

• Integrated custom modules (e.g., shipping, YMM search, marketplace) and REST APIs.

• Performed theme design, SEO optimization, and catalog imports for large datasets.

• Increase the performance by adding the optimizing techniques suggested by the Magento.

• Custom theme and REST API implementations.

• Added the daily product import incremental scripts Environment: Magento, PHP, Redis, REST APIs, Cloudflare, GitHub, Jira CopperLabs Pvt Ltd & Mobiware Pvt Ltd – Software Engineer May 2012 – Apr 2015 India

• Developed Magento-based e-commerce platforms for multiple industry verticals.

• The objective of all the projects is to replace the existing E-Shop, which is a new Magento-based E-Shop. All E- Shop has online e-commerce portal for selling products like “Medical equipment (AMD)”, “Measurement, Control, Recording, and Calibration (Instrumentation)”, “Pool and Spa Products (Hottubspasupplies)” and “Eco User friendly (Youchange Earth911)”.

• Enhanced themes built REST APIs, and optimized backend performance.

• Integrated CRM, analytics tools, and payment systems into Magento platforms.

• Testing unit testing & integration testing

• Responsible for overseeing the Quality procedures related to the project Environment: Magento, PHP, REST APIs, Redis, Varnish, AWS EC2, RDS, Git, Jira



Contact this candidate