Data Engineer Real-Time

Location:

Belleville, MI

Salary:

98000

Posted:

April 21, 2025

Contact this candidate

Resume:

Results-driven Data Engineer with *+ years of experience designing and managing scalable data pipelines, distributed data architectures, and cloud-based data solutions. Proficient in Python, SQL, and Databricks, with a strong track record of optimizing data processing systems and schema modeling to drive business insights. Hands-on expertise in AWS, GCP, and Azure ecosystems, including database design, query optimization, and cloud-native data services. Experienced in infrastructure as code (Terraform), automation, and performance tuning to enhance data reliability and scalability. Adept at working in hybrid environments, collaborating cross-functionally to implement high- impact data strategies.

PROFESSIONAL EXPERIENCE

AZURE DATA ENGINEER Feb 2024 – Present

STRYKER, Kalamazoo, MI

• Designed and deployed real-time data pipelines with AWS infrastructure and improved data accessibility for 5 data scientists, accelerating model development by 25%.

• Developed Power BI dashboards with DAX and custom visuals to support business intelligence and reporting.

• Optimized SQL Server queries, resulting in a 25% improvement in query performance.

• Built real-time streaming solutions utilizing Kafka and Spark Streaming for monitoring medical device performance.

• Collaborated with cross-functional teams to ensure compliance with FDA, ISO 13485, and GMP regulations.

• Constructed fully automated CI/CD pipelines with Jenkins, reducing deployment time by 30% and enabling faster iteration cycles for the engineering team, resulting in quicker updates.

AWS DATA ENGINEER March 2023 – Jan 2024

AUTO-OWNERS INSURANCE, Lansing, MI

• Developed real-time data pipelines using AWS Glue, Redshift, and Kinesis for risk scoring and underwriting models.

• Built an AWS data lake for claims processing, reducing data retrieval time by 40% and improving fraud detection accuracy by 30%.

• Configured Spark Streaming to process real-time data from Kafka for fraud detection and claims monitoring.

• Automated infrastructure provisioning using Terraform and set up CI/CD pipelines with Jenkins and GitHub.

• Created datasets from Amazon S3 using AWS Athena and generated visual insights with AWS QuickSight. GCP DATA ENGINEER April 2021- August 2022

MORGAN STANLEY, Bangalore, India

• Designed and optimized financial data pipelines using GCP BigQuery, DataProc, and Airflow for risk analytics and regulatory reporting.

• Built ETL workflows for loan processing, credit risk analysis, and fraud detection, ensuring compliance with Basel III and GDPR.

• Orchestrated financial data transformations and reporting pipelines with PySpark and Hive, processing 500+ GB of data daily while improving data accuracy for regulatory filings by 15%. Implemented machine learning techniques (sci-kit-learn, TensorFlow) for predictive analytics and anomaly detection. AWS DATA ENGINEER June 2019 – March 2021

PHILIPS HEALTHCARE, Bangalore, India

• Developed AWS Data Pipeline to extract, transform, and load (ETL) medical data from S3 into Redshift for healthcare reporting.

• Built real-time dashboards using Power BI and Tableau, providing KPI insights for operational and supply chain data.

• Optimized over 150 SQL queries within healthcare data systems through advanced index management and query tuning techniques; achieved a 40% reduction in query execution time, enhancing overall system performance for end-users. EDUCATION

Lewis University at Romeoville Illinois, USA

Master’s in Data Science Graduation Date: May 2024 VNR Vignana Jyothi Institute of Engineering and Technology Hyderabad, India Bachelor of Technology, Mechanical Engineering GraduaƟon Date: Sep 2020 TECHNICAL SKILLS

Cloud Technologies: AWS (S3, Glue, Redshift, Lambda, Kinesis), Azure (Data Factory, Databricks, Synapse), GCP (BigQuery, Dataflow) Programming Languages: Python, Scala, Java, SQL, PySpark Python Libraries: Pandas, NumPy, PySpark, Polars

Databases: Oracle, MySQL, SQL Server, PostgreSQL, Snowflake, HBase, MongoDB Big Data Tools: Hadoop (HDFS, Hive, HBase), Spark (Core, SQL, Streaming), Kafka, Airflow ETL Tools: Azure Data Factory, SSIS, Talend, Informatica Visualization Tools: Power BI, Tableau, Grafana

Data Formats: Parquet, Avro, JSON, CSV, Protobuf

DevOps Tools: Jenkins, Terraform, Docker, Kubernetes, Git, Maven PROJECTS

Liver Disease Prediction Using Machine Learning Algorithms

• Led a team of 3 in building an application to predict the occurrence of liver disease. Achieved a 94% accuracy with a selected machine learning algorithm and further enhanced diagnostic accuracy through a program. Emphasized ongoing maintenance and performance tuning for sustained accuracy levels.

• Organized a program to enhance diagnostic accuracy, achieving a 25% increase in classification accuracy through implementing Machine Learning models and scripting for efficient troubleshooting. Managed databases to ensure accurate and efficient data handling. Development of a Petrol Flow Authenticity Check Device using Arduino

• Designed and developed an Arduino-based IoT device with advanced sensors to monitor petrol flow in real time, achieving 98% accuracy in detecting anomalies and tampering. Enabled real-time monitoring and alerting for tampering, ensuring fair fuel dispensing and improving operational transparency. CERTIFICATION

• AWS Certified Data Analytics - Specialty

• Microsoft Azure Data Engineer Associate

• Google Cloud Professional Data Engineer

NAVEEN SIDDINENI

Naveennagsiddineni @gmail.com ( 813 ) 934 - 2526 https://www.linkedin.com/in/siddineninaveen/ PROFESSIONAL SUMMARY

Contact this candidate