Data Engineer Sql Server

Location:

Houston, TX

Posted:

September 10, 2025

Contact this candidate

Resume:

YASWANTH AMARESH BOGGAVARAPU

****.*************@*****.*** +1-475-***-**** LinkedIn:linkedin.com/in/yaswanth-amaresh-dataengineer/

Professional Summary

Full-stack Data Engineer with 4+ years of experience designing and optimizing ETL/ELT pipelines, real-time streaming systems, and data warehouses across AWS, Apache Spark, and Python ecosystems.

Proficient in data modeling (Star Schema, Snowflake), data quality validation, and pipeline automation using Airflow, NiFi, and Python.

Skilled in building and scaling distributed data platforms with Kafka, Hadoop, Redshift, and EMR.

Strong experience with SQL (PostgreSQL, MySQL, HiveQL, SparkSQL) and NoSQL (MongoDB).

Adept at integrating backend services, APIs, and cloud-native storage (S3, Redshift, Lambda, Athena).

Hands-on with BI/visualization tools (Tableau, QuickSight) to support executive decision-making.

Experience collaborating with Data Science teams to deliver curated datasets for ML models (TensorFlow, Scikit-learn).

Strong communicator with Agile/Scrum experience and proven ability to reduce costs and improve efficiency through scalable solutions.

Technical Skills

Languages: Python, Java, Bash, SQL, SparkSQL, HiveQL, PL/SQL, T-SQL

Databases & Warehousing: Snowflake, Azure Synapse, Amazon Redshift, BigQuery, SQL Server, PostgreSQL, MySQL, MongoDB, Hive

ETL/ELT Tools: Apache Airflow, Apache NiFi, Informatica PowerCenter, Talend, SSIS, DBT, Fivetran

Big Data & Analytics: Hadoop (HDFS, MapReduce, YARN), Spark (PySpark, SparkSQL), Databricks, Delta Lake, Hive, Pig

Cloud Platforms: Azure (Data Factory, Synapse, Databricks, Cosmos DB, ADLS Gen2, Logic Apps), AWS (S3, Glue, Redshift, EMR, Lambda, Athena, Kinesis), GCP (BigQuery, Pub/Sub, Dataflow)

Data Modeling: Star Schema, Snowflake Schema, Data Vault, Medallion Architecture

Visualization: Tableau, Power BI, QuickSight

DevOps & Tools: Git, GitLab, Jenkins, Docker, Kubernetes, Terraform, Jira, Agile/Scrum

Professional Experience

Data Engineer Beanbag AI Remote Jun 2023 – Present

Built and automated real-time ETL/ELT pipelines with Kafka, Airflow, DBT, Snowflake, and Redshift, reducing data latency by 30%.

Migrated on-prem data warehouse to Azure Synapse & AWS Redshift, cutting infra costs by 20% while enabling cloud scalability.

Developed SSIS packages and Informatica mappings for integration with SQL Server legacy workloads.

Implemented Fivetran connectors and DBT models to support SaaS ingestion and schema evolution.

Delivered Databricks pipelines with PySpark and Delta Lake, enabling ML teams with curated datasets.

Optimized SQL Server stored procedures, T-SQL queries, and Snowflake SQL scripts, improving execution times by 40%.

Enhanced observability with Azure Monitor & AWS CloudWatch alerts, reducing downtime.

Designed Data Vault and Star Schema models, improving reporting accuracy.

Research Assistant – Data Engineer University of Bridgeport Mar 2022 – Jan 2023

Built a PostgreSQL and Azure Synapse data warehouse, automating ETL pipelines with Airflow and SSIS, cutting manual reporting by 20%.

Designed Star Schema and Snowflake Schema models, reducing report generation by 25%.

Created Tableau & Power BI dashboards for faculty reporting, improving adoption by 15%.

Developed Python ETL scripts for ingestion, cleansing, and reconciliation.

Assisted faculty with predictive models in Databricks (PySpark + Scikit-learn) for analytics.

Integrated Informatica PowerCenter mappings for secure on-prem to cloud migration.

Data Engineer Intern Comviva Technologies Apr 2021 – Oct 2021

Designed and implemented Apache NiFi pipelines and Informatica workflows for telecom data ingestion, improving quality by 10%.

Built SSIS ETL jobs for SQL Server to automate billing reconciliation and reporting.

Integrated NiFi with Kafka for near real-time streaming of telecom CDR datasets.

Developed HiveQL scripts and SparkSQL queries on Hadoop to process petabyte-scale datasets.

Created incremental load strategies to optimize daily ingestion of 2TB+ CDR data.

Projects

Real-Time Talent Acquisition Analytics (2022)

Built a serverless data pipeline using AWS Kinesis, Lambda, and Redshift for HR analytics.

Implemented Python validation scripts for ensuring pipeline stability.

Designed Tableau dashboards for hiring trend analysis, empowering HR leaders with actionable insights.

Certifications

AWS Certified Data Analytics – Specialty

Microsoft Certified: Azure Data Engineer Associate

Education

M.S. in Computer Science – University of Bridgeport, CT – Apr 2023

B.Tech in Electronics & Communication Engineering – JNTU Kakinada, India – Aug 2021

Contact this candidate