Data Engineer Cloud Platform

Location:

Greensboro, NC

Posted:

May 28, 2025

Contact this candidate

Resume:

Harshitha Manam

Email: ******************@*****.***

Contact No: 336-***-****

Data Engineer

Professional Summary:

Around 4+ years of professional IT work experience in Analysis, Design, Development, Deployment and Maintenance of critical software and big data applications.

Capable of processing large sets of structured, semi-structured data and supporting systems application architecture.

Very good experience in Python and Shell scripting.

Experience in designing, implementing, and optimizing data pipelines on Google Cloud Platform.

Designed and implemented a real-time analytics platform using GCP services and Kafka.

Experience in retrieving data from databases like MYSQL, Teradata, and Oracle into HDFS using Sqoop and ingesting them into BigQuery.

Proficient in designing, implementing, and managing cloud infrastructure solutions on Azure.

Experienced in deploying virtual machines, storage, and networking components in Azure.

Very good experience in modelling data solutions for reporting/Datawarehouse purpose

Good understanding and experience with Software Development methodologies like Agile and Waterfall.

Effective leadership quality with good skills in strategy, business development, client management and project management.

Technical Skills:

Tools and Technologies:

Monitoring and Reporting

Tableau, Custom Shell Scripts, Splunk, Grafana

Build and Deployment Tools

Maven, Git, SVN, Jenkins

Programming and Scripting

SQL, JavaScript, Shell Scripting, Python, HiveQL

Databases

Oracle, MY SQL, MS SQL Server, Teradata, Postgres SQL

ETL Tools

Informatica Power Center, Info works

Operating Systems

Linux, Unix, Windows 8, Windows 7, Windows Server 2008/2003

AWS Services

S3, Redshift, EMR, Lambda

Kafka

Setup, configuration, data streaming, and integration

Professional Experience:

Sr. Data Engineer Nov 2023 – Till date

Client: Bright Speed, Charlotte, NC

Responsibilities:

●Designed and developed end-to-end data pipelines on Google Cloud Platform, incorporating Kafka for real-time data streaming and Python for data ingestion.

●Designed and implemented Azure Virtual Networks for secure and efficient communication.

●Created custom Kafka producers and consumers to seamlessly integrate external data sources with internal systems, enabling real-time analytics.

●Developed Python scripts for data extraction, transformation, and loading (ETL) from TXT and CSV files into Google BigQuery.

●Utilized Azure Log Analytics and Application Insights for diagnostics and performance optimization.

●Implemented efficient partitioning and clustering strategies in BigQuery for optimized query performance.

●Leveraged Airflow to design and orchestrate complex ETL workflows, including data extraction from MySQL databases and loading into BigQuery.

●Deployed and configured Azure Virtual Machines to host various applications and services.

●Developed Cloud Functions, to trigger the workflow in airflow upon files delivered to GCS storage.

●Participated in designing Airflow DAGs for scheduling and orchestrating data workflows.

●Orchestrated containerized applications using Azure Kubernetes Service.

●Implemented Azure Monitor and Azure Security Center for proactive monitoring.

●Implemented high-availability solutions using Azure VM scale sets.

●Collaborated with cross-functional teams to gather requirements, analyze data needs, and design effective data models.

●Integrated Azure DevOps for continuous integration and continuous deployment (CI/CD).

Environment: Python, BigQuery, Cloud Composer/ Airflow, MySQL, Cloud Storage, GitHub.

Sr. Data Engineer MAY 2023 – NOV 2023

General Motors – Detroit, MI.

Responsibilities:

●Worked on automating ETL scripts using Python.

●Designed and built data solutions to migrate existing Hadoop jobs on-prem to Google Cloud Platform

●Performed Exploratory Data Analysis on Data Sets for checking Assumptions required for model fitting and hypothesis testing and handling missing values and making transformations.

●Transformed data from different formats like XML, JSON, DSV to parquet file format using PySpark (Python).

●Shell Scripts to automate pipelines and error handling for data.

●Scheduled jobs in composer/airflow.

Environment: GCP, Sqoop, SQL, PySpark, Spark SQL, Astronomer/ Airflow, Hive, GitHub.

Sr. Data Engineer Jan 2020 – Aug 2022

Infor Global Solutions-IN

Responsibilities:

●Worked on automating ETL scripts using Python.

●Designed and built data solutions to migrate existing source data in Teradata and DB2 to Big

Query (Google Cloud Platform)

Integrated Azure Cognitive Services for AI-driven capabilities in applications.

●Designed, Developed, and automated pipeline to ingest data from Teradata, Oracle DB to Data Storage in GCP.

●Integrated Azure Functions with other Azure services for event-driven architectures.

●Transformed, Cleansed, and backfill data, created models in Big Query for the business use case to create reports for the EHRs.

●Deployed and managed web applications using Azure App Service.

●Utilized Azure Storage solutions, including Blob Storage, Azure Files, and Azure Disk Storage.

●Transformed data from different formats like XML, JSON, DSV to parquet file format using PySpark (Python).

●Shell Scripts to automate pipelines and error handling for data.

●Scheduled jobs in composer/airflow/Tidal to run jobs.

●Integrated Azure DevOps for continuous integration and continuous deployment (CI/CD).

●Implement One time Data Migration of Multistate level data from SQL server to Snowflake by using Python and SnowSQL.

Environment: GCP, Sqoop, SQL, PySpark, Logstash, Teradata, GitHub.

Education:

M.S in Computer Science, University of North Carolina Greensboro Aug 2022 – May 2024

Bachelor of Technology in Electronics and Communication Engineering

Sri Indu College of engineering and technology Aug 2017 – July 2021

Contact this candidate