Post Job Free

Resume

Sign in

Data Engineer Computer Science

Location:
Frisco, TX
Posted:
March 05, 2024

Contact this candidate

Resume:

Revanth Kumar Manepally

linkedin.com/in/revanth-kumar-42b262188 ad34rk@r.postjobfree.com Telephone: +1-913-***-**** New York, United States EDUCATION:

University of Central Missouri (Master’s in Computer Science) August 2021 - July 2023 Jawaharlal Nehru Technological University (Bachelor’s in Computer Science) August 2016 - October 2020 SKILLS:

Languages: Python, Java, JavaScript, Scala, Shell Scripting Cloud Services: Azure, AWS

Bigdata Ecosystem: Hadoop, Pig, Spark, Map Reduce, Hive, HBase, HDFS, Kafka, Sqoop, Flume, Oozie, Airflow, Nifi Tools: Power BI, Tableau, Git, GitHub, Docker, Kubernetes, JIRA, Jenkins Databases: MySQL, SQL Server, MongoDB, Dynamo DB, Oracle, Snowflake Web Technologies: HTML, XML, CSS

Operating Systems: MacOS, Windows Server, Linux

EXPERIENCE:

T-Mobile, Bellevue, WA January 2023 – Present

Role: Data Engineer

Developed and maintained data pipelines to extract, transform, and load (ETL) data from various sources into the company's data warehouse.

Created Python and Spark data pipelines for ETL parsing and analytics, resulting in a structured data model.

Conducted comprehensive performance tuning and optimization of complex SQL queries, resulting in an improvement in database response time.

Deployed the Cloudera Hadoop framework for distributed computing, increasing data processing throughput over a cluster of up to seventeen nodes.

Involved in creating Hive tables, loading with data, and writing Hive queries on top of data present in HDFS.

Created data visualizations using Tableau and other Power BI tools to support data analysis and decision-making.

Developed CI/CD pipelines to automate deployment processes and ensure data solutions are continuously integrated and delivered.

Participated in transforming Hive/SQL queries into transforms using Python.

Created pipelines in ADF using linked services, datasets, and pipelines to extract, transform, and load data from different sources like Azure SQL, blob storage, the Azure SQL Data Warehouse, the write-back tool, and backwards.

Extract, transform, and load data from source systems to Azure Data Storage services using a combination of Azure Data Factory and Data Lake Analytics.

Data ingestion to one or more Azure services (Azure Data Lake, Azure Storage, and Azure SQL) and processing the data in Azure Data bricks.

Managed version control of ETL/ELT scripts and data science models within CI/CD workflows.

Worked on migrating datasets and ETL workloads from On-Prem to AWS Cloud Services.

Wrote Spark-Streaming applications to consume the data from Kafka topics and write the processed streams to HBase.

Configured, assembled, and dispatched new information pipelines underway utilizing Apache Spark.

Performing Sqoop tasks to load data into HDFS and conduct validations.

Designed an Apache Airflow Data Pipeline to automate data ingestion and retrieval. Analogics Tech India Ltd, Hyderabad, India August 2019 – July 2021 Role: Data Engineer

Orchestrated collaboration across cross-functional teams to architect, develop, and maintain end-to-end data pipelines leveraging Azure Data Factory, resulting in a reduction in the processing of data and enabling real-time analytics.

Designed a Python-based hourly ETL for incremental Snowflake updates from Postgres database.

Streamlined data processing by integrating Spark, Hadoop, and Hive, enabling efficient handling of large datasets, enhanced data processing speed and resource consumption.

Initiated a Python-driven data automation system, reducing manual data processing time and improving overall data accuracy and integrity.

Integrated automated testing into CI/CD pipelines to ensure data quality and reliability.

Developed solutions for detecting lost history records, hard deletes, and missing data in ETL.

Streamlined and fortified Azure Data Lake Storage through implementation of data security protocols and access controls, safeguarding sensitive information while enabling seamless data accessibility; reduced potential security breaches by 40%.

Built data validation checks to validate the data integrity and consistency between the source and target databases.

Played an active role in identifying and resolving software bugs.

Created a Python-based data analysis tool for extracting relevant insights from huge datasets.

Utilized SQL for querying and analyzing large datasets.

Developed a Databricks script to load data from Redshift into Snowflake and perform data quality tests.

Gained practical expertise in Elasticsearch, enhancing data indexing search capabilities.

Created data pipelines for generating training data and importing it into the Snowflake data warehouse for model training. CERTIFICATIONS:

DP-203: Azure Data Engineer Associate

AWS Certified Solutions Architect - Associate



Contact this candidate