Associate Data Engineer

Location:

The Colony, TX

Posted:

May 10, 2023

Contact this candidate

Resume:

DHEEKSHA PANJAM

• •

PROFESSIONAL SUMMARY

Highly motivated and detail-oriented individual with a passion for data engineering. As an data engineer professional, I am eager to apply skills in data extraction, transformation, and loading to deliver robust and scalable data solutions. Expertise in SQL, Python, and ETL tools and have a solid understanding of data warehousing concepts. With a strong focus on quality and accuracy, committed to delivering data-driven insights that help drive business decisions.

SKILLS

Languages: Python, scala, Spark, UNIX

Big Data Technologies: Hadoop, Spark,

Hive,Kafka

Databases: MySQL, SQL Server, Snowflake

Tools: PySpark, AWS, GitHub, Jenkins

Operating System: Windows, Linux, Mac

Methodologies: Agile, SDLC, Waterfall

Cloud Computing: AWS EC2, S3, RDS,

DynamoDB, SNS, SQS, step functions, lambda

functions.

Container Platforms: Docker, Kubernetes, CD/CI.

API Design and Development

WORK HISTORY

01/2023 - Current

Data Engineer

Vantage ERP-Staples - Houston, TX

Completed assigned tasks around development of ETL pipelines, metadata definitions and models, queries, and reports, scheduled query jobs using Data bricks Ability to analyze data sets to support data quality reviews, as well as strong SQL skills, and ability to create stored procedures and views

Extensively used AWS S3, EC2 and EMR instances to deploy and test applications in various environments.

Good Understanding of Data ingestion, Airflow Operators for Data Orchestration, and other related python libraries

Worked in using collections in Python for manipulating and looping through different user-defined objects

Used MongoDB as NoSQL database to store data, and use Hadoop's HDFS for large, distributed storage

Used Apache Spark for real-time data processing, analyzing data as it arrives and generating real time analytics

Worked on data that was combination of unstructured and structured data from multiple sources and automated cleaning using Python scripts

The Colony, Texas 75056 913-***-**** **************@*****.*** Worked on Python APIs calls and landed data to S3 from external sources Implemented AWS Lambda functions to run scripts in response to events in Amazon DynamoDB table . Worked on task automation and task conversion with focus on risk reduction and process migration form on-premises solutions to cloud solutions

Worked with Data Engineers and Data Scientists to help and understand gaps between Product Integrity Datasets and Digital Manufacturing by leveraging Advanced Analytics Involved in performance tuning application at various levels, Hive, Spark. Ability to write and optimize diverse SQL queries, working knowledge of RDBMS like SQL Server and MySQL

07/2020 - 12/2021

Associate Data Engineer

Vision Tree Software Services - Hyderabad, India

Built CI/CD pipeline to automate process using Python. Worked on multiple applications alongside with other engineers to design, build & deliver data solutions. Experience building and optimizing AWS data pipelines, architectures, and data sets. Collaborated with data engineers and operation team to implement ETL process, wrote and optimized SQL queries to perform data extraction to fit analytical requirements. Develop Python and SQL used in transformation process. Familiarity with ETL processes and tools such as AWS Glue, Data Pipeline, or Apache Airflow for moving data in and out of databases.

Experience with querying external data sources like S3 and Glue. Involved in automating Vulnerability Management patching and CI/CD using Chef and other tools like Jenkins.

Used Spark Streaming APIs to perform transformations and actions on fly for building common. EDUCATION

01/2022

Master of Science: computer science

University of Central Missouri - Warrensburg, MO

07/2021

Bachelor of Technology: Electronics and Telematics Engineering G.Narayanamma Institute of Technology And Science - Hyderabad, India

Contact this candidate