Data Engineer Big

Location:

Dallas, TX, 75398

Salary:

50000

Posted:

December 05, 2023

Contact this candidate

Resume:

Abilash Reddy Vedeer

DATA ENGINEER

EDUCATION

Master of Science in

Computer and Information

Southern Arkansas University,

Arkansas

Bachelors in mechanical

engineering

Sree Nidhi Institute of Science

and Technology, India

SKILLS

Programming language:

Python, SQL

Packages: NumPy, Pandas,

Matplotlib, Seaborn, Sckit-

learn, TensorFlow

Databases: MySQL,

PostgreSQL, MongoDB

Web Technologies: HTML, CSS

Cloud Technologies: AWS, S3,

EMR, EC2, Lambda Function,

Redshift, AWS Glue

Big Data Technologies:

Hadoop, Kafka, Yarn, Apache

Spark, Apache Tomcat

Build /Other tools:

UML, MS Visio, Maven, Gradle

IDEs: Visual Studio Code,

PyCharm, Juypter Notebook

Operating Systems: Windows,

Linux

Other Technical Skills:

Machine Learning, Data

Management, Marketing

Analytics, Financial Decision

Making, Information

Technology Strategy, Jira,

Digital Innovation, ETL/ELT

Process Innovation &

Management, SSIS, SSRS, SSAS,

Kubernetes, Informatica

Other skills:

Critical analysis, requirements

analysis, user requirements

analysis, problem-solving, unit

testing

Version Control Tool: Git,

GitHub

Certification: AWS cloud

practitioner

SUMMARY

Accomplished data engineer with over 4+ years of hands-on experience, adeptly harnessing Python, SQL, and cutting-edge technologies for impactful data-driven solutions.

Solid foundation in big data technologies encompassing Hadoop, Kafka, and Apache Spark, enabling efficient data processing and real-time analytics.

Expertise in database management, spanning MySQL, PostgreSQL, and MongoDB, proficiently handling data storage, retrieval, and optimization.

Skilled in data visualization using Matplotlib, Seaborn, enabling stakeholders to access insights and drive informed decisions.

In-depth knowledge of ETL/ELT process innovation and management, garnered through hands-on experience with Informatica, SSIS, and SSRS.

Adept problem solver with a knack for critical analysis, requirements assessment, and user requirements analysis, facilitating efficient project execution.

Proficient in Git and GitHub for version control, facilitating efficient collaboration and tracking of project changes.

Strong focus on data security and privacy, incorporating data encryption techniques and role- based access control, achieving security compliance during internal audits. EXPERIENCE

Data Engineer JPMorgan Chase, TX Jul 2022 - Present.

Spearheaded a pivotal real-time data pipeline initiative at JPMorgan Chase, employing Apache Spark to significantly slash data latency by 40%.

Engineered robust ETL scripts in Python and SQL, effectively managing the processing of a substantial 2TB data volume daily, leading to a commendable 25% surge in overall data throughput.

Implemented a comprehensive testing framework to meticulously validate data accuracy, elevating precision levels to an impressive 99%.

Employed advanced data cleansing techniques within Apache Spark, ensuring impeccable data quality standards throughout transformations.

Skilfully harnessed Python and SQL for scripting and data manipulation, effectively translating into the precise handling of large-scale raw datasets.

Successfully addressed data latency concerns, leading to heightened system responsiveness and an enhanced user experience.

Thrived in an Agile environment, adapting swiftly to evolving project requirements and leveraging continuous integration and continuous deployment (CI/CD) practices for efficient software delivery.

Data Engineer Deloitte, India Oct 2018 - Aug 2021.

Led a project at Deloitte, bringing together customer data using Talend, which reduced time spent on combining information by 40%.

Used Hadoop and Hive to quickly manage and analyze millions of customer transactions, making it 30% faster to get answers from the data.

Made data more accurate (95% better) using Informatica Data Quality, ensuring the right info for decisions.

Used smart tricks in Apache Spark to suggest products to customers, boosting the chances of them buying by 20%.

Set up Amazon Redshift to store data fast and used Tableau to show it, improving how decisions were made by 25%.

Spearheaded the creation of refined customer profiles through a unified data hub, enabling granular segmentation and empowering precise targeting.

Demonstrated mastery over Talend, Informatica, and Apache Spark, expertly wielding these tools to deliver tangible project outcomes.

Elevated operational efficiency for the retail client, actualizing data-enabled decisions.

203-***-****

************.*******@*****.***

TX (Open to Relocate)

Contact this candidate