Resume

Data Engineer Big

Location:

Dallas, TX

Posted:

November 08, 2023

Contact this candidate

Resume:

Rudranu Palit

Dallas, TX 214-***-**** ad0ycc@r.postjobfree.com https://www.linkedin.com/in/rudranu/ SUMMARY

Data Engineer with 5+ years of experience in building and optimizing data pipelines, data warehouses, data models for enterprise customers using various AWS and Azure services, big data tools to achieve faster time to market. Proficient in Python, SQL, and Spark. Skilled at applying machine learning and analytics techniques to generate faster insights and improve data quality and security. TECHNICAL SKILLS

Programming: Python, Java, R, SQL, PySpark, Shell script,Scala,Javascript Cloud & ETL: DynamoDB, Glue, Athena, Elastic Map Reduce, Lake Formation, Step Functions, RDS, Lambda, IAM, ec2, s3, SNS, Kinesis, AzureDataFactory, Databricks, Cosmos DB,Databricks,Azure Synapse,Azure functions,Event hubs,ADLS Gen2

Certification: Azure Data fundamentals, AWS Solutions Architect Associate, Oracle Cloud Foundations, Oracle Business Intelligence Specialist.

Big Data technologies: Hadoop, MapReduce, HDFS, Kafka, Yarn, Apache Spark, Zookeeper, Apache Hudi PROFESSIONAL EXPERIENCE

Data Engineer, RumbleOn, Dallas, TX May 2023 – Present

• Worked with retail enterprise customers to build EDP, data management and big data platforms,Utilized Python to create and maintain data pipelines for consuming Eleads API data, integrating with AWS services such as S3, SQS, Lambda, Redshift, Glue, and CloudWatch. This contributed to a 16.7% increase in sales revenues.

• Wrote DBT SQL scripts to do data transformation in AWS Redshift layer to build models and scheduled daily runs in order to build rumbleon financial reports in Power BI.

• Build RumbleOn financial reports in Power BI and scheduled incremental refresh.

• Building a scalable data pipeline using python to consume Eleads API data using Python and AWS services such as S3,SQS,Lambda,Redshift Glue,Cloudwatch . The API helps to retrieve opportunity data which helped the marketing team to reach wider customers.The Sales revenues increased by 16.7% QoQ .

• Built custom Python scripts and performed data extraction, data ingestion into data warehouses using ETL tools like AWS Glue and EMR involving various sources like NoSQL databases Dynamo DB, JDBC based relational databases, Redshift, Snowflakes structured & unstructured formats .

• Designed, implemented efficient and robust ETL/ELT workflows to automate data lake creation using AWS step functions driven largely by Python code for lake creation.

• Built Leveraged Python to build real-time data streaming applications using Spark and AWS Kinesis streams, facilitating efficient reporting and data access patterns.

•

Data Engineering Intern, PRA Group, Norfolk, VA May 2022 – Dec 2022

• Implemented Python scripts and libraries in AWS Glue to optimize SQL queries, resulting in improved data retrieval and analysis.

• Skilled in writing efficient and optimized SQL queries to fetch, transform and analyze data from various data warehouses in AWS Redshift.

• Resolved performance bottlenecks and provided tuning recommendations for customers using Extract, Transform and Load (ETL) pipelines on Hadoop, Spark and Hive, resulting in 35% shorter downtimes .

• Identified and fixed customer issues related to data ingestion, processing, storage and analysis using various AWS tools such as Lambda,S3, Glue and Redshift..

• Troubleshot and optimized queries using different methods such as statistics, joins, indexes and code changes .

• Enhanced internal troubleshooting tool with a new feature that automated data entry and accelerated case resolution time by 20% .

Data Engineer, TCS, Kolkata, India Jan 2016 – Aug 2021

• Acquired expertise in AWS Glue, EMR, Athena, Virtual Private Cloud (VPC), Elastic Compute Cloud (EC2), Lambda, Simple Storage Service (S3), CloudFormation, Simple Queue Service (SQS), Identity and Access Management (IAM), Azure functions, ADF, Cosmos DB,ADLS Gen2,Databricks,Event hubs, Active directory and Azure Synapse analytics.

• Collaborated with other AWS and Azure support engineers and developers to ensure timely and effective solutions for complex big data problems .

• Designed and build an intelligent marketing systems to captures user activity in BT’s wholesale broadband website to provide tailored content using ML models. The various Azure services such as Azure functions,AzureEventhubs,AzureStreamAnalytics,ADLSGen2,cosmosdb,IntelligentRecommendations,Azure personalizer, Web app and Power BI increased marketing conversion rate to 31.7% QoQ.

• Created BI reports using Power BI for British Telecom’s Wholesale Broadband Customers.

• Utilized Azure Analytics capabilities such as Power BI to architect and deliver reusable packaged solutions that provided clients with instant results and reduced time to insights from weeks to days

• Used Azure Databricks platform to ingest customer information and processed the data using Pyspark for cleaning and aggregating the data and helped datascientists to learn ML models over the processed data.

• Created and delivered data-driven assets using big data workflows and visualization tools that enhanced customer engagement by 45% for internal and external stakeholders.

• Build a near realtime data pipeline using Azure cloud to capture customer reactions and feedback for BT’s new product launch. Used Azure Synapse analytics,Cognitive service,Azure ML, ADLS gen2,Power BI and Azure web apps service.

• Performed ETL by building data pipelines using AWS Stack and migrated data from Oracle Cloud to spark improving performance by 6 hours

• Implemented REST API to make metadata query able to enable self-service and manage data access.

• Onboarded backing up data to AWS S3 using Kafka connect and implemented scalable backfill framework.

• Build, deploy and maintain deployment applications using CI/CD pipelines in Jenkins Octopus.

• Involved in migration of data from AWS Redshift to Snowflake which saved annual cost of £120,000.

• Strong experience in tuning Spark applications and Hive scripts to achieve optimal performance and used Spark Dataframe and Spark API to implement batch/Stream processing of Jobs. EDUCATION

The University of Texas at Dallas May 2023

Master of Science, Business Analytics

Veer Surendra Sai University Of Technology, India May 2015 Bachelor of Science, Electrical Engineering

Contact this candidate