Dinesh
Data Engineer
Email: ******.**.****@*****.*** Mobile: +1-401-***-**** Place: United States
SUMMARY
Experienced Data Engineer with expertise in designing and optimizing data pipelines .Developed and deployed strong ETL frameworks, enhancing data processing efficiency by 35%. Proficient in utilizing cloud services,snowflake and PowerBI,Quicksight to deliver actionable insights and support advanced analytics. Capable in translating complex data requirements into efficient technical solutions.
TECHNICAL SKILLS
● Programming Languages: Python, Java, Scala, SQL
● Big Data Technologies: Hadoop, HDFS, Spark, Hive, Kafka, PySpark
● Data Storage and Databases: MySQL, Oracle SQL, PL/SQL, PostgreSQL,T-SQL, NoSQL
● Data Workflow and Orchestration: Airflow
● Data Visualization: Tableau, Power BI, Looker,Quicksight
● Version Control: Git, GitHub
● Operating Systems: Windows, Mac OS, Linux, Unix
● Scripting: Shell scripting
● Methodologies: Agile
● CI/CD Tools: Jenkins
● Data File Formats: Parquet, Avro
PROFESSIONAL EXPERIENCE
AMD Data Engineer
March 2024 – present Boston,MA,USA
● Optimized data pipelines utilizing Apache Airflow to streamline ETL processes for data integration, enhancing data accessibility and reducing processing time.
● Created complex data transformations in PySpark, processing over 500 million records per month. Reduced data transformation time by 40% through code optimization and efficient Spark operations.
● Designed and implemented end-to-end data workflows in Azure Databricks, integrating with Azure Data Lake and Power BI. Delivered data visualizations to stakeholders within 24 hours, reducing reporting turnaround by 50%.
● Employed a range of Python packages, including Pandas, NumPy to optimize data extraction, processing, and analysis, yielding a 13% reduction in processing time.
● Enhanced business intelligence and data warehousing through the implementation of SQL and Tableau dashboards in an Azure DevOps architecture, resulting in a 19% improvement in efficiency and productivity on the data platform.
● Expertise in Azure DevOps for CI/CD pipelines, version control, and automated deployments of data solutions.
● Designed and optimized SQL Server Reporting Services (SSRS) reports, delivering actionable insights to stakeholders with real-time dashboards.
● Developed and maintained SQL Server Integration Services (SSIS) packages for seamless data migration and transformation.
● Efficiently facilitated data transfer from Oracle EBS to Azure Synapse Analytics using Azure Blob Storage as a validation platform, utilizing Azure Data Factory and Azure Databricks tools. This resulted in a 21% reduction in data processing time and a 50% enhancement in data accuracy.
● Collaborated with a cross-functional team of three to design and implement automated procedures triggered by external system events, leveraging Azure Logic Apps and Azure Functions achieving a significant 50% reduction in manual workload and a notable 60% improvement in real-time synchronization efficiency.
● Engineered and deployed resilient ETL pipelines leveraging Azure Data Factory and Azure Databricks, optimizing data extraction, transformation, and loading processes. This initiative led to enhanced data consistency and a 25% reduction in time-to-insight.
Vodafone Data Engineer
August 2019 – August 2022 Pune,India
● Designed and implemented data processing pipelines using Spark SQL to transform raw data into usable formats for downstream analytics and reporting tools.
● Built and maintained ETL workflows using Airflow to automate the scheduling and execution of data processing jobs on a regular basis.
● Deployed and managed AWS EMR clusters to process large-scale datasets using Apache Spark, reducing data processing times by 50% in batch workflows.
● Implemented end-to-end data analytics pipelines, including data ingestion, processing, analysis, and visualization,leveraging tools like Apache Spark, Hadoop
● Created Snowpipe for continuous data load in snowflake.
● Designed ETL processes using AWS Glue to prepare datasets for training machine learning models, ensuring high data quality and accelerating model deployment.
● Designed and developed RESTful APIs for data retrieval, transformation, and integration, ensuring secure and efficient data access.
● Designed and implemented efficient ETL processes to extract, transform, and load data from various operational systems into the EDW, ensuring data consistency and accuracy
● Utilized Pandas and NumPy for feature engineering, preparing datasets for machine learning models and improving model performance
● Optimized complex SQL queries in Redshift using distribution keys, sort keys, and compression techniques, reducing query times by 40%.
● Leveraged Docker to integrate various data sources and tools, enabling seamless data flow and processing.
● Optimized data processing workflows using Apache Spark to reduce processing time.
● Implemented comprehensive data governance frameworks to ensure data quality, consistency, and compliance acrossthe organization.
● Conducted daily monitoring of ETL pipelines to ensure operational integrity and performance consistency.
● Built PySpark code for ETL pipelines that convert, clean, and aggregate data in Data Bricks clusters. CERTIFICATIONS
AWS Data Engineer Associate
EDUCATION
Master of Science in Computer Science
Bachelor of Technology in Computer Science and Engineering(Hons)