Data Engineer Analyst

Location:

Fayetteville, AR

Posted:

October 25, 2024

Contact this candidate

Resume:

MANOJ KUMAR BORRA

+1-430-***-**** USA

************@*****.*** linkedin.com/in/manoj-kumar-borra

PROFESSIONAL SUMMARY

Experienced Data Engineer with 3+ years in the Big Data ecosystem and expertise in SQL Server, Azure IaaS, and data architecture. Proven success in designing scalable data applications, optimizing database performance, and reducing cloud infrastructure costs by 50%. Adept at leading cross-functional teams to deliver high-quality solutions through data pipelines, ETL processes, and CI/CD pipelines. Strong in data modeling, data security, and governance across cloud platforms like Azure and AWS. Proficient in Power BI and Databricks for creating data-driven insights, and a dedicated problem-solver known for achieving substantial cost savings and efficiency improvements. SKILLS

Technical Skills ETL, Apache Spark, Pyspark, Scala Databases MSSQL, PostgreSQL, MongoDB, Oracle, MYSQL, Snowflake, RDBMS Big Data Technologies HDFS, Hive, MapReduce, YARN

Programming Languages Python, Shell Scripting, SQL, PL/SQL Libraries Pandas, NumPy

Visualization Tools Excel, Power BI, Tableau

Cloud Platforms Azure (SQL Database, Data Lake Storage, Blob, Data Factory, Databricks, Synapse Analytics, Functions, Event Hub, Power BI) AWS (S3, EC2, Lambda, Redshift, Glue)

DevOps Tools Git, GitHub, Bitbucket, Azure DevOps

Soft Skills Confident, Dedicated, Self-Learner, Team player, Good Listener EDUCATION

Master’s in computer and information Science, Southern Arkansas University B.Tech in Electronics and Communication Engineering, Lovely Professional University WORK HISTORY

Data Engineer Nov 2023 - Present

LinkedIn Sunnyvale, CA, USA

• Engaged with product owners to define key performance indicators (KPIs), leading data processing and analytics to align with business goals

• Collaborated with cross-functional teams to build high-quality data solutions using Azure Databricks and Azure Data Factory, reducing job run times by 30% through Spark code optimization.

• Collaborated on a project to integrate Google BigQuery with our data pipelines, resulting in a 20% increase in data processing efficiency.

• Developed Python scripts and leveraged Pytest for ETL pipelines, achieving a 60% reduction in failures through robust testing frameworks.

• Migrated custom microservices to Kafka, enabling real-time data processing, which cut platform costs by 25%

• Led CI/CD pipeline improvements for Databricks, resulting in seamless updates and faster deployment cycles.

• Reduced Azure infrastructure expenses by 50% through effective cost optimization strategies.

• Designed scalable data architecture solutions leveraging Azure IaaS and data modeling principles, ensuring system reliability and scalability.

• Delivered Power BI dashboards for actionable insights based on defined KPIs, supporting data-driven decision- making.

• Implemented data governance frameworks to enhance data quality, security, and compliance across data pipelines.

• Serving as the Scrum Master, facilitating Agile project management practices and ensuring effective team col- laboration.

• Mentoring and providing guidance to interns and freshers, fostering their professional growth and development. Data Engineer Jan 2023-Nov 2023

Client: Walmart Dallas, TX, USA

Employer: Advithri Technologies

• Developed Sqoop jobs for Oracle to Hive data migration, optimized ETL processes in Azure.

• Performed ETL from source systems to Azure Data Storage using Azure Data Factory, Spark SQL, T-SQL, and Azure Data Lake Analytics; ingested data into Azure services and processed it with Azure Databricks.

• Enhanced Python and Spark code for efficiency, and worked with JSON, ORC, and Parquet data formats on HDFS using PySpark.

• Optimized ETL processes, achieving a 20% reduction in processing time, and implemented SSIS, SSAS, Spark

(PySpark, Spark SQL), and Scala for data processing and analysis.

• Analyzed structured data with Spark SQL, tuned Spark applications, and used Spark RDD transformations for business analysis.

• Migrated MapReduce jobs to Spark RDD transformations and worked in Agile sprints for task organization.

• Deployed scripts with Jenkins following CI/CD processes, created Hive partitions and buckets for performance optimization, and generated reports with ETL (Informatica) jobs from MySQL databases.

• Optimized T-SQL queries for performance, improving the speed of data processing by 30% and ensuring high system availability.

• Tuned SQL queries and performed performance optimization for data-intensive applications, ensuring quick and reliable reporting.

• Ensured data security and governance, aligning with industry standards and compliance regulations. Data Engineer Aug 2020-Dec 2021

Client: Best Buy Austin, Tx

Employer: Gaman Software Solutions Pvt. Ltd Hyderabad, India

• Supported Microsoft Agile BI operations using Azure Data Factory, Azure Databricks, SQL Server, SSAS, Power BI, and Azure Synapse Analytics.

• Developed ADF pipelines for data loading, with incremental and delta loads, and automated triggers.

• Created and managed Power Apps solutions with SQL and SharePoint, and automated data processing with Azure Data Factory, Spark SQL, and T-SQL.

• Ingested and processed data in Azure Data Lake, Azure Storage, Azure SQL, and Azure SQL Data Warehouse using Azure Databricks.

• Migrated on-premises databases to Snowflake and handled production issues including job failures and Azure pipeline errors.

• Performed ETL from source systems to Azure Data Storage services using Azure Data Factory, T-SQL, Spark SQL, and U-SQL in Azure Data Lake Analytics.

• Managed pipeline jobs, scheduled triggers, and mapped data flows using Azure Data Factory (V2), and utilized Key Vaults for credential storage.

• Designed data models and visualizations for data warehousing and reporting using Power BI. CERTIFICATIONS

Python Certification for crash course Google

SQL for Data Science

Microsoft Azure for Data Science by Coursera

Contact this candidate