Post Job Free
Sign in

Data Engineer Solutions

Location:
Santa Clara, CA
Posted:
September 23, 2024

Contact this candidate

Resume:

Manikumar Siddireddy

Data Engineer

Cupertino, CA +1-618-***-**** *********@*********.*** LinkedIn SUMMARY

Experienced Data Engineer with 3 years of experience in building scalable data pipelines and optimizing ETL workflows. Proficient in leveraging cloud platforms such as AWS, Azure, and GCP to design robust data solutions, with a focus on data warehousing and real-time analytics. Expertise in SQL, Python, and data modeling, alongside hands-on experience with big data tools like Hadoop, Spark, and visualization platforms such as Tableau. Proven track record of reducing data processing times by 30% through pipeline optimization and delivering high-performance data solutions that drive business insights. SKILLS

EXPERIENCE

Data Engineer, Berkshire Hathaway, USA Jan 2024 - Present

• Developed data warehousing solutions using SQL Server and Azure Data Lake, centralizing insurance policy data for improved accessibility and analysis.

• Designed predictive models with Python and Scikit-learn, achieving a 15% improvement in risk assessment for underwriting processes.

• Coordinated cross-team collaboration to develop scalable ETL workflows using Databricks, ensuring efficient data flow and timely availability.

• Implemented advanced analytics and reporting in Power BI, delivering insights that optimized claims processing and reduced fraud detection times.

• Engineered big data solutions with Spark and Hadoop, enabling the analysis of large volumes of customer data to identify trends and patterns.

• Orchestrated data migrations to AWS, enhancing data security and scalability for critical insurance datasets. Data Engineer - (Full Time, Remote), Cognizant Pvt Limited, Chennai, India Aug 2021 - Aug 2022

• Optimized SQL queries and PostgreSQL database performance, achieving a 30% reduction in query response time.

• Automated data workflows with Apache Airflow, increasing pipeline efficiency and reducing manual intervention by 40%.

• Configured and maintained CI/CD pipelines using Git and GitHub, streamlining deployment processes for data applications.

• Designed and deployed scalable data solutions on AWS and Azure, managing over 2 million records monthly to enhance data accessibility and system performance.

• Developed ETL pipelines utilizing Python, SQL, and Talend, processing over 1 TB of data daily to support comprehensive business analytics and reporting.

• Implemented data visualization strategies using Tableau and Power BI, facilitating data-driven decision-making across 5 departments.

• Engineered machine learning models with Scikit-learn and TensorFlow, leading to a 20% improvement in predictive accuracy for customer segmentation.

• Designed and deployed infrastructure as code using Terraform for scalable, automated cloud resource provisioning. Data Engineer, SeaGate, India Jun 2020 - Jul 2021

• Crafted advanced data visualization dashboards in Tableau and Power BI, delivering actionable insights that resulted in a 25% increase in operational efficiency.

• Guided the setup and maintenance of Git-based version control systems, enhancing code management and team collaboration.

• Spearheaded the implementation of big data technologies like Hadoop and Hive, facilitating efficient processing of large-scale datasets and boosting query performance.

• Transformed raw data into meaningful metrics using SQL and DAX, aiding key business strategies and enhancing reporting accuracy.

• Instituted rigorous data cleaning processes with Python and Pandas, cutting data errors by 30% and improving data accuracy.

• Crafted data integration solutions with SSIS and Informatica, merging data from different sources and enhancing data quality for over 500,000 records.

EDUCATION

Master of Science in Computer Science May 2024

Southern Illinois University, Carbondale, United States Bachelor of Technology in Computer Science and Engineering May 2021 Amrita Vishwa Vidyapeetham, Coimbatore, India

Programming Language R, Python, SQL, SAS, Spark, PySpark, Java, C++ IDEs PyCharm, Jupyter Notebook, R Shiny

Cloud Technologies and services AWS, Azure Cloud, Azure Data Factory, Azure Synapse Analytics, Azure SQL Database, PAAS, IAAS, SAAS, Azure Databricks, ETL, Data Cleaning, Pipelines, Snowflake Visualization Tools Tableau, Power BI, SSRS, DAX

ETL Tools SSIS, Apache NiFi, Apache Kafka, Talend, Apache Airflow, Informatica Packages NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow Machine Learning Linear and Logistic Regression, Decision Trees, SVM, Random Forests, Naive Bayes, K Means Database and Bigdata SQL Server, MongoDB, MySQL, PostgreSQL, Hadoop, Apache Spark Version Control Git, CI/CD, GitHub

Methodologies SDLC, Agile, Scrum, Waterfall



Contact this candidate