Machine Learning Data Science

Location:

Cedar Park, TX

Posted:

July 11, 2024

Contact this candidate

Resume:

SAIKUMAR K

********.*****@*****.*** (***) -*** **64 https://www.linkedin.com/in/saikumar2613/

PROFESSIONAL SUMMARY

Data Science Engineer with almost 5 years of experience in developing and deploying machine learning models. Proficient in Python, R, SQL, and TensorFlow for data analysis and model building. Expertise in data preprocessing, feature engineering, and model evaluation. Strong background in statistical analysis, predictive modeling, and natural language processing (NLP). Proven ability to work with large datasets and optimize algorithms for performance. Experienced in collaborating with cross-functional teams to deliver data-driven solutions. Adept at using data visualization tools like Tableau and Matplotlib to present insights.

TECHNICAL SKILLS

Languages: Python, R, SQL, Java, C++

Data Manipulation and Analysis: Pandas, NumPy, Dplyr, Tidyverse

Machine Learning and AI: Scikit-learn, TensorFlow, Keras, PyTorch, XGBoost, LightGBM, Deep Learning, Natural Language Processing (NLP), Computer Vision

Data Visualization: Matplotlib, Seaborn, Plotly, Tableau, Power BI, ggplot2

Big Data Technologies: Hadoop, Spark, Hive, Pig

Data Engineering: ETL Pipelines, Apache Kafka, Airflow, AWS Glue

Cloud Platforms: Amazon Web Services (AWS), Google Cloud Platform (GCP), Microsoft Azure

Version Control and Collaboration: Git, GitHub, GitLab, Bitbucket

Development Tools: Jupyter Notebooks, RStudio, VSCode, PyCharm

DevOps: Docker, Kubernetes, Jenkins, CI/CD Pipelines

Databases: MySQL, PostgreSQL, MongoDB, Cassandra, Redis

Testing Tools: Selenium, JUnit, Mockito

EXPERIENCE

CLIENT: Fanniemae LOCATION: Reston, VA

ROLE: AWS Data Engineer Aug 2023 - Present

Project Responsibilities:

Engineered and optimized scalable ETL pipelines using AWS Glue, AWS Lambda, and Amazon Kinesis, successfully processing and transforming terabytes of structured and unstructured data daily.

Designed and implemented a robust data lake architecture on AWS S3, leveraging partitioning, compression, and lifecycle policies to enhance data retrieval efficiency and cost management.

Developed complex data transformation workflows with Apache Spark and Python, and loaded the processed data into Amazon Redshift, improving query performance and reducing latency for analytics.

Automated infrastructure provisioning and deployment processes with AWS CloudFormation and Terraform, ensuring consistent, reliable, and reproducible environments for data engineering tasks.

Implemented advanced data security measures using AWS IAM, AWS KMS, and encryption best practices, ensuring compliance with industry standards and protecting sensitive financial data.

CLIENT: CVS Health LOCATION: Hartford, CT

ROLE: Data Science Engineer Jan 2023 – July 2023

Project Responsibilities:

Developed predictive analytics models using Python and TensorFlow to optimize inventory management, and integrated these models with internal applications via APIs to enhance patient care.

Engineered data preprocessing workflows with Pandas and NumPy for cleaning and standardizing healthcare datasets, and leveraged Hadoop and Spark for distributed processing of large prescription and patient data sets.

Performed statistical analysis and A/B testing with R and SQL to evaluate health interventions, and created interactive dashboards and data visualizations using Tableau and Power BI for business decision-making.

Designed and implemented ETL pipelines with Apache Airflow and AWS Glue to integrate data from multiple sources, and utilized AWS services (EC2, S3, Redshift) for scalable data storage and computation.

Optimized database performance and managed data warehousing solutions using MySQL, PostgreSQL, and MongoDB, while maintaining CI/CD pipelines with Jenkins and ensuring compliance with healthcare regulations and data privacy standards.

CLIENT: Knowx Innovation Pvt. Ltd LOCATION: Bangalore, India

ROLE: Data Engineer Sep 2020 – July 2022

Project Responsibilities:

Developed and deployed predictive models using machine learning algorithms in Python on AWS SageMaker to optimize business processes and provide scalable predictions.

Conducted data preprocessing and feature engineering on diverse datasets utilizing SQL, Pandas, and Apache Spark to ensure high data quality and enhance model performance, and designed ETL pipelines with AWS Glue and Lambda for seamless data integration and high data accuracy.

Performed statistical analysis and hypothesis testing using advanced statistical methods and tools, such as R and Scikit-learn, to derive actionable insights and support strategic decision-making, and utilized data visualization tools like Tableau, Matplotlib, and AWS QuickSight to present complex analytical findings to stakeholders, enabling data-driven business decisions.

CLIENT: CES Ltd LOCATION: Hyderabad, India

ROLE: Associate Software Engineer June 2019 – Aug 2020

Project Responsibilities:

Designed and developed scalable data-driven applications using Java and Spring Boot, ensuring high performance and reliability in processing large datasets, and created RESTful APIs and microservices for seamless data access and integration.

Implemented robust data access layers utilizing Hibernate and JPA for efficient interaction with relational databases such as MySQL and PostgreSQL, and developed and optimized ETL processes using Apache Kafka and Apache NiFi to ensure data consistency and accuracy.

Utilized Apache Spark and Hadoop to process and analyze large-scale datasets, enhancing data processing capabilities and reducing computation time.

EXPERIENCE

Masters of Science - computer and information systems security and assurance

University of Central Missouri Aug 2022 – May 2024 GPA: 3.2/4.0

Bachelors Of Science - Computer Science and Engineering

Lovely Professional University July 2018 – May 2022 CGPA: 7.01/10.0

Contact this candidate