SAIKUMAR K
********.*****@*****.*** (***) -*** **64 https://www.linkedin.com/in/saikumar2613/
PROFESSIONAL SUMMARY
Data Science Engineer with almost 5 years of experience in developing and deploying machine learning models. Proficient in Python, R, SQL, and TensorFlow for data analysis and model building. Expertise in data preprocessing, feature engineering, and model evaluation. Strong background in statistical analysis, predictive modeling, and natural language processing (NLP). Proven ability to work with large datasets and optimize algorithms for performance. Experienced in collaborating with cross-functional teams to deliver data-driven solutions. Adept at using data visualization tools like Tableau and Matplotlib to present insights.
TECHNICAL SKILLS
Languages: Python, R, SQL, Java, C++
Data Manipulation and Analysis: Pandas, NumPy, Dplyr, Tidyverse
Machine Learning and AI: Scikit-learn, TensorFlow, Keras, PyTorch, XGBoost, LightGBM, Deep Learning, Natural Language Processing (NLP), Computer Vision
Data Visualization: Matplotlib, Seaborn, Plotly, Tableau, Power BI, ggplot2
Big Data Technologies: Hadoop, Spark, Hive, Pig
Data Engineering: ETL Pipelines, Apache Kafka, Airflow, AWS Glue
Cloud Platforms: Amazon Web Services (AWS), Google Cloud Platform (GCP), Microsoft Azure
Version Control and Collaboration: Git, GitHub, GitLab, Bitbucket
Development Tools: Jupyter Notebooks, RStudio, VSCode, PyCharm
DevOps: Docker, Kubernetes, Jenkins, CI/CD Pipelines
Databases: MySQL, PostgreSQL, MongoDB, Cassandra, Redis
Testing Tools: Selenium, JUnit, Mockito
EXPERIENCE
CLIENT: Fanniemae LOCATION: Reston, VA
ROLE: AWS Data Engineer Aug 2023 - Present
Project Responsibilities:
Engineered and optimized scalable ETL pipelines using AWS Glue, AWS Lambda, and Amazon Kinesis, successfully processing and transforming terabytes of structured and unstructured data daily.
Designed and implemented a robust data lake architecture on AWS S3, leveraging partitioning, compression, and lifecycle policies to enhance data retrieval efficiency and cost management.
Developed complex data transformation workflows with Apache Spark and Python, and loaded the processed data into Amazon Redshift, improving query performance and reducing latency for analytics.
Automated infrastructure provisioning and deployment processes with AWS CloudFormation and Terraform, ensuring consistent, reliable, and reproducible environments for data engineering tasks.
Implemented advanced data security measures using AWS IAM, AWS KMS, and encryption best practices, ensuring compliance with industry standards and protecting sensitive financial data.
CLIENT: CVS Health LOCATION: Hartford, CT
ROLE: Data Science Engineer Jan 2023 – July 2023
Project Responsibilities:
Developed predictive analytics models using Python and TensorFlow to optimize inventory management, and integrated these models with internal applications via APIs to enhance patient care.
Engineered data preprocessing workflows with Pandas and NumPy for cleaning and standardizing healthcare datasets, and leveraged Hadoop and Spark for distributed processing of large prescription and patient data sets.
Performed statistical analysis and A/B testing with R and SQL to evaluate health interventions, and created interactive dashboards and data visualizations using Tableau and Power BI for business decision-making.
Designed and implemented ETL pipelines with Apache Airflow and AWS Glue to integrate data from multiple sources, and utilized AWS services (EC2, S3, Redshift) for scalable data storage and computation.
Optimized database performance and managed data warehousing solutions using MySQL, PostgreSQL, and MongoDB, while maintaining CI/CD pipelines with Jenkins and ensuring compliance with healthcare regulations and data privacy standards.
CLIENT: Knowx Innovation Pvt. Ltd LOCATION: Bangalore, India
ROLE: Data Engineer Sep 2020 – July 2022
Project Responsibilities:
Developed and deployed predictive models using machine learning algorithms in Python on AWS SageMaker to optimize business processes and provide scalable predictions.
Conducted data preprocessing and feature engineering on diverse datasets utilizing SQL, Pandas, and Apache Spark to ensure high data quality and enhance model performance, and designed ETL pipelines with AWS Glue and Lambda for seamless data integration and high data accuracy.
Performed statistical analysis and hypothesis testing using advanced statistical methods and tools, such as R and Scikit-learn, to derive actionable insights and support strategic decision-making, and utilized data visualization tools like Tableau, Matplotlib, and AWS QuickSight to present complex analytical findings to stakeholders, enabling data-driven business decisions.
CLIENT: CES Ltd LOCATION: Hyderabad, India
ROLE: Associate Software Engineer June 2019 – Aug 2020
Project Responsibilities:
Designed and developed scalable data-driven applications using Java and Spring Boot, ensuring high performance and reliability in processing large datasets, and created RESTful APIs and microservices for seamless data access and integration.
Implemented robust data access layers utilizing Hibernate and JPA for efficient interaction with relational databases such as MySQL and PostgreSQL, and developed and optimized ETL processes using Apache Kafka and Apache NiFi to ensure data consistency and accuracy.
Utilized Apache Spark and Hadoop to process and analyze large-scale datasets, enhancing data processing capabilities and reducing computation time.
EXPERIENCE
Masters of Science - computer and information systems security and assurance
University of Central Missouri Aug 2022 – May 2024 GPA: 3.2/4.0
Bachelors Of Science - Computer Science and Engineering
Lovely Professional University July 2018 – May 2022 CGPA: 7.01/10.0