Post Job Free
Sign in

Machine Learning Data Science

Location:
Dallas, TX
Posted:
June 18, 2025

Contact this candidate

Resume:

KAREEM BABA SHAIK

+1-940-***-**** ***************@*****.*** KareemShaik.com github/heiskareem linkedin/heiskareem PROFESSIONAL SUMMARY

Motivated and hands-on professional with strong experience in machine learning, data science, and AI engineering through academic research, internships, and independent projects. Skilled in building end-to-end ML pipelines, deploying deep learning models, and performing advanced data analysis across cloud-native and real-time environments. Adept at applying tools like Python, TensorFlow, PyTorch, and AWS to solve practical problems in healthcare, scientific research, and education. Actively seeking entry-level roles as a Machine Learning Engineer, AI Engineer, or Data Scientist to apply technical expertise and contribute to impactful, data-driven solutions. TECHNICAL SKILLS

• Programming Languages: Python, C++, JavaScript, TypeScript, SQL

• Machine Learning & AI Frameworks: PyTorch, TensorFlow, Keras, Scikit-learn, Hugging Face Transformers, Caffe

• Deep Learning & NLP: Transformers (BERT, RoBERTa, LLaMA, GPT-3.5, Falcon), LSTM, GRU, Text Classification, Sentiment Analysis, Named Entity Recognition, LLM Fine-tuning

• Data Science & Analytics: Predictive Modeling, Time-Series Forecasting (ARIMA), Statistical Analysis, Document Parsing, Data Imputation

• Computer Vision: Image Processing, Object Detection, Segmentation, Tracking

• Big Data & Cloud Platforms: AWS (S3, Lambda, Glue, SageMaker), Azure (ADF, Databricks, ADLS Gen2), Apache Spark, PySpark, Kafka, Snowflake

• MLOps & DevOps: Docker, Jenkins, Git, CI/CD Pipelines, Model Deployment, Experiment Tracking

• Data Visualization: Power BI, Tableau, Matplotlib, Seaborn, Plotly

• Web & APIs: React, Angular, RESTful APIs

• Other Concepts: Multimodal AI, CUDA Programming, NVIDIA APIs, Probabilistic Methods, Model Interpretability, SDLC Documentation

PROFESSIONAL EXPERIENCE

AI Researcher

Responsible AI Lab Mar 2024 – Jan 2025 Denton, TX

• Published and presented original research at the ICCS 2024 conference in Spain, exploring LLM-driven software analysis. [https://arxiv.org/abs/2403.10588]

• Collaborated with scientists from Argonne and Oak Ridge National Labs to analyze scientific codebases, with a focus on the E3SM climate model.

• Built LLM-based pipelines using LLaMA, GPT-3.5, MPT, Falcon, and Claude to extract and summarize complex documentation and metadata.

• Designed Transformer-based systems (BERT, RoBERTa) for entity recognition, citation tracking, and summarization of research articles.

• Trained sequential models (LSTM, GRU) using up to 8 NVIDIA GPUs, leveraging CUDA programming and NVIDIA APIs for high-performance optimization.

• Created Power BI dashboards to visualize simulation trends, performance metrics, and model evaluations.

• Built ETL workflows to handle large volumes of unstructured research and simulation data, enabling smoother experimentation and reproducibility.

• Guided graduate students on topics like model fine-tuning, data pipelines, and documentation practices within cloud-native workflows using AWS SageMaker, Docker, and CI/CD. Environment: Python (NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn, NLTK, SpaCy), TensorFlow, PyTorch, Keras, CUDA, NVIDIA APIs, LangChain, Hugging Face Transformers (BERT, RoBERTa, LLaMA, MPT, GPT-3.5, Claude, Falcon), Spark MLlib, SQL, AWS SageMaker, Power BI, ARIMA, Deep Neural Networks (CNN, RNN, LSTM, GRU), Time Series Forecasting, MLOps (Model Deployment, CI/CD, Experiment Tracking), ETL Pipelines, Cloud Platforms (AWS, Azure), SDLC and documentation tools.

Machine Learning Engineer

Medcords Dec 2021 – Nov 2022 Hyderabad, IN

• Designed and implemented a machine learning system to predict cancer outcomes, helping doctors improve diagnosis and treatment planning.

• Applied models like regression, KNN, and ensemble methods using Python (Scikit-learn, Pandas, NumPy) to improve prediction accuracy and reduce clinical decision lag.

• Built interactive dashboards and tools using React and Angular, allowing medical staff to visualize patient insights clearly and in real time.

• Created and integrated RESTful APIs to connect front-end interfaces with backend analytics engines, enabling seamless access to predictive results and raw data.

• Automated ETL workflows with AWS Glue and Lambda to pull data from multiple sources, store it in S3, and push it into Oracle DB for use in real-time dashboards.

• Worked with PySpark and Spark SQL on Databricks to process large healthcare datasets; optimized cluster performance for speed and cost efficiency.

• Led the migration of SQL Server workloads to Azure cloud and connected Azure Machine Learning with MS R Server for cloud-based analysis.

• Collaborated with statisticians and business analysts, writing complex SQL scripts and building Tableau dashboards to support clinical and business insights. Environment: Machine Learning (KNN, Clustering, Regression, Random Forest, SVM, Ensemble Methods), Python

(Scikit-learn, NumPy, Pandas, Matplotlib, Seaborn, PyTorch), SQL (Oracle 11g, SQL Server 2012, PL/SQL), Big Data

(PySpark, Hadoop, Hive, HBase, Sqoop, Flume), AWS (S3, Lambda, Glue, SageMaker), Azure (ADF, Databricks, ADLS Gen2), Kafka, Snowflake, Tableau (Desktop 8.x / Server 8.x), Ubuntu, Linux, Web Technologies (React, Angular, RESTful APIs).

Founder (Instructor, Bootcamp Trainer)

Bootcamp Startup ( learnbaba.in ) Jul 2020 – Oct 2021 Hyderabad, IN

• Founded and ran a coding bootcamp that operated in-person across Hyderabad, Nalgonda, and Miryalaguda, earning over 50,000/month in revenue.

• Taught more than 1,000+ hours of live, hands-on classes in Python, Java, C/C++, Data Structures & Algorithms (DSA), full-stack web development, Android development, and introductory data science and machine learning with math.

• Built and launched custom student registration portals, leading to hundreds of student sign-ups from schools and engineering colleges.

• Organized and led weekend bootcamps and two-day technical workshops, focusing on practical skills and real-world projects.

• Personally guided students aged 12 to 22 through project-based learning, building confidence over programming fear and problem-solving.

• Designed a project-based curriculum and supported learners in building industry-level applications to strengthen their resumes.

• Recruited, trained, and mentored a small team of instructors to ensure quality teaching and student support across multiple batches.

EDUCATION

Master’s in Artificial Intelligence

University of North Texas Denton, TX Jan 2023 – Dec 2024 Bachelor of Technology (B.Tech) in Computer Science Malla Reddy Engineering College Hyderabad, India Aug 2018 – Jun 2022



Contact this candidate