TEJASWI RUPA NEELAPU
MACHINE LEARNING ENGINEER
Seattle, WA 206-***-**** LinkedIn Github Portfolio ***************@*****.***
PROFESSIONAL SUMMARY
Machine Learning Engineer with 2+ years of experience designing and deploying ML models in healthcare, retail, and tech domains. Proficiency in Python, Scikit-learn, TensorFlow, and NLP, with success in improving model accuracy by up to 92%, handling datasets of 250K+ records, and developing scalable ML workflows using cloud platforms. Experienced in fine-tuning models, automating feature pipelines, and collaborating across data science, engineering, and product teams to deploy ML solutions that drive business value.
EDUCATION
Master's in Data Science Seattle University, Seattle, WA September 2023 – June 2025 (Expected)
Bachelor's in Mechanical Engineering Amrita Vishwa Vidyapeetham, Kerala, India July 2018 – June 2022
SKILLS
Languages: Python, R, SQL
Frameworks & Libraries: Scikit-learn, TensorFlow, Keras, NLTK, XGBoost, LightGBM
Modeling Techniques: Classification, Regression, Ensemble Methods, Hyperparameter Tuning, Deep Learning, NLP
MLOps & Deployment: AWS (S3, EC2, Quicksight), Git, Docker, Streamlit, FastAPI
Data Tools: Pandas, NumPy, Matplotlib, Seaborn, Spark
Other Tools: dbt, Snowflake, Tableau, Jupyter
EXPERIENCE
Data Scientist - AI/ML (Capstone) Costco Wholesale Seattle, WA January 2025 – Present
• Cleaned and preprocessed 200K+ customer reviews from 5 platforms using advanced NLP techniques: tokenization, stopword removal, lemmatization, spell correction, and noise filtering.
• Performed topic modeling using LDA to uncover latent themes in feedback and inform product and service improvement strategies.
• Conducted multi-layer sentiment analysis: Used VADER for short-form sentiment tagging, Fine-tuned BERT models via Hugging Face for deep contextual understanding
• Built both extractive and abstractive summarization pipelines using Hugging Face Transformers, enabling scalable summarization of customer reviews.
• Developed an interactive chatbot (Chainlit frontend + LLaMA2 backend) to answer NL queries on sentiment trends, and a Power BI dashboard to visualize volume, polarity, and keyword trends across all channels.
Data Analyst – ML Support Genmab Inc. Princeton, NJ June 2024 – December 2024
• Designed and deployed an AI-driven heatmap interface using Tableau and dbt to visualize potential trial participants, enabling precision recruitment strategies and reducing timelines by 15%.
• Built automated DataOps pipelines using Snowflake and dbt for data modeling and real-time integration, improving transformation efficiency by 25% and enabling model-driven insights.
• Analyzed large datasets in Snowflake, improving business intelligence workflows and data-driven decision-making.
• Contributed to AI/ML initiatives that generated advanced recruitment insights, increasing forecast accuracy by 20%.
Data Engineer TATA Consultancy Services Bangalore, India May 2022 – August 2023
• Built and streamlined data pipelines using AWS and Snowflake, facilitating efficient ETL processes for 500,000+ records, reducing processing time and ensuring seamless data flow into downstream systems.
• Implemented data validation techniques using SQL, improving data accuracy and consistency, resulting in a 15% reduction in data errors.
• Engineered data models and schemas for enhanced data storage and retrieval, ensuring efficient database performance for large datasets.
• Designed interactive dashboards using AWS Quicksight, translating complex insights into actionable strategies, improving operational efficiency for clients in finance and retail.
PROJECTS
Bird Species Identification using Deep Learning Python, Keras, Neural Networks, Spectrogram Analysis Link
• Developed a deep learning model to classify 12 bird species from audio calls using spectrograms derived from Xeno-Canto datasets.
• Achieved 95.45% accuracy for binary classification and 67.24% accuracy for multi-class prediction by tuning neural network hyperparameters (epochs, batch size).
• Engineered models using Keras Sequential API with Dense layers, Dropout, and Softmax/Sigmoid activations for robust training and generalization.
• Processed and visualized real-world spectrograms for audio recognition; applied model to predict unknown bird calls from raw test data.
Youth Drug Use Prediction with Decision Trees R, Classification & Regression, Random Forest Link
• Modeled marijuana usage among youth using decision trees, random forests, and boosting techniques across binary, multiclass, and regression tasks.
• Achieved 89.3% accuracy for binary classification and 89.6% accuracy for multiclass classification using cross-validation and bagging.
• Conducted feature importance analysis revealing peer influence and social perception as top predictors of drug use behavior.
• Preprocessed and engineered demographic and behavioral data from the NSDUH national youth survey (~3,000+ records) for clean and interpretable modeling.