Goutham Thota
******@***.*** Houghton, MI https://www.linkedin.com/in/goutham-thota/ https://github.com/Goutham-19 OBJECTIVE
An enthusiastic and inventive individual pursuing a Master's degree in Data Science, with a Bachelor's degree in Computer Science. Always keen to learn new technologies and advancements, seeking full-time positions in data science and related fields. EDUCATION
•Master's In Data Science 04-2024
•Michigan Technological University GPA 3.7/4.0
•Bachelor's in B.E. Computer Science 03-2022
•Sri Chandrasekharendra Saraswathi Viswa Mahavidyalaya GPA 9.0 /10.0 SKILLS
Programming & Web development : Python, R, C, HTML & CSS Database & Big Data : SQL, T-SQL, Hadoop, Spark Machine learning & Data mining : ML Algorithms, Data visualization tool, Data analytics techniques, Scikit-learn, Pandas, Numpy, Matplotlib, Seaborn, Opencv Deep learning : CNN, Keras, PyTorch, Image segmentation, TensorFlow CERTIFICATES
Python Querying Microsoft SQL Server with Transact-SQL Machine Learning & Data Science PROFESSIONAL EXPERIENCE DATA SCIENCE INTERN EXPOSYS DATA LABS
•Developed a predictive model for diabetes prediction using Machine Learning with attributes from the Pima Indian Diabetes Dataset.
09-2021
•Handled missing values through imputation and normalized the data for preprocessing.
•Utilized various classification algorithms including Logistic Regression, K-Nearest Neighbor, Support Vector Classifier, Decision Tree, Random Forest, Gaussian Naïve Bayes, and Gradient Boosting, achieving approximately 77% accuracy with Logistic Regression emerging as the top-performing algorithm. PROJECTS & PUBLICATION
Machine learning projects (Tabular data)
Stock prediction & Breast cancer diagnosis
•Developed ML projects for early detection of stroke and breast cancer, focusing on leveraging advanced algorithms including XGBoost, K-Nearest Neighbors, Random Forest, Decision Trees, SVM, and Logistic Regression.
04-2023
•Implemented comprehensive data preprocessing techniques including outlier detection, handling missing values, and addressed class imbalance using SMOTE for stroke prediction. Utilized cross- validation (stratified k-fold) to ensure model robustness.
•Conducted hyperparameter tuning through grid search cross-validation for optimal model performance, achieving high accuracies of 89% for stroke prediction and 98.8% for breast cancer diagnosis, demonstrating the effectiveness of ML in healthcare. Deep learning projects (Image data)
Chest cancer detection, Brain tumor detection, & Alzheimer's Disease
•Developed AI models for early detection of chest cancer, brain tumors, and Alzheimer's disease, leveraging advanced techniques in medical imaging. 04-2023
•Implemented data augmentation using Keras' ImageDataGenerator class to enhance model robustness and utilized Convolutional Neural Networks (CNN) for effective image data analysis.
•Employed transfer learning with ResNet50V2 for chest cancer detection, achieving an accuracy of 86%, and utilized SVM, CNN, and AdaBoost algorithms for brain tumor detection, achieving the highest accuracy of 93% with CNN. Achieved an accuracy of 97.73% in Alzheimer's disease detection. Sentiment Analysis
Amazon Fine Food Reviews
•We analyzed 568,454 Amazon Fine Food reviews to understand online shopping sentiments, conducting data preprocessing steps such as lowercase conversion and special character elimination to enhance analysis.
04-2023
•We utilized machine learning algorithms including Naive Bayes, Random Forest, Logistic Regression, and Decision Tree for sentiment analysis.
•Applied TF-IDF vectorizer for feature extraction post-data partitioning and identified Logistic Regression as the top-performing algorithm, achieving an 89% accuracy rate and providing valuable insights into consumer sentiment trends.
Face Recognition-based Smart Attendance System
•Implemented a Face Recognition-based Smart Attendance System using IoT to automate manual attendance processes.
03-2022
•Preprocessed data by capturing facial images, applying Haar Cascade for face detection, and converting images to grayscale using Raspberry Pi, Python, and OpenCV.
•Achieved time-saving, efficient, real-time, and precise attendance marking with automatic reporting in spreadsheets. Successfully addressed issues with traditional attendance systems and automated classroom attendance using face recognition technology.