Karthik Reddy
Bergen Ave, Kearny, NJ *****
B ***********@*******.*** B https://www.linkedin.com/in/karthik-reddy-588614177/ H +1-201-***-**** EDUCATION
New Jersey Institute Of Technology — Ying Wu College Of Computing, Newark, NJ Master of Science, Information Systems Sep. 2018 – May. 2020 Coursework And Mooc’s: Machine Learning, Deep Learning, Data Analysis, Web Mining, Database Systems, Statistics Sreenidhi Institute Of Science And Technology, Hyderabad, Telangana, India Bachelors of Technology in Mechanical Engineering Aug. 2013 – Apr. 2017 WORK EXPERIENCE
UBER, Hyderabad, India
Data Science Analyst Jun. 2017 – Aug. 2018
Used Pandas, NumPy, SciPy, Matplotlib, Scikit-learn, NLTK in Python for developing various machine learning algorithms and utilized algorithms such as XGBoost, Logistic Regression, Random Forests, KNN for predicting employee attrition
Built Data Pipelines in Python for comparing the classification models to predict the model with the best score
Participated in all phases of data mining; data collection, data cleaning, developing models, validation and visualization
Created reports and dashboards using Tableau and Excel to communicate employee performances to the Senior Manager DATA SCIENCE PROJECTS
Credit Card Fraud Detection Using Machine Learning Project Github Link Nov. 2019 – Jan. 2020
Built a Machine Learning Classifier in Python to detect whether a transaction is a normal payment or fraud
Anomaly Detection: Removed extreme outliers from features that have a high correlation with our class
Classifiers: Obtained the parameters that give the best predictive score for predictive models using GridSearchCV
Among the predictive models, Logistic Regression Classifier shows the best score in both training and validation sets
Python - (Keras, NumPy, Pandas, Scikit-learn, Seaborn), Algorithms - (Decision Tree, Support Vector Classifier, KNN, Neural Networks)
Predicting Term Deposit Subscription Of Banks
Project Github Link Aug. 2019 – Oct. 2019
Identified whether or not a potential client will subscribe to a term deposit or not
Built data pipelines to preprocess the data, used cross-validation to avoid overfitting
Used various classifiers, implemented ROC curves and found that Gradient Boosting classifier is the best model to predict whether a potential client will subscribe to a term deposit or not
Python - (Pandas, Scikit-Learn, Matplotlib, Seaborn), Algorithms - (Gradient Boosting, Decision Tree, Random Forest, KNeighbours Classifier)
Who’s Tweeting? Trump or Trudeau? (Tweet Classification Using NLP) Project Github Link May. 2019 – Jul. 2019
Built a Machine Learning Classifier that identifies whether President Trump or Prime Minister Justin Trudeau is tweeting!
Used CountVectorizer, TfidfVectorizer classes to create a vectorized representation of the tweets by Trump and Trudeau
Trained Linear SVC model over TF-IDF vectorized tweets and observed an increase in the accuracy of the model
Python - (NumPy, Pandas, Scikit-Learn), Algorithms - (Naive Bayes, Linear Support Vector Classifier) Airbnb: The Amsterdam Story With Interactive Maps
Project Github Link Mar. 2019 – Apr. 2019
Obtained an inside Airbnb listing data for Amsterdam and analysed the popular trends and predicted Amsterdam’s listing prices. The peak average price of listing was 240 Euros
De Baarsjes (3300) and South Of De Pijp(2400) were the top 2 neighbours with the most listing
Centrum West(172 Euros) and Centrum Oost(154 Euros) were the most expensive neighbourhoods
Python - (NLTK, NumPy, Pandas, GeoPandas, Matplotlib, Plotly, Folium) IMDB Movie Recommender Engine
Project Github Link Jan. 2019 – Feb. 2019
Used weighted average IMDB formula as a metric to scale the rating, plotted the best movies based on the scaled metric
Built a movie recommender system that shares movies with similar plot summaries using TF-IDF
Computed the Sigmoid Kernel to calculate the numerical quantity that denotes the similarity between two movies
Python - (NumPy, Pandas, Scikit-Learn, Matplotlib, Seaborn) SKILLS & OTHERS
Machine Learning: Algorithms - Decision Tree, Random Forest, Naive Bayes, Gradient Boosting, Support Vector Classifier, KNN, Linear Regression, Logistic Regression, Neural Networks, Recommender Systems and A/B tests Programming Languages: Python - (NumPy, Pandas, Scikit-Learn, Matplotlib, Seaborn), SQL Data Visualization Tools: Tableau, Matplotlib, Seaborn, Excel Database Softwares: Microsoft SQL Server, MySQL, MongoDB