Data science

Location:

Hyderabad, Telangana, India

Posted:

July 07, 2019

Contact this candidate

Resume:

Lakshmipathi. N

*****************@*****.*** Mobile 998-***-****

https://www.linkedin.com/in/lakshmipathi-n-1b367a102/ https://github.com/lakshmi25npathi

Objective

Build machine learning models to deliver insights and implement action oriented solutions to complex business problems.

Technical Skills

Python: Pandas, Numpy, Scikit-Learn, Scipy, Keras, Jupyter, NLTK

R: caret, dplyr, tidyverse, rpart, glmnet,mlr, DMwR, DataExplorer

Visualization Tools: Matplotlib,Seaborn,ggplot2,Tableau

Machine Learning: Linear and Logistic Regression, Decision Trees, Random Forest, Support Vector Machines, KNN, Anomaly detection, KMeans Clustering, PCA, LightGBM, Naïve Bayes

Deep Learning: Neural Networks, Convolutional and Recurrent Neural Networks, Text mining

Databases: SQL, MongoDB

Other: Analysis using Excel, Microsoft Office

Training and Certifications

Undergone a Data Scientist training at Edwisor e-learning platform.

Certified in the course titled, “Programming with Python for everybody” from Coursera online platform.

Certified in the course titled, “Learning Python for Data Analysis and Visualization” from Udemy online platform.

Certified in the course titled, “Python for Data Science and Machine Learning” from Udemy online platform.

Certified in the course titled, “Data Science and Deep learning with Python” from Udemy Online platform.

Certified in the course titled, “Intro to SQL for Data Science” from DataCamp online platform.

Certified in the course titled “Introduction to R” from DataCamp online platform. Projects

Santander Customer Transaction Prediction June 2019

The objective of this project is to identify which customers will make a specific transaction in the future, irrespective of the amount of money transacted for given imbalance dataset.

Logistic Regression, Random Forest and LightGBM models are used for transaction prediction. Also Resampling techniques are used for balancing the data.

Confusion matrix and AUC (ROC curve) metrics are used for model evaluation.

LightGBM outperformed than logistic regression model with auc score of 0.889. Python libraries: Numpy, Pandas, sklearn, matplotlib, seaborn, imlearn, LightGBM, eli5, pdpbox, scikitplot.

R libraries : caret, moments, DataExplorer, tidyverse, pdp, glmnet, lightgbm, pROC, ROSE, DMwR, yardstick, randomForest.

(https://github.com/lakshmi25npathi/Machine-learning-projects/tree/Santander-Customer-Transaction- Prediction)

Bike Rental Count Prediction May 2019

The objective of this case is to predication of bike rental count on daily basis based on the environmental and seasonal settings.

Linear Regression, Decision trees and Random Forest algorithms are used for prediction. RMSE and MAE metrics used for model evaluation.

Random Forest algorithm performed well compare to other models with rmse value of 632.57. Python libraries: Numpy, Pandas, sklearn, matplotlib, seaborn, graphviz, scipy. R libraries: tidyverse, corrgram, DMwR, caret, rpart, randomForest.

(https://github.com/lakshmi25npathi/Machine-learning-projects/tree/Bike-Rental-Count-Prediction)

Credit Card Fraud Detection March 2019

The aim of this project is to identify the fraudulent credit card transactions for given class imbalance ratio.

Logistic Regression and LightGBM models are used for fraud detection. Also Resampling techniques are used for balancing the data.

Confusion matrix and AUC (ROC curve) metrics are used for model evaluation.

LightGBM outperformed than logistic regression model with auc score of 0.93. Python libraries: Numpy, Pandas, imlearn, LightGBM, sklearn, matplotlib, seaborn, scikitplot.

(https://www.kaggle.com/lakshmi25npathi/credit-card-fraud-detection)

Sentiment Analysis of IMDB Movie Reviews April 2019

In this case, we have to predict the number of positive and negative reviews based on sentiments by using different classification algorithms.

Logistic Regression, support vector machines and Multinomial Naïve Bayes models are used for classification. Confusion matrix metric used for model evaluation.

Both Logistic Regression and Multinomial Naïve Bayes performed well compare to support vector machines with f1 score of 0.75.

Python libraries: Numpy, Pandas, sklearn, nltk, matplotlib, seaborn, wordCloud.

(https://www.kaggle.com/lakshmi25npathi/sentiment-analysis-of-imdb-movie-reviews) MOOC Courses

Introduction to Python and Data Structures

Introduction to R

Data Mining and Visualization

Statistical Modelling

Machine Learning

Deep Learning

Academic Details

Year Degree Institute CGPA/Percentage

2018

2016

2012

2010

M.Tech

B.E

12th

10th

IIT Bombay

VTU Belgaum, Karnataka

Karnataka State Board

7.34

69.29

74.66

81.76

Position of Responsibilities

Teaching assistant at IIT Bombay

Student volunteer at Abhyuday Social Club of IIT Bombay

Contact this candidate