Post Job Free

Resume

Sign in

Logistic Scientist

Location:
Hyderabad, Telangana, India
Posted:
May 10, 2021

Contact this candidate

Resume:

* **

Anish Joel A

Data Scientist

Email ID : admabg@r.postjobfree.com

Summary

I am a data scientist with a demonstrated ability to deliver valuable insights using Analytics and Advanced Machine Learning methods. I successfully completed "Data Science Program at Digital Nest" from November 2019 - February 2020 and hold a bachelor’s in computer science. Key areas I had worked on are Machine Learning Models, NLP, Deep Learning, Ensembles and computer vision. Skills:- Data Science

• Machine Learning: Linear Regression Using Gradient Decent, Logistic Regression Using Sigmoid Function, Decision Trees, Support Vector machine (SVM), Random Forests, Ada Boost, XGBoost, Clustering, Dimensionality Reduction:- Principal Component Analysis (PCA), Knowledge of Deep Learning architectures: CNN&RNN .

• Optimization Algorithms: Gradient Descent(Stochastic,Batch,Mini Batch).

• Natural Language Processing: Corpus, Bag of words, Lemmatization, Stemming, Text Classification, Tf-Idf Vectorizer.

• Programming Languages: Python, R, C++, SQL.

• Visualization Tools: Jupyter Notebook, Pycharm, GitHub (username: Bastian023), Tableau, SQL, MYSQL

• Time series Analysis: Autocorrelation & Partial Auto correlation Functions, Arima. • Statistical Analysis: Measures of central Tendency (Mean,Median,Mode), Measures of Spreads (Standard Deviation, Variance).

Certifications

Digital Nest

November 2019 – February 2020; “Data Science Program “ Projects

Problem Statement:-To predict if a server machine will soon be hit with malware based on the server’s configurations such as OS version,System Value Capacity etc.

• Checked the missing values and its percentage and filled the missing value by measures of central tendency.

• Performed train-test split

• Converted the string into numeric form by using Label encoding.

• It is a classification problem.Model was built using Logistic Regression,naïve bayes classifier, Gradient Boost Classifier and Adaptive Boost Classifier.

• GridSearchCV, RandomizedSearchCV was performed on model for hyper-parameter tuning of the model.

• Confusion Matrix was built and values of Precision,Recall and F1-Score were checked for each model. • The above train model was evaluated with performance metrics like f1-score. Problem statement:-To predict the time it takes to pass testing and reduce the time that cars spend on the test bench.

• Checked for missing values and its percentage. Missing value were filled using measures of central tendency.

• Performed train-test split.

• Converted the string into numeric form by using Label encoding.

• Applied PCA (Principal Component Ananlysis) to reduce the Dimensionality Reduction.

• It is a regression model,The Variance Inflation Factor (VIF) was done to measures the impact of collinearity. • Applied different models such as Linear Regression, Gradient Boost Regressor and Adaptive Boost Regressor. • GridSearchCV, RandomizedSearchCV was performed on model for hyper-parameter tuning of the model. • Performed Lasso,Ridge regularization technique to solve Overfitting problem.

• The above train model has been evaluated with performance metrics like R2-score. Problem Statement: - To classify reviews of users by Text Classification.

• Performed train-test split.

• Performed Tokenization to convert sentences to words and Removal of Stop words to get pure tokens.

• Used Stemming or lemmatizing of pure tokens.

• Applied tf-Idf Vectorizer to get numeric dataset.

• Corpus was split into train and text data set.

• It is a Classification problem, naïve bayes classifier and Logistic Regression was built to get better performance.

• The above train model was evaluated with performance metrics like f1-score. Problem Statement: To predict the probability that driver will initiate auto insurance claim in the next year.

The data set is unbalanced dataset.

After preprocessing and train, test split imported SMOTE from imblearn.

SMOTE is used to balance the data set by selecting similar records and altering that record one column at a time by a random amount within the difference to the neighboring records.

Built the logistic regression model because the output variable is categorical.

Checked the performance of the model by confusion matrix,F1 score, AUC-ROC curve.

Built the algorithms like Decision Tree, Random Forest, XG Boost and KNN,to check the accuracy of the model.

Education :

B.TECH – Bachelor of Technology, 2019

Computer Science and Engineering from Mallareddy College of Engineering, JNTUH. Intermediate – AP Board of intermediate education, 2014 Mathematics and applied Science from Narayana Jr.College, IPE. 10

th

– SSC Board, 2012

State Board Syllabus from St. Patrick’s High School. DECLARATION:

I hereby declare that above mentioned facts are true to best of my knowledge and belief and I hope that my particulars written for your kind consideration will be considerable as per your requirement and if once given a chance, will work with full sincerity and devotion. Anish Joel

Mobile number : 998-***-****



Contact this candidate