Professional Summary:
I am a Data science aspirant and chatbot enthusiastic with expertise in creating data regression models using predictive data modelling. Deep knowledge of statistical methods, data analytics, database creation, managing coupled data along with strong communicational & coordination skills. I have acquired skills of Data Science: Machine learning, Exploratory Data Analysis, Programming with Python libraries both through classroom training & through online courses.
Education:
Graduation:
B.Tech (Computer Science and Engineering) - Anurag College of Engineering - 2011 to 2015
Intermediate:
Narayana Jr. College- 2009 to 2011
Work Experience:
Company : Data Jango
Role : Data Science Trainee
Duration : 27 January 2019 – Present
Roles and Responsibilities : Collected Data for stock market prediction by understanding data and attributes, its distribution and relationship with other attribute.
End-To-End Model building for a multi-classification problem Water plant working condition, achieved an accuracy of 82% with Random Forest.
Working on POC’s for various projects.
Worked on IMDB dataset, data preprocessing, lemmatized corpus and used it to build count vectorizer and Term Frequency-Inverse Document Frequency model got an accuracy of 92% on train set.
Applied Hashing on the corpus and got an accuracy of 89% (it was slightly over fitted model and currently fine-tuning the model).
Built a chatbot using Google’s dialogflow.
Built a customized Named Entity Recognizer (NER) with Spacy.
DATA SCIENCE TOOLKIT:
Machine Learning:
Supervised learning:
Classification: Logistic Regression, K-Nearest Neighbors, SVM, Decision Trees and Naive Bayes algorithm, Ensemble Classifiers: Bagging, Random Forest, Gradient Boosting, ADA Boost and Voting Classifiers
Regression: Linear Regression, Ridge, Lasso and Elastic Net Regression, SVR, Ensemble Regressors: Gradient Boosting, Random Forest
Anomaly Detection: Gaussian distribution, Multivariate normal distribution and Oversampling anomalous records using SMOTE
Unsupervised learning:
Clustering: K-Means clustering, Hierarchical clustering
Dimensionality Reduction: Principal Component Analysis (PCA), Locally Linear Embedding (LLE)
Solvers: Stochastic Gradient Decent
Natural Language Processing:
Text Processing: Cleaning text, Tokenizing, Removing special characters & stop words, Expanding contractions, Case conversions, Correcting words, Stemming, Lemmatization, POS Tagging and Parsing
Text Classification, Summarization: Feature Extraction, Topic modeling, Automated document summarization, Text Classification, Clustering, Semantic Analysis & Sentiment Analysis
Statistical Analysis:
Data Processing : Data transformation, Data quality check
Exploratory Data Analysis: Handling missing values, Feature extraction, Feature transformation and Feature selection.
Correlation, Segmentation, Linear regression and Logistic regression
Tools And Technologies:
Data Science Tools: Python, Sci Kit-Learn, TensorFlow, Keras, Stats Models, Pandas, NumPy, SciPy.
Statistical Concepts: Central Tendency (mean, median, mode), Variance, Standard Deviation, Z-Score,Covariance, Correlation, Central Limit Theorem, Statistical significance, Confidence Interval, P-Value, Chi-Square test, ANOVA.
Deep Learning: DNN, CNN(Image Processing, Object Detection)
NLP: Beautiful Soup, Spacy, NLTK, WordNet and Genism
Data Visualization: Matplotlib and Seaborn.
Database: MySQL
Web Technologies: HTML, CSS, Bootstrap, Java Script
DialogFlow: Experience in building basic chatbots using Google’s DialogFlow.
Projects:
Data Jango Technologies Duration: Apr’19 to Till
Title: IMDB Data Set
TeamSize: 4 Members
Worked on IMDB reviews data set and completely handled pre-processing activity.
Also built Stemmed corpus and lemmatized corpus which is then used to build count vectorizer and Tf-idf model.
Involved in fine-tuning the model and achieved an accuracy of 84% on unseen data.
Data Jango Technologies Duration: Jan’19 to Mar’19
Title: Stock Market Data Set
TeamSize: 4 Members
Analyzed the data provided to build best fit models based on data and provided appropriate insights to business problem.
Found best features and extracted data from database by identifying relations between the data provided.
Built a linear regression model for the data and obtained an accuracy of 86.7% on un-seen data.
Using web services and the sample pickle file the predictions are displayed on the web page.
Tools used: MySql, Work Bench, Python, Jupyter notebook.
Company : Amazon Development Centre (India) Pvt. ltd.
Role : Seller Support Associate.
Joining Date : 29 September 2015 – 28 November 2018.
Accomplishments:
ISTQB-Certified Tester Foundation Level.
Participated in Code Debugging in Anurag College Of Engineering in 2015.
Coordinated Technical fest(2015) in Anurag College of Engineering.