Machine Learning Data Science

Location:

Leonia, NJ, 07605

Posted:

May 21, 2024

Contact this candidate

Resume:

Vijaya Chowta

Email: ad5vbf@r.postjobfree.com mobile: +1-201-***-****.

SUMMARY:

● Hands 4+ experience in Deep Learning, Data Mining and Machine Learning with large datasets of Structured and Unstructured Data.

● Involved in the entire data science project life cycle and actively involved in all the phases including data extraction, data preprocessing, statistical modeling and data visualization with large data sets of structured and unstructured data.

● Skilled in using statistical methods including exploratory data analysis(EDA), regression analysis, regularized linear models, time-series analysis, cluster analysis.

● Experienced with machine learning algorithm such as logistic regression, random forest, KNN, SVM, neural network, linear regression, lasso regression and k - means etc..

● Strong skills in statistical methodologies such as hypothesis testing, Principal Component Analysis (PCA), sampling distributions, chi-square tests, time-series analysis, discriminant analysis, Bayesian inference, multivariate analysis.

● Extensively worked on Python for Data Engineering and Modeling.

● Efficient in data preprocessing including Data cleaning, Correlation analysis, Imputation, Visualization, Feature Scaling and Dimensionality Reduction techniques using Machine learning platforms like Python Data Science Packages.

● Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0 Jupyter Notebook 4.X, Excel.

● Experience in visualization tools like, Tableau 9.X, 10.X for creating dashboards.

● Ability to maintain a fun, casual, professional, and productive team atmosphere

● Skilled in Advanced Regression Modeling, Correlation, Multivariate Analysis, Model Building, Business Intelligence tools and application of Statistical Concepts.

● Great experience in Hypothetical testing, normal distribution, and other advanced statistical techniques.

● Developed predictive models using Decision Tree, Random Forest, Naïve Bayes, Logistic Regression, Cluster Analysis, and Neural Networks.

● Experienced in Python to manipulate data for data loading and extraction and worked with python libraries like Matplotlib, Numpy, Scipy and Pandas for data analysis.

● Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions that scales across massive volumes of structured and unstructured data.

● Experience in building production quality and large-scale deployment of applications related to natural language processing and machine learning algorithms.

● Exposure to AI and Deep learning platforms such as TensorFlow, Keras, AWS ML.

● Proficient in Tableau data visualization tools to analyze and obtain insights into large datasets, create visually powerful and actionable interactive reports and dashboards..

● Generated data visualizations using tools such as Tableau, Python Matplotlib, Python Seaborn.

● Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to various business problems and generating using sentiment Analysis.

● Proven ability to manage all stages of project development Strong Problem Solving and Analytical skills and abilities to make Balanced and Independent Decisions TECHNICAL SKILLS:

Statistical Methods Hypothesis Testing, ANOVA, Principal Component Analysis (PCA), Correlation

(Chi-square test, covariance), Multivariate Analysis, Bayes Law. Machine Learning Linear Regression, Logistic Regression, Naive Bayes, Decision Trees, Random Forest, Support Vector Machines (SVM), K-Means Clustering, K-Nearest Neighbors (KNN), Random Forest, Gradient Boosting Trees, Ada Boosting, PCA, LDA, Sentiment Analysis, K-Means Clustering, Natural Language Processing, TimeSeries - ARIMA, SARIMAX.

Deep Learning Artificial Neural Networks, RBM, DBN, Convolutional Neural Networks, RNN, Deep Learning on AWS, Keras API.

Data Visualization Tableau, Python (Matplotlib, Seaborn) Languages Python (NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn,NLP),SQL, Java Operating Systems UNIX Shell Scripting (via PuTTY client), Linux, Windows, Mac OS Other tools and

technologies

TensorFlow, Keras, AWS ML, MS Office Suite, GitHub, AWS EDUCATION DETAILS :

2004 — 2007 — Bachelors in Statistics, Computers — Nagarjuna University 2007 — 2010 — Master Of Computer Applications —- Nagarjuna University PROFESSIONAL EXPERIENCE:

Client: LeadXpression- Montgomeryville, PA Date: Feb2019 – present. Role: Data Scientist/Machine Learning

Responsibilities:

● Identified risk level and eligibility of new insurance applicants with Machine Learning algorithms.

● Predicted the claim severity to understand future loss and ranked importance of features

● Used Python3.X, Python 2.X to implement different machine learning algorithms including Generalized Linear Model, SVM, Random Forest, Boosting and Neural Network

● Evaluated with K-Fold Cross Validation to optimize performance of models and to tune hyper parameters

● Identified process improvements that significantly reduce workloads or improve quality

● Handle large size datasets and use Big Query ML to access the data from on cloud-based computing platforms such as Google Cloud Service and execute machine learning models.

● Worked on data cleaning, data preparation and feature engineering with Python 3.X including Numpy, pandas, Matplotlib, Seaborn and Scikit-learn

● PCA was used for dimensionality Reduction and used clustering technique K-Means to identify outliers and to classify unlabeled data

● Collaborated with product management and other departments to gather the requirements

● Performance of the model was improved using K-fold cross Validation technique and the data was tested to enhance the model on the sample data before finalizing the model. Confusion Matrix and AUC - ROC Chart were used to evaluate the classification model

● Application of various machine learning algorithms and statistical modeling like decision trees, text analytics, supervised and unsupervised, regression models, social network analysis, neural networks, deep learning, SVM, clustering to identify Volume using scikit-learn package in python

● Performed MapReduce jobs and analysis using Python for machine learning and predictive analytics models.

● Used pandas, NumPy, seaborn, matplotlib, Scikit - learn, in Python for developing various machine learning algorithms

● Checking if the data is stationary or non-stationary by using ADF, KPSS.

● Hands on Experience in Forecasting Analysis using ARIMA and SARIMAX for estimating upcoming house prices.

● Hands-on experience with natural language processing techniques including Sentiment analysis, Word embeddings.

● Experience in crawling all the different websites and acquiring the required data.

● Text classification uses NLP techniques to digest huge volumes of text data and classifying using Sentiment Analysis and using TF-IDF vectors.

● Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python.

● Work on cloud-based computing platforms such as Google Cloud Service to automate the data migration tasks

● Set up a GCP Firewall rules to allow or deny traffic from the VM instances based on specified configuration and.

● Designed dashboards with Tableau 9.2 and provided complex reports, including summaries, charts, and graphs to interpret findings to team and stakeholders Environment: Machine Learning, AWS, Python (Scikit-learn, NumPy, pandas, Matplotlib, Seaborn), Linux, Tableau. Client:WebClickMedia,Montgomeryville, PA. Dec2016 – Jan2019 Role: Data Analyst

Responsibilities:

● Extensively involved in all phases of data acquisition, data collection, data cleaning, model development, model validation and visualization to deliver data science solutions

● Built machine learning models that accurately predicts the demand of products among multiple classes based on the historical sales data available on multiple products. Further, the aim was to improve the profit by maintaining the right stock of products whose demand is high while avoiding the loss of maintaining unnecessary products

● Gathering business requirement from client and approach formulation and design methodology to match client requirements

● Built and tested different Ensemble Models such as Boosted aggregating, Bagged Decision Trees and Random Forest, Gradient boosting, to improve accuracy, reduce variance and bias, and improve stability of a model

● Generated Heat maps to identify the risk and flaws in the business.

● Develop dashboards, user stories and reports using Tableau to analyze data associated with product performance, customer feedback and strategic decision making.

● Performed data cleaning including transforming variables and dealing with missing value and ensured data quality, consistency, integrity using pandas, NumPy

● Used Python 2.x/3.x to develop many other machine learning algorithms such as CNN, RNN, using Keras, TensorFlow and Sklearn.

● Tackled highly imbalanced Fraud dataset using sampling techniques like under sampling and oversampling with SMOTE (Synthetic Minority Over-Sampling Technique) using Python Scikit-learn

● Utilized PCA and other feature engineering techniques to reduce the high dimensional data, applied feature scaling, handled categorical attributes using one hot encoder of scikit-learn library

● Developed various machine learning models such as Logistic regression, KNN, and Gradient Boosting with Pandas, NumPy, Seaborn, Matplotlib, Scikit-learn in Python

● Worked on cloud services such as Amazon Web Services (AWS) to do machine learning on big data

● Used cross-validation to test the model with different batches of data to find the best parameters for the model and optimized, which eventually boosted the performance

● Created and maintained reports to display the status and performance of deployed model and algorithm with Tableau

Environment: Python 3.x, Linux, TensorFlow, Tableau, SQL Server 2012, Microsoft Excel, SQL, Scikit-learn, Pandas, AWS(S3), XML

Contact this candidate