+1-925-***-**** email@example.com https://www.linkedin.com/in/gopinath21/
Southern Arkansas University, Arkansas
M.S., Computer and Information Science
Being a Passionate Data enthusiast and having around 7 years of experience as professional qualified Data Scientist in Statistical modeling, Machine Learning, Data Visualization and Data mining with large sets of both Structured and Unstructured Data.
Experience in feature extraction, creating Regression models, Classification, Predictive data modeling and Cluster analysis.
Strong experience in implementing Supervised Machine Learning Algorithms like Linear Regression, Logistic Regression, Linear Discriminant Analysis (LDA), Decision Tree, Random Forest, Support Vector Machines (SVM), Naive Bayes, K-Nearest Neighbor.
Extensive experience in providing Machine Learning and Data Mining solutions to various business problems based on requirements using Python
Strong expertise in implementing Unsupervised Machine Learning Algorithms like Hierarchical clustering, K-means clustering, Probability Clustering, Density-Based Clustering.
Proficient in using Python libraries like Pandas, NumPy, Scikit-learn, Seaborn, Scipy for developing various machine learning models.
Around 3 years of experience in developing Deep Learning models like Conventional Neural Network (CNN), Artificial Neural Network, Multilayer perception’s (MLPs), Recurrent Neural Networks (RNN) for recommended systems.
As a Data scientist actively involved in all phases of project life cycle including Data Extraction, Data Cleaning, Data Visualization and building Models.
Strong experience in Software Development Life Cycle (SDLC) including Requirement Analysis, Design Specification and Testing in both Waterfall and Agile methodologies.
Implementation experiences in Machine Learning and deep learning including Regression, Classification, Neural network, object tracking and Natural Language Processing (NLP’s) using packages like Tensor Flow, Keras, NLTK and Spacy.
Proficient Mathematical knowledge on Matrix Operations, Statistics, Probability, Linear Algebra, Differentiation, Integration and Geometry
Worked with various data visualization tools of python like Matplotlib, Seaborn, ggplot, pygal and using of Tableau.
Hands on Experience in using GIT Version Control System.
Proficient with excellent initiative and innovative thinking skills and ability to guide teammates to breakdown large and complex issues to simplified versions for easy execution.
Python, SQL, Java
Statistics and Probability, Linear Algebra, Matrix Operations, Calculus
Machine Learning Algorithms
Linear Regression, Logistic Regression, Linear Discriminant Analysis (LDA), Decision Trees, Random Forests with Adaboost and Gradient Descent Boosting, Support Vector Machines (SVM’s), Naive Bayes, K-Nearest Neighbor, Hierarchical clustering, K-means clustering, Probability Clustering, Density-Based Clustering.
Machine Learning Techniques
Principal Component Analysis, Data Standardization Techniques, L1 and L2 regularization, Hyperparameter tuning, Resampling Techniques like SMOTE, Cluster Centroid Methods, Feature selection and Feature Engineering, Cross Validation Methods(K-fold).
Whole Foods Market, Austin Oct 2018 - Present
Performed Customer segmentation based on customers behavior, demographics, transactions by using customer specific details like age, income and created multiple customer classes.
Analyzed the customers purchase data and product trends to recommend the types of products for customers based on their behavior tracked through customer accounts.
Explored and created different new data sets to work with and implement few data science work flow platforms for future applications.
Constructed customer classes with historical, demographic and behavioral data as features using Random Forest Classifier and Logistic Regression to help marketing team understand purchase pattern of customers.
Predicted sales and profits using machine learning and deep learning strategies.
Assisted marketing team to devise business strategy to target customers with discount coupons, deals and offers to improve customer purchases and maintaining stock at stores.
Communicated with management to discuss insights obtained from data, assisted in making best business decisions and reduced customer churn by 15% in few months of implementation by extracting value from data.
Applying clustering algorithms like partitioning clustering, fuzzy clustering, density-based clustering methods to group the data on their similar behavior patterns.
Identified distinct patterns in which customers respond to offers and clustered their actions using K-means, K-means++ Clustering, Hierarchical Clustering and segmented them into different groups, helped marketing team to further analyze behavioral patterns of customers.
Created Customer Lifetime Value (CLV) from the customers data by using Multi-Linear Regression algorithm, identified high and low value segments and helped organization to understand customers and improve customer service to retain customers.
Performed personal and food sales Predictive Modeling by using decision trees and regressions in order to get the risk involved by giving individual scores to the customers.
Proposed marketing strategies to target potential customers using their first three months data and from regression model, we evaluated CLV for every new customer.
Investigated large datasets to handle missing values, cleaned messy datasets and applied feature scaling to standardize range of independent variables.
Researched predictive models including Logistic Regression, Support Vector Machine (SVC) and re-enforcement learning to prevent retail fraud.
Improved model performance by tuning hyper-parameters using optimization techniques like Grid search, Random search and Bayesian optimization and increased model efficiency by XG-Boosting
Validated models using Cross validation, loss function to measure model performance and created Confusion Matrix, Receiver Operating Characteristic (ROC) and Cumulative Accuracy Profile (CAP) curves. Addressed over-fitting and under-fitting by tuning hyper parameters using L1 and L2 Regularization
Applied dimensionality reduction technique like Principal Component Analysis (PCA) to extract relevant optimal features from high dimensional data.
Visualized results using Matplotlib, Seaborn libraries of scikit-learn and used Tableau to present results on dashboards for team members, Management and other relevant departments in company.
Forecast the company’s short-term and long-term growth in terms of revenue, number of customers, various costs, stock changes etc., using machine learning algorithms.
Adidas, Oregon Jul 2017 – Sept 2018
Developed predictive solutions to support online shopping using machine learning algorithms such as Linear Regression, Logistic Regression, Naive Bayes, Decision Trees, Random Forest, Support Vector Machine in Python.
Worked on data cleaning, data preparation and feature engineering with Python, including NumPy, SciPy, Matplotlib, Seaborn, Pandas, and Scikit-learn.
Responsible for data identification, collection, exploration, cleaning for appropriate modeling.
Worked on NLTK library in python for doing sentiment analysis on customer product reviews and other third-party websites using web scrapping.
Performed sentiment analysis of customer reviews and classified each review into good, bad and neutral class to understand pulse of customers about business.
Implemented Time Series analysis on sales data to consider what measures to be taken for improve the Sales.
Used MySQL and created SQL tables and involved in data loading and writing SQL UDFs.
Conducted analysis in assessing customer behaviors with clustering algorithms such as K-Means Clustering and Hierarchical Clustering.
Evaluated parameters with K-Fold Cross Validation, Grid search methods to optimize performance of models.
Along with data analytics and Excel data extracts, Implemented Agile Methodologies, Scrum stories and sprints in a Python based environment.
Worked on .csv, .json, .excel different types of files for the data cleaning and data analysis.
Performed Time Series Analysis on animal medicine and vaccine product sales data in order to extract meaningful statistics and other characteristics of the data to predict future values based on previously observed values.
Worked in Tableau environment to create weekly, monthly, daily reports using tableau desktop & publish them to server.
Worked on Excel using pivots, conditional formatting, large record sets, data manipulation and cleaning.
Used GIT HUB as version control software to manage the source code and to keep track of changes to files which is fast and light weight system.
First Data, Georgia Feb 2016 – Jul 2017
Analyzed the data using various machine learning algorithms to segregate all transactions made by customers depending on the amount and total transactions.
Extracted Tera bytes of both structured and unstructured data by using SQL queries and performed data mining tasks including handling missing data, data wrangling, feature scaling.
Developed an easy to use documentation for the frameworks and tools developed for adaption by other teams.
Implemented Porter Stemmer (Natural Language Tool Kit) with NLP bag of words model using Count Vectorizer class to process text data.
Created predictive model using LSTM, Recurrent Neural Networks (RNNs) and studied reviews, obtained feedback on customer service to help employer reduce customer churn.
Experimented with other classification models like Random Forests, Logistic Regression and Naïve Bayes to classify customers reviews.
Extracted data from web using Web Scraping, Text mining and processes data into tab separated file to separate reviews by tab in data.
Cleaned dirty data and prepared data for feature extraction using Count Vectorizer of sci kit-learn feature extraction library.
Automated customer service by creating chat box which responds to customer queries using deep learning and text processing with nltk of NLP library.
Evaluated model performance by creating confusion matrix, classification report and accuracy score. Improved model performance by k-fold cross validation and XG-Boosting and achieved model accuracy of 92%.
Developed recommended systems using Apriori Principle Algorithm, for mining frequent item sets and relevant association rules to operate database containing a lot of transactions.
Built machine learning algorithms to forecast the company’s short term and long term growth in terms of revenue, number of customers, stock changes and other.
Demonstrated experience in design and implementation of Statistical models, Predictive models, enterprise data model, meta data solution and data life-cycle management in both RDBMS, Big Data environments.
Presented simple visualization of results using seaborn visualization libraries of Python.
Used python for statistical operations on the data and seaborn, ggplot for visualizing the data regarding the sales and customers.
Karvy capital Ltd, Hyderabad May 2013 – Dec 2015
Acquired data from primary or secondary data sources and maintain databases/data systems.
Established new client data preparing them for entry into new platform.
Loaded data by converting CSV file into corresponding database tables.
Worked with management team to create prioritized list of needs for each business segment.
Monitored and resolved issues of data flow on daily basis. Also created views for reporting team to use data for marketing numbers on daily basis.
Collaborated with reporting team to resolve data discrepancies and logical data corrections which are occurring throughout reports.
Generated Tableau ad-hoc reports using excel sheet, flat files, CSV files.
Designed, built, and implemented relational databases
Used data mining techniques for outlier detection and created algorithm to connect patterns between customer trends.
Created Software solutions in Software development lifecycle (SDLC) and Agile methodologies environment.
Performed computational tasks on data by creating pig, hive and Map reduce scripts to access and transform data in HDFS.
Developed and implemented metadata models for reporting functionalities and developed automated process for data corrections.
Developed SQL, NoSQL and PL/SQL scripts to extract data from database and for testing Purposes.
Reviewed logical model with application developers, ETL team, DBAs, and testing team to provide information about data model and business requirements.
Identified and logged defects if/when test fail, using SQL to narrow down root cause of problem for efficient investigation by development team and log accordingly.
Used advanced Excel functions to generate spreadsheets and pivot tables.