Post Job Free

Resume

Sign in

Data Scientist

Location:
Atlanta, GA, 30303
Posted:
June 26, 2023

Contact this candidate

Resume:

ALEJANDRO GARCIA

Contact: (***) *******; Email: adxw9s@r.postjobfree.com

SENIOR DATA SCIENTIST / DATA & ANALYTICS MANAGER

EXECUTIVE SUMMARY

Senior Data Scientist and Principal Consultant for Artificial Intelligence Transformation with 18+ years of experience in Machine Learning, Deep Learning, Big Data Architecture, and Dev Ops Engineering for IT, Business & Operations.

Expertise in providing end-end analytical, business, and technological solutions to different organizational problems across industries. Strong analytical skills with a background in NLP, computer vision, statistical machine learning, big data, cloud computing, predictive analytics machine learning deployment, and maintenance.

Extensive experience in:

o3rd-party cloud resources: AWS, Google Cloud, and Azure Statistics and Probability, including statistical modeling, statistical hypothesis testing; sound performance executing machine learning projects

oDeveloping different computer vision models for object classification and image recognition

oWorking with and querying large data sets from big data stores using Hadoop Data Lakes, Data Warehouse, Amazon AWS, Cassandra, Redshift, Aurora, and NoSQL

oEnsemble algorithm techniques, including Bagging, Boosting, and Stacking; knowledge with Natural Language Processing (NLP) methods, in particular BERT, ELMO, word2vec, sentiment analysis, Name Entity Recognition, and Topic Modelling Time Series Analysis with ARIMA, SARIA, LSTM, RNN, and Prophet

oBig Data Architecture for ETL, Data Pipelines, and CI-CD pipelines on the cloud (AWS CI/CD Pipeline for ML)

Proficient in implementing Business Solutions with:

oAll supervised machine learning manners – Linear Regression, Logistic Regression, Support Vector Machines, Random Forests, XGBoost, Survival Modelling, libraries like NumPy, SciPy, Pandas, matplotlib, and Sklearn.

oTensorFlow and PyTorch for building, validating, testing, and deploying reliable deep-learning algorithms for

ospecific business challenges

An assertive team leader with strong aptitude in developing, leading, hiring, and training highly effective work teams; strong analytical skills with proven ability to work well in a multi-disciplined team environment and adept at learning new tools and processes with ease.

Performance Milestones:

Successfully led and developed the creation and implementation of Machine Learning and Deep Learning models to increase sales, operational efficiencies, and optimized processes on IT, Marketing, and Operations in several industries

End to end Machine Learning & Deep Learning models since the extraction of the data (ETL), modeling, and their deployment (Dockers and Kubernetes) including results in Dashboards such as Tableau, Power BI and Qlik

Developed neural network architectures from scratch, such as Convolutional (CNNs), LSTMs, and Transformers. Also built unsupervised approaches such as k-means, gaussian mixture models, and auto-encoders.

PROFESSIONAL EXPERIENCE

Since July 2022 with CVS Health Remote from Atlanta Georgia

As a Senior Artificial Intelligence Architect

In charge of receiving business use cases to use the latest AI & ML techniques, Developing Proof of Concepts (PoCs) and MVP, from concept to development and deployment in Azure and AWS cloud platforms. Projects are based on Computer Vision, NLP, Neural Networks, and "traditional" Machine Learning (XG Boost, Trees, SVM, Regression, Clustering, etc). Open AI and Hugging face developments for AWS and Azure implementations.

Guided and led a team of 5 Jr. Data Scientists and ML engineers

Scanning prescriptions and laboratory handwritten for the customers through the OCR tools AWS Textract and Azure Form Recognizer, and using containers for scalability of the application

Implementation of different NLP models and technologies, like Open AI Chat GPT 3 (tunned), 3.5, 4, and 4,32.

Created a process to transcribe customer calls with Open AI Whisper on AWS Sage maker with real-time endpoints

Implement a Q&A chatbot with the Bio Clinical Bert model for usage by customer service employees.

Developed from scratch an analysis application for the customer service area where I took the customer interactions (chats), processed them with different feature engineering techniques, using Azure Cognitive services for the Open AI module, get insights from conversations, and use those inputs for customer service analysis

Improved a recommender system by using light GBM in Python, then deployed n Golan (Go) in Ec2, increased performance by more than 10% on metrics

Improved ETLs and customer management information with AWS Glue, DynamoDB, Lambda, and Step functions

Reviewed architecture on AWS and Azure for the AI projects

Used services in AWS like Sage maker, EC2, DynamoDB, Aurora, S3, EKS, ECR, Textract, etc.

Used services in Azure like Machine Learning Studio, Cognitive Services, Blob storage, Cosmo DB, Logic Apps, and functions, etc.

Dockers, Kubernetes, Powerbraker (BeyondTrust) and APIs for model deployment and operation/serving

Agile/Jira, Slack and Github for collaboration and PM tools

Model monitoring and explainability by detecting data and model drift, LIME and SHAP values

Technologies used:

AWS: Sage maker, Redshift, S3, Aurora, DynamoDB, Kubernetes, Containers, Textract, Transcribe, Data pipelines, Managed Services, API, SQS, Step functions, Lambda functions, Key Management Services, Secrets Manager, API Gateway, Amplify, CodePipeline, Model registry, CodeCommit

Azure: Azure Machine Learning, Cognitive Services, Form Recognizer, Virtual networks, Blob Storage, Cosmo DB, Logic App, Key Vault, API Manager, Azure Repos, Azure Model Registry

Models: Computer Vision (CV), Natural Language Processing (NLP), RNNs, CNNs, Boosting methods, etc.

Feb 2021 – Jul 2022 with HSBC BANK USA Buffalo, NY (Remote from Atlanta)

As a Senior Data Scientist / Leader Consultant in AI Adoption & Analytics

Senior Data Scientist and AI Transformational Architect for HSBC Operations and Marketing for Retail and Business Banking. Technical lead for Data Science, Big Data, and Dev Ops Engineering teams. Created an Object Character Recognition (OCR) process to automatize handwritten document scanning and analysis with different computer vision and Natural Language Processing techniques. Solved several use cases for marketing and sales areas with different Machine Learning Techniques, including computer vision image classification solution for spaces optimization during Covid. Used AWS cloud environment with different Big Data Tools. I have managed and trained several consultants in each discipline mentioned. I am responsible for a team of 6 Jr. Consultants for Data Science, Data Engineering, and Cloud Computing.

Worked with OCR libraries Extracted semantic data from OCR libraries derived from internal documents and implemented a deep learning-based OCR method using TensorFlow.

Developed algorithms using NLP techniques based on OCR libraries to help the preprocessing step for claims

Created a hybrid Recommendation system algorithm with NLP techniques to propose financial products and services

Used different clustering techniques with DBSCAN and K-Means to group customers and channels

Machine learning model developed and implemented to successfully match financial products and services with supporting documentation. This reduced human manual labor involved in the operational review.

Developed analytical models and used cross-validation to evaluate the models.

Measured performance with Log loss function and evaluated feature selection with ROC curves and AUC.

Used Tableau to convey the results by using dashboards to communicate with team members and stakeholders.

Data was on AWS Redshift and Snowflake transformed on Spark

Used a VGG16 transfer learning model with several convolutional layers for image classification problems

Used AWS Sage maker, Redshift, S3, ECR, EKS, and other services

Created and implemented Python modules for filtering images using image processing libraries like Pillow, SciKit-Image, OpenCV, SciPy, PyCario, and Simple ITK to extract text and background.

Implemented clustering mechanism to group various classes of data such as customer location, revenue, products, channels, risks, and marketing

Implemented NLP (Natural Language Processing) based classification to categorize various documents.

Created multiple custom SQL queries to parse and extract data for analysis.

Worked with data compliance and data governance teams to maintain data models, metadata, and data dictionaries, and define source fields and their definitions.

Delivered various complex OLAP databases/cubes, scorecards, dashboards, and reports.

Created various types of data visualizations using Matplotlib, Seaborn, and Tableau.

Worked with Pytesseract and several SciKit-Image sub-packages

Jun 2019 - Jan 2021 with JOHNSON & JOHNSON CONSUMER HEALTH Fort Washington, PA

As a Lead Consultant for Business Analytics & AI

J&J Consumer Health Division manages over-the-counter healthcare products (OTC). I oversaw the development and implementation of customer analytics and customer service Artificial Intelligence Initiatives. As Lead Consultant, one of my projects was to create and implement a chatbot to increase customer satisfaction and revenue using Marketing Analytics for the Essential Health department (Brands Listerine, Band-Aid, Aveeno, etc.) This Chatbot for customers as a result increased customer satisfaction and reduced operational expenses. Achieved Annual savings of 30% in the Customer Service area. Implemented and developed clustering techniques to find the proper product marketing and distribution. Forecasted demand for products with different Time series methods, which helped the area to take logistic actions when needed. Models deployed on Google Cloud Platform (GCP). I was the Senior technical and analytical lead for a team of 2 Machine Learning Engineers, 2 Big Data Engineers, 1 Dev Ops specialist, and 3 Data Scientists

Used different time series methods ARIMA, LSTM, RNN ad face book Prophet to forecast sales and demand

Deployed chatbot on GCP with several services (Big Query, Dialog flow, Kubernetes and containers engine, Colab, etc.) to improve customer service

Used different clustering techniques, K-means, and DBSCAN, hierarchical clustering to group customers & channels

Used different embedders such as Universal Google Encoder, DocToVec, TFIDF, BERT, and ELMO to identify the best embedder that yields the best-performing result

Performed Cosine Similarity method to match the user input to the most similar trained question as well as matching the trained question to the corresponding department

Cleaned the text data using different techniques and then performed EDA on it

My Time series models improved the MAPE metric by 10% per quarter on average

Data was stored on Hadoop and Google Big Query

Performed data wrangling and cleaning according to EDA per business problem and model

Web Scraped data with the Beautiful Soup library to analyze information and get customer satisfaction with sentiment analysis algorithms

Created an XG Boost algorithm to get feature importance for the sales force to tune people incentives, contributed to an increase of sales of 5% by adding the proper incentive

Models were deployed by using images from the docker hub on Google Container Engine and deployed into Google Kubernetes Engine

Jun 2017 - Jun 2019 with TRACFONE WIRELESS (later acquired by Verizon) Miami, FL

As a Machine Learning & Big Data Lead

Tracfone was the largest mobile virtual network operator (MVNO) in the US, acquired by Verizon in 2022. I was the acting head of Machine Learning and Big Data technologies for the business organization, and IT. I created several models to detect consumer behavior according to consumer patterns by using Call Detailed Records and Geo localization. This allowed the organization to optimize marketing campaigns and improve operations efficiencies. My team was composed of Jr. Data Scientists, ML Engineers, and Big Data Engineers.

Developed and implemented a churn prediction model to avoid customer attrition

Identified several clusters according to customer patterns to optimize their retention

I used data from the Hadoop and GCP Big Query to do a Clustering to understand the consumer types in the markets.

Reports were generated using the Google Cloud Platform and insights were obtained about the consumer trends in each cluster with different techniques (KMeans, Hierarchical, DBScan, and HDBScan Clustering algorithms)

I Calculated CLV and Product performance across Occasions using Cosine similarity, Point-wise Mutual Information, and Lift Score as a metric of association.

I Ran a Gensim Word2Vec and BERT model (NLP) for topic modeling to extract topics from consumers’ open texts.

I Ran several correlation analysis techniques such as Pearson and Cramer V correlation analysis to understand the correlation between topics and occasions.

I Ran a Regression Analysis to understand the impact of occasions on Emotional and Functional needs.

Developed and Implemented a Cox Proportional Hazard model to prevent failure in the telecommunication towers

Deployed the models in Google Container Engine and Google Kubernetes Engine

Led the team to implement Hadoop Data Lake and Spark tools

Dec 2015 - Oct 2017 with ACCENTURE PLC Chicago, IL

As a Senior Data Scientist for Financial, Retail & Telco

Accenture Plc is a professional services company. It provides management and technology consulting services. Its segments include Communications, Media & Technology, and Financial Services. At Accenture, I performed multiple roles within the machine learning/data science scope for Operations, Business, and IT areas including Data preparation support to the modeling and deployment part. My clients were in the Telecommunication, Financial Services, and Retail (e-commerce)

Created a Fraud Detection Algorithm with an Artificial Neural Network that detected 98% accuracy Fraud operations.

Created and implemented a Survival Analysis model (Cox Proportional Hazard) to prevent when vandalism and attacks were going to occur to a major Telecommunications company in the Latam market

Identified consumer patterns on an e-commerce website with by anomaly detection algorithm based on Support Vector Machines

Developed a hierarchical clustering model for customer segmentation for a retail website

Planned, developed, and managed the end-to-end schedule and worked with stakeholders toward meeting the release's success.

Continued improvement in product development and Process Improvement to boost efficiency and standardization.

Led a team of developers/analysts to perform a large data cleansing and ETL procedure to feed the data to a new database system.

Guaranteed data consistency and integrity by rigorous oversight over and cross-checking of random samples from the generated data.

Held periodic meetings with inter-departmental operatives to ensure a smooth transition from the old services to the new ones.

Feb 2005 - Nov 2015 with HSBC Bank Hong Kong & Latin America Hong Kong, and Latam

As Analytics Manager for CRM (Data Science/AI)

At HSBC, I was responsible (Principal Data Scientist & Head of Analytical Channel Infrastructure) for different offices across the Asia Pacific (Hong Kong and Indonesia) and Latin American Markets (Mexico, Brazil, and Argentina). I developed and implemented several models that increased customer revenue. Implemented the analytical infrastructure and optimized channel operations across the organization. I was the leader of Analytical and Infrastructure teams for Online banking, Branches, ATMs, and Call Centers. Implemented and managed a Real-Time analysis marketing system for all the channels based on Bayesian models

Achieved annual cost savings of 20% through operational efficiency among Product, Technology, and Process implementation of cross-selling and channels projects, e.g. Lead Channel & Customer Analytics, Lead IT spending on more efficient projects

Improved Operational and Cost efficiency of the channels and increased cross-selling through them

Responsible for IT budget and overseeing IT project resources alignment with business strategy, Risk, Regulatory authorities, and control

Customer and Channel Insight analysis, Business Intelligence and Analytics for Corporate customers, Big-Data & Loyalty Tools,

I implemented Improvements for Branches, Internet Banking, Call Centers, and Processes (e.g. Manual Payments Reductions), achieving record reductions and efficiencies in the HSBC Group, through leading Management of IT & Customer Retention Programs

I was responsible for the CRM & CK knowledge team for HSBC Hong Kong & Indonesia. I managed a team of 6 people.

I improved the IT & Business implementation of the selling systems of the bank

I established several of the new leading metrics of the office to detect and improve Direct Channel Selling

I analyzed and proposed several CRM & Technological initiatives that were adopted by the country (Multi-Channel Cross-selling)

I implemented some of the best Latin American CRM experiences with success for Indonesia (e.g. Inbound Marketing Strategies)

ACADEMIC CREDENTIALS

Postgraduate Program in Artificial Intelligence and Machine Learning

University of Texas at Austin Austin, Texas

Machine Learning (Supervised and Unsupervised)

Deep Learning (CV, NLP)

Big Data Tools (Hadoop, Spark, Scala)

ML Ops (Deployment on Dockers and Kubernetes)

Pursuing Certification for Data Science (ML Ops) in Cloud Computing

Great Learning Remote/Atlanta, GA

Azure, AWS

Model deployment, monitoring and maintenance

Machine Learning and Deep learning techniques

MBA in Analytics and MIS

University of Texas at Austin Austin, Texas

Analytics and MIS

Diploma in Big Data as Business Strategy

ITESM

R, Hadoop, Tableau, Hive, Spark, Qlik Dashboards, etc.

Diploma in Advance Data Mining Tools & Strategies

ITAM

Statistical analysis and machine learning techniques with R and SAS (Miner)

BS in BA and Management Information Systems

ITAM & Sidney University in Australia (exchange student)

MIS and Statistics

Awards

Distinction of having been awarded the Gartner Award for CRM Excellence (HSBC Mexico)

Martha Rogers & Don Peppers Award for Breakthrough CRM implementation (HSBC Mexico)

technical skills

Machine Learning Methods: Classification regression, prediction, dimensionally reduction, and clustering to problems that arise in retail, manufacturing, and market science.

Deep Learning Methods: Machine Vision, Natural Language (NLP, NLU), Machine Learning Algorithms, Multi-Layer Perception, Shallow Sequential Neural Nets, Recurrent Neural Networks (RNNs), and LSTMS

Artificial Intelligence: Text understanding, classification, pattern recognition, recommendation systems, targeting systems, and ranking systems.

Data Analytics: Research, analysis, forecasting, and optimization to improve the quality of user-facing products, Predictive Analytics, Probabilistic Modelling, Approximate Inference

Areas of Interest and Experience: Deep Learning, Reinforcement Learning, Recommender systems, Machine Learning for Marketing and Operations, Strategic Planning,

Development Environment and Tools: Jupyter Notebooks, Sublime Text, IntelliJ, Eclipse, Visual Studio, Git, SVN, Jenkins CI, Hudson CI, Confluence, Jira, Trello, Slack, TFS, Agile, Scrum, Analytic Languages: HiveQL, Pig, Spark, Scala, R, Python, Matlab,

Machine Learning and Data Analysis Frameworks and Libraries: TensorFlow, Keras, MLLib, OpenCV, NLTK, StanfordCoreNLP, NeuralTalk, Neuron, SpaCy, PyTorch

Parallel Processing and Virtualization: Python Numba, CUDA, Dask, Matlab Parfor, and Worker processing

Visualization: R, Plotly, TensorFlow Boards, Tableau, ggplot2, Power BI, AWS Quicksight and Qlik

Data Stores: Hadoop Data Lake, Data Warehouse, Amazon S3, Amazon Athena, Amazon Redshift, Cassandra DB, Mongo DB, Google Big Query, DataFlow, and Snowflake

Cloud and ML-Ops: GCP, Azure, AWS Sagemaker, ML-Pipeline Implementation using Terraform and Jenkins, AWS Batch, Docker, Kubernetes, EKS. Tensorflow Extended, Vertex AI

Programming Languages: Python, SQL, R, SAS, Spark, Bash, Git



Contact this candidate