Resume

Sign in

DATA SCIENTIST

Location:
Ashburn, VA
Posted:
November 28, 2022

Contact this candidate

Resume:

Profile Summary

*+ years’ hands-on working experience in the Data Science/Machine Learning space.

Skilled in multiple areas of prescriptive analytics / prescriptive modeling, including Machine Learning, Natural Language Processing, Applied Statistics, Operations Research, and a variety of optimization techniques.

Versed in the latest trends and advances within the fields of Machine Learning and Artificial Intelligence.

Proficiency in the application of statistical predictive modeling, machine learning, classification techniques, and econometric forecasting techniques.

Proficiency in various types of optimization, Market Mix modeling, Segmentation, Time Series, Price Promo models, Customer Retention models, Elastic Models, Net lift models

Extensive experience in Text Analytics, developing different Statistical Machine Learning Models, Data Mining solutions to various business problems, and generating data visualizations using R, Python, and Tableau.

Experience with a variety of NLP methods for information extraction, topic modeling, parsing, and relationship extraction in Python.

Adept at discovering patterns in data using algorithms, visual representation, and intuition.

Hands-on application of machine learning techniques such as Naïve Bayes, Linear Regression and Logistic Regression Analysis, Neural Networks, RNN, CNN, Transfer Learning, Time-Series Analysis, Trees, and Random Forests.

Experience in designing stunning visualizations using Tableau software and publishing and presenting dashboards, and Storyline on web and desktop platforms.

Hands-on experience in business understanding, data understanding, and preparation of large databases.

Worked on Natural Language Processing with NLTK, SpaCy, and other modules for application development for automated customer response.

Program automation processes using Python and the AWS Lambda service.

Skilled in transforming business requirements into analytical models, designing algorithms, building models, and developing data mining and reporting solutions that scale across massive volumes of structured and unstructured data.

Experience applying Neural Networks, Support Vector Machines (SVM), and Ensemble Models.

Worked on ML tools and Model Deployment in Cloud platforms like AWS, GCP, and Azure.

Table of Technical Skills

Programming Languages: Python, SQL, R, SAS, C#, Command Line

Python packages: Matplotlib, Seaborn, Numpy, Pandas, Scikit-Learn, TensorFlow, SciPy, Bokeh, Numba, NLTK

Machine Learning: Natural Language Processing & Understanding, Machine Intelligence, Machine Learning algorithms, Statistical Modeling, Computer Vision, Time Series, Survival Analysis, Accelerated time to Failure, Anomaly detection

Deep Learning: Machine perception, Machine Learning algorithms, Neural Networks, TensorFlow, Keras, Data Mining

Artificial Intelligence: text understanding, classification, pattern recognition, recommendation systems, targeting systems, ranking systems.

Analysis: Advanced Data Modeling, Forecasting, Regression, Predictive, Statistical, Sentiment, Exploratory, Stochastic

Data Modeling: Bayesian Analysis, Inference, Predictive Modeling, Stochastic Modeling, Linear Modeling, Behavioral Modeling, Probabilistic Modeling

Communication: Reporting, Documentation, Presentation, Collaboration. Clear, effective with a wide variety of colleagues, audiences.

Infrastructure: Cloud Environments Amazon Web Services (AWS) and Google Cloud Plattform

Professional Work Experience

February 2020 – Current

Senior Data Scientist/Machine Learning Engineer - DXC Technology (Ashburn, VA)

DXC Technology is a Fortune 500 global IT services leader, where I am part of the Data & Analytics team. As a Senior Data Scientist and Machine learning Engineer, I led a team to solve business problems in the healthcare, financial, manufacturing, and retail industries. I developed and deployed a Computer Vision model to classify X-Ray images per diagnosis for a major healthcare provider to meet with their exponentially rising CoViD cases by using transfer learning techniques and Convolutional Neural Networks. In another project, I had to build marketing analytics models for a major retail chain in the US, where the firm looked to increase its online KPIs (e.g., number of operations and Click Through Rate (CTR), and predict customer churn. I also built a hybrid recommender engine to offer relevant suggestions to visitors. Relevant KPIs increased after the engine was deployed, and the company realized expanded profits. I developed and deployed my models into the AWS cloud environment.

Engaged with the company’s sales department, data engineering team, and software development team.

Used different transfer learning algorithms (VGG16, VGG19, AlexNet, ResNet50, EfficientNet) for computer vision models

Used the YOLO algorithm for an object recognition use case

Assessed model performance using Click Through Rate (CTR) and Mean Average Precision (MAP).

Applied a K-Nearest Neighbors (KNN) algorithm with Cosine Similarity for collaborative filtering and recommender systems

Used NumPy, SciPy, Scikit-Learn, PySpark, MLlib, Pandas, Matplotlib, Seaborn, and Flask.

Built recommender engines on Big Data using PySpark’s MLlib.

Used XGBoost to predict customer Churn

Used generative adversarial networks (GANs) to improve the accuracy of my models

Performed queries and pulled data from Amazon S3 MemSQL database into Pandas DataFrames in Python using SQLAlchemy.

Implemented a Singular Value Decomposition (SVD) collaborative filtering algorithm to recommend items to users.

Created an OCR model with OpenCV and Google Tesseract to extract text from PDFs

Produced a Flask app API that returns a software-agnostic JSON file for software developers to implement on the site.

Used Scikit-Learn for creating and training collaborative filtering algorithms.

Used AWS S3 and Redshift Data Warehouse to access AWS Resources from Python.

Designed and Deployed end-end model deployment CiCd pipeline.

Worked with AWS Quicksight, Lambda, SageMaker, Athena, and others.

Defined different metrics and indicators for item similarity in the content-based approach.

Coordinated with the UI/UX team to plan the implementation of recommendations.

September 2018 – February 2020

Data Scientist (AI/NLP) - Deloitte Consulting (Atlanta, GA)

Deloitte has a regional Analytics team for the Southwest region. Worked on a team to create an alert automation system for internal messages and logs by leveraging cutting-edge NLP techniques for an Atlanta-based Hospital. A hand-labeled internal dataset combined with tweets from Twitter’s API was used to train a model for importance, relevance, and priority along with a sentiment analysis matrix. Results were then classified by priority and urgency. The final production model used a neural network based on medical BERT and allowed users to decide what types of messages they wanted to let through the filter through an adjustable threshold and re-training. User productivity was expected to increase by 18.8% as projected by the business. Created a Chatbot to address customer requests from a major healthcare company by using BERT and PyTorch

Accessed the Twitter API using a Python wrapper to extract pseudo-labeled data based on hashtags.

Cleaned and prepared text best data through normalization, tokenization, stemming, and lemmatization using BERT and NLTK.

Coded customized solutions using Python and the Tensorflow and Numpy libraries.

Tested on a variety of embedders, including a bag of words, TD-IDF, Word2vec, and ELMO.

Utilized statistical classifiers, random forests, and logistic regressions to perform sentiment analysis.

Used PyTorch and BERT encoder to create a chatbot

Used RNNs and LSTM for different NLP problems

Constructed an Artificial Neural networking machine learning solution for natural language processing.

Implemented a model utilizing BERT for embedding and classification and fine-tuned to specific data.

Became proficient in Natural Language Processing, SQL queries, and web scrapping for collecting literature using BeautifulSoup

Productionized final model by hosting a web API and user-friendly intranet app powered by FLASK.

Used Google Cloud Platform (GCP), Colab, Vertex AI, Big Query, AutoML, and others

November 2015 – September 2018

Data Scientist - Hewlett Packard Enterprise (Houston, Texas)

Hewlett-Packard has several divisions for enterprise solutions. I served Data Scientist that used a recurrent neural network and later used Facebook’s Prophet model as the base for a sales forecasting project. I utilized Python for data cleaning on a large dataset that included multiple years’ worth of data across different regional departments in dozens of stores. I produced highly accurate forecasts for each regional store and department. I also created a model to predict the maintenance of servers in collaboration with the R&D engineering team by using Accelerated Time to Failure models.

Prepared data for exploratory analysis.

Built a model using Facebook Prophet to produce highly accurate predictions of weekly sales.

Deployed model created highly accurate 6-month forecasts up to 6 months in advance for every store and department.

Tested survival analysis technique using various methods: Accelerated Failure Time model, proportional Hazard model, and Cox Proportional Hazard (CPH) to estimate the default probability and default time and chose the best performing model.

Used different Time Series models, ARIMA, SARIMA, Prophet, LSTMs, etc.

Assessed model performance on large datasets.

Pulled data from the Hadoop cluster (HDFS Cloudera).

Utilized Python, Pandas, SciPy, and NumPy for exploratory data analysis, data wrangling, and feature engineering.

Applied Kernel Density estimation in lower dimensional space as a feature to predict fraud.

Tested Anomaly Detection Models such as Expectation Maximization, Isolation Forest, and Elliptical Envelopes.

Completed hypothesis testing and statistical analysis to determine statistically significant changes in claims after participating in the safety program.

Utilized Tableau and TabPy for visualization of analyses.

Consulted with various departments within the company, including SIU and Safety.

January 2014 – November 2015

Jr. Data Scientist - Entech Biomedical (Chandler, Arizona) (Remote)

Entech Biomedical in Chandler was focused on providing medical equipment service to private practice physician offices, surgery centers, freestanding clinics, and major medical centers throughout the Southwest. My role involved determining medical equipment sold with the highest profit margin and then producing a model to predict the quantity sold in a particular quarter. My engagement extended to working with a team to improve the site’s recommender system. We grouped the site users into two types of expected customers and determined which group was more influenced by our current recommender system and segmentation models.

Modeled quantity of the part with the highest profit margin sold per quarter using Theano in Python.

Modeled long-run average cost (LRAC) of various electronic medical components to determine which products could be ordered in higher volume to maximize profit margins.

Applied Natural Language Processing (NLP) to classify reviews as being from end customers of medical equipment service.

Applied K-Means clustering to group types of type of customers from sales data.

Optimized Recommender system for online customers to see more feasible medical equipment services based on their business unit and their customer type.

Structured a time-series model to determine time-dependence and seasonality of medical equipment services using SARIMA in Python’s statistical statsmodels library.

Implemented Gaussian radial bases into the model to account for the seasonality of medical equipment services.

Education

Universidad de Guadalajara (CUCEI) – Master's degree in Bioengineering and Smart Computing

Specialized in artificial intelligence, machine learning, electrophysiology, electrophysiological signals processing

ITESM – BSc Degree in Biomedical Engineering

Specialized in Bioanalytics

Certifications

Project Management Certificate (In Analytical projects)

Research & Publications

Sensors Journal, special issue Advanced Sensing and Image Processing (Computer Vision) Techniques for Healthcare Applications. Work accepted and published in a leading international, open access, peer-reviewed journal whose rank is JCR-Q1/CiteScore-Q1. Research article: Effect of Auditory Discrimination Therapy on Attentional Processes of Tinnitus Patients. https://www.mdpi.com/1424-8220/22/3/937

IEEE Signal Processing in Medicine and Biology Symposium in Pennsylvania, USA. Work was selected, presented in an international symposium, published in the IEEE Xplore digital library, and invited to be published as a book chapter in an ebook produced by Springer. Research article: Monitoring of auditory discrimination therapy for tinnitus treatment based on event-related (de-) synchronization maps. https://ieeexplore.ieee.org/abstract/document/9672290

Research article published: XXXVIII National Congress of Biomedical Engineering in Mazatlan, MX, Work accepted in the most important event nationally in the area of Biomedical Engineering. Research article: Algorithm to identify potential cases of diabetic macular edema in fundus images using dynamic segmentation techniques and classifiers based on neural networks and Computer Vision. https://www.researchgate.net/publication/283489959_Identificacion_de_casos_potenciales_de_edema_macular_diabetico_en_imagenes_de_fondo_de_ojo_utilizando_tecnicas_de_segmentacion_dinamica_y_clasificadores_basados_en_redes_neuronales

Languages

English (Fluent)

Spanish (Native)



Contact this candidate