Post Job Free

Resume

Sign in

Data Scientist Analyst

Location:
Dallas, TX
Salary:
180000$
Posted:
April 03, 2023

Contact this candidate

Resume:

Senior Data Scientist (Health Care) & ML Engineer (TIME SERIES, NLP, CV)

Summary of Relevant Experience

7+ years’ experience in Advance analytics, Data Visualization, Machine learning, Aws, Redshift.

Senior Data Scientist (Enterprise metrics): Full time- Change healthcare

Consultant Data Informatics Analyst (Part time) - Impellam Group

Expert level experience in AWS products such as EC2, S3 buckets, and database like Redshift: Analytics model pipeline (Redshift to Coginity Workbech or Redshift to direct connection with Jupyter notebook (redshift connector library) and analytics with Pandas dataframe. Full stack Exploratory data Analysis with Pandas, Numpy, Seaboarn, Matplotlib and Machine learning Model development. Recent project working on the Corporate Time Series Data (The Data Resides in the AWS-Redshift) and Forecasting model.

Lead Data & Analytics Core, UAB with a background in multi-domain (Machine learning and Predictive analytics, Data Visualization, Biomedical Science, Healthcare).

Establish NLP pipeline for Market Research

Extensive knowledge in convolutional neural network (Keras, Tensorflow) for image classification and computer vision algorithms

Collaborative research operation with Department of Radiology, UAB to detect cardio vascular anomaly from MRI scans

Development of an easy access topic modeling and search platform for COVID researchers with NLP, and Dash/Plotly platform (ongoing)

Excellent understanding of User Stories in AGILE development, ability to convert story documents into functional test cases for Acceptance Testing and Functional Testing.

Extensive knowledge of handling AWS interface for ML or DL model building and Deployment of models through AWS Sagemaker

Build Data and visual analytics pipeline for the healthcare industry starting from data scrapping, data mining, and building machine learning model in both Python and R and/or SQL & data visualization with Tableau.

Well versed in the Machine learning algorithms (supervised and unsupervised learning), and Deep learning. Use cases: Market basket analysis, natural language processing, social network analysis, computer vision, genome prediction.

Expertise in statistical background for model selection (Forward, backward, Stepwise, Lasso and Ridge Regression), model evaluation (R2, Adjusted R2, AIC, BIC, Cp), feature compression (PCA, SVD) etc.

Knowledgeable in Big data query language (Apache PIG, Apache hive, Scala)

Experience with analyzing the data in High Performance Cloud computing in Linux interface.

Developed a novel visual analytics platform based for leadership recruitment and interview process.

Build a visual analytics platform for live monitoring of clinical trial performances & surface level data analysis.

Develop and Design ETL test cases, scenarios, and scripts to ensure quality Data warehouse / BI applications

Strong SQL scripting / ETL testing skills, Web application testing experience, Ensured data integrity and verified all data modifications and calculations during database migration.

Involved in testing REST services using REST Assured and tested both XML and JSON formats.

Experience in RESTFULL web service Testing using Rest Assured framework java. Validated JSON formatted data, different http status code like 200, 201, 400, 415, 500etc.

Created automated script for REST-API testing using Rest Assured framework, Experience in ETL Data Warehousing, database testing.

Led a team for computer vision project: X-ray image detection and analysis with convolutional neural network (Keras, Tensorflow) and deployed with the flask API.

Automated the Financial analytics with data scrapping, processing and visual deployment.

Experience in multi omics data integration and analysis (NGS data analysis, Genomics, Microbiome analysis, Proteomics, Transcriptomic)

Version control in Github.

Excellent understanding of Electronic medical record (EMR)

Experience with HIPPA EDI, 834, 837/835 transactions according to test scenarios and verification of the data with different FACETS modules i.e., Providers, Claims and Membership.

Worked on mapping ICD-10 codes, Mapped the Bloodhound tool (clinical editing tool) related data elements to the internal XML elements.

Led biomedical research projects in the field of Cardiovascular disease models, Sepsis and bacteriology. Published several peer reviewed articles, presented my research work in several national and international conferences. Discovered a novel inhibitor for Sepsis pathway.

Adept the data presentation skills in several national and international conferences.

Published several research articles in peer reviewed Journals

Served as Reviewer of Nitric Oxide Journal from 2009 to 2017.

Technical Skills

Programming Language

Python • R • Linux

Machine Learning Algorithms

SVM • K-NN • K-mean Clustering • Liner and Logistic Regression • Model Evaluation

Decision Tree • Random Forest • Deep learning (CNN, RNN) • Computer vision • Natural language processing.

Data visualization

Tableau • D3.js • Seaborn • matplotlib • ggplot •Plotly

Data mining

Pandas • Numpy • text mining in R and python (NLTK) • Web scrapping

Query language

SQL

Big data query

Apache Pig • Apache Hive

Could

Aws Solution architecture • Amazon Sagemaker • Azure ML

Work History

Senior Data Scientist & Data Informatics Analyst 2021 July -Present

CHC & Impellam Group

Management of the Analytics team for the metrics anomaly detection

Guide Data Engineering team for the enhancement of the enterprise metrics

Management of the ML related project and coordinate with Business operations team for the Data Science products.

Full stack Exploratory data Analysis & Machine learning Model development with Pandas, Numpy, Seaboarn, Matplotlib, Scikitlearn, and. Recent project working on the Corporate Time Series Data (The Data Resides in the AWS-Redshift, and S3 buckets (parquet files)) and developing Forecasting model.

NLP model and CNN for the marketing research

Academic Program within Company: Building NLP model to extract the review and recommendation of the in-house courses

Automated anomaly detection and Root cause analysis for the enterprise time series data (RNN, LSTM, Python Automation)

Python integration into data modelling and massaging to automate the Clinical Data Visualization project

Data Visualization and coordination with Medical Data Reviewer team.

Lead Data Scientist, Machine learning and Data Visualization 2021 Jan -November

Leading Analytics Core Team at Department of Medicine, UAB

Management of Analytics operation for more than 12 divisions at UAB hospital (namely Cardiovascular division, Rheumatology divisions etc).

Interacted with the End users, Designers, and Developers, Project Manager to get a better understanding of the Business Processes.

Facilitated JAD session to find out the impacted area of functionality of Analytics end product with Tableau during upgrading process and cooperate with developer to come up with solution.

Communicate with vendor regarding new updates on products and solution to their existing system

Collaborated with the Project Manager in Tracking and Managing Project Development Process.

Innovation and enhancement of analytics operations for UAB.

Generated 96000$ Revenue from analytics operation in last fiscal year

Increased the overall Revenue for ~ half a million dollar (500,000$) from IT-operation

Development of a new machine learning user friendly platform with Python Dash and Plotly

Development of a NLP based topic modeling and entity search platform for Biomedical researchers

Data Analyst, Department of Medicine, UAB 2019 Dec–2021 Jan

Clinical Trial Monitoring Platform Building Financial Data analysis (Business and Visual Analytics) Automation of IT data management platform

Responsibilities:

Led a team for computer vision project: X-ray image detection and analysis with convolutional neural network (Keras, Tensorflow) and deployed with the flask API. ChestNet image repository for X-ray dataset was used in this project. Used a transfer learning with MobileNet and DenseNet and compared the best model. The final model was deployed with Flask for Public use.

Developed an NLP analytics pipeline for Sentiment analysis of review and comment on COVID first responders (APP, Nurse, Clinicians) to rank best performers.

NLP for COVID research: Topic modeling and Entity extraction from the CORD-19 Dataset.

Involved in the full software development life cycle (SDLC) starting from initial requirement gathering to design, testing, documentation, and implementation.

Acted as a liaison between Business Area Subject Matter Experts (SMEs) & development team throughout all phases of SDLC.

Participated in Sprint Planning, Daily Scrum Stand Up, and Sprint Retrospective meetings.

Developing Custom Report and different types of Reports, Matrix Reports and distributed reports in multiple formats using Tableau Server

Prepared Product backlogs, Sprint backlog and managed User stories.

Attended Scrum meetings, which included Sprint Planning, Sprint Check-In, Sprint Review & Retrospective.

Performed Data Analysis and creating SQL queries involving Joins, Functions and Stored Procedures.

Performed Data mapping, logical data modeling, created class diagrams and ER diagrams and used SQL queries to filter data.

Created new and modified existing SQL queries for use in the data integrity testing of the data warehouse application's ETL process.

Wrote SQL queries to validate data after testing the changes.

Documented the User Acceptance Testing (UAT) Plan for the project.

Analyze system requirements and develop detailed Test Plan for System Testing.

Facilitate the requirement changes and fixes along with the release management.

Automated the financial data processing (General ledger) of the UAB Department of Medicine: The project was intended to take over the tedious manual data management and analytics. I delivered a fully automated system that cleans, processes the data with Python, and created a user friendly business analytics dashboard with Tableau.

Created a unique platform of the IT service, solution, and billing management system: The project involved data mining from multiple complex application platforms such as SCCM, Casper, Kayako, REDCap, Vmware, etc. The data scrapping and data mining and API connections were done with multiple programming languages such as Python, PowerShell, SQL, etc. The data visualization was done in Tableau.

Developed a novel interview management system for the top leadership of UAB. Data Processing and Scrapping was done in Python, Visualization and Analytics was performed using Tableau.

Developed interactive dashboard for UAB NIH funding with automated data injection.

Build a clinical trial visual monitoring platform for UAB.

Environment: Python, R, SktLearn, SciPy, Numpy, Pandas, Tensorflow, Keras, Agile Scrum, MS PowerPoint, MS SQL Server, Tableau, MS Access, MS Excel, SQL, HTML, UAT, Windows.

Postdoctoral Scientist, Department of Pathology, UAB 2016 Feb –2019 Nov

Genomic Data analysis Predictive modeling in QIIME Genome Function Prediction with PICRUST Data visualization in Seaborn (python) and ggplot2 (R).

Responsibilities

Complete genome sequence analysis, data quality control, species prediction, phylogeny analysis, Principal component analysis to oral microbiome form infants at UAB children’s hospital. Most of the analytics and plots were generated by R software or Python. Samuel et al; Redox Biol. 2021 Jan;38:101782. doi: 10.1016/j.redox.2020.101782. Epub 2020.

Major contribution in the Microbial Genomics analytics in the oral microbiome from different age group. Reference: Khandaker et al, Nitric Oxide, 2021 Mar 1;108:1-7

Major contribution (2nd Author) in the data acquisition and statistical analysis in the exercise medicine. Acute beetroot juice supplementation improves exercise tolerance and cycling efficiency in adults with obesity. Christian E. Behrens Jr et al. https://doi.org/10.14814/phy2.14574

Assistant Professor, Department of Medicine, IRMC research Institute, Dammam University, Saudi Arabia 2014 Feb–2016 Feb

Major Accomplishments:

Served as full time faculty, Guided students for their basic research projects.

Acquired Research Funding from Govt. of Saudi Arabia to work on a novel drug mechanism in mammal cell line (Principal Investigator).

Postdoctoral Scientist, Department of Microbiology, Kumamoto University, Japan 2011 Oct –2013 Sep

Major Projects:

Drug discovery research against Sepsis. Data analytics was done with statistical software Graph pad Prism.

Omics Data analysis of proteomics data collected from Mitochondrial protein.

Peer Reviewed Research articles publication

Served as Reviewer in Nitric Oxide Journal.

Education

Master of Science in Analytics, Georgia Institute of Technology, Atlanta, GA, USA ~Dec 2021

Major Courses: Analytical Modeling Computing for Data Analysis Business Fundamentals in Analytics Data analytics in Business Advanced Regression analysis Big Data Machine Learning Natural Language Processing Deep Learning

Doctor of Philosophy (Ph.D.) in Biomedical Sciences (Major in Pathophysiology/Biochemistry) Mar 2011

Kumamoto University, Department of Microbiology, Kumamoto, Japan

Major Courses: Data Acquisition, Statistical modeling, Analytical Chemistry, Data analysis, Kinetics analysis, statistical software handling (e.g., Graph pad prism), Data presentation (both oral or Poster presentation at Scientific Conferences)

Award

- Graduate award from Japanese Govt. (PhD Program) -2006

- Postdoctoral award (JSPS) -2011

- Research Grant-KACST (Saudi Research and Science Grant) – 2014

- NIH research grant (DART T-90 fellowship) – 2018

Publications

Google Scholar Profile: https://scholar.google.com/citations?hl=en&user=3wRoPV4AAAAJ&view_op=list_works&gmla=AJsN-F71HHGVCrz2r_Wg3DlrgKf_nYp3W7oYPklNy53b1JTEDsNMtZp5Du9bhff8rd4dygOG9oNfc365baUuBq4rUxOBXW3sQA



Contact this candidate