Post Job Free
Sign in

Data Sas

Location:
Worcester, MA
Posted:
September 12, 2020

Contact this candidate

Resume:

MARVIN AKUFFO

Phone #: 857-***-**** Email add: ***********@*****.***

SUMMARY

Have over 7 years of experience in Quantitative Research and Analysis, hands-on expert in Data Scientist, Machine Learning Algorithms, Stochastic Process and Modeling, Model development, Validation and Scoring / Projections in R, Python and SAS environment.

Professional experience in Base SAS programming (Data Step, SAS Proc, SAS Macros, Proc FCMP), SAS Proc SQL, Proc STATS etc.

Domain experience and expertise in medical claims billing covering ICD9\10 Diagnosis and Procedure Codes, HCPCS, DRG codes, CPT, professional and facility claims and Aetna and CMS Medicare & Medicaid data.

Domain experience and expertise in developing and executing Standard SQL queries and T-SQL stored procedure. Also have good understanding of data architecture and databases.

Expert and deep domain experience in advanced statistical methodology and inferences, regularization techniques, cross-validation, machine learning algorithms like Random Forest, Gradient Boosting, Elastic Net, Support Vector Machine(SVM), Clustering algorithm, K-NN, Principal Component Analysis(PCA), logistic regression etc., and statistical regression - OLS, GLM, GLS, GEE, Survival Analysis, Mixed Effect Models.

Advanced knowledge and experience in time-series/panel regression modeling encompassing Exponential Smoothing, AR(p), MA(q), ARCH (p, q) and GARCH (p, q), co-integration, test of stationarity etc.; Bayesian Statistical inferences and modeling encompassing parameter estimation and MCMC simulation techniques like Gibbs Sampling, Hastings-Metropolis.

Domain knowledge working on structured and unstructured data with Hadoop (HDFS and MapReduce) and Hadoop ecosystem encompassing Hive, Spark etc.

Experience with cloud computing infrastructure (e.g. Amazon Web Services (AWS) EC2, Elastic MapReduce) and consideration for scalable, distributed Systems.

Draws on experience in all aspects of analytics/data warehousing solutions (Database issues, Data modeling, Data mapping, ETL Development, metadata management and data migration

EDUCATION

BOSTON UNIVERSITY Boston, MA Graduate School of Management

MSc. Mathematical Finance Aug. 2011 – Dec. 2012

UNIVERSITY OF GHANA Accra, Ghana

Department of Mathematics

BA Combined Major Mathematics & Statistics Aug. 2006 - May 2010

WORK EXPERIENCE

1199SEIU Pensions & Benefit Funds New York, NYC

Sr. SAS Business Intelligence Analyst/Data Scientist May 2018 – Present

Responsibilities

Builds an attribution model to assign a patient or episode to the provider with the highest percentage of services or total cost for base and performance attribution period in SAS EG environment.

Builds a program to analyze ACO Shared Savings, quarterly program reports, determining the ACO’s financial and quality performance, and determining whether an ACO is eligible to share in the savings or losses.

Builds patient adherence to medication models with logistic regression, GLM, GEE and

performed survival analysis (with Cox-Proportional Reg.), patient longitudinal analysis of claims using PROC LOGISTIC, PROC GLM, PROC GENMOD, PROC PHREG etc.

Develop and maintain program evaluation/financial budget reconciliations of Accountable Care Organizations (ACOs) and other value-based payment arrangements.

Develop and maintain advanced methodology related to primary care attribution and reimbursement of medical, hospital, and pharmacy services.

Participate in the evaluation of hospital and physician performance, employing industry-standard reimbursement and risk-adjustment methods (Medicare Severity-Diagnosis Related Group (MSDRG), Enhanced Ambulatory Patient Grouping (EAPG) System, Resource-Based Relative Value Scale (RBRVS), Diagnostic Cost Group (DxCG), etc.).

Produce various other analyses and reports that aid in the business and strategic decision making of Benefit & Pension Funds.

Implement machine learning models across random forest, SVM, logistic regression, Deep Belief Net (DBN) etc. in R environment.

Environment: R\RStudio, SAS Enterprise Guide (SAS EG), SAS Visual Analytics (VA), MS SQL, RStudio

KPMG / KeyBank Cleveland - OH

Model Validation Consultant – CCAR Wholesale Credit Risk November 2017 – February 2018

Responsibilities

Second line of defense for independent validation of CRE portfolios’ Probability default models (estimated with sequential or multi-stage logistic regression models), Loss Given Default (LGD estimated with Zero Inflated Negative Binomial Regression), EAD through checking for model data appropriateness, conceptual soundness, technical analysis, outcome analysis etc.

Review and analyze documentation on model design, development, generation of reports on model validation and analysis and proper procedural steps of execution of SAS codes.

Check the completeness, accuracy and appropriateness of the data used in the model development. Validate model’s input data against primary sources and verify that all relevant drivers have been gathered and collect to meet the model’s core purpose.

Evaluate whether the selection and structure of variables is consistent with similar industry models and modeling objective.

Assess and test assumptions underlying Probability of Default(PD) and regression models covering autocorrelation, multi-collinearity, heteroskedasticity, stationarity test, normality, linearity/ model specification etc.

Perform independent conceptual and theoretical review, benchmarking, replication of the models employed and quantifying model risk and reporting of the findings.

Close interaction and collaboration with the model developers in the First Line-of-Defense.

Evaluate robustness and stability of models; perform variable selection for macroeconomic variables for the model development, sensitivity analysis, scenario analysis, replicating and benchmarking of PD, LGD and EL models.

Perform diagnostic checks through statistical significance of estimated model parameters, AIC, appropriateness sign checks of parameters with regards to Net Charge Off (NCO) rates, convergence status etc., to assess models’ performance and model uncertainty

Environment: SAS Enterprise Guide, PROC SQL, R\Rstudio, Python

Healthcare Markets and Regulation (HMR) Lab Boston – MA

Department of Healthcare Policy, Harvard University Medical School (HMS)

Statistician/SAS Analyst Consultant May 2016 – October 2017

Responsibilities

Design and develop statistical and data management programs/applications that support and analyze complex healthcare delivery (e.g., Aetna healthcare claims, CMS claims and other claims-based or electronic medical records databases) systems, healthcare payment systems as well as associated administrative systems in SAS environment.

Management and analysis of large administrative data from health insurers. Ensure integrity of data collection, data review, data compilation, and analysis techniques.

Create analytical files to be used for analysis related to cost, utilization rates, quality of administrative data.

Work with Program Managers and other team members to analyze and interpret research data to help create appropriate algorithms and programs and work independently to implement appropriate analysis and provide accurate documentation.

Responsible for successful completion of all analytical and data related duties and maintain timelines and determine proper summary statistics, report formats and all other analysis considerations.

Develop complex multi-level or nested mixed effect models and generalized logit models and performing partition and analysis of variance components at different levels of nested random effects in SAS environment.

Re-purposing and execution of AHRQ SAS programs on Hospital Discharge or Inpatient Claims data to generate quality measures or indicators at different levels of stratifications: Tax Identification Number (TINs), Accountable Care Organization (ACOs) etc.

Environment: (SAS/BASE, SAS/MACRO, SAS/STAT, SAS/ETS, SAS/ODS, SAS/PROC SQL), MS SQL, Unix, SAS Enterprise Guide, SAS Enterprise Miner, R\RStudio. Python

Syntel Inc., /Credit Suisse New York City -NY

Data Scientist Consultant Feb 2015 – Feb 2016

Responsibilities

Technical documentation and validation of Credit Risk/Loss models and Pre-Provisional Net Revenue (PPNR) models as part of Comprehensive Capital Analysis and Review (CCAR) implementation. This technical documentation and validation covers the whole sale Probability of Default (PD), Loss Given Default (LGD), Expected Loss (EL) and other regression model for revenue forecast or projections.

Review of white-papers on constructed PPNR regression-based models for projections and Whole-sale credit risk modeling to aid in productionizing these models.

Implementation of models to comply with CCAR architecture and Basel I, II, III regulations

Implement predictive models in Python and R environment as atoms (executable unit of code which takes read data from table and output model object to be stored in a table) for these models for onward deployment in Model Manager for execution. Leveraged advanced econometric modeling techniques (panel and time-series regression etc.) to build the atoms and test model assumptions (including Gauss-Markov Assumptions) to ensure robustness.

Build and implement the production-ready codes or atoms with APIs or R packages created for internal or in-house use for model development.

Ensure the model implementation is aligned with the model’s purpose and its underlying analytical approach including a thorough analysis of input data, review of the underlying code used in implementation and model performance.

Performs independent model validations. This involves assessing the model’s overall suitability to its intended purpose, evaluating the model’s mathematical and statistical theory

Environment: R/RStudio, Python/PyCham, Hadoop, SQL Developer, Tortoise SVN, Linux

Omni Claims Inc., Woburn MA

Data Scientist Contractor April 2014 - Dec.2014

Responsibilities

Developed complex SQL queries and stored procedures based on requirements of medical edits with high potential dollar savings for company’s clients like HIP and United Health Service (UHS). The requirements in the edits specifies procedure and diagnosis codes and other NCCI regulations for which administered claims are denied or billed. Also, with the appropriate input and output parameters specified, the stored procedures are run against the company databases like COSMOS and HIP to obtain the claims with possible dollar savings.

Built a scalable fraud anomaly detection model or classify to determine whether DRG claims are audited or not and to identify whether a given provider facility is paid for some or all their healthcare insurance claims according to DRG, case rate or Per Diem contract terms. These models were built leveraging some supervised machine learning algorithms like random forest(RF), Neural Networks(NN), Elastic Net(EN), Stochastic Gradient Boosting, regularized Logistic regression in R environment hosted on AWS EC2 instance.

Performed tuning or optimizing of model’s hyperparameters through grid search, cross – validation etc., and evaluation of model performance as part of the modeling process.

Managing and creating EC2 instances on Amazon Web Services (AWS) and hosting, maintaining and scaling up of memory capacity of Rstudio on AWS so that Rstudio can run advanced modes on big data without any memory limitation issues.

Environment: R, Python, SQL Server, Amazon Web Services (AWS), MS Excel.

Cognizant Technology Solutions Canton MA

R Statistical Consultant Nov.2013 – March 2014

Responsibilities:

Provided deep data mining skills and techniques to generate association rules among variety of Dunkin brands for the project. This helps the company in product placement and cross sell marketing campaign to boost revenue and profitability using R and Oracle.

Coded the apriori algorithm of Market Basket Analysis using PL/SQL in Oracle environment.

Manipulate POS data and utilize scripts run on Oracle to extract transaction data on Dunkin brands, clean the data and transform into a form for sample market basket analysis using arules package in R to generate support, confidence, lift and association rules among brands

Interfaced with the client and other team members to understand the functional requirements of the project and develop technical design.

Implement the technical design and develop and test code in Oracle Advanced Analytic environment

Environment: R Studio, Oracle PL/SQL, Oracle OBIE, Oracle DB, Oracle Client tools, OBIE

Dish Network Englewood, CO

Data Scientist /R Programmer Consultant April 2013 – October 2013

Responsibilities

Built response model for direct marketing campaign using PROC LOGISTIC in SAS and with both forward and backward stepwise regression for variable selection (over hundreds of variables) at STL = 0.001 and SLS=0.001. The model was built on training set (constituting 70% of the data) and validated on validation/test set (constituting 30% of the data) and reported 78.6% Area Under Curve (AUC) of Relative Operating Characteristic(ROC) curve as against the random 50%. This same model was replicated in R environment to compare across other models using advanced methods like Elastic Net with binomial as link function and support vector machine.

Scored the model on new data and used PROC RANK or block of SAS MACRO to partition the predicted outcome or responses into deciles for effective targeting or direct marketing campaign.

Environment: R Studio, Base SAS, SAS Enterprise Guide, Python, Teradata,), MS SQL Server,

TECHNICAL SKILLS

Hadoop Ecosystem

HDFS (HDFS and MapReduce) and MapReduce, Hive, Spark, HUE

Analytical/Modeling Software

R/Rstudio, Python(PyCham/IPython),

SAS, SPSS, Mat lab, Excel

Cloud Computing Infrastructure

Amazon Web Services (AWS) EC2,

Programming language and operating systems

Python, C++, Windows, Linux

BI Tools &Data Visualization

Spotfire Cloud, D3.js,Rshiny

Relational Database Management System

MS SQL Server, Oracle

Version Control

Tortoise SVN, Git

Web Development

HTML, CSS, JSON, JavaScript



Contact this candidate