Post Job Free

Resume

Sign in

SAS Programmer/ Data Analyst Contract

Location:
Baltimore, MD
Posted:
October 15, 2014

Contact this candidate

Resume:

Jian Sun

SAS Programmer/ Data Analyst

Summary:

Over 7 years of experience in statistical software packages: SAS (BASE/STAT/MACRO/SQL/ODS),

R, Python, SPSS, solid skills of MS Access and Excel.

Proficient in Data Analysis/Data Management/Database Designing/Business Analysis.

Certified in SAS Advanced and Base programming.

Proficient in SAS 9.2/9.3, SAS Business Intelligence (BI) platform with Enterprise Guide, BI platform

with Ab Initio.

Experience in working with SAS on different platform like UNIX and Windows.

Professional in Developing SAS Programs for Data Requirement Analysis and Data Mapping for ETL

(Extract, Transform, Loading) process.

Strong skills in Developing SAS Programs for manipulating Analysis of Variance (ANOVA), Linear

Regression, Logistics Regression, Multivariate Analysis and Statistical Modeling.

Experience in pulling data from data-ware housing (Teradata, Oracle, and SQL Server).

Proven skills in data cleaning, data validation, data integrity, ad-hoc reporting, and coding with SAS

on various environments.

Experience in Hadoop (MapReduce and HDFS), Online Analytical Processing (OLAP: SQL

constructors with Cube and Roll Up).

Solid knowledge of Stochastic Process, Time Series, Game Theory, Operation Research.

Expertise in Machine Learning (Supervised and Unsupervised) and Data Mining.

Experience in Microsoft Office tools like MS Access, MS SharePoint, MS word, MS PowerPoint, MS

Excel.

Professional in quantitative and qualitative research.

Knowledge of Medicaid and Medicare, Public Health, Clinical Trial (Good understanding of CDISC,

SDTM and ADAM standards).

Responsible for database recovery, leading software testing, troubleshooting.

Proven abilities in establishing effective task priorities as a team player with result oriented attitude.

Excellent communication and interpersonal skills.

Education:

University of California, Berkeley

B.A. in Statistics

B.S. in Mathematics

Certifications:

SAS Certified Advanced Programmer

SAS Certified Base Programmer

Technical Skills:

R, SAS (BASE/STAT/GRAPH/SQL/MACRO/ODS/ETL/ACCESS), SAS Enterprise Guide, Python,

SPSS, MATLAB, VBA, UNIX, Outlook, Prezi, Teradata, SQL Server, Oracle, MS Office, Data Mining,

Machine Learning, Predictive Modeling, Stochastic Process, Intermediate Accounting, Operation

Research, Game Theory, Time Series, Mathematical Analysis, Financial Analysis

Professional Accomplishments

InVentiv Health Clinical, Princeton, NJ Sep 2013 to Till Date

SAS Programmer

Responsibilities:

Invoked SQL Assistant to analyze and query clinical data from Teradata DB

Performed data extraction by converting MS Word documents and MS Excel tables into SAS data

sets.

Utilized SAS/Base (Proc Means, Proc Freq, Proc Sort) to clean and validate large datasets, and used

PROC SQL to extract columns and to join tables

Used Proc SQL pass through Facility to create and update Teradata

Conducted Logistic Regression Analysis by using Proc Logistic in SAS to predict the factors that

affect mortality

Generated residuals diagnostic to calculate the p-value for testing outliers, produced t-test based on

classification variables

Proceeded Proc ANOVA to evaluate the equality of mean values for mean systolic and diastolic

variable across the categories

Initiated SAS Macro to create table that displays the median systolic and diastolic measurements for

each quartile of women’s body mass index (BMI)

Used Output Delivery System (ODS) facility to write an analytical report directing SAS output to

HTML file which includes statistical results tables, analysis summary, and data interpretation

Presented the HTML report to clients by using Prezi

Environment and Skill: SAS/BASE, SAS/MACRO, SAS/SQL, SAS/STAT, SAS/GRAPH, SAS/ODS,

MS Excel, Prezi, Teradata DB.

Merkle, Columbia, MD Mar 2012 to Aug 2013

SAS Programmer for Clinical Trial Statistical Learning

Responsibilities:

Used Proc SQL to extract Clinical data from Teradata DB by using SQL assistant to analyze and

query data

Demonstrated data cleansing by using ETL’s Extraction and Loading Phase based on Ab Initio

Performed Data validation by using Proc Freq, Proc Means, Proc Report

Performed Principal Component Analysis (PCA) to reduce the numbers of predictive variables using

SPSS

Generated Cluster Analysis by K-means and K-medoids to reduce dissimilarities based on data

attributes

Used pseudo-F statistic and Cubic Cluster Criterion (CCC) to capture the tightness of clusters and to

measure deviation of the cluster

Built a sparse logistic-regression classifier (Lasso and Ridge Regression) in SAS to classify significant

factors and non-significant factors for patients

Compared three classification methods: CART, bagging, random forests with the sparse logistic -

regression classifier to determine the optimal method

Applied One V.S. All (OVA) and All V.S. All (AVA) Classification to Support Vector Machine (SVM),

Generalized Boosted Method (GBM) and compared them to make final optimal decision

Designed and developed code with SAS/EG to support ad-hoc requests

Used SAS/ODS to produce HTML reports and presented data visualization and analysis results in a

poster

Developed rule-based model and collaborated with pharmacists to discuss risk factors

Environment and Skill: SAS/BASE, SAS/SQL, SAS/STAT, SAS/ODS, R, SPSS, MS Excel, Machine

Learning, Teradata DB, Teradata SQL Assistant, Clinical Trial

Oct 2010 to Feb 2012

Keas, San Francisco, CA

Data Analyst/ R Programmer

Responsibilities:

Performed SQL syntax to pull Clinical data from Oracle by using Oracle SQL Developer

Converted ORACLE data tables into MS Excel format and read into R program

Constructed UNIX platform to clean the data according to the specifications

Computed parameters of family dies out by MLE and maximum number of generations of family

Generated Poisson simulation and probability distribution by Monte Carlo method in R program

Initiated shell script in UNIX to manipulate XML formatted data to extract information of family

name

Created data frame and family tree to compare the summary statistics across parameters

Implemented R programming to display characterizations of data to test whether data manifests a

time series trend

Extracted and manipulated data and fitted regression model for number of generations

Created tables to report frequencies of peaks within a calendar year, and fitted harmonic trends to

find cycle of maximum of generations of family

Made GARCH model to predict the future conditional standard deviation of future number of

generations of family

Fitted autoregressive integrated moving average (ARIMA) model to forecast future generation trend

Presented data results through graphs and formatted HTML table report to conclude the simulation

Environment and Skill: R program, UNIX, XML, HTML, MS Excel, Oracle, Oracle SQL Developer,

Mathematical Statistics Concept, Time Series, Clinical Trial

K.L.D. ltd, Yantai, China Apr 2009 to Sep 2010

SAS /SQL Programmer

Responsibilities:

Implemented SQL to extract data from different relational database management systems and used

SQL Assistant to analyze and query from the Oracle and manipulated Excel Macro to perform the

repeated task

Collected data and established Enhanced Entity Relationship (EER) diagram using animal shelter

database

Built relational database model and wrote structured queries in SQL and designed database in

Microsoft Access

Developed queries on the existing Oracle databases to provide ad-hoc requests

Performed normalization analysis to detect the primary and non-primary key among all attributes

Composed ad hoc report and documented SQL codes files for supervisor’s review

Environment and Skill: SQL, MS Access, MS Visio, MS Excel, Data Modeling, EER diagram, Operation

Research, Oracle

HOYN Capital, Beijing, China

Financial Project Coordinator/SAS Analyst Aug 2007 to Mar 2009

Responsibilities:

Extracted data from financial database by using SQL Assistant and cleaned data by using Excel

Loaded dataset into SAS and used Proc Means and Proc Univariate to check data validation

Estimated products’ future sales, profits, and cash flows throughout its five-year life cycle and

analyzed 10-K reports by calculating financial ratios in Excel and MATLAB

Calculated internal rate of return and net present value to determine how many shares of common

stock must be issued

Applied financial analysis to show how large a mortgage the prospective company can afford and

how much the total interest will be paid over

Performed financial calculation to determine the effective annual yield to maturity on each of the

bond

Determined the true economic break-even point based on sales volume

Helped company to make decision whether the product is reasonable to purchase and introduce

Environment and Skill: MS Excel, MS Word, SAS, SQL Assistant, MATLAB, Financial Analysis,

Intermediate Accounting



Contact this candidate