Post Job Free

Resume

Sign in

Data Scientist

Location:
Ridgefield, NJ
Salary:
80K per year
Posted:
June 23, 2017

Contact this candidate

Resume:

SEOWON JEONG

ac0zjl@r.postjobfree.com https://www.linkedin.com/in/seowonjeong 929-***-****

I am currently working as a Data Scientist intern at MATRiX ANALYTiCS CORPORATION based in New Jersey. I conducted research on Artificial Intelligence/ Machine Learning algorithms, including Sequence Pattern Forecasting Algorithm using k- means clustering and Knn & Neural Network (Recurrent Elman NN, Radial Basis, MLP), on time series data for credit risk control for a client company.

EDUCATION

Cornell University, Faculty of Computing and Information Science, Ithaca, NY May 2017 Master of Professional Studies in Applied Statistics, 4.03/4.30 Graduate Student Representative, Cornell Statistics Graduate Society Hong Kong University of Science and Technology, School of Science, Kowloon, Hong Kong June 2015 Bachelor of Science in Mathematics (Statistics track), First Class Honors, University Scholarship SKILLS & COURSES

Programming: R, SAS, Oracle SQL; Experience with Python (MapReduce in HDFS), Java programming, HTML Tools: Excel, R, Tableau, Spotfire, SPSS, Hadoop platform (Hive, Pig) Certificates: Advanced SAS Programming, Using Databases with Python (Coursera) Courses: Machine Learning & Data Mining, Big Data Management and Analytics, Database & SAS HPC with DBMS PROJECTS/CASE STUDIES

Identifying tenant network of shopping centers in the United States Jan–May 2017

• Identify the dependency network of shopping center tenants using a modern unsupervised machine learning technique, an undirected graphical model (Ising model) in R

• Extracted new insights through top-down approach by segmenting data based on shopping center and tenant features

• Captured and corrected inconsistencies of tenant name conventions in different columns in R

• Applied different treatments (imputation/removal/applying a concept from ecosystem) to missing values depending on possible reasons for missing (data entry error/an actual indication of disappearance from the shopping center)

• Report findings in client-requested format with actionable narrative using visualizations (Tableau)

• Modified network visualization and generated interactive graph in R to deliver findings more intuitively by adding five more dimensions to the graph.

Investigation of the influential factors for successful speed dating using machine learning techniques Dec 2016

• Performed variable selection (cluster analysis, stepwise method on logistic regression and lasso regression) on over 8000+ survey responses, using validation set approach, to make suggestions for future speed dating participants Recommendation to the Mayor of New York City for improvement in 311 call service Dec 2016

• Sourced and cleaned over 2 million records of 311 call service request data from 2016 utilizing R

• Identified the key factors affecting the number of service requests and studied the potential inherent bias

• Modelled the flow of service request after experimenting with different clusters and intervals Customer analytics hands-on exercises using SPSS (Targeted vs Untargeted mailing campaign) Feb- Mar 2017

• Analyzed the effect of “targeted” catalog mailing campaign to Tuscan Lifestyles customers using 96551 customers’ data, including individual’s past purchase history and profiles, and assessed profitability and return on marketing

• Pulled business insights by modelling customers’ response to upsell campaign of Intuit: Deciding the second-wave recipients of direct mail campaign to maximize profit and return on marketing using logistic regression RELEVANT EXPERIENCE

Data Analyst, Korea Engineering Consultants Corp, Korea May-Aug 2016

• Automated data cleaning, integration and transformation and checked data integrity using VBA in Excel

• Created self-updating and interactive graphs with customized options using Excel (macro, pivot tables, v-lookup etc.)

• Provided instructions on data management with implemented functions to the director for future report generation The Big Data System Development and Analysis Program Trainee, KITRI, Korea Aug 2015- Feb 2016

• Developed cosmetics and ingredient dictionaries by integrating data from disparate sources (different brands)

• Corrected data integration issues, such as inconsistencies in cosmetic products’ category labelling by brands

• Designed and implemented web visualization of data using spring MVC pattern with JDBC

• Analyzed click and web server log data with Python (Beautiful Soup)

• Generated word-cloud, scatter plot and heatmap in R to summarize visitors’ click pattern Research Intern, Big Data Institute at Seoul National University, Korea Jan-May 2015

• Visualized 4TBs of government open data with Tableau and Spotfire

• Diagnosed the applicability of MapReduce algorithm to Cox proportional hazards model



Contact this candidate