Post Job Free
Sign in

Machine Learning Data Science

Location:
Queens, NY
Posted:
June 10, 2024

Contact this candidate

Resume:

Henry (Jiajun) Du

+1-408-***-**** *****.********@*****.*** Apt. 635W, 11-12 30Th Drive, Astoria, NY, 11102 EDUCATIONAL BACKGROUND

Columbia University Sep, 2022 - Dec, 2023

Master of Arts in Statistics (Data Science & Finance) GPA: 3.167 / 4.0 University of California, Santa Cruz Sep, 2019 - June, 2022 Bachelor of Arts in Mathematics (Computational Mathematics) GPA: 3.72 / 4.0 Dean's Honors:

2020 Spring Quarter, 2021 Winter Quarter, 2021 Fall Quarter, 2022 Winter Quarter, 2022 Spring Quarter PERSONALITY

Self-Initiative, Drive, Communication Style, Social Style, Learning Style, Self-Perception, Passionate, Helpful SKILLS

Programming languages: Python, SQL, R Studio, Matlab, Tableau, Spark, Excel Relevant Coursework: Probability and Statistics, Regression Analysis, Machine Learning, Sample Surveys, AB Test, Hypothesis Testing, Inference for Independence, Regularization, Model Evaluation WORK EXPERIENCE

YINGDA CHANG’AN Insurance Brokers CO., LTD. Dec, 2020 - Feb, 2021 Intern in the Analysis Department

Utilized Python (Pandas, NumPy) for detailed analysis of insurance datasets, identifying key trends in payouts and loss conditions.

Created dynamic and interactive dashboards in Tableau for visualizing complex data insights, enhancing stakeholder decision-making.

Leveraged Apache Spark for efficient processing of large-scale datasets, significantly reducing data processing times and improving analysis accuracy.

Developed a customer churn prediction model using Scikit-Learn and logistic regression, integrated with Spark for data processing and Tableau for visualization, aiding in the formulation of targeted customer retention strategies.

Automated financial calculations and data processing workflows, boosting efficiency and accuracy in loan and policy assessments.

Byte Dance Jul, 2021 - Aug, 2021

Part-time Assistant in International Payment

Gathered information and coordinated with international payment platforms for detailed content.

Communicated via email with international clients (e.g., India, Pakistan, Brazil) to confirm transaction details when matching payment information was not found.

Used Excel's Vlookup for account reconciliation based on cloud storage and billing data.

Verified transaction numbers and amounts in channel account statements.

Conducted root cause analysis through text mining and activity pattern mining using advanced Python and Excel techniques, identifying and resolving operational issues from multiple channels and bank statements spanning 3-5 months.

To integrate all the past data provided by my team into a single database using the relational database management system MySQL, in order to facilitate future searches and calls. Washington Institute for Health Sciences Jul, 2022 - Sep, 2022 Part-time Research Analyst Assistant of Specialist Bin Li, MD.

Spearheaded gene expression study on gastric cancer, employing GEO database and R (Biobase, GEOquery, limma, umap) for analysis, uncovering geographical expression differences.

Utilized GEO2R and DAVID for differential expression and pathway analysis, highlighting the need for personalized medicine in diverse populations.

Advocated for region-specific cancer treatments, contributing to global cancer care innovation. PROJECTS

Similarity Analysis of Philosophers’ Thoughts (Python)

Loaded and scrutinized the dataset to understand its composition, including the enumeration of rows and columns.

Identified the most referenced philosophers and their works, with a spotlight on luminaries such as Aristotle, Plato, and Hegel.

Generated word clouds for these paramount philosophers to illustrate the most prevalent terms within their writings.

Executed TF-IDF analysis for the juxtaposition of textual resemblances amongst diverse titles and authors, underlining both commonalities and differences.

Deduced the distinct and intersecting philosophical notions of the analyzed authors, with a special emphasis on identifying common threads within Hegel's discourses. Maximizing Fairness Under Accuracy Constraints (Python)

Led the project, focusing on ethical AI by implementing Zafar et al.'s methods for fair machine learning.

Coordinated team efforts, integrating members' contributions for the ethical AI project. Scheduled regular meetings to ensure timely submission. Presented the project to classmates, showcasing its excellence.

Enhanced data integrity through one-hot encoding, missing value management, and sensitive variable isolation; evaluated models on accuracy and fairness.

Applied Local Massaging and Local Preferential Sampling algorithms for bias reduction in datasets.

Developed and analyzed a logistic regression baseline model, assessing the impact of fairness adjustments on model performance.

Explored the balance between model accuracy and fairness, employing both in-processing and pre-processing strategies in machine learning.

HOBBIES

Bilibili Personal Channel Uploader

Analyzed Bilibili audience preferences with MySQL, identifying sports content as highly favored, leading to the creation of targeted videos.

Achieved a peak view count of 500k on a single video, showcasing content's popularity.

Cultivated a sports enthusiast community through engaging and informative content, leveraging data insights for content strategy.

Basketball

Engaged in weekly street basketball games to relieve stress, honing skills and fostering community with fellow enthusiasts.

LEGO

Fueled by enthusiasm for Lego discovered in my junior college year, engaged in comparing and bidding on rare, out-of- production sets via online platforms such as eBay.

Ignited my interest in utilizing statistical methods for real-life challenges, specifically in price analysis and determining the best bidding times.



Contact this candidate