Post Job Free
Sign in

Data Engineer

Location:
Berkeley, CA
Posted:
October 16, 2012

Contact this candidate

Resume:

Yun (Helen) He

Computer Systems Engineer III

Scientific Computing Group

NERSC Division, MS 50F-1645

Lawrence Berkeley National Laboratory

Berkeley, CA 94720

Tel: 510-***-****

Fax: 510-***-****

email: ***@***.***

EDUCATION

PH.D. February 1993 - September 1998

Marine Studies, University of Delaware, Newark, Delaware.

Major Area: Numerical Model Simulation, Satellite Data Analysis.

Advisor: Dr. Xiao-Hai Yan GPA: 4.0Dissertation: A Study of the Upper Ocean Processes in the Tropical Pacific Ocean Using

Satellite Data, In-Situ Measurements, and Numerical Models.

Won Best Dissertation Award.

M.S. September 1996 - December 1999

Computer Information Science, University of Delaware, Newark, Delaware.

M.S. September 1988 - July 1991

Physical Oceanography, First Institute of Oceanography, SOA, Qingdao, P.R. China.

Advisor: Dr. Yeli Yuan

Thesis: Quasi-geostrophic Generation and Characteristic Analysis of Meso-scale Eddies.

B.S. September 1984 - July 1988

Applied Mechanics, Fudan University, Shanghai, P.R. China.

WORK EXPERIENCE

Climate Algorithms Developer and Computer Systems EngineerJanuary 2001 - Present

Postdoc FellowSeptember 1998 - December 2000Scientific Computing Group, National Energy Research Scientific Computing Division,

Lawrence Berkeley National Laboratory, Berkeley, CA.

- Involve in a multi-institutional ACPI and SCIDAC climate project "Collaborative Design

and

Development of the Community Climate System Model for Terascale Computers" to work

with a group of scientists from NCAR and other DOE laboratories to merge the Climate

System Model (CSM) and Parallel Climate Model (PCM), with the goal of achieving

significant improvements in simulation quality. Participate in the coupler developer's

group

for the next generation coupler; optimize the PCM coupler communication pattern, achieve

significant improvement; develop an object-oriented, highly modular utility library of

global

handshaking and communication among different component models for either single

executable or multiple executables (receives wide recognition in the community and is

adopted in CCSM2.0); produce the standard PROTEX manual; propose a ghost cell

expansion method for reducing communications in solving PDEs; investigate remapping

multi-dimensional arrays on cluster of SMP architectures under OpenMP, MPI, and hybrid

paradiagms.

- Work primarily on a joint LDRD project with Earth Science Division Scientists to apply

a

Coupled Climate-Land Surface Regional Model to Deduce Trends in Soil Moisture from

Air Temperature Data. Port the original PSU/NCAR Mesoscale Modeling System (MM5)

onto IBM seaborg; create the job submission script which automatically read and write

input

from NERSC HPSS, write an automatic script to obtain preliminary data input from NNRP

data stored in NCAR mass storage; integrate the MM5V35 model with an isotope LSM model,

run the integrated model sequentially and in parallel with very different configurations;

optimize the code and reach the reasonable speedup; analyze the model results with NCAR

graphic software.

- Work on a joint climate research project between NERSC and GFDL on the Modular Ocean

Model (MOM3) development for effectively running on massively parallel processing

supercomputers, typically Cray T3E. We develop an efficient and scalable parallel I/O

strategy for writing out snapshot files in netCDF format. The remap-and-write scheme

resolves the indexing order difference and memory limitation problems. Several critical

optimizations on memory management and file access are carried out, including an in-core

computing mode by eliminating the memory window and ramdisks. Significant speedup

is made. Perform the scaling analysis of the baroclinic and barotropic components.

- Help to port the NCAR Climate System Model (CSM) onto NERSC Cary J90 and SV1

clusters. Run all the model test cases as a general user. Discover necessary

modifications

in the setup and model scripts. Understand the time and memory usage difference on J90

and SV1. Install a visualization software (ferret) and analyze the results with it.

- Port the Parallel Ocean Program (POP) onto NERSC J90, T3E and IBM SP. Understand

POP basic structure, input/output format and run options. Analyze the POP run timing

results under 1D and 2D decomposition, with small and large model size. Write a complete

stand-alone program to perform the snapshot test, find and correct the problems in

original

POP distribution in writing out history files on T3E. Analyze the POP Input/Output

algorithm, test with different assign commands and run in /usr/tmp directory. Compare

with

MOM3 I/O scheme and efficiency regarding gather rate, read rate and write rate.

- Study the numerical reproducibility and stability issue in parallel applications

systematically

from accurate arithmetics approach. Found two simple and effective methods to be adopted,

and designed a unified MPI operator to work with MPI collective operations to ensure a

scalable implementation on large number of processors.

- Revise a paper submitted to Nature about marine turtles navigation and sea surface

temperature gradient pattern during normal and El Nino years. Reprocessing data in higher

temporal resolution. Using statistical analysis to study the relationship.

- Collaborate with JPL scientists on a MOM3 application model: ocean responses under high

frequency wind forcing. Help to configure the model with a new grid setup. Carefully

choose proper grid size by running the model benchmarks.

- Maintain Parallel Ocean Model Development at NERSC web homepage.

- Apply for LDRD, DOE and NSF research funds as PI or Co-PI.

Research AssistantJanuary 1993 - September 1998

College of Marine Studies, University of Delaware, Newark, Delaware.

- Studied 1-D mixed layer model. Estimated ocean surface heat fluxes by using bulk

parameterization and an inverse mixed layer model.

- Performed the research of different El Ninos, including the most recent one in 1997,

studied

the western Pacific warm pool in terms of its area, centroid, volume to better understand

the

mechanism of El Nino and to predict the event.

- Estimated surface net heat flux in the western tropical Pacific using TOPEX/Poseidon

altimeter data.

- Familiarized 3-D ocean general circulation models such as MOM, SPEM, POM, reduced

gravity and coupled models. Ran the numerical models on College Sun Sparc Stations,

University Mini Cray Machines and NCAR Cray Supercomputers. Used the models to study

the upper ocean dynamics in the western equatorial Pacific, compared the results with in

situ

data and satellite data, improved our overall understanding of air-sea interactions,

tropical

dynamics and global climate change.

- Extracted oceanographic informations from all kinds of in situ and satellite data, such

as

TOGA-TAO, AVHRR, TOPEX/Poseidon, ERS-1,2, ECMWF and rainfall data. Familiar

with different data formats: ascii, unformatted, binary, HDF, netCDF, etc. Master in

Fortran, C, Unix, Cray, scientific libraries and related softwares.

- Assisted advisor in writing and reviewing journal articles, research proposals, etc.

Teaching AssistantSeptember 1994 - February 1998

College of Marine Studies, University of Delaware, Newark, Delaware.

- Supervised junior graduate students in their Satellite Oceanography course research.

- Led senior undergraduate German students majored in Computer Information Science,

Electronic Engineering and Mathematics to do research for their theses.

Research AssociateJuly 1991 - January 1993Laboratory of Geophysical Fluid Dynamics, First Institute of Oceanography, SOA, Qingdao,

P. R. China.

- Developed a theoretical quasi-geostrophic vorticity model. Studied the meso-scale

eddies

in the ocean with numerical simulation, and statistically analyzed the results to obtain

the

characteristics of the phenomenon.

- Involved in variety of projects in the research of wind, wave and currents funded by

National

Science Foundation, Young Scientist Foundation, oil companies, etc.

- Assisted our group in editing and writing scientific papers and reports.

Teaching AssistantSeptember 1991 - July 1992Qingdao University, Qingdao, P. R. China.

- Advised senior undergraduates majored in Mathematics and Fluid Mechanics to do research

for their theses.

AWARDS

- Outstanding Performance Award, Lawrence Berkeley National Laboratory, June 2000.

- Best Dissertation Award, University of Delaware, May 1999.

- Best Publication Award, University of Delaware, May 1998.

- Student Travel Award, University of Delaware, May 1998.

- CMS Program Fellowship, University of Delaware, 1997-1998.

- Marian R. Okie Fellowship, University of Delaware, 1996-1997.

- Travel Award for NASA Summer School for Earth Sciences, California Institute of

Technology, 1993.

- First Level Scholarship, Fudan University, 1984-1988.

- Excellent Student Award, Fudan University, 1984-1988.

PROFESSIONAL ORGANIZATIONS

- The IEEE Computer Society

- American Geophysical Union

- American Meteorological Society

- Oceanography Society

- Chinese Computational Physics Association

TRAINING

- Workshop on OpenMP Applications and Tools (WOMPAT 2002), Arctic Region Supercomputing

Center, University of Fairbanks, August 5th-7th, 2002.

- Seventh Annual CCSM Workshop, Breckenridge, CO, June 25-27, 2002.

- PSU/NCAR Mesoscale Modeling System (MM5) Users' Workshop, NCAR Foothills Laboratory,

Boulder, CO, June 24-25, 2002.

- A Workshop on the ACTS Toolkit: How can ACTS work for you? Lawrence Berkeley

National Laboratory, Berkeley, CA, September 28-30, 2000.

- WOMPAT2000: Workshop on OpenMP Applications and Tools, San Diego Supercomputer

Center, San Diego, CA, July 6-7, 2000.

-

A Workshop on the IBM SP System, Berkeley, CA, April 4-6, 2000.

- Workshop on Programming Techniques for the IBM SP2, Berkeley, CA, October 18-20, 1999.

- NERSC User Training Classes: Various Topics on Cray J90, T3E, NERSC File Systems,

Debugging Tools, Scientific Libraries, Visualization, Parallel I/O, MPI-IO, Berkeley, CA,

March 16-17, 1999.

- Training Workshop for NERSC Users of the Tera Computer Corporation's MTA System at

SDSC, Berkeley, CA, January 12-13, 1999.

- NERSC Teleconference Lectures: C90 to J90 Conversion, J90 Vs. T3E Code Design, Fortran

77

vs. Fortran 90 Programming Model, Berkeley, CA, April 1999.

- NERSC Teleconference Lectures: Introduction and Parallel Programming on T3E, Berkeley,

CA, November 18, 1998.

- NASA Summer School for Earth Sciences: Processes of Global Change, California Institute

of

Technology, Pasadena, CA, August 1993.

- Computational Data Compressing Technology, Data and Information Service, SOA, Tianjin,

P. R. China, January 1990.

TECHNICAL SKILLS

Parallel Computing:

- In-depth experiences on distributed memory high performance parallel computers

such as CRAY T3E, IBM SP, Compaq, SGI Origin, PC clusters, and workstation clusters.

- Hands-on experiences on shared memory Parallel Vector Processor (PVP) computers

such as Cray SV1, J90, and C90 clusters.

- Familiar with different parallel programming models and languages such as Message

Passing Interface (MPI), SHMEM, OpenMP, Solaris-thread, HPF, and Titanium.

Other Computer Platforms:

DOS, Windows 95/NT, UNIX, VMS, Solaris, Sun-OS, X-Windows, Linux.

Other Programming Languages and Models:

Basic, FORTRAN, FORTRAN 90, C, C++, Visual C++, JAVA, HTML, SQL, Python,

Assembly, CORBA, LISP.

Data and Image Processing Tools:

MATLAB, netCDF, NCAR, Ferret, CDAT, PVWAVE, Xview, Ximage, PPlus, Gnuplot, Erdas,

IDL, MAPLE, MATHEMATICA, IBM Visualization Data Explorer.

Others: MS Office 97, Adobe Photoshop, Exceed, LATEX, StarOffice, PPP, TCP/IP.

PUBLICATIONS

-

- Y. He and C. H.Q. Ding, 'An Evaluation of MPI and OpenMP Paradigms for Multi-

Dimensional

Data Remapping'. Lecture Notes in Computer Science, accepted in February 2003.

- C. H.Q. Ding and Y. He, 'Effective Methods in Reducing Communication Overheads in

Solving

PDE Problems on Distributed-Memory Computer Architectures', Grace Hopper

Celebration of Women in Computing 2002, October 2002.

- C. H.Q. Ding and Y. He, `Ghost Cell Expansion Method for Reducing Communications in

Solving PDE Problems', Lawrence Berkeley National Laboratory Technical Report, Number

LBNL-47929, May 2001. Also Proceedings of Supercomputing 2001 Conference, November

2001.

- Y. He and C. H.Q. Ding, 'Multi Program-Components Handshaking (MPH) Utility Version 2

User's Manual', Lawrence Berkeley National Laboratory Technical Report, Number

LBNL-50778, November 2001.

- C. H.Q. Ding and Y. He, 'MPH: a Library for Distributed Multi-Component Environment',

Lawrence Berkeley National Laboratory Technical Report, Number LBNL-47930, May 2001.

Proceedings of the Tenth Workshop on the Use of High Performance Computing in

Meteorology,

November 2002.

- Y. He and C. H.Q. Ding, `Using Accurate Arithmetics to Improve Numerical

Reproducibility

and Stability in Parallel Applications', Lawrence Berkeley National Laboratory Technical

Report, Number LBNL-45040, January, 2000. Also Proceedings of International Conference

on Supercomputing (ICS'00), 225-234, May 2000. Also Proceedings of the Ninth Workshop

on the Use of High Performance Computing in Meteorology: Developments in Teracomputing,

296-317, November 2000. Also Journal of Supercomputing, vol.18, 259-277, March 2001.

- C. H.Q. Ding and Y. He, `Data Organization and I/O in a Parallel Ocean Circulation

Model',

Lawrence Berkeley National Laboratory Technical Report, Number LBNL-43384, May 1999.

PRESENTATIONS

- W.J. Riley, H.S. Cooley, Y. He, and M.S. Torn, 'Coupling MM5 with ISOLSM: Development,

Testing, and Applications', Thirteenth PSU/NCAR Mesoscale Modeling System Users'

Workshop,

NCAR, Boulder, CO, June 2003.

- Y. He, `Hybrid MPI and OpenMP Programming on the SP', NERSC User Group (NUG) Meeting,

Argonne National Lab, May 2003.

- Y. He, 'Hybrid OpenMP and MPI Programming on the SP: Successes, Failures, and Results',

NERSC User Training, Lawrence Berkeley National Laboratory, March 2003.

- Y. He and C. H.Q. Ding, Hybrid OpenMP and MPI Programming and Tuning. 2004 NERSC User

Group (NUG) Meeting, Berkeley, CA, June 2004.

- Y. He and C. H.Q. Ding, `MPI and OpenMP Paradiagms on Cluster of SMP Architectures:

the Vacancy Tracking Algorithm for Multi-Dimensional Array Transpose'.

WOMPAT 2002: Workshop on OpenMP Applications and Tools, University of Alaska,

Fairbanks, Alaska, August 2002; SuperComputing 2002, Baltimore, MD, November 2002.

- Y. He and C. H.Q. Ding, Climate Modeling: Coupling Multiple Component Models by MPH .

Berkeley Atmospheric Sciences Center, Third Annual Symposium, Oct 2003.

- C. H.Q. Ding and Y. He, Integrating Program Components on Distributed Memory

Architectures

Via MPH . 11th conference on Parallel Processing for Scientific Computing, sponsored by

the Society

for Industrial and Applied Mathematics (SIAM), San Francisco, Feb 2004.

- C. H.Q. Ding and Y. He, MPH: a Library for Coupling Climate Component Models in

Distributed

Memory Architecture . International Parallel and Distributed Processing Symposium, Santa

Fe,

NM, April 2004. Also SuperComputing Conference 2003, Phoenix, AZ, November 2003.

- C. H.Q. Ding and Y. He, 'MPH: a Library for Distributed Multi-Component Environment',

SuperComputing 2002, Baltimore, MD, November 2002.

- C. H.Q. Ding and Y. He, 'Effective Methods in Reducing Communication Overheads in

Solving

PDE Problems on Distributed-Memory Computer Architectures', Grace Hopper

Celebration of Women in Computing 2002, Vancouver, Canada, October 2002.

- Y. He and C. H.Q. Ding, `Using Accurate Arithmetics to Improve Numerical

Reproducibility

and Stability in Parallel Applications', Bay Area Scientific Computing Day, Berkeley, CA,

February 26, 2000. Also presented at International Conference on Supercomputing (ICS'00)

in Santa Fe, New Mexico, May 8-11, 2000. Also presented as Research GEM at SC2000,

Dallas, TX, November 4-10. Also presented at the Ninth Workshop on the Use of High

Performance Computing in Meteorology: Developments in Teracomputing, European Centre

for Medium-Range Weather Forecasts, Reading, U.K, November 2000.

- Y. He and C. H.Q. Ding, `Numerical Reproducibility on Distributed Platforms: An

Accurate

Arithmetics Approach', Research GEM at SC2000, Dallas, TX, November 2000.

- Y. He, B. Liblit, and C.J. Lin, 'Ti-Petsc: Integrating Titanium with PETSc', invited

talk at

A Workshop on the ACTS Toolkit: How can ACTS work for you? Lawrence Berkeley

National Laboratory, Berkeley, CA, September 2000.

- Y. He and C. H.Q. Ding, `Computational Ocean Modeling', Computer Science Graduate

Fellow (CSGF) Workshop, invited talk at Lawrence Berkeley National Laboratory, Berkeley,

CA, July 2000.

- C. H.Q. Ding and Y. He, `Data Organization and I/O in a Parallel Ocean Circulation

Model',

Supercomputing 99 Conference, Portland, OR, November 13-19, 1999. Also presented at

Lawrence Berkeley National Laboratory, Berkeley, CA, March 2000.

- Y. He and C. H.Q. Ding, `Computational Aspects of Modular Ocean Model Development',

invited talk at Jet Propulsion Laboratory, Pasadena, CA, April 1999.

REFERENCES

Dr. Chris H.Q. Ding

Staff Scientist

Scientific Computing Group

NERSC Division, MS 50F

Lawrence Berkeley National Laboratory

Berkeley, CA 94720

Tel: 510-***-****

Fax: 510-***-****

email: *******@***.***

Dr. Esmond Ng

Senior Scientist and Group LeaderScientific Computing Group

NERSC Division, MS 50F

Lawrence Berkeley National Laboratory

Berkeley, CA 94720

Tel: 510-***-****

Fax: 510-***-****

email: ****@***.***

Dr. Xiao-Hai YanProfessor and Associate Director

Center for Remote Sensing

College of Marine Studies

University of Delaware

Newark, DE 19716

Tel: 302-***-****

Fax: 302-***-****

email: *******@****.***



Contact this candidate