Computer Science Electrical Engineering

Location:

Knoxville, TN

Posted:

December 31, 2012

Contact this candidate

Resume:

Jakub Kurzak

Innovative Computing Laboratory

Electrical Engineering and Computer Science Department

University of Tennessee

**** ********* ****

Ste 203 Claxton

Knoxville, TN 37996-3450

OFFICE: 318 Claxton

PHONE: 865-***-****

FAX: 865-***-****

EMAIL: ******@****.***.***

WWW: http://web.eecs.utk.edu/~kurzak

EDUCATION

PhD

Computer Science Department ISBN: 978-0542512018

University of Houston

Houston, Texas

2005

Electrical Engineering Department

Wroc aw University of Technology

Wroc aw, Poland

2000

EXPERIENCE

Research Director September 2010 present

Research Scientist March 2009 August 2010

Senior Research Associate January 2006 February 2009

Innovative Computing Laboratory

Electrical Engineering and Computer Science Department

University of Tennessee

Knoxville, Tennessee

Research Assistant January 2001 December 2005

Institute for Molecular Design

Department of Chemistry

University of Houston

Houston, Texas

EXPERTIESE

All aspects of utilizing silicon to the fullest, from exploiting instruction-level parallelism with SIMD vectorization,

through multithreading on multi-core processors, to message-passing on large scale distributed-memory systems.

Experience with hardware accelerators / co-processors: Cell B. E. (RIP), GPUs, MIC.

Extensive knowledge of numerical algorithms for scientific computing, linear algebra in particular.

BOOK EDITOR

[1] J. Kurzak, D. Bader, J. Dongarra (editors)

Scientific Computing with Multicore and Accelerators

Computational Science series, Chapman & Hall/CRC, 2010

ISBN: 978-1439825365

BOOK CHAPTERS

[9] J. Dongarra, J. Kurzak, P. Luszczek, S. Tomov

Dense Linear Algebra on Accelerated Multicore Hardware

In High-Performance Scientific Computing: Algorithms and Applications

Springer-Verlag, 2012

ISBN: 978-1447124368

[8] J. Kurzak, P. Luszczek, A. YarKhan, M. Faverge, J. Langou, H. Bouwmeester, J. Dongarra

Multithreading in the PLASMA Library

In Handbook of Multi and Many-Core Processing: Architecture, Algorithms, Programming, and Applications

Computer & Information Science Series, Chapman & Hall/CRC, 2012

ISBN: 978-1447124368

[7] P. Luszczek, J. Kurzak, J. Dongarra

Changes in Dense Linear Algebra Kernels: Decades-Long Perspective

In Solving the Schr dinger equation: has everything been tried?

Imperial College Press

ISBN: 978-1848167247

[6] W. Alvaro, J. Kurzak, J. Dongarra,

Implementing Matrix Multiplication on the Cell B. E.

In Scientific Computing with Multicore and Accelerators

Computational Science series, Chapman & Hall/CRC, 2010

ISBN: 978-1439825365

[5] J. Kurzak, J. Dongarra

Implementing Matrix Factorizations on the Cell B. E.

In Scientific Computing with Multicore and Accelerators

Computational Science series, Chapman & Hall/CRC, 2010

ISBN: 978-1439825365

[4] J. Kurzak, H. Ltaief, J. Dongarra, R. Badia

Scheduling for Numerical Linear Algebra Library at Scale

In High Speed and Large Scale Scientific Computing

Advances in Parallel Computing series, IOS Press, 2010

ISBN: 978-1607500735

[3] A. Buttari, J. J. Dongarra, J. Kurzak, J. Langou

Parallel Dense Linear Algebra Software in the Multicore Era

In Cyberinfrastructure Technologies and Applications

Nova Science Publishers, Inc., 2009

ISBN: 978-1606920633

[2] A. Buttari, J. Dongarra, J. Kurzak, P. Luszczek, S. Tomov

Using Mixed Precision in Solving Linear Systems of Equations

In High Performance Computing and Grids in Action

Advances in Parallel Computing series, IOS Press, 2008

ISBN: 978-1586038397

[1] J. Demmel, B. Parlett, W. Kahan, M. Gu, D. Bindel, Y. Hida, E. J. Riedy, C. Voemel,

J. Kurzak, A. Buttari, J. Langou, S. Tomov, J. Dongarra, X. Li, O. Marques, J. Langou, P. Luszczek

Prospectus for a Dense Linear Algebra Software Library

In Handbook of Parallel Computing: Models, Algorithms and Applications

Computer and Information Science series, Chapman & Hall/CRC, 2008

ISBN: 978-1584886235

JOURNAL PUBLICATIONS

[19] J. Kurzak, P. Luszczek, J. Dongarra

LU Factorization with Partial Pivoting for a Multicore System with Accelerators

IEEE Transactions on Parallel and Distributed Systems (submitted)

http://www.computer.org/portal/web/tpds

[18] J. Kurzak, S. Tomov, J. Dongarra

Autotuning GEMM Kernels for the Fermi GPU

IEEE Transactions on Parallel and Distributed Systems (accepted)

http://www.computer.org/portal/web/tpds

[17] J. Kurzak, H. Ltaief, J. Dongarra, Rosa M. Badia

Scheduling Dense Linear Algebra Operations on Multicore Processors

Concurrency and Computation: Practice and Experience 22(1):15-44, 2010

DOI: 10.1002/cpe.1467

[16] H. Ltaief, J. Kurza, J. Dongarra,

Scheduling Two-Sided Transformations Using Tile Algorithms on Multicore Architectures

Scientific Programming 18(1):35-50, 2010

DOI: 10.3233/SPR-2010-0297

[15] H. Ltaief, J. Kurza, J. Dongarra

Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures

IEEE Transactions on Parallel and Distributed Systems 21(4):417-423, 2010

DOI: 10.1109/TPDS.2009.79

[14] M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, S. Tomov

Accelerating Scientific Computations with Mixed Precision Algorithms

Computer Physics Communications, 40th Anniversary Issue 180(12):2526-2533, 2009

DOI: 10.1016/j.cpc.2008.11.005

[13] J. Kurzak, J. Dongarra

QR Factorization for the CELL Processor

Scientific Programming, Special Issue: High Performance Computing

with the Cell Broadband Engine 17(1-2):31-42, 2009

DOI: 10.3233/SPR-2009-0268

[12] J. Kurzak, W. Alvaro, J. Dongarra

Optimizing Matrix Multiplication for a Short-Vector SIMD Architecture CELL Processor

Parallel Computing: Systems & Applications, Special Issue: Revolutionary Technologies for Acceleration of

Emerging Petascale Applications 35(3):138-150, 2009

DOI: 10.1016/j.parco.2008.12.010

[11] A. Buttari, J. Langou, J. Kurzak, J. Dongarra

A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures

Parallel Computing: Systems and Applications 35:38-53, 2009

DOI: 10.1016/j.parco.2008.10.002

[10] A. Buttari, J. Langou, J. Kurzak, J. Dongarra

Parallel Tiled QR Factorization for Multicore Architectures

Concurrency and Computation: Practice and Experience 20(13):1573-1590, 2008

DOI: 10.1002/cpe.1301

[9] A. Buttari, J. Dongarra, J. Kurzak, P. Luszczek, S. Tomov

Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance

While Achieving 64-bit Accuracy

ACM Transactions on Mathematical Software 34(4), article 17, 22 pages, 2008

DOI: 10.1145/1377596.1377597

[8] A. Buttari, J. Dongarra, J. Langou, J. Langou, P. Luszczek, J. Kurzak

Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems

International Journal of High Performance Computing Applications 21(4):457-466, 2007

DOI: 10.1177/1094342007084026

[7] J. Kurzak, A. Buttari, J. Dongarra

Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization

IEEE Transactions on Parallel and Distributed Systems 19(9):1175-1186, 2008

DOI: 10.1109/TPDS.2007.70813

[6] J. Kurzak, J. Dongarra

Implementation of Mixed Precision in Solving Systems of Linear Equations on the CELL Processor

Concurrency and Computation: Practice and Experience 19(10):1371-1385, 2007

DOI: 10.1002/cpe.1164

[5] J. Kurzak, B. M. Pettitt

Message-Passing Implementation of the Data Diffusion Communication Model

in Fast Multipole Methods: Large Scale Biomolecular Simulations

Journal of Algorithms & Computational Technology 2(4):557-579, 2008

DOI: 10.1260/174830108786231722

[4] J. Kurzak, D. Mirkovic, B. M. Pettitt, S. L. Johnsson

Automatic Generation of FFTs for Translations of Multipole Expansions in Spherical Harmonics

International Journal of High Performance Computing Applications 22(2):219-230, 2008

DOI: 10.1177/1094342008090915

[3] J. Kurzak, B. M. Pettitt

Fast Multipole Methods for Particle Dynamics

Molecular Simulation 32(10/11):775-790, 2006

DOI: 10.1080/08927020600991161

[2] J. Kurzak, B. M. Pettitt

Massively Parallel Implementation of a Fast Multipole Method for Distributed Memory Machines,

Journal of Parallel and Distributed Computing 65(7):870-881, 2005

DOI: 10.1016/j.jpdc.2005.02.001

[1] J. Kurzak, B. M. Pettitt

Communications Overlapping in Fast Multipole Particle Dynamics Methods

Journal of Computational Physics 203(2):731-743, 2005

DOI: 10.1016/j.jcp.2004.09.012

CONFERENCE PUBLICATIONS

[12] J. Kurzak, P. Luszczek, J. Dongarra

Programming the LU Factorization for a Multicore System with Accelerators

VECPAR'12: International Meeting on High-Performance Computing for Computational Science, Kobe, Japan, 2012

Lecture Notes in Computer Science XXXX:xxx-xxx, Springer, 201x

http://nkl.cc.u-tokyo.ac.jp/VECPAR2012/ (accepted)

[11] G. Bosilca, A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak,

J. Langou, P. Lemarinier, H. Ltaief, P. Luszczek, A. YarKhan, J. Dongarra

Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA

IPDPSW'11: International Parallel and Distributed Processing Symposium, Workshops and PhD Forum, Anchorage, AK, 2011

DOI: 10.1109/IPDPS.2011.299

[10] J. Kurzak, R. Nath, P. Du, J. Dongarra

An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs

PARA'10: State of the Art in Scientific and Parallel Computing, Reykjav k, Iceland, 2010

Lecture Notes in Computer Science 7134:248-257, Springer, 2012

DOI: 10.1007/978-3-642-28145-7

[9] E. Agullo, H. Bouwmeester, J. Dongarra, J. Kurzak, J. Langou, L. Rosenberg

Towards and Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures

VECPAR'10: High Performance Computing for Computational Science, Berkeley, California, 2010

Lecture Notes in Computer Science 6449:129-138, Springer, 2011

DOI: 10.1007/978-3-642-19328-6_14

[8] E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, S. Tomov

Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects

SciDAC'09: Scientific Discovery through Advanced Computing, San Diego, California, 2009

Journal of Physics: Conference Series 180:012037, IOP Publishing, 2009

DOI: 10.1088/1742-6596/180/1/012037

[7] W. Alvaro, J. Kurzak, J. Dongarra

Fast and Small Short Vector SIMD Matrix Multiplication Kernels

for the Synergistic Processing Element of the CELL Processor

ICCS'08: International Conference on Computational Science, Krak w, Poland, 2008

Lecture Notes in Computer Science 5101:935-944, Springer, 2008

DOI: 10.1007/978-3-540-69384-0_98

[6] A. Buttari, J. Langou, J. Kurzak, J. Dongarra

Parallel Tiled QR Factorization for Multicore Architectures

PPAM'07: International Conference on Parallel Processing and Applied Mathematics, Gda sk, Poland, 2007

Lecture Notes in Computer Science 4967:639-648, Springer, 2007

DOI: 10.1007/978-3-540-68111-3_67

[5] A. Buttari, J. Dongarra, P. Husbands, J. Kurzak, K. Yelick

Multithreading for Synchronization Tolerance in Matrix Factorization

SciDAC'07: Scientific Discovery through Advanced Computing, Boston, Massachusetts, 2007

Journal of Physics: Conference Series 78:012028, IOP Publishing, 2007

DOI: 10.1088/1742-6596/78/1/012028

[4] J. Langou, J. Langou, P. Luszczek, J. Kurzak, A. Buttari, J. Dongarra

Exploiting the Performance of 32 Bit Floating Point Arithmetic in Obtaining 64 Bit Accuracy

(Revisiting Iterative Refinement for Linear Systems)

SC'06: ACM/IEEE Conference on Supercomputing, Tampa, Florida, 2006

DOI: 10.1145/1188455.1188573

[3] J. Kurzak, J. Dongarra,

Implementing Linear Algebra Routines on Multi-Core Processors with Pipelining and a Look Ahead

PARA'06: State of the Art in Scientific and Parallel Computing, Ume, Sweden, 2006

Lecture Notes in Computer Science 4699:147-156, Springer, 2007

DOI: 10.1007/978-3-540-75755-9_18

[2] J. Demmel, J. Dongarra, B. Parlett, W. Kahan, M. Gu, D. Bindel, Y. Hida, X. Li, O. Marques,

E. J. Riedy, C. Voemel, J. Langou, P. Luszczek, J. Kurzak, A. Buttari, J. Langou, S. Tomov

Prospectus for the Next LAPACK and ScaLAPACK Libraries

PARA'06: State of the Art in Scientific and Parallel Computing, Ume, Sweden, 2006

Lecture Notes in Computer Science 4699:11-23, Springer, 2007

DOI: 10.1007/978-3-540-75755-9_2

[1] A. Buttari, J. Dongarra, J. Kurzak, J. Langou, P. Luszczek, S. Tomov

Impact of Multicore on Math Software

PARA'06: State of the Art in Scientific and Parallel Computing, Ume, Sweden, 2006

Lecture Notes in Computer Science 4699:1-10, Springer, 2007

DOI: 10.1007/978-3-540-75755-9_1

TECHINICAL REPORTS (not published elsewhere)

[5] J. Kurzak, P. Luszczek, S. Tomov, J. Dongarra

LAPACK Working Note 267:

Preliminary Results of Autotuning GEMM Kernels for the NVIDIA Kepler Architecture GeForce GTX 680

Technical Report UT-CS-12-XXX, Department of Compter Science, University of Tennessee, 2012

http://www.netlib.org/lapack/lawnspdf/lawn267.pdf

[4] J. Kurzak, J. Dongarra

LAPACK Working Note 220:

Fully Dynamic Scheduler for Numerical Computing on Multicore Processors

Technical Report UT-CS-09-643, Department of Compter Science, University of Tennessee, 2009

http://www.netlib.org/lapack/lawnspdf/lawn220.pdf

[3] H. Ltaief, J. Kurzak, J. Dongarra

LAPACK Working Note 208:

Parallel Block Hessenberg Reduction Using Algorithms-by-Tiles for Multi-core Architectures Revisited

Technical Report UT-CS-08-624, Department of Compter Science, University of Tennessee, 2009

http://www.netlib.org/lapack/lawnspdf/lawn208.pdf

[2] A. Buttari, J. Dongarra, J. Kurzak

LAPACK Working Note 185:

Limitations of the PlayStation 3 for High Performance Cluster Computing

Technical Report UT-CS-07-597, Department of Computer Science, University of Tennessee, 2007

http://www.netlib.org/lapack/lawnspdf/lawn186.pdf

[1] A. Buttari, P. Luszczek, J. Kurzak, J. Dongarra, G. Bosilca

SCOP3: A Rough Guide to Scientific Computing On the PlayStation 3

Technical Report UT-CS-07-595, Department of Computer Science, University of Tennessee, 2007

www.netlib.org/utk/people/JackDongarra/PAPERS/scop3.pdf

POPULAR SCIENCE

J. Kurzak, A. Buttari, P. Luszczek, J. Dongarra

The PlayStation 3 for High Performance Scientific Computing

Computing in Science and Engineering 10(3):84-87, 2008

ISSN: 1521-9615

TUTORIALS

J. Dongarra, J. Kurzak

LINPAK on Future Manycore & GPU Based Systems

ISC 2010'11'12: International Supercomputing Conference, Hamburg, Germany, 2010'11'12

J. Dongarra, J. Demmel, M. Heroux, J. Kurzak

Linear Algebra Libraries for High-Performance Computing: Scientific Computing with Multicore and Accelerators

SC 2011: ACM/IEEE Conference on Supercomputing, Seattle, WA, 2011

D. G ddeke, J. Kurzak, J. P. Wei

Scientific Computing on GPUs

PPAM 2011: Parallel Processing and Applied Mathematics, Toru, Poland, 2011

J. Kurzak

Cell Broad Engine Programming to the Metal

AFRL / Griffis Institute, Rome, NY, 2009

J. Kurzak, A. Buttari

Introduction to Programming High Performance Applications on the CELL Broadband Engine

HOTI 2007: 15th Annual IEEE Symposium on High-Performance Interconnects, Stanford, CA, 2007

REVIEWER

Transactions on Parallel and Distributed Systems (IEEE)

Journal of Parallel and Distributed Computing (Elsevier)

Parallel Computing: Systems and Applications (Elsevier)

Concurrency and Computation: Practice and Experience (Wiley)

International Journal of High Performance Computing Applications (SAGE)

Journal of Computer and System Sciences (Elsevier)

IBM Journal of Research and Development (IBM)

Transactions on Mathematical Software (ACM)

Parallel Processing Letters (World Scientific)

Journal of Computational Science (Elsevier)

Embedded Systems Letters (IEEE)

Computing in Science & Engineering (IEEE)

International Conference on Supercomputing (ICS)

International Parallel & Distributed Processing Symposium (IPDPS)

Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA)

International Conference on Parallel Processing and Applied Mathematics (PPAM)

International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC)

Springer

Taylor & Francis

U.S. Department of Energy, Office of Science

Natural Sciences and Engineering Research Council of Canada

PROGRAM COMMITTEE

CCGrid: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing ('10'11'12)

SAAHPC: Symposium on Application Accelerators in High Performance Computing ('11'12)

PPAM: International Conference on Parallel Processing and Applied Mathematics ('09)

PPAC: Workshop on Parallel Programming on Accelerator Clusters ('09'10'11)

Euro-Par: European Conference on Parallel Processing ('10)

GRANTS

J. Dongarra, J. Kurzak, P. Luszczek

PULSAR: Parallel Unified Linear Algebra with Systolic Arrays

National Science Foundation

J. Dongarra, J. Kurzak, J. Langou

PLASMA: Parallel Linear Algebra Software for Multiprocessor Architectures

National Science Foundation

COLLABORATORS

E. Agullo J. Demmel L. Johnsson D. Mirkovic

W. Alvaro J. Dongarra W. Kahan B. Parlett

M. Baboulin M. Faverge J. Langou (Julie) M. Pettitt

R. Badia M. Gu J. Langou (Julien) J. Riedy

D. Bindel B. Hadri P. Lemarinier L. Rosenberg

G. Bosilca A. Haidar X. Li S. Tomov

A. Bouteiller T. Herault H. Ltaief C. Voemel

H. Bouwmeester Y. Hida P. Luszczek A. YarKhan

A. Buttari P. Husbands O. Marques K. Yelick

A. Danalis

Contact this candidate