Abhinav Vishnu
Email: *******.******@*****.***
Contact Info.: 614-***-****
Education:
Ph. D., 2007 Computer Science The Ohio State University
B. Tech., 2002 Computer Science Institute of Technology, BHU
Appointments:
01/2009–Present Senior Research Scientist Pacific Northwest National Laboratory
02/2008–11/2008 Advisory Software Engineer IBM
Summary: Over the last 12 years, I have been involved in designing scalable, fault-tolerant, and energy-efficient
programming models for massively parallel and distributed systems on high speed communication networks
such as Cray Gemini/Aries, IBM Blue Gene/Q, InfiniBand and commodity networks such as Ethernet. My
current research interest is leveraging these programming models in designing scalable data mining and machine
learning algorithms - Extreme Scale Data Analysis Library (xDAL). The conducted research is integrated with
open source software: MVAPICH (2000+ organizations, 70+ countries) and Global Arrays (600+ organizations,
30+ countries).
Project Contributions:
• Massively Parallel Data Mining and Machine Learning: Led design and development of Extreme Data
Analysis Library (xDAL). Scale clustering algorithms (K-means, Spectral) to 256 nodes (4K processes),
Classification (Support Vector Machines) to 128 nodes (2048 processes) and recommender systems (FP-
Growth) to 64 nodes (1024 processes). Alpha-release is under preparation.
• Global Arrays: Led design and implementation of Communication Runtime for Exascale (ComEx).
Achieved 99% communication efficiency achieved on Cray Gemini, Cray Aries, IBM Blue Gene/Q,
and InfiniBand systems. Led design and developement of fault tol-rant and energy-efficient Global Arrays,
reduced the MTBF inverse square root and 10% energy efficiency improvement without performance loss
on applications. Recent work on soft errors reduced the impact of soft errors by 7x.
• MVAPICH: Led design of multi-rail enabled and congestion aware communication runtime system. 2x
improvement on benchmarks and 1.4x improvement on applications.
Software:
• Programming Languages: C (Expert), C++ (Advanced), Java (Prior Experience).
• Systems Software: MPI (Expert), PGAS (Expert), Pthreads (Intermediate)
• Hardware: Interconnects (Expert), Intel MIC (Intermediate)
Awards and Accomplishments:
• Best Paper Finalist. Hot-Spot Avoidance with Multi-Pathing Over InfiniBand: An MPI Perspective, IEEE
International Symposium on Cluster Computing and Grid (CCGrid), Rio De Janeiro, Brazil, 2007.
• IBM PhD Fellowship Award. Academic Year, 2006.
Research Grants:
• PI: Scalable ARMCI on Portals4, Intel. Period - (04/13 - 11/13). Total Funding - $155,000
• PI: Scalable Knowledge Extraction on Large Scale Systems, Laboratory Directed Research and Funding -
Extreme Scale Computing Initiative. Period - (01/13 - 09/13) Total Funding - $125,000
• PI: A Scalable Fault Tolerance Infrastructure and Algorithms with Programming Models and Scientific
Applications, Laboratory Directed Research and Funding - Extreme Scale Computing Initiative. Period -
(10/09 - 09/12) Total Funding - $900,000
1
Patents:
• Flow Control For Reliable Message Passing (with Tsai Yang Jea, Hung Thai, Hanhong Xue, Chulho Kim,
Uman Chan and Zen Piatek), IBM, US 8452888 B2.
Select Recent Publications:
• H. V. Dam, A. Vishnu, and W. D. Jong, A Case of Soft Error Detection and Correction in Computational
Chemistry. Journal of Chemical Theory and Computation (JCTC), Aug., 2013.
• H. V. Dam, A. Vishnu, and W. D. Jong. Designing A Scalable Fault Tolerance Model for Computational
Chemistry: A Case Study with Coupled Cluster Perturbative Triples. Journal of Chemical Theory and
Computation (JCTC), Jan., 2011.
• A. Vishnu, J. Daily and B. Palmer. Scalable PGAS Communication subsystems on Cray Gemini Intercon-
nect. International Conference on High Performance Computing (HiPC), Dec., 2012.
• D. Chavarria, S. Krishnamoorthy, and A. Vishnu. Global Futures: A Multi-threaded Execution Model
for Global Arrays Based Applications. International Conference on Cluster, Cloud and Grid Computing
(CCGrid), May, 2012.
• D. Kerbyson, A. Vishnu, and K. Barker. Energy Templates: Exploiting Application Information for Saving
Energy. International Conference on Cluster Computing (Cluster), Sep., 2011.
• A. Vishnu, R. Olson, and M. Bruggencate. Evaluating the Potential of Cray Gemini Interconnect for PGAS
Models. International Symposium on High-Performance Interconnects (HotI), Aug., 2011.
• C.Y Su, S. Song, R. Ge, K. Cameron, and A. Vishnu. Iso-energy-efficiency: An Approach to Power-
Constrained Parallel Computation. International Parallel and Distributed Processing Symposium (IPDPS),
May, 2011.
• A. Vishnu, S. Song, A. Marquez, K. Barker, D. Kerbyson and P. Balaji. Designing Energy Efficient
Communication Runtime Systems for Data Centric Programming Models. International Conference on
Green Computing and Communications (GreenCom), Dec., 2010.
• A. Vishnu, H. V. Dam, W. D. Jong, P. Balaji, S. Song. Fault Tolerant Communication Runtime Support for
Data Centric Programming Models. International Conference on High Performance Computing (HiPC),
India, 2010.
• A. Vishnu, and M. Krishnan. Efficient On-demand Connection Management Mechanisms with PGAS
Models on InfiniBand. International Symposium on Cluster Computing and Grid Computing (CCGrid),
May 2010.
Selected Presentations: Several talks on Programming Models and Communication subsystems with cross-
cutting issues of performance, power and reliability: The Ohio State University (2013), ACS Workshop (2013),
Intel (2011), Lawrence Berkeley National Lab (2011), and several conferences including IPDPS’13, Cluster’11,
HotI’11, HiPC’10, and Cluster’10.
Professional Activities:
• Chairmanships and Editorships: International Workshop on Programming Models and Systems Soft-
ware (2013, 2012, 2011, 2010, 2009), International Workshop on Parallel and Distributed Computing for
Large Scale Machine Learning and Big Data Analytics (2014).
• Technical Program Committees: International Conference on Parallel Processing (ICPP) : 2012; Interna-
tional Conference on Network and Parallel Computing (NCP): 2012; Workshop on Power Aware Systems
and Architecture (PASA): 2013, 2012; Workshop on Communication Architecture for Scalable Systems
(CASS): 2013, 2012; Partitioned Global Address Space Conference (PGAS): 2011; International Con-
ference on High Performance Computing (HiPC): 2013, 2012, 2011, 2010; International Conference on
Cluster Computing (Cluster): 2012, 2010; International Conference on Cluster, Cloud and Grid Computing
(CCGrid): 2014, 2012, 2011; International Conference on High Performance Computing and Communi-
cations (HPCC): 2010
• Review Panels: DOE Small Business Innovation Review Panel, Nov. 2012, 2011;
2