<--- Back to Details
First PageDocument Content
Computing / Concurrent computing / Parallel computing / Computer programming / OpenMP / Roofline model / Multi-core processor / Manycore processor / Thread / Benchmark / CUDA / Data parallelism
Date: 2014-11-13 12:51:32
Computing
Concurrent computing
Parallel computing
Computer programming
OpenMP
Roofline model
Multi-core processor
Manycore processor
Thread
Benchmark
CUDA
Data parallelism

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis Yu Jung Lo, Samuel Williams, Brian Van Straalen, Terry J. Ligocki, Matthew J. Cordery, Nicholas J. Wright, Mary W. Hall, and Leonid Oliker U

Add to Reading List

Source URL: www.dcs.warwick.ac.uk

Download Document from Source Website

File Size: 339,96 KB

Share Document on Facebook

Similar Documents

BOPS, Not FLOPS! A New Metric, Measuring Tool, and Roofline Performance Model For Datacenter Computing Chen Zheng ICT,CAS

BOPS, Not FLOPS! A New Metric, Measuring Tool, and Roofline Performance Model For Datacenter Computing Chen Zheng ICT,CAS

DocID: 1xVt0 - View Document

1  Cache-aware Roofline model: Upgrading the loft Aleksandar Ilic, Frederico Pratas, and Leonel Sousa INESC-ID/IST, Technical University of Lisbon, Portugal {ilic,fcpp,las}@inesc-id.pt

1 Cache-aware Roofline model: Upgrading the loft Aleksandar Ilic, Frederico Pratas, and Leonel Sousa INESC-ID/IST, Technical University of Lisbon, Portugal {ilic,fcpp,las}@inesc-id.pt

DocID: 1rBXE - View Document

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis Yu Jung Lo, Samuel Williams, Brian Van Straalen, Terry J. Ligocki, Matthew J. Cordery, Nicholas J. Wright, Mary W. Hall, and Leonid Oliker U

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis Yu Jung Lo, Samuel Williams, Brian Van Straalen, Terry J. Ligocki, Matthew J. Cordery, Nicholas J. Wright, Mary W. Hall, and Leonid Oliker U

DocID: 1rrNN - View Document

Design of Parallel and High Performance Computing HS 2013 Markus P¨ uschel, Torsten Hoefler Department of Computer Science ETH Zurich

Design of Parallel and High Performance Computing HS 2013 Markus P¨ uschel, Torsten Hoefler Department of Computer Science ETH Zurich

DocID: 1rlc8 - View Document

Auto-tuning the 27-point Stencil for Multicore Kaushik Datta2 , Samuel Williams1 , Vasily Volkov2 , Jonathan Carter1 , Leonid Oliker1 , John Shalf1 , and Katherine Yelick1 1  CRD/NERSC, Lawrence Berkeley National Laborat

Auto-tuning the 27-point Stencil for Multicore Kaushik Datta2 , Samuel Williams1 , Vasily Volkov2 , Jonathan Carter1 , Leonid Oliker1 , John Shalf1 , and Katherine Yelick1 1 CRD/NERSC, Lawrence Berkeley National Laborat

DocID: 1r4gA - View Document