Back to Results
First PageMeta Content
Numerical linear algebra / Video cards / GPGPU / Nvidia / Graphics hardware / Graphics processing unit / Gaussian elimination / Cache-oblivious algorithm / LU decomposition / Computer hardware / Algebra / Computing


LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware ∗ Nico Galoppo Naga K. Govindaraju Michael Henson
Add to Reading List

Document Date: 2005-07-22 02:58:32


Open Document

File Size: 472,24 KB

Share Result on Facebook

City

Seattle / /

Company

GPU / IBM / NVIDIA Corporation / Solving Dense Linear Systems / Intel / on developing GPU-based general linear equation solvers / /

Country

Jordan / United States / /

Currency

USD / /

/

Event

Force Majeure / Reorganization / /

Facility

ATLAS library / Nico Galoppo Naga K. Govindaraju Michael Henson University of North Carolina / Intel Math Kernel Library / GPU pipeline / Chapel Hill Dinesh Manocha / SSE-enabled Intel Math Kernel Library / Hall et al. / /

IndustryTerm

parallel dense linear algebra algorithms / numerical signal-processing applications / scientific applications / parallel algorithms / dense linear systems using graphics processors / vector-parallel systems / parallel fragment processors / distributed-memory systems / stream processors / graphics applications / pass algorithm / graphics processors / few scientific applications / parallel using fragment processors / fragment processor / programmable vertex processors / multi-pass algorithm / overall algorithm / vector dot products / shared memory algorithms / bus technologies / cache-aware algorithms / divide-and-conquer algorithm / fragment processors / latter general-purpose parallel systems / performance computing systems / cache technology / non-iterative algorithms / parallel vector fragment processors / well known numerical and scientific algorithms / dense linear systems / search algorithm / cache-aware blocking algorithm / matrix-matrix multiplication algorithms / partial pivoting algorithm / decomposition algorithms / factorization algorithm / texture mapping hardware / sparse linear systems / graphics hardware / optimized linear algebra algorithms / matrixmatrix multiplication algorithms / handheld devices / auxiliary algorithm / achievable computing power / linear algebra algorithms / linear systems / co-processor / cache-oblivious algorithm / programmable fragment processors / stream processor / decomposition algorithm / /

MarketIndex

LINPACK / /

Organization

University of North Carolina / /

Person

Lin / Kim / Hiroshige Goto / Nico Galoppo Naga / /

Product

Samsung NV40 Digital Camera / GeForce 6800 GT / Ultra GPU / Ultra / GeForce 6800 Ultra / /

ProgrammingLanguage

MATLAB / php / /

ProvinceOrState

Washington / /

Technology

16 programmable fragment processors / blocked algorithm / well known numerical and scientific algorithms / partial pivoting algorithm / search algorithm / fragment processors / GPUbased algorithms / shared memory algorithms / fragment processor / 12 parallel fragment processors / 6 programmable vertex processors / GPGP algorithms / LAPACK algorithms / normal LU decomposition algorithm / parallel using fragment processors / optimized linear algebra algorithms / multi-pass algorithm / rightlooking algorithm / linear algebra algorithms / Blocked LU decomposition algorithms / 16 parallel fragment processors / technology of graphics processors / cache-oblivious algorithm / bus technologies / cache-aware algorithms / RAW chip / cache technology / matrixmatrix multiplication algorithms / LU decomposition algorithms / LU factorization algorithm / alpha / parallel dense linear algebra algorithms / stream processors / graphics processors / LU decomposition algorithm / 4 fragment processors / 24 parallel fragment processors / 1 Introduction Commodity graphics processors / 4.1 Algorithm / divide-and-conquer algorithm / parallel fragment processors / parallel algorithms / pass algorithm / dense linear systems using graphics processors / programmable fragment processors / SIMD stream processor / computational kernel The fragment processors / LU algorithm / cache-aware blocking algorithm / 16 parallel vector fragment processors / non-iterative algorithms / simulation / matrix-matrix multiplication algorithms / 3.3 Parallel Algorithms / /

URL

http /

SocialTag