Back to Results
First PageMeta Content
Reliability engineering / Systems engineering / Fault-tolerant computer systems / Engineering statistics / Materials science / Failure rate / Weibull distribution / Exponential distribution / Computer cluster / Survival analysis / Statistics / Failure


Modeling and Tolerating Heterogeneous Failures in Large Parallel Systems Eric Heien Derrick Kondo
Add to Reading List

Document Date: 2011-05-31 04:15:11


Open Document

File Size: 1,05 MB

Share Result on Facebook

City

Bucharest / INRIA / Monte Carlo / /

Company

Checkpoint / GPU / /

/

Event

Product Issues / /

Facility

National Center / Checkpoint Variable Definitions / University Politehnica / /

IndustryTerm

parallel computing jobs / bioinformatic inspired algorithm / storage devices / scientific applications / largescale systems / largescale parallel computing systems / parallel computing / large parallel systems / networks / highperformance computing systems / largescale parallel systems / model hardware failures involving processors / applicable software / faster processors / software updates / fault-tolerant algorithms / to/from computing / desktop computing systems / real-world applications / hierarchical log analysis tool / parallel applications / scratch devices / comparable tools / purpose processors / bank / scratch storage devices / node failure law / automatic identification algorithm / bag-of-task applications / manufacturing imperfections / /

Organization

National Center for Supercomputing Applications / FJ MJ / /

Person

Franck Cappello / Job Failure / Job Failures / Dan LaPine Bill Kramer / /

Position

system administrator / cluster system administrator / author / representative / /

Product

Hierarchical Event Log Organizer / Weibull / memory module / factor / drive / component generating hundreds / /

Technology

model hardware failures involving processors / FPGA / two Itanium processors / bioinformatic inspired algorithm / RAM / errors affecting processor / fault-tolerant algorithms / automatic identification algorithm / fluid dynamics / network file system / RAID / Ethernet / gene expression / purpose processors / SAN / Gigabit / simulation / scsi / Gigabit Ethernet / /

SocialTag