Data di Pubblicazione:
2001
Abstract:
This paper describes research in exploiting loop-level parallelism on a simultaneous multithreading processor. We discuss some general and ad-hoc techniques for loop parallelization that proved to be effective with SMT, and how they were tuned for it. These techniques have been tested on the well-known Livermore loops, chosen for their variety of behaviors. The set of optimizations used produced significant improvement overall: we were able to improve average IPC from 2.72 to 3.97, and to gain an average speedup of 1.39 over optimized single-thread code, using up to eight threads. We also describe a simple but effective method for determining the best number of threads to be used for parallel loops on a multithreaded processor. The model uses compile-time information to predict the most efficient point.
Tipologia CRIS:
04.03 Poster in Atti di convegno
Keywords:
Simultaneous multithreading; Loop-parallelization; Compiling; Processor architectures
Elenco autori:
Puppin, Diego
Link alla scheda completa: