Data di Pubblicazione:
2012
Abstract:
We describe a parallel implementation of a compressible Lattice Boltzmann code on a multi-GPU cluster based on Nvidia Fermi processors. We analyze how to optimize the algorithm for GP-GPU architectures, describe the implementation choices that we have adopted and compare our performance results with an implementation optimized for latest generation multi-core CPUs. Our program runs at approximate to 30% of the double-precision peak performance of one GPU and shows almost linear scaling when run on the multi-GPU cluster.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Computational fluid-dynamics; Lattice Boltzmann methods; GP-GPUs computing; PERFORMANCE
Elenco autori:
Toschi, Federico; Scagliarini, Andrea
Link alla scheda completa:
Titolo del libro:
PARALLEL PROCESSING AND APPLIED MATHEMATICS, PT I