Understanding Application Scaling: NAS Parallel Benchmarks on the NOW and SGI Origin 2000

Frederick Wong


We present a study of the architectural requirements and scalability of the NAS Parallel Benchmarks. We find that the local processor and memory systems dominate both the overall performance and scalability of the NAS parallel benchmarks, as opposed to the communication architecture. Improvements in computational efficiency under scaling, due to the increase in global cache size, often offset the rising costs of communication and synchronization, hence yielding linear or even super-linear speedups. We also find that communication performance is often much lower than would be expected from micro-benchmark results.


For more information, see this paper.

Presentation: PowerPoint slides, postscript gzipped, postscript 2up gzipped.

Click here to go back to the Finale schedule.