This performance study includes the following parallel architectures (and software) IBM SP1 at ANL (MPI/F90) TMC CM5 at NCSA (CM-Fortran,MPI/F77) CRAY T3D at PSC (HPF subset) CRAY C90 at PSC (F90) SGI Power Challenge at NCSA (F90 subset)
no. processors cpu time speedup ~ Mflops
1 240.6 1.0 63.4
2 118.9 2.02 128.3
4 58.4 4.12 261.0
8 31.9 7.54 477.8
15 22.5 10.69 677.7
16 ***** only 15 processros configured in due to a failed processor
no. processors cpu time speedup ~ Gflops
32 23.6 0.6
64 14.8 1.0
128 9.9 1.5
256 7.7 2.0
512 6.9 2.2
no. processors cpu time speedup ~ Gflops
2 754.1 0.02
4 382.0 0.04
8 190.1 0.08
16 98.9 0.15
32 49.8 0.31
64 25.9 0.59
128 16.6 0.92
256 7.8 1.96
512 57.1 0.27
no. processors cpu time speedup ~ Gflops
64 13.9 1.09
Problem Size: nx = ny = nz = 64
SGI Power Challenge
no. processors cpu time speedup ~ Mflops
1 2067.1 1.00 59.0
2 1057.1 1.96 115.3
4 537.3 3.85 227.0
8 282.1 7.33 432.1
15 175.2 11.80 695.8
16 ***** only 15 processros configured in due to a failed processor
TMC CM5
no. processors cpu time speedup ~ Gflops
32 150.8 0.8
64 79.2 1.5
128 41.9 2.9
256 24.1 5.0
512 15.4 7.9
Cray T3D
no. processors cpu time speedup ~ Gflops
8 1520.5 0.08
16 769.3 0.16
32 389.3 0.31
64 194.3 0.63
128 100.3 1.20
256 50.2 2.40
512 26.6 4.60
IBM SP1
no. processors cpu time speedup ~ Gflops
64 68.6 1.8
Problem Size: nx = ny = nz = 128
SGI Power Challenge
no. processors cpu time speedup ~ Mflops
15 1370.3 711.6
TMC CM5
no. processors cpu time speedup ~ Gflops
64 574.1 1.7
128 294.0 3.3
256 154.0 6.3
512 81.9 11.9
Cray T3D
no. processors cpu time speedup ~ Gflops
64 1536.0 0.63
128 774.8 1.30
256 391.1 2.50
512 197.3 4.90
Peak Architecture Performance for Problem Size n = 32^3
Machine Processors CPU Time Gflops
CRAY T3D 512 7.8 2.0
TMC CM5 512 6.9 2.2
CRAY C90 16 19.0 0.5
IBM SP1 64 13.94 1.1
SGI PC 15 22.49 0.7
Peak Architecture Performance for Problem Size n = 64^3
Machine Processors CPU Time Gflops
CRAY T3D 512 26.6 4.6
TMC CM5 512 15.4 7.9
CRAY C90 16 23.1 5.3
IBM SP1 64 68.6 1.8
SGI PC 15 175.1 0.7
Peak Gflop Performance
Machine Processors n Gflops
CRAY T3D 512 256 5.1
TMC CM5 512 128 11.9
CRAY C90 16 128 7.4
IBM SP1 64 64 1.8
SGI PC 15 128 0.7