Difference between revisions of "Performance"

From Gw-qcd-wiki
Jump to: navigation, search
Line 1: Line 1:
 
'''Carver'''
 
'''Carver'''
  
  - Tester: Ben Gamari
+
  * Tester: Ben Gamari
  - Test date: 14 Jul 2010
+
  * Test date: 14 Jul 2010
  - Commit: e3e4ffafd158abd004c483694a27f4f6bc7d2185
+
  * Commit: e3e4ffafd158abd004c483694a27f4f6bc7d2185
  - Hardware:  
+
  * Hardware:  
  - CUDA version 3.0
+
  * CUDA version 3.0
  
 
{|
 
{|

Revision as of 13:37, 14 July 2010

Carver

* Tester: Ben Gamari
* Test date: 14 Jul 2010
* Commit: e3e4ffafd158abd004c483694a27f4f6bc7d2185
* Hardware: 
* CUDA version 3.0
Kernel Configuration Bandwidth FLOPs
Dslash_cuda Dslash (24^4) 73 GB/s 32 GFLOP/s
hopping (24^4) 74 GB/s 34 GFLOP/s
Dslash_multi_gpu (double) 1 node, 24^4 Dslash 79 GB/s 35 GFLOP/s
2 nodes, 24^4 Dslash 145 GB/s 64 GFLOP/s
4 nodes, 24^4 Dslash 256 GB/s 114 GFLOP/s
Dslash_multi_gpu (double) 1 node, 24^4 Dslash 79 GB/s 76 GFLOP/s
2 nodes, 24^4 Dslash 156 GB/s 140 GFLOP/s
4 nodes, 24^4 Dslash 283 GB/s 252 GFLOP/s
Vector utilities Addition 82 GB/s 3.4 GFLOP/s
Dot product 88 GB/s N/A
Copy 84 GB/s N/A