benchmarks

\n studio-striking\n

Questions and Answers : Preferences : benchmarks
Message board moderation

To post messages, you must log in.

AuthorMessage
Clod Patry
Volunteer moderator
Volunteer developer
Volunteer tester

Send message
Joined: 2 May 11
Posts: 65
Credit: 242,754,987
RAC: 0
Message 180 - Posted: 9 May 2011, 12:24:52 UTC
Last modified: 10 May 2011, 22:10:08 UTC

im posting here some outputs just for curious about benchmarks.

Feel free to post your results too.

ATI 5970 on Windows 7, d.net 5.18 client:
[May 09 16:20:08 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 09 16:20:14 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:02.63 [1,660,901,015 keys/sec]
[May 09 16:20:14 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 09 16:20:21 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:04.66 [943,566,153 keys/sec]
[May 09 16:20:21 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 09 16:20:40 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:16.31 [245,040,371 keys/sec]
[May 09 16:20:40 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 09 16:20:45 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:02.66 [1,667,364,932 keys/sec]
[May 09 16:20:45 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #3 (IL 4-pipe cs-1)
[May 09 16:20:45 UTC] Core #3 is marginally faster than the default core.
Testing variability might lead to pick one or the other.


----

ATI 6950 on Windows 7, d.net 5.18 client:
[May 09 12:45:18 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 09 12:45:23 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:02.55 [1,706,678,259 keys/sec]
[May 09 12:45:23 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 09 12:45:30 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:04.49 [961,409,319 keys/sec]
[May 09 12:45:30 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 09 12:45:35 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:03.18 [1,378,464,000 keys/sec]
[May 09 12:45:35 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 09 12:45:41 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:02.74 [1,583,620,043 keys/sec]
[May 09 12:45:41 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #0 (IL 4-pipe c)


ATI 5870 on Linux Ubuntu 10.04.2 AMD64, d.net 5.18 client:

[May 10 22:11:02 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 10 22:11:07 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:02.18 [2,019,129,367 keys/sec]
[May 10 22:11:07 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 10 22:11:13 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:03.70 [1,173,478,453 keys/sec]
[May 10 22:11:13 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 10 22:11:20 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:04.25 [1,029,996,069 keys/sec]
[May 10 22:11:20 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 10 22:11:24 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:02.21 [1,977,099,981 keys/sec]
[May 10 22:11:24 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #0 (IL 4-pipe c)
ID: 180 · Rating: 0 · rate: Rate + / Rate - Report as offensive
kashi

Send message
Joined: 5 May 11
Posts: 7
Credit: 13,680,807
RAC: 0
Message 227 - Posted: 11 May 2011, 1:43:36 UTC

"C:\Users\All Users\BOINC\projects\moowrap.net\dnetc518-win32-x86-stream.exe" -bench

HD 5870 @ 900/500, Win 7 64, CPU 75% load, 6 core VM Linux 64 DNA@Home:

[May 11 01:26:47 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 11 01:26:53 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:03.36 [1,286,084,096 keys/sec]
[May 11 01:26:53 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 11 01:26:59 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:03.65 [1,194,784,280 keys/sec]
[May 11 01:26:59 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 11 01:27:05 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:03.30 [1,314,060,341 keys/sec]
[May 11 01:27:05 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 11 01:27:09 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:02.18 [2,019,961,462 keys/sec]
[May 11 01:27:09 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #3 (IL 4-pipe cs-1)
[May 11 01:27:09 UTC] Core #3 is significantly faster than the default core.
The GPU core selection has been made as a tradeoff between core speed
and responsiveness of the graphical desktop.

HD 5870 @ 900/500, Win 7 64, CPU idle:

[May 11 01:31:20 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 11 01:31:25 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:03.30 [1,319,918,108 keys/sec]
[May 11 01:31:25 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 11 01:31:32 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:03.58 [1,212,948,686 keys/sec]
[May 11 01:31:32 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 11 01:31:37 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:02.63 [1,683,673,944 keys/sec]
[May 11 01:31:37 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 11 01:31:41 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:02.10 [2,102,272,000 keys/sec]
[May 11 01:31:41 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #3 (IL 4-pipe cs-1)
[May 11 01:31:41 UTC] Core #3 is significantly faster than the default core.
The GPU core selection has been made as a tradeoff between core speed
and responsiveness of the graphical desktop.

ID: 227 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sitarow

Send message
Joined: 3 May 11
Posts: 8
Credit: 73,794,244
RAC: 0
Message 231 - Posted: 11 May 2011, 17:04:13 UTC - in response to Message 227.  
Last modified: 11 May 2011, 17:14:49 UTC

"C:\Users\All Users\BOINC\projects\moowrap.net\dnetc518-win32-x86-stream.exe" -bench

Computer ID 405 CPU IDLE

http://www.moowrap.net/results.php?hostid=405&offset=0&show_names=0&state=3&appid=

Intel I7 920@2.66 HT enabled Windows 7 Enterprise 64
ATI 5850 1.4.1353 Drivers 11.4 Boinc Client 6.10.60

[May 11 17:00:50 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 11 17:00:58 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:05.80 [759,575,045 keys/sec]
[May 11 17:00:58 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 11 17:01:06 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:05.07 [872,034,219 keys/sec]
[May 11 17:01:06 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 11 17:01:12 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:04.01 [1,107,936,912 keys/sec]
[May 11 17:01:12 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 11 17:01:18 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:02.93 [1,514,639,503 keys/sec]
[May 11 17:01:18 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #3 (IL 4-pipe cs-1)
[May 11 17:01:18 UTC] Core #3 is significantly faster than the default core.
The GPU core selection has been made as a tradeoff be ...
and responsiveness of the graphical desktop.


I can speculate that this may be the a primary cause for low points granted. The Boinc side client does its bench on core 0. After setting the core to 3 via the project settings it completes the task faster and therefore awards the points for the lesser work done.
ID: 231 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile [AF>EDLS] Polynesia
Avatar

Send message
Joined: 1 May 11
Posts: 23
Credit: 1,574,433
RAC: 0
Message 232 - Posted: 11 May 2011, 18:37:42 UTC
Last modified: 11 May 2011, 18:39:38 UTC

How this benchmark?

because I is not this file: dnetc518-win32-x86-stream.exe

Config : i7 860 2.8ghz, 8g ram, boinc : 6.12.26, GPU : GTX 470 Zotac Amp Edition 1280 mo DDR5
ID: 232 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile cedricdd

Send message
Joined: 2 May 11
Posts: 3
Credit: 256,516,099
RAC: 0
Message 233 - Posted: 11 May 2011, 19:45:13 UTC - in response to Message 232.  

It's "dnetc518-win32-x86-cuda31.exe" for you.
ID: 233 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Clod Patry
Volunteer moderator
Volunteer developer
Volunteer tester

Send message
Joined: 2 May 11
Posts: 65
Credit: 242,754,987
RAC: 0
Message 234 - Posted: 11 May 2011, 19:51:44 UTC - in response to Message 231.  
Last modified: 11 May 2011, 19:52:01 UTC

Sitarow,
exact, like you can see, core #3 is MUCH more faster then core #0.
Sadly, this is a bug in the d.net client, not in the wrapper.
ID: 234 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sitarow

Send message
Joined: 3 May 11
Posts: 8
Credit: 73,794,244
RAC: 0
Message 235 - Posted: 11 May 2011, 20:02:12 UTC - in response to Message 234.  

Sitarow,
exact, like you can see, core #3 is MUCH more faster then core #0.
Sadly, this is a bug in the d.net client, not in the wrapper.


Thanks..

The next question would be how can I set the performance for the system bassed on core 3?
ID: 235 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile [AF>EDLS] Polynesia
Avatar

Send message
Joined: 1 May 11
Posts: 23
Credit: 1,574,433
RAC: 0
Message 237 - Posted: 11 May 2011, 20:25:52 UTC - in response to Message 233.  

It's "dnetc518-win32-x86-cuda31.exe" for you.


how I run this benchmark, I did not understand

Config : i7 860 2.8ghz, 8g ram, boinc : 6.12.26, GPU : GTX 470 Zotac Amp Edition 1280 mo DDR5
ID: 237 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sitarow

Send message
Joined: 3 May 11
Posts: 8
Credit: 73,794,244
RAC: 0
Message 238 - Posted: 11 May 2011, 20:52:06 UTC - in response to Message 237.  
Last modified: 11 May 2011, 21:29:57 UTC

It's "dnetc518-win32-x86-cuda31.exe" for you.


how I run this benchmark, I did not understand


if your running windows 7 / vista do this.

hold down windows key + r You should get a popup window.

Now paste the following line including the "

"C:\Users\All Users\BOINC\projects\moowrap.net\dnetc518-win32-x86-cuda31.exe" -bench

that file your running is located in the following default directory

C:\Users\All Users\BOINC\projects\moowrap.net



Here is something similar that should come up on the window.

Computer ID 405 CPU IDLE

http://www.moowrap.net/results.php?hostid=405&offset=0&show_names=0&state=3&appid=

Intel I7 920@2.66 HT Disabled Windows 7 Enterprise 64
ATI 5850 1.4.1353 Drivers 11.4 Boinc Client 6.10.60



[May 11 20:45:39 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 11 20:45:54 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:11.46 [381,036,501 keys/sec]
[May 11 20:45:54 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 11 20:46:01 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:04.77 [926,568,859 keys/sec]
[May 11 20:46:01 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 11 20:46:07 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:03.51 [1,296,465,069 keys/sec]
[May 11 20:46:07 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 11 20:46:12 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:02.77 [1,617,546,052 keys/sec]
[May 11 20:46:13 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #3 (IL 4-pipe cs-1)
[May 11 20:46:13 UTC] Core #3 is significantly faster than the default core.
The GPU core selection has been made as a tradeoff be ...
and responsiveness of the graphical desktop.


Disabling HT on the CPU did not help the core 0 speed however it did help core 3 speed.
ID: 238 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile [AF>EDLS] Polynesia
Avatar

Send message
Joined: 1 May 11
Posts: 23
Credit: 1,574,433
RAC: 0
Message 239 - Posted: 11 May 2011, 22:39:16 UTC
Last modified: 11 May 2011, 22:41:32 UTC

So much for me : (what is the 64. 128 and 256, and the last two cores?)

dnetc v2.9109-518-GTR-10092921 for CUDA 3.1 on Win32 (WindowsNT 6.1).
Please provide the *entire* version descriptor when submitting bug reports.
The distributed.net bug report pages are at http://bugs.distributed.net/

[May 11 22:35:07 UTC] nvcuda.dll Version: 8.17.12.7061
[May 11 22:35:07 UTC] RC5-72: using core #0 (CUDA 1-pipe 64-thd).
[May 11 22:35:18 UTC] RC5-72: Benchmark for core #0 (CUDA 1-pipe 64-thd)
0.00:00:08.34 [520,611,087 keys/sec]
[May 11 22:35:18 UTC] RC5-72: using core #1 (CUDA 1-pipe 128-thd).
[May 11 22:35:36 UTC] RC5-72: Benchmark for core #1 (CUDA 1-pipe 128-thd)
0.00:00:15.47 [286,025,348 keys/sec]
[May 11 22:35:36 UTC] RC5-72: using core #2 (CUDA 1-pipe 256-thd).
[May 11 22:35:54 UTC] RC5-72: Benchmark for core #2 (CUDA 1-pipe 256-thd)
0.00:00:15.64 [282,044,236 keys/sec]
[May 11 22:35:54 UTC] RC5-72: using core #3 (CUDA 2-pipe 64-thd).
[May 11 22:36:12 UTC] RC5-72: Benchmark for core #3 (CUDA 2-pipe 64-thd)
0.00:00:15.42 [285,615,619 keys/sec]
[May 11 22:36:12 UTC] RC5-72: using core #4 (CUDA 2-pipe 128-thd).
[May 11 22:36:29 UTC] RC5-72: Benchmark for core #4 (CUDA 2-pipe 128-thd)
0.00:00:15.41 [286,203,344 keys/sec]
[May 11 22:37:06 UTC] RC5-72: Benchmark for core #6 (CUDA 4-pipe 64-thd)
0.00:00:15.69 [277,522,484 keys/sec]
[May 11 22:37:06 UTC] RC5-72: using core #7 (CUDA 4-pipe 128-thd).
[May 11 22:37:22 UTC] RC5-72: Benchmark for core #7 (CUDA 4-pipe 128-thd)
0.00:00:13.47 [327,522,461 keys/sec]
[May 11 22:37:22 UTC] RC5-72: using core #8 (CUDA 4-pipe 256-thd).
[May 11 22:37:41 UTC] RC5-72: Benchmark for core #8 (CUDA 4-pipe 256-thd)
0.00:00:15.36 [298,577,264 keys/sec]
[May 11 22:37:41 UTC] RC5-72: using core #9 (CUDA 1-pipe 64-thd busy wait).
[May 11 22:37:51 UTC] RC5-72: Benchmark for core #9 (CUDA 1-pipe 64-thd bus ...
0.00:00:08.31 [521,166,124 keys/sec]
[May 11 22:37:51 UTC] RC5-72: using core #10 (CUDA 1-pipe 64-thd sleep 100us).
[May 11 22:38:09 UTC] RC5-72: Benchmark for core #10 (CUDA 1-pipe 64-thd sl ...
0.00:00:15.45 [279,218,148 keys/sec]
[May 11 22:38:09 UTC] RC5-72: using core #11 (CUDA 1-pipe 64-thd sleep dyna ...
[May 11 22:38:20 UTC] RC5-72: Benchmark for core #11 (CUDA 1-pipe 64-thd sl ...
0.00:00:08.31 [523,504,575 keys/sec]
[May 11 22:38:20 UTC] RC5-72 benchmark summary :
Default core : #0 (CUDA 1-pipe 64-thd)
Fastest core : #11 (CUDA 1-pipe 64-thd sleep dynamic)
[May 11 22:38:20 UTC] Core #11 is marginally faster than the default core.
Testing variability might lead to pick one or the other.

Config : i7 860 2.8ghz, 8g ram, boinc : 6.12.26, GPU : GTX 470 Zotac Amp Edition 1280 mo DDR5
ID: 239 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Teemu Mannermaa
Project administrator
Project developer
Project tester

Send message
Joined: 20 Apr 11
Posts: 388
Credit: 822,356,221
RAC: 0
Message 292 - Posted: 13 May 2011, 9:00:28 UTC

Hi,

Here's my benchmarks for reference.

Win7 64-bit w/ATI 5870 (actually one half of an ATI 5970):
dnetc v2.9109-518-GTR-10092921 for ATI Stream on Win32 (WindowsNT 6.1).

[May 12 14:40:46 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 12 14:40:51 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
                      0.00:00:02.70 [1,633,389,548 keys/sec]
[May 12 14:40:51 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 12 14:41:03 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
                      0.00:00:06.51 [945,057,548 keys/sec]
[May 12 14:41:03 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 12 14:41:23 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
                      0.00:00:17.15 [248,470,365 keys/sec]
[May 12 14:41:23 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 12 14:41:28 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
                      0.00:00:03.00 [1,468,730,893 keys/sec]
[May 12 14:41:28 UTC] RC5-72 benchmark summary :
                      Default core : #0 (IL 4-pipe c)
                      Fastest core : #0 (IL 4-pipe c)


Interesting that now it was core 0 that was faster even though core 3 came close. Might be due to my recent ATI driver (11.3) that seems to better at core 0 than 3 (like it seems to be for older driver versions).

Win7 64-bit w/nVidia GTX285:
dnetc v2.9109-518-GTR-10092921 for CUDA 3.1 on Win32 (WindowsNT 6.1).

[May 13 08:47:54 UTC] nvcuda.dll Version: 8.17.12.6099
[May 13 08:47:54 UTC] RC5-72: using core #0 (CUDA 1-pipe 64-thd).
[May 13 08:48:10 UTC] RC5-72: Benchmark for core #0 (CUDA 1-pipe 64-thd)
                      0.00:00:14.30 [306,603,420 keys/sec]
[May 13 08:48:10 UTC] RC5-72: using core #1 (CUDA 1-pipe 128-thd).
[May 13 08:48:29 UTC] RC5-72: Benchmark for core #1 (CUDA 1-pipe 128-thd)
                      0.00:00:16.30 [269,062,363 keys/sec]
[May 13 08:48:29 UTC] RC5-72: using core #2 (CUDA 1-pipe 256-thd).
[May 13 08:48:49 UTC] RC5-72: Benchmark for core #2 (CUDA 1-pipe 256-thd)
                      0.00:00:17.30 [182,221,893 keys/sec]
[May 13 08:48:49 UTC] RC5-72: using core #3 (CUDA 2-pipe 64-thd).
[May 13 08:49:08 UTC] RC5-72: Benchmark for core #3 (CUDA 2-pipe 64-thd)
                      0.00:00:16.30 [269,518,510 keys/sec]
[May 13 08:49:08 UTC] RC5-72: using core #4 (CUDA 2-pipe 128-thd).
[May 13 08:49:28 UTC] RC5-72: Benchmark for core #4 (CUDA 2-pipe 128-thd)
                      0.00:00:17.30 [182,068,249 keys/sec]
[May 13 08:49:28 UTC] RC5-72: using core #5 (CUDA 2-pipe 256-thd).
[May 13 08:49:48 UTC] RC5-72: Benchmark for core #5 (CUDA 2-pipe 256-thd)
                      0.00:00:17.25 [174,901,896 keys/sec]
[May 13 08:49:48 UTC] RC5-72: using core #6 (CUDA 4-pipe 64-thd).
[May 13 08:50:08 UTC] RC5-72: Benchmark for core #6 (CUDA 4-pipe 64-thd)
                      0.00:00:17.28 [184,558,071 keys/sec]
[May 13 08:50:08 UTC] RC5-72: using core #7 (CUDA 4-pipe 128-thd).
[May 13 08:50:29 UTC] RC5-72: Benchmark for core #7 (CUDA 4-pipe 128-thd)
                      0.00:00:17.25 [174,901,896 keys/sec]
[May 13 08:50:29 UTC] RC5-72: using core #8 (CUDA 4-pipe 256-thd).
[May 13 08:50:49 UTC] RC5-72: Benchmark for core #8 (CUDA 4-pipe 256-thd)
                      0.00:00:16.91 [178,304,613 keys/sec]
[May 13 08:50:49 UTC] RC5-72: using core #9 (CUDA 1-pipe 64-thd busy wait).
[May 13 08:51:07 UTC] RC5-72: Benchmark for core #9 (CUDA 1-pipe 64-thd busy wait)
                      0.00:00:14.28 [310,016,813 keys/sec]
[May 13 08:51:07 UTC] RC5-72: using core #10 (CUDA 1-pipe 64-thd sleep 100us).
[May 13 08:51:26 UTC] RC5-72: Benchmark for core #10 (CUDA 1-pipe 64-thd sleep 100us)
                      0.00:00:16.58 [264,484,526 keys/sec]
[May 13 08:51:26 UTC] RC5-72: using core #11 (CUDA 1-pipe 64-thd sleep dynamic).
[May 13 08:51:44 UTC] RC5-72: Benchmark for core #11 (CUDA 1-pipe 64-thd sleep dynamic)
                      0.00:00:16.30 [269,310,985 keys/sec]
[May 13 08:51:44 UTC] RC5-72 benchmark summary :
                      Default core : #0 (CUDA 1-pipe 64-thd)
                      Fastest core : #9 (CUDA 1-pipe 64-thd busy wait)
[May 13 08:51:44 UTC] Core #9 is marginally faster than the default core.
                      Testing variability might lead to pick one or the other.


Looks like core 9 would be best but I remember that caused CPU usage to rise so I've been using core 10 recently. Although, core 11 should be even better choice. I know default core 0 is what causes high cpu usage.

Gonna try switching my core selections and see what happens. :)

-w
ID: 292 · Rating: 0 · rate: Rate + / Rate - Report as offensive
aad

Send message
Joined: 6 May 11
Posts: 15
Credit: 692,725,672
RAC: 27
Message 296 - Posted: 13 May 2011, 15:38:00 UTC
Last modified: 13 May 2011, 15:43:07 UTC

Finaly get a benchmark done! (boinc was not in default dir!)
HD6970 on amd 1090T Win 7 64


dnetc v2.9109-518-GTR-10092921 for ATI Stream on Win32 (WindowsNT 6.1).

[May 13 15:32:26 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 13 15:32:30 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:02.24 [2,008,184,378 keys/sec]
[May 13 15:32:30 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 13 15:32:37 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:03.97 [1,113,817,347 keys/sec]
[May 13 15:32:37 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 13 15:32:42 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:03.29 [1,341,547,886 keys/sec]
[May 13 15:32:42 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 13 15:32:48 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:02.73 [1,629,170,450 keys/sec]
[May 13 15:32:48 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #0 (IL 4-pipe c)

and while Boinc still active;

[May 13 15:40:47 UTC] RC5-72: using core #0 (IL 4-pipe c).
[May 13 15:41:04 UTC] RC5-72: Benchmark for core #0 (IL 4-pipe c)
0.00:00:14.08 [304,135,429 keys/sec]
[May 13 15:41:04 UTC] RC5-72: using core #1 (IL 4-pipe c alt).
[May 13 15:41:26 UTC] RC5-72: Benchmark for core #1 (IL 4-pipe c alt)
0.00:00:19.00 [187,338,963 keys/sec]
[May 13 15:41:26 UTC] RC5-72: using core #2 (IL 4-pipe 2 threads).
[May 13 15:41:32 UTC] RC5-72: Benchmark for core #2 (IL 4-pipe 2 threads)
0.00:00:03.72 [1,197,281,781 keys/sec]
[May 13 15:41:32 UTC] RC5-72: using core #3 (IL 4-pipe cs-1).
[May 13 15:41:52 UTC] RC5-72: Benchmark for core #3 (IL 4-pipe cs-1)
0.00:00:17.08 [97,016,339 keys/sec]
[May 13 15:41:52 UTC] RC5-72 benchmark summary :
Default core : #0 (IL 4-pipe c)
Fastest core : #2 (IL 4-pipe 2 threads)
[May 13 15:41:52 UTC] Core #2 is significantly faster than the default core.
The GPU core selection has been made as a tradeoff be ...
and responsiveness of the graphical desktop.
Please file a bug report along with the output of -cp ...
only if the the faster core selection does not degrad ...
ID: 296 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Microcruncher*
Avatar

Send message
Joined: 8 May 11
Posts: 11
Credit: 1,075,941
RAC: 0
Message 364 - Posted: 18 May 2011, 15:29:03 UTC
Last modified: 18 May 2011, 15:36:37 UTC

I've posted the Windows Vista SP2 values (dubious results at least) for my GTX 470 @ 607 MHz stock and at 700 MHz in another thread. The Linux results are less "jumpy" and the differences are within a range of a few percent:

Linux x86_64 / GTX 470 / Driver version: 270.41.06 / Stock clocks:

distributed.net client for CUDA 3.1 on Linux Copyright 1997-2009, distributed.net
Please visit http://www.distributed.net/ for up-to-date contest information.
Start the client with '-help' for a list of valid command line options.


dnetc v2.9108-517-CTR-10070313 for CUDA 3.1 on Linux (Linux 2.6.32-5-amd64).
Please provide the *entire* version descriptor when submitting bug reports.
The distributed.net bug report pages are at http://bugs.distributed.net/

[May 18 14:04:31 UTC] RC5-72: using core #0 (CUDA 1-pipe 64-thd).
[May 18 14:04:42 UTC] RC5-72: Benchmark for core #0 (CUDA 1-pipe 64-thd)                                                           
                      0.00:00:08.79 [491,117,547 keys/sec]
[May 18 14:04:42 UTC] RC5-72: using core #1 (CUDA 1-pipe 128-thd).
[May 18 14:04:52 UTC] RC5-72: Benchmark for core #1 (CUDA 1-pipe 128-thd)                                                          
                      0.00:00:08.51 [507,606,843 keys/sec]
[May 18 14:04:52 UTC] RC5-72: using core #2 (CUDA 1-pipe 256-thd).
[May 18 14:05:03 UTC] RC5-72: Benchmark for core #2 (CUDA 1-pipe 256-thd)                                                          
                      0.00:00:08.59 [502,783,808 keys/sec]
[May 18 14:05:03 UTC] RC5-72: using core #3 (CUDA 2-pipe 64-thd).
[May 18 14:05:14 UTC] RC5-72: Benchmark for core #3 (CUDA 2-pipe 64-thd)                                                           
                      0.00:00:08.65 [498,891,716 keys/sec]
[May 18 14:05:14 UTC] RC5-72: using core #4 (CUDA 2-pipe 128-thd).
[May 18 14:05:25 UTC] RC5-72: Benchmark for core #4 (CUDA 2-pipe 128-thd)                                                          
                      0.00:00:08.40 [514,268,438 keys/sec]
[May 18 14:05:25 UTC] RC5-72: using core #5 (CUDA 2-pipe 256-thd).
[May 18 14:05:35 UTC] RC5-72: Benchmark for core #5 (CUDA 2-pipe 256-thd)                                                          
                      0.00:00:08.48 [509,478,872 keys/sec]
[May 18 14:05:35 UTC] RC5-72: using core #6 (CUDA 4-pipe 64-thd).
[May 18 14:05:46 UTC] RC5-72: Benchmark for core #6 (CUDA 4-pipe 64-thd)                                                           
                      0.00:00:08.59 [502,289,441 keys/sec]
[May 18 14:05:46 UTC] RC5-72: using core #7 (CUDA 4-pipe 128-thd).
[May 18 14:05:56 UTC] RC5-72: Benchmark for core #7 (CUDA 4-pipe 128-thd)                                                          
                      0.00:00:08.36 [517,607,292 keys/sec]
[May 18 14:05:56 UTC] RC5-72: using core #8 (CUDA 4-pipe 256-thd).
[May 18 14:06:07 UTC] RC5-72: Benchmark for core #8 (CUDA 4-pipe 256-thd)                                                          
                      0.00:00:08.43 [512,468,703 keys/sec]
[May 18 14:06:07 UTC] RC5-72: using core #9 (CUDA 1-pipe 64-thd busy wait).
[May 18 14:06:19 UTC] RC5-72: Benchmark for core #9 (CUDA 1-pipe 64-thd busy wait)                                                 
                      0.00:00:08.78 [491,695,519 keys/sec]
[May 18 14:06:19 UTC] RC5-72: using core #10 (CUDA 1-pipe 64-thd sleep 100us).
[May 18 14:06:30 UTC] RC5-72: Benchmark for core #10 (CUDA 1-pipe 64-thd sleep 100us)                                              
                      0.00:00:08.82 [489,091,617 keys/sec]
[May 18 14:06:30 UTC] RC5-72: using core #11 (CUDA 1-pipe 64-thd sleep dynamic).
[May 18 14:06:41 UTC] RC5-72: Benchmark for core #11 (CUDA 1-pipe 64-thd sleep dynamic)                                            
                      0.00:00:08.78 [491,651,933 keys/sec]
[May 18 14:06:41 UTC] RC5-72 benchmark summary :
                      Default core : #0 (CUDA 1-pipe 64-thd)
                      Fastest core : #7 (CUDA 4-pipe 128-thd)
[May 18 14:06:41 UTC] Core #7 is significantly faster than the default core.
                      The GPU core selection has been made as a tradeoff between core speed
                      and responsiveness of the graphical desktop.
                      Please file a bug report along with the output of -cpuinfo
                      only if the the faster core selection does not degrade graphics performance.

The GTX 470 is barely able to keep up with a tiny HD 4770. Bit fiddling is the strong point of ATI cards.
ID: 364 · Rating: 0 · rate: Rate + / Rate - Report as offensive

Questions and Answers : Preferences : benchmarks


 
Copyright © 2011-2024 Moo! Wrapper Project