Client core values

Author	Message
zombie67 [MM] Send message Joined: 2 May 11 Posts: 47 Credit: 319,540,306 RAC: 0	Message 195 - Posted: 10 May 2011, 4:41:42 UTC Last modified: 10 May 2011, 5:02:56 UTC Can someone please define what each of the values mean? From reading the various threads, it looks like range is -1 through 10, only integer values. Is this correct? I know -1 means auto-select. But what to all the rest of the values mean? Can someone please post the definitions? -1 = auto-select 0 = ? 1 = ? 2 = ? 3 = ? 4 = ? 5 = ? 6 = ? 7 = ? 8 = ? 9 = ? 10 = ? Reno, NV Team SETI.USA ID: 195 · Rating: 0 · rate: /

Clod Patry Volunteer moderator Volunteer developer Volunteer tester Send message Joined: 2 May 11 Posts: 65 Credit: 242,754,987 RAC: 0	Message 196 - Posted: 10 May 2011, 5:03:20 UTC Different cores simply means different algorithms used by d.net client to maximize computation with your current hardware. By maximizing computations means crunching more keys/sec. If you look at your task, you'll see, how fast your tasks were completed. The higher speed, the better you're, cause it will take less time to complete a work unit. example: http://moowrap.net/result.php?resultid=75750 you'll see: [May 10 08:49:57 UTC] RC5-72: Summary: 7 packets (448.00 stats units) 0.00:11:09.51 - [2,873.96 Mkeys/s] which means, that tasks 2873960 keys/sec. By default, having it to -1 should be already maximized. Some users notices if they changed it to some other values maximized their computation. The names aren't really important. ID: 196 · Rating: 0 · rate: /

zombie67 [MM] Send message Joined: 2 May 11 Posts: 47 Credit: 319,540,306 RAC: 0	Message 199 - Posted: 10 May 2011, 5:21:52 UTC Thanks for the answer, but I am not sure what to do with the answer: -1 should be best, but maybe not. Yes, I can look at the performance of a particular task, but that does not tell me how "good" that performance is, or how it would have compared to the exact same task with a different value. How do I know this? Reno, NV Team SETI.USA ID: 199 · Rating: 0 · rate: /

Clod Patry Volunteer moderator Volunteer developer Volunteer tester Send message Joined: 2 May 11 Posts: 65 Credit: 242,754,987 RAC: 0	Message 200 - Posted: 10 May 2011, 5:52:06 UTC By looking at the task name, you'll notice something like: dnetc_r72_1304995456_8_477_0 The interesting part is: _477_ this means that work units count for 477 stats units on distributed.net stats. Normally it's around 448. If you change core, you'll see a different speed in those 2 tasks. the _good_ simply means the faster. By using 2 tasks with 477 stats, if one core is faster, it will take less time to complete. That would be great if the BOINC benchmark (in the BOINC manager) would do a benchmark for existing GPUs too. ID: 200 · Rating: 0 · rate: /

zombie67 [MM] Send message Joined: 2 May 11 Posts: 47 Credit: 319,540,306 RAC: 0	Message 201 - Posted: 10 May 2011, 6:02:58 UTC So the answer is "trial and error"? Mess around until one value rises to the top? Reno, NV Team SETI.USA ID: 201 · Rating: 0 · rate: /

Clod Patry Volunteer moderator Volunteer developer Volunteer tester Send message Joined: 2 May 11 Posts: 65 Credit: 242,754,987 RAC: 0	Message 203 - Posted: 10 May 2011, 6:09:27 UTC I would say yes, you have to go with trial and error. the goal of the auto-detect (-1) is to avoid that trial & error method. It works fine on some system, but sadly, it doesn't work well on ALL systems. If you find something better then the auto-detect, feel free to publish it for other users having the same setup as you. ID: 203 · Rating: 0 · rate: /

zombie67 [MM] Send message Joined: 2 May 11 Posts: 47 Credit: 319,540,306 RAC: 0	Message 204 - Posted: 10 May 2011, 6:14:59 UTC Heh. I am not good at the math or the programing. I am just trying to understand how things work. So can the different values/formulas be posted? Also, do they end at 10? Or more? What is the range? Reno, NV Team SETI.USA ID: 204 · Rating: 0 · rate: /

frankhagen Send message Joined: 2 May 11 Posts: 27 Credit: 1,151,788 RAC: 0	Message 205 - Posted: 10 May 2011, 7:23:44 UTC - in response to Message 195. Can someone please define what each of the values mean? From reading the various threads, it looks like range is -1 through 10, only integer values. Is this correct? I know -1 means auto-select. But what to all the rest of the values mean? Can someone please post the definitions? you can look yourself by starting the app from a command prompt with "--config". select 3 and then 1.. for CUDA it's: RC5-72:-1) Auto select 0) CUDA 1-pipe 64-thd 1) CUDA 1-pipe 128-thd 2) CUDA 1-pipe 256-thd 3) CUDA 2-pipe 64-thd 4) CUDA 2-pipe 128-thd 5) CUDA 2-pipe 256-thd 6) CUDA 4-pipe 64-thd 7) CUDA 4-pipe 128-thd 8) CUDA 4-pipe 256-thd 9) CUDA 1-pipe 64-thd busy wait 10) CUDA 1-pipe 64-thd sleep 100us 11) CUDA 1-pipe 64-thd sleep dynamic ID: 205 · Rating: 0 · rate: /

Mr. Hankey Send message Joined: 4 May 11 Posts: 7 Credit: 224,776,119 RAC: 0	Message 358 - Posted: 18 May 2011, 1:08:02 UTC So I did the bench test and found core 3 was significantly faster. As such I configured the app to use core 3. From the results since I made that configuration change the core setting seems to be ignored at the app continues to use core 0 ID: 358 · Rating: 0 · rate: /

zombie67 [MM] Send message Joined: 2 May 11 Posts: 47 Credit: 319,540,306 RAC: 0	Message 360 - Posted: 18 May 2011, 13:19:38 UTC - in response to Message 358. So I did the bench test and found core 3 was significantly faster. As such I configured the app to use core 3. From the results since I made that configuration change the core setting seems to be ignored at the app continues to use core 0 You have to configure it in your project preferences page, here at the project. If you configure it from the command line, it gets over-written by the project preferences with the next task. Reno, NV Team SETI.USA ID: 360 · Rating: 0 · rate: /

Microcruncher* Send message Joined: 8 May 11 Posts: 11 Credit: 1,075,941 RAC: 0	Message 361 - Posted: 18 May 2011, 13:53:08 UTC - in response to Message 358. Last modified: 18 May 2011, 13:56:18 UTC So I did the bench test and found core 3 was significantly faster. As such I configured the app to use core 3. From the results since I made that configuration change the core setting seems to be ignored at the app continues to use core 0 I can only speak for myself but selecting any other core than the default -1 didn't work with the windows 1.01 wrapper/app (The app exits and does nothing, after ten retries it's calculation error time). The Linux 1.01 wrapper/app happily ignores the setting (Core #7 is the fastest here). Note: The Linux BOINC client just downloaded the new 1.02 wrapper/app. I have aborted all 1.01 tasks. 1.02 does work as expected. ID: 361 · Rating: 0 · rate: /

Microcruncher* Send message Joined: 8 May 11 Posts: 11 Credit: 1,075,941 RAC: 0	Message 362 - Posted: 18 May 2011, 14:10:10 UTC - in response to Message 201. Last modified: 18 May 2011, 14:34:12 UTC So the answer is "trial and error"? Mess around until one value rises to the top? Copy the executable (from DNETC) to another directory and start it with --bench (pause BOINC in the meantime). You can log the output with -l somefilename. Here is an example run (Linux/GTX 470): Some useless info removed [May 18 14:04:31 UTC] RC5-72: using core #0 (CUDA 1-pipe 64-thd). [May 18 14:04:42 UTC] RC5-72: Benchmark for core #0 (CUDA 1-pipe 64-thd) 0.00:00:08.79 [491,117,547 keys/sec] [May 18 14:04:42 UTC] RC5-72: using core #1 (CUDA 1-pipe 128-thd). [May 18 14:04:52 UTC] RC5-72: Benchmark for core #1 (CUDA 1-pipe 128-thd) 0.00:00:08.51 [507,606,843 keys/sec] [May 18 14:04:52 UTC] RC5-72: using core #2 (CUDA 1-pipe 256-thd). [May 18 14:05:03 UTC] RC5-72: Benchmark for core #2 (CUDA 1-pipe 256-thd) 0.00:00:08.59 [502,783,808 keys/sec] [May 18 14:05:03 UTC] RC5-72: using core #3 (CUDA 2-pipe 64-thd). [May 18 14:05:14 UTC] RC5-72: Benchmark for core #3 (CUDA 2-pipe 64-thd) 0.00:00:08.65 [498,891,716 keys/sec] [May 18 14:05:14 UTC] RC5-72: using core #4 (CUDA 2-pipe 128-thd). [May 18 14:05:25 UTC] RC5-72: Benchmark for core #4 (CUDA 2-pipe 128-thd) 0.00:00:08.40 [514,268,438 keys/sec] [May 18 14:05:25 UTC] RC5-72: using core #5 (CUDA 2-pipe 256-thd). [May 18 14:05:35 UTC] RC5-72: Benchmark for core #5 (CUDA 2-pipe 256-thd) 0.00:00:08.48 [509,478,872 keys/sec] [May 18 14:05:35 UTC] RC5-72: using core #6 (CUDA 4-pipe 64-thd). [May 18 14:05:46 UTC] RC5-72: Benchmark for core #6 (CUDA 4-pipe 64-thd) 0.00:00:08.59 [502,289,441 keys/sec] [May 18 14:05:46 UTC] RC5-72: using core #7 (CUDA 4-pipe 128-thd). [May 18 14:05:56 UTC] RC5-72: Benchmark for core #7 (CUDA 4-pipe 128-thd) 0.00:00:08.36 [517,607,292 keys/sec] [May 18 14:05:56 UTC] RC5-72: using core #8 (CUDA 4-pipe 256-thd). [May 18 14:06:07 UTC] RC5-72: Benchmark for core #8 (CUDA 4-pipe 256-thd) 0.00:00:08.43 [512,468,703 keys/sec] [May 18 14:06:07 UTC] RC5-72: using core #9 (CUDA 1-pipe 64-thd busy wait). [May 18 14:06:19 UTC] RC5-72: Benchmark for core #9 (CUDA 1-pipe 64-thd busy wait) 0.00:00:08.78 [491,695,519 keys/sec] [May 18 14:06:19 UTC] RC5-72: using core #10 (CUDA 1-pipe 64-thd sleep 100us). [May 18 14:06:30 UTC] RC5-72: Benchmark for core #10 (CUDA 1-pipe 64-thd sleep 100us) 0.00:00:08.82 [489,091,617 keys/sec] [May 18 14:06:30 UTC] RC5-72: using core #11 (CUDA 1-pipe 64-thd sleep dynamic). [May 18 14:06:41 UTC] RC5-72: Benchmark for core #11 (CUDA 1-pipe 64-thd sleep dynamic) 0.00:00:08.78 [491,651,933 keys/sec] [May 18 14:06:41 UTC] RC5-72 benchmark summary : Default core : #0 (CUDA 1-pipe 64-thd) Fastest core : #7 (CUDA 4-pipe 128-thd) [May 18 14:06:41 UTC] Core #7 is significantly faster than the default core. The GPU core selection has been made as a tradeoff between core speed and responsiveness of the graphical desktop. Please file a bug report along with the output of -cpuinfo only if the the faster core selection does not degrade graphics performance. They key info is: Fastest core : #7 (CUDA 4-pipe 128-thd) Your mileage may vary. This info is also important: [May 18 14:06:41 UTC] Core #7 is significantly faster than the default core. The GPU core selection has been made as a tradeoff between core speed and responsiveness of the graphical desktop. Please file a bug report along with the output of -cpuinfo only if the the faster core selection does not degrade graphics performance. If the selected core makes problems (laggy screen updates, too much CPU load, GPUs glowing red) you can try a another core. Cores number 9, 10 and 11 use the same code as core number 1 but coordinate the CPU and the GPU by different methods. The other cores are also variations of core #1 but they differ in the way they divide the work that is sent to the multiprocessing units of the GPU. ID: 362 · Rating: 0 · rate: /

Mr. Hankey Send message Joined: 4 May 11 Posts: 7 Credit: 224,776,119 RAC: 0	Message 369 - Posted: 19 May 2011, 0:01:18 UTC - in response to Message 360. So I did the bench test and found core 3 was significantly faster. As such I configured the app to use core 3. From the results since I made that configuration change the core setting seems to be ignored at the app continues to use core 0 You have to configure it in your project preferences page, here at the project. If you configure it from the command line, it gets over-written by the project preferences with the next task. So that doesn't make any sense. What if you have more than one system running different cards and core 1 is best on one card core 2 on the other etc etc... shouldn't that be set per card, not at the project level? ID: 369 · Rating: 0 · rate: /

zombie67 [MM] Send message Joined: 2 May 11 Posts: 47 Credit: 319,540,306 RAC: 0	Message 370 - Posted: 19 May 2011, 0:25:59 UTC - in response to Message 369. What if you have more than one system running different cards and core 1 is best on one card core 2 on the other etc etc... shouldn't that be set per card, not at the project level? That is what you use the different locations for. For example, you can have for example, Home (with "3") for 5870s, and Work (with "1") for my 4870s. Then assign the machines to the locations accordingly. Reno, NV Team SETI.USA ID: 370 · Rating: 0 · rate: /

Mr. Hankey Send message Joined: 4 May 11 Posts: 7 Credit: 224,776,119 RAC: 0	Message 402 - Posted: 20 May 2011, 0:50:35 UTC - in response to Message 370. You can't do that if the cards that require the different cores are in the same machine. ID: 402 · Rating: 0 · rate: /

Clod Patry Volunteer moderator Volunteer developer Volunteer tester Send message Joined: 2 May 11 Posts: 65 Credit: 242,754,987 RAC: 0	Message 407 - Posted: 20 May 2011, 16:12:53 UTC Mr. Hankey, what types of cards do you have in that machine? ID: 407 · Rating: 0 · rate: /

terencewee* Volunteer tester Send message Joined: 19 May 11 Posts: 7 Credit: 12,852,260 RAC: 0	Message 410 - Posted: 20 May 2011, 18:17:51 UTC @Clod, I suspect something like, ATI 5770 + ATI 5850 5770 good with 0. 5850 good with 3. Eg. NVIDIA 8800gtx + ATI 5850 I got 1 cruncher configured as such, but not in this project. Maybe the wrapper could read off a best-plan file (provided thru the project) and set the right core instead. Am sure the community will already provided best-core for different GPU. Add something like "Use best-core plan" in preference page. I sorely miss this future "best-plan" feature. :) ID: 410 · Rating: 0 · rate: /

Clod Patry Volunteer moderator Volunteer developer Volunteer tester Send message Joined: 2 May 11 Posts: 65 Credit: 242,754,987 RAC: 0	Message 446 - Posted: 23 May 2011, 16:15:25 UTC im curious about your "best-plan" feature. What would you do in a technical side? ID: 446 · Rating: 0 · rate: /

Beyond Send message Joined: 18 May 11 Posts: 46 Credit: 1,333,597,597 RAC: 388,375	Message 447 - Posted: 23 May 2011, 16:23:38 UTC Based on the information in this thread I tried core 3 on both an HD5850 and HD5770. Core 0 (auto detected by selecting -1) was faster here on both. Auto (0) was much faster on the HD5770 and a bit faster than 3 on the HD5850. This was determined by running several complete WUs and comparing both the keys/sec and the completion time for the same credit values. YMMV. ID: 447 · Rating: 0 · rate: /