Dual+ GPUs (Problems and possible solutions)

Author	Message
valterc Send message Joined: 10 May 11 Posts: 17 Credit: 10,320,726,071 RAC: 913,425	Message 6141 - Posted: 1 Sep 2014, 14:24:08 UTC Hi all. There are at least three annoying problems on running Moo on a dual+ gpu setup. I have experienced this situation on a box with two ATI/AMD 6900 series (Cayman) - Using Boinc v7+ the application won't start unless using an app_config.xml (This is something that should be avoided for normal users) - The dnetc application will try to use all the gpus, simply ignoring any boinc setup, thus creating conflicts with others gpu applications. - If the application uses all the gpus there are 'waiting for other thread' problems if the number of input packets is not even (in case of power of two gpus) or if the crunching speed of the gpus is different. What I propose is a very simple change, ie. just get rid at all of the multi-gpu capability of the dnetc application. The application itself uses a configuration file (dnetc-gpu-1.3.ini) which is copied in the slots directory before running it. The content of the current one is: [buffers] allow-update-from-altbuffer=no buffer-file-basename=in output-file-basename=out checkpoint-filename=chkpoint [misc] run-work-limit=-1 [triggers] restart-on-config-file-change=no exit-flag-filename=exit pause-flag-filename=pause [display] progress-indicator=off [processor-usage] priority=9 [networking] disabled=yes [logging] log-file-type="no limit" log-file=stderr.txt [rc5-72] random-subspace=1339 If we add a "max-threads=1" line into the [processor-usage] section we tell the application to use just one thread (which means on gpu). However, with this addition, it will ever use device 0. But if it is started with the -devicenum <n> command line argument (run on device <n> only), all the problem related to this issue will be solved. This should be just a minor change on the wrapper code. ID: 6141 · Rating: 0 · rate: /

mikey Send message Joined: 22 Jun 11 Posts: 2080 Credit: 1,854,430,696 RAC: 22	Message 6143 - Posted: 2 Sep 2014, 10:19:40 UTC - in response to Message 6141. Hi all. There are at least three annoying problems on running Moo on a dual+ gpu setup. I have experienced this situation on a box with two ATI/AMD 6900 series (Cayman) - Using Boinc v7+ the application won't start unless using an app_config.xml (This is something that should be avoided for normal users) - The dnetc application will try to use all the gpus, simply ignoring any boinc setup, thus creating conflicts with others gpu applications. - If the application uses all the gpus there are 'waiting for other thread' problems if the number of input packets is not even (in case of power of two gpus) or if the crunching speed of the gpus is different. What I propose is a very simple change, ie. just get rid at all of the multi-gpu capability of the dnetc application. The application itself uses a configuration file (dnetc-gpu-1.3.ini) which is copied in the slots directory before running it. The content of the current one is: [buffers] allow-update-from-altbuffer=no buffer-file-basename=in output-file-basename=out checkpoint-filename=chkpoint [misc] run-work-limit=-1 [triggers] restart-on-config-file-change=no exit-flag-filename=exit pause-flag-filename=pause [display] progress-indicator=off [processor-usage] priority=9 [networking] disabled=yes [logging] log-file-type="no limit" log-file=stderr.txt [rc5-72] random-subspace=1339 If we add a "max-threads=1" line into the [processor-usage] section we tell the application to use just one thread (which means on gpu). However, with this addition, it will ever use device 0. But if it is started with the -devicenum <n> command line argument (run on device <n> only), all the problem related to this issue will be solved. This should be just a minor change on the wrapper code. You might send this to the Boinc Mailing List as the dual Nvidia gpu's in a single machine problems you are seeing has been a thorn for a LONG time!! For some people at some projects it works just fine, for other people at the same project it does not work at all. You fill up the cache of one gpu and the other gpu sits idle. The ONLY solution so far is too put the two gpu's on different projects, then they both work fine. ID: 6143 · Rating: 0 · rate: /

valterc Send message Joined: 10 May 11 Posts: 17 Credit: 10,320,726,071 RAC: 913,425	Message 6145 - Posted: 2 Sep 2014, 10:53:39 UTC - in response to Message 6143. Last modified: 2 Sep 2014, 11:06:27 UTC You might send this to the Boinc Mailing List as the dual Nvidia gpu's in a single machine problems you are seeing has been a thorn for a LONG time!! For some people at some projects it works just fine, for other people at the same project it does not work at all. You fill up the cache of one gpu and the other gpu sits idle. The ONLY solution so far is too put the two gpu's on different projects, then they both work fine Well, this issue is not related at all with Nvidia. The problem here is that the DNETC application will try, by default, to use ALL the GPUs it sees. Being NOT a BOINC application (it needs a wrapper in order to run under boinc) it simply will not care about any boinc directive/setup. Using Boinc v7+ the app will request 2 gpus, but will not start at all, unless you specify <gpu_usage>1</gpu_usage> inside an app_config.xml. Even in this case the application will use ALL the gpus, thus conflicting with other gpu applications BOINC may want to run. To be more specific, suppose that I attach to Milkyway and Moo! on a dual GPU machine, with boinc v7. - If I do nothing: Boinc downloads both MOO and MW wus, MOO requests 2 GPUs but will not start at all (this may be a BOINC issue). Result: MOO will time out eventually and both GPUs run MW - If I modify the app_config.xml as per above: MOO is now requesting to run on just one GPU, application will start, say on the first GPU but actually using ALL GPUs (two). Boinc thinks that one gpu is free and starts another MOO or a MW on the second GPU. So you may end up having one GPU running two different tasks at the same time.... ID: 6145 · Rating: 0 · rate: /

Teemu Mannermaa Project administrator Project developer Project tester Send message Joined: 20 Apr 11 Posts: 389 Credit: 822,556,349 RAC: 0	Message 6151 - Posted: 5 Sep 2014, 12:40:04 UTC - in response to Message 6141. this addition, it will ever use device 0. But if it is started with the -devicenum command line argument (run on device only), all the problem related to this issue will be solved. This should be just a minor change on the wrapper code. It does seem a minor change but it's complicated by the factor that BOINC Client gives apps 3 different ways the device number to use. And most of the time it gives us wrong GPU information (not for the one app should use, but for the other one). This is sad especially since the client knows the info, it just doesn't pass it via it's API. I've also seen the multi-GPU scheduling hang. I think that's a BOINC Client bug where they no longer schedule multiple GPUs correctly. :( However, your three problems are valid ones and summary the situation nicely. I'm happy to report that the upcoming OpenCL app will make use of the -devicenum to run one instance per device. It also fetches necessary GPU information directly from the BOINC Client by hooking into the process based detection feature in recent versions. That app version will hit beta any minute now. :) -w ID: 6151 · Rating: 0 · rate: /

valterc Send message Joined: 10 May 11 Posts: 17 Credit: 10,320,726,071 RAC: 913,425	Message 6152 - Posted: 5 Sep 2014, 15:35:55 UTC - in response to Message 6151. Last modified: 5 Sep 2014, 15:38:29 UTC That's great! Any improvements will help on keeping this project alive, and surely on attracting more users. Just one thing: Do you have any benchmarks of the OpenCL application running on pre-Tahiti AMD/ATI GPUS (5xxx, 6xxx)? If there will be a significant drop in performance it may be worthy to also keep the ATI/CAL applications alive (the one based on dnetc518-win32-x86-stream, maybe adding the -devicenum to its runtime flags) Thank you ID: 6152 · Rating: 0 · rate: /

Teemu Mannermaa Project administrator Project developer Project tester Send message Joined: 20 Apr 11 Posts: 389 Credit: 822,556,349 RAC: 0	Message 6154 - Posted: 5 Sep 2014, 16:00:17 UTC - in response to Message 6152. Just one thing: Do you have any benchmarks of the OpenCL application running on pre-Tahiti AMD/ATI GPUS (5xxx, 6xxx)? If there will be a significant drop in performance it may be worthy to also keep the ATI/CAL applications alive (the one based on dnetc518-win32-x86-stream, maybe adding the -devicenum to its runtime flags) I don't have anything conclusive about performance differences at the moment. OpenCL is supposed to be slightly slower but Dnet Client devs (and I'm sure card driver devs) have been improving OpenCL performance steadily. My own 5870 tests about the same on both OpenCL and Stream but it might not be able to use it's best core due to those getting disabled on my test system (because those cores would crash due to my 7970). For now I'm keeping Stream apps alive but we'll see what happens in the long run. Deploying with newer wrapper version (and Dnet Client one since build 519 is first one to support -devicenum option) is something I'd want to do for them to get new version benefits. -w ID: 6154 · Rating: 0 · rate: /