Dual+ GPUs (Problems and possible solutions)

\n studio-striking\n

Message boards : Number crunching : Dual+ GPUs (Problems and possible solutions)
Message board moderation

To post messages, you must log in.

AuthorMessage
valterc

Send message
Joined: 10 May 11
Posts: 17
Credit: 6,249,120,024
RAC: 13,744,086
Message 6141 - Posted: 1 Sep 2014, 14:24:08 UTC

Hi all. There are at least three annoying problems on running Moo on a dual+ gpu setup. I have experienced this situation on a box with two ATI/AMD 6900 series (Cayman)
- Using Boinc v7+ the application won't start unless using an app_config.xml (This is something that should be avoided for normal users)
- The dnetc application will try to use all the gpus, simply ignoring any boinc setup, thus creating conflicts with others gpu applications.
- If the application uses all the gpus there are 'waiting for other thread' problems if the number of input packets is not even (in case of power of two gpus) or if the crunching speed of the gpus is different.

What I propose is a very simple change, ie. just get rid at all of the multi-gpu capability of the dnetc application. The application itself uses a configuration file (dnetc-gpu-1.3.ini) which is copied in the slots directory before running it. The content of the current one is:
[buffers]
allow-update-from-altbuffer=no
buffer-file-basename=in
output-file-basename=out
checkpoint-filename=chkpoint

[misc]
run-work-limit=-1

[triggers]
restart-on-config-file-change=no
exit-flag-filename=exit
pause-flag-filename=pause

[display]
progress-indicator=off

[processor-usage]
priority=9

[networking]
disabled=yes

[logging]
log-file-type="no limit"
log-file=stderr.txt

[rc5-72]
random-subspace=1339

If we add a "max-threads=1" line into the [processor-usage] section we tell the application to use *just* one thread (which means on gpu). However, with this addition, it will *ever* use device 0. But if it is started with the -devicenum <n> command line argument (run on device <n> only), all the problem related to this issue will be solved. This should be just a minor change on the wrapper code.
ID: 6141 · Rating: 0 · rate: Rate + / Rate - Report as offensive
mikey
Avatar

Send message
Joined: 22 Jun 11
Posts: 2080
Credit: 1,826,336,240
RAC: 3,658
Message 6143 - Posted: 2 Sep 2014, 10:19:40 UTC - in response to Message 6141.  

Hi all. There are at least three annoying problems on running Moo on a dual+ gpu setup. I have experienced this situation on a box with two ATI/AMD 6900 series (Cayman)
- Using Boinc v7+ the application won't start unless using an app_config.xml (This is something that should be avoided for normal users)
- The dnetc application will try to use all the gpus, simply ignoring any boinc setup, thus creating conflicts with others gpu applications.
- If the application uses all the gpus there are 'waiting for other thread' problems if the number of input packets is not even (in case of power of two gpus) or if the crunching speed of the gpus is different.

What I propose is a very simple change, ie. just get rid at all of the multi-gpu capability of the dnetc application. The application itself uses a configuration file (dnetc-gpu-1.3.ini) which is copied in the slots directory before running it. The content of the current one is:
[buffers]
allow-update-from-altbuffer=no
buffer-file-basename=in
output-file-basename=out
checkpoint-filename=chkpoint

[misc]
run-work-limit=-1

[triggers]
restart-on-config-file-change=no
exit-flag-filename=exit
pause-flag-filename=pause

[display]
progress-indicator=off

[processor-usage]
priority=9

[networking]
disabled=yes

[logging]
log-file-type="no limit"
log-file=stderr.txt

[rc5-72]
random-subspace=1339

If we add a "max-threads=1" line into the [processor-usage] section we tell the application to use *just* one thread (which means on gpu). However, with this addition, it will *ever* use device 0. But if it is started with the -devicenum <n> command line argument (run on device <n> only), all the problem related to this issue will be solved. This should be just a minor change on the wrapper code.


You might send this to the Boinc Mailing List as the dual Nvidia gpu's in a single machine problems you are seeing has been a thorn for a LONG time!! For some people at some projects it works just fine, for other people at the same project it does not work at all. You fill up the cache of one gpu and the other gpu sits idle. The ONLY solution so far is too put the two gpu's on different projects, then they both work fine.
ID: 6143 · Rating: 0 · rate: Rate + / Rate - Report as offensive
valterc

Send message
Joined: 10 May 11
Posts: 17
Credit: 6,249,120,024
RAC: 13,744,086
Message 6145 - Posted: 2 Sep 2014, 10:53:39 UTC - in response to Message 6143.  
Last modified: 2 Sep 2014, 11:06:27 UTC

You might send this to the Boinc Mailing List as the dual Nvidia gpu's in a single machine problems you are seeing has been a thorn for a LONG time!! For some people at some projects it works just fine, for other people at the same project it does not work at all. You fill up the cache of one gpu and the other gpu sits idle. The ONLY solution so far is too put the two gpu's on different projects, then they both work fine

Well, this issue is not related at all with Nvidia. The problem here is that the DNETC application will try, by default, to use ALL the GPUs it sees. Being NOT a BOINC application (it needs a wrapper in order to run under boinc) it simply will not care about any boinc directive/setup.

Using Boinc v7+ the app will request 2 gpus, but will not start at all, unless you specify <gpu_usage>1</gpu_usage> inside an app_config.xml. Even in this case the application will use ALL the gpus, thus conflicting with other gpu applications BOINC may want to run.

To be more specific, suppose that I attach to Milkyway and Moo! on a dual GPU machine, with boinc v7.

- If I do nothing: Boinc downloads both MOO and MW wus, MOO requests 2 GPUs but will not start at all (this may be a BOINC issue). Result: MOO will time out eventually and both GPUs run MW
- If I modify the app_config.xml as per above: MOO is now requesting to run on just one GPU, application will start, say on the first GPU but actually using *ALL* GPUs (two). Boinc thinks that one gpu is free and starts another MOO or a MW on the second GPU. So you may end up having one GPU running two different tasks at the same time....
ID: 6145 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Teemu Mannermaa
Project administrator
Project developer
Project tester

Send message
Joined: 20 Apr 11
Posts: 388
Credit: 822,356,221
RAC: 0
Message 6151 - Posted: 5 Sep 2014, 12:40:04 UTC - in response to Message 6141.  

this addition, it will *ever* use device 0. But if it is started with the -devicenum command line argument (run on device only), all the problem related to this issue will be solved. This should be just a minor change on the wrapper code.


It does seem a minor change but it's complicated by the factor that BOINC Client gives apps 3 different ways the device number to use. And most of the time it gives us wrong GPU information (not for the one app should use, but for the other one). This is sad especially since the client knows the info, it just doesn't pass it via it's API.

I've also seen the multi-GPU scheduling hang. I think that's a BOINC Client bug where they no longer schedule multiple GPUs correctly. :(

However, your three problems are valid ones and summary the situation nicely. I'm happy to report that the upcoming OpenCL app will make use of the -devicenum to run one instance per device. It also fetches necessary GPU information directly from the BOINC Client by hooking into the process based detection feature in recent versions.

That app version will hit beta any minute now. :)

-w
ID: 6151 · Rating: 0 · rate: Rate + / Rate - Report as offensive
valterc

Send message
Joined: 10 May 11
Posts: 17
Credit: 6,249,120,024
RAC: 13,744,086
Message 6152 - Posted: 5 Sep 2014, 15:35:55 UTC - in response to Message 6151.  
Last modified: 5 Sep 2014, 15:38:29 UTC

That's great! Any improvements will help on keeping this project alive, and surely on attracting more users.

Just one thing: Do you have any benchmarks of the OpenCL application running on pre-Tahiti AMD/ATI GPUS (5xxx, 6xxx)? If there will be a significant drop in performance it may be worthy to also keep the ATI/CAL applications alive (the one based on dnetc518-win32-x86-stream, maybe adding the -devicenum to its runtime flags)

Thank you
ID: 6152 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Teemu Mannermaa
Project administrator
Project developer
Project tester

Send message
Joined: 20 Apr 11
Posts: 388
Credit: 822,356,221
RAC: 0
Message 6154 - Posted: 5 Sep 2014, 16:00:17 UTC - in response to Message 6152.  

Just one thing: Do you have any benchmarks of the OpenCL application running on pre-Tahiti AMD/ATI GPUS (5xxx, 6xxx)? If there will be a significant drop in performance it may be worthy to also keep the ATI/CAL applications alive (the one based on dnetc518-win32-x86-stream, maybe adding the -devicenum to its runtime flags)


I don't have anything conclusive about performance differences at the moment. OpenCL is supposed to be slightly slower but Dnet Client devs (and I'm sure card driver devs) have been improving OpenCL performance steadily.

My own 5870 tests about the same on both OpenCL and Stream but it might not be able to use it's best core due to those getting disabled on my test system (because those cores would crash due to my 7970).

For now I'm keeping Stream apps alive but we'll see what happens in the long run. Deploying with newer wrapper version (and Dnet Client one since build 519 is first one to support -devicenum option) is something I'd want to do for them to get new version benefits.

-w
ID: 6154 · Rating: 0 · rate: Rate + / Rate - Report as offensive

Message boards : Number crunching : Dual+ GPUs (Problems and possible solutions)


 
Copyright © 2011-2024 Moo! Wrapper Project