Questions and Answers :
Windows :
CUDA Failure
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 May 11 Posts: 5 Credit: 600,520,646 RAC: 0 |
I have a Windows XP 64, AMD 5400+ running a 9800 GTX cuda GPU. I am seeing consistent failures with no actual valid work units completed. |
Send message Joined: 2 May 11 Posts: 27 Credit: 1,151,788 RAC: 0 |
same over here - i bet they crash because maximum runtime is too low.. |
Send message Joined: 1 May 11 Posts: 23 Credit: 1,574,433 RAC: 0 |
you receive at least the units for I no ... Config : i7 860 2.8ghz, 8g ram, boinc : 6.12.26, GPU : GTX 470 Zotac Amp Edition 1280 mo DDR5 |
Send message Joined: 2 May 11 Posts: 65 Credit: 242,754,987 RAC: 0 |
I have a Windows XP 64, AMD 5400+ running a 9800 GTX cuda GPU. I am seeing consistent failures with no actual valid work units completed. What's the error in your Messages? Does it stop right away after it started? What type of card do you have? Which driver version? I can see: dnetc v2.9109-518-CTR-10092921 for CUDA 3.1 on Win32 (WindowsNT 5.1). Using email address (distributed.net ID) 'EMAIL@yahoo.com' [May 02 17:55:20 UTC] nvcuda.dll Version: 6.14.11.9621 [May 02 17:55:20 UTC] Unable to create CUDA stream [May 02 17:55:20 UTC] Unable to initialize CUDA. [May 02 17:55:20 UTC] *Break* Shutting down... 13:55:21 (6096): input buffer 0 packets (1074790400 bytes), checkpoint file 0 packets (1082589184 bytes), output buffer 1952257862 packets (-1077459503 bytes) 13:55:21 (6096): premature exit detected, app exit status: 0xfffffffd 13:55:21 (6096): wrapper: running dnetc518-win32-x86-cuda31.exe (-ini dnetc.ini -runoffline -multiok=1 -e EMAIL@yahoo.com) Based on: [May 02 17:55:20 UTC] Unable to create CUDA stream [May 02 17:55:20 UTC] Unable to initialize CUDA. Are you sure you're CUDA is properly installed? |
Send message Joined: 2 May 11 Posts: 5 Credit: 600,520,646 RAC: 0 |
@Clod Patry - The work unit finishes then while waiting for validation it results in "Error while computing". No other messages are available. I am running the project PrimeGrid with no problems. |
Send message Joined: 2 May 11 Posts: 65 Credit: 242,754,987 RAC: 0 |
We'll have to dig a little bit why this is causing a problem. Which driver version? |
Send message Joined: 20 Apr 11 Posts: 388 Credit: 822,356,221 RAC: 0 |
Which driver version? Looking at the output, it says [May 02 17:55:20 UTC] nvcuda.dll Version: 6.14.11.9621 which means it's driver v196.21. (Sure, it's also in http://moowrap.net/show_host_detail.php?hostid=115 but that's too easy. :) ) Unfortunately, this is too old and could explain the CUDA stream creation errors in the log. Looks like v256 is the minimum needed for CUDA 3.1 (Dnet Client was compiled for that version). -w |
Send message Joined: 2 May 11 Posts: 3 Credit: 271,855 RAC: 0 |
I'm getting the same thng, even though I have the latest drivers. Take a look at [url=http://moowrap.net/result.php?resultid=8422]this[result] It's actually crunched through 3 of the packets ok before it gets a maximum disk usage. I have it set to use 99% of disk space, so I'm unsure why it get's that message. Bok |
Send message Joined: 20 Apr 11 Posts: 388 Credit: 822,356,221 RAC: 0 |
Take a look at [url=http://moowrap.net/result.php?resultid=8422]this[url] That's a setting in the WU and is set at server side. It was originally set too low for these systems that generate so much stderr output due to a wrapper bug and missing Dnet ID. I've already bumped the value for any newly generated work so these should go away. -w |
Send message Joined: 2 May 11 Posts: 27 Credit: 1,151,788 RAC: 0 |
ok, seems to work now. next thing needed is a way to get rid of automatic core selection which costs a lot of performance for many of us. |
Send message Joined: 2 May 11 Posts: 4 Credit: 15,774,182 RAC: 677 |
That's a setting in the WU and is set at server side. It was originally set too low for these systems that generate so much stderr output due to a wrapper bug and missing Dnet ID. The problem seems persisted. One of the WU this morning got the following error logged - 2011/5/5 ä¸Šåˆ 03:12:02 Moo! Wrapper Aborting task dnetc_r72_1304316070_0: exceeded disk limit: 0.48MB > 0.48MB 2011/5/5 ä¸Šåˆ 03:12:09 Moo! Wrapper Computation for task dnetc_r72_1304316070_0 finished 2011/5/5 ä¸Šåˆ 03:12:11 Moo! Wrapper Started upload of dnetc_r72_1304316070_0_0 2011/5/5 ä¸Šåˆ 03:12:18 Moo! Wrapper Finished upload of dnetc_r72_1304316070_0_0 Philip |
Send message Joined: 2 May 11 Posts: 3 Credit: 271,855 RAC: 0 |
Looking at that wu name, it looks like one of the old ones prior to the change the admin made to correct it. |
Send message Joined: 2 May 11 Posts: 4 Credit: 15,774,182 RAC: 677 |
I did a reset to clear the old WUs prior to downloading some new ones. Perhaps there are still some old WUs in the queue and I should wait a little longer before downloading WUs again. I shall try again when I return home this evening. |
Send message Joined: 2 May 11 Posts: 4 Credit: 15,774,182 RAC: 677 |
I did a reset to clear the old WUs prior to downloading some new ones. Perhaps there are still some old WUs in the queue and I should wait a little longer before downloading WUs again. I shall try again when I return home this evening. Downloaded three WUs, two look like old ones and one seems new. Crunched the new one successfully. I thought I will abort the two old ones and wait for awhile before try again. |
Send message Joined: 20 Apr 11 Posts: 388 Credit: 822,356,221 RAC: 0 |
Downloaded three WUs, two look like old ones and one seems new. Crunched the new one successfully. Unfortunately, the two workunits (4599 and 4871) that you have "in progress" are most likely going to end in error. You should try to get one of the new ones (that have longer names) or indeed wait for either the next application version that should fix the long stderr that's giving problems or for somebody else crunch the older ones. -w |
Send message Joined: 2 May 11 Posts: 4 Credit: 15,774,182 RAC: 677 |
Downloaded three WUs, two look like old ones and one seems new. Crunched the new one successfully. I tried download again this morning but still got two WUs (client 1.00) with short names. So I think I will wait till the next application version is released. Thanks. Philip |
Send message Joined: 1 May 11 Posts: 23 Credit: 1,574,433 RAC: 0 |
I did not feel that changing the line: <command_line>-ini-Dnetc.ini runoffline-multiok=1</command_line> by <command_line>-ini dnetc.ini -runoffline -multiok=1 -c 10</command_line> réduice CUDA time units because the last two times were 8000 sec each Envrionment ... unless these units is very long compared to the previous .... Config : i7 860 2.8ghz, 8g ram, boinc : 6.12.26, GPU : GTX 470 Zotac Amp Edition 1280 mo DDR5 |
Send message Joined: 2 May 11 Posts: 27 Credit: 1,151,788 RAC: 0 |
réduice CUDA time units because the last two times were 8000 sec each Envrionment ... unless these units is very long compared to the previous .... check what's in stderr! there are packets of different sizes. for my hosts it's about 20% faster.. |