Message boards :
Number crunching :
Can't get BOINC to download work
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 May 11 Posts: 53 Credit: 255,380,797 RAC: 7,430 |
Over the past many months every now and then I have found that BOINC manager/client refuses to request new work even when it runs out of work to process and just sits there idle. I usually have to reboot the machine to get it to ask for more work. Now this has been mainly a GPU issue and the CPUs keep chugging away and request new work when needed. Starting yesterday Both my Windows computers that are running Moo stopped asking for new work at about the same time and both later ran out of work. I waited for BOINC to request more but if it did it just said "Not Requesting New Work" or "No Work Available", "Got Zero new tasks". If I manually update I get the same message. If I Reset the computer (either one) still get the same message. Keeps saying Not requesting Work when it has none and plenty is available. CPU work keeps requesting work and getting it, just not the GPU. Mostly stopping and restarting BOINC Manager/Client gets things rolling again, work downloads and it then keeps asking for work and getting it for a week or more before doing it again. However I sometimes have to reboot to get work flowing again. So yesterday I upgraded one of the computers from 6.10.58 (which all of my computers have) to 6.12.34 to see if that would fix the problem. I got work and it was running fine. It only lasted less than one day and then the new Boinc Client started doing the same thing again. It ran out of work again and refused to download more. I reset the project but nothing changed. I detached (removed) the Project and reattached (added) Moo back again and after at first not downloading it then downloaded 100 work units, which it is now processing. But even after all this it has not improved with both computers still not downloading any new work at all, so as my cache drops down no more is downloaded to replace it. I am running out of work yet again. This all seems to have started with the server upgrade the other day, as I hardly had any real problems until about yesterday (when I noticed it, could of started before this as I had a full cache and noticed it when it did not replenish). I have checked all settings on both the computers and the project fixing anything I thought may help, but it all seems normal just not working. I am going away this Friday for a week and it looks like I may have to switch to another project in order to maintain work levels as MOO does not work any more. Conan |
Send message Joined: 2 May 11 Posts: 53 Credit: 255,380,797 RAC: 7,430 |
Well 6.10.58 just refuses to download any work. Can't recall it being a problem before as the client has been running for many many months without a major problem. But now it runs out of work and wont refill the queue. Still believe it has something to do with the Project Boinc Server code update of a few days ago, as this started a few days ago as well. Anyway have also now updated the Boinc client to 6.12.34 on the other Windows machine and now have some work. I will monitor for the next day or so. Linux still on 6.10.58 and no problems so I will leave it there as the new features on 6.12.34 are a bit odd, I would prefer to not have to open a separate box just to look at the messages but other than that it seems OK. Conan |
Send message Joined: 27 Jul 11 Posts: 342 Credit: 252,653,488 RAC: 0 |
I've used 6.12.33 from a month after it was released without problems. I hope it stays that way. |
Send message Joined: 5 May 11 Posts: 233 Credit: 351,414,150 RAC: 0 |
The other thing to be wary of is the dreaded OpenCL bug, its affecting both AMD and NVidia cards, although AMD seem the worst hit. There is a Linux fix going through final testing at present at AMD (at last - this has been going on for 3 months or more), its not yet known when the Windows fix will appear. Its complicated as its an OpenCL bug, not a Card Vendor bug. The Bug hits cards with multiple GPUs, single GPU cards are not affected. As a result on multi-GPU cards, OpenCL applications grab one CPU core per GPU - like it or not. Its more than pocessive as well, almost fights off other apps trying to use that core :) It does frequently cause problems for those with multiple Cards in the same machine. The bottom line is be wary of CPU overloading, take into account the bug behaviour of grabbing a full core per GPU. You can easily overload a dual GPU card's CPU cores if you dont, and problems are *likely* to start. If you dont get issues, great, crunch on. But if strange issues appear, keep this bug in the back of the mind and look at CPU Cores in use - write off one core per GPU, and see if that resolves the issue. Single GPU card in a box are NOT affected by this bug. |
Send message Joined: 22 Jun 11 Posts: 2080 Credit: 1,844,407,912 RAC: 3,236 |
Well 6.10.58 just refuses to download any work. The newer versions of Boinc ARE supposed to be more gpu friendly but are mostly helpful only if you crunch using BOTH the cpu and the on the SAME project. If you do that, switch one of them away and see if it works again. It seems the older versions of Boinc, the 6.10.xx versions did not handle the cpu and gpu work fetches for the same project. I personally crunch using all the cpu's in the machine, but never for the same project as the gpu. ps I too dislike having to open a different window to see the messages! I run 6.10.58 or 6.10.60 on most of my machines because of it. They also moved the things around in the Tasks bar and sorting things the way I like them is now harder for me. I like sorting the units by % complete, so the ones done are at the top, it is easier to see if I need to manually update or not that way. But they moved the Name of the unit to the far right so I have to keep lots of others columns tiny in order to see it, this is in 6.12.26. I guess they swapped Status and Name and I don't like that! |
Send message Joined: 2 May 11 Posts: 53 Credit: 255,380,797 RAC: 7,430 |
Well I think I found out what was causing the problem. Alber@home was also running on those computers and they are testing an ATI OpenCL WU type. I was running the CPU versions but had ATI selected in my preferences and this is where the problem arises. When I stopped doing Albert everything started to work again and now no more problems. It would appear that Boinc thought the WU was an GPU type when in fact it was a CPU type, and was getting rid of the Moo work to start the Albert work that did not exist. Anyway working now. Conan. |
Send message Joined: 22 Jun 11 Posts: 2080 Credit: 1,844,407,912 RAC: 3,236 |
Well I think I found out what was causing the problem. That makes sense, glad you found the problem and are now back up and crunching at full speed! Now go enjoy your trip and hopefully the pc's will just crunch along by themselves!! |
Send message Joined: 2 May 11 Posts: 2 Credit: 40,286,729 RAC: 0 |
11/26/2011 11:47:41 PM Moo! Wrapper Message from server: Tasks for CPU are available, but your preferences are set to not accept them I keep getting the above message. I have only one guy running now but when I had two the one would get so many tasks it was impossible to finish by deadline while this guy kept getting this message at the same time the other was being constantly sent new ones! This guy is running (2) 5970's AND (1) 5870 and runs well(1.3mil+/day) when he gets constant work so why does the server dicriminate? |
Send message Joined: 22 Jun 11 Posts: 2080 Credit: 1,844,407,912 RAC: 3,236 |
11/26/2011 11:47:41 PM Moo! Wrapper Message from server: Tasks for CPU are available, but your preferences are set to not accept them Check your cache settings, also if you run other projects at the same time you could have too much work from them right now. I only run one gpu project and one cpu project, NEVER the same one, on each of my pc's. You are running an older version of Boinc, 6.10.60, and they have problems with scheduling when running BOTH the cpu and gpu's on the same project. |
Send message Joined: 2 May 11 Posts: 2 Credit: 40,286,729 RAC: 0 |
Thanks for the input Mikey, This guy was running Milkyway but as you probably know they have been MIA for some time now! I have NO CPU running on either project and now this morning I have gotten some more work but fell behind one spot while waiting. The thing that I want to know is WHY with identical working settings will one computer get an overwhelming amount of work( one of my guys running 3 6970's gets swamped!)while this STRONGER guy is left sitting with the server claimimg it has no work? That's obviously false when it keeps sending work to the other guy! These guys(with my others helping too) have won the last two dnetc and moo challenges and during those there was never any of this behaviour so I am just curious why. Thanks again |
Send message Joined: 22 Jun 11 Posts: 2080 Credit: 1,844,407,912 RAC: 3,236 |
Thanks for the input Mikey, I am guessing because of this: State: All (1918) | In progress (82) | Pending (1) | Valid (1395) | Invalid (10) | Error (430) It is the details of your tasks on the pc with the 5 gpu's in it. For every task that is returned 'invalid', or in 'error', you reduce the number available to you until you can produce valid tasks again. Your 'invalid' and 'error' status units all seem to be recent, causing the website to not continue to send units to a pc that is not crunching properly. Don't take it personally, a simple restart may fix the problem, or it could be more complicated, it is hard to tell at this point as they have a status report of 'aborted by user'. If you did this just because of deadline issues then just reduce your cache size and you should be fine in no time, as in a week or so. You WILL continue to get units but at a very reduced rate until you start returning valid units again, at which point it goes up 2 for every 1 you return that is valid. This was done a while back so that pc's that just return bad units don't continue to keep getting them as all bad units must be resent to other pc's, causing an extra load on the Projects Servers. Some pc's can go thru HUNDREDS AND HUNDREDS of unis in a couple of hours just getting them and returning them because of problems. Of course this could go on forever if some controls weren't in place. |
Send message Joined: 2 May 11 Posts: 4 Credit: 103,649,424 RAC: 0 |
Hello When I try to get new UW i got message like this 2011-12-03 19:28:51 Moo! Wrapper Requesting new tasks for GPU 2011-12-03 19:28:54 Moo! Wrapper Scheduler request completed: got 0 new tasks 2011-12-03 19:28:54 Moo! Wrapper Message from server: No tasks sent 2011-12-03 19:28:54 Moo! Wrapper Message from server: Tasks for CPU are available, but your preferences are set to not accept them 2011-12-03 19:28:54 Moo! Wrapper Message from server: This computer has finished a daily quota of 1041 tasks i've checked boinc manager 6.10.58 and 6.12.34 all the same any ideas thanks |
Send message Joined: 2 May 11 Posts: 4 Credit: 103,649,424 RAC: 0 |
Hello When I try to get new UW i got message like this 2011-12-03 19:28:51 Moo! Wrapper Requesting new tasks for GPU 2011-12-03 19:28:54 Moo! Wrapper Scheduler request completed: got 0 new tasks 2011-12-03 19:28:54 Moo! Wrapper Message from server: No tasks sent 2011-12-03 19:28:54 Moo! Wrapper Message from server: Tasks for CPU are available, but your preferences are set to not accept them 2011-12-03 19:28:54 Moo! Wrapper Message from server: This computer has finished a daily quota of 1041 tasks i've checked boinc manager 6.10.58 and 6.12.34 all the same any ideas thanks |
Send message Joined: 26 May 11 Posts: 568 Credit: 121,524,886 RAC: 0 |
Hello Do you have a lot of tasks that you have aborted. Please unhide your PC´s so we can see your computers. |
Send message Joined: 2 May 11 Posts: 4 Credit: 103,649,424 RAC: 0 |
my pc http://moowrap.net/show_host_detail.php?hostid=141 i think You have right, so many task aborted |
Send message Joined: 26 May 11 Posts: 568 Credit: 121,524,886 RAC: 0 |
my pc Well I´m not sure, but maybe you shall look at your preferences about the number of days that you want to have work for. Set the amount to 1 day and see what happens. As stated here, if you don´t process any good WU´s, that will have an impact about the load of WU´s from the server. I see in your computer side that you have 3 rigs. Maybe you shall focus on 1 pc for the time being and try to process some wu´s that are accepted and in that way come on the right side about accepted work. Don´t give up! |
Send message Joined: 2 May 11 Posts: 4 Credit: 103,649,424 RAC: 0 |
today everything gone well, i got new tasks, i've changed supply additional WU on 2 days (10 days before) cheers |
Send message Joined: 26 May 11 Posts: 568 Credit: 121,524,886 RAC: 0 |
today everything gone well, i got new tasks, i've changed supply additional WU on 2 days (10 days before) Good decision to slow down the jobqueue a little. For Moo there is no need to have so many days in a queue. Happy that your problems are solved. Greetings! |