Big WU´s

\n studio-striking\n

Message boards : Number crunching : Big WU´s
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Bernt
Avatar

Send message
Joined: 26 May 11
Posts: 568
Credit: 121,524,886
RAC: 0
Message 2670 - Posted: 15 Feb 2012, 15:53:17 UTC

Has anybody any idea why we have received this small WU´s for such a long period.
I think it is a little waist of GPU capacity. Maybe Teemu has anya idea about this?
ID: 2670 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Teemu Mannermaa
Project administrator
Project developer
Project tester

Send message
Joined: 20 Apr 11
Posts: 388
Credit: 822,356,221
RAC: 0
Message 2701 - Posted: 18 Feb 2012, 16:05:40 UTC - in response to Message 2670.  

Has anybody any idea why we have received this small WU´s for such a long period.


Scheduler seems to think you are slower these days, which is most likely due to changes I've made. I recently changed the GFLOPs limit from 500 to 200 to get Huge WUs and I'm seeing a lot more people getting bigger work now.

I do agree that fast cards should get bigger units and I intend to optimize these sizes sometime in the future further. Hopefully this quick fix helps in the meantime. (Including server load since the clients don't need to fetch work so often. Not that we are having big problems load wise but still..)

-w
ID: 2701 · Rating: 0 · rate: Rate + / Rate - Report as offensive
mikey
Avatar

Send message
Joined: 22 Jun 11
Posts: 2080
Credit: 1,844,401,480
RAC: 3,144
Message 2712 - Posted: 19 Feb 2012, 12:06:33 UTC - in response to Message 2701.  

Has anybody any idea why we have received this small WU´s for such a long period.


Scheduler seems to think you are slower these days, which is most likely due to changes I've made. I recently changed the GFLOPs limit from 500 to 200 to get Huge WUs and I'm seeing a lot more people getting bigger work now.

I do agree that fast cards should get bigger units and I intend to optimize these sizes sometime in the future further. Hopefully this quick fix helps in the meantime. (Including server load since the clients don't need to fetch work so often. Not that we are having big problems load wise but still..)

-w


I was thinking about the problem of the software not figuring out which unit to send to which gpu...can you setup small list that when work is requested it checks to see which list YOUR gpu is on and sends work based on that? So after the initial small units break in period, say 7 days, you would have one list that depending on if YOUR individual gpu is on it or not decides which size workunit you would get. ie on it get a big one, not on it get the smaller ones. I was thinking then that you as an Admin could move gpu's on or off the list based on what you see on your end.
ID: 2712 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Bernt
Avatar

Send message
Joined: 26 May 11
Posts: 568
Credit: 121,524,886
RAC: 0
Message 2715 - Posted: 19 Feb 2012, 20:33:21 UTC - in response to Message 2701.  

Has anybody any idea why we have received this small WU´s for such a long period.


Scheduler seems to think you are slower these days, which is most likely due to changes I've made. I recently changed the GFLOPs limit from 500 to 200 to get Huge WUs and I'm seeing a lot more people getting bigger work now.

I do agree that fast cards should get bigger units and I intend to optimize these sizes sometime in the future further. Hopefully this quick fix helps in the meantime. (Including server load since the clients don't need to fetch work so often. Not that we are having big problems load wise but still..)

-w


It is strange, just after I made the post I found out that I got bigger WU´s.
Maybe you waved your magic rod Teemu, anyhow it is OK now. Thanks


ID: 2715 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile HGW

Send message
Joined: 3 May 11
Posts: 5
Credit: 143,962,508
RAC: 0
Message 2919 - Posted: 19 Mar 2012, 10:42:56 UTC - in response to Message 2715.  
Last modified: 19 Mar 2012, 10:43:39 UTC

???
19 Mar 2012 | 8:52:16 UTC Abbruch durch Benutzer 49,056.50 24.20 --- Distributed.net Client v1.03 (ati14)
ID: 2919 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile SLAYER OF DEATH

Send message
Joined: 12 Jul 11
Posts: 112
Credit: 229,191,777
RAC: 0
Message 2979 - Posted: 1 Apr 2012, 19:35:02 UTC

Are fragmented WU rearing its ugly head again??
Rig 9291, my fastest rig, is getting hammerd by them.
And not too bad on the others but they are still getting them as I am sure everyone is getting them. This has been going on for a while, sporatic WU, but only recently seeing more of them.
THIS IS NO APRIL FOOLS JOKE...OR IS IT...
ID: 2979 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile HGW

Send message
Joined: 3 May 11
Posts: 5
Credit: 143,962,508
RAC: 0
Message 2980 - Posted: 2 Apr 2012, 8:49:04 UTC - in response to Message 2979.  

???

1 Apr 2012 | 23:55:14 UTC Fehler beim Berechnen 51,922.05 20.30 --- Distributed.net Client v1.03 (ati14)
ID: 2980 · Rating: 0 · rate: Rate + / Rate - Report as offensive
mikey
Avatar

Send message
Joined: 22 Jun 11
Posts: 2080
Credit: 1,844,401,480
RAC: 3,144
Message 2981 - Posted: 2 Apr 2012, 11:35:12 UTC - in response to Message 2980.  
Last modified: 2 Apr 2012, 11:36:28 UTC

???

1 Apr 2012 | 23:55:14 UTC Fehler beim Berechnen 51,922.05 20.30 --- Distributed.net Client v1.03 (ati14)


I don't understand.
ID: 2981 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile SLAYER OF DEATH

Send message
Joined: 12 Jul 11
Posts: 112
Credit: 229,191,777
RAC: 0
Message 2982 - Posted: 2 Apr 2012, 12:23:42 UTC
Last modified: 2 Apr 2012, 12:24:38 UTC

Something was going on?? Gpu reset? every second? thats just a little bit of the log, it went on for a while. I dont know..
That 58xx is screaming tho!, 1520 sec avg.
He had 2 wu like that and aborted quite a bit.
<stderr_txt>
after a pause...
[Apr 01 23:41:51 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:52 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:53 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:54 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:55 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:56 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:57 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:58 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:59 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:42:00 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:42:01 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
ID: 2982 · Rating: 0 · rate: Rate + / Rate - Report as offensive
mikey
Avatar

Send message
Joined: 22 Jun 11
Posts: 2080
Credit: 1,844,401,480
RAC: 3,144
Message 2985 - Posted: 3 Apr 2012, 9:54:56 UTC - in response to Message 2982.  

Something was going on?? Gpu reset? every second? thats just a little bit of the log, it went on for a while. I dont know..
That 58xx is screaming tho!, 1520 sec avg.
He had 2 wu like that and aborted quite a bit.
<stderr_txt>
after a pause...
[Apr 01 23:41:51 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:52 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:53 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:54 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:55 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:56 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:57 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:58 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:59 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:42:00 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:42:01 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...


I wish my 5870's would do that!! I just put 2 in the same machine and my RAC is going DOWN in that machine!!! It used to have 1 5870 in the machine, now it has 2! I am hoping it is short term!!
ID: 2985 · Rating: 0 · rate: Rate + / Rate - Report as offensive
DigitalDingus

Send message
Joined: 2 May 11
Posts: 54
Credit: 117,821,513
RAC: 0
Message 2989 - Posted: 4 Apr 2012, 1:29:41 UTC - in response to Message 2979.  

Are fragmented WU rearing its ugly head again??
Rig 9291, my fastest rig, is getting hammerd by them.
And not too bad on the others but they are still getting them as I am sure everyone is getting them. This has been going on for a while, sporatic WU, but only recently seeing more of them.
THIS IS NO APRIL FOOLS JOKE...OR IS IT...


Too bad we can't give the small WU's to people who just joined. Then after 3 months or so, they pass the "initiation phase".
ID: 2989 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile SLAYER OF DEATH

Send message
Joined: 12 Jul 11
Posts: 112
Credit: 229,191,777
RAC: 0
Message 2990 - Posted: 4 Apr 2012, 2:43:45 UTC - in response to Message 2989.  

The wu's with (triple digit)XXX_XXX do not have the same effect we were seeing when they first came out. Still finish apx.28/30 min, GPU work still drops around 88% Its all good.
On a side note, summer is on here in North Georgia, and no way I can crunch and use some a/c..so for right now ill crunch 2 rigs at nite for 14 hours on, 10 off and try leaving my fastest rig up 24/7. Make the ole energy bill look good.
ID: 2990 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile HGW

Send message
Joined: 3 May 11
Posts: 5
Credit: 143,962,508
RAC: 0
Message 2996 - Posted: 4 Apr 2012, 12:35:16 UTC - in response to Message 2981.  

???

1 Apr 2012 | 23:55:14 UTC Fehler beim Berechnen 51,922.05 20.30 --- Distributed.net Client v1.03 (ati14)


I don't understand.


hier more info
<core_client_version>7.0.20</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
<stderr_txt>
after a pause...
[Apr 01 23:41:51 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:52 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
...
[Apr 01 23:53:12 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:53:13 UTC] *Break* (found exit flag file)
[Apr 01 23:53:13 UTC] RC5-72: Saved CE:E2A3B000:00000000:64*2^32 (36.50% done)
0.03:23:30.73 - [8,125,583 keys/s]
[Apr 01 23:53:13 UTC] RC5-72: 1 packet (64.00 stats units) is in in.r72
[Apr 01 23:53:13 UTC] RC5-72: 12 packets (731.00 stats units) are in out.r72
[Apr 01 23:53:13 UTC]

sorry, i dont speak english
ID: 2996 · Rating: 0 · rate: Rate + / Rate - Report as offensive
mikey
Avatar

Send message
Joined: 22 Jun 11
Posts: 2080
Credit: 1,844,401,480
RAC: 3,144
Message 3000 - Posted: 4 Apr 2012, 22:53:53 UTC - in response to Message 2996.  

???

1 Apr 2012 | 23:55:14 UTC Fehler beim Berechnen 51,922.05 20.30 --- Distributed.net Client v1.03 (ati14)


I don't understand.


hier more info
<core_client_version>7.0.20</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
<stderr_txt>
after a pause...
[Apr 01 23:41:51 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:41:52 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
...
[Apr 01 23:53:12 UTC] Thread 0 was terminated in unexpected way. Restarting after a pause...
[Apr 01 23:53:13 UTC] *Break* (found exit flag file)
[Apr 01 23:53:13 UTC] RC5-72: Saved CE:E2A3B000:00000000:64*2^32 (36.50% done)
0.03:23:30.73 - [8,125,583 keys/s]
[Apr 01 23:53:13 UTC] RC5-72: 1 packet (64.00 stats units) is in in.r72
[Apr 01 23:53:13 UTC] RC5-72: 12 packets (731.00 stats units) are in out.r72
[Apr 01 23:53:13 UTC]

sorry, i dont speak english


Your English is a WHOLE lot better than my German!!!!!!!!!!!

One of the error messages is "Maximum disk usage exceeded" to me that means increase the amount of disk space available to Boinc. To do that go into the Boinc Manager, down by the clock, and click on the 'disk and memory usage' tab and 'use at most', 'leave at least' and 'use at most' boxes are where changes should be made.
ID: 3000 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile HGW

Send message
Joined: 3 May 11
Posts: 5
Credit: 143,962,508
RAC: 0
Message 3005 - Posted: 5 Apr 2012, 18:08:35 UTC - in response to Message 3000.  

thx for these information
ID: 3005 · Rating: 0 · rate: Rate + / Rate - Report as offensive
mikey
Avatar

Send message
Joined: 22 Jun 11
Posts: 2080
Credit: 1,844,401,480
RAC: 3,144
Message 3006 - Posted: 5 Apr 2012, 19:26:41 UTC - in response to Message 3005.  

thx for these information


I just hope it works!!
ID: 3006 · Rating: 0 · rate: Rate + / Rate - Report as offensive

Message boards : Number crunching : Big WU´s


 
Copyright © 2011-2024 Moo! Wrapper Project