Posts by AMDave

\n studio-striking\n
1) Message boards : Number crunching : CPU Benchmark appeared to cause WU error (Linux_AMD64) (Message 804)
Posted 16 Jul 2011 by AMDave
Post:
That's ok, thanks.
I am a long time BOINCer and I am familiar with the forums.
The note is more for the admins (who can see hidden hosts).

I don't think there is a real issue with the Moo wrapper or the dnetc client app, and I do not believe this would be a BOINC manager issue in any way that I can think of, or we would see a lot more of this, which we don't at this time.

Point to note - this is a very VERY busy host so an issue from time to time is expected.

I'm posting FTR in case this happens to anyone else so it might be correlated.

The oddity is the orphaned process. In the scheme of things the wrapper should have shut it down at some point, but all relevant tasks in the time-frame were completed successfully, so I can't correlate it.

Anyway it is gone now. I'm watching to see if it happens again. I may write a bash script to tell me immediately when more than one dnetc task occurs on the host so that I can interrogate it and nail it down. But given the frequency I don't expect that to occur again for another week or two.

Thanks for your reply, though.
2) Message boards : Number crunching : CPU Benchmark appeared to cause WU error (Linux_AMD64) (Message 800)
Posted 15 Jul 2011 by AMDave
Post:
I have observed a couple of intermittent failed WUs on a Linux_AMD64 machine.

These failed WUs occurred very far apart.

I was unable to work out what exactly was causing the problem until one occurred while I was watching.

Normally I can manually suspend and resume a WU without any problems.

On this occasion the BOINC Manager suspended all projects to execute the scheduled CPU Benchmark cycle.

At the start of that event the WU failed.

It was approximately 80% through.

The box is host#769
OS: Ubuntu desktop 10.10
Kernel: Linux 2.6.35-28-generic
Arch: AMD64
RAM: 16GB
CPU: 6 core AMD Phenom(tm) II X6 1055T Processor [Family 16 Model 10 Stepping 0],
GPU: CAL ATI Radeon HD5700 series (Juniper) (1024MB) driver: 1.4.1332
S/W: (of note)
ia32libs
ati-stream-sdk-v2.2-lnx64
BOINC 6.10.58

Task: 1403959
WU: 1082519

I'm really not that concerned about the issue myself as it is so infrequent, but thought I should post the details of the event, in case it matters to someone.

[EDIT]
the event may also coincide with an orphaned client app process:
I just found an extra instance of the client app that had been running for 11+ hours.
It was unnaccounted for by the BOINC Manager.
It is also not the process belonging to the WU that failed.
Also, approx 10 hours ago when that orphaned process should have returned or failed, this host does not have a failed WU (not even in the whole week prior).
Mysterious.
I killed the process and will resume monitoring.
[/EDIT]
3) Message boards : News : Different workunit sizes added (Message 525)
Posted 28 May 2011 by AMDave
Post:
I got the new WUs and they run 4 times longer on HD5700.
But it's not good.
The credit is averaging waaaay lower than the small wu's.
My completed and 'valid' stats are showing 400% more work but only 95% of the credit that I was getting for the small WUs.
That's so disappointing that I have to vote with my GPUs.
I'm ditching until that is sorted out.
I'll try again later.
4) Questions and Answers : Unix/Linux : Pretty Please :) (Message 294)
Posted 13 May 2011 by AMDave
Post:
Thank you +1 :)





 
Copyright © 2011-2024 Moo! Wrapper Project