Posts by Zydor

\n studio-striking\n
61) Message boards : Number crunching : DNET Fragmentation - When is it likely to End? (Message 2227)
Posted 14 Jan 2012 by Profile Zydor
Post:
Teemu - a question born out of frustration .... when does it seem likely DNET will get through this fragmented rubbish and get back to sending proper work units. I understand well enough that MOO can rattle through these far far quicker than compared to DOS based DNET CPU, but there's a limit as to how long we are used to clear up the rubbish.

If thats an unfair perception on my part, okie doke, but this fragmented garbage gets in the way of actual productivity. My comment is aimed at DNET not you. I well, well, recognised you have done much in the background to alleviate issues - including the credit adjustment being just one - and mucho gracias for all that :)

But DNET are gaining a growing number of pins in my DNET voodoo doll for this fragmented rubbish rofl :)

Regards
Zy
62) Message boards : Number crunching : 7970s Dont Run (Message 2226)
Posted 14 Jan 2012 by Profile Zydor
Post:
Its the support given by 7970s (or more precisely zero support) to CAL apps. Its (at the moment) OpenCL or nothing - I cant see them going back on that as CAL is now a dead donkey and dropped.

The performance of the hardware though is phenominal - the 7970 runs the MW NVidia OpenCL app faster than a 6970 runs the normal MW AMD CAL app! For the Gaming side of the equation, sprint, dont walk, and get one! The hardware power is incredible.

As always, its software that rules the day - there will be more of this as time goes on and the AMD transition to OpenCL takes hold.

Regards
Zy
63) Message boards : Number crunching : ATI 12.1 - Preview (Message 2217)
Posted 13 Jan 2012 by Profile Zydor
Post:
No, definitely stay on 12.1, just try an app_info to control it.

That one looks ok, but its a very complete one, its laying down just about every parameter an app_info can. Some of them I dont claim to be the worlds expert on, and if wrong they could tip it over.

Suggest you try it with an empty queue, and preferences set to zero buffer, that way if it does fall over you dont zap dozens in your queue. You just then look at the parameters again (post the issue no doubt others will pitch in to help - especially those running app_infos), change an errent one and try again.

Usually doesnt take long to sort it and get the app_info running, just keep the buffer at zero so errors dont take out your queue whilst sorting it out.

Mikey's suggestion of a separate app_info thread is a good one, you might like to post your first crack at the app_info and first post in a new Thread "Moo app_info". As life proceeds it becomes a consolidated thread for app_info issues.

Dont forget to use Notepad to put the app_info together, that stops the erroneous charators appearing. You may initially get them if you use that one as a start point (open in Notepad and if you see "-" (no quotes) at the start of lines get rid of them, just navigate cursor to the dash and delete it. The file name to use is app_info.xml and you should save it to the Moo project directory inside Program Data diectory.

All sounds a bit of a mouthfull I know (!), but in reality its not. Just post issues as you go, Mikey and I will certainly help until you are running with it - no doubt others will as well as they come onto the forum.

Regards
Zy
64) Message boards : Number crunching : ATI 12.1 - Preview (Message 2211)
Posted 13 Jan 2012 by Profile Zydor
Post:
I would try an app_info and specifically limit the cpu useage, that would keep it under your control not the AMD driver or the core DNET app. AMD have largely nailed the excess cpu useage in 12.1, but perceptions are tainted by the default behaviour of the core DNET app which is to use all available cpu cycles.

Its a little unclear in absolute terms what effect that default DNET behaviour has on us from the core app, however the app_info will retain control at your end whatever it is.

In the app_info, set the average cpu useage to 0.05, and the max cpu useage to 1, and should be ok.

Regards
Zy
65) Message boards : Number crunching : 7970s Dont Run (Message 2205)
Posted 12 Jan 2012 by Profile Zydor
Post:
7970 up and running at MW if anyone has one.

See the "7970 dont run" thread (ironically).

For the moment its running an NVidia OpenCL app on the 7970s, instructions at end of that thread.

Regards
Zy
66) Message boards : Number crunching : 7970s Dont Run (Message 2186)
Posted 11 Jan 2012 by Profile Zydor
Post:
Its not really a surprise given the different (long term) Architecture, GCN. It'll pan out in a couple of weeks as the points of failure are identified, especially as most BOINC AMD apps are maths based.

POEM runs ok on it, if you fancy a medical based model. Utilisation is low unlike here where its high (usually), so running an app_info is essential at POEM with a 7970 (POEM utilisation is 18% without an app_info using the 7970s there. Its pretty well the only one running for a 7970 at present.

Regards
Zy
67) Message boards : Number crunching : 7970s Dont Run (Message 2182)
Posted 10 Jan 2012 by Profile Zydor
Post:
The 7970s are not running at Moo.

Tried with 2 x 7970 on crossfire on my 1090T box. They always get to 22 secs and fail with communications error. I have tried detatch and re-atatch, still the same.

If a tester needed for any recompiles etc, yell I'll do it

Regards
Zy
68) Message boards : Number crunching : CPU app only - minor issue with checkpointing (Message 2181)
Posted 10 Jan 2012 by Profile Zydor
Post:
One aspect of this ..... if you are going to close down BAM for any reason, suspend Moo first (make sure you have "suspended in memory" enabled), then shut down BAM.

It will avoid a failed WU on restart of BAM.

There are some issues around with checkpointing, Teemu is working on another Moo client version, but meanwhile if you suspend to memory before closing BAM it will avoid a big part of the checkpoint failures.

Regards
Zy
69) Message boards : Number crunching : Moo tasks ignore BOINC CPU preferences (Message 2169)
Posted 10 Jan 2012 by Profile Zydor
Post:
Its not a Moo issue, its the way BOINC is designed, and the way a facility is expressed.

The use at most x% of a cpu is not an exact science. That switch is there to limit the number of cores used in processing. It will therefore default to (for a quad) 25%, 50%, 75% or 100% (one, two, three or four cores) whatever value you set it chooses what it considers to be the closest number of cores to the percentage you entered.

In no way does the facility give the precise percentage option, thats a coding nightmare to end coding nightmares. The text painted to screen could be a little clearer, thats for sure, but that doesnt change the underlying use of that switch. So set a percentage of (say) 60% and you'll get two out of four cores on a quad enabled for BOINC, 27% will get one, etc etc.

It is in effect a crude switch to limit the number of Cores used in BOINC processing. We can all think up other preferences or wish lists for it based on the text painted to screen - hence it was somewhat unfortunate BOINC chose that form of text explanation painting to screen, but that text, or our personal wish list, will not alter what its coded to do - limit the cores used to a maximum granularity of whole cores.

Regards
Zy
70) Message boards : Number crunching : strange complete times being seen (Message 2153)
Posted 8 Jan 2012 by Profile Zydor
Post:
I'd think about getting a SSD for the C: drive on the 1090T. The Corsair Force 3 GT 180Gb or especially the Corsair Performance Pro 256GB. Good drives that have excellent balance between perfromance and cost. Its a well worthwhile longer term investment transferable to any machine as you progress through rebuilds.

I started using an SSD on my main box a month ago - brilliant, its like being in a new world - the PC and applications think their birthday came early. I would recommend investing in Diskeeper Pro Premier with Hyperfast ( http://www.diskeeper.com )if you go that route, as the latter will very significantly cut down on disk I/O which is the great enemy of SSDs - the latter being the yardstick by which the life of an SSD is held hostage to.

Still need the existing mechanical drives for data storage, but putting windows and programs on an SSD is like being sent to heaven rofl :)

Regards
Zy
71) Message boards : Number crunching : Credits (Message 2148)
Posted 8 Jan 2012 by Profile Zydor
Post:
I blame global warming ....
72) Message boards : Number crunching : strange complete times being seen (Message 2140)
Posted 7 Jan 2012 by Profile Zydor
Post:
Of course I'm running a 2600k and not a 1090T so it's not exactly helpful

Thanks for that - my main machine is a 3960X so that was a useful comparitor.

Regards
Zy
73) Message boards : Number crunching : Sapphire 6970 Poor gpu usage. (Message 2134)
Posted 7 Jan 2012 by Profile Zydor
Post:
Moo is usually good on utilisation circa 96-99 depending on what else is going on with the machine.

However at present we are going through a large batch of fragmented WUs sent down by DNet Upstream. It means that the Stat Units inside the WU are grouped in larger numbers with fewer stat units per group. This has the effect of not fully utilising all cores all the time, and lower the gpu clocks a tad to account for the wide variation in loading and Stat Unit numbers.

The resultant slow down from the norm is around 15%, so Teemu has added 15% credits to each WU as compensation whilst we get through them - which will be a while yet.

So you will see wild swings in utilisation, and an overall slow down from moo norms, but the credits are compensated accordingly. Just look carefully at temperatures, you may have to reduce the gpu clocks slightly so that it copes with mega fragmentation, as well as the few normal WUs flowing through.

Or to put it another way, configure your GPU to run the best it can against a normal WU, that way it copes, albeit slower, with a fragmented one. If you force high utilisation on fragmented ones, and a normal one comes along you will likely overheat the gpu crunching the normal one - hence size everything for big fragmentation at slower clocks. The 15% increase in credits for fragmentation compensates.

Regards
Zy
74) Message boards : Number crunching : Credits (Message 2131)
Posted 7 Jan 2012 by Profile Zydor
Post:
It changed today ....

Its now (for) 768 Stat Units x 8 per stat unit = 6144

If a fragmented WU add 15%, which for a 768 WU makes it 7065

(Substitute the Stat Units in the WU you are looking at, instead of 768, to compare)

Regards
Zy
75) Message boards : Number crunching : strange complete times being seen (Message 2121)
Posted 6 Jan 2012 by Profile Zydor
Post:
I'd be interested to know - once all is settled down for you re the new box - how well behaved those WGC Projects you listed are on the 1090T in terms of effect on the Moo WU, if concurrently running with Moo, all six cores with those particular WGC Project WUs. Not run those yet, but will do at some point so would be nice to know the effect on 1090T/Moo.

Regards
Zy
76) Message boards : Number crunching : Credits (Message 2114)
Posted 6 Jan 2012 by Profile Zydor
Post:
Its about right now, depends on Teemu's final goal. He was going to reduce credit per stat unit to 8 from 8.5. The 20% bonus to help with the fragmented WUs would remain in either case. So for a default Huge sized WU - 768 - its either going to end up:

768 x 8 plus 20% = 7372.8

or

768 x 8.5 plus 20% = 7822.6

I suspect 8 per is still the intention, so it likely will end up 6144 plus 20% = 7372.8

Regards
Zy
77) Message boards : Number crunching : strange complete times being seen (Message 2112)
Posted 6 Jan 2012 by Profile Zydor
Post:
The clue is in the last number of the WU ... _110_193

110 is the temporary norm indicator of fragmentated WUs, and the 193 the usual Stat Unit size indicator. The huge WUs are 768 compared to 193.

The server will settle and uprate the WU size once its got used to what the beast can do.

I found with the 1090T that it pays to leave a core "free" if using one of the CPU hungry driver versions. I use PRPNet or the Clean Energy Project Phase 2 at WCG for that on the 1090T on my second box, as you get precise per CPU Core control over what is loaded and crunched.

Worth looking at, especially CEP2, as it has excellent control mechanism inside its preferences for precise control over the number of Cores it uses. PRPNet effectively does a similar control service but by accident rather than design as you use DOS boxes - it does take circa 2 weeks for the PRPNet credits to flow through to BOINC, but they do flow even is delayed sometimes on conversion. CEP2 has no such issues as its a standard BOINC level Project in WU terms. CEP2 needs 1.5Gb memory free to service 5 WUs, so be careful on that in relation to the overall call on main stream memory - other than that CEP2 does not tax the 1090T particularly hard, so its a nice one to run on a 1090T.

Regards
Zy
78) Message boards : Number crunching : Project Overload (Message 2111)
Posted 5 Jan 2012 by Profile Zydor
Post:
Teemu provides a good service as admin, particularly noteworthy as there is only one physical server. He always jumps on the truely significant issues where possible, leaving the not important / cosmetic / epeen for later updates. It is one of the most stable projects in BOINC with rare downtime. There are a couple of significant issues he cannot do anything about (aka fragmentation from Upstream, incorrect driver timing reporting, and CPU loading in later 11.XX drivers; the latter two AMD have resolved in 12.XX ).

Recently the incoming wave of credit chasers resulting from the Santa bonus stretched the server to its limit - 50% more than its normal load, and it took the hit well, until the last few days when it filled up to capacity. Its eased off 20% and still reducing now that Santa's tail lights are a dim memory, so I anticipate back to mormal good service as chasers move on.

There is always light traffic on the boards in non credit chaser times as there is usually little of genuine Project related significance to yack about - a good sign :)

Regards
Zy
79) Message boards : Number crunching : Low CPU utilization with differing GPU models (Message 2083)
Posted 3 Jan 2012 by Profile Zydor
Post:
More observation than insight I guess. Both boxes were ok (or seemed to be) after the rebuilds, but then seemed to decline and fall over.

I switched my No2 box from 12.1 to 11.9, no change or difference, so left it there on 11.9. It has a single 5850 in it, and the driver hassles are mainly multi-GPU. With this box, it turns out the main problem was the "non-obvious" system drivers, the background ones. I ran a Driver Updater through it and it picked up a dozen candidates, and I let it go for the lot. Worked a treat. Been stable ever since.

My main box (twin 5970) was on 12.1, switched to 11.9, and if anything it was worse. I also lost the multi-GPU improvements in 12.1 doing that (fixed CPU useage bug, correct lapsed time reporting on WUs etc). So I switched back to 12.1, got it going ok in the end by doing a real clean hard sweep with Driver Cleaner and a Registry Cleaner, its stable now.

So.... was it 12.1 all along, in truth, I dont think so. Main Box turned out to be a fragmented graphics driver that needed cleaning out (again), and No2 box needed some system driver updates. Its easy to point fingers at the driver saying "not me guv", but in truth this time I think it was my fault. See how it goes, but both boxes been stable now for hours, so I think all is well again.

Regards
Zy
80) Message boards : Number crunching : 2 WU on cuda cards? (Message 2079)
Posted 3 Jan 2012 by Profile Zydor
Post:
GPUs can only run one thread, thats all thats physically designed for it, so they can only do one task at a time. If two are present, it shares time between both WUs, it physically cant do anything else.

I used to run NVidia in 9800GTX+ days and going back to when they first started, didnt work then either - I moved to AMD after the final Firmi farce - it maybe something in current NVidia architecture that gives some room, but it will not be crunching, it can only crunch one at a time.

The other major factor is that as pointed out above Moo is a multi-thread GPU app designed to use up all GPU space as it becomes available. There is an issue with part of that at present as Upstream are feeding us fragmented WUs, and the usual allocation tailored to card type is hard-impossibe until "clean" WUs start again.

Thats why Teemu gave an extra 20% as a temporary measure whilst we battled through the fragged ones. That special case may give some temporary space for gain ... dont know until you try I guess, but you will need to use a special app_info to prevent the Moo WU grabbing all GPUs as its designed to do.

Bare in mind if this is going for NVidia cards, that NVidia works sloooow here .... not the best of ideas to run them at Moo, unless there is a particular personal reason.

Regards
Zy


Previous 20 · Next 20


 
Copyright © 2011-2024 Moo! Wrapper Project