Posts by Teemu Mannermaa

\n studio-striking\n
41) Message boards : Number crunching : Doing 2 WU's instead of 1 (Message 6870)
Posted 4 Dec 2015 by Profile Teemu Mannermaa
Post:
For Seti we were able to modify the programming so it does multiple WU's instead of 1 by switching settings from 1 to .5. Is that possible here?


Not really, unless one uses the appinfo/appconfig features of BOINC to change estimates. However, a Dnet Client WU should be almost fully using a CPU and GPU so adding additional WUs wouldn't make things any speedier.

-w
42) Questions and Answers : Unix/Linux : 195 (0xc3) EXIT_CHILD_FAILED (Message 6854)
Posted 25 Nov 2015 by Profile Teemu Mannermaa
Post:
However, all I get is the task runs for 16/17 seconds then fails:

195 (0xc3) EXIT_CHILD_FAILED

It's not a new problem. See: http://moowrap.net/forum_thread.php?id=200&postid=2225#2225


That particular problem has been solved by using OpenCL apps for newer cards. Not sure why BOINC Client/Server are not preferring OpenCL apps for your host.

However, that error code might be due to for some other reason. Looking at your recent error results, it seems the error is "Unable to initialize problem manager. Quitting..." which is a bit different problem. The Dnet Client is also not able to detect any ATI cards on the system.

In any case, switching to OpenCL app would be the solution but we'd have to find a way to make BOINC Client/Server prefer that, and only that. Currently I'm not sure how to do that.. :(

-w
43) Questions and Answers : Windows : Is there a 64 bit WU for my GPU? (Message 6853)
Posted 25 Nov 2015 by Profile Teemu Mannermaa
Post:
The server only offers Distributed.net Client 1.04 windows_intelx86 (ati14)
I'll setup an app_info.xml to force the 64 bit app if there is one.


There's no real benefit on having the wrapper/client logic parts as 64-bit for GPU clients. This is because the actual work happens inside a GPU, which is not related to bits in the executable. That's why only 32-bit apps are available for GPU crunchers.

Even if I'd want to provide a 64-bit wrapper for GPUs, upstream only provides 32-bit GPU clients anyway. Therefore, I'd recommend not to worry about this.

Is there some particular reason you'd want a 64-bit wrapper/client?

-w
44) Questions and Answers : Web site : Cannot upload avatar, nor create profile (Message 6852)
Posted 25 Nov 2015 by Profile Teemu Mannermaa
Post:
Getting this error message when uploading an avatar:
Fatal error: Call to undefined function gd_info() in /home/boincadm/projects/moo/html/inc/image.inc on line 23

And this when creating a profile:
Fatal error: Call to undefined function imageCreateFromJPEG() in /home/boincadm/projects/moo/html/user/create_profile.php on line 128


Interesting.. wonder if there's something wrong with jpeg-avatars. Can anybody else confirm they have managed to create avatar/profile lately and if so, was the picture jpeg-format or something else?

Also, retrying with a png-image is something I would be interested to hear results from.

-w
45) Message boards : News : Short outage (Message 6719)
Posted 18 Sep 2015 by Profile Teemu Mannermaa
Post:
We had a few hours of outage today due to network connectivity issues at our server hosting provider that affected our DB server. Things should be stabilizing again barring any pending network maintenance.
46) Message boards : News : Fresh work available! Come get yours! (Message 6662)
Posted 22 Jul 2015 by Profile Teemu Mannermaa
Post:
Keymaster is back online, at least temporarily pending more hardware replacements, so we are now generating fresh work for your hungry computers. Come get some and get your crunch on!

Thanks for your patience and understanding during this extended work outage due to upstream keymaster hardware failure.
47) Message boards : News : Out of work (Message 6660)
Posted 22 Jul 2015 by Profile Teemu Mannermaa
Post:
It's been a week, can we get an update?


Dnet volunteer staff was working on the keymaster hardware issue and they managed to get it back alive, at least temporarily pending further hardware replacements. Sounds like there was at least a fried memory module that brought keymaster down.

I'm just starting the work generators to get fresh work out there. Also I'm bumping our buffer levels a bit. :)

Looks like others answered the other questions raised here, thanks for those!

-w
48) Message boards : Cafe : Last person to post # 19 (Message 6651)
Posted 19 Jul 2015 by Profile Teemu Mannermaa
Post:
Hey, I like to join in whining! I love to whine. Not so much winning, though.

-w
49) Message boards : News : Out of work (Message 6648)
Posted 12 Jul 2015 by Profile Teemu Mannermaa
Post:
There's no more work available at the moment due to the distributed.net keymaster hardware failure. Our local cache was just also depleted, which lasted for about two days. So let's finish what we have and then move on to backup projects (yes, they are always good idea to have) while we wait for the distributed.net staff to repair keymaster. Rest assured, they are working hard to get the keymaster back up. As soon as that happens, we'll get fresh work out for our hungry systems to crunch.

For more details and latest developments, please read http://blogs.distributed.net/. Thanks for your patience!
50) Message boards : Number crunching : Moo use my intel gpu, not my amd gpu (Message 6646)
Posted 11 Jul 2015 by Profile Teemu Mannermaa
Post:
Hello!

I made an explanation of what's going on here at post http://moowrap.net/forum_thread.php?id=414&postid=6645. At least the problem is clear (miscommunication between BOINC and Dnet Client we use for actual crunching) but fixing it might get tricky.

-w
51) Message boards : Number crunching : Nvidia GPU workunits won't start (Message 6645)
Posted 11 Jul 2015 by Profile Teemu Mannermaa
Post:
The application is trying to use my Intel GPU instead of NVidia GPUs.


Aha! Thanks for this information, at least now we know what the problem is. Although, the fix for this might be a bit involved. Meanwhile, I'll probably try and get newer CUDA builds with our new app for Windows build and deployed as those will use 1 device at a time and won't suffer from the scheduling problem.

This stems from the two different views on the OpenCL devices on the system. BOINC only sees the nVidia devices while D.net Client sees all the OpenCL capable devices and thus device 0 is different. :(

-w
52) Message boards : Number crunching : Nvidia GPU workunits won't start (Message 6643)
Posted 10 Jul 2015 by Profile Teemu Mannermaa
Post:
After a few seconds of running the task progress bar stops at 17.567% and no further progress is made..


Oh no. :( Can you find the stderr.log file in the slot directory from a task that's stuck and post it either here or PM it to me? (Note that the log file might contain your Distributed.net ID which you might not want to disclose publicly.)

-w
53) Message boards : News : Disk maintenance and work generation woes (Message 6642)
Posted 10 Jul 2015 by Profile Teemu Mannermaa
Post:
Project was down today between 18:00 and 21:30 EEST (that's from 15:00 UTC/7:00 PST to 18:30 UTC/11:30 PST) for about 3 and half hours while the previously failing disk was swapped with a backup disk and data was copied over. Now the project is running from a disk that's not showing signs of collapsing due to read errors. The backup disk is as old as the failed disk but hasn't had that much use so it should last until the project server is migrated to a new server with SSD disks later this year.

This happened a bit unannounced as I took advantage of the D.net keymaster problems that seems to slow down our work generation for some reason. That's also the why we run out of work before the maintenance and why we still don't have full work buffers. Hopefully that will fix itself once the keymaster is back in action. In any case, our local proxy will eventually run out of work unless the keymaster will be resurrected.

For more information about the keymaster failure, please read http://blogs.distributed.net/2015/07/10/04/28/bovine/. Thanks and happy crunching once the dust settles!
54) Message boards : Number crunching : Nvidia GPU workunits won't start (Message 6639)
Posted 10 Jul 2015 by Profile Teemu Mannermaa
Post:
Here is a link to my computer ...
http://moowrap.net/show_host_detail.php?hostid=185032


Right, thanks for the host #. I've blacklisted the nVidia CUDA for that host for now so you should get the nVidia OpenCL work units now. Hope they work better. Please reset the project and fetch new work to test, thanks!

The problem with getting nVidia OpenCL work in the first place is that BOINC Server thinks that the OpenCL app is way much slower than the CUDA one for your host. It obviously doesn't take into account that CUDA doesn't start at all now. :(

Specifically the scheduler says: "Comparing AV#38 (43.01 GFLOPS) against AV#26 (178.36 GFLOPS)" where AV#38 is the OpenCL and AV#26 is the CUDA. CUDA speed comes from the measured time from succesfully returned work and OpenCL is a guess by the server. I've been trying to make it guess better but have not succeeded fully yet.

-w
55) Questions and Answers : Android : Client doesn't get enough tasks (Message 6634)
Posted 6 Jul 2015 by Profile Teemu Mannermaa
Post:
On my Nexus 4, job cache is around 24 tasks, but only 12 tasks on (more powerful) Nexus 7. The log says "Not requesting tasks: don't need (job cache is full)".


I tried to find the setting myself too and failed. I guess BOINC devs think it's too advanced for Android users or something. :(

It's true that the normal computing prefs are not used by Android devices, they only listen to locally configured settings and they are quite limited (one can't even setup remote management). NativeBOINC does have more normal configuration options but it's BOINC Client version is getting old.

It's the BOINC Client that says "job cache is full" and it's purely a client decision to fetch or not fetch work. BOINC Server or stats stored on serve side doesn't have much (or maybe anything) to do with the decision.

Also, I noticed that Nexus 7 has smaller Gflops rating (application details->average processing rate) compared to Nexus 4 computer, which is ridiculous, and maybe that is the reason behind job cache size difference. How do I update that number?


These are calculated based on the results received from the device. I don't think BOINC Client gets to even see these values as they are used by the BOINC Server to send work and to calculate granted credits. (Except we don't use that crediting option since it's unreliable but grant fixed amount instead.)

The only way to change these is to.. complete more work units successfully and the stats will change. Slowly but surely, if the new measurements are different. However, the measured GFLOPS might be affected by our apps CPU usage calculation bug where only CPU usage from last run is returned. This affects especially your Nexus 4 device as it pauses and restarts a lot. However, that shouldn't affect the average return time which is also quite different.

Do you still have this problem or did you find a way to fix it?

-w
56) Message boards : Number crunching : Stream & OpenCL selectable for ATI? (Message 6633)
Posted 6 Jul 2015 by Profile Teemu Mannermaa
Post:
Is the only solution an app_info.xml?


There's also app_config.xml support (please, read http://boinc.berkeley.edu/wiki/client_configuration#Application_configuration for more details) that does go into app version level but might or might not do what you need.

Otherwise, I need to take a look how to configure the BOINC Server feature that allows prefs to set what apps (maybe even versions) user wants to allow.

-w
57) Message boards : Number crunching : Nvidia GPU workunits won't start (Message 6632)
Posted 6 Jul 2015 by Profile Teemu Mannermaa
Post:
None of my Moo! Wrapper Nvidia GPU workunits will start running. They all say "Ready to Start", but even when there are no other GPU workunits running from other projects, the Moo! Wrapper workunits won't start running.


I noticed this too on my test nVidia system that has dual GPUs. BOINC Client has major problems on trying to schedule work units that require multiple GPUs (like all our old app workunits do). I've tested all major versions since 7.2.33 and all seem to be affected at least on my test system.

I did disable sending old app versions to any multi-GPU systems using BOINC Client v7.2.x or later but it seems some users have managed to get our old app working so this prevented them getting any work. I have no idea why or how to detect those systems in the server side. So for now only multi-GPU systems with BOINC Client v7.2.33 won't get old apps (this is the version that's available for CentOS 6 in their repo).

There was a fix committed to BOINC Client for this problem recently (see https://github.com/BOINC/boinc/commit/8c7aef5b997c028e007dc158d76eab3a5502e3c4) but I don't think there's a release version of it yet. I've tested it with a test build and it does seem to fix the scheduling problem.

However, that won't help old BOINC Client users. So the best course probably is to have everybody use our new app. For that we have to have new enough Dnet Client that supports selecting the GPU to be used and there's no CUDA builds available for that. I've gotten one compiled for Linux and thus was able to release cuda60/cuda70 builds with new app. I'll try to get Windows ones built too.

In the mean time, nVidia OpenCL apps are the thing to get. I've fixed some problems in the BOINC Server code that could have prevented people getting that app. (There has been various problems for sending apps when there's more than one possible app for a platform.) Additinally, more stats the server has for app version, the more it knows it's true speed against other apps.

So, could you test if you can get OpenCL apps now. And if not, tell me the host # that's affected and I'll try to debug the app sending problem further. Thanks!

-w
58) Message boards : Number crunching : PC's(4) stuck on Uploading.?? (Message 6629)
Posted 5 Jul 2015 by Profile Teemu Mannermaa
Post:
Have 4 pc's stuck on uploading.?
...
Is this is due to bcserv02 not running on the Server status page.?


Yeah, most likely it was due to disk I/O errors that brought the main server down. That's now fixed (at least temporarily) and uploads should work again. Other services will also be enabled any moment now as the backlog is worked through.

-w
59) Message boards : News : Unscheduled downtime due to disk problems (Message 6628)
Posted 5 Jul 2015 by Profile Teemu Mannermaa
Post:
There was an unscheduled downtime on 5th of July from about 13:00 EEST+3 (that's 10:00 UTC or 3:00 PDT-7) until about 1:30 EEST (22:30 UTC or 15:30 PDT) for a total of 12 hours 30 mins. There might have been problems with services for 2 hours or even longer before the start of downtime.

This was due to disk I/O failures on our main server that meant it had to be brought down for a disk check and temporary repairs. There was no critical data affected as the only file blocks permanently lost due to bad sectors were parts of log files. All other files were repaired successfully.

The affected disk will need to be replaced as it will most likely fail completely soon. This will require additional downtime in coming days.

For now, I'm bringing services back up slowly while the backlog gets processed.
60) Message boards : News : New v1.4 apps for nVidia OpenCL and Android (Message 6582)
Posted 12 Jun 2015 by Profile Teemu Mannermaa
Post:
... except that there's no preference setting for beta work as there is on other projects.


That's because they are no longer beta and are available for all. And when there are no beta apps, the beta setting is automatically hidden.

-w


Previous 20 · Next 20


 
Copyright © 2011-2024 Moo! Wrapper Project