Questions and Answers :
Unix/Linux :
Computation Error (output file missing)
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 Aug 14 Posts: 2 Credit: 259,080 RAC: 0 |
I am seeing Computation Error (output file missing) for many work units. just had had about 5 WUs fail with 'output file missing'. uname -r 3.14-9.dmz.1-liquorix-amd64 GPU is Nvidia GTX 750 Stdout has 11-Aug-2014 00:01:45 [Moo! Wrapper] Computation for task dnetc_r72_1407505306_9_9_0 finished 11-Aug-2014 00:01:45 [Moo! Wrapper] Output file dnetc_r72_1407505306_9_9_0_0 for task dnetc_r72_1407505306_9_9_0 absent 11-Aug-2014 00:01:45 [Moo! Wrapper] Starting task dnetc_r72_1407540914_9_9_0 11-Aug-2014 00:01:54 [Moo! Wrapper] Computation for task dnetc_r72_1407540914_9_9_0 finished 11-Aug-2014 00:01:54 [Moo! Wrapper] Output file dnetc_r72_1407540914_9_9_0_0 for task dnetc_r72_1407540914_9_9_0 absent Stderr is full of mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory mv: cannot stat ‘slots/1/out.r72’: No such file or directory |
Send message Joined: 20 Apr 11 Posts: 388 Credit: 822,356,221 RAC: 0 |
I am seeing Computation Error (output file missing) for many work units. Looking at your host logs, this seems to be the reason no output files get generated: dnetc: Unable to initialize timers. So this turns to be a Dnet Client bug http://bugs.distributed.net/show_bug.cgi?id=4570 that's hopefully fixed in a newer build. I believe the reason for this problem is is kernel versions that have only two digit version numbers (like your problem host has v3.14) and the code expected three digits (like v2.6.32) so it failed initialization. Did you upgrade your host OS recently? Basicly, fix is for me to deploy newer versions which has support for newer kernels. -w |
Send message Joined: 8 Aug 14 Posts: 2 Credit: 259,080 RAC: 0 |
That would explain it, I recently moved to a newer kernel to solve some issues with unsupported hardware. |