Multi-GPU task takes longer than single-gpu task

\n studio-striking\n

Questions and Answers : Windows : Multi-GPU task takes longer than single-gpu task
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
zombie67 [MM]
Avatar

Send message
Joined: 2 May 11
Posts: 47
Credit: 319,540,306
RAC: 220,260
Message 153 - Posted: 7 May 2011, 18:25:45 UTC

Yeah. The frustrating part is the wasted time, and lack of actual production. It should be doing ~2x the work of the single card machine. In reality, it is doing 1/6th or less.
Reno, NV
Team SETI.USA
ID: 153 · Rating: 0 · rate: Rate + / Rate - Report as offensive
3327

Send message
Joined: 5 May 11
Posts: 3
Credit: 585,243
RAC: 0
Message 154 - Posted: 7 May 2011, 18:29:07 UTC - in response to Message 151.  

Yep. All tasks on my dual 5870 machine take ~1-2 hours. On my single 5870, they take ~15 minutes.


really strange!

but DA's bright new credit-system has decided to compensate for that and is granting a freaky load of credits for those WU's now..


uh, really?

49194 43088 7 May 2011 | 13:29:48 UTC 7 May 2011 | 14:33:03 UTC Completed and validated 3,786.47 3,484.83 49,620.57 Distributed.net Client v1.01 (ati14)
49132 43026 7 May 2011 | 13:29:48 UTC 7 May 2011 | 15:37:19 UTC Completed and validated 3,847.45 3,457.91 50,410.69 Distributed.net Client v1.01 (ati14)
48153 42215 7 May 2011 | 11:10:49 UTC 7 May 2011 | 12:24:54 UTC Completed and validated 4,433.80 1,535.66 57,499.16 Distributed.net Client v1.01 (ati14)
48069 42142 7 May 2011 | 11:10:49 UTC 7 May 2011 | 13:34:59 UTC Completed and validated 3,895.79 2,959.86 51,030.92 Distributed.net Client v1.01 (ati14)
46009 40162 7 May 2011 | 7:01:43 UTC 7 May 2011 | 9:47:34 UTC Completed and validated 4,835.84 1,024.91 62,713.01 Distributed.net Client v1.01 (ati14)

I think DA is on to something. :) I'll see if a WU completes on this end.

thanks.
ID: 154 · Rating: 0 · rate: Rate + / Rate - Report as offensive
3327

Send message
Joined: 5 May 11
Posts: 3
Credit: 585,243
RAC: 0
Message 155 - Posted: 7 May 2011, 20:10:11 UTC

51458 45169 537 7 May 2011 | 18:31:17 UTC 7 May 2011 | 20:03:06 UTC Completed and validated 10,709.45 10,709.45 7,552.65 Distributed.net Client v1.01 (ati14)

...but it did finish. lol.
ID: 155 · Rating: 0 · rate: Rate + / Rate - Report as offensive
frankhagen

Send message
Joined: 2 May 11
Posts: 27
Credit: 1,151,788
RAC: 0
Message 156 - Posted: 7 May 2011, 20:31:25 UTC - in response to Message 153.  

Yeah. The frustrating part is the wasted time, and lack of actual production. It should be doing ~2x the work of the single card machine. In reality, it is doing 1/6th or less.


now you do not ask me to tell you who had a bright idea again.. ;)

of course it's a problem with the app or the wrapper, but it's as silly as it can get..
ID: 156 · Rating: 0 · rate: Rate + / Rate - Report as offensive
kashi

Send message
Joined: 5 May 11
Posts: 7
Credit: 13,680,807
RAC: 0
Message 157 - Posted: 7 May 2011, 20:35:24 UTC

Credit can scale both ways, way up first and then way down.

Currently granting me only 2943 credits for a 7 packet, 448.00 stats units task. DNETC fixed credit of 8.05 per stats unit would have granted 3606 credits for the same task. Only a little way to go down and it will be granting a lower credit rate than PrimeGrid PPS Sieve which runs much cooler and uses far less power. Who knows, maybe it will go up again, perhaps I inadvertently triggered the anti-cherrypicker algorithm by aborting a batch of tasks to get the new version. I've been granted a range of 2400 to 28000 credits for similar size 7 packet tasks.

I swapped my 5970 for a 5870. Very little CPU usage but still slow. At first took about the same time as my 5970 (over an hour) but after a few tasks, it suddenly got into gear and VDDC current jumped from 34.8A to over 70A. That task took only 19.4 minutes. Had 2 others that took 40 minutes and one that took 47 minutes. Then new version was introduced and all since then took 67-73 minutes.

I knew the GPU was bottlenecked as key/s speed didn't change when varying GPU core speed between 725MHz and 900MHz. Eventually I changed to core 3 in preferences and it came alive, processing speed became about 4.5 times faster. A 7 packet task now takes about 16 minutes @ 900/500 core/memory, processing speed is about 2,015.23 Mkeys/s. Had to increase the fan speed as VDDC is now drawing 70-74A.

The dnetc518 stream application appears to default to core 0, which is designed to minimise GUI lag and perhaps stop people from burning out their cards. Unfortunately on some configurations it hobbles the performance a great deal.

Will put my 5970 back in tomorrow and see if setting core 3 works on that too. Hopefully some others may benefit from trying this, but watch your core and VRM temps carefully if it works as you may have been lulled into a false sense of security from when it was going slow.

My thanks to frankhagen and Bijek for posting about core selection.
ID: 157 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Shadow.SETI.USA [TopGun]

Send message
Joined: 2 May 11
Posts: 8
Credit: 157,289,433
RAC: 0
Message 158 - Posted: 7 May 2011, 23:04:45 UTC

I haven't had the long work units like others have had. My dual 6970's are doing them in 8 minutes and 14 seconds. It's just weird how some duals are working fine and others are taking hours to do a work unit.
ID: 158 · Rating: 0 · rate: Rate + / Rate - Report as offensive
zombie67 [MM]
Avatar

Send message
Joined: 2 May 11
Posts: 47
Credit: 319,540,306
RAC: 220,260
Message 202 - Posted: 10 May 2011, 6:08:12 UTC

It's been a week now, and I am not sure the original question has been answered.

Why is the dual GPU machine slower than the single GPU machine? This was not the case with DNETC. So what needs to be done to fix this problem?

Reno, NV
Team SETI.USA
ID: 202 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Conan
Avatar

Send message
Joined: 2 May 11
Posts: 52
Credit: 254,366,005
RAC: 7,504
Message 206 - Posted: 10 May 2011, 9:17:14 UTC - in response to Message 202.  

It's been a week now, and I am not sure the original question has been answered.

Why is the dual GPU machine slower than the single GPU machine? This was not the case with DNETC. So what needs to be done to fix this problem?



G'Day Zombie67,

I have the exact opposite to yourself.
My dual 5870 cards take 8 to 10 minutes per WU but my single 5870 card takes over 2 hours.

I no longer do Moo on the single card.

I am not sure if this helps but I had to install the SDK and ATI APP versions of the ATI drivers on my machine due to running PrimeGrid. This may be making my cards run normally?
I think I am using Driver Version 11.1.

Don't know about the single card as it may be a motherboard issue due to one PCI-e slot now going faulty, the other is working on Collatz, Milkyway and PrimeGrid with no issues, just not Moo!Wrapper project.
I can not work out why.

The machine with dual cards is an AMD Phenom II 955 and the machine with a single card is an AMD Opteron 285.

The Opteron is a lot slower than the Phenom, this maybe an issue as well.

I have ordered a new motherboard and processor (MSI 890FXA-GD70 and an AMD Phenom II 1100T) which should alleviate the Opteron's slowness issue and hopefully the graphic card issue as well.

Conan
ID: 206 · Rating: 0 · rate: Rate + / Rate - Report as offensive
zombie67 [MM]
Avatar

Send message
Joined: 2 May 11
Posts: 47
Credit: 319,540,306
RAC: 220,260
Message 209 - Posted: 10 May 2011, 16:28:57 UTC

Yes, I have SDK 2.4 installed. I also have 11.3 with APP. This is win7 64.
Reno, NV
Team SETI.USA
ID: 209 · Rating: 0 · rate: Rate + / Rate - Report as offensive
frankhagen

Send message
Joined: 2 May 11
Posts: 27
Credit: 1,151,788
RAC: 0
Message 210 - Posted: 10 May 2011, 16:39:07 UTC - in response to Message 209.  

Yes, I have SDK 2.4 installed. I also have 11.3 with APP. This is win7 64.


i suggest you run the benchmarks while boinc is shut down.

"dnetc518xxx.exe -- C 1 --benchmark"
"dnetc518xxx.exe -- C 2 --benchmark"

and so on.

oh, and btw.:

you might find something when running

"dnetc518xxx.exe --help"

i can not test this, because i got only NVIDIA's...

ID: 210 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile cedricdd

Send message
Joined: 2 May 11
Posts: 3
Credit: 256,516,099
RAC: 0
Message 212 - Posted: 10 May 2011, 16:48:08 UTC - in response to Message 210.  

Yes, I have SDK 2.4 installed. I also have 11.3 with APP. This is win7 64.


i suggest you run the benchmarks while boinc is shut down.

"dnetc518xxx.exe -- C 1 --benchmark"
"dnetc518xxx.exe -- C 2 --benchmark"

and so on.

oh, and btw.:

you might find something when running

"dnetc518xxx.exe --help"

i can not test this, because i got only NVIDIA's...



use "dnetc518xxx.exe -bench", it will test each possible case for the core and then give you the best one.
ID: 212 · Rating: 0 · rate: Rate + / Rate - Report as offensive
J

Send message
Joined: 8 May 11
Posts: 1
Credit: 26,271,560
RAC: 0
Message 214 - Posted: 10 May 2011, 19:54:22 UTC

FYI:

ATI HD5870, task time averaging 15 minutes.
Ive even had to clock back the card from 850Mhz,1250 back to 750x1000 to keep the thing cool. It run HOT HOT HOT on Moo!

Other projects such as collatz, MW, used to run at 72C. Moo whacks it over 80C and the fan is on full. Even at the lower clock speeds the card is running at 74C.

ID: 214 · Rating: 0 · rate: Rate + / Rate - Report as offensive
vaio [The Lone Gunman]

Send message
Joined: 3 May 11
Posts: 41
Credit: 165,019,076
RAC: 0
Message 215 - Posted: 10 May 2011, 19:56:20 UTC

Lower memory clocks, it may help.
Mine runs happily at 900/900.....mid 70's....70% fan.
Team Renegades
Forum
ID: 215 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dan

Send message
Joined: 5 May 11
Posts: 17
Credit: 103,092,604
RAC: 0
Message 218 - Posted: 10 May 2011, 21:13:18 UTC

Did anyone figure out the Dual GPU problem? I've had to abandon this project because the credits are so low for my two 5870s.
ID: 218 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Clod Patry
Volunteer moderator
Volunteer developer
Volunteer tester

Send message
Joined: 2 May 11
Posts: 65
Credit: 242,754,987
RAC: 0
Message 219 - Posted: 10 May 2011, 21:39:08 UTC - in response to Message 218.  

dan: it should be fine with app 1.01.
Do you still have this problem with version 1.01?
ID: 219 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dan

Send message
Joined: 5 May 11
Posts: 17
Credit: 103,092,604
RAC: 0
Message 221 - Posted: 10 May 2011, 21:50:49 UTC - in response to Message 219.  
Last modified: 10 May 2011, 21:52:52 UTC

Ya. I'm still taking an hour+ on dual 5870s.

P.S. I am running 1.01
ID: 221 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sitarow

Send message
Joined: 3 May 11
Posts: 8
Credit: 73,794,244
RAC: 0
Message 222 - Posted: 10 May 2011, 21:58:17 UTC - in response to Message 219.  
Last modified: 10 May 2011, 22:00:54 UTC

I have done a bit of testing on some of the computers that I have setup for this project.

I took the advice posted here on running a benchmark test on each computer I am running this project on.

What I found was that the computers that performed at a normal rate were using there best core 0 of the 4 options.

After taking the suggestions of the above posts. I found that some would perform great on core 0 and others would perform best on core 3.

I set those computers into diffrent groups and it helped bring the 1.2hr job down to 22mins on a ATI 5850 computer running stock speeds on both CPU and GPU.

I found that with an ATI 5870 we would have to speed up the CPU clock speeds in order to get the added performance.

The drawback was that the normal reward for the task at 1.2hrs would be about 9k points but because of the auto balancing of points the next tasks that finished in 22mins would only be awarded 2400 points.

My next question is how long will it take before it "normalizes" on the points rewarded?

Oh to run the benchmark application found on a windows 7/vista box at the following directory..

default install

"C:\Users\All Users\BOINC\projects\moowrap.net\dnetc518-win32-x86-stream.exe" -bench

P.S.

I ran this bench on an AMD 6 core and an Intel 6 core and both would only do 0 - 3 it would not do the final 2.
ID: 222 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dan

Send message
Joined: 5 May 11
Posts: 17
Credit: 103,092,604
RAC: 0
Message 223 - Posted: 10 May 2011, 22:06:36 UTC - in response to Message 222.  
Last modified: 10 May 2011, 22:11:39 UTC

I set those computers into diffrent groups and it helped bring the 1.2hr job down to 22mins on a ATI 5850 computer running stock speeds on both CPU and GPU.


Single GPUs are taking about 10 to 15 minutes. 5870s in CF should be taking about 6 to 8 minutes. See times at the beginning of this thread.

Dan

P.S. I did update the CF profile (11.4) from ATI and saw my GPU go from ~5000 seconds to 3,000 seconds and my CPU drop to 1,600 seconds.
ID: 223 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sitarow

Send message
Joined: 3 May 11
Posts: 8
Credit: 73,794,244
RAC: 0
Message 224 - Posted: 10 May 2011, 22:17:03 UTC - in response to Message 223.  
Last modified: 10 May 2011, 22:32:06 UTC

P.S. I did update the CF profile (11.4) from ATI and saw my GPU go from ~5000 seconds to 3,000 seconds and my CPU drop to 1,600 seconds.


I will try that and see if it gives me a similar diffrence.

The only problem is that now the project will not give the proper reward for work done because of the reduced time it take to complete the project.

I may have to reinstall the boinc client on that computer and get a new ID for the computer on this project to get that issue resolved.


Edit: These are the computers that you can check to see about the reduced time and lower points granted.

Computer ID 255
http://www.moowrap.net/results.php?hostid=255&offset=0&show_names=0&state=3&appid=

Computer ID 252
http://www.moowrap.net/results.php?hostid=252&offset=0&show_names=0&state=3&appid=

Computer ID 405
http://www.moowrap.net/results.php?hostid=405&offset=0&show_names=0&state=3&appid=
ID: 224 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dan

Send message
Joined: 5 May 11
Posts: 17
Credit: 103,092,604
RAC: 0
Message 226 - Posted: 11 May 2011, 0:00:47 UTC - in response to Message 224.  

I tried running the benchmark and setting the core to the best in the benchmark. It made no difference in my time.

I'm getting 1 to 1.5 credits per second per GPU. Milkyway and Collatz get about 2.3 credits per second per GPU. I forget what dnet got but I know that it was much more than Milkyway or Collatz.

Dan
ID: 226 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 · 2 · 3 · Next

Questions and Answers : Windows : Multi-GPU task takes longer than single-gpu task


 
Copyright © 2011-2024 Moo! Wrapper Project