Announcement

**Rattledagger** · 18 November 2004, 08:53

Currently, all wu under BOINC should be the same size, but later on more detailed models will be distributed.
They've also got a high-resolution-model they've tested and they maybe will release for bleeding-edge-systems, among other things using 800 MB memory...

**J1NG** · 18 November 2004, 10:45

800MB? Bah...

J1NG

VJ · 19 November 2004, 01:17

So how come some of my workunits are larger...?

I mean, 1600 hours in comparison to 600 hours is a lot more...

JÃ¶rg

**Rattledagger** · 19 November 2004, 07:37

Well, from the small info can gather from your trickle-info, you're running one instance on work-computer and 4 on Xeon-HT. Since HT isn't a "real" cpu, being somewhat slower is expected. But still the Xeon seems unreasonably slow.

A couple other reasons for slow Xeon can be, slower memory than work and probably with ECC slowing it even more down, 4 instances using all memory-bandwith, slower write-access to hd with 4 instances.

Still some more reasons can be, too little memory so uses pagefile, virus-scanner is constantly re-checking and slowing things down, cpu running too hot so is slowed down, some power-savings is wrongly configured so slows down the cpu, turned off cache-memory.

VJ · 19 November 2004, 08:16

Originally posted by Rattledagger
Well, from the small info can gather from your trickle-info, you're running one instance on work-computer and 4 on Xeon-HT. Since HT isn't a "real" cpu, being somewhat slower is expected. But still the Xeon seems unreasonably slow.

I've only uploaded trickles once for the Xeon.
Actually, I can verify this. The elapsed time increases with real time. Same goes for the cpu time spent on the processes (as shown in the taskmanager). If it would be slowed down, this would imply processes being paused alternating, so I would expect esp. the cpu-time in taskmanager to go slower than real time (ie. after 10 hours of computation, a CPDN process would only have used 6 hours of CPU. But this isn't the case: all 4 process say they have had 10 hours of cpu time).
(the CPU time in taskmanager does work like this: if I run a program which uses all 4 CPUs, its CPU time goes 4 times faster than real time

)

A couple other reasons for slow Xeon can be, slower memory than work and probably with ECC slowing it even more down, 4 instances using all memory-bandwith, slower write-access to hd with 4 instances.

Still some more reasons can be, too little memory so uses pagefile, virus-scanner is constantly re-checking and slowing things down, cpu running too hot so is slowed down, some power-savings is wrongly configured so slows down the cpu, turned off cache-memory.

The weird part is: if I move those workunits to my P4 (which now has a workunit of 600 hours), it first does the benchmarks, and then also predicts this workunit to take 1600 hours (it runs 1 of the four units, the other 3 are paused automatically).

So I would think if the problem were with the Xeon, the P4 would show an estimated time of 600 hours.

At work, I have the same virusscanner, and it isn't even disabled.

(the specs of the Xeon: 2x 2.4 Ghz, 1 GB ECC DDR SDRAM, 2x 10K U320 harddrives; I disabled the virusscanner while crunching, powersaving is turned off, cpu temperature stays around 50Â°C)

JÃ¶rg

**Rattledagger** · 19 November 2004, 09:08

Not having HT doesn't know the exact difference, but let's say crunching one seti "classic" wu on a HT gives an average cpu-time & run-time of 4 hours.
If instead runs 2 instances, the run-time for crunching both is example 6 hours, and the total cpu-time is now 12 hours.

Even for a "normal" multi-cpu, there's a smp-penalty in seti if you've not got atleast 1 MB cache, for Amd this is around 5-10%, for pentium AFAIK 10% with 512 KB, 20% with 256 & 40% with 128 KB. Also, for small-cache it also seems there's penalty if only running one instance, and this is probably around 10% with 256 KB and 20% with 128 KB. You'll get both of these penalties in dual-systems, so 128 KB is probably around 80% penalty compared to a dual-1 Mb-cache-system.

Not knowing CPDN too closely, but there's probably something similar here, so running one instance on a HT takes example 600 hours, but if running 2 instances you'll example use 900 hours for each.

BTW, is the 1600 hours the expected run-time for the paused wu then running on P4, or for the wu currently being crunched?

VJ · 19 November 2004, 09:26

Originally posted by Rattledagger
Not knowing CPDN too closely, but there's probably something similar here, so running one instance on a HT takes example 600 hours, but if running 2 instances you'll example use 900 hours for each.

I doubt it: if this were the case, after 10 hours of crunching, the CPU time spent on the units would be less than 10 hours (i.e. 7 hours) as shown on taskmanager. This isn't the case: every process has the same amount of cpu-time, and this is equal to the real time spent on it.

BTW, is the 1600 hours the expected run-time for the paused wu then running on P4, or for the wu currently being crunched? [/B]

It gives this value for both.
Even if I let BOINC recalculate the benchmarks.

JÃ¶rg

**Rattledagger** · 19 November 2004, 09:52

Well, after having a dual-p3-600 for some years, knows running one seti-wu gives run-time & cpu-time 9 hours something, while running 2 seti-wu gives run-time & cpu-time 10h20m...

You can just try on your Xeon to change cpu-preference to only 1, and see if your TS suddenly becomes smaller.

Not knowing exactly how a cpu counts cpu-time, but it looks like it's also counting the time "lost" for waiting for cache-misses and similar, and it's of course bigger chance for cache-miss and needing to access slow main memory on a dual.
Even then waiting on a cache-miss, it can't do anything else so maybe a reason for counting this as 100%...

BTW, looking more closely, it seems only wu that haven't been started shows the "expected" crunch-time based on benchmark, this means all wu will show the expected crunch-time left. If you example have crunched 160 hours to 10% done, this means will display 1440 hours left regardless of machine-speed. But of course, if you crunches 10 hours on a faster machine you maybe have only 1400 hours to go.

VJ · 19 November 2004, 13:54

Hmm, I noticed something weird earlier this evening. I started my PC around 9.00 and checked around 18.00, so it had been running for 8 hours straight (no interaction by me).
According to the taskmanager, every workunit had gotten 4 hours and 40 minutes of CPU time. This indeed indicates that they have to wait for each other; probabely because the different logical processor have to wait for physical units to become available. They really seem to use a lot of time like this: adding them up results in 18 hours 40 minutes (4x 4h40min) of actually used CPU time, whereas the computer logically 'thinks' to have given 36 hours (4x 8 hours). The hyperthreading doesn't seem to be providing any benefit.

Weird though that this wasn't the case when I previously did the same check. Also on the P4, but perhaps its prediction was influenced by the Xeons prediction.

I will upload my data on Sunday, and will set the settings to only use 2 CPUs (when it is crunching, I'm usually away). (as the machine is not connected to the internet, I can't make that change now) It would also make it a bit more responsive I have happen to want to do some small stuff (html editing, ...).

If there isn't a huge difference, I'll see if disabling hyperthreading is beneficial. But judging by what I noticed today, it could well be that this caused it...

Would the predictions of estimated time left be altered when you force the client to run a benchmark?
(hmm, something to try out)

Thanks for you insights!

JÃ¶rg

**Rattledagger** · 19 November 2004, 17:47

You can manually edit the preferences, just stop BOINC, and use Notepad to edit global_preferences.xml
[max_cpus]2[/max_cpus] under your venue.

Not having HT, so doesn't know if this is normal behaviour, but sure system idle process haven't got some hours, or something else have been eating resources?
Other reason to not showing full cpu-time is if you haven't selected "leave apps in memory" and have been stopped for benchmark or something. Another reason is if CPDN changed phase, happens at 33.3% and 66.6%
Or for that matter computer crashed and re-booted...

The expected time left is influenzed by benchmark in other BOINC-projects, but not sure in CPDN for already started wu...

VJ · 20 November 2004, 06:56

Originally posted by Rattledagger
[B]You can manually edit the preferences, just stop BOINC, and use Notepad to edit global_preferences.xml
[max_cpus]2[/max_cpus] under your venue.

Good to know!

Not having HT, so doesn't know if this is normal behaviour, but sure system idle process haven't got some hours, or something else have been eating resources?
Other reason to not showing full cpu-time is if you haven't selected "leave apps in memory" and have been stopped for benchmark or something. Another reason is if CPDN changed phase, happens at 33.3% and 66.6%
Or for that matter computer crashed and re-booted...

Well, with the hyperthreading: some CPU units (IIRC integer unit) are present twice and can be used simultaneously. Other units aren't present twice (i.e. memory access), so if both process need this, they'll only be allowed to use these units one at a time. And away goes your parallellism.

The system idle process was virtually 0%, leave apps in memory is selected, the project was at 10% yesterday evening. The computer is set to remain off upon a power failure, and has no automatic login (so I would see if there was a power failure or a spontaneous reboot).

The expected time left is influenzed by benchmark in other BOINC-projects, but not sure in CPDN for already started wu...

No problem... I can hardly wait to see what influence changing these settings will have (I still have to wait till tomorrow).

JÃ¶rg

VJ · 22 November 2004, 03:42

Well, it was the hyperthreading...

When setting it to use 2 CPUs, it gives an estimate time for the paused units of 600 hours. The estimate time for the ones that are running is the same, but it decreases a lot faster. So I'm guessing it is also about 600 hours.

So instead of finishing 4 units in 1600 hours, I'll finish 2 in 600 hours. Add the next two, and I'll have finished all for in about 1200 hours!

Weird though that initially the CPU time spent on all 4 units was equal to the running time.

Now, these CPU time readings seem to illustrate the performance hit with hyperthreading.

Thanks!

JÃ¶rg

Announcement

work unit size question

work unit size question

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment