Announcement

Collapse
No announcement yet.

hyperthreading and rendering speed loss

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • hyperthreading and rendering speed loss

    I´ve done some searching on hyperthreading but didn´t come up with satisfying answers.
    I´ll be upgrading shortly to a 3Ghz P4 Prescott config. and I read somewhere that turning on HT should decrease rendering performance because HT should reserve CPU power for other apps running simultaneously with my NLE software.

    Before I assemble the kit I want to figure out if I´ll have to turn off HT in the BIOS (HT ON is default) because I understand that you have to make HT settings before installing the OS.

    I want to install a P750 or a Parhelia (probably stuck to Matrox for life!). Am I correct in the assumption that earlier Parhelias only support AGP4x, where later Parhelias and Millennium P series support AGP8x?
    -Off the beaten path I reign-

    At Home:

    Asus P4P800-E Deluxe / P4-E 3.0Ghz
    2 GB PC3200 DDR RAM
    Matrox Parhelia 128
    Terratec Cynergy 600 TV/Radio
    Maxtor 80GB OS and Apps
    Maxtor 300 GB for video
    Plextor PX-755a DVD-R/W DL
    Win XP Pro

    At work:
    Avid Newscutter Adrenaline.
    Avid Unity Media Network.

  • #2
    What I would suggest is that you create two profiles in XP with and w/o HT on. Do your initial install with HT off in the BIOS. Create a copy of that profile and reboot and change the BIOS to enable HT. I guess you can run your own independent tests to see if you notice any speed drops. I think your understanding of how HT works is a bit skewwed. The processer does not RESERVE cpu cycles for another app if there is not one used. Just if the application is multi-thread aware, it (the app) will take advantage of that.
    Go Bunny GO!


    Titan:
    MSI NEO2-FISR | Intel P4-3.0C | 1024MB Corsair TWINX1024 3200LLPT RAM | ATI AIW 9700 Pro | Dell P780 @ 1024x768x32 | Turtle Beach Santa Cruz | Sony DRU-500A DVD-R/-RW/+R/+RW | WDC 100GB [C:] | WDC 100GB [D:] | Logitech MX-700

    Mini:
    Shuttle SB51G XPC | Intel P4 2.4Ghz | Matrox G400MAX | 512 MB Crucial DDR333 RAM | CD-RW/DVD-ROM | Seagate 80GB [C:] | Logitech Cordless Elite Duo

    Server:
    Abit BE6-II | Intel PIII 450Mhz | Matrox Millennium II PCI | 256 MB Crucial PC133 RAM | WDC 6GB [C:] | WDC 200GB [E:] | WDC 160GB [F:] | WDC 250GB [G:]

    Comment


    • #3
      My understanding how HT works is none whatsoever. It´s just that I read these things about people having disappointing results in terms of rendering speed while CPU load would only reach 56% or so. As you can see I´m still on P!!! so I´m not familiar with hyperthreading (yet!).
      Just purchased a Parhelia 128 though, and I expect my new mainboard and CPU next week, so in a few weeks I should be less ignorant...
      -Off the beaten path I reign-

      At Home:

      Asus P4P800-E Deluxe / P4-E 3.0Ghz
      2 GB PC3200 DDR RAM
      Matrox Parhelia 128
      Terratec Cynergy 600 TV/Radio
      Maxtor 80GB OS and Apps
      Maxtor 300 GB for video
      Plextor PX-755a DVD-R/W DL
      Win XP Pro

      At work:
      Avid Newscutter Adrenaline.
      Avid Unity Media Network.

      Comment


      • #4
        Ok, hyperthreading ...

        In the CPU, some units are present more than once, or different units for different purposes (i.e. integer, floating point, ...) are present.
        Without hyperthreading, one process is running and is using the units it needs. This leaves some other units unused (i.e. when the program is doing floating point calculations, the integer unit is unused).
        (it is a bit simplified, but comes down to this)
        Any parallellism is done by processes taking turns, they don't run in parallel.

        Hyperthreading allows the unused units to be used by another process, so if one process is using the floating point unit, another (possibly unrelated) process can use the interger unit.
        Hyperthreading can yield a drop in performance if there are a lot of memory access by both process (allthough IIRC, the latest Intel implementations try to detect this). It turns out that with hyperthreading in its current form, you gain a maximum of 20% in performance.

        Now to how Windows (W2K, W XP; which have knowledge of hyperthreading) sees this...

        Windows sees 2 logical CPUs, with no preference among them. However, as mentioned above, these CPUs are not independant (the available units for one logical CPU depend on what is running on another one).
        In taskmanager, Windows cannot visualize this, so it just shows each CPU as 50% (of the total). If one logical CPU is running a process, it will show a 50% usage as it considers one CPU in use. If no other processes are running though, it runs at the same speed as if there was no hyperthreading (but then taskmanager would show 100%).
        Just to illustrate the skewed behaviour of taskmanager, leave the processor in idle for some time (with hyperthreading enabled). You'll see that the time spent 'idle' will be twice the real time.

        Hyperthreading does certainly yield slowdowns on operating systems that aren't hyperthreading aware (e.g. NT): the scheduler in XP or 2K knows the limitations of running the logical units in parallel, but NT doesn't. As a result, its scheduling will be off.


        I hope this makes some sense...
        Best way to test the speed is just measure it, don't go by what the taskmanager says.


        Jörg
        pixar
        Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

        Comment


        • #5
          oh, I do believe that hyperthreading doesn't help much in video processing. At most, it can result in less swapping needed (for other running processes), but the effect is limited.


          Jörg
          pixar
          Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

          Comment


          • #6
            I think there is a way to force WXP to run specific apps on the main logical CPU (Set Affinity to zero I think). Assuming you'll be getting a Prescott core and use dual channel mem, I would be surprised of you met any memory bottlenecks (like VJ does ).

            I think you should not suffer a loss in rendering once that is done. Any resources left would most likely first be allocated to run on the second logical CPU, which AFAIK should not halt rendering on LCPU 0 at all (provided MEM is OK, which I think it will be).

            Just my ign. $0.02

            edit: The idea being that you'd be using the lesser CPU for secondary tasks like browsing or running CPDN and what not, which should actually *help* rendering speeds as it gets interupted less.
            Join MURCs Distributed Computing effort for Rosetta@Home and help fight Alzheimers, Cancer, Mad Cow disease and rising oil prices.
            [...]the pervading principle and abiding test of good breeding is the requirement of a substantial and patent waste of time. - Veblen

            Comment


            • #7
              Originally posted by Umfriend
              I think there is a way to force WXP to run specific apps on the main logical CPU (Set Affinity to zero I think).
              There is no main logical CPU: all are treated as equal. If 2 processes are running on the logical CPUs, and both of them need a specific internal unit, there is no priority between the 2 logical units (yes, both CPUs windows sees are logical).

              Assuming you'll be getting a Prescott core and use dual channel mem, I would be surprised of you met any memory bottlenecks (like VJ does ).
              I'm still not sure the slower performance is due to a memory bottleneck: even one process (nothing in parallel seem to run slower than on my P4).
              Landrover: my system is a dual Xeon 2.4 GHz, with hyperthreading (so I see 4 logical CPUs in Windows).

              I think you should not suffer a loss in rendering once that is done. Any resources left would most likely first be allocated to run on the second logical CPU, which AFAIK should not halt rendering on LCPU 0 at all (provided MEM is OK, which I think it will be).
              Not quite. The scheduler allocates the units that are shared between LCPU0 and LCPU1 alternatively (swapping), instead of "having LCPU0 use them, and prevent LCPU1 from using". This latter scheme opens a door towards deadlocks, and violates some basic behaviour of parallel systems.

              edit: The idea being that you'd be using the lesser CPU for secondary tasks like browsing or running CPDN and what not, which should actually *help* rendering speeds as it gets interupted less.
              Again, the running process on one logical CPU will be interrupted if a process on the other logical CPU needs it.
              The gain of hyperthreading is that some of the operations can be done in parallel, thus the execution of both process with hyperthreading is faster than the execution of both processes in serial; but the execution of one of these processes can be longer than its execution as a single process. With a true SMP (dual CPUs) system, the execution time of each processes should be about the same as if it ran exclusively on a single CPU system.

              But as those secondary CPU tasks will be present anyway, without hyperthreading *everything* will be swapped, whereas with hyperthreading only the units that cannot be run in parallel will be swapped. And there is the true gain: no computer today runs a single process.


              Jörg
              Last edited by VJ; 10 January 2005, 03:58.
              pixar
              Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

              Comment


              • #8
                OK, The practicalities, from experience:

                HT on, non-HT-enabled software, typical processing speed 95 - 100%

                HT on, HT-enabled software, typical processing speed 110-120%

                HT off, non-HT-enabled software, typical processing speed 100%

                HT off, HT-enabled software, typical processing speed 100%

                i.e don't expect miracles in rendering speed, cos you won't get 'em

                Remember the process is called HYPErthreading.
                Brian (the devil incarnate)

                Comment


                • #9
                  Ah, uhm, OK. Thx VJ, Sry Landrover.
                  Join MURCs Distributed Computing effort for Rosetta@Home and help fight Alzheimers, Cancer, Mad Cow disease and rising oil prices.
                  [...]the pervading principle and abiding test of good breeding is the requirement of a substantial and patent waste of time. - Veblen

                  Comment


                  • #10
                    Cheers everybody. It´s all relatively clear now.
                    Brian, that´s your best phrase since the hilarious "codswallop".
                    B.t.w. have you moved to AMD?
                    I don´t expect miracles, but I´ll be running a Prescott CPU indeed, so it´ll certainly outperform my current, though precious, rig.
                    Got myself a genuine Parhelia yesterday, so I'm happy like a child.
                    -Off the beaten path I reign-

                    At Home:

                    Asus P4P800-E Deluxe / P4-E 3.0Ghz
                    2 GB PC3200 DDR RAM
                    Matrox Parhelia 128
                    Terratec Cynergy 600 TV/Radio
                    Maxtor 80GB OS and Apps
                    Maxtor 300 GB for video
                    Plextor PX-755a DVD-R/W DL
                    Win XP Pro

                    At work:
                    Avid Newscutter Adrenaline.
                    Avid Unity Media Network.

                    Comment


                    • #11
                      Originally posted by Brian Ellis
                      OK, The practicalities, from experience:
                      Yup... Just to add to it:

                      HT on, 2 independant single threads: 95-120%
                      (i.e. possibly slightly faster than running them sequentially (which would be 100%), but also possibly slower)


                      Jörg
                      pixar
                      Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                      Comment


                      • #12
                        56% processor usage with HT on means that one processor, let's call it the physical one is 100% loaded. The logical processor is 12% loaded. That is 112% processor loading. The best you can do with non-HT systems is 100% processor load.

                        I have not noticed a drop in performance in any video applications I tested. MSP shows a small but significant increase in performance, as does TMPGENc.

                        Unless you can demonstrably show it slowing your application of choice, or instability associated problems I would leave HT enabled.

                        - Mark
                        - Mark

                        Core 2 Duo E6400 o/c 3.2GHz - Asus P5B Deluxe - 2048MB Corsair Twinx 6400C4 - ATI AIW X1900 - Seagate 7200.10 SATA 320GB primary - Western Digital SE16 SATA 320GB secondary - Samsung SATA Lightscribe DVD/CDRW- Midiland 4100 Speakers - Presonus Firepod - Dell FP2001 20" LCD - Windows XP Home

                        Comment

                        Working...
                        X