Announcement

Collapse
No announcement yet.

Hot from the over: Athlon XP2800 333Mhz fsb

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    So I guess it's time for me to chime in. There's some things talked about here that I know the answer about, but it isn't public yet. Sorry if I seem to ignore some things because of that.

    Compilers: Yes, IA-64 depends a lot on its compilers. But it isn't stupid without them. For example, branch prediction is very good in McKinley. One stat I remember is that McKinley predicts and fetches branch indirects about 10% of the time, while everything else out there is closer to 1%. That's huge, since it means McKinley prefetches that code from main memory way ahead of time. That's a pretty good job of "turning corners."
    Predication is also really cool. It's kind of like tagging assembly code with (if A then B else C). It sounds kind of like branch prediction, but predication performs a lot like conditional branching without the fetching overhead. McKinley can start executing B & C, leaving things incomplete where it has to, and as soon as it can figure out A, it discards B or C accordingly. IA-64 does it, and a good compiler will do a better job of tuning the predicates to boost performance.
    Also, IA-64 has loop flattening (parallelism). So if you wrote (for i = 0; i < 10; i++) etc, most processors will have to branch back 10 times and iterate over the loop 10 times. IA-64 will run the cases in parallel, even though it might be a recursive loop, and merge the answer at the end. That might take as long as 2-3 iterations of the loop would, instead of 10. When you think about how much of your processor's intensive time is spent in loops, that's a big plus.
    IA-64 is really built with speculative processing in mind, but I can't get in to what the future holds there too much. IA-64 does have ALAT instructions though, which help a lot if the compiler can use them well. ALATs are advanced loads, and when they're in the code, it's like saying "get this if you have time." So, if McKinley is doing a bunch of loads and stores for the code it's dealing with at the moment, it may see the ALAT and ignore it, but when it's got some free memory bandwidth, it will remember the ALAT entry, and pre-fetch something from memory before it is needed. You can throw a bunch of ALATs in to the code, and they can be invalidated easily, so that if you don't need them, no harm - no foul.

    A lot of this stuff is covered lightly here: http://cpus.hp.com/technical_referen...64_arch_wp.pdf

    McKinley isn't that much hotter because of things like parallel execution. It's hotter because it's frigging huge, and on a 180nm process. 3MB of single-cycle cache burns a hell of a lot of power, and having the world's fastest, most aggressive FPU doesn't help that either On the other hand, it does mean that the McKinley SETI@Home client is ****loads faster than anybody else's. But Madison is coming, and later IA-64 implementations. Power should get better, while you'll see everyone else's power consumption rise over time.
    Gigabyte P35-DS3L with a Q6600, 2GB Kingston HyperX (after *3* bad pairs of Crucial Ballistix 1066), Galaxy 8800GT 512MB, SB X-Fi, some drives, and a Dell 2005fpw. Running WinXP.

    Comment


    • #17
      I was under the impression that branch prediction had been thrown out of the window completely and replaced with predication. That probably explains most of what I wrote.

      Otherwise, a fair bit to get my head around there. Thanks for all that. I'll go away and read the pdf (tomorrow, it's late now and I have to get up early), and perhaps come back with more questions. Which you probably won't be allowed to answer
      Blah blah blah nick blah blah confusion, blah blah blah blah frog.

      Comment


      • #18
        Thanks for the information wombat

        As hopefully pointed out in my first post, I am much more interested in the IA-64 CPU than the AMD Hammer CPU.

        Ribbit, your wrong about compilers not about the code their running. Compilers know MORE about the code they than the CPU because the compilers have to generate it. Compilers know the original source code to the programs and can optimize structures used in the source code. They also have a much larger window into the code, whereas a CPU would only have at most a dozen or so instructions ready for parrelization at a time. A compiler should be able to do a better job at parrellization than CPU's.
        80% of people think I should be in a Mental Institute

        Comment


        • #19
          I don't care if the CPU comes from AMD or from Intel, I don't even care for 64Bit. I will just choose what delivers the most performance for a still reasonable price.
          Last edited by Indiana; 3 October 2002, 06:20.
          But we named the *dog* Indiana...
          My System
          2nd System (not for Windows lovers )
          German ATI-forum

          Comment


          • #20
            Originally posted by Indiana
            I don't care from if the CPU comes from AMD or from Intel, I don't even care for 64Bit. I will just choose what delivers the most performance for a still reasonable price.
            That too,

            But the IA-64 does interest me from a computer science aspect more than the X86-64 does, in the same way that the powerpc is more interesting than the X86-32 cpus.

            I will still buy what gives me good bang for my buck (which will probably be the X86-64 cpu's for a few years yet)
            80% of people think I should be in a Mental Institute

            Comment

            Working...
            X