Announcement

Collapse
No announcement yet.

Supercomputer VS. Home computer

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Supercomputer VS. Home computer

    I've been trying to search some stuff about it but so far I quite failed getting all the info

    The questions are:

    1. How many flops did older supercomputer had (Cray 1 for example) ?
    2. What's the difference between Specfp95 and Specfp2000 ?
    3. Are home computer and supercomputers being measured by the exact same system (see question 2) ?
    4. How many mflops does current mass production CPUs (Athlon XP 1600-2600, P4 1600-3.2, latest Sparc) have ?
    5. How much (if at all) does memory architecture affect the outcome of those tests ?

  • #2
    Found it !

    Computing Performance in FLOPS
    <table width="400" border="1" style="font-size:10pt">
    <tr><td><b>Performance Tier</b></td><td><b>FLOPS equivalent</b></td><td><b>Key Platforms</b></td></tr>
    <tr><td>kiloflops (KFLOPS)</td><td>1,000 FLOPS</td><td>IBM 701 (1953)<br>IBM 704 (1955)<br>Apple II (1977)</td></tr>
    <tr><td>megaflops (MFLOPS)</td><td>1,000,000 FLOPS</td><td>CDC 6600 (1966)<br>Cray 1 (1976)<br>Intel Pentium (1993)</td></tr>
    <tr><td>gigaflops (GFLOPS)</td><td>1,000,000,000 FLOPS</td><td>Cray 2 (1985)<br>Thinking Machines CM-2 (1987)<br>Microsoft Xbox (2001)</td></tr>
    <tr><td>teraflops (TFLOPS)</td><td>1,000,000,000,000 FLOPS</td><td>Intel ASCI Red (1996)<br>IBM ASCI Blue Pacific (1998)<br>IBM ASCI White (2000)<br>NEC Earth Simulator (2002)</td></tr>
    <tr><td>petaflops (PFLOPS)</td><td>1,000,000,000,000,000 FLOPS</td><td>IBM Blue Gene (2005-2010?)</td></tr></table>

    Comment


    • #3
      An interesting note, check the years difference between home computers and supercomputers in each block.

      1st block: 22-24 years

      2nd block: 17-27 years

      3rd block: 14-16 years

      It's getting shorter every time.

      Comment


      • #4
        Last note b4 going home (@work):

        Supercomputer graphics usually takes it's power from the CPU unlike with home computers. So for 3D output, our homecomputers with T&L enabled graphic accelerators should be at least as strong as 7-8 years old supercomputers.

        Comment


        • #5
          rendering farms still don't use any 3D chips because the precision is higher with CPUs.
          i don't think that has changed in the meanwhile or has it?
          no matrox, no matroxusers.

          Comment


          • #6
            the ATI 9700 supports rendering up to 128bit fp precision (altho its not a full 128bit pipeline). The GeForceFX has a full 128bit pipeline as well. This, i believe, is either equal to or exceeds what most movies use.

            render farms will start being consolidated down to a handfull of dedicated graphics cards within the next few years...
            "And yet, after spending 20+ years trying to evolve the user interface into something better, what's the most powerful improvement Apple was able to make? They finally put a god damned shell back in." -jwz

            Comment


            • #7
              Re: Supercomputer VS. Home computer

              Originally posted by Dogbert
              4. How many mflops does current mass production CPUs (Athlon XP 1600-2600, P4 1600-3.2, latest Sparc) have ?
              5. How much (if at all) does memory architecture affect the outcome of those tests ? [/B]
              Although the Athlon XP has 3 FPU pipelines, I believe its IPC for FPU operations is closer to 2 (someone correct me please...). This means that you rate a basic XP 1900+ (1600MHz) from around 3.2 to 4.8 GFLOPS.

              P4 is more complex to place on the GFLOPS scale. Basically it should be able to execute 1-2 FPU operations per clock cycle, but it is probably closest to 1. If SSE2 is applicable to your application, those SIMD instructions should ensure that your IPC gets much closer to 2. So your basic 2.4GHz P4 should rate around 2.4 to 4.8 GFLOPS.

              UltraSparc is a generation behind Athlon/P4 on the FPU performance side. But they do have 64bit as standard.



              Memory arch. does not affect GFLOPS. But it definitely affect SPEC stats.

              Regards,
              lurqa

              Comment


              • #8
                I know that there are a number of commonly used legacy FPU instructions that take 3+ cycles to execute on the P4 that used to take 1-2 cycles on the P5/P6 architecture. One of the tradeoffs of getting to the high clock speeds... The P4 really does shine when using SSE2 though.
                "And yet, after spending 20+ years trying to evolve the user interface into something better, what's the most powerful improvement Apple was able to make? They finally put a god damned shell back in." -jwz

                Comment


                • #9
                  I remember reading about the Cray 2 and from what I read it wasn't as fast as a Gflop... maybe half that. I remember finding that its performance was roughly equal to a 400mhz PentiumII in Gflops. When we saw the Cray 2 back in the mid 80s and thought what an awesome machine it was, I wonder if anyone would have imagined it being outperformed by a gaming console 15 years later..

                  Comment


                  • #10
                    Originally posted by DGhost
                    the ATI 9700 supports rendering up to 128bit fp precision (altho its not a full 128bit pipeline). The GeForceFX has a full 128bit pipeline as well. This, i believe, is either equal to or exceeds what most movies use.

                    render farms will start being consolidated down to a handfull of dedicated graphics cards within the next few years...
                    A colleague of mine recently went to a conference on computing (had to do with cache, optimisations, parallellism...) and told me there was a presentation there on this topic. A researcher had made a program to perform matrix calculations on a video-card. Whereas the execution time rose quasi-linear for 10x10, 100x100, ... matrices on a P4, it remained almost constant for the same calculations done on the videocard. Main issue was the delay in putting and retrieving the data on/from the videocard. Of course, the calculations were limited, both due to the instruction set (lack of certain control-instructions that would be usefull, ...) and the limited size of the program, but the performance was awesome.


                    Jörg
                    pixar
                    Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                    Comment


                    • #11
                      One of the problems is that none of the video cards are designed with moving that data in mind. AGP allows for fast reads and writes, but the video cards are designed only to grab information, their output over the bus is painfully slow.
                      Gigabyte P35-DS3L with a Q6600, 2GB Kingston HyperX (after *3* bad pairs of Crucial Ballistix 1066), Galaxy 8800GT 512MB, SB X-Fi, some drives, and a Dell 2005fpw. Running WinXP.

                      Comment

                      Working...
                      X