Announcement

Collapse
No announcement yet.

I think I found a real reason for Q3demo1 slowdowns

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • I think I found a real reason for Q3demo1 slowdowns

    Ok, so I got some free time on my hands this morning and decided to apply my engineering senses to the performance issues in the new 5.21 drivers for G400s. First the setup:

    Abit BX6 rev 2.0, Celeron 300a/464Mhz, 192MB 125Mhz SDRAM, 100Mhz FSB, 66Mhz AGP (2/3), 64MB AGP Aperture Size, Matrox Milennium G400MAX 32MB DualHead 157.5Mhz/210Mhz, 7200rpm UDMA Quantum Fireball+ KA 18.2GB.

    The drivers were installed on a clean setup. G400 Tweak was used to make sure VSync was ON or OFF in the following tests.

    The registry key: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servic es\Class\Display\0000\Settings\DirectX
    was verified to change depending on the settings in G400 Tweak.
    There were no entries 0001, 0002, and so on in this subtree.

    Quake3 v.1.08 was used for the benchmarking.

    Only 1024x768 resolution was targeted at this research.

    First thing I did was to run both demo1 and demo2 at 1024x768 @ 85Hz for all possible permutations of the {color depth, texture quality, texture filter} settings. The following were the obtained results:

    Color Depth; Texture Depth; Texture Filter; Demo1 (fps); Demo2 (fps)
    16; 16; bilinear; 38.2; 38.6
    16; 16; trilinear; 34.0; 32.1
    32; 16; bilinear; 35.5; 36.4
    32; 16; trilinear; 29.9; 29.6
    16; 32; bilinear; 30.0; 37.7
    16; 32; trilinear; 21.7; 30.1
    32; 32; bilinear; 19.0; 35.4
    32; 32; trilinear; _13.9_; 27.4

    timedemo 1
    r_swapinterval 0
    demo q3demo1.dm3
    demo q3demo2.dm3

    was used in the above tests. Texture detail was always at maximum (4/4)


    Look at the 13.9 figure! Under 5.13 the same settings produces 26.0fps! Interestingly enough for demo2 the 5.13's 27.0 is replaced with 27.4. Something is really wrong here... since 5.21 has VSync ON and 5.13 has it OFF this seems as the first possible reason. If the problem is in VSync we would expect to see different FPS when switching refresh rates... additionally we might expect to see substantially different results when switching r_swapinterval between 0 and 1. So I ran the 1024x768x32 test from above at various refresh rates and with r_swapinterval at 0 and at 1. The results were as following:

    r_swapinterval___1___________________0
    ________________FPS___Refresh Rate___FPS
    _______________12.4_______60________14.0
    _______________12.6_______70________13.9
    _______________12.6_______75________13.9
    _______________12.7_______85________13.9
    _______________12.7_______90________13.8
    _______________12.7______100________13.7
    _______________12.7______120________13.5

    So the with VSync ON the FPS is not a function of the refresh rate! What could this mean? Probably that this numbers are indeed the true performance of the card (or should I say drivers) at this settings. And besides, it the performance were hurt by the VSync, above FPS would divide the refresh rates by an integer number! 60/14.4=4.29; 70/13.9=5.04; 75/13.9=5.40; 85/13.9=6.12; 90/13.8=6.52; 100/13.7=7.30; 120/13.5=8.89. The only refresh rate that seems to divide nicely is 70, but this is definitely just a coincidence when compared to 6 other results. So the conclusion is that r_swapinterval does NOT change VSync state. Or perhaps worse yet, 5.21 OpenGL applications can not turn VSync ON and OFF? It might be just Quake3 though...

    Also note that with r_swapinterval 1 doesn't differ from 0 by much... the difference is probably due to the additional code executing in the inner loop of the engine trying to wait for the VSync whereas in r_swapinterval 0 it never waits... the VSync is probably forced later on by the driver's underlying layer (OpenGL or not, to write to video memory they still use drivers with its own settings...)

    So now that we know that r_swapinterval doesn't change VSync we take a look at the other way to change VSync in Q3. Sync Every Frame in the Game Options. In addition we take a look at another possible performance culprit: Texture Details. Just to make sure we also run all of these tests for both r_swapinterval 1 and 0. This is what I got (85Hz refresh rate again):


    Sync Every Frame OFF

    r_swap___1______2_____3______4
    1______32.9___31.8___27.2__12.7
    0______36.6___36.1___32.3__13.8


    Sync Every Frame ON

    r_swap___1______2_____3______4

    1______30.4___29.3___26.1___13.1
    0______35.2___34.3___30.6___14.2

    Bingo! It seems we found the real reason for the frame drops! The Texture Details. Notice the HUGE drop between 3 and 4. This might explain why most of the people in this forum didn't see any difference in FPS between 5.13 and 5.21 in Q3 demo1! Unless you run it at ALL ON, which means 4/4 texture details as well as 32bit colot depth, 32bit textures, trilinear, you will indeed get results similar to 5.13 in this test.

    We can again see that neither r_swapinterval nor Sync Every Frame do their job. The frame rate difference is 4-5fps for 1-3 texture details in all cases... this can be attributed to the engine trying to wait (or not) for the sync in the inner loop. The FPS never divide 85Hz either... so these are all real FPS output by this drivers. Refresh rate is high enough not to influence the FPS noticably.

    Finally we try the G400 Tweak, or the reghack to disable the VSync for the drivers themselves in the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servic es\Class\Display\0000\Settings\DirectX. Running the troubled setting once again (q3demo1.dm3 at ALL ON) we still get 13.8 but this time with tearing! So this VSync switch DOES work. But it produces no difference in Q3 q3demo1.dm3! Again this points to the fact that this particular setup with 5.21 drivers does produce only 13.8 fps. VSync does NOT influence it.

    Now what about the performance increase from 5.13 to 5.21? Well, I also ran Quake2. And there you do see the difference. Just as Ant mentioned the FPS goes way up. At 1024x768x16bit @ 85Hz VSync ON (in the registry) produces:

    demo1 = 42.6; demo2 = 41.6

    with VSync OFF I got:

    demo1 = 62.0; demo2 = 60.3

    That's about 30% increate "promised", isn't it? Too bad after playing with VSync ON, VSync OFF tearing is just TOO noticable... oh well, I don't play Q2 anyways ;0)


    In summary the reason why people don't see difference between VSync ON and VSync OFF in Q3 1.08 q3demo1.dm3 is because VSync is not the reason for the slowdown. Texture Details set at maximum mixed with the new 5.21 drivers is. With 5.13 there are no problems in this test. The slowdown is also not constant... it moves fairly well, than drops down to <10fps. Specifically at index times:

    60:10 - 60:13
    60:17 - 60:20
    60:33 - 60:35
    60:43 - 60:45

    In every single instance there is a lot of blood flying around. So perhaps the API call (or some conditional branch inside) that gets called in 4/4 texture details for blood doesn't get called for lower details settings... or maybe a bandwidth bottleneck was introduced in the new drivers... whatever the reason, now it's far easier to look into it... knowing this results and where to look. Hopefully, driver developers will look into it sooner than later. I'm not really a Q3 player, but I am eagerly waiting for some of the Q3 engine based games (Star Trekkie


    So I think I've covered all of it (correct me if I was wrong somewhere... tried to do a thorough research here, but you always miss something .

    Michael


    ------------------
    P2c-300a/450, 192MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Creative Labs AWE64 Gold Sound Blaster, A-Trend Voodoo II 12MB, Matrox Millennium G400Max, 19" Hitachi SuperScan 752 and some other fancy stuff


    [This message has been edited by MM (edited 09-04-1999).]

    [This message has been edited by MM (edited 09-04-1999).]

    [This message has been edited by MM (edited 09-06-1999).]
    P2c-300a/450, 256MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Matrox Millennium G400Max, 19" Hitachi SuperScan 752, Logitech Cordless MouseMan Wheel and some other fancy stuff

  • #2
    yeah i posted the same issue today in "answer to lockups" at first max texture quality caused lockups but now after decreasing the agp speed to x1 it only causes slow downs... and I too didn't have that problem with 5.13

    Comment


    • #3
      I'd really like to see results from people who say their FPS did NOT change from 5.13 to 5.21 in Q3 1.08 demo1. If they would run the 1024x768x32bit with 32bit textures at max quality and trilinear, lightmap, high geometry detail... I'd be surprised if they didn't get the ~13-14 fps others do.

      One thing I'm sure off, it has nothing to do with VSync in this specific case.

      Michael

      ------------------
      P2c-300a/450, 192MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Creative Labs AWE64 Gold Sound Blaster, A-Trend Voodoo II 12MB, Matrox Millennium G400Max, 19" Hitachi SuperScan 752 and some other fancy stuff
      P2c-300a/450, 256MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Matrox Millennium G400Max, 19" Hitachi SuperScan 752, Logitech Cordless MouseMan Wheel and some other fancy stuff

      Comment


      • #4
        I am experiencing the same thing. With texture detail at max I get around 13fps
        at 1024x768 and everything else on. When I turn the texture detail down one level my fps jumps to 28 or so. I thought it was something specific to my machine that I was doing wrong.

        This is on a 366 oc'd to 550 and a G400MAX, 5.21 drivers and reg hacks for forcing agp2x and vsync off.

        Comment


        • #5
          MM, nice piece of work.

          ------------------
          'scuse me while I kiss the sky
          'scuse me while I kiss the sky

          Comment


          • #6
            Good job if its all true MM. Kidding! Great post you take the cake for POST of the WEEK ! What is PC125 ram anyway?

            ------------------
            PIII-450, 128 HDSRAM, Asus P3BF, G400/32, SBLive!, Nokia 447Xi 17",

            Comment


            • #7
              Thanks. I put ~4 hours into it but it was well worth it... I was beginning to get an impression that 5.21 sucked deeply and I needed to go back. Now I know I don't. There is this small issue with 4/4 texture details and the System Shock 2's doors which I'm sure they'll fix in the next driver release since they know what they changed (it did work before).

              PC125 is just SDRAM that was designed to run at 125Mhz in CAS2. Just like PC100 or PC133... PC133 was way too expensive when I was building this rig so 125 it was. For extra stability (I still run at 100Mhz since my Celery 300a won't do anything more than 450Mhz... once i get P3 something I'll go higher).

              ------------------
              P2c-300a/450, 192MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Creative Labs AWE64 Gold Sound Blaster, A-Trend Voodoo II 12MB, Matrox Millennium G400Max, 19" Hitachi SuperScan 752 and some other fancy stuff
              P2c-300a/450, 256MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Matrox Millennium G400Max, 19" Hitachi SuperScan 752, Logitech Cordless MouseMan Wheel and some other fancy stuff

              Comment


              • #8
                I just tried Quake III timedemo 1 with a non-max at 1024X768 everything 32-bit and got a score of 28.6FPS.

                System Spec:
                PIII 600
                MSI 6130PRO Mobo
                128MB PC133 Ram
                G400 non-max
                SBLive! Retail with 2.1 drivers.

                I am getting my Max on Tusday and will re-run the benchmark again and report scores back to this thread when i'm done.

                Woops.... forgot the set texture detail all the way up, I also am getting 13.3FPS so it appears that high textures is what kills G400's. Since however everyone I know who plays Quake III plays at 640X480 or 800X600 at 16-bit normal texure detail I don't really see why anyone who wants to stay competitive would run that high. I just don't think what's available can run Quake III like that. I'm probably wonr, but it's my quess.

                [This message has been edited by cpld005 (edited 09-05-1999).]
                PIII 700@960, Asus CUSL2, Adaptec 29160, 2x Seagate Barracuda 18.2GB, SB LIve!, 3COM 3C905TX, 256MB Muskin Rev. 2 PC133 at 2-2-2, G400MAX soon the be replaced with ?.

                Comment


                • #9
                  CPLD,

                  It's just that the 5.13 drivers did just fine with that setting, and so does a TNT2. So obviously something broke in 5.21, however now that we know what it is we can just bug Matrox until they fix it and leave that last texture setting off (I think that last slider notch is just large textures anyway - 512x512'ers).

                  - Gurm

                  ------------------
                  G. U. R. M. It's not hard to spell, is it? Then don't screw it up!
                  The word "Gurm" is in no way Copyright 1999 Jorden van der Elst.
                  The Internet - where men are men, women are men, and teenage girls are FBI agents!

                  I'm the least you could do
                  If only life were as easy as you
                  I'm the least you could do, oh yeah
                  If only life were as easy as you
                  I would still get screwed

                  Comment


                  • #10
                    Great job MM. I kinda suspected something similiar myself but didn't do all the nice checking you did.

                    Now, if you or someone could just send along your handiwork to Matrox perhaps we'll be able to play Q3Test at that resolution someday

                    Comment


                    • #11
                      Sure I can do that. What would be the email to send it to? I posted it here because I thought Matrox people were monitoring this forum (even if silently). If this is not the case, by all means it should get into their hands via email.
                      P2c-300a/450, 256MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Matrox Millennium G400Max, 19" Hitachi SuperScan 752, Logitech Cordless MouseMan Wheel and some other fancy stuff

                      Comment


                      • #12
                        Oh, Matrox has seen this thread, the thread on HL and textures, and all the rest
                        Core2 Duo E7500 2.93, Asus P5Q Pro Turbo, 4gig 1066 DDR2, 1gig Asus ENGTS250, SB X-Fi Gamer ,WD Caviar Black 1tb, Plextor PX-880SA, Dual Samsung 2494s

                        Comment


                        • #13
                          Holy Crap!!
                          Up until now, I doubted the existance of a devine entity. But I have seen the light.
                          All praise MM. I'd sacrifice a lamb(leg of)
                          but I need it for dinner.

                          ------------------
                          AssuP2P??,300a at 337,128megs of the good stuff,G400reg32megSH,cheapy yamaha PCI sound,'couple o' little HD's,42XCDrom and 98SE w/shutdown patch
                          AMD XP2100+, 512megs DDR333, ATI Radeon 8500, some other stuff.

                          Comment


                          • #14
                            Yep, I got the same thing. 13-14fps at 1024x768x32-bit texture slide bar maxxed.

                            Where is this 20% improvement in OGL games that we were all lead to believe ?

                            specs:
                            PIII-581 (129fsb)
                            Singlehead G400 32meg oc-150/200

                            Comment


                            • #15
                              Who led who to belive in the Uma "OpenGL"?

                              Comment

                              Working...
                              X