Ok, so I got some free time on my hands this morning and decided to apply my engineering senses to the performance issues in the new 5.21 drivers for G400s. First the setup:
Abit BX6 rev 2.0, Celeron 300a/464Mhz, 192MB 125Mhz SDRAM, 100Mhz FSB, 66Mhz AGP (2/3), 64MB AGP Aperture Size, Matrox Milennium G400MAX 32MB DualHead 157.5Mhz/210Mhz, 7200rpm UDMA Quantum Fireball+ KA 18.2GB.
The drivers were installed on a clean setup. G400 Tweak was used to make sure VSync was ON or OFF in the following tests.
The registry key: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servic es\Class\Display\0000\Settings\DirectX
was verified to change depending on the settings in G400 Tweak.
There were no entries 0001, 0002, and so on in this subtree.
Quake3 v.1.08 was used for the benchmarking.
Only 1024x768 resolution was targeted at this research.
First thing I did was to run both demo1 and demo2 at 1024x768 @ 85Hz for all possible permutations of the {color depth, texture quality, texture filter} settings. The following were the obtained results:
Color Depth; Texture Depth; Texture Filter; Demo1 (fps); Demo2 (fps)
16; 16; bilinear; 38.2; 38.6
16; 16; trilinear; 34.0; 32.1
32; 16; bilinear; 35.5; 36.4
32; 16; trilinear; 29.9; 29.6
16; 32; bilinear; 30.0; 37.7
16; 32; trilinear; 21.7; 30.1
32; 32; bilinear; 19.0; 35.4
32; 32; trilinear; _13.9_; 27.4
timedemo 1
r_swapinterval 0
demo q3demo1.dm3
demo q3demo2.dm3
was used in the above tests. Texture detail was always at maximum (4/4)
Look at the 13.9 figure! Under 5.13 the same settings produces 26.0fps! Interestingly enough for demo2 the 5.13's 27.0 is replaced with 27.4. Something is really wrong here... since 5.21 has VSync ON and 5.13 has it OFF this seems as the first possible reason. If the problem is in VSync we would expect to see different FPS when switching refresh rates... additionally we might expect to see substantially different results when switching r_swapinterval between 0 and 1. So I ran the 1024x768x32 test from above at various refresh rates and with r_swapinterval at 0 and at 1. The results were as following:
r_swapinterval___1___________________0
________________FPS___Refresh Rate___FPS
_______________12.4_______60________14.0
_______________12.6_______70________13.9
_______________12.6_______75________13.9
_______________12.7_______85________13.9
_______________12.7_______90________13.8
_______________12.7______100________13.7
_______________12.7______120________13.5
So the with VSync ON the FPS is not a function of the refresh rate! What could this mean? Probably that this numbers are indeed the true performance of the card (or should I say drivers) at this settings. And besides, it the performance were hurt by the VSync, above FPS would divide the refresh rates by an integer number! 60/14.4=4.29; 70/13.9=5.04; 75/13.9=5.40; 85/13.9=6.12; 90/13.8=6.52; 100/13.7=7.30; 120/13.5=8.89. The only refresh rate that seems to divide nicely is 70, but this is definitely just a coincidence when compared to 6 other results. So the conclusion is that r_swapinterval does NOT change VSync state. Or perhaps worse yet, 5.21 OpenGL applications can not turn VSync ON and OFF? It might be just Quake3 though...
Also note that with r_swapinterval 1 doesn't differ from 0 by much... the difference is probably due to the additional code executing in the inner loop of the engine trying to wait for the VSync whereas in r_swapinterval 0 it never waits... the VSync is probably forced later on by the driver's underlying layer (OpenGL or not, to write to video memory they still use drivers with its own settings...)
So now that we know that r_swapinterval doesn't change VSync we take a look at the other way to change VSync in Q3. Sync Every Frame in the Game Options. In addition we take a look at another possible performance culprit: Texture Details. Just to make sure we also run all of these tests for both r_swapinterval 1 and 0. This is what I got (85Hz refresh rate again):
Sync Every Frame OFF
r_swap___1______2_____3______4
1______32.9___31.8___27.2__12.7
0______36.6___36.1___32.3__13.8
Sync Every Frame ON
r_swap___1______2_____3______4
1______30.4___29.3___26.1___13.1
0______35.2___34.3___30.6___14.2
Bingo! It seems we found the real reason for the frame drops! The Texture Details. Notice the HUGE drop between 3 and 4. This might explain why most of the people in this forum didn't see any difference in FPS between 5.13 and 5.21 in Q3 demo1! Unless you run it at ALL ON, which means 4/4 texture details as well as 32bit colot depth, 32bit textures, trilinear, you will indeed get results similar to 5.13 in this test.
We can again see that neither r_swapinterval nor Sync Every Frame do their job. The frame rate difference is 4-5fps for 1-3 texture details in all cases... this can be attributed to the engine trying to wait (or not) for the sync in the inner loop. The FPS never divide 85Hz either... so these are all real FPS output by this drivers. Refresh rate is high enough not to influence the FPS noticably.
Finally we try the G400 Tweak, or the reghack to disable the VSync for the drivers themselves in the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servic es\Class\Display\0000\Settings\DirectX. Running the troubled setting once again (q3demo1.dm3 at ALL ON) we still get 13.8 but this time with tearing! So this VSync switch DOES work. But it produces no difference in Q3 q3demo1.dm3! Again this points to the fact that this particular setup with 5.21 drivers does produce only 13.8 fps. VSync does NOT influence it.
Now what about the performance increase from 5.13 to 5.21? Well, I also ran Quake2. And there you do see the difference. Just as Ant mentioned the FPS goes way up. At 1024x768x16bit @ 85Hz VSync ON (in the registry) produces:
demo1 = 42.6; demo2 = 41.6
with VSync OFF I got:
demo1 = 62.0; demo2 = 60.3
That's about 30% increate "promised", isn't it? Too bad after playing with VSync ON, VSync OFF tearing is just TOO noticable... oh well, I don't play Q2 anyways ;0)
In summary the reason why people don't see difference between VSync ON and VSync OFF in Q3 1.08 q3demo1.dm3 is because VSync is not the reason for the slowdown. Texture Details set at maximum mixed with the new 5.21 drivers is. With 5.13 there are no problems in this test. The slowdown is also not constant... it moves fairly well, than drops down to <10fps. Specifically at index times:
60:10 - 60:13
60:17 - 60:20
60:33 - 60:35
60:43 - 60:45
In every single instance there is a lot of blood flying around. So perhaps the API call (or some conditional branch inside) that gets called in 4/4 texture details for blood doesn't get called for lower details settings... or maybe a bandwidth bottleneck was introduced in the new drivers... whatever the reason, now it's far easier to look into it... knowing this results and where to look. Hopefully, driver developers will look into it sooner than later. I'm not really a Q3 player, but I am eagerly waiting for some of the Q3 engine based games (Star Trekkie
So I think I've covered all of it (correct me if I was wrong somewhere... tried to do a thorough research here, but you always miss something .
Michael
------------------
P2c-300a/450, 192MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Creative Labs AWE64 Gold Sound Blaster, A-Trend Voodoo II 12MB, Matrox Millennium G400Max, 19" Hitachi SuperScan 752 and some other fancy stuff
[This message has been edited by MM (edited 09-04-1999).]
[This message has been edited by MM (edited 09-04-1999).]
[This message has been edited by MM (edited 09-06-1999).]
Abit BX6 rev 2.0, Celeron 300a/464Mhz, 192MB 125Mhz SDRAM, 100Mhz FSB, 66Mhz AGP (2/3), 64MB AGP Aperture Size, Matrox Milennium G400MAX 32MB DualHead 157.5Mhz/210Mhz, 7200rpm UDMA Quantum Fireball+ KA 18.2GB.
The drivers were installed on a clean setup. G400 Tweak was used to make sure VSync was ON or OFF in the following tests.
The registry key: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servic es\Class\Display\0000\Settings\DirectX
was verified to change depending on the settings in G400 Tweak.
There were no entries 0001, 0002, and so on in this subtree.
Quake3 v.1.08 was used for the benchmarking.
Only 1024x768 resolution was targeted at this research.
First thing I did was to run both demo1 and demo2 at 1024x768 @ 85Hz for all possible permutations of the {color depth, texture quality, texture filter} settings. The following were the obtained results:
Color Depth; Texture Depth; Texture Filter; Demo1 (fps); Demo2 (fps)
16; 16; bilinear; 38.2; 38.6
16; 16; trilinear; 34.0; 32.1
32; 16; bilinear; 35.5; 36.4
32; 16; trilinear; 29.9; 29.6
16; 32; bilinear; 30.0; 37.7
16; 32; trilinear; 21.7; 30.1
32; 32; bilinear; 19.0; 35.4
32; 32; trilinear; _13.9_; 27.4
timedemo 1
r_swapinterval 0
demo q3demo1.dm3
demo q3demo2.dm3
was used in the above tests. Texture detail was always at maximum (4/4)
Look at the 13.9 figure! Under 5.13 the same settings produces 26.0fps! Interestingly enough for demo2 the 5.13's 27.0 is replaced with 27.4. Something is really wrong here... since 5.21 has VSync ON and 5.13 has it OFF this seems as the first possible reason. If the problem is in VSync we would expect to see different FPS when switching refresh rates... additionally we might expect to see substantially different results when switching r_swapinterval between 0 and 1. So I ran the 1024x768x32 test from above at various refresh rates and with r_swapinterval at 0 and at 1. The results were as following:
r_swapinterval___1___________________0
________________FPS___Refresh Rate___FPS
_______________12.4_______60________14.0
_______________12.6_______70________13.9
_______________12.6_______75________13.9
_______________12.7_______85________13.9
_______________12.7_______90________13.8
_______________12.7______100________13.7
_______________12.7______120________13.5
So the with VSync ON the FPS is not a function of the refresh rate! What could this mean? Probably that this numbers are indeed the true performance of the card (or should I say drivers) at this settings. And besides, it the performance were hurt by the VSync, above FPS would divide the refresh rates by an integer number! 60/14.4=4.29; 70/13.9=5.04; 75/13.9=5.40; 85/13.9=6.12; 90/13.8=6.52; 100/13.7=7.30; 120/13.5=8.89. The only refresh rate that seems to divide nicely is 70, but this is definitely just a coincidence when compared to 6 other results. So the conclusion is that r_swapinterval does NOT change VSync state. Or perhaps worse yet, 5.21 OpenGL applications can not turn VSync ON and OFF? It might be just Quake3 though...
Also note that with r_swapinterval 1 doesn't differ from 0 by much... the difference is probably due to the additional code executing in the inner loop of the engine trying to wait for the VSync whereas in r_swapinterval 0 it never waits... the VSync is probably forced later on by the driver's underlying layer (OpenGL or not, to write to video memory they still use drivers with its own settings...)
So now that we know that r_swapinterval doesn't change VSync we take a look at the other way to change VSync in Q3. Sync Every Frame in the Game Options. In addition we take a look at another possible performance culprit: Texture Details. Just to make sure we also run all of these tests for both r_swapinterval 1 and 0. This is what I got (85Hz refresh rate again):
Sync Every Frame OFF
r_swap___1______2_____3______4
1______32.9___31.8___27.2__12.7
0______36.6___36.1___32.3__13.8
Sync Every Frame ON
r_swap___1______2_____3______4
1______30.4___29.3___26.1___13.1
0______35.2___34.3___30.6___14.2
Bingo! It seems we found the real reason for the frame drops! The Texture Details. Notice the HUGE drop between 3 and 4. This might explain why most of the people in this forum didn't see any difference in FPS between 5.13 and 5.21 in Q3 demo1! Unless you run it at ALL ON, which means 4/4 texture details as well as 32bit colot depth, 32bit textures, trilinear, you will indeed get results similar to 5.13 in this test.
We can again see that neither r_swapinterval nor Sync Every Frame do their job. The frame rate difference is 4-5fps for 1-3 texture details in all cases... this can be attributed to the engine trying to wait (or not) for the sync in the inner loop. The FPS never divide 85Hz either... so these are all real FPS output by this drivers. Refresh rate is high enough not to influence the FPS noticably.
Finally we try the G400 Tweak, or the reghack to disable the VSync for the drivers themselves in the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servic es\Class\Display\0000\Settings\DirectX. Running the troubled setting once again (q3demo1.dm3 at ALL ON) we still get 13.8 but this time with tearing! So this VSync switch DOES work. But it produces no difference in Q3 q3demo1.dm3! Again this points to the fact that this particular setup with 5.21 drivers does produce only 13.8 fps. VSync does NOT influence it.
Now what about the performance increase from 5.13 to 5.21? Well, I also ran Quake2. And there you do see the difference. Just as Ant mentioned the FPS goes way up. At 1024x768x16bit @ 85Hz VSync ON (in the registry) produces:
demo1 = 42.6; demo2 = 41.6
with VSync OFF I got:
demo1 = 62.0; demo2 = 60.3
That's about 30% increate "promised", isn't it? Too bad after playing with VSync ON, VSync OFF tearing is just TOO noticable... oh well, I don't play Q2 anyways ;0)
In summary the reason why people don't see difference between VSync ON and VSync OFF in Q3 1.08 q3demo1.dm3 is because VSync is not the reason for the slowdown. Texture Details set at maximum mixed with the new 5.21 drivers is. With 5.13 there are no problems in this test. The slowdown is also not constant... it moves fairly well, than drops down to <10fps. Specifically at index times:
60:10 - 60:13
60:17 - 60:20
60:33 - 60:35
60:43 - 60:45
In every single instance there is a lot of blood flying around. So perhaps the API call (or some conditional branch inside) that gets called in 4/4 texture details for blood doesn't get called for lower details settings... or maybe a bandwidth bottleneck was introduced in the new drivers... whatever the reason, now it's far easier to look into it... knowing this results and where to look. Hopefully, driver developers will look into it sooner than later. I'm not really a Q3 player, but I am eagerly waiting for some of the Q3 engine based games (Star Trekkie
So I think I've covered all of it (correct me if I was wrong somewhere... tried to do a thorough research here, but you always miss something .
Michael
------------------
P2c-300a/450, 192MB PC125 SDRAM, Quantum Fireball Plus KA 18.2GB 7200rpm, Panasonic 7502B x4/x8 Ultra SCSI CD-R, Tekram DC-390U2W Ultra2Wide SCSI controller, Diamond MX300 (Vortex2), Creative Labs AWE64 Gold Sound Blaster, A-Trend Voodoo II 12MB, Matrox Millennium G400Max, 19" Hitachi SuperScan 752 and some other fancy stuff
[This message has been edited by MM (edited 09-04-1999).]
[This message has been edited by MM (edited 09-04-1999).]
[This message has been edited by MM (edited 09-06-1999).]
Comment