Announcement

Collapse
No announcement yet.

Any review of RadeonLe with hyperZ on/off?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Any review of RadeonLe with hyperZ on/off?

    ive been trying to find a review on the net that has benchmarks of the ATI Radeon LE with/without hyperZ enabled to see what type of difference it makes in real world.

    so far all reviews also seem to clock the card up to the same speed as the Radeon 32DDR, which isnt realy useful.

    What I want to find out is if it realy makes all that much difference.

    Everybody is giving the Parhelia a hard time due to lack of hidden surface removal hardware, but we dont realy know for sure how much difference it makes.

    I do have a Radeon LE myself, but its in the wifes computer, and she wont let me play with it (shes writting her thesis, so fair enough).

    If anybody has a Radeon LE in windows 9x, and has a tweaker like Radedit or whatever, I seem to remember there was a way to enable/disable certain parts of Hyper-Z.

    The Parhelia has fast Z clear, so what we need is a benchmark between:

    Standard RadeonLe with fast Z clear enabled.
    RadeonLe with full HyperZ enabled

    The differences between these should show %age wise what sort of difference HSR makes in the real world.

    Anybody keen?


    Links:, Ill update as I find more.




    PS, found a little bit on the XBit site.

    Merc truck Racing 1024*768*16
    RadeonLE : 51.9
    RadonLe +HZ:51.5

    UT 1024*768*16
    LE:42.39
    LE+HZ:41.78

    And the interesting this:
    3Dmark 2000
    Game1 low:
    LE: 59.2
    HZ: 77

    Game1 high:
    Le: 22.6
    HZ: 24.8

    Game2 Low
    LE: 54.5
    HZ: 69.1

    Game2High
    LE: 33.1
    HZ: 36.2


    My conclusions:

    From the Xbit stuff, looks like real games makes no difference, fake gamse (3dmark) makes a big difference at low settings, a little difference at high settings.

    This still isnt acurrate, as we dont know how much of the difference is from fast Z clearing.

    I would say that all real games have already done most of the HSR themselves in their engines, and only badly written games/synthetic benchmarks are going to see any major difference.

    I would also say the Parhelia would Suck at Village Mark, but that was deliberatly written to render the scene in a inefficient way, which we (shouldnt) see in any real games.

    Please feel free to add anything to this/correct my conclusions.

    Ali
    Last edited by Ali; 18 May 2002, 16:48.

  • #2
    I found this also:


    Looks like fast z-clear is quite important.

    Comment


    • #3
      I haven't done any formal testing, but Hierarchial-Z on my 8500 makes the difference of 50 to 90fps on RTCW in some cases. Thats with older drivers, ATi may have improved their algorithms since I last compared.

      But it doesn't make sense to judge the usefullness of hidden surface removal by looking at only ATi's scores.

      If you can totally illliminate overdraw, when there is an overdraw of (for example) 3, you will in in fact reduce fill-rate requirement by a factor of 3. Simple. Now the question that should be asked is how effective hidden surface removal is implemented, not whether the concept of hidden surface removal itself is effective.
      Last edited by EvilDonnyboy; 18 May 2002, 17:32.
      Primary system specs:
      Asus A7V266-E | AthlonXP 1700+ | Alpha Pal8045T | Radeon 8500 | 256mb Crucial DDR | Maxtor D740X 40gb | Ricoh 8/8/32 | Toshiba 16X DVD | 3Com 905C TX NIC | Hercules Fortissimo II | Antec SX635 | Win2k Pro

      Comment


      • #4
        Nice link Tomasz, that was exactly what I waas looking for.

        According to the graph,:
        Hierarchical Z only makes up 2% of the benefit if hyper Z. thats the bit that the Parhelia doesnt have.

        Assuming the Parhelias fast Z clear is as eficient as the ATI one, thats 29% of the benefit.

        Then if the Parhelias burst mode Z makes as much difference as ATIs Z compression (which I doubt) thats another 22%.

        I would think the burst mode might make about 1/3 ofthe diference of the Z compression, so lets say 7%.

        So if HyperZ makes 53 Units of difference, the Parhelia will have 36 units.

        Realy we dont know until we see benchmarks, but its interesting to see how pointless Hierarchical Z is.

        Evildonny, I assume you are talking about hyperZII for your 50-90fps difference, not just Hierarchical Z?

        Ali

        Comment


        • #5
          No, I disabled just hierarchial Z with ATi's 6043 drivers. Did those bench's a while ago, so don't pay too much attention to the "50-90" number.

          Reducing overdraw is absolutely not useless. It is just another way to increase efficiency. I would be surprised if matrox didn't make an initiative to reduce overdraw for their product after parhelia. It's possible with the smaller .13 die that matrox might even include some kind of anti-overdraw logic in an upcomming parhelia "refresh".
          Primary system specs:
          Asus A7V266-E | AthlonXP 1700+ | Alpha Pal8045T | Radeon 8500 | 256mb Crucial DDR | Maxtor D740X 40gb | Ricoh 8/8/32 | Toshiba 16X DVD | 3Com 905C TX NIC | Hercules Fortissimo II | Antec SX635 | Win2k Pro

          Comment


          • #6
            Traditional z-buffer based rendering is totally wasteful in terms of its memory and bandwith requirements. While hidden surface removal seems like a no brainer, its practical implementation is not exactly easy. While doing an early Z test, you have to read the z-buffer right after triangle setup, then you use these z-buffer values to perform calculations to see which objects are occluded, you then discard the occluded objects, and then you have to check the z-buffer to see that nothing has changed, and then you finally draw to the frame buffer. It goes with out saying that all this has to be done very fast and in parallel with all other processes. Furthermore, it may be very difficult to predict which objects are in front of each other, especially in complex scenes in which polygons are ordered back to front or randomly (by the application). According to PowerVR's white papers, early Z tests are on average only about 10% efficient in removing hidden surfaces. It seems to be too much effort for a relatively marginal performance benefit.

            Here's a good article on a method of heirarchical z-buffer analysis (pretty much what Ati uses) by Ned Greene:


            In my mind, HSR, makes much more sense for deffered texturing solutions such as the PowerVR series. In that case, you pretty much have certainty as to what is to be occluded and don't have to deal with the Z-buffer. The Z-buffer approach is designed to be a brute force approach. It needs lots of memory, lots of bandwidth, and an efficient memory subsystem. The Parhelia seems like it will provide all three. In the end no sort of z-buffer tricks can substitute for true bandwith, so I am not concerned at all and I trust the Matrox engineers. It seems to me, from reading the specs, that the filrate and memory bandwidth of the Parhelia are matched together quite well, so neither should be a limiting factor (unlike some other cards...cough nVidia cough... which had ridiculous filrates w/o the mem. bandwidth to match). Perhaps internal testing at Matrox showed that z-buffer tricks were not necessary, or not worth it, performance-wise and hence were not included in the chip.

            I will make my final conclusions when I buy a Parhelia and test it out (hopefully soon).

            Tomasz

            Comment


            • #7
              Nappe has done some recent tests on Beyond3d.com and its seems that the performance increase is never greater than 18% (and thats the exception using low quality). Although it would be nice of Parhelia to do this type of culling, with 20gb of bandwidth, I dont see there being the problem that people like Anand are complaining about.

              REgards MD
              Last edited by mdhome; 19 May 2002, 03:46.
              Interests include:
              Computing, Reading, Pubs, Restuarants, Pubs, Curries, More Pubs and more Curries

              Comment


              • #8
                I'm not sure how effective the occlusion culling tech is in current cards (beyond the tiling architecture used in the Kyro cards). In fact, despite using much less effective tech, the GF3/4 cards do much better for the available bandwidth, not by relying on the occlusion culling and z-buffer tricks (for the most part), but instead relying on their advanced memory controllers (the nVidia crossbar controllers make a BIG difference).

                Parhelia seems to be well-endowed in that regard, if the interpretation of the chip specs and the layout diagram is on the mark.

                (Not saying occlusion culling has no benefits, just that the memory management/controller scheme seems to be more important and yield greater benefits overall)
                "..so much for subtlety.."

                System specs:
                Gainward Ti4600
                AMD Athlon XP2100+ (o.c. to 1845MHz)

                Comment


                • #9
                  don't forget that the RadeonLE your comparing against has a fillrate of ~300mpixels and ~900 mtexels, whereas the Parhelia's fillrate will be around ~1300mpix and ~5200mtex.
                  although now thinking about it, the increase in fillrate is the same for raw bandwidth. RadeonLE @150mhz 128-bit DDR has ~4,800m/sec and Parhelia's is ~20,800m/sec (325mhz 256-bit DDR)

                  both the fillrate and bandwidth have increased approximately 500%, interesting.
                  no harm, no foul.

                  Comment

                  Working...
                  X