Announcement

**Wombat** · 22 April 2002, 16:27

I completely disagree. You just have a very short memory. People have been predicting the eminent failure of Moore's Law for over a decade. Actually, it hasn't failed yet.

We just:
-keep finding smaller lambdas
-improve the fab process, redefining (reasonable size)
-repairable circuits
-find better dielectrics
-start working with more accurate models than previously used

Et cetera, et cetera. I can see what's coming down the pipe for the next 5 years or so, and there's nothing for iconoclasts such as yourself to be yelling about, but that hasn't stopped your kind yet, even after decades of being wrong.

Here's a quick review of the last decade: http://www.icknowledge.com/history/1990s.html

Oh, McKinley is 464mm^2, 3.3x the P3 size, and since die errors are roughly O(n^2), McKinley yield should be about 1/11 of the P3, if we operate in your world where the fab will kill us. I can assure you that such a calculcation is foolish.

For the most part, video display doesn't parallelize fairly well. That means that any multi-chip solution would have to have a whole lot of communication between the different cores. Now, would you care to guess how many orders of magnitude slower it is to talk to another chip than it is to communicate with another part of the die?

**flee** · 22 April 2002, 16:29

You guys have it the wrong way wrong way round....

Its not multichip its multicore!!!!!!!

Now that wouldn't be too expensive or hard to implement....

**Wombat** · 22 April 2002, 16:47

I sure hope you're joking.

Multi-core is nice, in some places, but it costs you. It's MORE expensive to design, test, yield, and debug than multi-chip (at least you can get a logic analyzer between two chips). But when you get it working, yeah, it's sweet. Fast, expensive, but sweet.

**Ali** · 22 April 2002, 17:00

Just to be anal, but there are a few mistakes in that history.

CutnPaste:
1997 - Intel Pentium IITM

The Pentium II introduced single in-line cartridge housing the processor chip and standard cache chips running at ½ the processor speed. The Pentium II was manufactured in a silicon gate CMOS process with 0.35µm linewidths, required 16 mask layers and had 1 polysilicon layer and 4 metal layers, the Pentium II had 7.5 million transistors, a 233 to 300MHz clock speed and a 209mm2 die size.

The P2 had a clock speed of 233 to 450Mhz, not to 300.

CutnPaste:
1999 - Intel Pentium IIITM

The Pentium III returned to a more standard PGA package and integrated the cache on chip. The Pentium III was manufactured in a silicon gate CMOS process with 0.18µm linewidths, required 21 mask layers and had 1 polysilicon layer and 6 metal layers, the Pentium III had 28 million transistors, a 500 to 733MHz clock speed and a 140mm2 die size.

this gets harder, as Intel changed the P3 at 600Mhz, so the origonal P3 went from 450Mhz to 600Mhz, then the newer one went from 450Mhz (again) up to 1.13ghz (althought I think that got recalled), now there is an even newer P3, that goes from (I think) 933Mhz to over 1.2Ghz

Ali (the anal retentive)

**flee** · 22 April 2002, 17:01

Opppps I forgot to mention

Not only is it multicore its is multi GPU

Although not to get you guys way tooo excited..... I could be imagining this

**superfly** · 22 April 2002, 17:13

Originally posted by Wombat
I completely disagree. You just have a very short memory. People have been predicting the eminent failure of Moore's Law for over a decade. Actually, it hasn't failed yet.

We just:
-keep finding smaller lambdas
-improve the fab process, redefining (reasonable size)
-repairable circuits
-find better dielectrics
-start working with more accurate models than previously used

Et cetera, et cetera. I can see what's coming down the pipe for the next 5 years or so, and there's nothing for iconoclasts such as yourself to be yelling about, but that hasn't stopped your kind yet, even after decades of being wrong.

Here's a quick review of the last decade: http://www.icknowledge.com/history/1990s.html

Oh, McKinley is 464mm^2, 3.3x the P3 size, and since die errors are roughly O(n^2), McKinley yield should be about 1/11 of the P3, if we operate in your world where the fab will kill us. I can assure you that such a calculcation is foolish.

For the most part, video display doesn't parallelize fairly well. That means that any multi-chip solution would have to have a whole lot of communication between the different cores. Now, would you care to guess how many orders of magnitude slower it is to talk to another chip than it is to communicate with another part of the die?

While the progress accomplished so far is nothing short of impressive,there's no denying that sooner or later fab processes are bound to hit a barrier... i'd say no more than 10 years tops.

lithography technology is being pushed to the point where pretty soon they'll be operating very close to x-ray frequencies.

Then there's the problem of how transistors will react to operating at extremely high speeds(think 10 ghz +) when they're fast aproching the atomic scale as far as gate lenght/height goes....

**Wombat** · 22 April 2002, 18:35

Ali:

Their P2 history isn't that far off. I think they're just sticking to initial offerings. The initial core (Klamath?) stopped at 300, maybe 333. Everything over that was a different fab job (Descheutes?).

**JF_Aidan_Pryde** · 22 April 2002, 20:48

Wombat,

Graphics parrallels better than anything else in the PC. Graphics has always benefited immensely from parallel processing, right from the days of the first dual texturing Voodoo2 to SLI architecture. Surely multi-chip configurations are hard to design in many aspects and also cost a heap, but they give near linear returns in speed for extra chips added in. CPUs can not come close in this respect. In graphics, the speed you get is directly proportional to the amount of extra transistors you put in, since we are accelerating a particular task (3D) in hardware. Extra transistors in CPUs are used mostly as overhead, solving problems for the CPU rather than dealing with any problem directly.

The reason why scalable graphics has been rather unsuccessful recently is mainly architectural. Current implementations of SLI and AFR are too inefficient (texture duplication) and lack geometry scaling. A deferred render is much easier to implement a multi-chip configuration. When eDRAM matrues, it should also make designing scalable IMR much easier.

**Wombat** · 22 April 2002, 21:06

That's the thing. The SLI setup worked because a single Voodoo2 was fillrate limited. That problem, although it exists today, is not nearly what it was. The ATI card, with AFR, suffered its own miserable fate.

The returns from multi-core graphics setups are diminishing quickly, as the communication costs of a multi-chip solution continually rise to counter the benefit of the second processor. eDRAM may very well make this problem more difficult, as the graphics community would have to tackle issues such as cache coherency. Not to mention the lost real estate (and power) needed for all of the extra pins for IPC.

In CPU land, the return from multiple processors <I>can</I> be linear, it all depends on the job you're doing. If it's not something that can be done in parallel very well, then the return isn't very big. I can't think of any reasons that video processors don't fit the same profile.

I don't understand what you're trying to say about transistors in CPUs.

**Lemmin** · 23 April 2002, 02:23

I think that Moore's law, in its more generalised form, is likely to stay in force pretty much indefinately - at least as long as people need more processing power. AFAIK the technologies for the next 10 years of silicon manufacturing are already mapped and planned out, and thats without touching things like focused X-ray (lobster eye lens) and electron beam etching. And even when silicon stops being usefull, there are plenty of other technologies that could carry on the spirit, if not the letter, of Moore's law.

BTW PowerVR have a highly successfull mutli-chip solution in use right now - their naomi arcade box is a 5 chip solution (1+4). In fact their arcade chipsets have always been designed to be scalable.

LEM

**knirfie** · 23 April 2002, 03:13

Wasn't the 'original' G800 supposed to get multi-chip support? I thought i head read something about a G800 "max" coming out with two chips instead of one, of course we never got to see the g800 in all its glory, let alone the dual G800 but it's not sucha a bad idea: two parhelia GPU's on one card

or FOUR

**Novdid** · 23 April 2002, 03:21

Well I believe that people heard the "fusion", and thought "that must be a multichip solution!!!".

**Evildead666** · 23 April 2002, 03:35

Everyone saw what happened to 3DFX when they decided to go multi chip.....

Too much power drain, and the cards were huge to say the least.

Single core, single chip, but with a new set of tricks up their sleeve seems to be the best bet yet.

Core architechture is the key.

As i understand it, just for an example, the P4 core performs worse than a PIII(latest model) at the same clock speeds.
The PIII core was brilliant, the P4 core is there to get higher clock speeds (and integrate new functions etc..). there is prob more to it than that, but that is what i have come to hear on the many forums and web pages i have read...

64bit rendering is a tad overkill.
40bit internal rendering sounded good, and then outputting @32bit, or even 40bit.
But there will be more to it than that, and those in the know can't tell us what.
Even if we have speculated correctly, they (the NDA's) cannot even hint at a good guess.....

Long live the bunny.

**jwb** · 23 April 2002, 07:13

Sorry for the lack of answer. Yesterday evening from my place the forum was not accessible anymore.

Originally posted by Wombat
No, listen, it WON'T pay in the end. It's just the wrong way to do this kind of thing. It's not more expensive "to start," it's more expensive all-around, and performance would suck.

Why on earth would sombody buy then a 3dfx Alchemy (8 or 16 times VSA100 if my memory serves correctly) if performance sucked.

Why on earth was Voodoo2 (and SLI) so successfully... Because its performance sucked at that time...

NOOO...

It may NOT pay for YOU in the end. But it will for anybody else trust me...

Originally posted by Wombat

jwb, both Greebe and I are under NDAs.

And... What should it it tell me....

!! Stop replying if you are under NDA then !!

And BTW if sombody replies anyway so harsh on my personal opinion and has to tell as if i'm interested in that he/she is under NDA then something is deeply hidden here.

I KNOW... (that i'm not under NDA

)

**jwb** · 23 April 2002, 07:25

And who said it has to be distinct multi-core(s) on the same PCB.

Remember PPro:
Same cartridge two dies. One for the core one for the mem (cache).

Why not 3 dies , 1 core, 2 memory
If one die dies

instead of 256bit bus you get 128bit bus
you don't have to throw the whole thing away. Still fine today.

Much shorter traces, more tighter intergration, more bandwidth in return

Who said this can't be a multi-core solution...

Be patient is the key...

Announcement

Polish Parhelia confirmation

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment