Text Size

AMD Fusion, Bobcat, Bulldozer

Bulldozer NXT uarch

Discussion about AMD's upcoming CPU's and APU's

Bulldozer NXT uarch

Postby wuttz » Fri Jan 13, 2012 3:17 am

NXT , i.e. NeXT

Objective: Fix and vastly improve single-thread performance.
How:
Make the INT scheduler in modules dynamically switch between single/dual cores,
just like how it is with the FP scheduler dynamically allocating the FP unit between two INT clusters.

So, for "heavy" thread profiles, INT scheduler assigns the resources of two INT clusters in the module to that one thread. For light threads, continue as it is today, assigning it to that single INT cluster.

This way you keep the core that is capable of higher clocks/turbo, but have a wider execution path = more available IPC for bentmarks. :mrgreen:


* this may have been covered previously in depth, but as far as i can read, the scenario was for the 2nd INT cluster to do work w/ speculation/run-ahead execution. This of course is different.


Whatcha think? You want some of what I'm smokin? :mrgreen: :mrgreen: :mrgreen:
wuttz
K8 Athlon 64 X2 (Manchester) Elite Boarder
K8 Athlon 64 X2 (Manchester) Elite Boarder
 
Posts: 2992
Joined: Sat Aug 08, 2009 6:48 pm
Location: Pearland, Texas

Re: Bulldozer NXT uarch

Postby seronx » Fri Jan 13, 2012 3:43 am

wuttz wrote:Whatcha think? You want some of what I'm smokin? :mrgreen: :mrgreen: :mrgreen:


Not really going to fix anything and I don't think it is possible

Front-end VMT -> SMT with a 64B iFetch
Floating Point Unit -> 8x64B FMAC. 8x64B MAC. with Front-end of FPU VMT -> SMT
Continue adding EX pipe instructions to AGLU pipe instructions
Bigger WCC 8KB instead of 4KB

But it would seem they are already doing some of the stuff I am saying in later gens :twisted:
User avatar
seronx
K6-III Fresh Boarder
K6-III Fresh Boarder
 
Posts: 286
Joined: Sun Jul 17, 2011 6:03 pm

Re: Bulldozer NXT uarch

Postby duby229 » Fri Jan 13, 2012 5:13 am

What your describing is more or less reverse hyper threading. AMD has already said it isnt ever going to happen.

EDIT: It is entirely possible, AMD owns dozens of patents on it. But for some undescribed reason, they absolutely refuse to do it.
MB: Gigabyte GA785GMT-UDH2
CPU: AMD Phenom II 955 BE @ 3820mhz
MEM: G. Skill DDR3-1600 2x2GB @ DDR3-1866
GFX: ATi Radeon HD 6850
OS: Gentoo Linux
User avatar
duby229
K8 Athlon 64 (San Diego) Expert Boarder
K8 Athlon 64 (San Diego) Expert Boarder
 
Posts: 1863
Joined: Wed Mar 10, 2004 8:45 pm

Re: Bulldozer NXT uarch

Postby wuttz » Fri Jan 13, 2012 5:35 am

so thats what they were talking about .. :oops: :oops:
thanks duby. :wink: :cry: :cry:
wuttz
K8 Athlon 64 X2 (Manchester) Elite Boarder
K8 Athlon 64 X2 (Manchester) Elite Boarder
 
Posts: 2992
Joined: Sat Aug 08, 2009 6:48 pm
Location: Pearland, Texas

Re: Bulldozer NXT uarch

Postby mmarq » Fri Jan 13, 2012 6:03 am

oh boy! speculating... the exercise i like the most :mrgreen:

To massively increase the single thread performance to me it seems the best way is to apply forms of speculative multithreading. Why ? just consider that spMT of sorts is a way of dramatically increase the ILP of a single thread, by the way of more threads(but speculative ones that is)
(here have plenty of fun http://groups.google.com/group/comp.arc ... este+grupo )

a)
Eager execution (both sides of problematic branches simultaneously, one on a hardware thread)
Ahead "access" computation( Access computation on another hardware thread)
Run-ahead "speculative execution" for some amount of instructions( it can be large) after a cache miss (this doesn't imply the use of another hardware thread)

This are forms of "batched" execution with small threads or no threads at all if you consider the possibility of very large "instruction windows" by the way of checkpointing.

b)
Other more sophisticated ones include things like

CALL induced thread forking,
LOOP forking
Method Follower..

etc etc ... and last but not least,

Compiler induced or "binary translation machine" induced thread level data speculation forking by the way of transactions ( must include support for Transactional Memory which in AMD includes the provisions for ASF or lightweight locks )
-------------------

And this last one seems a fit for FSA/FSAIL (which has at core a JIT binary translator agnostic to both GPU and CPU) which might include that spMT option as another form of on-the-fly code translation and optimization/transformation.

a) must include a form of data speculation to function well, which for itself can already give a core up to 20% boost... which i believe a rudimentary form/first approach( elaborated enough) will be on Steamroller

Excavator might employ plenty of support for Binary Translations on top of a Steamroller base(FSA will blossom), including a form of on-the-fly fully hardware supported "Fusing" x86 INt instructions into XOP, so a BD core now is kind of a false 4 way wide issue, in Excavator will be 6 way wide issue (counting 2 MMX pipes per core)...

Perhaps over 50% more single thread performance than now, cause it will have 50% more pipes(4 of BD against 6 of Exc-> more ILP), and on top of it will have also data speculation and other tweaks.(-> clever way to substantially augment the ILP.. eh!... without really augmenting the number of executing pipes per thread :mrgreen: ) ... and funny thing the same basic u-arch of BD with 4 INT pipes and 2 MMX pipes FPUs, so the FO4 and ability for clock can be maintain or even improved 8)

Funny thing, this matches the rumor of Theo Valich about a 128bit monster, he even talks about 128bit x86 http://www.brightsideofnews.com/news/20 ... x?pageid=1 ... and it also matches another comment on CPU-World about a X128 support in Excavator on a PD Trinity thread http://www.cpu-world.com/news_2012/2012 ... ssors.html

Posted by: Anton Markov

This is native Next Generation Bulldozer based on new FM2 socket/chipset platform, with a big changes of work with RAM; new instructions included; new faster L cache memories of every levels and peak up tu 1.5+TFLOPs for oldest model when turbo core 3.0 is enabled...After that in 2014 will have AMD Excavator with X128 support and quad-channel DDR4...


Ok i'm a viced speculator also :lol: ... but sorry wuttz... your generous offer!(You want some of what I'm smokin?)... but the stuff this guys are smok'en is much better than yours ... feewwww! :lol: :mrgreen:

Isn't this ENOUGH for ya ?.. how much do ya think Intel 'll get from Haswell ? ... i think its not really needed more to write letters and calculate spreadsheets, more will be code target specific and it will be "heterogeneous computing" (fusing x86 into XOP is already a form of forced heterogeneity)... and so after Excavator, i will not go that far... after all the world is going to end in 2012(this year), and i already got a front seat ticket with snacks and drinks at disposal, a bag of good stuff and a hot chick to pamper me up... 8) (rest is private :mrgreen: )
mmarq
K8 Athlon 64 (Orleans) Expert Boarder
K8 Athlon 64 (Orleans) Expert Boarder
 
Posts: 2337
Joined: Sat Jul 14, 2007 4:31 am

Re: Bulldozer NXT uarch

Postby mmarq » Fri Jan 13, 2012 7:01 am

wuttz wrote:NXT , i.e. NeXT

Objective: Fix and vastly improve single-thread performance.
How:
Make the INT scheduler in modules dynamically switch between single/dual cores,
just like how it is with the FP scheduler dynamically allocating the FP unit between two INT clusters.

So, for "heavy" thread profiles, INT scheduler assigns the resources of two INT clusters in the module to that one thread. For light threads, continue as it is today, assigning it to that single INT cluster.

This way you keep the core that is capable of higher clocks/turbo, but have a wider execution path = more available IPC for bentmarks. :mrgreen:


Now seriously! ... to do that, you don't really need to touch the scheduler, it already supports speculative execution.. lets see if i can explain it... you can kind of serialize things, by the way of a switch-on-event hardware thread not visible to OS, or even a SMT internal context on a single core, instead of parallelize it by the making the 2 core/clusters to work on a single thread...

You can have forms of eager execution and run-ahead (they complement), without trashing the other thread context... or bentmarks will find ways to massively target the 2 cores for a single thread, and your 8 core chip will be most of times 4, and as since thread counts will go up for sure for most code(but not that much)even for desktop... chances are you'll be losing overall performance.

So... Excavator can have some very rudimentary forms of eager execution and run-ahead, facilitated by a larger potential checkpointing power than Steamroller(as a perspective example, BD already has a very rudimentary form of checkpointing->no real traditional ROB structures)..

..it all complements very well data speculation(with load value prediction) which i suspect steamroller will have(along with a decoded instruction loop detection buffer[like intel] to alleviate front-end sharing), and it complements well x86 fusing into XOP(enlarges the potential ILP-number of pipes per thread, so a hardware thread can have a room without trashing any of the 2 thread contexts of a module)

I don't think Excavator will have more than 2 x86 cores per thread, but at 20nm it might very well have 6 modules for desktop chips and up to 8 modules per server chip... steamroller will be the same of now with PD, 4 and 5 modules and 2 threads per module.

Only after Excavator, and at 14nm, is when we will see 4 thread contexts per module and it can be an elaborated form of SMT(many resources duplicated) on top of CMT, with support for those hardware threads in an out of any of those OS visible thread contexts.
mmarq
K8 Athlon 64 (Orleans) Expert Boarder
K8 Athlon 64 (Orleans) Expert Boarder
 
Posts: 2337
Joined: Sat Jul 14, 2007 4:31 am

Re: Bulldozer NXT uarch

Postby wuttz » Fri Jan 13, 2012 10:10 am

mmarq;

i think i'd agree with your ideas;
1. "fusing x86 INT to XOP" making use of XOP pipes
2. trace cache for I$ (L0 for SB) bypassing decode when in loops

but not with;
1. spMT, eager execution
2. JIT/ FSAIL

latter group will always have power/performance budget constraints ..
we need methods that will give better results 100% of the time.
spMT is not 100%, JIT txlation has performance overhead = no go.

better to invest rather in;
1. higher clocks/ turbo

why is amd allergic to reverse-HTT?
wuttz
K8 Athlon 64 X2 (Manchester) Elite Boarder
K8 Athlon 64 X2 (Manchester) Elite Boarder
 
Posts: 2992
Joined: Sat Aug 08, 2009 6:48 pm
Location: Pearland, Texas

Re: Bulldozer NXT uarch

Postby pTmd » Fri Jan 13, 2012 11:07 am

mmarq wrote:at 20nm it might very well have 6 modules for desktop chips and up to 8 modules per server chip...

It seems they will follow Intel's step and not fully enable cores but clock higher with less cores on desktop variants. Okay... only if the "Vishera is up to 8 cores" is true.
pTmd
K6-2 Fresh Boarder
K6-2 Fresh Boarder
 
Posts: 204
Joined: Sat Dec 24, 2011 5:57 pm

Re: Bulldozer NXT uarch

Postby mmarq » Sat Jan 14, 2012 7:55 am

wuttz wrote:mmarq;

i think i'd agree with your ideas;
1. "fusing x86 INT to XOP" making use of XOP pipes
2. trace cache for I$ (L0 for SB) bypassing decode when in loops


Yeah!... but a compiler, JIT or not, can accomplish that "binary translation" (fusing) much more extensively than a pure hardware method... and optimize the code at the same time. So a JIT can complement very well hardware fusing for x86 to XOP.

But recent AMD patents about trace caches specify a "redirect recovery cache" where traces are formed around branch targets, with possible extensive reuse of instructions ->

wuttz wrote:
but not with;
1. spMT, eager execution
2. JIT/ FSAIL

latter group will always have power/performance budget constraints ..
we need methods that will give better results 100% of the time.
spMT is not 100%, JIT txlation has performance overhead = no go.


SpMT is only a generic term that can encompass plenty of diff techs.

Eager execution can be like run-ahead, sequential(like IBM), using checkpointing instead of a thread context. And since you'll have a "redirect recovery cache" then 80% of the difficulty of implementing eager execution is already done... how came you don't like it ?

Eager execution from a trace cache, reusing decoded instructions, can have quite a low additional power constrain and surely will bring good performance, cause its a form of run-ahead or ahead execution, orienting prefetch and warming caches if nothing else...

Nevertheless i think an internal SMT context for this is better, more so because its an even easier target for checkpointing, and also because larger traces can be employed(up to dozens of instruc if not hundreds), and since you have vertical multithreading, then you probably proceed from the "dispatch domain" where the trace cache will be(?), and vertical multithread it with the o-o-o "exec domain"... i believe its facilitated this way...

A JIT, you already have a JIT wuttz :?: ... if you have a GPU you already have a JIT, wasting plenty cycles on your CPU wuttz! :roll:

FSAIL you will also have it, unless you'll have nothing but a pure x86 CPU and no AMD GPGPU... because in the future, if you'll choose an AMD heterogeneous CPU(APU ?)... even if with no AMD GPU and or graphics in it... CPU this with the likes of compress, encrypt, managed-code, complementing large vectors, as different co-processor blocks, there in the modules[even for Opteron server offerings]... chances are you'll be forced to load FSAIL to make it function properly. The agnostic to CPU and GPU FSA/FSAIL is not only for GPGPUs, its for heterogeneous computing in general.

So with a JIT... and FSAIL, which is like a low level VM(LLVM) target... the missing piece is compiler oriented speculative multithreading... nothing new really... its not easy, but not new, and one of the forms of mitigating the issues of this "compiler speculative" tech, is by orienting it to break sequential code into various threads by data speculation(a level above and beyond hardware forms of data speculation which i believe Steamroller will have) on the way of transactions with protected memory regions... and transactional memory with ASF of AMD, will provide exactly that...

This is the only form of spMT that has less power constraints because is more software than hardware, and because its done mostly by a compiler(JIT) with the added bonus of having a proper target ISA .. no matter if a "virtual ISA" like in FSAIL... so it more easily can have other code optimizations/transformations complementing this form of spMT, matter of fact, on-the-fly TLDS(thread level data speculation) can be one of those optimizations/transformations all complementing the above hardware fusing of x86 to XOP.

Transactional Memory will go for sure, AMD and Intel are even cooperating on this, and so forms of TLDS also, cause this transactional memory feature is "speculative" in nature, no matter if now only specifically encoded by developers... a charme! ... now AMD only has to have a way to do it on-the-fly without developer intervention, or complement the developer coding(THE BEST) ...

Only i don't think this will be by the time of Excavator... but probably 1 or 2 iterations after.

Intel will also have a form of TLDS with HTM.. of sorts... in the form of pre-computation with internal tread contexts.

better to invest rather in;
1. higher clocks/ turbo


quite orthogonal, even an internal SMT context, on each core, will not constrain this and most probably will not imply the need for a larger FO4... OTOH your idea of making 2 cluster/cores to work on a single thread might(and not little).

Also an internal SMT context might come very handy(unavoidable) if you want to implement reliable or redundant execution... like IBM has for the big metal... and AMD seems serious about this.

It can be BIOS triggered for those internal SMT contexts:

eager execution/run-ahead for client and small server...

redundant execution for big server jobs

all from the same exact chips...

why is amd allergic to reverse-HTT?


Reverse-HTT is spMT... 1 real thread(context) several cores (achievable by speculating)... (inverse)... HTT is 1 real core several contexts...

So!... will AMD ever implement a form of spMT ?... beyond something like eager execution that is ?... i think yes.

At least AMD like Intel seems serious about transactional memory support, and since you'll have a JIT and FSAIL for heterogeneous computing "things"(APUs, HCUs? :lol: ), which very well all metal from AMD can be it in the future, including server chips.. why not data a good form of speculative multithreading ?
mmarq
K8 Athlon 64 (Orleans) Expert Boarder
K8 Athlon 64 (Orleans) Expert Boarder
 
Posts: 2337
Joined: Sat Jul 14, 2007 4:31 am

Re: Bulldozer NXT uarch

Postby mmarq » Sat Jan 14, 2012 8:04 am

pTmd wrote:
mmarq wrote:at 20nm it might very well have 6 modules for desktop chips and up to 8 modules per server chip...

It seems they will follow Intel's step and not fully enable cores but clock higher with less cores on desktop variants. Okay... only if the "Vishera is up to 8 cores" is true.


Not really! the 6 modules will be the equivalent of Zambezi/Vishera, it will not have any core disabled.

The 8 module chips will be a different chip...server oriented...

at 20nm has i guess for Excavator... 8 modules will be 16 cores( REAL cores with perfect meaning) in a single chip, so a MCM variant will have 32 (8 channels of DDR4 might suffice).

of course those will be server variants like Terramar(2012-32nm) will be the same 8 channels but "only" 20 cores, and DDR3.

Vishera 8 cores if true ? Zambezi is already 8 cores! Vishera is PileDriver for 2012.
mmarq
K8 Athlon 64 (Orleans) Expert Boarder
K8 Athlon 64 (Orleans) Expert Boarder
 
Posts: 2337
Joined: Sat Jul 14, 2007 4:31 am

Re: Bulldozer NXT uarch

Postby pTmd » Sat Jan 14, 2012 10:14 am

mmarq wrote:Not really! the 6 modules will be the equivalent of Zambezi/Vishera, it will not have any core disabled.

The 8 module chips will be a different chip...server oriented...

The question is that will they make such a CPU-only chip, while:
1. the MCM strategy still remains;
2. the AMD Family of APUs will be expanded.
:P

Vishera 8 cores if true ? Zambezi is already 8 cores! Vishera is PileDriver for 2012.

I am talking about IF Vishera got eight cores then it seems blahblahblah. It's none of the business of Zambezi.
pTmd
K6-2 Fresh Boarder
K6-2 Fresh Boarder
 
Posts: 204
Joined: Sat Dec 24, 2011 5:57 pm

Re: Bulldozer NXT uarch

Postby seronx » Sat Jan 14, 2012 10:24 am

pTmd wrote:I am talking about IF Vishera got eight cores then it seems blahblahblah. It's none of the business of Zambezi.


We will find out on February 2nd, at least we won't have to wait for H2 2012
User avatar
seronx
K6-III Fresh Boarder
K6-III Fresh Boarder
 
Posts: 286
Joined: Sun Jul 17, 2011 6:03 pm

Re: Bulldozer NXT uarch

Postby mmarq » Sat Jan 14, 2012 5:44 pm

pTmd wrote:
mmarq wrote:Not really! the 6 modules will be the equivalent of Zambezi/Vishera, it will not have any core disabled.

The 8 module chips will be a different chip...server oriented...

The question is that will they make such a CPU-only chip, while:
1. the MCM strategy still remains;
2. the AMD Family of APUs will be expanded.
:P


Yes "one thing that they don't want you to know" lol :mrgreen: .. its that a C2012 socket will have provisions for PCIe.. so not hard to imagine that AMD will kiss the but of Intel on the PCIe issue, instead of trash it with HTX, and sort out a "monster APU" for those sockets that originally were supposed to be only for the likes of professional workstation and server( with ECC DDR -> a replacement of C32)... perhaps even at 140W!...

pTmd wrote:
Vishera 8 cores if true ? Zambezi is already 8 cores! Vishera is PileDriver for 2012.

I am talking about IF Vishera got eight cores then it seems blahblahblah. It's none of the business of Zambezi.


8 cores is plenty of more than fine...Trinity is only 4 and nobody seems to complain... and IF AMD can put those cores at over 4 Ghz, for this PD, then awesome.
mmarq
K8 Athlon 64 (Orleans) Expert Boarder
K8 Athlon 64 (Orleans) Expert Boarder
 
Posts: 2337
Joined: Sat Jul 14, 2007 4:31 am

Re: Bulldozer NXT uarch

Postby Pietro sk » Sat Jan 14, 2012 11:50 pm

wuttz wrote:Whatcha think? You want some of what I'm smokin? :mrgreen: :mrgreen: :mrgreen:
ImageYou should share good stuff with us :lol:
Image
The famous intel lawsuit Image Moron is playing VIDEO-game
Not biased Cinebench ? Think again ... They say it´s "finetuned" for intel CPU´s .. Image
User avatar
Pietro sk
K8 Athlon 64 X2 (Windsor) Elite Boarder
K8 Athlon 64 X2 (Windsor) Elite Boarder
 
Posts: 3933
Joined: Fri Jan 08, 2010 3:59 pm
Location: Le sarcasm..

Re: Bulldozer NXT uarch

Postby DamnYank » Thu Mar 01, 2012 2:11 am

Saw a post on SA with a link,

http://www.russinoff.com/papers/srt8.pdf

did not see anything about this posted here on AMD Zone, was hoping somebody could explain exactly what this will mean performance wise for steamroller compared to bulldozer and pile driver.
DamnYank
K5 Fresh Boarder
K5 Fresh Boarder
 
Posts: 119
Joined: Thu Aug 11, 2011 6:03 am

Re: Bulldozer NXT uarch

Postby Lamb0 » Thu Mar 01, 2012 4:17 am

Ooof! :? It looks like a Mathematician's Proof of a computerized version of the way my Father taught me to do a long hand square root in fourth grade. :roll: He's got the "i"s dotted and the "t"s crossed so it "just works" without introducing unmanageable discrepancies for efficient computation on a computer. 8) My Honors Calculus instructor would have referred to it as "elegant". :lol: However, I was paying more attention to the "elegant" way she filled a tight t-shirt at the time! :oops:
User avatar
Lamb0
K7 Athlon XP (Palomino) Junior Boarder
K7 Athlon XP (Palomino) Junior Boarder
 
Posts: 414
Joined: Tue Jan 05, 2010 10:18 am
Location: Faibury, NE, USA

Re: Bulldozer NXT uarch

Postby Montaray Jack » Thu Mar 01, 2012 7:19 am

Be careful about using the following code -- I've only proven that it
works, I haven't tested it.

Donald Knuth
Seems apropos.

The infamous SRT lookup tables. Pentium's FDIV bug caused me to ruin a very expensive piece of steel.

So he proved it correct. Performance wise I don't know. It's probably not a change to using the FMA functions because that would probably use Goldschmidt’s Algorithm.(edit: nope I'm wrong here, HP used SRT with FMA on the PA8000 so it IS possible, Sun used Goldschmidt's on the SuperSPARC but didn't use FMA. Just for curiosity's sake there are other outliers in historic processor's division algorithms: DEC 21164 Alpha AXP used SRT Adder-Coupled, IBM RS/6000 Power2 used Newton-Raphson, Mips R8000 used a Multiplicative algorithm, while the Mips R10000 used SRT Multiplier-coupled)

The big change is from a radix 4 to a radix 8 algorithm. This might shed some light on the matter.
http://www.acsel-lab.com/arithmetic/papers/ARITH09/ARITH09_Fandrianto.pdf
ALGORITHM for HIGH SPEED SHARED RADIX 8 DIVISION and RADIX 8 SQUARE ROOT
And this:
http://gram.eng.uci.edu/~numlab/archive/pub/nl98p02/nl98p-02.html
LOW-POWER RADIX-8 DIVIDER

One last link, rather old, but where I got the info for the '90s processors:
Pipelining High-Radix SRT Division Algorithms
http://www.csee.umbc.edu/~squire/images/srt1.pdf
No.6: “The whole earth as. . . `The Village'?”
No.2: “That is my hope. What's yours?”
No.6: “I'd like to be the first man on the moon!”
--Chimes of Big Ben
Montaray Jack
K8 Athlon 64 (Clawhammer) Senior Boarder
K8 Athlon 64 (Clawhammer) Senior Boarder
 
Posts: 1166
Joined: Sat May 30, 2009 11:29 pm
Location: The Village

Re: Bulldozer NXT uarch

Postby duby229 » Sun Mar 18, 2012 12:00 am

Let me chime and and say that, I dont believe AMD is in a position to fight Intel in a manufacturing war....

But I do believe also that they have some real amazing IP to work with that they are developing well. We'll see how Mr. Read handles the situation.
MB: Gigabyte GA785GMT-UDH2
CPU: AMD Phenom II 955 BE @ 3820mhz
MEM: G. Skill DDR3-1600 2x2GB @ DDR3-1866
GFX: ATi Radeon HD 6850
OS: Gentoo Linux
User avatar
duby229
K8 Athlon 64 (San Diego) Expert Boarder
K8 Athlon 64 (San Diego) Expert Boarder
 
Posts: 1863
Joined: Wed Mar 10, 2004 8:45 pm

Re: Bulldozer NXT uarch

Postby eaima » Sun Mar 18, 2012 2:06 am

duby229 wrote:Let me chime and and say that, I dont believe AMD is in a position to fight Intel in a manufacturing war....

It's obvious that amd is not in a position to fight on the manufacturing side, they are fabless. :P
The constancy of the universe:
1+1=10
2+2=10
...
π+π=10
User avatar
eaima
K7 Athlon XP (Thoroughbred) Senior Boarder
K7 Athlon XP (Thoroughbred) Senior Boarder
 
Posts: 781
Joined: Fri Jun 26, 2009 2:25 pm
Location: >Shall we play a game?

Re: Bulldozer NXT uarch

Postby eaima » Sun Mar 18, 2012 2:07 am

Anton Markov wrote:
mmarq wrote:NXT , i.e. NeXT
Posted by: Anton Markov

This is native Next Generation Bulldozer based on new FM2 socket/chipset platform, with a big changes of work with RAM; new instructions included; new faster L cache memories of every levels and peak up tu 1.5+TFLOPs for oldest model when turbo core 3.0 is enabled...After that in 2014 will have AMD Excavator with X128 support and quad-channel DDR4...

Anton Markov is here! Yes I make speculation with some specs...I wish also to ignite in you a desire to want more for your own benefit and also to prevent, any monopolization of production and the relevant part of the market (for "amateurish" processors) where our users to lead us where they want tied to the string holding the carrot in front of our eyes ....I wish a real competition, bloody war between manifacturers and superpower procesors for a few dollars each :mrgreen:


Hi Anton, welcome to the zone! Yeah we all dream the same. I am dreaming of a world where all competitors put their efforts to make better products, instead of concocting a tactic where the adversary die to get all the place by default; because it's hard to move when your adversary have his foots over your foot. Now I need to find a name for this planet.
The constancy of the universe:
1+1=10
2+2=10
...
π+π=10
User avatar
eaima
K7 Athlon XP (Thoroughbred) Senior Boarder
K7 Athlon XP (Thoroughbred) Senior Boarder
 
Posts: 781
Joined: Fri Jun 26, 2009 2:25 pm
Location: >Shall we play a game?

Re: Bulldozer NXT uarch

Postby duby229 » Sun Mar 18, 2012 3:18 am

cheiftonia....
MB: Gigabyte GA785GMT-UDH2
CPU: AMD Phenom II 955 BE @ 3820mhz
MEM: G. Skill DDR3-1600 2x2GB @ DDR3-1866
GFX: ATi Radeon HD 6850
OS: Gentoo Linux
User avatar
duby229
K8 Athlon 64 (San Diego) Expert Boarder
K8 Athlon 64 (San Diego) Expert Boarder
 
Posts: 1863
Joined: Wed Mar 10, 2004 8:45 pm

Re: Bulldozer NXT uarch

Postby AussieFX » Sun Mar 18, 2012 10:09 am

Republic of PGOMF...

(Please Get Off My Foot)
Sent from my flippy phone thingy using TAPATALK HD_2016.1


Image
Nikon D7000 / Nikon D5000
User avatar
AussieFX
K8 Opteron (SledgeHammer) Moderator
K8 Opteron (SledgeHammer) Moderator
 
Posts: 7678
Joined: Fri May 11, 2007 1:50 pm
Location: I wish I knew...

Streamroller discussion

Postby undone1999 » Thu May 03, 2012 12:09 pm

Time to start a thread about the future. :wink:

There were some rumors about the new architecture, SR, would have lots of changes compared to nowaday BD and PD.
Kaveri is an APU which will based on SR and GCN GPU. It's sure that Kaveri would have more SP in the GPU, and we already know a lot about GCN, so...... is it possible that GPU part would be significantly larger than CPU part which will based on SR in Kaveri? :shock: I doubt AMD to sacrifice more area for GPU.
When I see the heat problem on IVB, which is partly because the core is hotter due to the smaller area, I begin to guess, would the SR module be significantly bigger than PD? If true, die area of both CPU and GPU part would be balance. Just a thought.
undone1999
 
Posts: 14
Joined: Mon Jan 31, 2011 3:43 pm

Re: Streamroller discussion

Postby Lamb0 » Thu May 03, 2012 2:52 pm

:mrgreen: Um...Can we have an upgrade from 384 SPs to 512 SPs and 3 Modules (6 cores) with the change from 32nm to 28nm? If three CPU modules is 1 too many, perhaps a nice BIG boost in CPU/GPU cache quantity and efficiency for the new APU will be the order of the day. :wink:
User avatar
Lamb0
K7 Athlon XP (Palomino) Junior Boarder
K7 Athlon XP (Palomino) Junior Boarder
 
Posts: 414
Joined: Tue Jan 05, 2010 10:18 am
Location: Faibury, NE, USA

Re: Streamroller discussion

Postby duby229 » Thu May 03, 2012 6:12 pm

cool. now we are talking about steamrollers...i happen to know alot about them. :D
MB: Gigabyte GA785GMT-UDH2
CPU: AMD Phenom II 955 BE @ 3820mhz
MEM: G. Skill DDR3-1600 2x2GB @ DDR3-1866
GFX: ATi Radeon HD 6850
OS: Gentoo Linux
User avatar
duby229
K8 Athlon 64 (San Diego) Expert Boarder
K8 Athlon 64 (San Diego) Expert Boarder
 
Posts: 1863
Joined: Wed Mar 10, 2004 8:45 pm

Next

Return to AMD Fusion, Bobcat, Bulldozer

Who is online

Users browsing this forum: No registered users and 1 guest

cron