Text Size

AMD Fusion, Bobcat, Bulldozer

3 Operand x86 Instructions? What the Hell?

Discussion about AMD's upcoming CPU's and APU's

3 Operand x86 Instructions? What the Hell?

Postby Elhardt » Thu Feb 23, 2012 8:23 am

I'm trying to confirm something shown in an AMD manual.

While AVX adds 3 operand instructions for SSE vector instructions, it still does nothing for the general purpose archaic x86 instruction set.... But I just downloaded the AMD pdf manual: "Software Optimization Guide for AMD Family 15h Processors", and was shocked by what I saw starting on page 163. They're showing 3 operand general purpose instructions like:

sub ebx, eax, 1
or ecx, ebx, eax

I have never heard that 3 operand instructions were added to the x86 instruction set, nor can I find any reference beyond this one manual to such a thing. The questions are, does the Bulldozer architecture have this capability, is this written for the future Piledriver archetecture, or is it just a bunch of nonsense written by somebody who isn't aware that x86 instructions aren't non-destructive 3 operand types?

Thanks for any insight.
Elhardt
 
Posts: 3
Joined: Thu Feb 23, 2012 8:07 am

Re: 3 Operand x86 Instructions? What the Hell?

Postby abinstein » Thu Feb 23, 2012 6:44 pm

Elhardt wrote:I'm trying to confirm something shown in an AMD manual.

While AVX adds 3 operand instructions for SSE vector instructions, it still does nothing for the general purpose archaic x86 instruction set.... But I just downloaded the AMD pdf manual: "Software Optimization Guide for AMD Family 15h Processors", and was shocked by what I saw starting on page 163. They're showing 3 operand general purpose instructions like:

sub ebx, eax, 1
or ecx, ebx, eax

I have never heard that 3 operand instructions were added to the x86 instruction set, nor can I find any reference beyond this one manual to such a thing. The questions are, does the Bulldozer architecture have this capability, is this written for the future Piledriver archetecture, or is it just a bunch of nonsense written by somebody who isn't aware that x86 instructions aren't non-destructive 3 operand types?

Thanks for any insight.

Nice find! Welcome to the Zone!

The 3-op BMI and TBM instructions are added to family 15 model 10h-1fh and 20h-2fh (which may be Piledriver and Trinity?). They are considered general-purpose instructions.
AFAIK there's no 3-op version of other (original) general-purpose instructions.

The confusing sample code that you find, well, are just sloppy writing, I guess. It seems mixing up the mnemonics of pseudo-code operation with actual x86 instructions. But nice find and I think someone should tell AMD about this. (Plus the description of that example 1 is wrong.) EDIT: on second thought perhaps it was intended for them to write the original code in pseudo-code mnemonics since the targeted audience are probably compiler writers who will be working with these optimization in the intermediate instruction level. Still they should explicitly say so to avoid confusion.
Anandtech -- a site which every visit makes me regret my time spent! It is a living testimony of Einstein's quote:"Only two things are infinite, the universe and human stupidity, and I'm not sure about the former."
abinstein
K8 Opteron (SledgeHammer) Moderator
K8 Opteron (SledgeHammer) Moderator
 
Posts: 7171
Joined: Sat Oct 30, 2004 9:49 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby Elhardt » Fri Feb 24, 2012 12:39 pm

Thanks for the information. It really is deceiving for them to put lots of examples of the general purpose x86 instructions with 3 operands. They're showing people how to reduce instruction count by using the new bit manipulation instructions, but then they use false comparisons with fake instructions. It doesn't make any sense. Intel/AMD seem to only concentrate on modernizing the SSE stuff but keep avoiding the x86 instructions. Well sort of. They're finally giving us some new bit instructions, but even there it looks like they didn't include a bitfield insert instruction to compliment their bitfield extract instruction, to say nothing about auto increment/decrement registers, and 3/4 operand x86 instructions. So they're still not where the 68000 was in 1979.

-Elhardt
Elhardt
 
Posts: 3
Joined: Thu Feb 23, 2012 8:07 am

Re: 3 Operand x86 Instructions? What the Hell?

Postby hyc » Sat Feb 25, 2012 1:55 am

Elhardt wrote:So they're still not where the 68000 was in 1979.
-Elhardt


Sigh. QFT...
hyc
K8 Athlon 64 (Winchester) Expert Boarder
K8 Athlon 64 (Winchester) Expert Boarder
 
Posts: 1349
Joined: Mon Jan 16, 2006 4:38 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby abinstein » Sat Feb 25, 2012 5:47 am

I really don't think it is a goal of either AMD or Intel to get where 68000 was at all. It would be both blind and pompous to assume that. The truth is, x86 and x86-64 have their own lives and evolution, to satisfy real needs of the customers, rather than few idealists and ideologists.

It is foolish to add some rarely used instructions into the data path just for the sake or completeness. That's probably the #1 rule in microarchitecture design.

In fact the TBM and BMI wouldn't have made ANY sense if not x86 had only destructive 2-argument general purpose instructions. Keeping that backward compatibility is an absolute must for x86-64. Thus it made that slight sense to have TBM and BMI, in the hope that it would alleviate some register pressure for these operations (still pretty weak IMO to add these instructions, but well...)
Anandtech -- a site which every visit makes me regret my time spent! It is a living testimony of Einstein's quote:"Only two things are infinite, the universe and human stupidity, and I'm not sure about the former."
abinstein
K8 Opteron (SledgeHammer) Moderator
K8 Opteron (SledgeHammer) Moderator
 
Posts: 7171
Joined: Sat Oct 30, 2004 9:49 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby gallier2 » Sat Feb 25, 2012 10:34 am

hyc wrote:
Elhardt wrote:So they're still not where the 68000 was in 1979.
-Elhardt


Sigh. QFT...


Considered that the semantic richness of 68k was the reason of its downfall in comparison to x86, I wouldn't be so emphatetic. The lateness of 68060 compared to Pentium and the slow clock were certainly due to the overly CISCy addressing modes. In comparison x86's addressing modes are pure RISC (there's at most 1 pagefault in x86 , there could be up to 8 in m680[236]0). I don't want to lecture you considered your history (I registered your name already at the time I had my Ataris) but the move mem, mem and the indirect addressing modes were imho extremly costly design errors.
gallier2
K5 Fresh Boarder
K5 Fresh Boarder
 
Posts: 131
Joined: Thu Oct 07, 2010 9:08 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby hyc » Sun Feb 26, 2012 11:10 am

abinstein wrote:I really don't think it is a goal of either AMD or Intel to get where 68000 was at all. It would be both blind and pompous to assume that. The truth is, x86 and x86-64 have their own lives and evolution, to satisfy real needs of the customers, rather than few idealists and ideologists.

It is foolish to add some rarely used instructions into the data path just for the sake or completeness. That's probably the #1 rule in microarchitecture design.

In fact the TBM and BMI wouldn't have made ANY sense if not x86 had only destructive 2-argument general purpose instructions. Keeping that backward compatibility is an absolute must for x86-64. Thus it made that slight sense to have TBM and BMI, in the hope that it would alleviate some register pressure for these operations (still pretty weak IMO to add these instructions, but well...)


If you look at the evolution of x86, the main reason for any disparity in frequency of instruction use is because of the non-orthogonal *register use*. Aside from the fact that the programmer's model was register-starved, the fact that certain instructions could only operate on certain registers, (and some of those implicitly, at that) forced the frequency of instructions to diverge.

Contrast that with a more orthogonal design and you find that, instead of instruction frequencies being spread across 50+ classes of instructions, they're spread across only a handful of instruction classes. Microarchitectures don't just live for their own purity or simplicity at the hardware level - they must be usable from software. Compilers for x86 are ridiculously complicated compared to other architectures. Optimization is literally an intractable task due to the combinatorial possibilities.

With M68K you could write a few cases and *know* that you had generated optimal code. With x86, no way; too many corner cases.

All the new instructions added to x86/x86-64 over the years are just bandaids on top of an ugly design.
hyc
K8 Athlon 64 (Winchester) Expert Boarder
K8 Athlon 64 (Winchester) Expert Boarder
 
Posts: 1349
Joined: Mon Jan 16, 2006 4:38 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby hyc » Sun Feb 26, 2012 11:14 am

gallier2 wrote:
hyc wrote:
Elhardt wrote:So they're still not where the 68000 was in 1979.
-Elhardt


Sigh. QFT...


Considered that the semantic richness of 68k was the reason of its downfall in comparison to x86, I wouldn't be so emphatetic. The lateness of 68060 compared to Pentium and the slow clock were certainly due to the overly CISCy addressing modes. In comparison x86's addressing modes are pure RISC (there's at most 1 pagefault in x86 , there could be up to 8 in m680[236]0). I don't want to lecture you considered your history (I registered your name already at the time I had my Ataris) but the move mem, mem and the indirect addressing modes were imho extremly costly design errors.


(ahh, the good ol' Atari days.....)

I suppose those addressing modes were expensive to implement. But at the same time, once they've implemented a mechanism to save state, it's just iteration of a known code path. The complexity isn't that much greater. And also, the number of page faults obviously depends on your actual data - if your access pattern was going to touch those addresses anyway, then you were going to pay for those 8 faults anyway. Better to do it from a single instruction, than from 8 (or more) instructions, with multiple state save/restores incremented at each one.

Especially today, with the prevalence of object oriented languages, all of those modes would see more use, and you'd get far more compact code using them than not.

But I suppose you're right, it may have been too far forward-thinking for the time. Certainly they were struggling to keep the thermals under control. Oh well.
hyc
K8 Athlon 64 (Winchester) Expert Boarder
K8 Athlon 64 (Winchester) Expert Boarder
 
Posts: 1349
Joined: Mon Jan 16, 2006 4:38 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby abinstein » Sun Feb 26, 2012 10:23 pm

hyc wrote:If you look at the evolution of x86, the main reason for any disparity in frequency of instruction use is because of the non-orthogonal *register use*. Aside from the fact that the programmer's model was register-starved, the fact that certain instructions could only operate on certain registers, (and some of those implicitly, at that) forced the frequency of instructions to diverge.

Contrast that with a more orthogonal design and you find that, instead of instruction frequencies being spread across 50+ classes of instructions, they're spread across only a handful of instruction classes. Microarchitectures don't just live for their own purity or simplicity at the hardware level - they must be usable from software. Compilers for x86 are ridiculously complicated compared to other architectures. Optimization is literally an intractable task due to the combinatorial possibilities.

With M68K you could write a few cases and *know* that you had generated optimal code. With x86, no way; too many corner cases.

All the new instructions added to x86/x86-64 over the years are just bandaids on top of an ugly design.


I am not trying to say x86 is a good ISA. I agree with your assessment mostly. But it wouldn't have been good to evolve it toward 68000, either. It is my belief that the CISC days were over a long time ago, ever since Pentium Pro and pretty much every high performance CPU was designed internally as RISC.

On the other hand, insisting on orthogonality also does not make an efficient CPU, since in practice, not all instructions need all memory/register addressing modes. From this point of view, the 68000 instruction set is kind-of oddly positioned. It's addressing modes are too complex and CISC like, OTOH, the instructions are made (near) orthogonal and look like a RISC purist design. (Although I must say that I've never worked with 68000 on the assembly level.)

The x86-64 is clearly going on its own way. It has its CISC compatibility to maintain. It has its RISC core, which is also getting very complex. While the critical path is no longer the execution pipe, new specialized instructions can be added to utilize additional die area. This is actually not a bad evolution from a initially poorly designed ISA.
Anandtech -- a site which every visit makes me regret my time spent! It is a living testimony of Einstein's quote:"Only two things are infinite, the universe and human stupidity, and I'm not sure about the former."
abinstein
K8 Opteron (SledgeHammer) Moderator
K8 Opteron (SledgeHammer) Moderator
 
Posts: 7171
Joined: Sat Oct 30, 2004 9:49 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby Elhardt » Mon Feb 27, 2012 6:51 am

Okay, I'm seeing a few comments that I don't understand for instance "QFT", some comment about Atari, and other comments regarding the 68K that of course as usual, people taking what I said and warping it into all kinds of unrelated stuff. The 68K was not a failed design, it was the design of choice for any company starting a computer design from scratch, such as Apple, Atari, Amiga, Sun, Silicon Graphics, almost every other workstation, plug in coprocessor boards for PCs, and so on. The 68K chips were always ahead of the x86 in clock rate until the 68040, which slipped backwards. The 68K had 32 bit instructions and registers from day one. Complexity of addressing modes shouldn't have anything to do with how fast a chip can be clocked. In fact, virtually all of those 68K addressing modes were present on Motorola's 88000 RISC CPU.

However, I wasn't asking for more complex addressing modes, just auto inc/dec modes like every other CPU in the world and 3 operand general purpose instructions, not just for SSE. Intel/AMD just seem to keep concentrating on the SSE portion while leaving the main CPU in a primitive state. Well almost, they did finally give us 8 more registers, but then kept them from being used for 32 bit code. Since I'm an assembly language programmer, I like elegant, tight, fast and efficient code, and the 1974/1977 based Intel architecture isn't it. In fact, the x86 is such a lousy instruction set while the PowerPC for example is such a good one, that the whole CISC uses fewer instructions vs RISC uses more theory turns out to be the opposite most of the time, and sometimes by quite a bit.

Since Intel and AMD don't seem to have a problem adding more features and gobs more SSE instructions ever couple of years, making code written for those new features incompatible with older CPU's, then there's no reason for not doing some of the simple improvements to the basic x86 portion. In fact, since programmers need to recompile code for 64 bit operation, that would have been a good time for Intel/AMD to wipe the slate clean and go with an even 32 bit instruction word and modern instruction set when running in 64 bit mode. They did give us those much needed extra registers, but we should have gotten more than just those.
Elhardt
 
Posts: 3
Joined: Thu Feb 23, 2012 8:07 am

Re: 3 Operand x86 Instructions? What the Hell?

Postby hyc » Mon Feb 27, 2012 10:00 am

abinstein wrote:I am not trying to say x86 is a good ISA. I agree with your assessment mostly. But it wouldn't have been good to evolve it toward 68000, either. It is my belief that the CISC days were over a long time ago, ever since Pentium Pro and pretty much every high performance CPU was designed internally as RISC.

On the other hand, insisting on orthogonality also does not make an efficient CPU, since in practice, not all instructions need all memory/register addressing modes. From this point of view, the 68000 instruction set is kind-of oddly positioned. It's addressing modes are too complex and CISC like, OTOH, the instructions are made (near) orthogonal and look like a RISC purist design. (Although I must say that I've never worked with 68000 on the assembly level.)

The x86-64 is clearly going on its own way. It has its CISC compatibility to maintain. It has its RISC core, which is also getting very complex. While the critical path is no longer the execution pipe, new specialized instructions can be added to utilize additional die area. This is actually not a bad evolution from a initially poorly designed ISA.


Even the M68K family was a RISC core at heart, by the time of the 68040. I'll agree with you that x86 couldn't practically become a 68000. But both 68K and x86 are proof that having a CISC programming model doesn't conflict with having a RISC core. The fact that macro-op fusion is a *good thing* in x86 designs also proves that there was room to go even more CISC. So while you make a lot of good points, I'm going to disagree with you that CISC days are over. (And again, i'm talking about the exposed programming model, not necessarily the core microarchitecture.)
hyc
K8 Athlon 64 (Winchester) Expert Boarder
K8 Athlon 64 (Winchester) Expert Boarder
 
Posts: 1349
Joined: Mon Jan 16, 2006 4:38 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby abinstein » Mon Feb 27, 2012 7:32 pm

hyc wrote:
abinstein wrote:I am not trying to say x86 is a good ISA. I agree with your assessment mostly. But it wouldn't have been good to evolve it toward 68000, either. It is my belief that the CISC days were over a long time ago, ever since Pentium Pro and pretty much every high performance CPU was designed internally as RISC.

On the other hand, insisting on orthogonality also does not make an efficient CPU, since in practice, not all instructions need all memory/register addressing modes. From this point of view, the 68000 instruction set is kind-of oddly positioned. It's addressing modes are too complex and CISC like, OTOH, the instructions are made (near) orthogonal and look like a RISC purist design. (Although I must say that I've never worked with 68000 on the assembly level.)

The x86-64 is clearly going on its own way. It has its CISC compatibility to maintain. It has its RISC core, which is also getting very complex. While the critical path is no longer the execution pipe, new specialized instructions can be added to utilize additional die area. This is actually not a bad evolution from a initially poorly designed ISA.


Even the M68K family was a RISC core at heart, by the time of the 68040. I'll agree with you that x86 couldn't practically become a 68000. But both 68K and x86 are proof that having a CISC programming model doesn't conflict with having a RISC core. The fact that macro-op fusion is a *good thing* in x86 designs also proves that there was room to go even more CISC. So while you make a lot of good points, I'm going to disagree with you that CISC days are over. (And again, i'm talking about the exposed programming model, not necessarily the core microarchitecture.)

I think from a programmer's point of view, it really doesn't matter whether the CPU is CISC or RISC. We all program conceptually in CISC. Any algorithm starts at high level as complex operations. Still, these operations are ultimately transformed into simple steps, before they could be understood by machines, or even another human being.

Many people view the advent of SSE as the come back of CISC. But really SIMD and CISC are two very different things. CISC is to provide comprehensive and specialized service to high-level programs. SIMD is to utilize available chip area for data parallel processing. SSE is actually a poorly designed SIMD particularly due to its somewhat-CISC nature. Anyone who has programmed with SSE should know how ugly and irregular its instructions are. (Somehow Intel just cannot come up with a clean ISA in the first attempt.) Even so, we still don't find all SSE instructions to access memory in any way. So really SSE is only 1/2 a CISC, and IMO the good part of it is purely RISC (/SIMD).

Some SSE instructions are quite specialized and perform some "nice" side effects, eg. SSE4. They have little use outside of some benchmarks. I don't understand the rationale behind adding such junks into an ISA, but I guess it's just usual business (for Intel).
Anandtech -- a site which every visit makes me regret my time spent! It is a living testimony of Einstein's quote:"Only two things are infinite, the universe and human stupidity, and I'm not sure about the former."
abinstein
K8 Opteron (SledgeHammer) Moderator
K8 Opteron (SledgeHammer) Moderator
 
Posts: 7171
Joined: Sat Oct 30, 2004 9:49 pm

Re: 3 Operand x86 Instructions? What the Hell?

Postby Montaray Jack » Tue Feb 28, 2012 6:04 am

QFT is an acronym that means "quoted for truth"
Given the context, I think hyc is in agreement with your evaluation.
It's probably best to read "QFT Sigh..." as "sad but true"
No.6: “The whole earth as. . . `The Village'?”
No.2: “That is my hope. What's yours?”
No.6: “I'd like to be the first man on the moon!”
--Chimes of Big Ben
Montaray Jack
K8 Athlon 64 (Clawhammer) Senior Boarder
K8 Athlon 64 (Clawhammer) Senior Boarder
 
Posts: 1132
Joined: Sat May 30, 2009 11:29 pm
Location: The Village


Return to AMD Fusion, Bobcat, Bulldozer

Who is online

Users browsing this forum: No registered users and 3 guests