An excerpt:
There is an almost invisible war going on between Intel and AMD. It's the game of who is defining the new additions to the x86 instruction set. This war has been going on behind the scenes for years without being noticed by the majority IT professionals. Most programmers don't care what is going on at the machine code level, so they can't see all the ridiculous consequences that this war has. Those working with virtualization may have noticed that Intel and AMD processors are incompatible when it comes to virtualization software, but this is only one of the more visible consequences of the conflict.
Some important battles
Traditionally, Intel has been the market leader, defining the instruction set for each new generation of microprocessors: 8086, 80186, 80286, 80386, etc. Each new instruction set is a superset of the previous one so that the backwards compatibility is maintained.
Intel's main competitor, AMD, has tried several times to gain the lead by defining their own extensions to the x86 instruction set. In 1998, AMD was the first to introduce Single-Instruction-Multiple-Data (SIMD) instructions in their so-called 3DNow instruction set. Intel never supported the 3DNow instructions. Instead, they introduced the SSE instruction set a few years later. SSE does essentially the same thing as 3DNow, but with a larger register size. Clearly, Intel had won and AMD had to support SSE because it was better than 3DNow.
In 2001, Intel launched their first 64-bit processor named Itanium with a new parallel instruction set. Instead of accepting the new Itanium instruction set, AMD developed their own 64-bit instruction set which - unlike the Itanium - was backwards compatible with the x86 instruction set. The market favored the backwards compatibility so AMD won this time and Intel had to support the AMD64, or x86-64, instruction set in their next processor.
The next important battle is going on right now. It's about instructions with more than two operands. The industry has recognized a need for fused multiply-and-add instructions (e.g.: D=A*B+C) and several other instructions with more than two operands. The current coding scheme supports only instructions with two operands, so a new coding scheme has to be invented in order to support instructions with more than two operands. AMD came first with a proposal. In August 2007, AMD announced a future instruction set called SSE5 with a new coding scheme. The early disclosure of AMD's intentions was a break with the previous policy where both companies had kept their intentions secret as long as possible. Intel's reply came in April 2008 with an early (probably premature) disclosure of their planned AVX instruction set. Intel's AVX coding scheme was much more flexible and future-oriented than AMD's SSE5 scheme, as I argued in a public discussion forum. Most importantly, the AVX scheme has room for future extensions of the size of the SIMD vector registers, while the SSE5 scheme has little room for any future extensions. It was pretty obvious that Intel had won this time, and thanks to the early disclosure of Intel's AVX instructions, it was not too late for AMD to change their plans. In May 2009, AMD published a revision of their plans where they modified the coding scheme for better compatibility with AVX. In addition to a full support of AVX, the revised AMD plan contains most of the original SSE5 instructions under the new name XOP and with the new coding scheme. Unfortunately, Intel had changed their plans in the meantime! In December 2008, Intel published a revision of their plans which involved a change of the coding of the fused multiply-and-add (FMA) instructions. Now it was too late for AMD to change their design once more, so the first AMD processors with FMA will follow the premature Intel specification rather than Intel's later revision. It is difficult to obtain compatibility when you are following a moving target.









