Text Size

AMD Fusion, Bobcat, Bulldozer

bulldozer AVX/FMA4 tests - why no f64 ymm dot product ???

Discussion about AMD's upcoming CPU's and APU's

Re: bulldozer AVX/FMA4 tests - why no f64 ymm dot product ??

Postby anubis » Wed Mar 07, 2012 9:29 am

bootstrap wrote:
The question is this. Why is there no 256-bit (ymm register) version of the dot product for f64 variables? There is a 256-bit (ymm register) dot product for f32 variables. Seems very strange. I recall no other case where that difference exists. Just for fun I tried putting these 4 lines in a test function and sure enough, the assembler generated an error for only the last of these 4 lines:

vdpps $0x77, %xmm2, %xmm1, %xmm0 # 128-bit xmm register dot product for f32 variables
vdpps $0x77, %ymm2, %ymm1, %ymm0 # 256-bit ymm register dot product for f32 variables
vdppd $0x77, %xmm2, %xmm1, %xmm0 # 128-bit xmm register dot product for f64 variables
vdppd $0x77, %ymm2, %ymm1, %ymm0 # 256-bit ymm register dot product for f64 variables



intel's documentation does note that vdppd is not present for avx. http://software.intel.com/file/41604 (pg 538)


looking at the documentation, seems like vdpps http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011/compiler_c/intref_cls/common/intref_avx_dp_ps.htm for avx calculates only for the lower 4 floats; number of operations on 4 floats (in both 128/256bit for f32).

i am throwing darts in the dark:
so for f64, the choice may have been keeping the same number of operations for both 128/256bit or change behavior of f64 and update to operating 4 doubles. And they didnot choose either
anubis
 
Posts: 29
Joined: Tue Jan 18, 2005 4:33 pm

Re: bulldozer AVX/FMA4 tests - why no f64 ymm dot product ??

Postby Montaray Jack » Thu Mar 08, 2012 12:59 am

pp 117 - 122 BTW
http://support.amd.com/us/Processor_TechDocs/26568_APM_v4.pdf
DPPD is an SSE4.1 instruction and VDPPD is an AVX instruction. Support for these instructions is
indicated by CPUID Fn0000_00001_ECX[SSE41] and Fn0000_00001_ECX[AVX] (see the CPUID
Specification, order# 25481).
No.6: “The whole earth as. . . `The Village'?”
No.2: “That is my hope. What's yours?”
No.6: “I'd like to be the first man on the moon!”
--Chimes of Big Ben
Montaray Jack
K8 Athlon 64 (Clawhammer) Senior Boarder
K8 Athlon 64 (Clawhammer) Senior Boarder
 
Posts: 1166
Joined: Sat May 30, 2009 11:29 pm
Location: The Village

Re: bulldozer AVX/FMA4 tests - why no f64 ymm dot product ??

Postby Montaray Jack » Thu Mar 08, 2012 2:17 am

I don't think there is a good answer. We need clarification from both Intel and AMD.
H.J. Lu was confused on the mixing of ymm and xmm registers, and when one of the leads for binutils gets confused, I get worried.
http://www.x86-64.org/pipermail/discuss ... 10700.html
No.6: “The whole earth as. . . `The Village'?”
No.2: “That is my hope. What's yours?”
No.6: “I'd like to be the first man on the moon!”
--Chimes of Big Ben
Montaray Jack
K8 Athlon 64 (Clawhammer) Senior Boarder
K8 Athlon 64 (Clawhammer) Senior Boarder
 
Posts: 1166
Joined: Sat May 30, 2009 11:29 pm
Location: The Village


Return to AMD Fusion, Bobcat, Bulldozer

Who is online

Users browsing this forum: Bing [Bot] and 3 guests