Ok, I see. These are not FFT routines, but routines that are used by complex FFTs that make complex FFT routines faster.Quote from lilboy716:
ftp://download.intel.com/technology...01/art01_microarchitecture/vol8iss1_art01.pdf
Yeah, I am aware of these packages.http://www.ffte.jp/
some packages are available to take advantage of it..however, i have no experience with this at all. can't help you any further
The very article (pdf) you gave me above gives the reason!why do you need FFT?
The code sequence above shows how to implement a double-precision complex multiplication using SSE2 only or with the new SSE3 instructions, where mem_X contains one complex operand and mem_Y the other; mem_Z is used to store the complex result and xmm7 is a constant used to change the sign of one data element.Quote from nitro:
The very article (pdf) you gave me above gives the reason!
nitro [/B]
Quote from EdgeHunter:
The code sequence above shows how to implement a double-precision complex multiplication using SSE2 only or with the new SSE3 instructions, where mem_X contains one complex operand and mem_Y the other; mem_Z is used to store the complex result and xmm7 is a constant used to change the sign of one data element.
Since the main speed limiter of this code is the number of execution uops (7 for SSE2, 4 for SSE3), <i><b><u>the new instructions can improve complex multiplications by up to 75%.</i></b></u>
On SPEC CPU2000, the compiler is able to use SSE3 to improve 168.wupwise by 10-15%.
Nitro...
is this 'across the board' improvement for CPU usage for ALL custom indicators that require a lot of math or am i out in left field for the reasons to implement this...
cj...
__________________
HAVE STOP - WILL TRADE
If You Have The Vision We Have The Code