Hi,
i've been recently porting a convolution library that uses Apple's vDSP to IPP to make it cross-platform capable.
In general, i made it work, the results seem fine, but i somewhat stunned that it is significantly slower than vDSP on the same machine (compared over various different blocksizes).
The platform i'm currently working on is OSX El Capitan, with clang as a compiler, on an Intel i7 (broadwell) processor.
I'm currently the using real-valued in-place FFT. Using ippInit() or not doesn't seem to change anything.
I was wondering if i'm making a mistake somewhere, if there's anything i've overlooked ?
Best,
n