Quantcast
Channel: Intel® Integrated Performance Primitives
Viewing all 1294 articles
Browse latest View live

ippsResamplePolyphase data type question

$
0
0

Hello,

I was re-writing Matlab DSP algorithm to C++ and managed to receive almost the same result. But Matlab is using double data type by default and thanks to that result is slightly better. Small difference between Matlab code and my code appears because of resampling function (I've checked everything for sure very small difference appears after resampling). So as ippsResamplePolyphase has only 16s and 32f constructors I am forced to convert my double data to float which causes loss of data. I wouldn't mind buying newest IPP, but I see that there is also only 32f ippsResamplePolyphase function.

My question is: It there is any method that would allow me to perform ippsResamplePolyphase on 64f (double) data? (Matlab is using polyphase implementation too - and because of that my result is almost perfect, but I would like to make it perfect)

What I am using currently:

  •  IPP 7.1 in C++ code:

(link to documentation of IPP 7.1 - https://software.intel.com/sites/products/documentation/doclib/iss/2013/ipp/ipp_manual/IPPS/ipps_ch6/functn_ResamplePolyphase.htm)

 

IppStatus ippsResamplePolyphaseFixed_32f(const Ipp32f* pSrc, int len, Ipp32f* pDst, Ipp32f norm, Ipp64f* pTime, int* pOutlen, const IppsResamplingPolyphaseFixed_32f* pSpec);

  • Matlab R2014a uses this:

(link to documentation - http://www.mathworks.com/help/signal/ref/resample.html)

 

resample - Change the sampling rate of a signal.

Y = resample(X,P,Q) resamples the sequence in vector X at P/Q times
the original sample rate using a polyphase implementation.  Y is P/Q
times the length of X (or the ceiling of this if P/Q is not an integer).  
P and Q must be positive integers.
 
resample applies an anti-aliasing (lowpass) FIR filter to X during the
resampling process, and compensates for the filter's delay.  The filter
is designed using FIRLS.  resample provides an easy-to-use alternative
to UPFIRDN, relieving the user of the need to supply a filter or
compensate for the signal delay introduced by filtering.
 
In its filtering process, resample assumes the samples at times before

and after the given samples in X are equal to zero. Thus large
deviations from zero at the end points of the sequence X can cause
inaccuracies in Y at its end points.

 


API for external threading

$
0
0

Hello IPP Team 

in the new release notes you writte:

Intel IPP 8.2 provides a lot of new APIs for one-dimensional (signal processing) and two-dimensional (image processing) functions that support external threading (processing by independent tiles). Several Intel IPP functions combined into a single pipeline that is threaded externally at the application level are significantly more efficient from the performance and power consumption point of view than the sequential call of internally threaded variant of the same functions.

Do these APIs only include the described TBB external multithreading with image tiles (for image processing)? Or is this feature something more advanced? Can you please say a couple of words about these APIs? What options do we have now for external threading?

 

Best

Intel® IPP 9.0 Beta is available

$
0
0

Intel® IPP 9.0 Beta is now available.  The release added the new Intel® AVX-512 optimization for the computer vision and image processing functions, extended optimization for Intel® Atom™ and Intel® Quark™ processors, added the new APIs to support external threading, and provided the custom dynamic library building tool, which enables users to build the dynamic library containing the selected Intel® IPP functions.
 
We also provide some options on the deprecated IPP functions.  Please find more information on Intel IPP 9.0 Beta release note

Intel IPP 9.0 beta release is available as a part of the Intel Parallel Studio XE 2016 Beta now. To sign up for the Intel Parallel Studio XE 2016 Beta, visit this page.

Your feedback and question is welcome during your evaluation.

ippsThreshold_GT_64f fails for NAN values in 64-bit architecture

$
0
0

Hi,

so my software started crashing, eventually pinpointed ippsThreshold_GT_64f - unlike normal comparison or ippsThreshold_GT_32f, it doesn't work with NANs. It just keeps them NAN.

Problems with IPP dispatcher in C++ project

$
0
0

We've found that pure C programs linked with the standard ipp library names - eg ipps - can be built on a machine with one hardware profile, and then run successfully on another.

However, binaries compiled with g++ only work on machines with the same processor type as the machine they were built on. They will run, but the just hang with 100% CPU usage.

If, instead, we explicitly specify the IPP library for the target architecture - eg  ippsvmx - when building with g++, it runs successfully on the target machine. Of course, in this case, it has problems on any other architecture.

Is this a known issue? Is there some way around it, or is IPP simply not portable for C++ programs?

LZO functions lisence

$
0
0

I have one doubt about LZO compression functions in IPP.
According to IPP reference manual, original library of IPP LZO is code of http://www.oberhumer.com/ .
I got LZO code (ver2.09, OpenSource version) and I read the document.

I found below sentences in explanation of LZO.

> The LZO algorithms and implementations are copyrighted OpenSource
> distributed under the GNU General Public License.

If I use LZO function of IPP in my product, it is not clear for me whether I must publish my source code.
It is mentioned that "LZO algorihtms" is applied GPL.

In my understanding, even if I use LZO feature in IPP, it is not needed to publish my source code and library.

Is it right ? 

Apply Filter on Image Using "In Place"

$
0
0

Hello,

I know that as a policy you removed all the "In Place" function in Intel IPP.
I'm talking about the Image Processing domain functions.

I was wondering though, what would happen if I send the pointer as destination and source?
Should the current function work? Which might fail?

I'm asking specifically on the Column / Row Filter.
Should it work?
Should other work?
Could you publish a list of functions which should perform correctly in this case?

Thank You.

Canny Edge Detector

$
0
0

Hi, I have some problems width canny edge detector using IPP

I make three steps:

I have 3-channel image(byte array), that ordered like this RGBRGB...

The first step is applying grayScale filter.

The second is calculation gradients dx and dy using IPP vertical and horizontal Sobel filters with border, 

The last step is using ippiCanny

Code:

unsigned char * RGBToGrayScaleIpp(unsigned char * src, int width, int height, int channels)
{
    IppiSize ROI = {width, height};
    Ipp8u *GrayScaleImg = new Ipp8u[width * height];
    ippiRGBToGray_8u_C3C1R(src, width * channels * sizeof(Ipp8u), GrayScaleImg, width * sizeof(Ipp8u), ROI);
    return GrayScaleImg;
}

extern "C" __declspec(dllexport) unsigned char* __stdcall Canny(unsigned char * img, int channels, int width, int height)
{

    Ipp8u *GrayScaleImg = RGBToGrayScaleIpp(img, width, height, channels);

    IppiSize roiSize = {width - 1, height - 1};

    int horizBufferSize, vertBufferSize;
    
    IppiMaskSize maskSize = ippMskSize3x3;
    ippiFilterSobelVertGetBufferSize_8u16s_C1R(roiSize, maskSize, &vertBufferSize);
    ippiFilterSobelHorizGetBufferSize_8u16s_C1R(roiSize, maskSize, &horizBufferSize);

    Ipp8u *horizBuffer = ippsMalloc_8u(horizBufferSize);
    Ipp8u *vertBuffer = ippsMalloc_8u(vertBufferSize);

    Ipp16s *dx = new Ipp16s[width * height];
    Ipp16s *dy = new Ipp16s[width * height];

    ippiFilterSobelVertBorder_8u16s_C1R(GrayScaleImg, width * sizeof(Ipp8u), dx, (width - 1) * sizeof(Ipp16s), roiSize, maskSize, ippBorderRepl, 0, vertBuffer);
    ippiFilterSobelHorizBorder_8u16s_C1R(GrayScaleImg, width * sizeof(Ipp8u), dy, (width - 1) * sizeof(Ipp16s), roiSize, maskSize, ippBorderRepl, 0, horizBuffer);

    Ipp8u *buffer;
    if (vertBufferSize < horizBufferSize)
    {
        ippiCannyGetSize(roiSize, &horizBufferSize);
        buffer = ippsMalloc_8u(horizBufferSize);
    }
    else
    {
        ippiCannyGetSize(roiSize, &vertBufferSize);
        buffer = ippsMalloc_8u(vertBufferSize);
    }

    
    Ipp32f low=100.0f, high=100.0f;
    Ipp8u* dst = new Ipp8u[width * height];
    ippiCanny_16s8u_C1R(dx, (width - 1) * sizeof(Ipp16s), dy, (width - 1) * sizeof(Ipp16s), dst, width * sizeof(Ipp8u), roiSize, low, high, buffer);

    ippsFree(buffer);

    return dst;

}

I get incorrect result. In attach files there are 3 images source, filtered using .NET Aforge and filtered using IPP. Can you see any mistakes in my code? Please, help.

 


Canny Edge Detector

$
0
0

Hi, I have some problems width canny edge detector using IPP

I make three steps:

I have 3-channel image(byte array), that ordered like this RGBRGB...

The first step is applying grayScale filter.

The second is calculation gradients dx and dy using IPP vertical and horizontal Sobel filters with border, 

The last step is using ippiCanny

Code:

unsigned char * RGBToGrayScaleIpp(unsigned char * src, int width, int height, int channels)
{
    IppiSize ROI = {width, height};
    Ipp8u *GrayScaleImg = new Ipp8u[width * height];
    ippiRGBToGray_8u_C3C1R(src, width * channels * sizeof(Ipp8u), GrayScaleImg, width * sizeof(Ipp8u), ROI);
    return GrayScaleImg;
}

extern "C" __declspec(dllexport) unsigned char* __stdcall Canny(unsigned char * img, int channels, int width, int height)
{

    Ipp8u *GrayScaleImg = RGBToGrayScaleIpp(img, width, height, channels);

    IppiSize roiSize = {width - 1, height - 1};

    int horizBufferSize, vertBufferSize;
    
    IppiMaskSize maskSize = ippMskSize3x3;
    ippiFilterSobelVertGetBufferSize_8u16s_C1R(roiSize, maskSize, &vertBufferSize);
    ippiFilterSobelHorizGetBufferSize_8u16s_C1R(roiSize, maskSize, &horizBufferSize);

    Ipp8u *horizBuffer = ippsMalloc_8u(horizBufferSize);
    Ipp8u *vertBuffer = ippsMalloc_8u(vertBufferSize);

    Ipp16s *dx = new Ipp16s[width * height];
    Ipp16s *dy = new Ipp16s[width * height];

    ippiFilterSobelVertBorder_8u16s_C1R(GrayScaleImg, width * sizeof(Ipp8u), dx, (width - 1) * sizeof(Ipp16s), roiSize, maskSize, ippBorderRepl, 0, vertBuffer);
    ippiFilterSobelHorizBorder_8u16s_C1R(GrayScaleImg, width * sizeof(Ipp8u), dy, (width - 1) * sizeof(Ipp16s), roiSize, maskSize, ippBorderRepl, 0, horizBuffer);

    Ipp8u *buffer;
    if (vertBufferSize < horizBufferSize)
    {
        ippiCannyGetSize(roiSize, &horizBufferSize);
        buffer = ippsMalloc_8u(horizBufferSize);
    }
    else
    {
        ippiCannyGetSize(roiSize, &vertBufferSize);
        buffer = ippsMalloc_8u(vertBufferSize);
    }

    
    Ipp32f low=100.0f, high=100.0f;
    Ipp8u* dst = new Ipp8u[width * height];
    ippiCanny_16s8u_C1R(dx, (width - 1) * sizeof(Ipp16s), dy, (width - 1) * sizeof(Ipp16s), dst, width * sizeof(Ipp8u), roiSize, low, high, buffer);

    ippsFree(buffer);

    return dst;

}

I get incorrect result. In attach files there are 3 images source, filtered using .NET Aforge and filtered using IPP. Can you see any mistakes in my code? Please, help.

 

AttachmentSize
Downloadaforge.jpg28.31 KB
Downloadipp.jpg12.95 KB
Downloadsource.jpg4.74 KB

error LNK1104: libmmt.lib

$
0
0

h files and *.lib file from another developer.
I have built a C++/CLI project and I want to create a C# wrapper for two functions

I'm getting this error

    error LNK1104: cannot open file 'libmmt.lib'    

I understand the he is using Intel Compiler and I know he uses IPP functions.<Br>
from here:<br>
https://software.intel.com/en-us/articles/libraries-provided-by-intelr-c...

I understand that he also uses Multi-threaded 
static library (/MT) version of math library.

I have Intel parallel studio installed. can you please help me understand how to configure my C++/CLI project?

I have added 
$(ICPP_COMPILER15)compiler\lib\intel64 into Additional Library Directories inside Linker -> General and now I have many more errors:

    error LNK2019: unresolved external symbol ippsSum_32f referenced in function ia_cp_robustMest

    error LNK2019: unresolved external symbol ippsSubC_32f referenced in function ia_cp_estimaSigmaRob

and so on...

 

Remap function, OpenCV, speed up

$
0
0

Hello,

I am using OpenCV remap function.

But i need more speed up for my application.

Does OpenCV with IPP supports remap acceleration?

Here i can't see remap function: http://opencv.org/opencv-3-0-alpha.html

If IPP has fast remap function like opencv remap(src_mat, dst_mat, map_x, map_y), where i can see examples and try it?

Best regards Viktor.

missing function overloads in 9.0 Beta?

$
0
0

 

I am building a project that worked fine with IPP 8.0, but compilation fails with 9.0 Beta

1>d:\apama\src\libs\utils\apfilter.cpp(622): error C3861: 'ippiFilterRow_32f_C3R': identifier not found
1>d:\apama\src\libs\utils\apfilter.cpp(655): error C3861: 'ippiFilterColumn_32f_C3R': identifier not found
1>d:\apama\src\libs\utils\apfilter.cpp(731): error C3861: 'ippiFilter_32f_C1R': identifier not found
1>d:\apama\src\libs\utils\apfilter.cpp(1098): error C3861: 'ippiWarpAffine_32f_C1R': identifier not found
1>d:\apama\src\libs\utils\apfilter.cpp(1338): error C3861: 'ippiWarpAffine_32f_C1R': identifier not found
1>d:\apama\src\libs\utils\apfilter.cpp(1375): error C3861: 'ippiRotateCenter_8u_C1R': identifier not found
1>d:\apama\src\libs\utils\apfilter.cpp(1426): error C3861: 'ippiRotateCenter_32f_C1R': identifier not found

When I look in ippi.h, these functions are missing - but they are there in 8.0.

For example, in 8.0 there were 15 overloads of 'ippiFilterRow_xxx_xxx, there is now only one,

IPPAPI( IppStatus, ippiFilterRow_64f_C1R, ( const Ipp64f* pSrc, int srcStep,
        Ipp64f* pDst, int dstStep, IppiSize dstRoiSize, const Ipp64f* pKernel,
        int kernelSize, int xAnchor ))

Were they deprecated?

 

 

 

Problem with FilterBilateralBorder

$
0
0

Hi,

I tried to use ippiFilterBilateralBorder_32f_C1R and got incorrect results.
Then I have run an example from the manual for ippiFilterBilateralBorder_8u_C1R (copy-paste) but have gotten different, incorrect results too (below).

 1   2   3 123 123 125  54  54
 3   4   5 128 130 130  61  62
 4   5   6 131 132 133  64  65
 6   6   6 132 133 133  66  66
 7   7   6 133 133 132  66  66
 7   6   5 133 132 131  66  65
 6   5   4 132 131 130  65  64

I am using IPP 8.1.0 under 64-bits Win 7 Pro SP-1 and MS Visual Studio Pro 2013.

Integrated Performance Primitives samples

$
0
0

Hi,

 

I've noticed that the speec-codec samples are no longer available with the latest version of the IPP libraries.

Are these samples still available? Or are they no longer supported with the 8.0+ versions?

Is there a Matrix Set?

$
0
0

I'm looking at converting from a different library to Intel ones so I'm looking at function analogues.

There are vector set functsions( ippsSet_???), but I was wondering if there was an analogue for matrices, preferable for complex data.


ippsDotProd_32f Performance on Haswell CPU

$
0
0

Hi,

at the moment I'm using ippsDotProd_32f in IPP 7.0 quite extensively in one of my projects. I now tested IPP 8.2 on a Haswell CPU (Xeon e5-2650 v3 in a HP z640 workstation) with this project because I expected it to be significantly faster (see below). Actually, the code was about 10% slower using IPP 8.2 which I found quite disturbing.

I created a test program (see below) to verify this and found that ippsDotProd_32f (as well as some other functions) seem to be slower in IPP 8.2 as compared to IPP 7.0 if one uses a lot but rather small arrays of about 100 entries. For larger arrays the speed seems to be equal.

Unfortunately this is exactly what I have to do in my project. Now two questions arise:

 

1. What can I do to make my code work at least with the speed of IPP 7.0 event if I use IPP 8.2

2. Why is ippsDotProd_32f on a Haswell CPU not actually significantly faster? My assumptions are based on this article (section 3.1):

https://software.intel.com/en-us/articles/intel-xeon-processor-e5-2600-v...
 

Where it is stated that Haswell CPUs have two FMA units and therefore should be much faster calculating dot products. Furthermore it is stated in https://software.intel.com/en-us/articles/haswell-support-in-intel-ipp that ippsDotProd_32f should actually profit from this fact, at least in IPP versions larger 7.0

 

I'm very thankful for assistance here! Apparently I understood something wrong? Here is my test code, it was compiled with Visual Studio 2012 on a non-Haswell-computer but the tests were run on the mentioned Haswell-system:

 

#include "stdafx.h"
#include "windows.h"
#include "ipp.h"
#include "ipps.h"
#include "ippcore.h"



int main(int argc, _TCHAR* argv[])
{

	IppStatus IPP_Init_status;
	IPP_Init_status=ippInit();
	printf("%s\n", ippGetStatusString(IPP_Init_status) );
	const IppLibraryVersion *lib;
	lib = ippsGetLibVersion();
	printf("%s %s\n", lib->Name, lib->Version);
	//ippSetNumThreads(1);

	//generate two vectors
	float* vec1;
	float* vec2;
	vec1=new float[1000]();
	vec2=new float[1000]();

	//fill vectors with values
	for (int i=0;i<1000;i++){
		vec1[i]=(float)i;
		vec2[i]=(float)(1000-i);
	}


	//result variable
	float dotprod_result=0.f;


	//start timing
	int dotprod_time=0;
	LARGE_INTEGER StartingTime, EndingTime, ElapsedMicroseconds;
    LARGE_INTEGER Frequency;
    QueryPerformanceFrequency(&Frequency);
    QueryPerformanceCounter(&StartingTime);


	//run ippsDotProd
	for (int i=0; i<500000000; i++){
		//ippsSum_32f(vec1,1000, &dotprod_result,ippAlgHintFast);
		ippsDotProd_32f(vec1, vec1, 100, &dotprod_result);
	}


	//stop timing
	QueryPerformanceCounter(&EndingTime);
    ElapsedMicroseconds.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
    ElapsedMicroseconds.QuadPart *= 1000000;
    ElapsedMicroseconds.QuadPart /= Frequency.QuadPart;
    dotprod_time=(int)(ElapsedMicroseconds.QuadPart/1000);

	printf("Total time [ms]:  %d\n", dotprod_time);



	delete[] vec1;
	delete[] vec2;

	return 0;
}

 

The result for IPP 7.0:

ippStsNoErr: No errors, it's OK.
ippse9-7.0.dll 7.0 build 205.105
Total time [ms]:  7558

 

The result for IPP 8.2:

ippStsNoErr: No errors.
ippSP AVX2 (l9) 8.2.1 (r44077)
Total time [ms]:  8141

 

 

 

 

h264 library license

$
0
0

Dear all,

I would like to use "h264_dec_filter.dll" and "h264_enc_filter.dll" (which could be use to register by regsvr32 ) to create the Direct Show project for commercial application. I do not know whether if they are free to use or not. If I have to pay license fee to use them, how much do I have to pay?

Thanks and Best Regards.

ChuyenLuong

FilterLaplace generates different results when running the 32-Bit vs. 64-Bit

$
0
0

We have observed different results when running the FilterLaplace function between the 32-bit and the 64-bit version of the IPP libraries.
Specifically we are calling: ippiFilterLaplace_8u_AC4R with both 3X3 and 5X5 kernels.
The binary results differ between the 32 and 64 bit libraries.  Is this expected?  We are currently running IPP 7.0.

Many thanks,

Brian

Software Evaluation question

$
0
0

I would like to evaluate the speech coding samples of the Intel IPP librairies.

I have downloaded the 30-day evaluation of Parallel Studio XE Composer Edition for Fortran and C++ Linux*. Is this the correct package?

I understand that the speech coding sample is deprecated, however I am confused as there is an entry called Speech Coding Functions (https://software.intel.com/en-us/node/502412) in the  Reference Manual for Intel® Integrated Performance Primitives 8.2 Update 2.

It even gives a link to the  Intel IPP Speech Coding Samples:

The use of the Intel IPP speech coding functions is demonstrated in Intel IPP Samples. See Intel IPP Speech Coding Samples downloadable from http://www.intel.com/cd/software/products/asmo-na/eng/220046.htm.

However, this link does not seem to work.

 

Is it possible to try the speech coding samples  with an evaluation copy of the Intel® Integrated Performance Primitives?

 

Thank you,

Louis.

Converting from pixel order YCbCr411 to BGR

$
0
0

Hi all,

I'm new to using the IPP library and  I'm having trouble converting from pixel order YCbCr411 data to BGR format.  From reading the documentation, I know that I first need to convert the pixel order to planar format first before calling the the library function ippiYCbCr411ToBGR_8u_P3C4R to convert planar YCbCr411 data to BGR format. I see that I can probably use the ippiCopy_8u_C3P3R call to convert my image buffer from pixel order to planar order. 

For example, is this the following the correct approach?

int pDstStep[3];
Ipp8u* pFirst = ippiMalloc_8u_C1(srcCols, srcRows, &pDstStep[0]);
Ipp8u* pSecond = ippiMalloc_8u_C1(srcCols/4, srcRows, &pDstStep[1]);
Ipp8u* pThird = ippiMalloc_8u_C1(srcCols/4, srcRows, &pDstStep[2]);

Ipp8u* pDst[3] = { pFirst, pSecond, pThird };

 ippiCopy_8u_C3P3R((Ipp8u*)pImageBuffer,
            pImageBufferStride,
            pDst,
            srcCols,
            imgSize);

// Convert from 3 plane YCbCr411 source image to BGR
ippiYCbCr411ToBGR_8u_P3C4R(const unsigned char**)(&pDst),
            pDstStep,
            pDestBuffer,
            destStride,
            imgSize,
            0);

Thanks!

Viewing all 1294 articles
Browse latest View live


Latest Images