Integral images are a fast way to compute the sum of a rectangular region of an image. The main advantage of this algorithm is that once the integral image is computed, we can evaluate the sum of any rectangular region in constant time (the size of the region does not affect the time to compute it), because we just have to add four points of the integral image. This allows to create fast filters to blur the image, find edges or detect many other features. It is equivalent to a convolution with a rectangular and uniform kernel, and it can be used as a rough and fast approximation to a gaussian blur. The first application of this idea in computer vision was in the Viola-Jones (2004) face detector.
Usage:
If you don't have the Java plug-in you can watch a video of it
Using the Mean statistical measure, we get different effects depending on the shape of the filter:
The Average is a rectangle that blurs the image: the texture and noise disappears, but also the details of the image. This is similar to the Gaussian blur that you can find in Photoshop or the Gimp.
The Center-Surround is an approximation to the Difference of Gaussians (DoG) which is a useful operation to find edges and binarize the image.
If you use a small surround and a pixel for the center, you find the edges:
Binarization of the image (converting it to black and white) is an important operation previous to blob analysis or shape description (see my Master's Thesis). Usually one can use a threshold to decide if a pixel is white or black, but this threshold is very sensitive to lighting. In the following example you should use one threshold for the center and a different one for the corners, because the letters in the center are lighter than the background in the corners:
To overcome this, we can “subtract” the surround (using an on-center off-surround filter) and then we can binarize without problems:
The neurons in our retina perform a similar operation called Lateral inhibition (a neuron response is inhibited by the response of the lateral ones). You can experience this effect in several optical illusions. Herman grid illusion: if you look at the grid you can see black spots in the crossings of the white lines (even if they are not there). Match bands: you can see darker and lighter lines in the edges between the vertical lines (called overshot), even if the vertical lines are flat:
Both effects can be simulated with the Center-Surround filter:
Hybrid images is another optical illusion that can be explained by the lateral inhibition. These images change depending on the viewing distance. Using center-surround filters of different sizes we have the same effect:
Other shapes of filters are useful to find edges or lines in the image
We can use the Standard deviation or StDev instead of the Mean. This statistical descriptor is useful to assess how much texture a region of the image has:
Comments
Really nice article. Congratulations.
Thanks for the informative article. Im currently working on human detection and tracking. I was hoping if you could help me.
Hi Jabran,
If you are interested in the Viola-Jones face detector mentioned above, OpenCV has a good implementation that you can try. The function is cvHaarDetectObjects and you can find an example here http://opencv.willowgarage.com/wiki/FaceDetection
I tested it some time ago and it took more than a second to detect the faces (depending on the resolution and other parameters) and it only detected the faces that looked directly to the camera. It has problems if the face is tilted, is smiling too much, has a beard or glasses and so on.
I would recommend you to start by general motion detection and tracking, and then decide if it is a person or not by looking at the size of the object and the speed of motion. The book “Learning OpenCV” is the one that I would recommend.
Great explanation you have there. Basically, I understand how integral image works. But do you know how to compute the mean, std dev etc? Hope to hear from you soon.
Hi Mizuki,
If you look at the algorithm explained in the Wikipedia: http://en.wikipedia.org/wiki/Summed_area_table#The_algorithm you will get the sum over a rectangle. If you need the mean (E[i]), you just have to divide by the area of the rectangle (width multiplied by height).
The standard deviation (stdev) is slightly more complicated. First of all, you need another Integral image but instead of using the intensity of the pixel, you use the intensity of the pixel squared. If you compute the mean using this integral image you will get the expectancy of the squared intensity (E[i^2]). Then you compute the standard deviation using the mean computed before (E[i]) and the new value (E[i^2]): sqrt(E[i^2] - E[i]^2)
See http://en.wikipedia.org/wiki/Standard_deviation#Definition_of_population_values
Similarly, using E[i^3] and E[i^4] you could compute the skewness and the kurtosis, but it would be easy that the integral image would overflow even for small images.
I hope it helps.