Log Gabor filter

In signal processing it is useful to simultaneously analyze the space and frequency characteristics of a signal. While the Fourier transform gives the frequency information of the signal, it is not localized. This means that we cannot determine which part of a (perhaps long) signal produced a particular frequency. It is possible to use a short time Fourier transform for this purpose, however the short time Fourier transform limits the basis functions to be sinusoidal. To provide a more flexible space-frequency signal decomposition several filters (including wavelets) have been proposed. The Log-Gabor[1] filter is one such filter that is an improvement upon the original Gabor filter.[2] The advantage of this filter over the many alternatives is that it better fits the statistics of natural images compared with Gabor filters and other wavelet filters.

Applications

The Log-Gabor filter is able to describe a signal in terms of the local frequency responses. Because this is a fundamental signal analysis technique, it has many applications in signal processing. Indeed, any application that uses Gabor filters, or other wavelet basis functions may benefit from the Log-Gabor filter. However, there may not be any benefit depending on the particulars of the design problem. Nevertheless, the Log-Gabor filter has been shown to be particularly useful in image processing applications, because it has been shown to better capture the statistics of natural images.

In image processing, there are a few low-level examples of the use of Log-Gabor filters. Edge detection is one such primitive operation, where the edges of the image are labeled. Because edges appear in the frequency domain as high frequencies, it is natural to use a filter such as the Log-Gabor to pick out these edges.[3][4] These detected edges can be used as the input to a segmentation algorithm or a recognition algorithm. A related problem is corner detection. In corner detection the goal is to find points in the image that are corners. Corners are useful to find because they represent stable locations that can be used for image matching problems. The corner can be described in terms of localized frequency information by using a Log-Gabor filter.[5]

In pattern recognition, the input image must be transformed into a feature representation that is easier for a classification algorithm to separate classes. Features formed from the response of Log-Gabor filters may form a good set of features for some applications because it can locally represent frequency information. For example, the filter has been successfully used in face expression classification.[6] There is some evidence that the human visual system processes visual information in a similar way.[7]

There are a host of other applications that require localized frequency information. The Log-Gabor filter has been used in applications such as image enhancement,[8] speech analysis,[9] contour detection,[10] texture synthesis [11] and image denoising [12] among others.

Existing Approaches

There are several existing approaches for computing localized frequency information. These approaches are advantageous because unlike the Fourier transform, these filters can more easily represent discontinuities in the signal. For example, the Fourier transform can represent an edge, but only by using an infinite number of sine waves.

Gabor filters

When considering filters that extract local frequency information, there is a relationship between the frequency resolution and the time/space resolution. When more samples are taken the resolution of the frequency information is higher, however the time/space resolution will be lower. Likewise taking only a few samples means a higher spatial/temporal resolution, but this is at the cost of less frequency resolution. A good filter should be able to obtain the maximum frequency resolution given a set time/space resolution, and vice versa. The Gabor filter achieves this bound.[2] Because of this, the Gabor filter is a good method for simultaneously localizing spatial/temporal and frequency information. A Gabor filter in the space (or time) domain is formulated as a Gaussian envelope multiplied by a complex exponential. It was found that the cortical responses in the human visual system can be modeled by the Gabor filter.[7][13] The Gabor filter was modified by Morlet to form an orthonormal continuous wavelet transform.[14]

Although the Gabor filter achieves a sense of optimality in terms of the space-frequency tradeoff, in certain applications it might not be an ideal filter. At certain bandwidths, the Gabor filter has a non-zero DC component. This means that the response of the filter depends on the mean value of the signal. If the output of the filter is to be used for an application such as pattern recognition, this DC component is undesirable because it gives a feature that changes with the average value. As we will soon see, the Log-Gabor filter does not exhibit this problem. Also the original Gabor filter has an infinite length impulse response. Finally, the original Gabor filter, while optimum in the sense of uncertainty, does not properly fit the statistics of natural images. As shown in,[1] it is better to choose a filter with a longer sloping tail in an image coding task.

In certain applications, other decompositions have advantages. Although there are many such decompositions possible, here we briefly present two popular methods: Mexican hat wavelets and the steerable pyramid.

Mexican Hat Wavelet

The Ricker wavelet, commonly called the mexican hat wavelet is another type of filter that is used to model data. In multiple dimensions this becomes the Laplacian of a Gaussian function. For reasons of computational complexity, the Laplacian of a Gaussian function is often simplified as a difference of Gaussians. This difference of Gaussian function has found use in several computer vision applications such as keypoint detection.[15] The disadvantage of the Mexican hat wavelet is that it exhibits some aliasing and does not represent oblique orientations well.

Steerable pyramid

The steerable pyramid decomposition [16] was presented as an alternative to the Morlet (Gabor) and Ricker wavelets. This decomposition ignores the orthogonality constraint of the wavelet formulation, and by doing this is able to construct a set of filters which are both translation and rotation independent. The disadvantage of the steerable pyramid decomposition is that it is overcomplete. This means that more filters than truly necessary are used to describe the signal.

Definition

Field introduced the Log-Gabor filter and showed that it is able to better encode natural images compared with the original Gabor filter.[1] Additionally, the Log-Gabor filter does not have the same DC problem as the original Gabor filter. A one dimensional Log-Gabor function has the frequency response:

where and are the parameters of the filter. will give the center frequency of the filter. affects the bandwidth of the filter. It is useful to maintain the same shape while the frequency parameter is varied. To do this, the ratio should remain constant. The following figure shows the frequency response of the Gabor compared with the Log-Gabor:

Difference in frequency domain between Gabor and Log-Gabor filters. The Gabor filter has a non-zero response at DC frequency, whereas the Log-Gabor always is zero. Because of this, the Gabor filter tends to over-represents lower frequencies. This is particularly evident in the log domain.

Another definition of the Log-Gabor filter is to consider it as a probability distribution function, with a normal distribution, but considering the logarithm of frequencies. This makes sense in contexts where the Weber–Fechner law applies, such as in visual or auditive perception. Following the change of variable rule, a one dimensional Log-Gabor function has thus the modified frequency response:

Note that this extends to the origin and that we still have .

In both definitions, because of the zero at the DC value, it is not possible to derive an analytic expression for the filter in the space domain. In practice the filter is first designed in the frequency domain, and then an inverse Fourier transform gives the time domain impulse response.

Bi-dimensional Log-Gabor filter

Multiscale decomposition of a natural image using log-Gabor filters. To represent the edges of the image at different levels, the correlation of log-Gabor filters was computed at different scales (in a clockwise fashion), see this page for an implementation.

Like the Gabor filter, the log-Gabor filter has seen great popularity in image processing.[4] Because of this it is useful to consider the 2-dimensional extension of the log-Gabor filter. With this added dimension the filter is not only designed for a particular frequency, but also is designed for a particular orientation. The orientation component is a Gaussian distance function according to the angle in polar coordinates (see or ):

where here there are now four parameters: the center frequency, the width parameter for the frequency, the center orientation, and the width parameter of the orientation. An example of this filter is shown below.

Construction of two-dimensional Log Gabor filter. The two dimensional filter consists of a component based on frequency (a) and a component based on orientation (b). The two components are combined to form the final component (c).
Difference in spatial domain between Gabor and Log-Gabor filters. In the spatial domain the response of Gabor and Log-Gabor filters are nearly identical. On the left is the real part and on the right is the imaginary part of the impulse response.

The bandwidth in the frequency is given by:

Note that the resulting bandwidth is in units of octaves.

The angular bandwidth is given by:

In many practical applications, a set of filters are designed to form a filter bank. Because the filters do not form a set of orthogonal basis, the design of the filter bank is somewhat of an art and may depend upon the particular task at hand. The necessary parameters that must be chosen are: the minimum and maximum frequencies, the filter bandwidth, the number of orientations, the angular bandwidth, the filter scaling and the number of scales.

See also

References

  1. D. J. Field. Relations between the statistics of natural images and the response properties of cortical cells. J. Opt. Soc. Am. A, 1987, pp. 2379-2394.
  2. D. Gabor. Theory of communication. J. Inst. Electr. Eng. 93, 1946.
  3. Z. Xiao, C. Guo, Y. Ming, and L. Qiang. Research on log Gabor wavelet and its application in image edge detection. In International Conference on Signal Processing volume 1, pages 592–595 Aug 2002.
  4. Sylvain Fischer, Filip Sroubek, Laurent U. Perrinet, Rafael Redondo, Gabriel Cristobal. Self-invertible 2D log-Gabor wavelets. Int. Journal of Computational Vision, 2007
  5. X. Gao, F. Sattar, and R. Venkateswarlu. Multiscale corner detection of gray level images based on log-Gabor wavelet transform. IEEE Transactions on Circuits and Systems for Video Technology, 17(7):868–875, July 2007.
  6. N. Rose. Facial expression classification using Gabor and log-Gabor filters. In International Conference on Automatic Face and Gesture Recognition (FGR), pages 346–350, April 2006.
  7. J. G. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America, 1985, pp. 1160-9.
  8. W. Wang, J. Li, F. Huang, and H. Feng. Design and implementation of log-Gabor filter in fingerprint image enhancement. Pattern Recognition Letters, 2008. pp. 301–308.
  9. L. He, M. Lech, N. Maddage, and N. Allen. Stress and emotion recognition using log-Gabor filter analysis of speech spectrograms. Affective Computing and Intelligent Interaction, 2009, pp. 1-6
  10. Sylvain Fischer, Rafael Redondo, Laurent Perrinet, Gabriel Cristobal. Sparse approximation of images inspired from the functional architecture of the primary visual areas. EURASIP Journal on Advances in Signal Processing, special issue on Image Perception, 2007
  11. Paula S. Leon, Ivo Vanzetta, Guillaume S. Masson, Laurent U. Perrinet. Motion Clouds: Model-based stimulus synthesis of natural-like random textures for the study of motion perception. Journal of Neurophysiology, 107(11):3217--3226, 2012
  12. P. Kovesi. Phase preserving denoising of images. The Australian Pattern Recognition Society Conference: DICTA’99, 1999, pp. 212-217.
  13. Andrew B. Watson. The cortex transform: rapid computation of simulated neural images. Journal of Computer Vision, Graphics, and Image Processing. 1987. pp. 311-327.
  14. A. Grossmann and J. Morlet. Decomposition of Hardy functions into square integrable wavelets of constant shape. SIAM Journal on Mathe- matical Analysis, 1984, pp. 723-736.
  15. D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, pp. 91-110.
  16. E. P. Simoncelli and W. T. Freeman. The steerable pyramid: A flexible architecture for multi-scale derivative computation. IEEE Int’l Conf on Image Processing, 1995. pp. 444 - 447
  • (obsolete to this date)
  • A python implementation with examples for vision:
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.