Image Pre-processing Techniques I - Value, Space and Frequency Domains
The aim of pre-processing an image is to reduce the amount of information in it, by removing non-relevant information, such as noise.
Sometimes, the captured image is not ideal, so it could lead to poor computer vision results. Some techniques could allow us to modify our input image (filter it) so it its better "seen" by a computer.
Other times, we just need our computer vision to focus on something. For example, let's suppose we are trying to authenticate someone based on his/her face. It would be counter-productive if the computer is focusing on the background to try to check if it is a recognizable face. It would be much better if we could segment the image and run the face recognition on a person rather than on the background, wouldn't it?
When we are pre-processing an image, we could work it on three different domains:
- Value
- Spatial
- Frequency
The Value Domain
In the Value Domain, we disregard any meaning in the image. We treat it only as a huge array of float numbers. So, each pixel is looked individually without comparing it with other pixels around it.
The Spatial Domain
In the Spatial domain, every pixel is analyzed based on it's position in the matrix of pixels that composes the image.
In this class of algorithms, we are looking to a pixel as a source of information too, but now the other pixels around it matter. Maybe some statistic value about it's neighbors is important to my analysis, or maybe there are some group of pixels that share some similarities... this group could be relevant to my analysis. Or else, maybe it's the difference between neighbor pixels that matters.
Depending on what I wish to do when i'm manipulating that image, the Spatial domain could provide information about an image.
It's the most common working area in classical image processing.
The Frequency Domain
In this domain, we take into consideration the way the pixel values change in an image.
In the spatial domain, a pixel is represented by it's value and it's position in the image's matrix. So, it's value is an amplitude, while it's address (x,y) is it's position.
In the frequency domain, the value and location of a pixel are represented by sinusoidal relationships. So, the image is composed of a sum of magnitude and phase of sine/co-sine waves.
It's a very useful domain to apply filters to images. For example, we could eliminate all high frequencies and we would obtain a blurry version of the image. Or, we could eliminate the lower frequencies and obtain an edged highlighted version of the image.
Comentários
Postar um comentário