Classical Computer Vision: The Spatial Domain I

The Spatial Domain for Image Processing is the domain where a pixel is seen as a member of a group of pixels, in opposition to the Value Domain, where a pixel was seen as a unique entity.

Therefore, when we work in the spatial domain, we try to extract geometrical information from the image. We are concerned about borders, segmentation, morphology, filters etc.

The Convolution Operation

What is a convolution?

Mathematically speaking, a convolution is a type of matrix operation involving two matrices that are generally bi-dimensional, where one matrix is the image and the other is the convolution matrix, also known as the structuring element.

The goal here is to transform the first matrix using the second one as a tool for changing its values.

We could think of the convolution matrix as a piece of code and the image as the input for that piece of code. That "piece of code" is going to run for every group of pixels in the input image, producing some value. By changing the "piece of code", we change the value produced.

By choosing different convolution matrices, we achieve different image transformations, much like we were modifying a computer code to transform our input into a desired output.

Basically, the convolution matrix is our "kernel". We use this kernel to look at a portion of the image and do some calculations with it. For example, we could multiply our kernel and sum the values, we could take the mean, or we could invert the numbers... our code is the desired operation.

So, what we have then is a new resulting matrix, formed by the sum of all interactions between our kernel and the original input image matrix

Nice links

Computer Vision Notes

Classical Computer Vision: The Spatial Domain I - The Convolution Operation

Comentários

Postar um comentário