What is a Point Operation?
We could see the value domain transformations as point operations. A point operation is an operation that modifies a pixel without affecting the neighboring ones. It doesn't change the size, structure or geometry of the image, and also, it isn't based on any particular aspect of the information inside the image. It just acts blindly upon every pixel.
Point Operations with Scalars
By using point operations between pixels and scalars, we can make several simple changes in the image, such as modifying brightness, contrast etc.
As an example of this, let's check the code below:
from skimage import io
import matplotlib.pyplot as plt
def trunc(px):
if px > 255:
return 255
if px < 0:
return 0
return px
def point_op(image, fn):
for row in range(0, image.shape[0]):
for col in range(0, image.shape[1]):
res = fn(image[row][col])
l = len(res)
if (l > 1):
for i in range(0, l-1):
res[i] = trunc(res[i])
image[row][col] = res
return image
What this code does is to define a function 'fn' which we could apply to every pixel, one-by-one of an image. It also automatically truncates that
Now, we can use it to apply any transformation to every color channel of any image. As an example, lets reduce brightness of channel R and increase contrast of channels G and B by 10% and 20% respectively on the original "Lena" image.
if __name__ == '__main__':
image = io.imread('lena.png')
plt.figure()
io.imshow(image)
plt.figure()
io.imshow(point_op(image.copy(), lambda px: [px[0] - 70, 1.1 * px[1], 1.2 * px[2]]))
plt.show()
Image Math / Logic Operations between Images
We could also apply math or logic operations between images. In that case, every pixel from image-1 is to be compared with it's counterpart in image-2, should image-2 had that row,col pos. Otherwise, it should be left as it is (no operation performed).
The code for the algorithm above could be implemented as:
from skimage import io
import matplotlib.pyplot as plt
def trunc(px):
if px > 255:
return 255
if px < 0:
return 0
return px
def image_merge_op(image1, image2, fn):
max_row_img2 = image2.shape[0]
max_col_img2 = image2.shape[1]
res_img = image1.copy()
for row in range(0, image1.shape[0]):
for col in range(0, image1.shape[1]):
if (row < max_row_img2 and col < max_col_img2):
res_pixel = fn(image1[row][col], image2[row][col])
# truncate channels for result img
l = len(res_pixel)
if (l > 1):
for i in range(0, l-1):
res_pixel[i] = trunc(res_pixel[i])
res_img[row][col] = res_pixel
return res_img
Now, using our code, we could apply a mask to cut an image, as an example of a bitwise operation between images. For the same "Lena" example, let's cut everything else besides her face:
def bitwise_and(img1_px, img2_px):
return [img1_px[0] & img2_px[0], img1_px[1] & img2_px[1], img1_px[2] & img2_px[2]]
if __name__ == '__main__':
image = io.imread('lena.png')
mask_image = io.imread('lena_cut_mask.png')
fig, axes = plt.subplots(nrows=1, ncols=3)
ax = axes.ravel()
ax[0].imshow(image)
ax[0].set_title("Original")
ax[1].imshow(mask_image)
ax[1].set_title("Mask")
ax[2].imshow(image_merge_op(image, mask_image, bitwise_and))
ax[2].set_title("Result")
plt.tight_layout()
plt.show()
A nice example of math operations between images is to detect movement. For every second, we compare one image with the next one, by subtracting every pixel from one to another, obtaining a resulting image containing what changed.
If something changed, I could alert a human operator that something happened, for example: a simple and efficient intrusion detector.
from skimage import io
import matplotlib.pyplot as plt
import av
import cv2
def trunc(px):
if px > 255:
return 255
if px < 0:
return 0
return px
def change_detector(base_img, target_img, px_tolerance, total_tolerance):
max_row_img2 = target_img.shape[0]
max_col_img2 = target_img.shape[1]
diff = 0
for row in range(0, base_img.shape[0]):
for col in range(0, base_img.shape[1]):
if (row < max_row_img2 and col < max_col_img2):
if (int(target_img[row][col]) - int(base_img[row][col])) > px_tolerance:
diff = diff + 1
if (diff > total_tolerance):
print(diff)
return diff > total_tolerance
def bitwise_and(img1_px, img2_px):
return [img1_px[0] & img2_px[0], img1_px[1] & img2_px[1], img1_px[2] & img2_px[2]]
if __name__ == '__main__':
video = av.open('20211126_180531.mp4')
i = 0
first = 0
showed_first = False
search = True
for packet in video.demux():
if search:
for frame in packet.decode():
img = frame.to_ndarray()
if (i > 300):
if (not showed_first):
first = img
plt.imshow(img)
plt.show()
showed_first = True
elif (change_detector(first, img, 100, 100)):
print("detected movement on frame #" + str(i))
plt.imshow(img)
plt.show()
search = False
break
i = i + 1
Obviously all these codes are conceptual and there's no optimization involved here. This is just for learning, it's a very very very slow code! Please be advised of that.
reference image
detected movement on frame #371
Operations using Statistics
Histograms are one of the most used techniques to pre-process images. For example, if we are to apply a threshold to an image, we could use histogram to define the threshold value automatically, based on the image's pixel distribution.
import numpy as np
import skimage
import skimage.io as io
import matplotlib.pyplot as plt
if __name__ == '__main__':
image = io.imread('lena.png')
image_red, image_green, image_blue = image[:,:,0], image[:,:,1], image[:,:,2]
fig, ax = plt.subplots(2,3)
ax[0,0].imshow(image_red, cmap='gray')
ax[0,1].imshow(image_green, cmap='gray')
ax[0,2].imshow(image_blue, cmap='gray')
bins = np.arange(-0.5, 255+1,1)
ax[1,0].hist(image_red.flatten(), bins = bins, color='r')
ax[1,1].hist(image_green.flatten(), bins=bins, color='g')
ax[1,2].hist(image_blue.flatten(), bins=bins, color='b')
plt.show()
Combined Operations
We could combine any operations to pre-process the images. For example, the histogram is used to define a threshold, then a threshold is applied. Then, we could apply a mask and get the resulting image.
Comentários
Postar um comentário