top of page
Video Channel

Video Channel

All Categories

Morning Rush

Into the Blue

Beach Patrol

Raw Image Processing in Python (Data Science)

  • tirthankarghosh5
  • Jun 19, 2022
  • 4 min read

A Post By: Tirthankar Ghosh

Preprocessing raw images for machine learning pipelines

Almost all modern-day cameras capture raw format images and process them in a format commonly known as sRGB, suitable for humans to see. However, one might wonder what all the techniques are used to convert the raw images into sRGB format are? Why was it necessary? Also, one might wonder how to use raw images or process them in a certain manner to get better performance on some machine learning tasks. This article attempts to answer all such questions in addition to step-by-step python code for each process.

  • Most of the filters on social media apps such as Snapchat, Instagram, etc., use machine learning. The machine learning algorithm behind those filters uses raw images to process the filter’s image to give real-time results. So, it becomes increasingly important to know what raw image and how it is processed by the camera while designing an algorithm that uses raw images.

What exactly is a raw image?

A raw image can be defined as a minimally processed image captured by a camera. It is yet to be processed by the software methods to process background noise, contrast, black level, etc. The raw image is unpleasant to the human eye in most cases and needed to be processed to be pleasant to see. How is a raw image is captured in a camera, and how does the camera sensor work?

How does an image sensor work?

The image sensor can be considered a circuit consisting of a surface used to capture the camera’s shutter’s electromagnetic waves or when the sensor is exposed to light. The sensor’s surface captures the intensity of the electromagnetic waves, aka light, that is incident on the surface at the time of capture. The surface can be considered a 2D array in which each element stores the light incident’s intensity. But, by storing only the light’s intensities, the sensor cannot comprehend the colors in the light. So, how does the sensor detect the colors in the scene?

For detecting the color in a sensor, various techniques are used; one of the most common and most widely used is the Bayer-filter sensor and discussed here.


A Bayer filter is used to map the incoming electromagnetic signal into the RGB space by using a filtering technique. The incident light is filtered into Red, Green, and Blue colors using a wavelength filter before the light hits the sensor. Using this technique, the intensities of a particular color (in this case, red, green, and blue) can be known. The red, green, and blue intensities are stored alternately in the Bayer filter, as shown in the figure. The are other filter patterns used on some cameras, but the Bayer filter pattern is the most widely used one.

How to obtain the color channels from this image

A raw image a 2D array that consists of information about the light intensities at the various wavelength/colors. To obtain a color channel, we need to separate pixels of each color and combine them to make an image. However, one can easily see that the number of green pixels is double the color pixels. In this case, the value of adjacent green pixels is averaged to obtain a single value. Hence, for a raw image of H x W size, the final RGB image obtained is H/2 x W/2 x 3.

def get_channels(bayer_image):
  red = bayer_image[1::2,1::2]
  blue = bayer_image[0::2,0::2]
  green1 = bayer_image[1::2,0::2]
  green2 = bayer_image[0::2,1::2]
  green = (green1 + green2)/2
  return red, green, blue 

What does a raw image contain?

The raw image file generally contains the image as a 2D array recorded on the image sensor after being passed from the Bayer filter. The file contains a large amount of metadata about the camera, aperture, lighting condition, etc. in the file, which helps during the image’s postprocessing. Some common metadata types are the black level, white level, orientation, color space transform, etc., which are discussed in this article. All these steps need to be done on the image to convert it into the required format to maintain quality.

Now, we are going to discuss some steps in detail along with the python code:

Black Level:

The black level is defined as the intensity of the least intense/dark part of the image. It is necessary to calibrate the image’s black level during postprocessing to obtain the perfect black pixels that are not present in the original raw images. Various algorithms are used to correct the black level in the images and are beyond this article’s scope.

def adjust_blacklevel(image, gamma=1.0):
    invGamma = 1.0 / gamma
    table = np.array([((i / 255.0) ** invGamma) * 255 for i in np.arange(0, 256)]).astype("uint8")
    return cv2.LUT(image, table) 
    

Orientation:-

In some cameras, the image is stored vertically inverted, so the metadata’s orientation information helps correct the image in such cases. The lens in the camera projects the image into the sensor in an inverted form. Sometimes, it is also left-right flipped. The lens’s orientation effect is generally corrected internally within the camera and doesn’t need to be corrected during postprocessing.

def fix_orientation(image, orientation): if type(orientation) is list: orientation = orientation[0] if orientation == 1: pass elif orientation == 2: image = cv2.flip(image, 0) elif orientation == 3: image = cv2.rotate(image, cv2.ROTATE_180) elif orientation == 4: image = cv2.flip(image, 1) elif orientation == 5: image = cv2.flip(image, 0) image = cv2.rotate(image, cv2.ROTATE_90_COUNTERCLOCKWISE) elif orientation == 6: image = cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE) elif orientation == 7: image = cv2.flip(image, 0) image = cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE) elif orientation == 8: image = cv2.rotate(image, cv2.ROTATE_90_COUNTERCLOCKWISE) return image

Color Space Transform:-

This is mostly the last step in any image processing pipeline. The processed image is transformed to the required color space, such as sRGB, YCrCb, Grayscale, etc., before being stored in the disk. The most commonly used color space is the sRGb color space. After performing the color space transformation, the images are stored on the disk in the form .png, .jpeg, etc., image storing formats.

path = "some_path"
src = cv2.imread(path)

# Converting RGB image to grayscale
image = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)

# Converting RGB image to YCrCb
image = cv2.cvtColor(src, cv2.COLOR_RGB2YCrCb)

Camera pipelines are much more complex than the one we have discussed here, but the details discussed in this article are more than sufficient to start using raw image data in a machine learning pipeline.


Feel free to ask questions in the comment section.

 
 
 

Comments


©2022 by Tirthankar Ghosh. Proudly created with Wix.com

  • Facebook
  • Twitter
  • LinkedIn
bottom of page