Digital image representation

About 615 wordsAbout 2 min

2025-08-08

1. Digital Image Representation

Understanding how digital images are represented is fundamental to all image processing and computer vision tasks. This lecture covers the core concepts of pixel representation, common image formats, data types, and key differences in popular image processing libraries.

1.1 What Is a Digital Image?

A digital image is essentially a matrix (or array) of pixels.
Each pixel represents a tiny part of the image and stores color or intensity information.
Images can be grayscale (single channel) or color (multiple channels).

1.2 Pixel Representation

Grayscale images are represented as a 2D matrix of intensity values.
Color images are represented as a 3D array with dimensions (height × width × channels).
Each channel corresponds to one component of the color (e.g., Red, Green, Blue).

For example:

A grayscale image of size 256×256 pixels is represented as a 2D array with shape (256, 256).
An RGB color image of size 256×256 pixels has shape (256, 256, 3).

1.3 Common Image Formats and Color Channels

Format	Number of Channels	Description
Grayscale	1	Single channel representing brightness (intensity).
RGB	3	Red, Green, Blue channels. Standard color representation.
RGBA	4	RGB + Alpha (transparency channel).
BGR	3	Blue, Green, Red channels. Used as the default in OpenCV.
HSV / Lab	3	Alternative color spaces useful for certain tasks.

Important: The order of channels matters! OpenCV reads images in BGR order, whereas libraries like PIL and matplotlib use RGB.

1.4 Data Types in Images

Images are stored with different numerical data types depending on the application:

Data Type	Description	Common Usage
`uint8`	8-bit unsigned integers, values 0–255	Most common; standard image files and OpenCV default
`float32`	32-bit floating point, typically normalized (0.0–1.0 or 0–255)	Used in deep learning and image processing algorithms
`bool`	Binary images	Masks, segmentation outputs
`int16`, `int32`	Signed integers	Depth maps, displacement fields

1.5 Differences Between OpenCV and PIL Image Handling

Operation	OpenCV	PIL (Python Imaging Library)
Default color channel order	BGR	RGB
Image representation	NumPy ndarray	PIL Image object
Reading an image	`cv2.imread()` returns BGR ndarray	`Image.open()` returns PIL Image
Displaying images	`cv2.imshow()`	`Image.show()` or matplotlib
Conversion between formats	Requires explicit channel swapping	Direct RGB

Converting OpenCV BGR to RGB:

img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)

Converting PIL Image to NumPy array:

np_img = np.array(pil_img)

1.6 Practical Exercises and Code Examples

Example 1: Load and Display an Image Correctly Using OpenCV and Matplotlib

import cv2
from matplotlib import pyplot as plt

# Load image in BGR format (OpenCV default)
img_bgr = cv2.imread('your_image.jpg')

# Convert BGR to RGB for correct color display in matplotlib
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)

# Display image
plt.imshow(img_rgb)
plt.title("Image in RGB format")
plt.axis('off')
plt.show()

Example 2: Convert Color Image to Grayscale and Split/Merge Channels

# Convert to grayscale
img_gray = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)

# Split BGR channels
b, g, r = cv2.split(img_bgr)

# Merge channels back
merged_img = cv2.merge([b, g, r])

1.7 Summary

Digital images are arrays of pixel values, which can be single-channel (grayscale) or multi-channel (color).
The color channel order differs between libraries (OpenCV uses BGR, PIL uses RGB).
Images use different data types; uint8 is the most common.
Properly converting between formats and understanding these basics is critical for image processing.

1.8 Suggested Exercises

Load both grayscale and color images using OpenCV. Print their shapes and data types.
Display a BGR image directly in matplotlib without conversion and observe color distortion.
Implement a function that manually converts an RGB image to grayscale using a weighted average method:
$Gray = 0.299 \times R + 0.587 \times G + 0.114 \times B$

Fundamentals of Images

Classical Image Processing and Feature Extraction