Digital image representation
2025-08-08
1. Digital Image Representation
Understanding how digital images are represented is fundamental to all image processing and computer vision tasks. This lecture covers the core concepts of pixel representation, common image formats, data types, and key differences in popular image processing libraries.
1.1 What Is a Digital Image?
- A digital image is essentially a matrix (or array) of pixels.
- Each pixel represents a tiny part of the image and stores color or intensity information.
- Images can be grayscale (single channel) or color (multiple channels).
1.2 Pixel Representation
- Grayscale images are represented as a 2D matrix of intensity values.
- Color images are represented as a 3D array with dimensions
(height × width × channels)
. - Each channel corresponds to one component of the color (e.g., Red, Green, Blue).
For example:
- A grayscale image of size 256×256 pixels is represented as a 2D array with shape
(256, 256)
. - An RGB color image of size 256×256 pixels has shape
(256, 256, 3)
.
1.3 Common Image Formats and Color Channels
Format | Number of Channels | Description |
---|---|---|
Grayscale | 1 | Single channel representing brightness (intensity). |
RGB | 3 | Red, Green, Blue channels. Standard color representation. |
RGBA | 4 | RGB + Alpha (transparency channel). |
BGR | 3 | Blue, Green, Red channels. Used as the default in OpenCV. |
HSV / Lab | 3 | Alternative color spaces useful for certain tasks. |
Important: The order of channels matters! OpenCV reads images in BGR order, whereas libraries like PIL and matplotlib use RGB.
1.4 Data Types in Images
- Images are stored with different numerical data types depending on the application:
Data Type | Description | Common Usage |
---|---|---|
uint8 | 8-bit unsigned integers, values 0–255 | Most common; standard image files and OpenCV default |
float32 | 32-bit floating point, typically normalized (0.0–1.0 or 0–255) | Used in deep learning and image processing algorithms |
bool | Binary images | Masks, segmentation outputs |
int16 , int32 | Signed integers | Depth maps, displacement fields |
1.5 Differences Between OpenCV and PIL Image Handling
Operation | OpenCV | PIL (Python Imaging Library) |
---|---|---|
Default color channel order | BGR | RGB |
Image representation | NumPy ndarray | PIL Image object |
Reading an image | cv2.imread() returns BGR ndarray | Image.open() returns PIL Image |
Displaying images | cv2.imshow() | Image.show() or matplotlib |
Conversion between formats | Requires explicit channel swapping | Direct RGB |
Converting OpenCV BGR to RGB:
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
Converting PIL Image to NumPy array:
np_img = np.array(pil_img)
1.6 Practical Exercises and Code Examples
Example 1: Load and Display an Image Correctly Using OpenCV and Matplotlib
import cv2
from matplotlib import pyplot as plt
# Load image in BGR format (OpenCV default)
img_bgr = cv2.imread('your_image.jpg')
# Convert BGR to RGB for correct color display in matplotlib
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
# Display image
plt.imshow(img_rgb)
plt.title("Image in RGB format")
plt.axis('off')
plt.show()
Example 2: Convert Color Image to Grayscale and Split/Merge Channels
# Convert to grayscale
img_gray = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
# Split BGR channels
b, g, r = cv2.split(img_bgr)
# Merge channels back
merged_img = cv2.merge([b, g, r])
1.7 Summary
- Digital images are arrays of pixel values, which can be single-channel (grayscale) or multi-channel (color).
- The color channel order differs between libraries (OpenCV uses BGR, PIL uses RGB).
- Images use different data types;
uint8
is the most common. - Properly converting between formats and understanding these basics is critical for image processing.
1.8 Suggested Exercises
Load both grayscale and color images using OpenCV. Print their shapes and data types.
Display a BGR image directly in matplotlib without conversion and observe color distortion.
Implement a function that manually converts an RGB image to grayscale using a weighted average method:
Gray=0.299×R+0.587×G+0.114×B