Module 3: I/O and GUI
Reading, writing, and displaying images and videos with OpenCV’s I/O and GUI facilities.
Topics Covered
- Reading and writing images
- Video capture and writing
- Display windows and keyboard handling
- Trackbars and mouse events
- Drawing functions
Algorithm Explanations
1. Image Reading (imread)
What it does: Loads an image from disk into memory as a NumPy array.
Read Modes:
| Flag | Value | Description |
|——|——-|————-|
| IMREAD_COLOR | 1 | Load as BGR (default) |
| IMREAD_GRAYSCALE | 0 | Load as single channel |
| IMREAD_UNCHANGED | -1 | Load with alpha channel |
Decoding Process:
┌─────────────────────────────────────────────────────────────────────┐
│ Image Reading Pipeline │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌──────────┐ ┌────────────┐ ┌──────────┐ │
│ │ File │────▶│ Decoder │────▶│ Raw Pixels │────▶│ NumPy │ │
│ │ (disk) │ │(JPEG/PNG)│ │ (BGR) │ │ Array │ │
│ └─────────┘ └──────────┘ └────────────┘ └──────────┘ │
│ │
│ photo.jpg Decompress [B,G,R,B,G,R, shape: (H,W,3) │
│ image.png & Decode B,G,R,B,G,R] dtype: uint8 │
│ │
└─────────────────────────────────────────────────────────────────────┘
Read Mode Comparison:
Original Image (RGB with Alpha)
┌─────────────────────────────────────┐
│ R G B A R G B A ... │ 4 channels
└─────────────────────────────────────┘
│
▼
┌──────────┬──────────────┬──────────────────────┐
│ Mode │ Result │ Shape │
├──────────┼──────────────┼──────────────────────┤
│ COLOR │ B,G,R,B,G,R │ (H, W, 3) - No alpha │
│ GRAY │ L,L,L,L,L,L │ (H, W) - Luminance │
│ UNCHANGED│ B,G,R,A,... │ (H, W, 4) - All │
└──────────┴──────────────┴──────────────────────┘
Important Notes:
- Returns
Noneif file cannot be read - OpenCV supports: JPEG, PNG, BMP, TIFF, WebP, PBM/PGM/PPM
2. Image Writing (imwrite)
What it does: Saves an image to disk in specified format.
Encoding Process:
┌─────────────────────────────────────────────────────────────────────┐
│ Image Writing Pipeline │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌────────────┐ ┌─────────┐ │
│ │ NumPy │────▶│ Encoder │────▶│ Compressed │────▶│ File │ │
│ │ Array │ │ │ │ Data │ │ (disk) │ │
│ └──────────┘ └──────────┘ └────────────┘ └─────────┘ │
│ │
│ (H, W, 3) JPEG/PNG/ Binary blob output.jpg │
│ uint8 BGR WebP codec image.png │
│ │
└─────────────────────────────────────────────────────────────────────┘
Quality vs File Size Trade-off:
JPEG Quality Parameter (0-100)
File Size Quality
▲ ▲
│ ╱ │ ╱
│ ╱ │ ╱
│ ╱ │ ╱
│ ╱ │ ╱
│╱ │ ╱
└────────────────▶ └────────────────▶
0 50 100 0 50 100
Quality Quality
Low quality = Small file Low quality = Artifacts
High quality = Large file High quality = Sharp
Format-Specific Parameters:
| Format | Parameter | Range | Description |
|---|---|---|---|
| JPEG | IMWRITE_JPEG_QUALITY |
0-100 | Quality (higher = better, larger) |
| PNG | IMWRITE_PNG_COMPRESSION |
0-9 | Compression level |
| WebP | IMWRITE_WEBP_QUALITY |
1-100 | Quality factor |
Example:
cv2.imwrite("output.jpg", img, [cv2.IMWRITE_JPEG_QUALITY, 95])
3. In-Memory Encoding/Decoding
Encoding to bytes (for network transmission):
success, buffer = cv2.imencode('.jpg', image, params)
# buffer is a NumPy array of bytes
Decoding from bytes:
image = cv2.imdecode(buffer, cv2.IMREAD_COLOR)
4. Video Capture
VideoCapture handles reading from:
- Video files (MP4, AVI, etc.)
- Camera devices (webcam)
- Network streams (RTSP, HTTP)
Video Source Types:
┌─────────────────────────────────────────────────────────────────────┐
│ Video Capture Sources │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────┐ │
│ │ Video File │──────┐ │
│ │ "video.mp4" │ │ │
│ └───────────────┘ │ ┌─────────────────┐ │
│ ├─────▶│ VideoCapture │────▶ Frames │
│ ┌───────────────┐ │ │ │ │
│ │ Camera │──────┤ └─────────────────┘ │
│ │ device: 0 │ │ │
│ └───────────────┘ │ │
│ │ │
│ ┌───────────────┐ │ │
│ │ Network Stream│──────┘ │
│ │"rtsp://..." │ │
│ └───────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Frame Reading Loop:
┌───────────────────────────────────────────────────────────────┐
│ Video Capture Loop │
├───────────────────────────────────────────────────────────────┤
│ │
│ cap = VideoCapture(source) │
│ │ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ cap.isOpened()? │◀────────────────────────┐ │
│ └────────┬───────────┘ │ │
│ Yes │ No │ │
│ ▼ ▼ │ │
│ ┌────────────┐ Exit │ │
│ │ ret, frame │ │ │
│ │ = cap.read │ │ │
│ └────────┬───┘ │ │
│ ▼ │ │
│ ┌─────────────┐ │ │
│ │ ret True? │ │ │
│ └──────┬──────┘ │ │
│ Yes │ No │ │
│ ▼ ▼ │ │
│ ┌──────────┐ Break │ │
│ │ Process │ │ │
│ │ Frame │───────────────────────────────────┘ │
│ └──────────┘ │
│ │
│ cap.release() │
│ │
└───────────────────────────────────────────────────────────────┘
cap = cv2.VideoCapture(source)
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Process frame
cap.release()
Video Properties:
| Property | ID | Description |
|———-|—–|————-|
| CAP_PROP_FRAME_WIDTH | 3 | Frame width |
| CAP_PROP_FRAME_HEIGHT | 4 | Frame height |
| CAP_PROP_FPS | 5 | Frames per second |
| CAP_PROP_FRAME_COUNT | 7 | Total frames |
| CAP_PROP_POS_FRAMES | 1 | Current frame position |
5. Video Writing
VideoWriter saves frames to video file.
Video Writing Pipeline:
┌─────────────────────────────────────────────────────────────────────┐
│ Video Writing Pipeline │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌────────────┐ ┌─────────┐ │
│ │ Frames │────▶│ Codec │────▶│ Container │────▶│ File │ │
│ │ (BGR) │ │ (XVID) │ │ (AVI) │ │ (disk) │ │
│ └──────────┘ └──────────┘ └────────────┘ └─────────┘ │
│ │
│ NumPy arrays FourCC code Video format output.avi │
│ (H, W, 3) compression + metadata │
│ │
└─────────────────────────────────────────────────────────────────────┘
FourCC (Four Character Code):
FourCC = 4 ASCII characters identifying the codec
┌───┬───┬───┬───┐
│ X │ V │ I │ D │ = XVID codec (MPEG-4)
└───┴───┴───┴───┘
┌───┬───┬───┬───┐
│ m │ p │ 4 │ v │ = MP4 Part 2
└───┴───┴───┴───┘
┌───┬───┬───┬───┐
│ M │ J │ P │ G │ = Motion JPEG
└───┴───┴───┴───┘
FourCC Codec Codes:
| Code | Format | Description |
|——|——–|————-|
| XVID | AVI | MPEG-4 codec |
| mp4v | MP4 | MPEG-4 Part 2 |
| MJPG | AVI | Motion JPEG |
| X264 | MP4 | H.264 codec |
Example:
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi', fourcc, 30.0, (640, 480))
out.write(frame)
out.release()
Frame Rate Timing:
30 FPS Video Playback
Frame 1 Frame 2 Frame 3 Frame 4
│ │ │ │
▼ ▼ ▼ ▼
───┬──────────┬──────────┬──────────┬───────▶ Time
│ 33ms │ 33ms │ 33ms │
│◀────────▶│◀────────▶│◀────────▶│
Wait time = 1000 / FPS = 1000 / 30 ≈ 33 ms
6. GUI Windows
Window Creation:
cv2.namedWindow('window', cv2.WINDOW_NORMAL) # Resizable
cv2.namedWindow('window', cv2.WINDOW_AUTOSIZE) # Fixed size
Keyboard Input:
key = cv2.waitKey(delay) & 0xFF
# delay: 0 = wait forever, >0 = wait milliseconds
# & 0xFF: mask for 64-bit systems
Common Key Codes: | Key | ASCII | Usage | |—–|——-|——-| | ESC | 27 | Exit | | Space | 32 | Pause | | Enter | 13 | Confirm | | ‘q’ | 113 | Quit |
7. Trackbars
What it does: Creates interactive sliders for parameter adjustment.
Trackbar Visualization:
┌─────────────────────────────────────────────────────────────────────┐
│ Window: "Controls" │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Brightness ├────────────────●───────────────────┤ 150 / 255 │
│ ▲ │
│ │ │
│ Slider position │
│ │
│ Contrast ├───●────────────────────────────────┤ 30 / 100 │
│ │
│ Threshold ├──────────────────────────────●─────┤ 200 / 255 │
│ │
├─────────────────────────────────────────────────────────────────────┤
│ [Image Display Area] │
│ │
└─────────────────────────────────────────────────────────────────────┘
Callback Pattern:
User moves slider
│
▼
┌───────────────────────┐
│ Callback Function │
│ on_change(value) │
└───────────┬───────────┘
│
▼
┌───────────────────────┐
│ Update image based │
│ on new value │
└───────────────────────┘
def on_change(value):
# Called when slider moves
pass
cv2.createTrackbar('name', 'window', initial, max_val, on_change)
current = cv2.getTrackbarPos('name', 'window')
8. Mouse Events
Mouse Interaction Flow:
┌─────────────────────────────────────────────────────────────────────┐
│ Mouse Event Flow │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ User Action Event Generated │
│ ─────────── ─────────────── │
│ │
│ 🖱️ Move mouse ───────▶ EVENT_MOUSEMOVE │
│ │
│ 🖱️ Left click ───────▶ EVENT_LBUTTONDOWN │
│ EVENT_LBUTTONUP │
│ │
│ 🖱️ Double-click ───────▶ EVENT_LBUTTONDBLCLK │
│ │
│ 🖱️ Right click ───────▶ EVENT_RBUTTONDOWN │
│ │
│ 🖱️ Scroll wheel ───────▶ EVENT_MOUSEWHEEL │
│ │
└─────────────────────────────────────────────────────────────────────┘
Event Types:
| Event | Description |
|——-|————-|
| EVENT_MOUSEMOVE | Mouse moved |
| EVENT_LBUTTONDOWN | Left button pressed |
| EVENT_LBUTTONUP | Left button released |
| EVENT_RBUTTONDOWN | Right button pressed |
| EVENT_LBUTTONDBLCLK | Left double-click |
| EVENT_MOUSEWHEEL | Scroll wheel |
Callback Receives Coordinates:
┌───────────────────────────────────────────┐
│ Window │
│ (0,0)───────────────────────────────▶ x │
│ │ │
│ │ ┌─────────┐ │
│ │ │ Click │ │
│ │ │ (125,80)│ │
│ │ └─────────┘ │
│ │ │
│ ▼ │
│ y │
└───────────────────────────────────────────┘
def mouse_callback(event, x, y, flags, param):
│ │
│ └── y coordinate
└───── x coordinate
Callback Pattern:
def mouse_callback(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
print(f"Left click at ({x}, {y})")
cv2.setMouseCallback('window', mouse_callback)
9. Drawing Functions
Coordinate System:
(0,0) ────────────→ x
│
│
│
↓
y
Drawing Primitives:
┌─────────────────────────────────────────────────────────────────────┐
│ Drawing Functions │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ LINE RECTANGLE CIRCLE │
│ │
│ pt1 ● pt1 ●─────────┐ ┌───────┐ │
│ ╲ │ │ ╱ ● ╲ │
│ ╲ │ │ │ center │ │
│ ╲ │ │ │ ●───────│radius │
│ ● pt2 └─────────● pt2 ╲ ╱ │
│ └───────┘ │
│ │
│ ELLIPSE POLYLINES TEXT │
│ │
│ ╭──────╮ ●─────● ┌─────────────┐ │
│ ╱ ╲ ╲ ╱ │ Hello World │ │
│ │ axes │ ╲ ╱ └─────────────┘ │
│ ╲ a × b ╱ ● ▲ │
│ ╰──────╯ ╲ │ org (x,y) │
│ ● └────────── │
│ │
└─────────────────────────────────────────────────────────────────────┘
Line Drawing (Bresenham’s Algorithm):
cv2.line(img, pt1, pt2, color, thickness, lineType)
Line Types Comparison:
LINE_4 (4-connected) LINE_8 (8-connected) LINE_AA (Anti-aliased)
■ ■ ░▒
■ ■ ░▓█
■■ ■ ░▓█
■ ■ ░▓█
■■ ■ ░▓█
■ ■ ▒▓█
Blocky, strict Diagonal allowed Smooth edges
horizontal/vertical (default) (slower)
| Type | Description |
|---|---|
LINE_8 |
8-connected (default) |
LINE_4 |
4-connected |
LINE_AA |
Anti-aliased |
Circle Drawing (Midpoint Algorithm):
cv2.circle(img, center, radius, color, thickness)
# thickness = -1 for filled
Thickness Parameter:
thickness = 1 thickness = 3 thickness = -1
┌───────┐ ┌───────┐ ┌───────┐
│ ○ │ │ ◉ │ │ ● │
│ │ │ │ │ │
└───────┘ └───────┘ └───────┘
Outline Thick outline Filled
Rectangle:
cv2.rectangle(img, pt1, pt2, color, thickness)
pt1 (x1, y1) ────────────┐
│ │
│ │
│ │
└───────────────────● pt2 (x2, y2)
Ellipse (parametric form):
x = center_x + a × cos(θ)
y = center_y + b × sin(θ)
a (semi-major axis)
◀────────▶
╭──────────╮ ▲
╱ ╲ │ b (semi-minor axis)
│ ● │▼
╲ center ╱
╰──────────╯
Text Rendering:
cv2.putText(img, text, org, fontFace, fontScale, color, thickness)
Available Fonts:
FONT_HERSHEY_SIMPLEXFONT_HERSHEY_DUPLEXFONT_HERSHEY_COMPLEXFONT_HERSHEY_TRIPLEXFONT_HERSHEY_SCRIPT_*
Tutorial Files
| File | Description |
|---|---|
01_image_io.py |
imread, imwrite, formats, encoding |
02_video_io.py |
VideoCapture, VideoWriter, camera input |
03_gui_basics.py |
Windows, keyboard, trackbars, mouse events, drawing |
Key Functions Reference
| Function | Description |
|---|---|
cv2.imread(path, flags) |
Load image |
cv2.imwrite(path, img, params) |
Save image |
cv2.imencode(ext, img) |
Encode to buffer |
cv2.imdecode(buf, flags) |
Decode from buffer |
cv2.VideoCapture(src) |
Open video/camera |
cv2.VideoWriter(...) |
Create video writer |
cv2.namedWindow(name, flags) |
Create window |
cv2.imshow(name, img) |
Display image |
cv2.waitKey(delay) |
Wait for key |
cv2.destroyAllWindows() |
Close all windows |
cv2.createTrackbar(...) |
Create slider |
cv2.setMouseCallback(...) |
Set mouse handler |
cv2.line(...) |
Draw line |
cv2.rectangle(...) |
Draw rectangle |
cv2.circle(...) |
Draw circle |
cv2.putText(...) |
Draw text |