Module 3: I/O and GUI

Reading, writing, and displaying images and videos with OpenCV’s I/O and GUI facilities.

Topics Covered

  • Reading and writing images
  • Video capture and writing
  • Display windows and keyboard handling
  • Trackbars and mouse events
  • Drawing functions

Algorithm Explanations

1. Image Reading (imread)

What it does: Loads an image from disk into memory as a NumPy array.

Read Modes: | Flag | Value | Description | |——|——-|————-| | IMREAD_COLOR | 1 | Load as BGR (default) | | IMREAD_GRAYSCALE | 0 | Load as single channel | | IMREAD_UNCHANGED | -1 | Load with alpha channel |

Decoding Process:

┌─────────────────────────────────────────────────────────────────────┐
│                     Image Reading Pipeline                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌─────────┐     ┌──────────┐     ┌────────────┐     ┌──────────┐ │
│   │  File   │────▶│ Decoder  │────▶│ Raw Pixels │────▶│  NumPy   │ │
│   │ (disk)  │     │(JPEG/PNG)│     │  (BGR)     │     │  Array   │ │
│   └─────────┘     └──────────┘     └────────────┘     └──────────┘ │
│                                                                     │
│   photo.jpg        Decompress      [B,G,R,B,G,R,    shape: (H,W,3) │
│   image.png        & Decode         B,G,R,B,G,R]    dtype: uint8   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Read Mode Comparison:

Original Image (RGB with Alpha)
┌─────────────────────────────────────┐
│  R   G   B   A   R   G   B   A  ... │  4 channels
└─────────────────────────────────────┘
           │
           ▼
┌──────────┬──────────────┬──────────────────────┐
│ Mode     │ Result       │ Shape                │
├──────────┼──────────────┼──────────────────────┤
│ COLOR    │ B,G,R,B,G,R  │ (H, W, 3) - No alpha │
│ GRAY     │ L,L,L,L,L,L  │ (H, W) - Luminance   │
│ UNCHANGED│ B,G,R,A,...  │ (H, W, 4) - All      │
└──────────┴──────────────┴──────────────────────┘

Important Notes:

  • Returns None if file cannot be read
  • OpenCV supports: JPEG, PNG, BMP, TIFF, WebP, PBM/PGM/PPM

2. Image Writing (imwrite)

What it does: Saves an image to disk in specified format.

Encoding Process:

┌─────────────────────────────────────────────────────────────────────┐
│                     Image Writing Pipeline                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌──────────┐     ┌──────────┐     ┌────────────┐     ┌─────────┐ │
│   │  NumPy   │────▶│ Encoder  │────▶│ Compressed │────▶│  File   │ │
│   │  Array   │     │          │     │    Data    │     │ (disk)  │ │
│   └──────────┘     └──────────┘     └────────────┘     └─────────┘ │
│                                                                     │
│   (H, W, 3)         JPEG/PNG/        Binary blob     output.jpg    │
│   uint8 BGR         WebP codec                        image.png    │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Quality vs File Size Trade-off:

JPEG Quality Parameter (0-100)

    File Size                    Quality
       ▲                           ▲
       │    ╱                      │          ╱
       │   ╱                       │        ╱
       │  ╱                        │      ╱
       │ ╱                         │    ╱
       │╱                          │  ╱
       └────────────────▶          └────────────────▶
        0    50    100             0    50    100
          Quality                     Quality

    Low quality = Small file      Low quality = Artifacts
    High quality = Large file     High quality = Sharp

Format-Specific Parameters:

Format Parameter Range Description
JPEG IMWRITE_JPEG_QUALITY 0-100 Quality (higher = better, larger)
PNG IMWRITE_PNG_COMPRESSION 0-9 Compression level
WebP IMWRITE_WEBP_QUALITY 1-100 Quality factor

Example:

cv2.imwrite("output.jpg", img, [cv2.IMWRITE_JPEG_QUALITY, 95])

3. In-Memory Encoding/Decoding

Encoding to bytes (for network transmission):

success, buffer = cv2.imencode('.jpg', image, params)
# buffer is a NumPy array of bytes

Decoding from bytes:

image = cv2.imdecode(buffer, cv2.IMREAD_COLOR)

4. Video Capture

VideoCapture handles reading from:

  • Video files (MP4, AVI, etc.)
  • Camera devices (webcam)
  • Network streams (RTSP, HTTP)

Video Source Types:

┌─────────────────────────────────────────────────────────────────────┐
│                     Video Capture Sources                           │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌───────────────┐                                                 │
│   │  Video File   │──────┐                                          │
│   │ "video.mp4"   │      │                                          │
│   └───────────────┘      │      ┌─────────────────┐                 │
│                          ├─────▶│  VideoCapture   │────▶ Frames    │
│   ┌───────────────┐      │      │                 │                 │
│   │    Camera     │──────┤      └─────────────────┘                 │
│   │   device: 0   │      │                                          │
│   └───────────────┘      │                                          │
│                          │                                          │
│   ┌───────────────┐      │                                          │
│   │ Network Stream│──────┘                                          │
│   │"rtsp://..."   │                                                 │
│   └───────────────┘                                                 │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Frame Reading Loop:

┌───────────────────────────────────────────────────────────────┐
│                    Video Capture Loop                         │
├───────────────────────────────────────────────────────────────┤
│                                                               │
│    cap = VideoCapture(source)                                 │
│             │                                                 │
│             ▼                                                 │
│    ┌────────────────────┐                                     │
│    │   cap.isOpened()?  │◀────────────────────────┐          │
│    └────────┬───────────┘                         │          │
│         Yes │       No                            │          │
│             ▼        ▼                            │          │
│    ┌────────────┐  Exit                           │          │
│    │ ret, frame │                                 │          │
│    │ = cap.read │                                 │          │
│    └────────┬───┘                                 │          │
│             ▼                                     │          │
│    ┌─────────────┐                                │          │
│    │  ret True?  │                                │          │
│    └──────┬──────┘                                │          │
│       Yes │    No                                 │          │
│           ▼     ▼                                 │          │
│    ┌──────────┐  Break                            │          │
│    │ Process  │                                   │          │
│    │  Frame   │───────────────────────────────────┘          │
│    └──────────┘                                              │
│                                                               │
│    cap.release()                                              │
│                                                               │
└───────────────────────────────────────────────────────────────┘
cap = cv2.VideoCapture(source)
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    # Process frame
cap.release()

Video Properties: | Property | ID | Description | |———-|—–|————-| | CAP_PROP_FRAME_WIDTH | 3 | Frame width | | CAP_PROP_FRAME_HEIGHT | 4 | Frame height | | CAP_PROP_FPS | 5 | Frames per second | | CAP_PROP_FRAME_COUNT | 7 | Total frames | | CAP_PROP_POS_FRAMES | 1 | Current frame position |


5. Video Writing

VideoWriter saves frames to video file.

Video Writing Pipeline:

┌─────────────────────────────────────────────────────────────────────┐
│                     Video Writing Pipeline                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌──────────┐     ┌──────────┐     ┌────────────┐     ┌─────────┐ │
│   │  Frames  │────▶│  Codec   │────▶│ Container  │────▶│  File   │ │
│   │ (BGR)    │     │ (XVID)   │     │  (AVI)     │     │ (disk)  │ │
│   └──────────┘     └──────────┘     └────────────┘     └─────────┘ │
│                                                                     │
│   NumPy arrays     FourCC code      Video format     output.avi    │
│   (H, W, 3)        compression      + metadata                      │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

FourCC (Four Character Code):

FourCC = 4 ASCII characters identifying the codec

  ┌───┬───┬───┬───┐
  │ X │ V │ I │ D │  = XVID codec (MPEG-4)
  └───┴───┴───┴───┘

  ┌───┬───┬───┬───┐
  │ m │ p │ 4 │ v │  = MP4 Part 2
  └───┴───┴───┴───┘

  ┌───┬───┬───┬───┐
  │ M │ J │ P │ G │  = Motion JPEG
  └───┴───┴───┴───┘

FourCC Codec Codes: | Code | Format | Description | |——|——–|————-| | XVID | AVI | MPEG-4 codec | | mp4v | MP4 | MPEG-4 Part 2 | | MJPG | AVI | Motion JPEG | | X264 | MP4 | H.264 codec |

Example:

fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi', fourcc, 30.0, (640, 480))
out.write(frame)
out.release()

Frame Rate Timing:

30 FPS Video Playback

    Frame 1    Frame 2    Frame 3    Frame 4
       │          │          │          │
       ▼          ▼          ▼          ▼
    ───┬──────────┬──────────┬──────────┬───────▶ Time
       │   33ms   │   33ms   │   33ms   │
       │◀────────▶│◀────────▶│◀────────▶│

    Wait time = 1000 / FPS = 1000 / 30 ≈ 33 ms

6. GUI Windows

Window Creation:

cv2.namedWindow('window', cv2.WINDOW_NORMAL)  # Resizable
cv2.namedWindow('window', cv2.WINDOW_AUTOSIZE)  # Fixed size

Keyboard Input:

key = cv2.waitKey(delay) & 0xFF
# delay: 0 = wait forever, >0 = wait milliseconds
# & 0xFF: mask for 64-bit systems

Common Key Codes: | Key | ASCII | Usage | |—–|——-|——-| | ESC | 27 | Exit | | Space | 32 | Pause | | Enter | 13 | Confirm | | ‘q’ | 113 | Quit |


7. Trackbars

What it does: Creates interactive sliders for parameter adjustment.

Trackbar Visualization:

┌─────────────────────────────────────────────────────────────────────┐
│  Window: "Controls"                                                 │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Brightness ├────────────────●───────────────────┤ 150 / 255      │
│                              ▲                                      │
│                              │                                      │
│                         Slider position                             │
│                                                                     │
│   Contrast   ├───●────────────────────────────────┤  30 / 100      │
│                                                                     │
│   Threshold  ├──────────────────────────────●─────┤ 200 / 255      │
│                                                                     │
├─────────────────────────────────────────────────────────────────────┤
│                     [Image Display Area]                            │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Callback Pattern:

                    User moves slider
                          │
                          ▼
              ┌───────────────────────┐
              │   Callback Function   │
              │   on_change(value)    │
              └───────────┬───────────┘
                          │
                          ▼
              ┌───────────────────────┐
              │  Update image based   │
              │  on new value         │
              └───────────────────────┘
def on_change(value):
    # Called when slider moves
    pass

cv2.createTrackbar('name', 'window', initial, max_val, on_change)
current = cv2.getTrackbarPos('name', 'window')

8. Mouse Events

Mouse Interaction Flow:

┌─────────────────────────────────────────────────────────────────────┐
│                      Mouse Event Flow                               │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   User Action                     Event Generated                   │
│   ───────────                     ───────────────                   │
│                                                                     │
│   🖱️ Move mouse         ───────▶  EVENT_MOUSEMOVE                   │
│                                                                     │
│   🖱️ Left click         ───────▶  EVENT_LBUTTONDOWN                 │
│                                   EVENT_LBUTTONUP                   │
│                                                                     │
│   🖱️ Double-click       ───────▶  EVENT_LBUTTONDBLCLK               │
│                                                                     │
│   🖱️ Right click        ───────▶  EVENT_RBUTTONDOWN                 │
│                                                                     │
│   🖱️ Scroll wheel       ───────▶  EVENT_MOUSEWHEEL                  │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Event Types: | Event | Description | |——-|————-| | EVENT_MOUSEMOVE | Mouse moved | | EVENT_LBUTTONDOWN | Left button pressed | | EVENT_LBUTTONUP | Left button released | | EVENT_RBUTTONDOWN | Right button pressed | | EVENT_LBUTTONDBLCLK | Left double-click | | EVENT_MOUSEWHEEL | Scroll wheel |

Callback Receives Coordinates:

┌───────────────────────────────────────────┐
│ Window                                    │
│  (0,0)───────────────────────────────▶ x  │
│    │                                      │
│    │         ┌─────────┐                  │
│    │         │  Click  │                  │
│    │         │ (125,80)│                  │
│    │         └─────────┘                  │
│    │                                      │
│    ▼                                      │
│    y                                      │
└───────────────────────────────────────────┘

def mouse_callback(event, x, y, flags, param):
                        │  │
                        │  └── y coordinate
                        └───── x coordinate

Callback Pattern:

def mouse_callback(event, x, y, flags, param):
    if event == cv2.EVENT_LBUTTONDOWN:
        print(f"Left click at ({x}, {y})")

cv2.setMouseCallback('window', mouse_callback)

9. Drawing Functions

Coordinate System:

(0,0) ────────────→ x
  │
  │
  │
  ↓
  y

Drawing Primitives:

┌─────────────────────────────────────────────────────────────────────┐
│                     Drawing Functions                               │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   LINE                   RECTANGLE              CIRCLE              │
│                                                                     │
│   pt1 ●                  pt1 ●─────────┐        ┌───────┐          │
│        ╲                     │         │       ╱    ●    ╲         │
│         ╲                    │         │      │  center   │         │
│          ╲                   │         │      │  ●───────│radius    │
│           ● pt2              └─────────● pt2   ╲         ╱         │
│                                                  └───────┘          │
│                                                                     │
│   ELLIPSE                POLYLINES              TEXT                │
│                                                                     │
│       ╭──────╮           ●─────●               ┌─────────────┐     │
│      ╱        ╲           ╲   ╱                │ Hello World │     │
│     │   axes   │           ╲ ╱                 └─────────────┘     │
│      ╲  a × b ╱             ●                    ▲                  │
│       ╰──────╯               ╲                   │ org (x,y)        │
│                               ●                  └──────────        │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Line Drawing (Bresenham’s Algorithm):

cv2.line(img, pt1, pt2, color, thickness, lineType)

Line Types Comparison:

LINE_4 (4-connected)      LINE_8 (8-connected)      LINE_AA (Anti-aliased)

  ■                         ■                         ░▒
  ■                          ■                        ░▓█
  ■■                          ■                      ░▓█
   ■                           ■                    ░▓█
   ■■                           ■                  ░▓█
    ■                            ■                ▒▓█

  Blocky, strict           Diagonal allowed       Smooth edges
  horizontal/vertical      (default)              (slower)
Type Description
LINE_8 8-connected (default)
LINE_4 4-connected
LINE_AA Anti-aliased

Circle Drawing (Midpoint Algorithm):

cv2.circle(img, center, radius, color, thickness)
# thickness = -1 for filled

Thickness Parameter:

thickness = 1              thickness = 3              thickness = -1
  ┌───────┐                  ┌───────┐                  ┌───────┐
  │ ○     │                  │ ◉     │                  │ ●     │
  │       │                  │       │                  │       │
  └───────┘                  └───────┘                  └───────┘
   Outline                   Thick outline              Filled

Rectangle:

cv2.rectangle(img, pt1, pt2, color, thickness)

pt1 (x1, y1) ────────────┐
     │                   │
     │                   │
     │                   │
     └───────────────────● pt2 (x2, y2)

Ellipse (parametric form):

x = center_x + a × cos(θ)
y = center_y + b × sin(θ)

        a (semi-major axis)
       ◀────────▶
      ╭──────────╮  ▲
     ╱            ╲ │ b (semi-minor axis)
    │      ●       │▼
     ╲   center   ╱
      ╰──────────╯

Text Rendering:

cv2.putText(img, text, org, fontFace, fontScale, color, thickness)

Available Fonts:

  • FONT_HERSHEY_SIMPLEX
  • FONT_HERSHEY_DUPLEX
  • FONT_HERSHEY_COMPLEX
  • FONT_HERSHEY_TRIPLEX
  • FONT_HERSHEY_SCRIPT_*

Tutorial Files

File Description
01_image_io.py imread, imwrite, formats, encoding
02_video_io.py VideoCapture, VideoWriter, camera input
03_gui_basics.py Windows, keyboard, trackbars, mouse events, drawing

Key Functions Reference

Function Description
cv2.imread(path, flags) Load image
cv2.imwrite(path, img, params) Save image
cv2.imencode(ext, img) Encode to buffer
cv2.imdecode(buf, flags) Decode from buffer
cv2.VideoCapture(src) Open video/camera
cv2.VideoWriter(...) Create video writer
cv2.namedWindow(name, flags) Create window
cv2.imshow(name, img) Display image
cv2.waitKey(delay) Wait for key
cv2.destroyAllWindows() Close all windows
cv2.createTrackbar(...) Create slider
cv2.setMouseCallback(...) Set mouse handler
cv2.line(...) Draw line
cv2.rectangle(...) Draw rectangle
cv2.circle(...) Draw circle
cv2.putText(...) Draw text

Further Reading