Document Scanner

Scan documents using edge detection and perspective transformation.

Overview

Transform photos of documents into clean, flat scans - like a mobile scanning app.

Key Techniques:

Canny edge detection
Contour detection and approximation
Perspective transformation
Adaptive thresholding

How It Works

Input Image → Edge Detection → Find Document → Perspective Warp → Output
     ↓              ↓               ↓                ↓
 [Photo of    [Canny edges]  [4-corner        [Flattened
  document]                   contour]         document]

Pipeline Steps

Preprocessing: Convert to grayscale, apply Gaussian blur
Edge Detection: Canny edge detector finds edges
Contour Finding: Find largest 4-sided contour (the document)
Corner Ordering: Sort corners to top-left, top-right, bottom-right, bottom-left
Perspective Transform: Warp to bird’s-eye view
Enhancement: Adaptive threshold for clean text

Key OpenCV Functions

# Edge detection
edges = cv2.Canny(blur, 50, 200)

# Find contours
contours, _ = cv2.findContours(edges, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

# Approximate to polygon
approx = cv2.approxPolyDP(contour, epsilon, True)

# Perspective transform
M = cv2.getPerspectiveTransform(src_pts, dst_pts)
warped = cv2.warpPerspective(image, M, (width, height))

# Enhance text
result = cv2.adaptiveThreshold(warped, 255,
    cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)

Controls

Key	Action
`c`	Capture and process document
`t`	Toggle threshold enhancement
`s`	Save scanned document
`r`	Reset
`q`	Quit

Running the Application

python curriculum/applications/01_document_scanner.py

Document Scanner

Overview

How It Works

Pipeline Steps

Key OpenCV Functions

Controls

Running the Application

Official Documentation