Document Scanner
Scan documents using edge detection and perspective transformation.
Overview
Transform photos of documents into clean, flat scans - like a mobile scanning app.
Key Techniques:
- Canny edge detection
- Contour detection and approximation
- Perspective transformation
- Adaptive thresholding
How It Works
Input Image → Edge Detection → Find Document → Perspective Warp → Output
↓ ↓ ↓ ↓
[Photo of [Canny edges] [4-corner [Flattened
document] contour] document]
Pipeline Steps
- Preprocessing: Convert to grayscale, apply Gaussian blur
- Edge Detection: Canny edge detector finds edges
- Contour Finding: Find largest 4-sided contour (the document)
- Corner Ordering: Sort corners to top-left, top-right, bottom-right, bottom-left
- Perspective Transform: Warp to bird’s-eye view
- Enhancement: Adaptive threshold for clean text
Key OpenCV Functions
# Edge detection
edges = cv2.Canny(blur, 50, 200)
# Find contours
contours, _ = cv2.findContours(edges, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
# Approximate to polygon
approx = cv2.approxPolyDP(contour, epsilon, True)
# Perspective transform
M = cv2.getPerspectiveTransform(src_pts, dst_pts)
warped = cv2.warpPerspective(image, M, (width, height))
# Enhance text
result = cv2.adaptiveThreshold(warped, 255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
Controls
| Key | Action |
|---|---|
c |
Capture and process document |
t |
Toggle threshold enhancement |
s |
Save scanned document |
r |
Reset |
q |
Quit |
Running the Application
python curriculum/applications/01_document_scanner.py