Module 6: Video Analysis
Motion analysis and background modeling for video processing.
Topics Covered
- Lucas-Kanade optical flow (sparse)
- Farneback optical flow (dense)
- Background subtraction (MOG2, KNN)
- Motion detection
Algorithm Explanations
1. Optical Flow Concept
What it does: Estimates motion between consecutive frames.
Optical Flow Visualization:
┌─────────────────────────────────────────────────────────────────────┐
│ Optical Flow Concept │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Frame t Frame t+1 │
│ ┌───────────────────┐ ┌───────────────────┐ │
│ │ │ │ │ │
│ │ ● │ │ ● │ │
│ │ (ball) │ ──▶ │ (ball) │ │
│ │ │ │ │ │
│ └───────────────────┘ └───────────────────┘ │
│ │
│ Optical Flow = Motion Vector │
│ │
│ ●───────────────▶● │
│ (x,y) (x+dx, y+dy) │
│ │
│ Flow vector: (dx, dy) = displacement per frame │
│ │
└─────────────────────────────────────────────────────────────────────┘
Definition: For each pixel (x, y), find displacement (dx, dy) such that:
I(x, y, t) = I(x + dx, y + dy, t + dt)
Brightness Constancy Assumption:
I(x + dx, y + dy, t + dt) = I(x, y, t)
Taylor Expansion:
I(x + dx, y + dy, t + dt) ≈ I + Iₓdx + Iᵧdy + Iₜdt
Optical Flow Constraint Equation:
Iₓu + Iᵧv + Iₜ = 0
Or: ∇I · v + Iₜ = 0
Where:
Iₓ, Iᵧ: Spatial derivativesIₜ: Temporal derivativeu = dx/dt,v = dy/dt: Flow velocities
Problem: One equation, two unknowns → need additional constraints.
2. Lucas-Kanade Optical Flow (Sparse)
What it does: Tracks sparse feature points between frames.
Sparse vs Dense Flow:
┌─────────────────────────────────────────────────────────────────────┐
│ Sparse vs Dense Optical Flow │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ SPARSE (Lucas-Kanade) DENSE (Farneback) │
│ Track selected points Compute flow for ALL pixels │
│ │
│ ┌───────────────────┐ ┌───────────────────┐ │
│ │ ●→ ●→ │ │→→→→→→→→→→→→→→→→→ │ │
│ │ │ │→→→→→→→→→→→→→→→→→ │ │
│ │ ●→ ●→ │ │→→→→→→→→→→→→→→→→→ │ │
│ │ │ │→→→→→→→→→→→→→→→→→ │ │
│ │ ●→ ●→ │ │→→→→→→→→→→→→→→→→→ │ │
│ └───────────────────┘ └───────────────────┘ │
│ │
│ Fast, for tracking Slow, full motion field │
│ specific features for visualization │
│ │
└─────────────────────────────────────────────────────────────────────┘
Additional Constraint: Assume constant flow in local neighborhood.
For a window of n pixels:
[Iₓ₁ Iᵧ₁] [Iₜ₁]
[Iₓ₂ Iᵧ₂] [u] [Iₜ₂]
[ ⋮ ⋮ ] [v] = -[ ⋮ ]
[Iₓₙ Iᵧₙ] [Iₜₙ]
A v = b
Least Squares Solution:
v = (AᵀA)⁻¹Aᵀb
[u] [Σ Iₓ² Σ IₓIᵧ]⁻¹ [-Σ IₓIₜ]
[v] = [Σ IₓIᵧ Σ Iᵧ² ] [-Σ IᵧIₜ]
Pyramidal Extension (for large motions):
┌─────────────────────────────────────────────────────────────────────┐
│ Image Pyramid for Large Motion │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Problem: Large motion exceeds window size │
│ Solution: Compute at coarse scale, refine at fine scale │
│ │
│ Level 2 (coarsest) ┌─────┐ │
│ Large motion → small │ │ Compute initial flow │
│ └─────┘ │
│ │ │
│ ▼ │
│ Level 1 ┌─────────┐ │
│ Propagate & refine │ │ Refine flow estimate │
│ └─────────┘ │
│ │ │
│ ▼ │
│ Level 0 (finest) ┌───────────────┐ │
│ Final refinement │ │ Final accurate flow │
│ └───────────────┘ │
│ │
│ At each level: motion appears smaller (easier to track) │
│ │
└─────────────────────────────────────────────────────────────────────┘
OpenCV:
next_pts, status, error = cv2.calcOpticalFlowPyrLK(
prev_gray, next_gray, prev_pts, None,
winSize=(15, 15), # Window size
maxLevel=2, # Pyramid levels
criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03)
)
Returns:
next_pts: New positions of tracked pointsstatus: 1 if flow found, 0 otherwiseerror: Tracking error
3. Farneback Optical Flow (Dense)
What it does: Computes flow for every pixel.
Polynomial Expansion: Approximates neighborhood with quadratic polynomial:
f(x) ≈ xᵀAx + bᵀx + c
Where A is a symmetric matrix, b is a vector, c is a scalar.
Displacement Estimation: For two frames with polynomial approximations:
f₁(x) ≈ xᵀA₁x + b₁ᵀx + c₁
f₂(x) ≈ xᵀA₂x + b₂ᵀx + c₂
Assuming f₂(x) = f₁(x - d):
d = -(A₁ + A₂)⁻¹ × (b₂ - b₁) / 2
OpenCV:
flow = cv2.calcOpticalFlowFarneback(
prev_gray, next_gray, None,
pyr_scale=0.5, # Pyramid scale
levels=3, # Pyramid levels
winsize=15, # Averaging window
iterations=3, # Iterations per level
poly_n=5, # Polynomial neighborhood (5 or 7)
poly_sigma=1.2, # Gaussian smoothing
flags=0
)
# Returns (H, W, 2) array: flow[y, x] = [dx, dy]
Flow Visualization:
┌─────────────────────────────────────────────────────────────────────┐
│ Flow Color Coding (HSV) │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Direction → Hue (color) Magnitude → Value (brightness) │
│ │
│ 0° (Red) │
│ │ │
│ 315° │ 45° Slow Fast │
│ ╲ │ ╱ (dark) (bright) │
│ ╲ │ ╱ │
│ 270° ─────────────── 90° ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ │
│ (Blue) ╱ │ ╲ (Yellow) ░░░░░░░░░░░░████████ │
│ ╱ │ ╲ ░░░░░░░██████████████ │
│ 225° │ 135° │
│ │ │
│ 180° (Cyan) │
│ │
│ Example: Moving right (0°) and fast = bright red │
│ Moving up (270°) and slow = dark blue │
│ │
└─────────────────────────────────────────────────────────────────────┘
# Convert to polar coordinates
magnitude, angle = cv2.cartToPolar(flow[..., 0], flow[..., 1])
# HSV representation
hsv = np.zeros((h, w, 3), dtype=np.uint8)
hsv[..., 0] = angle * 180 / np.pi / 2 # Hue = direction
hsv[..., 1] = 255 # Saturation = max
hsv[..., 2] = cv2.normalize(magnitude, None, 0, 255, cv2.NORM_MINMAX) # Value = magnitude
4. Background Subtraction
What it does: Separates foreground (moving objects) from background (static).
Background Subtraction Concept:
┌─────────────────────────────────────────────────────────────────────┐
│ Background Subtraction │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Input Frame Background Model Foreground Mask │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ ┌───┐ │ │ │ │ ┌───┐ │ │
│ │ │ │ │ │ │ │ │███│ │ │
│ │ │🚶♂️│ │ - │ (empty │ = │ │███│ │ │
│ │ │ │ │ │ scene) │ │ └───┘ │ │
│ │ └───┘ │ │ │ │ │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ │
│ Current frame Learned over time White = foreground │
│ with person (no person) Black = background │
│ │
└─────────────────────────────────────────────────────────────────────┘
MOG2 (Mixture of Gaussians)
Model: Each pixel modeled as mixture of K Gaussians:
P(xₜ) = Σₖ wₖ × N(xₜ; μₖ, Σₖ)
Gaussian Mixture Model per Pixel:
┌─────────────────────────────────────────────────────────────────────┐
│ Pixel Modeled by Multiple Gaussians │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Pixel intensity histogram over time: │
│ │
│ ▲ │
│ │ Gaussian 1 Gaussian 2 │
│ │ (background: (shadow: │
│ │ bright sky) darker) │
│ │ ╱╲ ╱╲ │
│ │ ╱ ╲ ╱ ╲ │
│ freq│ ╱ ╲ ╱ ╲ │
│ │ ╱ ╲ ╱ ╲ │
│ │ ╱ ╲ ╱ ╲ │
│ │───────────────────────────────────────▶ intensity │
│ 0 100 150 200 │
│ │
│ New pixel value: │
│ - Matches Gaussian → Background │
│ - No match → Foreground (moving object) │
│ │
└─────────────────────────────────────────────────────────────────────┘
Algorithm:
1. For each new pixel value:
a. Check which Gaussian matches (within 2.5σ)
b. If match: update that Gaussian's parameters
c. If no match: replace weakest Gaussian
2. Background: Gaussians with highest weights
3. Foreground: Pixels not matching background
Update Rules:
wₖ ← (1 - α)wₖ + α × Mₖ
μₖ ← (1 - ρ)μₖ + ρ × xₜ
σₖ² ← (1 - ρ)σₖ² + ρ × (xₜ - μₖ)²
Where:
α = learning rate
ρ = α / wₖ
Mₖ = 1 if matched, 0 otherwise
Shadow Detection:
Shadow if: 0.5 < I/B < 1.0 and similar chromaticity
OpenCV:
mog2 = cv2.createBackgroundSubtractorMOG2(
history=500, # Frames used for background
varThreshold=16, # Squared Mahalanobis distance
detectShadows=True # Enable shadow detection
)
fg_mask = mog2.apply(frame)
# Returns: 0=background, 127=shadow, 255=foreground
KNN Background Subtractor
Model: Uses K nearest neighbors in sample history.
Background if: pixel is close to K samples in history
OpenCV:
knn = cv2.createBackgroundSubtractorKNN(
history=500,
dist2Threshold=400,
detectShadows=True
)
fg_mask = knn.apply(frame)
5. Motion Detection Pipeline
Typical Workflow:
┌─────────────────────────────────────────────────────────────────────┐
│ Motion Detection Pipeline │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ 1. Input Frame 2. BG Subtract 3. Morphology │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ ┌───┐ │ │ ┌───┐ noise │ │ ┌───┐ │ │
│ │ │🚗 │ │ ──▶ │ │███│ ·· · │ ──▶ │ │███│ │ │
│ │ └───┘ │ │ └───┘ · │ │ └───┘ │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ (noise removed) │
│ │
│ 4. Find Contours 5. Filter by Area 6. Draw Result │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ ┌───┐ │ │ ┌───┐ │ │ ┌───┐ │ │
│ │ │ ▢ │ small │ ──▶ │ │ ▢ │ │ ──▶ │ │🚗 │ │ │
│ │ └───┘ · │ │ └───┘ │ │ └───┘ │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ (ignore tiny) (bounding box) │
│ │
└─────────────────────────────────────────────────────────────────────┘
1. Apply background subtractor
2. Threshold/clean mask
3. Morphological opening (remove noise)
4. Morphological closing (fill holes)
5. Find contours
6. Filter by area
7. Draw bounding boxes
Comparison
| Method | Type | Speed | Use Case |
|---|---|---|---|
| Lucas-Kanade | Sparse | Fast | Feature tracking |
| Farneback | Dense | Medium | Full motion field |
| MOG2 | Background | Fast | Surveillance |
| KNN | Background | Fast | Complex backgrounds |
Tutorial Files
| File | Description |
|---|---|
01_optical_flow.py |
Lucas-Kanade, Farneback, motion visualization |
02_background_subtraction.py |
MOG2, KNN, foreground detection |
Key Functions Reference
| Function | Description |
|---|---|
cv2.calcOpticalFlowPyrLK() |
Lucas-Kanade sparse flow |
cv2.calcOpticalFlowFarneback() |
Farneback dense flow |
cv2.createBackgroundSubtractorMOG2() |
Create MOG2 |
cv2.createBackgroundSubtractorKNN() |
Create KNN |
subtractor.apply(frame) |
Get foreground mask |
subtractor.getBackgroundImage() |
Get background model |