Module 6: Video Analysis

Motion analysis and background modeling for video processing.

Topics Covered

Lucas-Kanade optical flow (sparse)
Farneback optical flow (dense)
Background subtraction (MOG2, KNN)
Motion detection

Algorithm Explanations

1. Optical Flow Concept

What it does: Estimates motion between consecutive frames.

Optical Flow Visualization:

┌─────────────────────────────────────────────────────────────────────┐
│                    Optical Flow Concept                             │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Frame t                        Frame t+1                         │
│   ┌───────────────────┐          ┌───────────────────┐             │
│   │                   │          │                   │             │
│   │      ●            │          │           ●       │             │
│   │    (ball)         │  ──▶     │         (ball)    │             │
│   │                   │          │                   │             │
│   └───────────────────┘          └───────────────────┘             │
│                                                                     │
│   Optical Flow = Motion Vector                                      │
│                                                                     │
│         ●───────────────▶●                                          │
│        (x,y)            (x+dx, y+dy)                                │
│                                                                     │
│   Flow vector: (dx, dy) = displacement per frame                   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Definition: For each pixel (x, y), find displacement (dx, dy) such that:

I(x, y, t) = I(x + dx, y + dy, t + dt)

Brightness Constancy Assumption:

I(x + dx, y + dy, t + dt) = I(x, y, t)

Taylor Expansion:

I(x + dx, y + dy, t + dt) ≈ I + Iₓdx + Iᵧdy + Iₜdt

Optical Flow Constraint Equation:

Iₓu + Iᵧv + Iₜ = 0

Or: ∇I · v + Iₜ = 0

Where:

Iₓ, Iᵧ: Spatial derivatives
Iₜ: Temporal derivative
u = dx/dt, v = dy/dt: Flow velocities

Problem: One equation, two unknowns → need additional constraints.

2. Lucas-Kanade Optical Flow (Sparse)

What it does: Tracks sparse feature points between frames.

Sparse vs Dense Flow:

┌─────────────────────────────────────────────────────────────────────┐
│                  Sparse vs Dense Optical Flow                       │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   SPARSE (Lucas-Kanade)              DENSE (Farneback)             │
│   Track selected points              Compute flow for ALL pixels   │
│                                                                     │
│   ┌───────────────────┐              ┌───────────────────┐         │
│   │ ●→    ●→          │              │→→→→→→→→→→→→→→→→→ │         │
│   │                   │              │→→→→→→→→→→→→→→→→→ │         │
│   │    ●→       ●→    │              │→→→→→→→→→→→→→→→→→ │         │
│   │                   │              │→→→→→→→→→→→→→→→→→ │         │
│   │ ●→    ●→          │              │→→→→→→→→→→→→→→→→→ │         │
│   └───────────────────┘              └───────────────────┘         │
│                                                                     │
│   Fast, for tracking                 Slow, full motion field       │
│   specific features                  for visualization             │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Additional Constraint: Assume constant flow in local neighborhood.

For a window of n pixels:

[Iₓ₁  Iᵧ₁]       [Iₜ₁]
[Iₓ₂  Iᵧ₂] [u]   [Iₜ₂]
[  ⋮    ⋮ ] [v] = -[ ⋮ ]
[Iₓₙ  Iᵧₙ]       [Iₜₙ]

    A      v   =   b

Least Squares Solution:

v = (AᵀA)⁻¹Aᵀb

[u]   [Σ Iₓ²    Σ IₓIᵧ]⁻¹ [-Σ IₓIₜ]
[v] = [Σ IₓIᵧ   Σ Iᵧ² ]   [-Σ IᵧIₜ]

Pyramidal Extension (for large motions):

┌─────────────────────────────────────────────────────────────────────┐
│                    Image Pyramid for Large Motion                   │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Problem: Large motion exceeds window size                         │
│   Solution: Compute at coarse scale, refine at fine scale          │
│                                                                     │
│   Level 2 (coarsest)      ┌─────┐                                   │
│   Large motion → small    │     │  Compute initial flow            │
│                           └─────┘                                   │
│                              │                                      │
│                              ▼                                      │
│   Level 1                ┌─────────┐                                │
│   Propagate & refine     │         │  Refine flow estimate         │
│                          └─────────┘                                │
│                              │                                      │
│                              ▼                                      │
│   Level 0 (finest)    ┌───────────────┐                             │
│   Final refinement    │               │  Final accurate flow       │
│                       └───────────────┘                             │
│                                                                     │
│   At each level: motion appears smaller (easier to track)          │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

OpenCV:

next_pts, status, error = cv2.calcOpticalFlowPyrLK(
    prev_gray, next_gray, prev_pts, None,
    winSize=(15, 15),    # Window size
    maxLevel=2,          # Pyramid levels
    criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03)
)

Returns:

next_pts: New positions of tracked points
status: 1 if flow found, 0 otherwise
error: Tracking error

3. Farneback Optical Flow (Dense)

What it does: Computes flow for every pixel.

Polynomial Expansion: Approximates neighborhood with quadratic polynomial:

f(x) ≈ xᵀAx + bᵀx + c

Where A is a symmetric matrix, b is a vector, c is a scalar.

Displacement Estimation: For two frames with polynomial approximations:

f₁(x) ≈ xᵀA₁x + b₁ᵀx + c₁
f₂(x) ≈ xᵀA₂x + b₂ᵀx + c₂

Assuming f₂(x) = f₁(x - d):

d = -(A₁ + A₂)⁻¹ × (b₂ - b₁) / 2

OpenCV:

flow = cv2.calcOpticalFlowFarneback(
    prev_gray, next_gray, None,
    pyr_scale=0.5,      # Pyramid scale
    levels=3,           # Pyramid levels
    winsize=15,         # Averaging window
    iterations=3,       # Iterations per level
    poly_n=5,           # Polynomial neighborhood (5 or 7)
    poly_sigma=1.2,     # Gaussian smoothing
    flags=0
)
# Returns (H, W, 2) array: flow[y, x] = [dx, dy]

Flow Visualization:

┌─────────────────────────────────────────────────────────────────────┐
│                    Flow Color Coding (HSV)                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Direction → Hue (color)           Magnitude → Value (brightness) │
│                                                                     │
│              0° (Red)                                               │
│                 │                                                   │
│        315°    │    45°             Slow        Fast                │
│           ╲    │    ╱               (dark)      (bright)            │
│            ╲   │   ╱                                                │
│   270° ─────────────── 90°          ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓              │
│   (Blue)    ╱   │   ╲    (Yellow)   ░░░░░░░░░░░░████████            │
│            ╱    │    ╲              ░░░░░░░██████████████           │
│        225°     │     135°                                          │
│                 │                                                   │
│              180° (Cyan)                                            │
│                                                                     │
│   Example: Moving right (0°) and fast = bright red                 │
│            Moving up (270°) and slow = dark blue                   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

# Convert to polar coordinates
magnitude, angle = cv2.cartToPolar(flow[..., 0], flow[..., 1])

# HSV representation
hsv = np.zeros((h, w, 3), dtype=np.uint8)
hsv[..., 0] = angle * 180 / np.pi / 2  # Hue = direction
hsv[..., 1] = 255                       # Saturation = max
hsv[..., 2] = cv2.normalize(magnitude, None, 0, 255, cv2.NORM_MINMAX)  # Value = magnitude

4. Background Subtraction

What it does: Separates foreground (moving objects) from background (static).

Background Subtraction Concept:

┌─────────────────────────────────────────────────────────────────────┐
│                    Background Subtraction                           │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Input Frame          Background Model       Foreground Mask      │
│   ┌───────────────┐    ┌───────────────┐     ┌───────────────┐    │
│   │   ┌───┐       │    │               │     │   ┌───┐       │    │
│   │   │   │       │    │               │     │   │███│       │    │
│   │   │🚶‍♂️│       │ -  │   (empty      │  =  │   │███│       │    │
│   │   │   │       │    │    scene)     │     │   └───┘       │    │
│   │   └───┘       │    │               │     │               │    │
│   └───────────────┘    └───────────────┘     └───────────────┘    │
│                                                                     │
│   Current frame      Learned over time       White = foreground    │
│   with person        (no person)             Black = background    │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

MOG2 (Mixture of Gaussians)

Model: Each pixel modeled as mixture of K Gaussians:

P(xₜ) = Σₖ wₖ × N(xₜ; μₖ, Σₖ)

Gaussian Mixture Model per Pixel:

┌─────────────────────────────────────────────────────────────────────┐
│               Pixel Modeled by Multiple Gaussians                   │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Pixel intensity histogram over time:                              │
│                                                                     │
│       ▲                                                             │
│       │    Gaussian 1         Gaussian 2                           │
│       │    (background:       (shadow:                              │
│       │     bright sky)        darker)                              │
│       │        ╱╲                ╱╲                                 │
│       │       ╱  ╲              ╱  ╲                                │
│   freq│      ╱    ╲            ╱    ╲                               │
│       │     ╱      ╲          ╱      ╲                              │
│       │    ╱        ╲        ╱        ╲                             │
│       │───────────────────────────────────────▶ intensity           │
│       0          100        150        200                          │
│                                                                     │
│   New pixel value:                                                  │
│   - Matches Gaussian → Background                                   │
│   - No match → Foreground (moving object)                          │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Algorithm:

1. For each new pixel value:
   a. Check which Gaussian matches (within 2.5σ)
   b. If match: update that Gaussian's parameters
   c. If no match: replace weakest Gaussian

2. Background: Gaussians with highest weights
3. Foreground: Pixels not matching background

Update Rules:

wₖ ← (1 - α)wₖ + α × Mₖ
μₖ ← (1 - ρ)μₖ + ρ × xₜ
σₖ² ← (1 - ρ)σₖ² + ρ × (xₜ - μₖ)²

Where:
  α = learning rate
  ρ = α / wₖ
  Mₖ = 1 if matched, 0 otherwise

Shadow Detection:

Shadow if: 0.5 < I/B < 1.0 and similar chromaticity

OpenCV:

mog2 = cv2.createBackgroundSubtractorMOG2(
    history=500,        # Frames used for background
    varThreshold=16,    # Squared Mahalanobis distance
    detectShadows=True  # Enable shadow detection
)
fg_mask = mog2.apply(frame)
# Returns: 0=background, 127=shadow, 255=foreground

KNN Background Subtractor

Model: Uses K nearest neighbors in sample history.

Background if: pixel is close to K samples in history

OpenCV:

knn = cv2.createBackgroundSubtractorKNN(
    history=500,
    dist2Threshold=400,
    detectShadows=True
)
fg_mask = knn.apply(frame)

5. Motion Detection Pipeline

Typical Workflow:

┌─────────────────────────────────────────────────────────────────────┐
│                    Motion Detection Pipeline                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   1. Input Frame         2. BG Subtract        3. Morphology       │
│   ┌───────────────┐      ┌───────────────┐     ┌───────────────┐  │
│   │   ┌───┐       │      │   ┌───┐ noise │     │   ┌───┐       │  │
│   │   │🚗 │       │  ──▶ │   │███│ ·· ·  │ ──▶ │   │███│       │  │
│   │   └───┘       │      │   └───┘  ·    │     │   └───┘       │  │
│   └───────────────┘      └───────────────┘     └───────────────┘  │
│                                                  (noise removed)   │
│                                                                     │
│   4. Find Contours       5. Filter by Area     6. Draw Result     │
│   ┌───────────────┐      ┌───────────────┐     ┌───────────────┐  │
│   │   ┌───┐       │      │   ┌───┐       │     │   ┌───┐       │  │
│   │   │ ▢ │ small │  ──▶ │   │ ▢ │       │ ──▶ │   │🚗 │       │  │
│   │   └───┘  ·    │      │   └───┘       │     │   └───┘       │  │
│   └───────────────┘      └───────────────┘     └───────────────┘  │
│                           (ignore tiny)        (bounding box)      │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Apply background subtractor
Threshold/clean mask
Morphological opening (remove noise)
Morphological closing (fill holes)
Find contours
Filter by area
Draw bounding boxes

Comparison

Method	Type	Speed	Use Case
Lucas-Kanade	Sparse	Fast	Feature tracking
Farneback	Dense	Medium	Full motion field
MOG2	Background	Fast	Surveillance
KNN	Background	Fast	Complex backgrounds

Tutorial Files

File	Description
`01_optical_flow.py`	Lucas-Kanade, Farneback, motion visualization
`02_background_subtraction.py`	MOG2, KNN, foreground detection

Key Functions Reference

Function	Description
`cv2.calcOpticalFlowPyrLK()`	Lucas-Kanade sparse flow
`cv2.calcOpticalFlowFarneback()`	Farneback dense flow
`cv2.createBackgroundSubtractorMOG2()`	Create MOG2
`cv2.createBackgroundSubtractorKNN()`	Create KNN
`subtractor.apply(frame)`	Get foreground mask
`subtractor.getBackgroundImage()`	Get background model

Module 6: Video Analysis

Topics Covered

Algorithm Explanations

1. Optical Flow Concept

2. Lucas-Kanade Optical Flow (Sparse)

3. Farneback Optical Flow (Dense)

4. Background Subtraction

MOG2 (Mixture of Gaussians)

KNN Background Subtractor

5. Motion Detection Pipeline

Comparison

Tutorial Files

Key Functions Reference

Further Reading