Text%20Detection%20in%20Video - PowerPoint PPT Presentation

About This Presentation
Title:

Text%20Detection%20in%20Video

Description:

Can not handle the large range of font-size. ... Using size constraints to remove unsatisfied areas. Multi-frame analysis. Text region matching ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 21
Provided by: cai51
Category:

less

Transcript and Presenter's Notes

Title: Text%20Detection%20in%20Video


1
Text Detection in Video
  • Min Cai
  • 2002.3.13

2
Background
  • Video OCR Text detection, extraction and
    recognition
  • Detection Target Artificial text
  • Text detection
  • Detect the region from Single frame
  • Refine the region by combining consecutive frames

3
Existing Work
Feature Extraction Text Detection based on feature
Color Connected-component
Texture Texture-Segmentation
Edge Top-Down Bottom-Up
4
Connected-component-based methods
  • Basic idea
  • Treat text as an uniform color (color level) and
    classify each pixel as text or non-text according
    to the color value.
  • Combine connected text-pixels into connected
    components.
  • Group collinear connected components into a text
    string.
  • Advantage
  • Can detect an arbitrary orientation text ----
    with similar color and in a simple background.
  • Disadvantage
  • Sensitive to color variance
  • Lossy compression of video introduces color
    bleeding
  • Complex background

5
Texture Segmentation method
  • Basic idea
  • Treat text as a type of texture
  • Use texture segmentation algorithms to detect
    text
  • Gabor Filter
  • Gaussian derivatives
  • Advantage
  • Can segment text areas graphic areas in a
    simple background efficiently. It is usually used
    in document analysis.
  • Disadvantage
  • Time-consuming
  • Cannot handle well a text embedded in various
    background.

6
Bottom-Up method
  • Basic idea
  • A seed region is defined as a small region with
    high edge density.
  • Grow a seed region into successively larger
    components until all seed regions are reached on
    the image.
  • Advantage
  • It is a generic method to detect a homogeneous
    object of various shape. That is, it can detect
    not only a rectangular object, but also other
    shapes.
  • Disadvantage
  • Sensitive to noise.
  • Can not handle the large range of font-size.
  • Sensitive to the stroke density (different
    language).

7
Top-Down method
  • Basic idea
  • Based on run-length smoothing algorithm
  • Analyze horizontal and vertical projection
    profiles
  • Advantage
  • Can detect the boundary of horizontal alignment
    text string quickly and correctly
  • Noise insensitive
  • Disadvantage
  • Cannot handle diagonal alignment text.
  • One pass of horizontal vertical projection
    cannot handle the complex layout.

8
Analysis (1)
  • A certain contrast against background
  • Artificial text strings are designed to be read
    easily
  • A certain stroke density
  • Text strings always appear horizontally
  • Spatial cohesion
  • Characters of the same text string are of similar
    heights, orientation and spacing
  • Size constraint
  • Text strings have certain size restriction
  • A text string appears in multiple consecutive
    frames and the similar position.

9
Analysis (2)
Problems Resolutions
How to extract more useful edge? Local Thresholding
How to highlight text areas? Text area recovery
How to detect text regions fast and correctly? Coarse-To-Fine detection
10
Single Threshold
11
Local threshold (1)
  • Use a small kernel (red) to scan the whole image.
  • In a bigger window (gray) surrounding the kernel,
    calculate the local threshold corresponding to
    its local histogram.

b. Local threshold selection
a. Window move
12
Local threshold (2)
13
Text-like area recovery (1)
Before recovery After recovery
14
Text-like area recovery (2)
Before recovery After recovery
15
High pass filter
16
Coarse-to-Fine detection
  • Using Top-down scheme to detect text-like areas

17
Detect text-like areas
1)
2)
3)
4)
b. Coarse vertical projection
18
Refinement
  • Combine the neighboring text areas with similar
    height
  • Using size constraints to remove unsatisfied areas

19
Multi-frame analysis
  • Text region matching
  • Find all the regions corresponding to the same
    text
  • Text region enhancement
  • Enhance the text image quality by multi-frame
    integration
  • Repetitive text elimination
  • Only record the text at its first emergence.

20
Thank you!
End
Write a Comment
User Comments (0)
About PowerShow.com