Newshub - NUS' News Portal

Automated video text detection allows navigation

12 June 2013

The new image processing technique automates video scene text detection, allowing locations to be navigated via the tracking of road, shop and bus stop signs
Photo: The Straits Times © Singapore Press Holdings

Members of the team: (from left) Dr Lu, Dr Shivakumara, Prof Tan, Mr Tian, Dr Su and Mr Phan

Finding and recognising text in videos has always been a challenge to developers, with existing solutions less than ideal in accuracy and capability. Now, NUS researchers have come up with a new image processing technique - the multistep method for automating text detection - to automate video scene text detection.

The fresh approach will allow locations to be navigated via the tracking of road, shop and bus stop signs. In addition, the machine translation capability enables road sign identification, a useful feature for travellers and tourists. If coupled with voice rendering, the solution can also help the visually impaired.

Currently, it is difficult for the computer to tell where to find text in a video frame, said Professor Tan Chew Lim from the NUS Department of Computer Science who helmed the study. There are also interferences from the underlying background in the text region.

"We therefore devised an image processing technique based on mathematical functions to detect areas in a video image that exhibit rapid and repetitive changes in the image signal. At the same time, another mathematical function - masking - is deployed to suppress interfering signal that behaves in a less regular and weaker pattern, enhancing the contrast between text and its background. Another process - clustering - will then segregate text from non-text regions," explained Prof Tan.

The NUS team consists of Computer Science Research Fellows Dr Palaiahnakote Shivakumara and Dr Su Bolan, and PhD students Mr Tian Shangxuan and Mr Phan Quy Trung. Dr Lu Shijian from the Institute for Infocomm Research is also a collaborator in the project.

Looking ahead, the researchers plan to overcome the problems of detecting text in video images such as low resolution, decreased sharpness, visual obstruction and perspective distortion arising from skewed viewing angle and non-horizontal text alignment.

Prof Tan, whose research interests include document image analysis, text mining and natural language processing, shared that work into enhancing recognition after text detection is also under way. This can eventually lead to efficient implementation of the method in detection software, which may be installed in mobile devices.