VISER Multimedia Event Detection System
The Raytheon BBN VISER Multimedia Event Detection System detects events of interest in a video collection by using all aspects of the videos. VISER identifies visual features such as appearance, color, and motion. It also identifies high-level semantic information obtained from detecting scene and object concepts (e.g., “child”, “umbrella”), as well as action concepts (e.g., “swimming”).
Using BBN’s state-of-the-art automatic speech recognition and optical character recognition engines, VISER combines spoken and videotext content with visual features, along with the audio-visual co-occurrence patterns in the video.
In addition to the application of state-of-the-art human language technologies, users can annotate videos to provide positive examples of events. This provides additional specific information to help identify events within the content.