Technology Today

2012 Issue 2

Raytheon's Multimedia Monitoring System enables the automated collection, translation and analysis of open source multilingual data.


Language is at the heart of human communications and language can save lives. Words have infuriated populations, created misunderstandings, fueled hatred, and built barriers across cultures and civilizations. At the same time, words have also opened doors, reconciled groups, gained peace and created laws for justice, order, democracy and freedom. Accuracy and, more importantly, the meaning of a word in context, matter. The U.S. Department of Defense (DoD) recognizes that purely manual approaches to understanding the discussions happening around the globe would render us deaf to important messages and themes, and leave us without a voice in the ongoing discussion.

Advances in automated human language processing technologies such as speech recognition and machine translation have opened up new capabilities in processing traditional media forms such as broadcast television and Web content. The GALE program (Global Autonomous Language Exploitation), funded by the DoD's Defense Advanced Research Projects Agency (DARPA), focused on improving the state of the art of automatic machine translation. Numerous human language technology development programs at the Combating Terrorism Technology Support Office/Technical Support Working Group (CTTSO/TSWG) have transitioned these advances to operational use. In 2003, the first operational, fully automated media monitoring systems were deployed.

Multimedia Monitoring System

Raytheon BBN Technologies' (BBN) commercial off-the-shelf Multimedia Monitoring System (M3S) delivers an end-to-end capability for monitoring, translating, storing and searching a wide variety of open source media across a range of languages. M3S provides users with real-time understanding of news, events, and perceptions around the world. The system can be configured to process any combination of inputs (i.e., analog broadcast, Web text, file-based media) and support any number of channels and users, giving English speakers direct access to foreign language multimedia from a single browser-based user interface.

Distributed Architecture

M3S is designed as a distributed Web-based system that allows its components and services to be optimally located in proximity to its media sources and human users. The distributed design, shown in Figure 1, also supports easy expansion, robust failure recovery and inter-agency data sharing.

Figure 1

M3S is delivered as a turnkey system, with all hardware and software, and is composed of the networked components described below, which are also available as separate turnkey offerings. Additionally, the core analytics can be unbundled and integrated into other third-party systems either at the operational level or at the data level.

BBN Broadcast Monitoring System™ (BMS)

BMS processes television and radio sources to create a continuous, searchable, one-year archive of audio/video and its associated machine transcription and translation metadata for every channel. An audio stream from ingested video is automatically processed through the BBN Audio Monitoring Component (AMC), which is the core software solution. AMC includes an integrated pipeline of advanced natural language processing (NLP) technologies such as speech recognition, speaker tracking, sentence boundary detection, named-entity recognition, and machine translation. BMS also includes BBN's state-of-the-art optical character recognition (OCR) component for the automatic detection and transcription of on-screen text in video images.

Figure 2

Figure 2 shows the home screen of the BMS — the Channel Overview. This screen contains a search box, a list of persistent queries (the watchlist), and a thumbnail overview of the available channels. The overall application operates in real time, producing content that is available for search no more than three minutes after it is aired. BMS is available in 15 languages, including Arabic, Mandarin Chinese, Farsi, Russian, French, Hindi, Urdu, Cantonese and Pashto. With this revolutionary system, English speakers with little or no foreign language skills can get the essence of a foreign TV/radio broadcast and sift through enormous volumes of media in other languages quickly and efficiently.

BBN Web Monitoring System™ (WMS)

The WMS processes website sources, capturing content from user-selected websites on a schedule specified by the user organization (Figure 3). The captured sites are translated, archived and versioned for later use, so there is always a local copy available, even when the page has disappeared from the active Web. Internal links are preserved in the harvested Web pages so users can navigate within the archived sites. English speakers can obtain a basic understanding of an ingested Web page by reading the English translation automatically produced by the Language Weaver MT software (available in over 30 languages), then reach through the network for human translation support when a higher-quality translation is desired.

Operational Use

BBN's Multimedia Monitoring Systems are widely deployed across a diverse set of user groups and are used to produce a rich set of information products. Although all customers have essentially the same M3S capabilities, their applications are very diverse. Furthermore, the simplicity of the user interface has allowed fielded systems to be used in ways not envisioned when the system was developed. Intelligence and learning applications are described below.


The J2 Open Source Intelligence (OSINT) group at the United States Central Command (USCENTCOM) in Tampa, Fla. has been using both broadcast monitoring and Web monitoring capabilities since 2004. Initially staffed solely with monolingual English-speaking analysts, this group was tasked with reporting how foreign media portrayed U.S. actions in the Middle East. Since BBN multimedia monitoring systems were deployed, the group has increased their reporting from three short products per week to more than a thousand requests for information (RFIs) per year, drawing information from open sources, using linguists to enrich the translation and to provide cultural background and insight. Based on these successes, BBN's media monitoring solutions have spread to OSINT groups at other combatant commands (COCOMs) and the broader DoD.

Language Learning

The DoD devotes considerable resources to teaching people to speak other languages, and soldiers who have successfully mastered a second language are incentivized to sustain their fluency. The creation of current, authentic learning content for the classroom is a laborious task, and instructors may spend hours or days preparing a short video clip for a single lesson. As part of an effort funded by the Combating Terrorism Technical Support Office (CTTSO), BBN enhanced their broadcast and Web monitoring systems to act as a platform for rapid content creation for language instructors. Teachers can locate a clip about a particular topic or theme, use a built-in editor to transform the media into lessons such as fill-in-the-blank, matching or a flashcard set, and distribute the lessons to students on the desktop or a handheld device. These capabilities have been deployed at the Defense Language Institute and other DoD and intelligence community learning centers.

Future Capabilities

The media monitoring suite of technologies is now a platform for deploying new human language analytics and technologies to operational users, as well as for applying them to new and varied media types.

Automatic Topic Detection and Tracking allows processed media assets to be automatically labeled with human-readable tags, and then grouped with other, similar media assets. This allows analysts to rapidly find content using a "more like this" style of search.

Entity Profiles automatically build biographies of a person or organization from open source information. Biographies might include statements made by or about the specified person, links to other entities and past/present affiliations. The profiles will enable users to quickly discover new specific facts, rather than dredging through unrelated content.

LocalISR™ allows thematic geotagging of news articles, deconflicting common place names (i.e., Georgia, United States, and Georgia, the former Soviet Republic) and assigning the most relevant location to a news story.

Automated Sentiment Analysis determines whether a block of text expresses opinions for or against a particular entity or topic, and identifies who is expressing that opinion.

The capabilities and advances described here will enable more effective use of open source data for collecting information from emerging Internet and public news media sources worldwide. A few years ago, English was by far the dominant lingua franca of the online world. Today, the majority of new online content is in languages other than English. As a result, machine translation technology is poised to play a key role in the years to come.

Premkumar Natarajan and Amit Srivastava

Share This Story

Top of Page