Raytheon develops, matures and exploits machine learning to improve designs, enhance performance and raise the quality of its solutions in support of customer missions. Raytheon also utilizes these advanced cognitive and analytical methods to continually enhance its business operations. Machine learning powered approaches are now emergent across the company in all disciplines, including Engineering, Operations, Information Technology, Supply Chain and Business Development, helping to improve decision-making and maintain a competitive advantage. 


Raytheon generates a vast amount of data throughout the design, production and deployment of its complex defense systems. At each step, data is generated and retained, capturing details, decisions, measurements and transaction logs associated with products, processes and people. Some examples of data types include design simulations, supplier purchase orders, assembly steps, component tests, final quality assessments and field testing. One area that is particularly data-rich and filled with opportunity to apply analytics and machine learning is the factory floor, where the production, integration and test validation of products are completed. 

Machine learning can provide predictive power and actionable insight to nearly all aspects of manufacturing on the factory floor. Unlike descriptive or diagnostic analytics, predictive analytics enabled by machine learning provide future predictions based on patterns discovered in historical data (Figure 1). Insight into the future, as opposed to statistics of the past and present (descriptive and diagnostic analytics), enables a business to be less reactive and more proactive in its decision-making. For example, a probabilistic estimation of when a product will fail permits proactive planning before the failure ever occurs. This technical approach can be applied within manufacturing across multiple areas to improve product yields, decrease cycle time, and eliminate redundant testing and processes. It also helps improve product quality through lower probability of defects, reduced scrap and decreased costs due to rework. All of these benefits increase competitive advantage and ensure high-quality products are delivered quickly and efficiently to customers. 

Figure 1 : Analytics capability trade-space

Machine learning tasks can be categorized into three broad categories: supervised, unsupervised and reinforcement. Supervised learning is applied to understand the relationship across various inputs to predict specific outputs. For example, supervised learning can be used to evaluate attribute data about a component (input) to predict the probability of a future defect (output), and ultimately use that information to reduce the likelihood of occurrence or eliminate it totally. Unsupervised learning is applied to discover patterns or a hidden structure from input data where the output is unknown; for instance, it can be used to cluster or detect anomalies within product quality. Lastly, reinforcement learning is applied to find the ideal behavior within a given situation in order to maximize a reward; for example, it can be utilized to maximize physical space within a factory layout. Supervised and unsupervised learning have the most direct applicability to the manufacturing environment, although reinforcement learning may have utility for certain applications. Both unsupervised and supervised machine learning use iterative algorithms designed to learn continually and seek optimized outcomes as more data is incorporated. These algorithms can detect patterns across intricate datasets in seconds, thereby identifying optimized outcomes in seconds, where an analyst may require weeks to make the same determination. Further, the specific algorithms employed are chosen based on their higher interpretability (or explainability — the understanding of why an algorithm makes a specific prediction), Figure 2. This understanding provides more actionable insights for daily operations.

Figure 2 : Tradeoffs between accuracy and explainability of techniques


At Raytheon, as each part is manufactured or purchased, component assembled, subsystem integrated, and tested on the factory floor, data is generated and stored to business and enterprise information systems. On a high-rate production program with multiple domestic and international customers, slices of the data are regularly used to construct reports and metrics, ensuring that the manufactured products met their quality and on-time delivery requirements. Large-scale data integration and analysis of this data is typically difficult due to historically siloed information systems, expensive storage costs, and discipline-specific nomenclature across data sources. However, with the application of machine learning in projects such as DREAMachine (Defect/Test Reduction Empowered by Analytics and Machine Learning), the cost-benefit tradeoff is shifted due to the ease of applying machine learning techniques and the broader insight they provide. Machine learning makes it possible to quickly sift through vast amounts of information, recognize complex patterns and predict future outcomes to support data-driven decision making. 

Overview of the Technology
DREAMachine applies machine learning to traditionally siloed data sources to achieve a whole-systems view focused on reducing testing and predicting future defects (Figure 3). The execution of testing and rework of defects account for a large portion of the cost of manufacturing products; therefore, any opportunity to reduce these costs can have a sizable impact to a business. DREAMachine extracts data from business warehouses and enterprise systems, automates the data integration across sources, and builds upon open-source analytics and machine learning software libraries. It employs a modular architecture, where additional data sources are easily integrated as new information emerges, to achieve greater predictive accuracy and deeper systems-level understanding.

Figure 3 : Whole system view versus traditional single component view

First, DREAMachine imports production process data from multiple information systems, such as enterprise resource planning (ERP) systems, databases and servers. This data can represent parametric test data from components, quality fault codes, work operation orders, supplier data and other production related information. Next, the data is filtered and joined to relate information across disciplines and through all levels of components and systems in the product structure. Finally, exploratory data analyses, including unsupervised machine learning approaches (e.g. k-means and hierarchical clustering, principal components analysis, linear discriminant analysis) are performed. It is these methods and algorithms which identify meaningful new groupings that point to potential opportunities to improve testing procedures and operational processes. For example, k-means clustering can be applied to the component and system test values to identify clusters of similar values and highlight areas of redundancy. Then, supervised machine learning methods (e.g. decision trees, gradient boosting, random forests, support vector machines, naïve Bayes) are applied to predict future defects at both the component and system levels (Figure 4). Specifically, parameters such as test values, time of day the operation occurred, location of the test chamber and quality attributes are utilized to probabilistically predict if and when a failure will occur. In other words, historical data about the tests and operations are used as input into a model, an ensemble model like gradient boosting for example, to learn and discover patterns across the data from which future pass/fail predictions can be made.

Figure 4 : Overview of RIC DREAMachine generalizable framework

In the initial stage of DREAMachine development, the project team’s analysts partnered with a production program to create the use case and ensure that the implementation added value to key decision makers on the shop floor. The supervised learning methods applied to predict failures at the component and system levels achieved accuracies of up to 99% in predicting failures. More significantly, the unsupervised learning methods identified areas of redundancy in the test flow, highlighting opportunities for process optimization. 

Early on, test and reliability engineers suspected that the elimination of certain testing operations could speed up the production line while maintaining strong quality standards. DREAMachine integrated traditionally siloed historical data and analyzed the dataset, validating that a lengthy series of test operations provided redundant information without additional benefit or insight, and could possibly be reduced or eliminated. Further, it was determined that suggested process optimizations could increase production capacity by as much as 40%, potentially saving millions of dollars once enacted on the production line. 

Next Steps
In close collaboration with a Raytheon program, DREAMachine developed and applied machine learning methods to reduce redundant testing and predict future defects that could enhance critical production metrics. The next steps include continued testing and validating across other programs. The reusable DREAMachine framework will be applied to accommodate lower-volume production programs, as well as programs with markedly different product features. Further evolution of DREAMachine will improve the algorithms and fine-tune overall application performance. Feature enhancements and additional program data will improve prediction accuracies and provide greater insight into the testing and quality of Raytheon’s products. Aside from simply adding more historical records, the ability to combine data across multiple programs is a powerful advantage. For instance, patterns related to a particular component may be undetectable on a single program, but with data from the same or similar component combined across multiple programs patterns become stronger and more readily detected, enabling formulation of more accurate predictions. The ultimate integration and analysis of data across all Raytheon programs and businesses will provide new and actionable insight at the enterprise level for enhancing the production of complex defense systems. 


As the manufacturing space continues to experience an unprecedented increase in available data, the opportunities for applying and attaining value from machine learning will grow. Known as “Industry 4.0” or “Smart Manufacturing,” the current industrial movement aims to include more automation and data generation within manufacturing systems by exploiting the Internet of Things (IoT), cyber-physical systems, cloud infrastructure, and cognitive analytics. This data encompasses a variety of different formats, semantics, quality levels, and sources. For example, it could include sensor data from a robotics line, environmental data of the factory building, tooling calibration and maintenance, operation timing or assembler training history. This increase in data provides machine learning analysis opportunities to gain efficiencies throughout the factory. 

Some machine learning applications that are already under development across Raytheon manufacturing include production schedule optimization, factory floor layout planning via reinforcement learning, predictive maintenance of machines, and automated quality inspections using image processing empowered by convolutional neural networks. In conjunction with additional data sources, supervised machine learning models can be used to analyze and improve key business metrics, such as on-time delivery performance (probability of delivering products on-time across specific scenarios), cost forecasting (probability that the budget will be met given new constraints), and contract wins (probability of winning a specific contract and the associated risks).

In conclusion, machine learning-powered approaches have the potential to improve all aspects of the factory and business. As a leader in the defense industry, Raytheon continues to apply advanced analysis methods to understand the past, present, and future state of its factories, products, and businesses — making optimal, data-driven decisions and improving competitive advantage.

– Kimberly Kukurba, Ph.D.