Antonin Durieux - Data Scientist
Identifying factors impacting industrial performance: the case of batch production
For manufacturing process companies and factories operating their production by batch, optimising production performance is of course one of the strategic challenges to ensure their competitiveness. However, the production data reported by the field teams are not always easy to analyse and are sometimes difficult to integrate into a continuous improvement process. As such, the identification of the tangible factors that influence industrial performance is often complex, due to the lack of appropriate solutions to formally identify them.
In the food, pharmaceutical or chemical industries, industrial processes are organised in batch production. The objective at the end of the chain is to obtain products that comply with previously defined standards, for example in terms of quality. In reality, industrial performance can vary significantly from one batch to another for a similar product. Very often, it is difficult to make a true diagnosis of these variations because the volume of potential parameters makes this interpretation very difficult, especially when the analysis tools are lacking.
Various influencing factors can be taken into account:
Faced with this diversity of potential factors, connecting the equipment on the production line is essential in order to faithfully describe the production operations thanks to the data collected from the machines. The data is then processed in order to clearly identify the influencing factors responsible for the differences recorded between batches. These indicators will then be made available to those involved in the process, so that they can intervene in the field (for operators or line managers) as well as at management level (for managers or production directors). Where necessary, operational decisions are made to adjust scheduling, batch size or team composition.
The large volume of potential factors influencing performance must lead to an approach that goes beyond an analysis based solely on business expertise. The use of Machine Learning algorithms proves to be relevant in this context due to its capacity to analyse complex data sets. Before doing so, two essential prerequisites must be considered.
Firstly, in order to initiate its learning process, the system must first have exhaustive data covering the entire production process. In this case, we collected in a database one year of data from a production line comprising more than a hundred separate variables from machines as well as shop floor applications. Secondly, a target variable that best illustrates the expected performance must be chosen. In this case study, the indicator chosen is the production rate expressed in number of litres produced per minute.
In order to identify the influencing factors, we have structured our analysis around two main steps
Step 1: Ensure the quality of the model prediction by comparing the ideal production rate with actual data.
We first seek to evaluate the performance of the model, i.e. its capacity to correctly predict the target variable as a function of the input variables. This quality is measured by a coefficient of determination called R-Square. In our case, this coefficient reaches 0.67, which means that the model is sufficiently accurate to give us indications of the influencing factors.
Step 2: Identifying the influencing factors
The data are then analysed through a so-called importance permutation model which allows the identification of influencing factors among the input variables. This analysis is guided by one principle: opposing the performance of the model in prediction with and without the input variable under consideration.
If this protocol is well respected, the algorithms are able to clearly identify the influencing factors and generate insights that can be used to monitor industrial performance.
In our case study, the collection and analysis of the data allowed us to clearly identify the following 5 influencing factors:
In other words, the scheduling of batches, the seasonality of production as well as the volume produced per batch are the most important factors in explaining any differences in industrial performance. These results provides to the production workforce valuable insights for a better performance monitoring and execution in the future.
Beyond analyses dedicated to industrial performance, this type of methodology can be applied to other stakes specific to the process industry, such as the identification of factors influencing product compliance at the end of the production chain. The use of alternative methods of explainability of models such as Shapley values offers a safe way to identify these factors but also to identify the nature of their impact whether positive or negative. This approach offers new perspectives for identifying production optima.
Identification of optimal performance conditions for a given variable
More generally, the combined contribution of the Internet of Things and the Learning Machine in this type of context lies in their capacity to make factual the measurement and analysis steps that frequently take place in a continuous improvement process like a DMAIC, for example. First, their combination makes the measurement of data concrete and continuous. Then the results from the algorithms complete the analysis phase and the identification of problems, which are preliminary to the action plan.
For the production workforce, these insights make it easier to make decisions and adjust production operations according to their objectives. These new technologies contribute to the digitalisation and modernisation of continuous improvement methods such as Lean management or Kaizen.
Quantitatively, the performance gain observed on average as a result of this type of analysis is significant for process manufacturing companies. We generally observe an improvement in the OEE rate (Overall Equipment Efficiency) of about 5%. From a qualitative point of view, through the deployment of this type of analysis, the industrial performance observed and measured proves to be more stable over time. By shedding new light on production, the contribution of this method represents a substantial gain for the productivity of industrial companies producing by batch.