Multi‑Condition Identification of Thermal Process Data Based on Mixed Constraints Semi‑Supervised Clustering
Multi‑Condition Identification of Thermal Process Data Based on Mixed Constraints Semi‑Supervised Clustering
Abstract
The multi-model method is used in complex system modeling or industrial data monitoring, and its goal is to establish
the sub-models corresponding to different conditions. How to divide the modeling data into datasets corresponding
to different conditions in the case of insufficient prior knowledge of the process is an important problem to be solved.
Machine learning provides many excellent algorithms for condition identification. However, many algorithms are easily
affected by abnormal data, but making full use of prior knowledge can effectively solve this problem. Based on the semi-
supervised clustering, mixed constraints which include composite distance and the pairwise constraint are introduced
to distinguish strongly and weakly dependent data and clarify boundary data to realize a multi-model condition division
of thermal process data. Radial Basis Function neural network is used to realize feature learning, and an online condition
identification is constructed. The influence of network structure and parameters on the generalization ability of the rec-
ognizer is analyzed. On the premise of not significantly increasing the amount of calculation, the generalization ability
is improved by adjusting the weight coefficient in the composite distance, and the weight coefficient under low error
rate is given by particle swarm optimization. Compared with classic the methods, such as Sliding Windows, Bottom-Up
and Top-Down, the proposed method has better performance in segmentation results.
Keywords Semi-supervised clustering · Mixed constraints · Composite distance · Polynomial fit · Online condition
recognizer · RBFNN
* Yue Zhang, [email protected] | 1Department of Automation, North China Electric Power University,
Lianchi District, Baoding 071003, China. 2Hebei Technology Innovation Center of Simulation and Optimized Control for Power Generation,
North China Electric Power University, Baoding 071003, Hebei Province, China.
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
learning has been most widely used, and its main advan- where ds,t (0 ≤ s < t ≤ T ) is the segmentation error of seg-
tage is that it does not rely on prior knowledge, but it also ment [s, t]. It is local error, and
{ it is determined
} by the data
has certain shortcomings, especially in the case of window in the time series segment xs , xs+1 , … xt ,
or modal boundary data. t
This paper considers the time-sequence relationship ∑ ( )
ds,t = (x𝜏 − x̂ 𝜏 )T x𝜏 − x̂ 𝜏 (2)
of the object under study from the perspective of semi- T =s
supervised clustering and proposes a hybrid constraint
that combines pairwise constraints and time constraints where x̂ 𝜏 is estimated value.
to improve the identification accuracy of working condi- It is not a simple single objective optimization problem
tions or mode in the boundary area. The simulation results in the process of condition partition or segmentation. On
show that the semi-supervised clustering based on mixed the basis of ensuring the overall segmentation cost, it is
constraints has higher accuracy in condition identification, necessary to make the local segmentation cost as close
especially in the boundary data. as possible and at a lower level. As mentioned above, the
This paper is organized into three major sections. multivariate time series segmentation problem is trans-
Firstly, the background knowledge of the paper is intro- formed into a constrained optimization problem or a
duced, including the cost of condition division and tradi- multi-step optimization problem.
tional methods. Secondly, the semi-supervised clustering
with mixed constraints is introduced in detail. Thirdly, an 2.2 The traditional methods for multi‑condition
online recognizer based on RBFNN is designed. Fourthly, identification
the method in this paper is compared with the traditional
method by simulation. The final section presents the key The traditional methods for multi-condition identifica-
conclusions and limitations of this work, while offering tion include artificial judgment based on prior knowledge
future directions for research that could advance the cur- and machine assistance based on the recursive method.
rent body of knowledge on this subject. Among them, the more famous methods include Sliding
Windows (SW), Top-Down, Bottom-Up, Sliding Window
and Bottom-Up (SWAB), Feasible Space Window (FSW).
2 Background knowledge Sliding Windows (SW) [5] the algorithm determines
the width of the potential segment by recursive method.
2.1 Cost of condition division Anchor the left point at the first data point, then attempt
to approximate the data to the right with increasing
Process data condition partition is equivalent to the prob- longer segments. At point i, the error is greater than the
lem of multivariate time series segmentation. In essence, user specified threshold, so the subsequence from the
it means that for anchor to i − 1is transformed into a segment. The anchor
{ } a given ( k-dimensional
)T time series
X = x1 , x2 , … xT , xT = x1t , x2t , … xkt , the time domain is moved to location i , and the process repeats until the
is divided according to the change law of the data and the entire time series has been transformed into a piecewise
correlation relationship before and after. linear approximation.
Assuming that the time series is divided into N seg- Top-Down [5] the algorithm works by considering every
ments, the boundary possible partitioning and splitting it at one location. Both
{ time label}of the segmentation result
is defined as t = t1 , t2 , … tN , then the segmentation subsections are tested to see if their approximation error is
result t satisfies 0 < t1 < t2 < ⋯ < tN = T . below some user specified threshold. If not, the algorithm
In the problem of time series segmentation, t1 , t2 , … , tN recursively continues to split the subsequences until all the
segments have approximation errors below the threshold.
[ called segmentation boundary
are ] [ or mutation
] points,
t1 , t2 ], [t2 + 1, t3 ], [t3 + 1, t4 , … tN−1 + 1, tN are called Bottom-Up [5] firstly, the algorithm creates the finest
segmentation segments, and the number of segments N possible approximation of the time series, so that n∕2 seg-
is called segmentation order [17]. ments are used to approximate the n-length time series.
Thermal data condition partition or time series seg- Next, the cost of merging each pair of adjacent segments
mentation can be described as optimization problem. The is calculated, and the algorithm begins to iteratively merge
overall cost of segmentation is J(t), the lowest cost pair until a stopping criteria is met.
Sliding Window and Bottom-Up (SWAB) [5] the algorithm
N−1
∑ keeps a small buffer. Bottom-Up is applied to the data in
J(t) = dti +1,ti+1 (1) the buffer and the leftmost segment is reported. The data
i=0
corresponding to the reported segment is removed from
the buffer and more datapoints are read in. These points
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
( )
are incorporated into the buffer and Bottom-Up is applied |S| is the size of S, and ∃yi ∈ Y , so that xi , yi ∈ L, then S
again. This process of applying Bottom-Up to the buffer, is called the seed set. In particular, when the number of
reporting the leftmost segment. categories which the samples in S belong to is equal to
Feasible space window (FSW) [6] the algorithm intro- k, S can be expressed as S = Uki=1 Si , where Si is the class i
duces a point called a Candidate Segmenting Point (CSP) non-empty sample set.
which may be chosen to be the next eligible segmenting Although the semi-supervised clustering algorithm
point. The distances of all the points lying between the based on the seed sets can effectively improve the cluster-
last segmenting point and the new chosen one are all ing performance, it largely depends on the scale and qual-
within the maximum error tolerance. The key idea of FSW ity of the seed set. Guo Maozu and Deng Chao [20] intro-
is to search for the farthest CSP to make the current seg- duced a semi-supervised clustering algorithm based on
ment as long as possible under the given maximum error the tri-training and data editing, and combined it with the
tolerance. depuration data editing technology to correct and purify
The comparison of the above algorithms is shown in the mislabeled samples in the seed set while expanding
Table 1. the size of the seed set to improve its quality.
The Cop-K-means algorithm introduces the idea of
2.3 Semi‑supervised clustering method pairwise constraints to the traditional K-means algorithm.
During the data distribution process, data objects must
Unlike the traditional unsupervised clustering algorithms, meet the Must-link (ML) constraints and Cannot-Link (CL)
such as the K-means algorithm and expectation–maxi- constraints. In the ML, two selected points must belong
mization (EM) algorithm, semi-supervised clustering to the same class, and in the CL, two selected points do
combines clustering and semi-supervised learning to not have to be elements of the same class. The constraints
improve the clustering performance using a small amount have symmetry and transitivity characteristics that are
of label data and prior knowledge in massive data. The expressed as follows.
semi-supervised clustering algorithms can be divided into Symmetry:
constraint-based semi-supervised clustering algorithms, ( ) ( )
distance-based semi-supervised clustering algorithms, xi , xj ∈ ML xj , xi ∈ ML
and constraint- and distance-based semi-supervised clus-
tering algorithms. ( )
xi , xj ∈ CL
( )
xj , xi ∈ CL
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
clustering accuracy and better results using a smaller condition data appears to be disorderly in an ultra short
amount of pairwise constraint information. period, and the overall change trend is to increase or
decrease in one direction. There is no instantaneous jump
2.3.2 Semi‑supervised clustering combining multiple of amplitude in both steady-state data and transient data.
methods According to the above characteristics of thermal pro-
cess data, the key to the division of steady-state condi-
Chen [23] and Chang [24] have suggested that the semi- tion and process condition is to solve three problems: first,
supervised clustering algorithm can simultaneously use the problem of working condition central point; second,
two types of supervision information, such as class label data screening problem with obvious strong connection
and pairwise constraints, for clustering, especially when relationship with working condition central point; third,
active learning is added to actively label samples, and determination of boundary data of adjacent working
higher quality supervision information and better cluster- conditions.
ing results can be obtained.
Wei et al. [25] proposed a semi-supervised clustering 3.2 The flowchart of the condition identification
method based on pairwise constraints and measures. In
the case of data marked by pairwise constraints, the semi- In this paper, the semi-supervised clustering with mixed
supervised clustering method based on constraints and constraints is used to realize condition identification, as
measures is used to generate different basic clustering shown in Fig. 1.
partitions, and then the target clustering is conducted by
integration. 3.3 Initial seed set establishment
Compared with the single clustering method, the com-
bination of multiple clustering methods can yield the best After data preprocessing, the dataset is subjected to the
use of given monitoring information, improving the algo- whitening process by referring to the idea of the PCA algo-
rithm performance. For the identification of multi-model rithm to reduce the linear correlation under the premise
working conditions, this article uses the initial seed set to that the data are as true as possible. The goal is to mini-
provide the number of working conditions and the centers mize the variance between the original data and the pre-
reference of the dataset under the same working condi- processed data.
tion. It focuses on solving the problem of a fuzzy bound- N
ary between different working conditions. By introducing 1∑
J= x − x̃n2 (3)
mixed constraints, the accuracy of the boundary division N n=1 n
of working conditions has been improved.
M D
∑ ∑
x̃n = ani ui + bi ui (4)
3 Multi‑model condition identification i=1 i=M+1
of thermal process data based
where xn is the original data, x̃n is the preprocessed data,
on mixed‑constraint semi‑supervised
N is the
{ number of data, M is principal component dimen-
clustering }
sion, ui is D dimensional unit orthogonal set.
The correlation coefficients of different dimensions are
3.1 Characteristics of thermal process data
guaranteed to be as small as possible. The correlation coef-
ficient 𝜌ij is defined as follows:
Different from batch process and random process, ther-
mal process data has its own unique characteristics, such N
( )( )
1 ∑ xni − x̃i xnj − x̃j
as strong coupling, non-linear, slow time-varying, etc. 𝜌ij = (5)
N n=1 𝜎i 𝜎j
Reflected in the data, the main performance is that there
are many factors affecting the data change, it is difficult to where 𝜎i and 𝜎j is standard deviation.
predict, the correlation between different data is strong, Next, the data is processed by the density-based clus-
and has obvious time series characteristics. tering method [26], and the initial seed sets are estab-
Thermal process is usually considered as a transition lished according to the distance from the data centre.
process from a steady state to a new steady state. There-
fore, the thermal process data can be divided into steady-
state condition data and transition condition data. The
characteristic of steady-state data is that it fluctuates in
a small range near the steady-state data. The transition
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
The clustering based on distance Dij can be used to jitter problems. According to the category continuity
divide the data with strong affiliation into datasets cor- characteristics of adjacent points, the pairwise constraint
responding to different working conditions. is improved by superimposing on the distance constraint,
and the results are as follows:
3.5 Clarification of class boundaries based on mixed N M
constraints J=
∑ ∑
Dij2 +
∑
wij +
∑
w ij
i=1 j=1
xi , xj ∈ M x i , xj ∈ C (8)
The boundary data between adjacent working condi-
tions refer to the data with a weaker relationship with the s.t.li ≠ li s.t.li = li
center point of a working condition. From the perspec- where M and C are the given Must-link set and Cannot-link
tive of modeling, such data may belong to two adjacent set respectively; wij , w ij is the penalty weight for violating
working conditions at the same time; specifically, the Must-link and Cannot-link constraint rules, respectively.
closer the data to the boundary point are, the more obvi- In semi-supervised clustering method with pairwise con-
ous the multi-condition attribute of the data is. However, straints, the constraint set satisfying li = lj is called the
the problem of jitter and jumping of the condition attrib- Must-link set, and the constraint set satisfying li ≠ lj is called
ute needs to be avoided. The standard K-means clustering Cannot-link set.
algorithm cannot overcome this problem. For example, a In Eq. 8, Dij is the composite distance mentioned above,
set of typical thermal data, namely the bed temperature which is composed of the change characteristics of data
data of a fluidized bed boiler, is used for clarification. It and the time span from the central point of the category.
includes 3000 sampling points with a sampling interval of It makes the category of data clearer. Pairwise constraints
5 s. The standard K-means clustering is used to classify the reduce the frequent jumps of the categories of boundary
normalized data from the dimension of value. The results data.
are shown in Fig. 2. Moreover, the boundary area data may have multi cat-
In Fig. 2, different colors represent different categories; egory attributes, which can be judged by the change rate
the red line represent the category belonging of the cor- of time series data. For example, when the change rate of
responding sampling time points. The standard K-means sampling points is less than a certain fixed value within
clustering considers only the value, so the category lines a time span, it can be approximately considered that the
are irregular and frequently jittered, and such a classifica- data within the time span is in a steady state. If the time
tion is of little significance. span is just within the boundary interval, the data can
In view of this, mixed constraints are designed to
achieve clearer working condition boundaries and avoid
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
have multi category labels and belong to two adjacent The connection between nodes determines the behav-
categories. ior of neural network. There are many connection meth-
ods, The most common is full connection. All nodes in a
3.6 Hyperparameter optimization layer are connected to all nodes in a higher layer, and there
are other methods such as sparse connection network [32]
In the working condition identification process, there are and direct connection between input and output nodes
three hyperparameters; the first is the time span (time dis- [33]. If the selection of radial basis function is reasonable,
tance) of data with a strong relationship with the center the hidden layer nodes in full connection mode can filter
point of clustering; the second is the time span of multi- the input data well.
category labels in the boundary area; and, the third is the Improving efficiency and ensuring quality is the direc-
time span ts that is used to calculate the change in the data tion of training method improvement. The growing RBFNN
increment at a certain time series. There are three common improves training efficiency by increasing the number of
methods for determining the hyperparameters, manual RBF units at each learning step [32]. It is an effective solu-
method based on experience, machine-assisted method, tion to ensure the training quality by k-fold cross valida-
and algorithm-based method. In this work, the machine- tion of training samples and validation samples [33].
assisted method has been selected. Therefore, there are In summary, the design of RBFNN needs to solve the
two problems to be solved, an optimization objective number of nodes, the type of hidden layer activation func-
function and an optimization method. tion, the connection mode between nodes and the weight
The model performance is modeled by polynomial fit- adjustment method.
ting under the same conditions for all partitioned working
condition data, and the accumulated error of the multi- 4.1 Structure design of RBF neural network
model is taken as an optimization objective function of
the hyperparameter optimization. RBFNN has three layers, such as input layer, output layer
[ n ] and hidden layer. The number of input layer and output
m
∑ 1 ∑( ( ) ) layer nodes is easy to determine, which is determined
E= y x , 𝜔 − yij (9)
i=i
2 j=1 ij ij according to the characteristic variables of the model. In
this paper, the number of input nodes Ninput is determined
The conventional grid-search method has been by the composite distance Dij , including the data value and
selected as the optimization method. change rate at the current sampling time, and the data
value and change rate at the adjacent sampling time,
Ninput = Nrange × Nsample (10)
4 Online condition recognizer based
on RBFNN where Nrange is the range of adjacent data and Nsample is the
number of the reference information at the sampling time.
RBFNN is considered to be one of the most promising algo- The number of hidden layer nodes Nhidden is determined
rithms. Compared with other artificial neural networks, by the dimension of input data Ninput and the number of
RBFNN is more popular because of its simple structure, conditions Ncond . The number of working conditions can be
fast learning process and appropriate approximation abil- determined by machine learning method [12],
ity [27, 28]. There are many factors that affect the perfor-
mance of RBFNN, such as the network structure, hidden Nhidden = Ninput × Ncond (11)
layer activation function, the connection between nodes,
training methods, and so on. Many researchers have given Considering the nonlinearity of the thermal process
good suggestions in these methods. Mosavi et al. analyzed data, radial basis function can describe the information
the hidden layer structure of RBF neural network and pro- of conditions (The parameter C of radial basis function
posed an efficient training method, which uses Stochastic represents the center of condition, and the closer to the
Fractal Search Algorithm (SFSA) for training RBFNN [29]. center, the greater the output value). Therefore, standard
By choosing a more reasonable radial basis function, the RBF (SRBF), Cauchy RBF (CRBF), inverse multiquadric RBF
problem of slow classification speed of RBFNN is solved and generalized inverse multi-quadric Functions can meet
[28, 30, 31]. the basic requirements of the algorithm.
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
5 Simulation analysis
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
5.1.2 Comparison of different number of initial seed sets of seed sets. If the number of initial seed sets is defined,
the location can be determined by the density-based clus-
In addition to the hyperparameters mentioned in Sect. 3.6, tering method. The influence of different number of initial
the information of the initial seed sets will also affect the seed sets on segmentation results is shown in Figs. 6 and 7.
segmentation results, it includes the location and number
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
Compare Figs. 5, 6 and 7, setting a smaller number of primary air flow, second air flow and fuel flow are selected
initial seed sets can make the segmentation of the origi- as an input, and the model polynomial fitting is performed.
nal data clearer, but the segmentation effect is evaluated The results of model polynomial fitting on segmented data
by the model polynomial fitting in segmented data. On are shown in Table 2. The segmentation effect of setting
this basis, the data of each working condition is taken as five initial seed sets is better than setting four or six initial
an output, the four group of data such as bed pressure, seed sets.
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
Table 2 Comparison results No. of ini- Segmenting Polynomial fitting results Fitting Overall
of different number of initial tial seeds score (R2) error
seeds Regression coefficient Intercept (MSE)
4 (0, 858) [− 0.26, − 0.52, 1.14, 1.27, − 1.14, − 0.48] 0.44 0.78 0.023
(859, 1468) [− 0.23, 0.52, − 0.13] 0.33 0.76
(1469, 2270) [− 0.28, 0.51, 0.003] 0.30 0.39
(2230, 3028) [− 0.29, − 0.65, 1.22, 1.44, − 1.25, − 0.47] 0.45 0.74
5 (0, 367) [− 0.32, 0.34, − 0.51, 1.91] 0.35 0.71 0.011
(316, 858) [− 0.23, − 0.07, − 2.28, 0.34] 2.69 0.65
(859, 1468) [− 0.19, 0.42, 0.003, 0.46] 0.16 0.75
(1469, 2270) [− 0.006, 0.33, − 0.003, − 0.32] 0.46 0.55
(2253, 3028) [− 0.25, 0.16, 0.06, 0.04] 0.48 0.69
6 (0, 367) [− 0.47, 0.52, − 1.49, − 2.11, 1.51, 0.99] 1.45 0.69 0.037
(316, 858) [− 0.28, − 0.09, − 2.26] 2.84 0.59
(859, 1075) [0.008, 0.46, 2.36, 0.05, − 2.59, − 0.36] − 0.02 0.52
(1076, 1646) [1.26, 0.41, − 1.45, − 1.89, 0.15, 1.50] 0.50 0.48
(1482, 2270) [1.38, 0.33, − 1.45, − 1.94, 0.04, 1.60] 0.50 0.57
(2230, 3028) [− 0.26, 0.18, 0.05] 0.50 0.72
5.1.3 Compare semi‑supervised clustering with mixed the method in this paper. The polynomial fitting results
constraints with Sliding windows, bottom‑up, given in Table 3 show that the semi-supervised clustering
top‑down with mixed constraints can achieve the condition identifi-
cation, and compared with Sliding Windows, Bottom-Up,
The data of each working condition is taken as an out- Top-Down, the mixed constraint is better for the division
put, the four group of data such as bed pressure, primary of working conditions.
air flow, second air flow and fuel flow are selected as an
input, and the model polynomial fitting is performed. 5.2 Online condition recognizer
The comparison between the clustering results obtained
using mixed constraints, Sliding Windows, Bottom-Up, The online condition recognizer realizes the mapping
Top-Down are shown in Fig. 8, The specific comparative from continuous data to discrete data. By increasing the
data is shown in Table 3. number of hidden layer nodes, the network can perform
( ) perfectly in training data, but the network has poor nor-
u
R2 = 1 − , malization ability, as shown in Fig. 9.
v
Here, the weight coefficients 𝛼i of composite distance
( )2 Dij are equal. Increase the amount of input data informa-
n n n
∑ ( )2 ∑ 1∑ tion, such as Nrange and Nsample can improve the ability of
u= yi − ŷ i , v = yi − y (17) normalization, but the amount of computation increases
i=1 i=1
n i=1 i
significantly. In general, using PSO (particle swarm optimi-
zation) to optimize the weight coefficients 𝛼i of composite
In Table 3, the calculation formula of fitting score is distance Dij can also improve the network normalization
shown in formula 17, different methods have different ability, but it does not increase the amount of calcula-
segmentation results, Sliding Windows, Bottom-Up, and tion, as shown in Fig. 10. The optimization process of
Top-Down achieve good results in the linear fitting of weight coefficients 𝛼i is shown in Fig. 11. Therefore, other
local sub model, but the results in different segments have parameters, such as Nrange , Nsample and the characteristic
large deviation, and the overall effect is not as good as
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
parameters of radial basis function in hidden layer, can be numerical component of the input data and weakening
further optimized by PSO. the proportion of the sequence information can improve
On the test data set, when the error rate is low, the normalization ability of the network. On the contrary, it
𝛼1 = 0.96, 𝛼2 = 3.60, 𝛼3 = 1, the ratio of the weight coef- is necessary to enhance the proportion of sequence infor-
ficient shows that increasing the proportion of the mation to improve the ability of condition identification.
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
Fig. 8 (continued)
(c) Bottom-Up
(d) Top-down
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
Sliding Windows 1 (0, 863) [− 0.19, 1.15, − 1.18, 0.77] 0.94 0.71 0.013
2 (864, 1515) [− 0.23, 0.50, − 0.08, 0.33] 0.23 0.82
3 (1516, 2080) [− 0.33, 0.39, 0.18, − 0.28] 0.40 0.66
4 (2081, 2845) [0.04, 0.26, 0.03, 0.24] 0.07 0.74
5 (2846, 3028) [− 0.13, − 0.11, − 0.02, − 0.08] 0.75 0.14
Bottom-Up 1 (0, 858) [− 0.17, 0.90, − 1.09, 0.06] 1.27 0.76 0.013
2 (859, 1463) [− 0.25, 0.53, − 0.07, 0.66] 0.17 0.76
3 (1464, 1970) [− 0.38, 0.43, 0.06, − 2.01] 1.13 0.67
4 (1971, 2429) [0.16, 0.57, − 0.61, 0.10] 0.35 0.22
5 (2430, 3028) [− 0.14, 0.21, 0.09, − 0.13] 0.49 0.78
Top-Down 1 (0, 67) [− 0.49, − 0.33, 0.70, − 0.66] 0.98 0.27 0.015
2 (68, 1077) [− 0.38, 1.64, − 0.96, 1.43] 0.26 0.88
3 (1078, 1970) [− 0.18, 0.78, − 0.34, − 0.16] 0.35 0.82
4 (1971, 2429) [0.16, 0.42, − 0.17, 1.11] − 0.57 0.50
5 (2430, 3028) [− 0.14, 0.21, 0.09, − 0.14] 0.49 0.78
Semi-supervised clustering 1 (0, 367) [− 0.32, 0.34, − 0.51, 1.91] 0.35 0.71 0.011
with mixed constraints 2 (316, 858) [− 0.23, − 0.07, − 2.28, 0.34] 2.69 0.65
3 (859, 1468) [− 0.19, 0.42, 0.003, 0.46] 0.16 0.75
4 (1469, 2270) [− 0.006, 0.33, − 0.003, − 0.32] 0.46 0.55
5 (2253, 3028) [− 0.25, 0.16, 0.06, 0.04] 0.48 0.69
6 Conclusion time span of the child window. The online condition iden-
tifier is designed, and the network normalization ability
The multi-model approach has been proved to be very is improved by optimizing the weight coefficient of input
effective in describing complex processes. The overall information. The simulation results show that the pro-
model precision is affected by the sub-model form and posed method is feasible in dividing the sub-windows, and
division of multiple model sub-windows. Once the time the overall error of the sub-model established is improved.
span of the sub-window is established, the modeling pro- One limitation of this study is that the method pre-
cess within the sub-window is the same as that of a sin- sented in this paper is based on historical data and cannot
gle model. Therefore, the division of sub-windows greatly be segmented online. Although the online recognizer is
affects the overall precision of the multi-model method. designed, it is still offline in essence. It limits the applica-
In this paper, using machine learning and combining tion of the method. Improving it into an online method is
the characteristics of thermal process data, the semi- the next research content.
supervised clustering with mixed constraints is used to
realize condition segmentation and sharpen division of
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
Vol.:(0123456789)
Research Article SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y
Funding This work is supported by the Central Universities Funda- 4. Choi SW, Martin EB, Morris AJ (2005) Fault detection based on a
mental Research Fund under Grant 2019MS098. maximum-likelihood principal component analysis (PCA) mix-
ture. Ind Eng Chem Res 44:2316–2327
Availability of data and material All the data pertaining to this study 5. Keogh E, Chu S, Hart D, Pazzani M (2002) An online algorithm for
are available upon request. segmenting time series. In: Proceedings 2001 IEEE international
conference on data mining, IEEE
6. Liu X, Lin Z, Wang H (2008) Novel online methods for time series
Declarations segmentation. IEEE Trans Knowl Data Eng 20(12):1616–1626
7. Ge ZQ, Song ZH (2009) Multimode process monitoring based
Conflict of interest The authors declare that they have no conflict of on Bayesian method. J Chemom 23(12):636–650
interest. 8. Zhu ZB, Song ZH, Palazoglu A (2012) Process pattern construc-
tion and multi-mode monitoring. J Process Control 22:247–262
Open Access This article is licensed under a Creative Commons Attri- 9. Lu NY, Gao FR, Wang FL (2004) A sub-PCA modelling and online
bution 4.0 International License, which permits use, sharing, adap- monitoring strategy for batch processes. AIChE J 50(1):255–259
tation, distribution and reproduction in any medium or format, as 10. Zhao CH, Wang FL, Lu NY, Jia MX (2007) Stage-based soft-tran-
long as you give appropriate credit to the original author(s) and the sition multiple PCA modelling and on-line monitoring strategy
source, provide a link to the Creative Commons licence, and indicate for batch processes. J Process Control 17(9):728–741
if changes were made. The images or other third party material in this 11. Ling W, Hui Z (2021) Fuzzy segmentation of multivariate
article are included in the article’s Creative Commons licence, unless time series with KPCA and G-G clustering. Control Decis
indicated otherwise in a credit line to the material. If material is not 36(1):115–124
included in the article’s Creative Commons licence and your intended 12. Zhang Y, Zhang B, Zheng Wu (2019) Multi-model modeling of
use is not permitted by statutory regulation or exceeds the permitted CFB boiler bed temperature system based on principal compo-
use, you will need to obtain permission directly from the copyright nent analysis. IEEE Access 8:389–399
holder. To view a copy of this licence, visit http://creativecommons. 13. Song B, Tan S, Shi HB (2016) Key principal components with
org/licenses/by/4.0/. recursive local outlier factor for multimode chemical process
monitoring. J Process Control 47:136–149
14. Ye Lv, Zhong YH (2014) A multi-model approach for soft sen-
sor development based on feature extraction using weighted
References kernel fisher criterion. Chin J Chem Eng 22(2):146–152
15. Wei L, Yu-pu Y, Na W (2008) Multi-model LSSVM regression mod-
1. Hwang DH, Han C (1999) Real-time monitoring for a process eling based on kernel fuzzy clustering. Control Decis 23(5):560–
with multiple operating models. Control Eng Pract 7(7):891–902 562, 566
2. Zhao SJ, Zhang J, Xu YM (2006) Performance monitoring of 16. Shu-Mei Z, Fu-Li W, Shuai T, Shu W (2016) A fully automatic
processes with multiple operating modes through multiple PLS offline mode identification method for multi-mode processes.
models. J Process Control 16:763–772 ACTA Autom Sin 42(1):60–80
3. Natarajan S, Srinivasan R (2010) Multi-model based process con- 17. Hongyue G (2017) Multivariate time series segmentation and
dition monitoring of offshore oil and gas production process. prediction approach and application research. Dalian University
Chem Eng Res Des 88:572–591 of Technology, Dalian
18. Basu S, Banerjee A, Mooney R (2002). Semi-supervised cluster-
ing by seeding. In: Proceedings of the nineteenth international
conference on machine learning (ICML 2002), pp 19–26
Vol:.(1234567890)
SN Applied Sciences (2022) 4:194 | https://doi.org/10.1007/s42452-022-05076-y Research Article
19. Wagstaff K, Cardie C (2000). Clustering with instance-level con- 29. Mosavi MR, Khishe M, Hatam Khani Y, Shabani M (2017) Train-
straints. In: Proceedings of the 17th international conference on ing radial basis function neural network using stochastic fractal
machine learning (ICML 2000), pp 1103–1110 search algorithm to classify sonar dataset. Iran J Electr Electron
20. Deng C, Guo MZ (2008) Tri-training and data editing based semi- Eng 13(1):100–111
supervised clustering algorithm. J Softw 19(3):663–673 30. Montazer GA, Giveki D (2015) An improved radial basis func-
21. Zhu Y, Qian J, Ji Z (2015) An improved COP-Kmeans algorithm tion neural network for object image retrieval. Neurocomputing
based on BFS. Beijing: China science and technology paper 168:221–223
online. http:// w ww. p aper. e du. c n/ r elea s epap e r/ c onte n t/ 31. Thandar AM, Khine MK (2012) Radial basis function (RBF) neural
201507-93 network classification based on consistency evaluation meas-
22. Li CM, Xu SB, Hao ZF (2017) Cross-entropy semi-supervised clus- ure. Int J Comput Appl 54(15):20–23
tering based on pairwise constraints. Pattern Recogn Artif Intell 32. Vachkov G, Stoyanov V, Christova N (2015) Growing RBF network
30(7):598–608 models for solving nonlinear approximation and classification
23. Chen ZY, Wang HJ, Hu M et al (2017) An active semi-supervised problems. In: Proceedings 29th European conference on model-
clustering algorithm based on seeds set and pairwise con- ling and simulation
straints. J Jilin Univ Sci Edn 55(3):664–672 33. Duliba KA (1991) Contrasting neural nets with regression in
24. Chang Yu, Ji-Ye L, Jia-Wei G, Jing Y (2012) A semi-supervised predicting performance in the transportation industry. In: Pro-
clustering algorithm based on seeds and pair-wise constraints. ceedings of the twenty-fourth Hawaii international conference
J Nanjing Univ Natl Sci 48(4):405–411 on system sciences. IEEE
25. Wei S, Li Z, Zhang C (2018) Combined constraint-based with 34. Diaconiţa I, Leon F (2011) A learning model for intelligent agents
metric-based in semi-supervised clustering ensemble. Int J using radial basis function neural networks with adaptive train-
Mach Learn Cybern 9(7):1085–1100 ing methods. Bul Inst Politeh Iaşi Autom Control Comput Sci
26. Rodriguez A, Laio A (2014) Clustering by fast search and find of Sect 57(61)(2):9–20
density peaks. Science 344:1492–1496
27. Laleh MS, Razaghi M, Bevrani H (2020) Modeling optical filters Publisher’s Note Springer Nature remains neutral with regard to
based on serially coupled microring resonators using radial jurisdictional claims in published maps and institutional affiliations.
basis function neural network. Soft Comput 25(1):585–598
28. Dash CSK, Behera AK, Dehuri S et al (2016) Radial basis function
neural networks: a topical state-of-the-art survey. Open Comput
Sci 6(1):33–63
Vol.:(0123456789)