0% found this document useful (0 votes)
22 views49 pages

Application of Neural Network Based Regression Model To Gas Concentration Analysis of Tio Nanotube-Type Gas Sensors

Uploaded by

3A Company
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views49 pages

Application of Neural Network Based Regression Model To Gas Concentration Analysis of Tio Nanotube-Type Gas Sensors

Uploaded by

3A Company
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Version of Record: https://www.sciencedirect.

com/science/article/pii/S0925400522003744
Manuscript_fcc96cadd651ff70959c04bc78b31702

Application of Neural Network Based Regression Model to

Gas Concentration Analysis of TiO2 Nanotube-Type Gas

Sensors

Kazuki Iwataa*, Hiroyuki Abeb, Teng Mac, Daisuke Tadakid, Ayumi Hirano-Iwatac,d,

Yasuo Kimurae, Shigeaki Sudaf, and Michio Niwanoa,d*

a
Tohoku Fukushi University, Sendai 989-3201, Japan

bIndustrial Technology Institute, Miyagi Prefectural Government, Sendai 981-3206, Japan

cAdvanced Institute for Materials Research (AIMR), Tohoku University, Sendai 980-8577,
Japan

d
Research Institute of Electrical Communication, Tohoku University, Sendai 980-8577, Ja-
pan

eTokyo University of Technology, Hachioji City, Tokyo 192-0983, Japan

fMiyagi Factory, CHEST M.I., INC., Miyagi, Japan

_________________

*Corresponding Authors
Kazuki Iwata – Tohoku Fukushi University, Sendai 989-3201, Japan;
E-mail: [email protected]
Michio Niwano – Tohoku Fukushi University, Sendai 989-3201, Japan;
E-mail: [email protected]; [email protected]

© 2022 published by Elsevier. This manuscript is made available under the Elsevier user license
https://www.elsevier.com/open-access/userlicense/1.0/
Abstract

We performed a gas analysis of TiO2 nanotube (NT)-type integrated gas sensors using a

machine learning (ML) algorithm and neural network-based regression. We fabricated a

TiO2-NT integrated gas sensor with multiple sensing elements with different response char-

acteristics, and we measured the output signals of each sensing element exposed to a gas

mixture, where the main components were nitrogen and oxygen gas with a small amount of

carbon monoxide. We analyzed the output signals of the sensor elements using the ML

technique to predict the concentrations of CO and O2, to which the TiO2-NT gas sensors

were sensitive. Sensor output data were collected for seven sets of mixed gas concentra-

tions with different concentrations of each component gas. Four or five of the seven da-

tasets were used as ML training data for the neural network method, and the concentrations

of CO and O2 in the remaining three or two datasets were predicted. Consequently, we con-

firmed that increasing the number of sensor elements significantly improved the prediction

accuracy of the gas concentration. When the output signals from 10 sensor elements were

used, the gas concentration could be predicted with an accuracy of less than 0.001% for a

carbon monoxide concentration of 0.02%. This accuracy was sufficient for practical appli-

cation.

Keywords: gas sensor, concentration analysis, machine learning, neural networks, titanium

oxide nanotube

2
1. Introduction

Gas sensors are widely used both in gas detection in manufacturing sites where there is

a risk of suffocation or explosion, and in breath tests for early disease diagnosis. Therefore,

numerous studies have been performed on gas sensors in the fields of medicine and health

[1,2]. It is well known that components in exhaled air can provide useful information re-

garding health status and disease [2,3]. The gas sensor in a breath analyzer should be able

to detect a specific gas from a mixture of several different species of gas in exhaled air in a

short time and with high sensitivity and accuracy. To date, different types of gas sensors

with different detection principles have been proposed and developed, such as contact

combustion-type, gas-thermal-conduction-type, and solid-electrolyte-type sensors [4].

Among them, oxide semiconductor-based sensors are the most actively studied, because of

their simple structure and high sensitivity in the low gas concentration region compared to

other types of sensors [5–8]. When an oxide semiconductor comes to contact with an oxi-

dizing or reducing gas, its resistance changes, and the change in resistance is used to detect

the gas. SnO2, ZnO, In2O3, and ZrO2 are typical oxide semiconductor materials used for gas

sensors, which have been widely used in flammable gas and oxygen gas sensors.

Titanium dioxide (TiO2) has recently attracted attention as a promising class of materi-

als for gas sensors [9]. Zwilling et al. first reported that TiO2 has a unique nanotube (NT)

structure [10]. This tubular structure is highly favorable for gas sensors, owing to signifi-

cantly increased net surface area in contact with the gas to be detected. Furthermore, the

ordered tubular structure of TiO2 can be readily formed using a simple anodic oxidation

method (anodization) [11,12]. Several gas sensors using TiO2 NT films have been proposed

3
so far [13–19]. Previously, we used anodization to form TiO2 NT films on glass or silicon

substrates [20–22]. We fabricated a microscale hydrogen gas sensor on a Si substrate (Si

wafer) using a hybrid process that combines local anodization (bottom-up process) and

photolithography technology (top-down process) [23]. The sensor had the structure sche-

matically presented in Fig. 1 and exhibited a wide detection range of 1 to 105 ppm (10%)

and high sensitivity (detection limit of 1 ppm). However, for the proposed sensor to be ef-

fectively applied to breath analyzer devices, it should be capable of detecting a specific gas

in a gas mixture, reducing the time required for gas detection (response time) and sensor

output to return to the level before detection after gas detection (recovery time). To improve

the sensitivity and shorten the response time, we decorated the inside walls of the TiO2 NTs

with Pt catalyst nanoparticles using the atomic layer deposition (ALD) method [24]. We

confirmed that decoration with catalytic metals improves the sensitivity by more than five

orders of magnitude for hydrogen gas detection [24]. However, the response of the semi-

conductor-type gas sensor becomes complicated when mixtures of reducing and oxidizing

gases are detected. Therefore, new innovations are required to analyze the concentration of

each gas component in a gas mixture containing reducing and oxidizing gases.

Recently, there has been an increase in research on analyzing the output of sensors us-

ing machine learning (ML) techniques [25–41]. These studies attempted to predict the

component concentration of mixed gases by analyzing the output of sensors with different

response characteristics using the ML method. Several commercially available semiconduc-

tor-type gas sensors were used as gas sensors in most of these studies. In this study, several

TiO2 NT-based gas sensor elements with different response characteristics were integrated

4
on a Si substrate, and a gas mixture was simultaneously exposed to them. We used a gas

mixture containing four different gas components (carbon monoxide (CO), oxygen, helium,

and nitrogen) as the model gas. CO is frequently used to diagnose respiratory function, and

we focused on the detection of trace amounts of CO gas in exhaled air. The target value for

the detection accuracy was 0.001% at a CO concentration of 0.02%. However, CO is a re-

ducing gas, while oxygen contained in exhaled air is an oxidizing gas; that is, the two types

of gases react in opposite directions to the TiO2 surface. When these two gas types are

mixed, the response of semiconductor-type sensors becomes complicated, and it is expected

that the concentration of each gas component cannot be derived from the sensor output val-

ue based on the conventional linear regression method. Therefore, we analyzed the sensor

outputs using ML and attempted to predict the concentration of CO gas in the mixed gas.

Therefore, this study aims to determine whether the ML method is effective in predicting

the concentration of each gas component in a gas mixture using the developed TiO2 NT-

type gas sensors and the number of sensor elements required to obtain the desired predic-

tion accuracy. We examined how the accuracy of the concentration prediction varies with

the number of sensor elements.

2. Experimental and analytical methods

2.1. Sensor Fabrication

The method of fabricating the TiO2 NT-based sensor was the same as that previously

reported [24]. Figure 2 is a schematic of the integrated gas sensor substrate used in this

5
study, which is the same as that used in our previous work [24]. Three sets of sensors with

TiO2 NT films with two different line widths (100 µm and 1000 µm) were arranged on a

silicon substrate. We expected that sensors with different widths would have marginally

different sensor response characteristics. Furthermore, by arranging the sensors on the same

board, all the sensors could be simultaneously exposed to the mixed gas. The titanium film

in the region enclosed by the dotted circle in Fig. 2 was anodized locally. Catalytic metal

nanoparticles that promote dissociative adsorption of gas molecules to be detected were

decorated on the top surface of the TiO2 NT film and the inner wall of the NTs to improve

their sensitivity, response time, and gas selectivity. The ALD technique method was used

for metal decoration [24]. The size of the Pt particles loaded on the NT film surface was

changed by varying the number of ALD deposition cycles, to prepare TiO2 NT sensors with

different gas response characteristics. We prepared two types of Pt particles with diameters

of approximately 5 nm and 10 nm that were synthesized by 15 and 45 cycles of ALD depo-

sition, respectively, as shown in Fig. 3. The different sizes of the catalytic nanoparticles

were expected to result in different sensor characteristics because of the different degrees of

dissociative adsorption of the detected gas molecules on the Pt-decorated TiO2 nanotube

surface.

2.2. Response measurements

In this study, we predicted the gas concentrations based on the sensor’s resistance val-

ues. The resistance value was assessed by monitoring the output current measured using a

semiconductor parameter analyzer (Agilent 4156C), with a constant voltage of 1 V applied

6
between the Ti electrodes, as shown in Fig. 1. The system for measurements was the same

as that used in our previous study [24]. The sensor mounted in the measuring vessel was

heated to approximately 300 °C, and the vessel was evacuated to approximately 10 Pa us-

ing an oil rotary pump. Although the backflow of oil from the oil rotary pump may be a

problem when dealing with ultra-high vacuum, it was not a major problem in this study be-

cause no significant characteristic change was observed in the sensor outputs, even after

several months of operation. After evacuating for 30 min, dry air was introduced into the

vessel until the output current of the sensor stabilized. Subsequently, the gas in the vessel

was changed from dry air to the target gas (four mixed gases). After 100 s, the target gas

was turned off, and dry air was introduced into the vessel. Figure 4 shows a typical sensor

response curve when the gas introduction series mentioned above was performed. On eval-

uating the gas concentration, we obtained the resistance of the sensor from the measured

output current. We then derived a ratio of R0/RG, where R0 and RG are the resistances of the

sensor exposed to dry air and the target gas, respectively, namely the response ratio. Our

previous work [24] indicated that the maximum of the first derivative for time, d(R0/RG)/dt,

reflects the gas concentration, and can be used as a measure of the concentration. Hereafter,

we refer to this value as the “response derivative.” In this study, we used the response de-

rivative as the input feature value for the ML procedure discussed in the following section

and predicted the concentrations of CO and O2 in the mixed gases. In the ML procedure

learning process, the learning time is only a few hours for the measurement of approxi-

mately seven gas mixtures as used in this study. In the test, it is enough to spray the sensor

several times with a gas mixture of unknown concentration and record the average sensor

7
output values. This process took only a few minutes in the verification experiment we con-

ducted. This processing time is estimated to be equal to or less than the results of previous

studies [36,39], and is not a problem in practical use.

2.3. Neural Network Architecture and Training

To predict the concentrations of CO and O2 gases in the gas mixture, we developed neu-

ral network-based models using Python with Tensorflow, which is the most popular deep

learning framework developed by the Google community. The neural network-based mod-

els employed in this study included an input layer, hidden layers composed of five full con-

nection layers, and an output layer with two nodes that corresponded to the concentrations

of the two gas components (CO and O2). He and N2 contained in the model gas were not

included in the present model because our sensors were not sensitive to these gases. Figure

5 schematically illustrates the neural network architecture used in this study. The network

has an input layer of multiple nodes, five full connection layers with n = 1,024, 512, 256,

128, and 64 nodes, and two output nodes that correspond to the concentrations of CO and

O2. The number of nodes in the input layer corresponded to the number of sensors. Figure 5

shows an example where the values of the response derivatives from the four sensors are

input into the network. We employed the logarithm of the response derivative as the feature

value because the proposed sensors were extremely sensitive, making the raw value exces-

sively large (approximately 10 sec-1 with a mixed gas of CO, 0.30%) to use the feature val-

ue of the model. We used the Adam optimizer of the adaptive moment estimator [42]. The

optimizer parameters were as follows: the exponential decay rate for the first- and second-

8
moment estimates was 0.9 and 0.999, respectively, and epsilon was 10-7. The model was

trained for 100,000 epochs, with a batch size of 64. We did not use the cross-validation pro-

cedure, but applied the learning rate decay technique to stabilize the learning result and

changed the learning rate from 0.0001 to zero with a step of zero after 100,000 epochs. One

reason we did not employ the cross-validation method in this study is as follows. The pur-

pose of this study was to verify the predictability of the CO concentration, a reducing gas,

in a gas mixture, that is, in the presence of oxidizing oxygen. For this purpose, we adopted

a method for predicting unknown gas concentrations from gas mixture data with known

component gas concentrations. This is similar to the method for measuring the gas concen-

tration calibration curve and predicting the unknown concentration based on the derived

calibration curve. Therefore, we adopted a method that completely separates the target vari-

able (concentration) between training and test; the sensor outputs of the known gas concen-

trations were used as the training dataset, and the sensor outputs of the unknown gas con-

centrations were used as the test dataset. We chose this method for real-world applications.

In a typical ML training setup, all data are randomly divided into training and test; there-

fore, optimization for training data using the cross-validation method can be expected to be

highly generalizable to test data. However, in our case, the training data did not contain any

concentrations used in the test; therefore, even if the training is optimized, we cannot guar-

antee that it is fit for the test. The learning-rate decay method was used to improve the fit.

In recent years, this method has been widely used when using NN [43-45]. Another reason

we did not employ the cross-validation method in this study is that one of our goals was to

reduce learning time. For example, the model adopted in this study requires several hours

9
for one training session. Then, the 5-fold cross-validation method, for example, would have

taken approximately a day to train, which would be impractical for the purpose of our sen-

sor application. Despite this difficult setting, as described below, we were able to achieve a

high prediction accuracy for the test data, which is an important achievement of this study.

3. Result and discussion

3.1. Response characteristics of gas sensor for mixed gases

A four-component gas mixture was prepared to measure the response characteristics.

The component ratio of the gas mixture was adjusted by mixing a commercially available

four-component gas mixture (CO: 0.30%, He: 10.20%, O2: 20.30%, and N2: 69.20%) with

dry air (O2: 20.95%; N2: 78.08%, and others: 0.97%). The application of the method devel-

oped in this study is breath analysis, that is, the concentration analysis of CO in the exhaled

air. In breath analysis, a test gas, i.e., a mixture of four gases (CO, He, O2, and N2) is artifi-

cially inhaled into the lungs, and then the components of the exhaled air are analyzed. In

this process, CO is partially taken into the human body through the alveoli and it is diluted

by the air originally present in the lungs before the test gas is inhaled, leading to a decrease

in the CO concentration in the exhaled air. Helium gas is used for another test of lung func-

tion and is included in the gas mixture to test the sensor performance under near-realistic

conditions. It is not taken into the human body and its concentration is reduced because it is

diluted by the air originally present in the lungs. On the other hand, O2 is taken into the

human body in a certain amount, but its concentration in the exhaled breath increases be-

cause of the air originally in the lungs; similarly, N2 is not taken into the human body, and

10
its concentration slightly increases because of the air originally in the lungs. Since the

amount of CO taken into the human body and the amount of air originally present in the

lungs differ from human to human, the ratio of CO to O2 is not likely to be inversely pro-

portional, but the concentration of CO and He in the exhaled breath tends to decrease while

the concentration of O2 and N2 increases. In this study, we attempted to reproduce this

trend. Therefore, we first considered the gas components contained in the exhaled air, and

the four-component gas mixture was diluted with dry air to change the concentration of a

small amount of CO gas used for pulmonary function assessment. Helium gas, which was

used together for another test of lung function, was included in the gas mixture to test the

sensor performance under near-realistic conditions. Table 1 lists the concentrations of each

component of the seven gas mixtures used for the response measurements. Notably, the

component concentrations presented in Table 1 were prepared in this study because we fo-

cused on the differences in response characteristics to changes in CO gas concentration,

which is frequently used in breath tests. Figures 6 and 7 show the response characteristics

of the TiO2 NT sensors with line widths of 1,000 μm and 100 μm, respectively. On the left

side of Figs. 6 and 7, we indicated that the response characteristics were collected when the

particle size of the decorated Pt particles was approximately 5 nm. The response character-

istics obtained for a Pt particle size of approximately 10 nm are indicated on the right side

of Figs. 6 and 7. The blue and orange curves indicate R0/RG and d(R0/RG)/dt, respectively.

As indicated in Figs. 6 and 7, the response derivative, that is, the maximum value of

d(R0/RG)/dt, increases with increasing CO concentration. Figure 8(a) displays the CO con-

centration dependence of the response derivative obtained for 10 sensors with different siz-

11
es of Pt-decorated nanoparticles and different line widths of NT films. As shown in Fig.

8(a), the response derivative increases with the CO concentration, which indicates that the

maximum derivative value reflects the CO concentration. However, the response derivative

did not vary linearly with CO concentration. We can see that the values of the response de-

rivative were approximately ten times different in the high CO concentration regions. In

addition, the output values and concentration dependencies were different for each sensor.

This difference is probably due to the subtle differences in the surface chemical state of the

TiO2 NT film and the coating state of the catalyst metal surface. Our idea is to see if such

subtle differences can be utilized for mixed gas analysis by machine learning. As shown in

Fig. 8(b), a similar concentration dependence of the response derivative was observed for

the O2 gas. For each gas mixture, the mean and standard deviation of the response deriva-

tive were examined to predict the concentration. For all gas mixtures, the standard devia-

tion was approximately 5% of the mean value, and even when it was large, it was less than

approximately 10%. Therefore, the concentrations of the mixed gases were determined to

be sufficiently discriminating, and the response derivatives were of sufficient quality to be

used as explanatory variables for the concentration prediction. In this study, the output sig-

nals of up to ten TiO2 NT sensors with different response characteristics were used to pre-

dict the concentrations of CO and O2 gases in the gas mixture using neural-network-based

regression. The difference in the response characteristics of the fabricated sensors may be

due to the inevitable subtle heterogeneity caused by the anodic oxidation and catalytic met-

al decoration performed in the process of sensor fabrication, in addition to the difference in

the Pt nanoparticle sizes and NT thin film width. Notably, such subtle differences in re-

12
sponse characteristics may be beneficial for gas analysis using the ML technique. In gen-

eral, a certain amount of variation in response characteristics is unavoidable when manufac-

turing gas sensors. Of course, in some cases, it is necessary to make an effort to minimize

variation, but how much uniformity is required depends on the application and how the

sensor is used. During the actual use of such sensors with a certain degree of variability, a

calibration curve was obtained for each sensor, and the unknown gas concentration was de-

termined based on the calibration curve. The ML used in this study was similar to obtaining

this calibration curve. We speculated that there is no need to suppress the variation in sen-

sor characteristics; however, the fact that the response characteristics of each sensor are dif-

ferent is advantageous for predicting the concentration of the component gases. In other

words, it is better to prepare a large number of sensors with different characteristics. Each

sensor output would be slightly different for gas mixtures with slightly different gas com-

ponent ratios, and the analysis of these outputs using the ML method would lead to a pre-

diction of each gas component’s concentration. The key point of this study was to verify the

prediction accuracy using this method. As described above, we intentionally changed the

size of the sensor and the amount of the catalytic metal coating used to change the charac-

teristics. However, if the characteristics change drastically over time, ML will not be effec-

tive. To date, no problematic changes have occurred for approximately three months. How-

ever, it is probable that, when a sensor is used over a long period, it will change to some

extent over time. In this case, the sensor must be periodically trained. There is a method of

relearning called fine-tuning (i.e., transfer learning) to shorten the learning time.

13
3.2. Prediction of gas concentration by neural network regression

In ML, the number of features input to a neural-network-based regression model has a

significant influence on the prediction accuracy. We regarded the differences in response

characteristics among sensors as different features in predicting the component concentra-

tions of the mixed gases. In this study, we developed five different neural network-based

regression models corresponding to five different inputs and used the output data from up

to ten sensors with different response characteristics. The number of inputs (number of sen-

sors) that we examined in this study was two, four, six, eight, and ten. The model schemati-

cally shown in Fig. 5 is for the case of four inputs (four sensors). As shown in Fig. 5, the

network we used has five full connection layers with n = 1,024, 512, 256, 128, and 64

nodes. In addition to this set of node numbers (1024-512-256-128-64), we have also

checked the prediction accuracy for smaller sized multilayer NNs: (512-256-128-64-32),

(256-128-64-32-16), (128-64-32-16-8), and (64-32-16-8-4). No overlearning was observed

in any of the networks, but prediction accuracy was not stable for the smaller sized NNs

and was degraded compared to the (1024-512-256-128-64) case. Since, as will be described

below, the target prediction accuracy was achieved with the (1024-512-256-128-64) net-

work, multi-layer NNs of larger size were not used. Therefore, we determined that the net-

work size we chose is not too large and is appropriate. In this study, we investigated the in-

fluence of the number of inputs on the accuracy of gas concentration prediction.

A neural network-based regression model, as described above, was developed to predict

the concentrations of CO and O2 in a four-component gas mixture (CO, He, O2, and N2). As

aforementioned, our sensors were not sensitive to He and N2; therefore, we did not predict

14
their concentrations because these gases should behave as spectators. To optimize the pa-

rameters of the model, four or five of the seven concentration sets listed in Table 1 were

used as training datasets, and the remaining three or two concentration sets were used as

test datasets for evaluation. As summarized in Table 1, we used three combinations of train-

ing and test datasets: Comb-A, Comb-B, and Comb-C. Comb-A and Comb-B had 128 and

64 training and test data, respectively, while Comb-C had 96 and 96. Note that the test data

concentrations were within those of the training data for all combinations. In other words,

interpolation was assumed and extrapolation was not considered. In general, if no special

assumptions such as linearity can be made between the explanatory and target variables,

extrapolation is very difficult or impossible regardless of the interpolation accuracy[46].

For our sensors, as shown in Figs. 8 and 9, no linearity between the explanatory and target

variables was observed for any of the sensors, and we did not consider extrapolation. The

number of data points (64, 96, and 128) was selected as follows: We chose 128 and 64 as

the number of data to optimize the efficiency of the use of computational resources

(memory) by setting the number of data as a power of two, and to increase the computa-

tional efficiency. In summary, we adopted these data to reduce the time required for train-

ing and testing. When calculating each of the combinations listed in Table 1, we randomly

and evenly selected data from the output data of each gas mixture to obtain this number of

data. For example, for Comb-A shown in Table 1, we selected data almost equally from

each of A, B, C, E, and G as the training dataset, making a total of 128 datasets, and select-

ed 32 datasets each from test datasets D and F, resulting in a total of 64 datasets. The reason

we did not set the number of training data sets to 256 or 512, which are even larger powers

15
of two, is that the maximum number of explanatory variables (number of sensors) is 10,

and we chose the number closest to 10 times the number of explanatory variables. This

method for selecting the number of data points follows the literature [47]. The number of

data points for the test was set to 64 to ensure computational efficiency. In Comb-C, the

number of concentrations for the test increased by one; therefore, so to match the number of

data points per concentration with Comb-A and Comb-B, we used 32 data points each from

D, E, and F, for a total of 96 data points. To match the total number of training and test data

points with Comb-A and Comb-B, the number of training data points for Comb-C was set

to 96. The number of data points for training was not 10 times larger than the number of

explanatory variables. Therefore, the memory efficiency was expected to be reduced com-

pared to Comb-A and Comb-B, but the training time did not change. In addition, compared

to Comb-A and Comb-B, the reduction in the amount of training data could work against

the prediction accuracy. However, the prediction accuracy of Comb-C for the test data was

approximately the same as those of Comb-A and Comb-B. This is an important result when

considering practical applications.

For the loss function, the mean squared error (MSE) between the true and predicted

concentrations is calculated as follows:

1
MSE = , − , , 1
∈ ,

16
where N is the mini-batch size, Yi,j is the true concentration of the four-component mixed

gases, and , is the prediction value of the gas concentration of each component. To im-

prove the predictions for CO and O2 gas concentrations, we trained the model. In the train-

ing phase, we optimized the parameters of the model by minimizing the MSE loss function.

It is important to note here how the training and test data were established. In general, the

training and test data were randomly selected from a dataset containing a set of input and

true values for the model. However, in this study, we separated the training data from the

test data. This separation method closely resembled actual concentration measurements.

That is, we trained on the data of a mixture of gases with several known concentration rati-

os and then determined the concentration of each component of the mixture of gases with

unknown concentration ratios. In particular, as mentioned above, we selected four or five of

the seven datasets of concentrations listed in Table 1 for training and then predicted the

concentrations of CO and O2 in the remaining datasets, which we assumed to be unknown

concentrations. Therefore, during the evaluation, we predicted values for concentrations

that were not included in the training, which resulted in poorer prediction accuracy than in

the case of general ML prediction.

Another aspect to consider in the case of ML prediction is overlearning. We derived the

learning curve of the evaluation data for each dataset combination to check for overlearn-

ing. In this study, the MSE given by Eq. (1) was used to obtain the learning curve. The

learning curve reflects the speed of convergence of the model and its closeness to the true

value. The faster the convergence and the smaller the loss value, the higher the prediction

accuracy. Figure 9 shows the learning curve for each dataset combination when the number

17
of sensors was set to 10. Clearly, from Fig. 9 for Comb-A and Comb-C, the value of MSE

gradually decreases, indicating that overlearning does not occur for these two combina-

tions. However, as for Comb-B, the value of MSE initially decreases with an increase in the

number of epochs and then increases slightly with the increase in the number of epochs,

indicating that prediction does not work well for this combination. The reason for this poor

prediction is discussed later.

Figures 10 and 11 show the dependence of the predicted concentrations of CO and O2

gases on the number of sensors, respectively. We predicted the concentrations of CO:

0.04% and O2: 20.86% (D) and that of CO, 0.02% and O2, 20.90% (F) in the dataset com-

bination of Comb-A. The upper portion of Figs. 10(a) and 11(a) show the predicted concen-

tration expressed as a percentage; the dashed lines indicate the true values. The horizontal

axis of the figure represents the number of input sensors, which ranged from two to ten.

The error bars indicate the standard deviations of the predicted values. In the lower portion

of Figs. 10(a) and 11(a), we plotted the mean absolute error (MAE) expressed as a percent-

age point (%p), as a function of the number of inputs. For the prediction of CO concentra-

tion, the MAE decreased monotonically with an increasing number of inputs, as shown in

Fig. 10(a). This suggests that as the number of sensors used in the prediction increases, the

accuracy of the concentration prediction increases significantly. When the output values

from 10 sensors were used, the minimum MAE value for CO concentration prediction was

0.0016%p for a concentration of 0.04% and 0.0006%p for a concentration of 0.02%. Nota-

bly, the prediction accuracy was equal to or less than that of commercially available elec-

trochemical CO sensors. Conversely, as for O2 concentration prediction, increasing the

18
number of inputs did not notably improve the error, as shown in Fig. 11(a). As indicated in

Table 1, the O2 gas concentration value to be predicted was more than ten times larger than

that of CO, and the changes in concentration were considerably less, suggesting that the

concentration prediction for O2 was more difficult than that of CO. However, the prediction

was accurate to the smallest digit (second decimal place, 0.01) of the concentration value.

Accordingly, we predicted the gas concentrations of CO and O2 with sufficiently high accu-

racy.

To investigate the effectiveness and validity of predicting gas concentrations using neu-

ral network regression, we attempted to evaluate the concentrations using the other sets of

training and test data, that is, dataset combinations of Comb-B and Comb-C. In Comb-B,

we selected A, B, D, F, and G out of the seven sets of concentrations summarized in Table 1

for training, and we predicted the concentrations of CO and O2 in the two sets of C and E,

which were assumed to be unknown concentrations. In Figs. 10(b) and 11(b), we plotted the

dependence of the predicted concentrations of CO and O2 gases on the number of sensors,

respectively. In the lower portion of Figs. 10(b) and 11(b), we plotted the MAE as a func-

tion of the number of inputs. In predicting CO concentration, the MAE decreased with an

increase in the number of inputs, although there was some variation, as shown in Fig. 10(b).

This is similar to the trend observed in Fig. 10(a). For the 10 sensors, the MAE value for

CO concentration prediction was 0.0038%p for a concentration of 0.05% and 0.0017%p for

a concentration of 0.03%. The accuracy of the prediction for a CO concentration of 0.05%

was greater than that of the other predictions. For O2 concentration prediction, the predic-

tion was below 0.01%, which is comparable to that of prediction for datasets D and F. The

19
poor prediction for CO 0.05% would be related to the observed behavior of the learning

curve, as shown in Fig. 9(b), in which the MSE value slightly increased with an increase in

the number of epochs. The reason for this poor prediction could be attributed to the dataset

used for training. In the dataset used for training, the value of concentration above 0.05% of

the CO concentration was 0.1%. We suppose that the rather large interval between 0.05%

and 0.1% reduced the accuracy of the concentration prediction. When predicting CO con-

centration of 0.05%, the parameters learned from the data of the neighboring concentrations

of 0.04% and 0.1% were used. Consequently, as the number of epochs increased, the pre-

dictions were expected to gravitate toward 0.04%. Certainly, as shown in the upper portion

of Fig. 10(b), the predicted value was smaller than the true value when the number of sen-

sors was 10. This result suggests that to improve the prediction accuracy, the gas concentra-

tions to be used as training data, and those to be predicted should be distributed as evenly as

possible. Basically, when predicting a concentration of 0.05%, the prediction accuracy

would be improved if data for concentrations between 0.05% and 0.1%, such as 0.75%,

were added as the training data in the ML analysis.

For Comb-C, another combination of datasets, we selected A, B, C, and G out of the

seven concentration sets for training, and predicted CO and O2 concentrations in three sets,

D, E, and F. In Figs. 10(c) and 11(c), we plotted the dependence of the predicted concentra-

tions of CO and O2 gases on the number of sensors, respectively. In the lower portion of

Figs. 10(c) and 11(c), we plotted the MAE as a function of the number of inputs. For the

prediction of CO concentration, the MAE decreased with an increasing number of inputs,

which is the same trend as observed in Figs. 10(a) and 10(b). This implies that, as the num-

20
ber of sensors used in the prediction increases, the accuracy of the concentration prediction

increases. For the 10 sensors, the minimum MAE value for CO concentration prediction

was 0.0018%p for a concentration of 0.04%, 0.0015%p for a concentration of 0.03%, and

0.0014%p for a concentration of 0.02%. For O2 concentration prediction, the prediction was

accurate to the smallest digit (second decimal place, 0.01) of the concentration value. The

accuracy of the concentration prediction of CO and O2 was comparable to that of Comb-A.

For CO concentration prediction, the prediction accuracy was less than 0.002 %p for all

three concentrations when the number of sensors was 10.

The purpose of this study was to determine whether ML techniques are effective in pre-

dicting concentrations with our developed sensors, and to determine the number of sensor

units required to obtain the desired prediction accuracy. The above results evidently demon-

strate that the ML technique is effective in predicting the concentration using our sensors,

and that it is necessary to prepare approximately 10 sensor elements to obtain the desired

prediction accuracy.

3.3. Residual error analysis

We discussed the appropriateness of the neural-network-based regression model using

residual error analysis. In the regression analysis, the regression equation is as follows:

= + ", 2

21
where is the predicted value, is the regression function, is the input of the feature

value, and " is the residual error, which is the difference between the predicted and ob-

served true values and follows a normal distribution. We can evaluate the validity of the

obtained regression model by determining if " is normally distributed [48].

In this study, we predicted the concentrations of two species of gases (CO and O2), and

the residuals were as follows:

ϵ = Y − Y , & ∈ CO, O , 3

where Y is the true value for each gas concentration. To analyze the error distribution, we

normalized the residual errors to obtain the normalized residual errors (NRE) as follows:

ϵ , − ϵ-,
NRE = , & ∈ CO, O , 4
./0

where ϵ-, is the mean of the residual error and ./0 is the standard deviation of the residual

error for each target gas. Fig. 12 shows the quantile-quantile (Q-Q) plot of the NRE ob-

tained for three different dataset combinations, where the NRE is plotted as a function of

the theoretical quantile [48]. If the residual errors follow a normal distribution, the points in

the Q-Q plot should lie on a straight line, as indicated by the solid lines in Fig. 12. Clearly

from Fig. 12, the Q-Q plot is close to a straight line for CO concentration prediction, sug-

gesting that the residual errors follow a normal distribution. Furthermore, based on the

22
normality test [49,50], the residual errors were significant at the 5% level for CO concentra-

tion prediction. Therefore, we can conclude that the obtained regression model is adequate,

and the regression model based on the neural network can predict the concentration of CO

gas components in the four-component gas mixtures. However, for O2 concentration predic-

tion, the Q-Q plot deviates marginally from a straight line, particularly for the dataset com-

bination Comb-B. This is because the relative change in the concentration of O2 gas was

less than that of CO gas, as aforementioned.

3.4. Comparison with linear regression analysis

To further indicate the effectiveness of the NN-based regression analysis, we compared

the results with those obtained by linear regression (LG) analysis, which is commonly used

in gas analysis. In linear regression analysis, for each of the ten sensors shown in Fig. 8, the

seven datasets shown in Table 1 were divided into two groups: one for determining model

parameters and the other for predicting. This division was the same as in the NN-based re-

gression analysis. Fig. 13 shows a comparison of the results obtained by LG analysis with

those obtained by NN-based analysis. The results of the LG analysis shown in Fig. 13 are

the mean values of the MAE for the predicted concentrations obtained by LG analysis of

the output values from the 10 sensors. As shown in Fig. 13, the prediction accuracy of the

NN-based analysis was superior to that of the linear regression analysis. The reason for the

poor prediction accuracy of LG analysis is twofold: First, the concentration and response

value did not have a linear relationship with the gas concentration, as shown in Fig. 8. Sec-

ond, the response characteristics of each sensor are different from each other. As aforemen-

23
tioned, when there is a mixture of oxidizing and reducing gases in the gas to be detected,

semiconductor gas sensors such as our TiO2 sensor most probably exhibit complex re-

sponse characteristics. Additionally, it is not necessarily easy to fabricate semiconductor

gas sensors with the same response characteristics, and it is inevitable that the response

characteristics will differ from sensor to sensor. The nonlinearity in the response character-

istics and variability in the response characteristics most likely led to a decrease in the pre-

diction accuracy. Contrary to this, in NN-based regression analysis, nonlinearities in re-

sponse characteristics do not adversely affect the analysis, and differences in response char-

acteristics among sensors are advantageous for improving prediction accuracy in the case of

multi-component analysis. Accordingly, we can conclude that it is effective to use NN-

based regression to analyze the output signals from multiple sensors with different response

characteristics. We tried to predict concentrations using a polynomial regression model with

a quadratic function, but the results were overfitting, and the prediction accuracy was infe-

rior to that of LG analysis, although not shown here.

4. Conclusion

We used the ML technique to perform a gas concentration analysis of the TiO2-NT-

based integrated gas sensor that we had developed. In this study, we fabricated TiO2-NT

integrated gas sensors equipped with sensing elements with different response characteris-

tics, and predicted the concentration of each component gas of unknown concentration by

analyzing the output signal when the sensor was exposed to a gas mixture containing nitro-

gen and oxygen as the main components, and a small amount of CO. We confirmed that the

24
concentrations of CO and O2 gas components were predicted simultaneously, and that the

accuracy of gas concentration prediction could be improved by increasing the number of

sensors. When the number of sensors was set to 10, the gas concentration could be predict-

ed with an accuracy of less than 0.001% for a CO concentration of 0.02% in the gas mix-

ture. This accuracy was sufficient for the analysis accuracy required for the application of

our TiO2 NT-type sensor for breath analysis. Notably, in the integrated gas sensor devel-

oped in this study, 10 sensor elements could easily be integrated on a Si chip using micro-

fabrication techniques such as photolithography and self-organizing processes (anodization).

Furthermore, one of the features of the sensor we developed is that the response character-

istics of sensor elements arrayed on a Si chip can be arbitrarily changed by varying the sur-

face coverage of the catalytic metal (size of the metal particles) and the size (line width) of

each sensor element. The ability to prepare sensor elements with different characteristics is

advantageous for improving the prediction accuracy of ML.

The high accuracy of concentration prediction presented in this study demonstrates the

usefulness of ML in calibrating sensors with nonlinear and complex responses, such as

semiconductor-based gas sensors. Neural network-based learning is called end-to-end

learning, which implies that the model can directly learn the relationship between the fea-

tures and supervisor values. Because the ML method used in this study is one of the most

versatile in ML, it can be applied to the highly accurate prediction of gas concentrations,

and a variety of tasks based on gas concentrations, such as odor identification. We plan to

use our TiO2 NT-type gas sensor as an odor sensor (electronic nose). In practical applica-

tions, the durability and reliability of sensors are important evaluation items. As for the du-

25
rability of the sensor, as a result of repeated measurements, there was no significant change

in the characteristics for at least three months, and it was confirmed that the sensor was suf-

ficient for the practical use targeted in this research. As for reliability, the concentration

prediction in this study cleared the practical target, and the sensor reliability was judged to

be high. However, in the future, when the sensor is applied to advanced measurements,

such as odor sensors, a detailed study of these evaluation items and a comprehensive study

of various component concentrations will be necessary.

Acknowledgment

The experiments in this study were conducted primarily at the Laboratory for Nanoelec-

tronics and Spintronics, Research Institute of Electrical Communication, Tohoku Universi-

ty, and the Microsystem Integration Research and Development Center, Tohoku University.

This work was supported by the Japan Science and Technology Agency (JST) under the

Grant-in-Aid for Scientific Research on Innovative Areas (A-STEP), Functional Verifica-

tion Phase. We would like to thank Editage (www.editage.com) for English language edit-

ing.

26
References

[1] C.D. Natale, R. Paolesse, E. Martinelli, R. Capuano, Solid-state gas sensors for breath
analysis: A review, Analytica Chimica Acta 824 (2014) 1-17.
https://doi.org/10.1016/j.aca.2014.03.014.
[2] M. Righettoni, A. Amann, S.E. Pratsinis, Breath analysis by nanostructured metal ox-
ides as chemo-resistive gas sensors, Mater. Today 18 (2015) 163-717.
https://doi.org/10.1016/j.mattod.2014.08.017.
[3] H. Haick, Y. Broza, P. Mochalski, V. Ruszanyi, A. Amann, Assessment, origin, and im-
plementation of breath volatile cancer markers, Chem. Soc. Rev. 43 (2014) 14231449.
https://doi.org/10.1039/C3CS60329F.
[4] A.M. Azad, S.A. Akbar, S.G. Mhaisalkar, L.D. Birkefeld, K.S. Goto, Solid-state gas-
sensors: a review, J. Electrochem. Soc. 139 (1992) 3690.
https://doi.org/10.1149/1.2069145.
[5] N. Miura, T. Raisen, G. Lu, N. Yamazoe, Highly selective CO sensor using stabilized
zirconia and a couple of oxide electrodes, Sensors and Actuators B: Chemical 47
(1998) 84-91. https://doi.org/10.1016/S0925-4005(98)00053-7.
[6] N.O. Savage, S.A. Akbar, P.K. Dutta, Titanium dioxide based high temperature carbon
monoxide selective sensor, Sensors and Actuators B: Chemical 72 (2001) 239-248.
https://doi.org/10.1016/S0925-4005(00)00676-6.
[7] C.S. Moon, H.-R. Kim, G. Auchterlonie, J. Drennan, J.-H. Lee, Highly sensitive and
fast responding CO sensor using SnO2 nanosheets, Sensors and Actuators B: Chemical
131 (2008) 556-564. https://doi.org/10.1016/j.snb.2007.12.040.
[8] S. Hong, Y. Hong, Y. Jeong, G. Jung, W. Shin, J. Park, J.-K. Lee, D. Jang, J.-H. Bae,
J.-H. Lee, Improved CO gas detection of Si MOSFET gas sensor with catalytic Pt dec-
oration and pre-bias effect, Sensors and Actuators B: Chemical 300 (2019) 127040.
https://doi.org/10.1016/j.snb.2019.127040.
[9] Z. Li, Z. Yao, A.A. Haidry, T. Plecenik, L. Xie, L. Sun, Q. Fatima, Resistive-type hy-
drogen gas sensor based on TiO2: a review, Int. J. Hydrogen Energy 43 (2018) 21114-
21132. https://doi.org/10.1016/j.ijhydene.2018.09.051.

27
[10] V. Zwilling, E.D.-Ceretti, A.B.-Forveille, D. David, M.Y. Perrin, M. Auxouturier,
Structure and physicochemistry of anodic oxide films on titanium and TA6V alloy,
Surf. Interface Anal. 27 (1999) 629-637. https://doi.org/10.1002/(SICI)1096-
9918(199907)27:7<629::AID-SIA551>3.0.CO;2-0.
[11] J.M. Macak, H. Tsuchiya, A. Ghicov, K. Yasuda, R. Hahn, S. Bauer, P. Schmuki, TiO2
nanotubes: self-organized electrochemical formation, properties and applications,
Curr. Opin. Solid State Mater. Sci. 11 (2007) 3-18.
https://doi.org/10.1016/j.cossms.2007.08.004.
[12] A. Ghicov, P. Schmuki, Self-ordering electrochemistry: a review on growth and func-
tionality of TiO2 nanotubes and other self-aligned MOx structures, Chem. Commun.
20 (2009) 2791-2808. https://doi.org/10.1039/B822726H.
[13] O.K. Varghese, D. Gong, M. Paulose, K.G. Ong, C.A. Grimes, Hydrogen sensing us-
ing titania nanotubes, Sensors and Actuators B: Chemical 93 (2003) 338- 344.
https://doi.org/10.1016/S0925-4005(03)00222-3.
[14] G.K. Mor, M.A. Cavalho, O.K. Varghese, M.V. Pishko, C.A. Grimes, Photoelectro-
chemical properties of titania nanotubes, J. Mater. Res. 19 (2004) 2989-2996.
https://doi.org/10.1557/JMR.2004.0370.
[15] J. Sungwook, I. Muto, N. Hara, Hydrogen Gas Sensor Using Pt- and Pd-Added An-
odic TiO2 Nanotube Films, J. Electrochem. Soc. 157 (2010) J221-J226.
https://doi.org/10.1149/1.3374643.
[16] H. Liu, D. Ding, C. Ning, Z. Li, Wide-range hydrogen sensing with Nb-doped TiO2
nanotubes, Nanotechnology 23 (2012) 015502. https://doi.org/10.1088/0957-
4484/23/1/015502.
[17] Z. Li, D. Ding, Q. Liu, C. Ning, X. Wang, Ni-doped TiO2 nanotubes for wide-range
hydrogen sensing, Nanoscale Res. Lett. 9 (2014) 118. https://doi.org/10.1186/1556-
276X-9-118.
[18] R. Zazpe, M. Knaut, H. Sopha, L. Hromadko, M. Albert, J. Prikryl, V. Gä rtnerová ,
J.W. Bartha, J. M. Macak, Atomic Layer Deposition for Coating of High Aspect Ratio

28
TiO2 Nanotube Layers, Langmuir 32 (2016) 10551-10558.
https://doi.org/10.1021/acs.langmuir.6b03119.
[19] A. Yu, H. Xun, J. Yi, Improving hydrogen sensing performance of TiO2 nanotube ar-
rays by ZnO modification, Front. Mater. 6 (2019).
https://doi.org/10.3389/fmats.2019.00070.
[20] K. Ishibashi, R. Yamaguchi, Y. Kimura, M. Niwano, Fabrication of titanium oxide
nanotubes by rapid and homogeneous anodization in a mixture of perchloric acid and
ethanol, J. Electrochem. Soc. 155 (2008) K10-K14.
https://doi.org/10.1149/1.2801975.
[21] R. Kojima, Y. Kimura, M. Bitoh, M. Abe, M. Niwano, Investigation of influence of
electrolyte composition on formation of anodic titanium oxide nanotube films, J.
Electrochem. Soc. 159 (2012) D629-D636. https://doi.org/10.1149/2.003211jes.
[22] R. Kojima, Y. Kimura, T. Ma, K. Ishibashi, D. Tadaki, R.A. Rosenberg, A. Hirano-
Iwata, M. Niwano, Fabrication and characterization of front-illuminated dye-
sensitized solar cells with anodic titanium oxide nanotubes, J. Electrochem. Soc. 164
(2017) H78-H84. https://doi.org/10.1149/2.1031702jes.
[23] Y. Kimura, S. Kimura, R. Kojima, M. Bitoh, M. Abe, M. Niwano, Micro-scaled hy-
drogen gas sensors with patterned anodic titanium oxide nanotube film, Sensors and
Actuators B: Chemical 177 (2013) 1156-1160.
https://doi.org/10.1016/j.snb.2012.12.016.
[24] H. Abe, Y. Kimura, T. Ma, D. Tadaki, A. Hirano-Iwata, M. Niwano, Response charac-
teristics of a highly sensitive gas sensor using a titanium oxide nanotube film deco-
rated with platinum nanoparticles, Sensors and Actuators B: Chemical 321 (2020)
128525. https://doi.org/10.1016/j.snb.2020.128525.
[25] W. Khalaf, C. Pace, M. Gaudioso, Gas detection via machine learning, Int. Scholarly
and Sci. Res. Innovation 2 (2008) 61-65. https://doi.org/10.5281/zenodo.1075909.
[26] Lei Zhang, Fengchun Tian, Chaibou Kadri, Guangshu Pei, Hongjuan Li, Lina Pan,
Gases concentration estimation using heuristics and bio-inspired optimization models

29
for experimental chemical electronic nose, Sensors and Actuators B: Chemical 160
(2011) 760-770. https://doi.org/10.1016/j.snb.2011.08.060.
[27] I. Rodriguez-Lujan, J. Fonollosa, A. Vergara, M. Homer, R. Huerta, On the calibra-
tion of sensor arrays for pattern recognition using the minimal number of experi-
ments, Chemom. Intell. Lab. Syst. 130 (2014) 123-134.
https://doi.org/10.1016/j.chemolab.2013.10.012.
[28] L. Zhang and F. Tian, Performance Study of Multilayer Perceptrons in a Low-Cost
Electronic Nose, IEEE Transactions on Instrumentation and Measurement, 63 (2014)
2014. https://doi.org/10.1109/TIM.2014.2298691.
[29] G. Imamura, K. Shiba, G. Yoshikawa, Smell identification of spices using nanome-
chanical membrane-type surface stress sensors, Jpn. J. Appl. Phys. 55 (2016) 1102B3.
https://doi.org/10.7567/jjap.55.1102b3.
[30] S. De Vito, E. Esposito, M. Salvato, O. Popoola, F. Formisano, R. Jones, G. Di Fran-
cia, Calibrating chemical multisensory devices for real world applications: An in-
depth comparison of quantitative machine learning approaches, Sensors and Actua-
tors B: Chemical 255 (2018) 1191-1210. https://doi.org/10.1016/j.snb.2017.07.155.
[31] M. Tatarko, E.S. Muckley, V. Subjakova, M. Goswami, B. Sumpter, T. Hianik, I.N.
Ivanov, Machine learning enabled acoustic detection of sub-nanomolar concentration
of trypsin and plasmin in solution, Sensors and Actuators B: Chemical 272 (2018)
282-288. https://doi.org/10.1016/j.snb.2018.05.100.
[32] L. Han, C. Yu, K. Xiao, and X. Zhao, A New Method of Mixed Gas Identification
Based on a Convolutional Neural Network for Time Series Classification, Sensors
(Basel) 19(9) (2019) 1960. https://doi.org/10.3390/s19091960.
[33] I. Essiet, Y. Sun, Z. Wang, Big data analysis for gas sensor using convolutional neural
network and ensemble of evolutionary algorithms, Procedia Manufacturing 35 (2019)
629-634. https://doi.org/10.1016/j.promfg.2019.06.005.
[34] J.R.R. Kumar, R.K. Pandey, B.K. Sarkar, Pollutant Gases Detection using the Ma-
chine learning on Benchmark Research Datasets, Procedia Computer Science 152
(2019) 360-366. https://doi.org/10.1016/j.procs.2019.05.005.

30
[35] J. Thorson, A. Collier-Oxandale and M. Hannigan, Using A Low-Cost Sensor Array
and Machine Learning Techniques to Detect Complex Pollutant Mixtures and Identify
Likely Sources, Sensors 19 (2019) 3723. https://doi.org/10.3390/s19173723.
[36] S. Acharyya, B. Jana, S. Nag, G. Saha, P. K. Guha, Single resistive sensor for selec-
tive detection of multiple VOCs employing SnO2 hollowspheres and machine learn-
ing algorithm: A proof of concept, Sensors and Actuators B: Chemical 321 (2020)
128484. https://doi.org/10.1016/j.snb.2020.128484.
[37] M. Aliramezani, A. Norouzi, C. R. Koch, A grey-box machine learning based model
of an electrochemical gas sensor, Sensors and Actuators B: Chemical 321 (2020)
128414. https://doi.org/10.1016/j.snb.2020.128414.
[38] Soo-Yeon Cho, Youhan Lee, Sangwon Lee, Hohyung Kang, Jaehoon Kim, Junghoon
Choi, Jin Ryu, Heeeun Joo, Hee-Tae Jung, and Jihan Kim, Finding Hidden Signals in
Chemical Sensors Using Deep Learning, Anal. Chem. 92 (2020) 6529–6537.
https://doi.org/10.1021/acs.analchem.0c00137.
[39] N.X. Thai, M. Tonezzer, L. Masera, H. Nguyen, N.V. Duy, N.D. Hoa, Multi gas sen-
sors using one nanomaterial, temperature gradient, and machine learning algorithms
for discrimination of gases and their concentration, Analytica Chimica Acta 1124
(2020) 85-93. https://doi.org/10.1016/j.aca.2020.05.015.
[40] C. G. Viejo, S. Fuentes, A. Godbole, B. Widdicombe, R. R Unnithan, Development of
a low-cost e-nose to assess aroma profiles: An artificial intelligence application to as-
sess beer quality, Sensors and Actuators B: Chemical, 308 (2020) 127688.
https://doi.org/10.1016/j.snb.2020.127688.
[41] W. Zhang, L. Wang, J. Chen, W. Xiao, and X. Bi, A Novel Gas Recognition and Con-
centration Detection Algorithm for Artificial Olfaction, IEEE Transactions on Instru-
mentation ond Measurement, 70 (2021) 2509514.
https://doi.org/10.1109/TIM.2021.3071313.
[42] D.P. Kingma, J. Ba, Adam: A Method for Stochastic Optimization, Conference paper
at the 3rd International Conference for Learning Representations, San Diego, (2015).

31
[43] Y. Bengio, Practical Recommendations for Gradient-Based Training of Deep Archi-
tectures, Neural Networks: Tricks of the Trade, (2012),437-478.
https://doi.org/10.1007/978-3-642-35289-8_26
[44] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition,
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016)
770–778. https://doi.ieeecomputersociety.org/10.1109/CVPR.2016.90.
[45] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, Chapter 8: Optimization
for Training Deep Models, The MIT Press, Cambridge, 2016, pp.271-325.
[46] H.Lohninger, Teach/Me Data Analysis, Springer-Verlag, Heidelberg, 1999.
[47] P. Peduzz, J. Concato, A. R. Feinstein, and T. R.Holford, Importance of events per
independent variable in proportional hazards regression analysis II. Accuracy and
precision of regression estimates, Journal of Clinical Epidemiology, 48(12), (1995)
1503-1510. https://doi.org/10.1016/0895-4356(95)00048-8.
[48] M.B. Wilk, R. Gnanadesikan, Probability plotting methods for the analysis for the
analysis of data, Biometrika 55 (1968) 1-17. https://doi.org/10.1093/biomet/55.1.1.
[49] R.B. D’Agostino, An omnibus test of normality for moderate and large sample size,
Biometrika, 58 (1971) 341-348. https://doi.org/10.2307/2334522.
[50] R. D’Agostino, E.S. Pearson, Tests for departure from normality, Biometrika 60
(1973) 613-622. https://doi.org/10.1093/biomet/60.3.613.

32
Table 1. Concentrations of each component gas in the mixed gases.

CO He O2 N2 Comb-A Comb-B Comb-C

A 0.30% 10.00% 20.00% 69.70% ○ ○ ○

B 0.10% 3.40% 20.73% 75.12% ○ ○ ○

C 0.05% 1.70% 20.84% 76.60% ○ ※ ○

D 0.04% 1.36% 20.86% 76.90% ※ ○ ※

E 0.03% 1.02% 20.88% 77.20% ○ ※ ※

F 0.02% 0.68% 20.90% 77.49% ※ ○ ※

G 0.01% 0.34% 20.93% 77.79% ○ ○ ○

※ Datasets for testing, ○ Datasets for training.

33
Figure captions:

Fig. 1. Schematic of fabricated gas sensor. The gas detection medium is a film of regular-
ly arranged titanium dioxide (TiO2) nanotubes with a diameter of approximately
100 nm.

Fig. 2. (a) Schematic and (b) photograph of sensor elements fabricated on Si substrate (Si
wafer). Six sensor elements with different line widths were arranged on the silicon
substrate.

Fig. 3. Scanning electron microscope (SEM) images of Pt nanoparticles with diameters of


(a) 5 nm and (b) 10 nm loaded onto the TiO2 nanotube film.

Fig. 4. Response characteristics of the sensor exposed to a gas mixture of four different
gases (CO, He, O2, and N2).

Fig. 5. Neural network model used in this study. The neural network is composed of an
input layer, a hidden layer of five full-connection layers, and an output layer. In the
hidden layer, the five full-connection layers have n = 1024, 512, 256, 128, and 64
nodes.

Fig. 6. Response characteristics of Pt-decorated TiO2 gas sensors with a line width of
1,000 μm and Pt nanoparticle sizes of (a) 5 nm and (b) 10 nm, when exposed to
four-component gas mixtures with different mixing ratios.

Fig. 7. Response characteristics of Pt-decorated TiO2 gas sensors with a line width of 100
μm and Pt nanoparticle sizes of (a) 5 nm and (b) 10 nm, when exposed to four-
component gas mixtures with different component ratios.

Fig. 8. Dependence of the maximum value of d(R0/RG)/dt on (a) CO and (b) O2 gas con-
centrations for Pt-decorated TiO2 gas sensors with different line widths (100 and
1,000 μm) and Pt particle sizes (5 and 10 nm). The horizontal axis is logarithmic.
Dotted lines represent the regression lines obtained by linear regression analysis.

34
Fig. 9. Learning curves for test data in different training and test datasets (a) Datasets A,
B, C, E, and G in Table 1 were used for training, and D and F were used for test-
ing. (b) Datasets A, B, D, F, and G were used for training, and C and E were used
for testing. (c) Datasets A, B, C, and G were used for training, and D, E, and F
were used for testing. The dotted lines are the learning curves magnified by a fac-
tor of 10 in the vertical direction.

Fig. 10. Dependence of the predicted concentration of CO (upper panel) and MAE (lower
panel) on the number of sensor elements in different training and test datasets.
From (a) to (c), the true concentrations of the test data are D and F, C and E, D, E,
and F, respectively. Error bars indicate standard deviation of the predicted concen-
tration.

Fig. 11. Dependence of the predicted concentration of O2 (upper panel) and MAE (lower
panel) on the number of sensor elements in different training and test datasets.
From (a) to (c), the true concentrations of the test data are D and F, C and E, D, E,
and F, respectively. Error bars indicate standard deviation of the predicted concen-
tration.

Fig. 12. Q-Q plot of residual error for test data in different training and test datasets corre-
sponding to Figs. 10 and 11.

Fig. 13. Comparison of neural network (NN) regression analysis with linear regression
analysis

35
Figure 1

36
Figure 2

37
Figure 3

38
Figure 4

39
Figure 5

40
Figure 6

41
Figure 7

42
Figure 8

43
Figure 9

44
Figure 10

45
Figure 11

46
Figure 12

47
Figure 13

48

You might also like