CFC-GAN: Road Crack Forecasting Model
CFC-GAN: Road Crack Forecasting Model
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
SEKAR AND PERUMAL: CFC-GAN: FORECASTING ROAD SURFACE CRACK USING FORECASTED CRACK GAN 21379
Based on the investigation mentioned earlier, the contribu- model discriminates whether the generated images are real or
tions of the present work are summarized as follows: fake.
• We have proposed a conditional Forecasted Generative AlignGAN model [32] was designed using conditional
Adversarial Network model, namely CFC-GAN, which GAN. In AlignGAN, the generator has the input layer with-
is used to generate the forecasted crack images based on out condition-based domain vectors, whereas the discrimina-
various conditional factors like temperature, precipitation, tor has an input layer with condition-based domain vectors
road aging, and the number of vehicles traveled (month for generating cross-domain images. Contextual Generative
wise). Adversarial Network model was proposed in [33] to synthesis
• We have conducted wide experiments to verify the perfor- face aging images. It has collectively three networks: age dis-
mance of our proposed model. To our knowledge, this is criminator, conditional transformation, and transition pattern
the first approach to propose a crack forecast using GAN. discriminator, which helps to develop face aging from baby to
The results were demonstrated that the proposed model teenagers, age from 30 to 50, and also for wrinkle. Age-cGAN
achieves higher performance in terms of quantitative and approach [34] was designed to synthesis face age (various
qualitative analysis. range of age like: 0-18, 19-29, 30-39, 40-49, 50-59, 60+)
• We introduce a new road crack dataset as road surface identification by preserving the original identity of a person.
Crack ForeCast (CFC dataset) with mapping paired) in Conditional Adversarial AutoEncoder (CAAE) [22] is used
different time intervals. to synthesis face aging images based on a low-dimensional
The structure of the paper is organised as follows. Section II features.
reviews the most relevant research papers that address the The approach in [23] achieved synthesis of face aging
various approaches used to detect cracks on the road surface, images based on its identity preserved information using
generating crack images using GAN and synthesis images the AlexNet model, whereas [24] uses three different face
using conditional GAN for various fields. Section III describes patches of the person along with the conditional input vec-
the proposed method to forecast the road crack images using tor. [25] develop a model to generate face aging images
CFC-GAN based on various conditions like the number of based on semantic and attention mechanism by deploying
vehicles passed, temperature, precipitation, and road aging. Wavelet Packet Transform with discriminator. The quality
Section IV demonstrates possible experimental work carried of the generated image was measured using the face++
for forecasting the crack images using CFC-GAN. Section V tool. [26] explored a GAN to generate face aging images
discusses the results obtained for the proposed approach by incorporating age-related features into the discriminator.
and compared with other existing state-of-the-art approaches. The discriminator uses a pyramid-structured discriminator
Section VI concludes our work. to differentiate between real and fake images. PFA-GAN
model [35] proposed several sub-networks which are used
II. R ELATED W ORKS to mimic and generate the face aging images based on end-
Various models were developed for automatic road crack to-end training. Pearson correlation coefficient was used as
detection, segmentation, and classification, whereas for road a measuring metric to evaluate the face synthesised images.
surface crack forecasting, no articles are proposed to the Triple-GAN [36] implemented the triple translation loss in
best of our knowledge. But very few researchers developed the generator while generating the face aging images, and the
their proposal for forecasting the scenery images and sensory progressive mappings are also applied based on age domains.
data [21]. This section will discuss the study of image synthe- ID-CGAN [37] introduces a multi-scale discriminator for
sis using the basic GAN model, condition-based GAN model, identifying the real or fake images using global and local
and how paired image to image translations were implemented information to generate the single de-raining images. The
by incorporating conditional information. quality of the generated de-raining images was measured
Various traditional image processing-based research papers using various metrics like Universal Quality Index, Peak
were proposed at the initial stage for road crack detection. Signal to Noise Ratio (PSNR), Structural SIMilarity (SSIM),
Later, many researches were carried out for road crack detec- and Visual Information Fidelity (VIF). Lifelong GAN [38]
tion based on machine learning and deep learning models. generates the images based on two different circumstances:
Recently, the CrackGAN model was implemented for syn- image-conditioned and label-conditioned image generation.
thesing crack ground truth image in [19]. They introduced a Knowledge distillation is used for both image-conditioned and
Crack-Patch-Only (CPO) supervised GAN and an asymmetric label-conditioned image generation tasks to address the main
U-shape generator network to segment crack regions for the issue of catastrophic forgetting. The generated image quality
given set of input crack images. was measured using Acc (Accuracy)-trained with real images
In recent years, various GAN based applications are devel- and evaluated over the generated images [39].
oped like: image-to-image translation, text-to-image transla- [27] proposed a fuseGAN model for fusion of multi-focus
tion and for generating high resolution images. GAN plays image detection and classification of the focused regions
a major role for image synthesis like face aging, natural accurately in PASCAL VOC 2012 dataset using Siamese
image, dehazing image and art image based on conditions network. [28] generated natural and an artistic image using
[22]–[29] and [30]. GAN [31] has two Convolution models: CResNetBlocks by incorporating gradient loss function from
The generator model is represented as G and the discriminator the categorical discriminator. The forecast model proposed
model is represented as D. The generator model generates the in [21] uses probabilistic GAN for the Lorenz dataset to
images from the given inputs (noise vector). The discriminator learn the future values from the condition window. Instead of
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
21380 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 11, NOVEMBER 2022
Fig. 3. Proposed architecture using conditional forecasted crack-generative adversarial network (CFC-GAN) for synthesising the crack forecast images.
While comparing the synthesized crack images x̂ with the GAN model to optimize each sub-network based on the paired
targeted crack images Ct ar , discriminator D will iteratively crack age dataset. Figure 3 and Table I show the framework
trained until it is never classified as fake. For real crack images, of our proposed Conditional Forecasted Crack-GAN model,
the probability belonging to real crack D(x|Ct ar ) value will be which consists of generator and discriminator associated with
high. D is also applicable for adjusting the input label Ct for conditional factors and loss functions.
the synthesised crack images. The conditional based function
Vc (D, G) is obtained using equation (2).
C. Generator
mi n max Vc (D, G) = Ex ∼ px (x) [log D(x| Ct ar )] The sub-generator is fed with the input of originally cap-
G D
tured first-month crack images of size 256 × 256 × 3 and
+ E y ∼ p y (y) [log(1 − D(G(y|Ct ar )))] . (2)
256 × 256 × 4 conditional feature maps. Our network model is
The conditional GAN uses the equation (2) for optimiza- structured with a residual encoder-decoder. The encoder has
tion. In [49], least-square is introduced to generate the high four convolutional layers that are 16, 32, 64, and 128 filters,
quality images from a more stable learning process using respectively, with each kernel size of 4 × 4. The convolutional
Least Squares GAN (LSGAN). We introduce a conditional layers are followed by six residual layers with a kernel size
forecasted crack loss function for synthesising the appropriate of 3 × 3. The decoder consists of four deconvolutional layers
forecasted crack images. Mathematically, the conditional loss similar to the encoder with 128, 64, 32, and 16 filters with
is formulated for both the generator and discriminator using the size of 4 × 4. Except for the final convolutional layer,
equation (3) and (4). the remaining layers use batch normalization (BN) along with
Leaky Rectified Linear Unit (LReLU) activation function.
1
LG = E y ∼ p y (y) [(D(G(y)| Ct ar ) − 1))2 ]. (3)
min 2
D. Discriminator
1
LD = Ex ∼ px (x) [(D(x)| Ct ar − 1)2 ] This has the inputs of generated forecasted crack images
min 2
1 along with the real images associated by the conditional
+ E y ∼ p y (y) (D(G(y|Ct ar )))2 . (4) feature maps to discriminate the results for the image-to-image
2 translation task. We deployed various conditional factors into
where Ex and E y represent the overall expected instances of the discriminator in order to identify that the generated synthe-
real and forecasted crack images. sised images were consistent with the provided conditions. The
CFC-GAN model consists of various sub-networks to syn- discriminator network has a series of 7 convolution layers with
thesise the forecasted crack images based on the conditional 4 × 4 filters. We have used the following name conventions,
factors. Every sub-network extracts the features to synthesis Con k indicates 4 × 4 convolution layer, batch normalization
the forecasted crack images based on various factors between (BN), and Leaky Rectified Linear Unit (LReLU) layer of
six adjacent road aging groups. We have incorporated the stride two and k number of output channels. The discriminator
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
21382 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 11, NOVEMBER 2022
TABLE I generate the forecasted crack images from G, which are more
CFC-GAN M ODEL -D ISCRIMINATOR AND G ENERATOR realistic and similar to originally captured ground truth images.
N ETWORK A RCHITECTURE
The adversarial loss function is obtained using the equation (3)
and (4).
F. L1 Regularization Loss
L1 loss is introduced basically for finding the sum of the
absolute deviation between the ground truth images (gti ), and
the generated forecasted crack images (g f i ). The L1 loss
function is obtained using the equation (5).
N
L1 = |gti − g f i |. (5)
i=1
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
SEKAR AND PERUMAL: CFC-GAN: FORECASTING ROAD SURFACE CRACK USING FORECASTED CRACK GAN 21383
H. Overall Loss
Fig. 4. Number and type of vehicles travelled month wise data.
The overall loss functions are used to optimize the error
that occurred in both generator and discriminator. Loverall is
represented in equation (13).
Loverall = L D + L1reg + L S S I M(Igt ,Igen ) . (13)
min
The generator and discriminator are alternatively updated
until they converge while training. End-to-end training
was used to avoid accumulative errors in our CFC-GAN.
We have incorporated the conditional factors for both the
generator and discriminator. So that it learns and adapts
for different conditions like road aging, the number of
vehicles traveled, precipitation and temperature. Besides,
the error from the generator gets back-propagated since
we have trained our CFC-GAN model in an end-to-end
manner. Fig. 5. Climatic conditions data.
IV. E XPERIMENTS
In this section, we have discussed an experimental setup
carried out for our proposed method. We have addressed the a GPS device. Maximum of 98-113 images are captured by
dataset collected for forecasting with its conditional informa- changing the elevation and accuracy for a single location. From
tion. We also have generated forecasted crack images for the the collected images, we have filtered around 100 images
existing benchmark datasets. Highlighted various quantitative for each location. In order to train the adversarial network,
and qualitative metrics are used to evaluate our proposed image augmentation was performed. We have collected the
model along with its implementation details. vehicle information from Vanagaram Toll Plaza, Chennai,
The proposed CFC-GAN model for generating the fore- Tamilnadu, India. The data provided from the Toll Plaza is
casted crack images was implemented in the Pytorch environ- for the same duration. Figure 4 shows the number and type
ment. We have used a dataset split of 70% images for training, of vehicles traveled every month. We have collected the cli-
10% images for validation, and 20% images for testing. The mate information from the public website [Link]
CFC-GAN model is trained to forecast the growth of crack [Link]. The average maximum temperature (◦ F), average
from 1st month to 2nd , 3rd , and 6t h month, from 2nd to 3rd , minimum temperature(◦F), and average precipitation (mm)
and 6t h month and from 3rd month to 6t h month. The stride are derived from it. Figure 5 shows the climate conditional
is set to 2 for the generator and discriminator. The Batch information.
Normalization (BN) is used for regularizing the Generator and We have evaluated our CFC-GAN model for various bench-
Discriminator network. The LeakyReLU activation function is mark road crack datasets. We have treated those crack images
used for both the Generator and Discriminator of the proposed as 1st month images, and the conditional information remains
model. the same.
Crack ForeCasted (CFC) Dataset: CFC dataset consists of • CrackForest Dataset (CFD): In [50], they had generated
three different kinds of data: crack images (months wise), a road crack dataset comprised of 118 images of size
the number of vehicles traveled, and climatic conditions. The 480 × 320.
crack image dataset was collected from Inner Ring Road • Cracktree 200 Dataset: In [51], road pavement images
located at Chennai, Tamilnadu, India. For generating the are collected with cracks comprised of 206 images of
forecasted crack image dataset, we have collected the crack size 800 × 600.
images for certain time intervals (1st month, 2nd month, • Crack 500 Dataset: In [13] and [52] road pavement crack
3rd month and 6t h month). We have collected images from dataset was collected from the Temple University campus
10 different locations. We have collected the location infor- through cell phones. The dataset comprised of 500 images
mation like latitude, longitude, elevation, and accuracy using of pixel size 2,000 × 1,500.
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
21384 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 11, NOVEMBER 2022
We have resized the images of the above-mentioned bench- where μact ual and μ f orecast represented the mean of actu-
mark datasets as 256 x 256 pixel size since we designed our ally captured images and generated forecasted images,
model that accepts the input image of 256 x 256 pixels. T r represents the trace operator, act ual and f orecast
Our proposed model is evaluated with the metrics like represents the variance of actually captured images and
Structural SIMilarity (SSIM), Peak Signal to Noise Ratio generated forecasted images respectively.
(PSNR), Inception Score (IS), Frechet Inception Distance • Intersection over Union (IoUC F C−G AN )) IoU is mainly
(FID), and Intersection over Union (IoU) which are discussed used to compute the accuracy of the forecasted crack
below. image. Edge detection is applied for the crack regions
• Structural SIMilarity (SSIMC F C−G AN ) SSIM [29] is used of both the generated and original images. Then IoU is
to measure how well the generated forecasted crack applied for both the generated and original images. IoU
images are similar to the actual images. It is measured is computed using equation (18).
based on luminance, contrast, and structure [53].
gener atedi mage origi nali mage
• Peak Signal to Noise Ratio (PSNRC F C−G AN )) PSNR is I oUC F C−G AN = .
the ratio between the generated forecast crack images gener atedi mage origi nali mage
and the original images. PSNR is used to measure the (18)
quality between the originally captured, and the generated The Conditional GAN+Adversarial model, the Condi-
forecasted crack images [53]. The higher PSNR score tional GAN+Adversarial+L1 model, CFC-GAN with ReLU
represents the better result of the forecasted crack images. as activation function (CFC-GAN+ReLU) and CFC-GAN
PSNR score is computed using equation (14). with Leaky ReLU as activation function (CFC-GAN) were
trained for 1000, 600, 600, and 500 epochs, respec-
M AX f luc
P S N RC F C−G AN = 10log10 √ . (14) tively before convergence. The batch size is set to
MSE 16 for all the Conditional GAN+Adversarial, Conditional
where Mean Square Error (MSE) is computed using GAN+Adversarial+L1, CFC-GAN+ReLU and CFC-GAN
equation (15). models. The learning rate of the generator is set to 0.001,
and the discriminator is set to 0.0005. We used an ADAM
1
x−1 y−1
optimizer with beta1=0.9 and beta2=0.999 to train the mod-
M S E C F C−G AN = ||Io (i, j ) − Igen (i, j )||2 . els. Figure 6a, 6b, 6c and 6d shows the loss of genera-
xy
0 0 tor and discriminator of the Conditional GAN+Adversarial,
(15) Conditional GAN+Adversarial+L1, CFC-GAN+ReLU, and
CFC-GAN models respectively.
where Io represents the originally captured image, Igen
represents the generated image, x indicates the number
of pixels in rows and i indicates the index of that row, V. R ESULTS
y indicates the number of pixels in columns and j In this section, the performance of the CFC-GAN model
indicates the index of that column. M AX f luc denotes the is analyzed both quantitatively and qualitatively. The results
maximum fluctuation with the originally captured crack are elaborated for the CFC-GAN model using various met-
images. rics for the forecasted crack images. The performance of
• Inception Score (ISC F C−G AN ) Inception score [54], [55] the proposed Conditional GAN+Adversarial, Conditional
is used to evaluate the quality of image generated GAN+Adversarial+L1, CFC-GAN+ReLU, and CFC-GAN
(forecasted) [20]. Inception score is computed using models are evaluated for the proposed and other existing
equation (16). benchmark crack datasets.
I SC F C−G AN = ex p(Ex ∼ p f D K L ( p(y|x)|| p(y))). (16)
A. Ablation Study
where x ∼ p f represents that x is an image sample Tables II and III shows the results obtained for the
from p f , D K L represents K L divergence between the various evaluation metrics of the proposed Conditional
distributions of any two samples i and j , p(y|x) indicates GAN+Adversarial, Conditional GAN+Adversarial+L1,
the conditional label distribution and p(y) indicates the CFC-GAN+ReLU, and CFC-GAN models for generated
marginal label distribution. The higher inception score forecasted crack images based on road aging, number of
represents the better result. vehicles traveled and climatic conditions on the proposed
• Frechet Inception Distance (FIDC F C−G AN )) FID [56] is CFC dataset and other existing datasets.
mainly used to measure the quality of the forecasted Based on the results shown in Table II the following
image [20]. It is computed by taking the difference observations are made:
between the originally captured crack images and gen- • From the results shown in Table II, the proposed
erated appropriate forecasted crack images. FID is com- CFC-GAN model performs better than Conditional
puted using equation (17). GAN+Adversarial, Conditional GAN+Adversarial+L1,
and CFC-GAN+ReLU models used for generating fore-
F I DC F C−G AN = ||μact ual − μ f orecast ||2
casted crack images. This is because of the use of
+ T r (act ual + f orecast paired translation-based mapping conditional informa-
− 2(act ual f orecast )1/2 ). (17) tion along with L1 regularization loss and SSIM loss
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
SEKAR AND PERUMAL: CFC-GAN: FORECASTING ROAD SURFACE CRACK USING FORECASTED CRACK GAN 21385
Fig. 6. Generator and discriminator—training loss. (a) Conditional GAN+adversarial. (b) Conditional GAN+adversarial+L1. (c) Conditional forecast
crack-GAN with ReLU. (d) Conditional forecast crack-GAN.
TABLE II
G ENERATION OF ROAD C RACK F ORECASTED I MAGES OF S IZE 256 × 256
activation function, it provides the derivative as one
U SING CFC-GAN FOR THE P ROPOSED D ATASET and zero for positive and negative values. Whenever
the value is negative, there will be no learning during
back-propagation. To overcome this, Leaky ReLU was
implemented while training the model. Leaky ReLU will
include a small fraction as a derivative even for the
negative values and also helps to speed up the training
process. Table II showcases that the results obtained for
CFC-GAN model with LeakyReLU activation function
performs better than the CFC-GAN+ReLU model while
trained with ReLU activation function.
• The SSIM loss ensures that the generated images are
structurally similar to the respective originally captured
images. So the forecasted crack images are highly similar
and more realistic while using CFC-GAN model. And
also generates the appropriate ground-truth images avail-
able in the CFC dataset.
Table III shows the result of generated forecasted crack
images for the existing benchmark datasets like: Crack-500,
CrackTree-200 and CFD. In this case, we have only a single
instance of crack images and do not have ground truth images
to forecast crack images. Hence for evaluating the models,
we have compared the generated forecasted images with the
given input images to ensure consistency of the proposed
model.
The results obtained from Table III shows that the
CFC-GAN model performs better than the Conditional
in addition to the adversarial loss used in Conditional GAN+Adversarial and Conditional GAN+Adversarial+L1
GAN+Adversarial, Conditional GAN+Adversarial+L1, models over most of the benchmark datasets as we have
and CFC-GAN+ReLU models. When we use the ReLU used the better loss function for CFC-GAN compared
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
21386 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 11, NOVEMBER 2022
Fig. 7. Illustrates few sample forecasted images for the proposed dataset using proposed CFC-GAN model.
to Conditional GAN+Adversarial and Conditional GAN+ datasets. Figure 7 shows a few generated samples of forecasted
Adversarial+L1 models. crack images for the proposed dataset. It shows four sample
images from the CFC dataset for the forecast images from
1st month to 2nd month, 1st month to 3rd month, 1st month to
B. Analysis of Generated Forecasted Crack Images 6t h month, 2nd month to 3rd month, 2nd month to 6t h month,
This subsection discusses the qualitative results obtained and 3rd month to 6t h month. Here we can see the growth
from the generated forecasted crack images using proposed of crack from 1st month to the succeeding months. The size
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
SEKAR AND PERUMAL: CFC-GAN: FORECASTING ROAD SURFACE CRACK USING FORECASTED CRACK GAN 21387
Fig. 8. Comparison of forecasted crack images using proposed CFC-GAN model along with the conditional parameters.
of the crack increases and the quality of the road decreases C. Effect of Conditional Information for Forecasted Crack
as the number of the month between sampling increases. The Images Using CFC-GAN
images are generated using the CFC-GAN model using various
conditions along with the adversarial loss, L1 regularisation CFC-GAN consists of crack images along with con-
Loss, and SSIM loss. ditional factors. Conditional information like the num-
Figure 7 shows the forecast images generated for various ber of vehicles traveling on the road, temperature, and
sizes and types of crack images (linear and alligator). It can precipitation plays a major role in the quality of the
be seen from the forecasted images and SSIM scores that the road surface. We included these conditions while generat-
model is able to forecast all types of cracks with higher accu- ing the forecasted crack images in our proposed model.
racy. We can also see that the forecasted images are similar Figure 8 shows the result of the forecasted crack images
to the ground-truth images of the corresponding months. generated by Conditional GAN+Adversarial, Conditional
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
21388 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 11, NOVEMBER 2022
Fig. 9. Forecasted images for existing benchmark datasets using proposed CFC-GAN model.
TABLE III
G ENERATION OF F ORECASTED C RACK I MAGES OF S IZE 256 × 256 U SING CFC-GAN M ODEL FOR THE E XISTING B ENCHMARK D ATASETS
GAN+Adversarial+L1, CFC-GAN+ReLU, and CFC-GAN ground truth images as the given input images. It can be
models for the proposed CFC dataset. It also shows the original visualized from all the images that the generated images by
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
SEKAR AND PERUMAL: CFC-GAN: FORECASTING ROAD SURFACE CRACK USING FORECASTED CRACK GAN 21389
Fig. 10. Comparison of forecasted crack images by varying conditional information for the existing dataset using proposed CFC-GAN model.
the CFC-GAN model are better forecasted when compared better when compared to the Conditional GAN+Adversarial,
to the images generated by Conditional GAN+Adversarial, Conditional GAN+Adversarial+L1, and CFC-GAN+ReLU
Conditional GAN+Adversarial+L1, and CFC-GAN+ReLU models. From Figure 8, considering the first sample image,
models. The Conditional GAN+Adversarial model generates we can see that the images from the duration of 1st month
the forecasted images with conditional factors and adversar- to 6t h month forecast has large number of vehicles traveled
ial loss whereas Conditional GAN+Adversarial+L1 model and precipitation is high. Hence we can see that the quality of
uses the conditional factors, adversarial and L1 loss. CFC- the road crack deteriorates to a larger extent. Even in this
GAN+ReLU model uses conditional factors with adversarial, case, the CFC-GAN can accurately generate the appropri-
L1 regularisation and SSIM losses. We can see in Figure 8 that ate forecasted crack images when compared to Conditional
the CFC-GAN+ReLU model generates the better forecased GAN+Adversarial, Conditional GAN+Adversarial+L1, and
crack images compared to Conditional GAN+Adversarial CFC-GAN+ReLU models is very similar to ground truth
and Conditional GAN+Adversarial+L1 models. Conditional images. From the third sample image, we can see that the
GAN+Adversarial+L1 model generates the better-forecasted forecasted image from 1st month to 2nd month has less
crack images compared to the Conditional GAN+Adversarial number of vehicles traveled, temperature, and precipitation.
model. Here we can also see that the CFC-GAN forecasted more accu-
The result of forecasted crack images by the CFC-GAN rate results than the other three models. Thus, the proposed
model associates adversarial loss, L1 regularisation loss, and CFC-GAN model can capture all the conditions and generate
SSIM loss with Leaky ReLU activation function performs more accurate and appropriate forecasted crack images.
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
21390 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 11, NOVEMBER 2022
D. Result of Forecasted Crack Images for Other Benchmark [5] H. Oh, N. Garrick, and L. Achenie, “Segmentation algorithm using
Datasets iterative clipping for processing noisy pavement images,” in Proc. 2nd
Int. Conf. Imag. Tech.: Techn. Appl. Civil Eng., Davos, Switzerland,
Figure 9 shows the forecasted crack images generated for 1998, pp. 138–147.
the Crack Forest Dataset (CFD), Crack Tree-200 dataset and [6] M. Petrou, J. Kittler, and K. Y. Song, “Automatic surface crack detection
on textured materials,” J. Mater. Process. Tech., vol. 56, nos. 1–4,
Crack-500 dataset. We have generated the forecasted crack pp. 158–167, Jan. 1996, doi: 10.1016/0924-0136(95)01831-X.
images from 1st month to 2nd , 3rd and 6t h month with various [7] Y. Huang and B. Xu, “Automatic inspection of pavement cracking
conditional factors like the number of vehicles traveled, road distress,” J. Electron. Imag., vol. 15, no. 1, Jan. 2006, Art. no. 013017,
doi: 10.1117/1.2177650.
aging, temperature, and precipitation similar to the CFC [8] S. Cafiso, A. D. Graziano, and S. Battiato, “Evaluation of pavement
dataset. It can be seen that the crack images forecasted for surface distress using digital image collection and analysis,” in Proc.
the 6t h month have a larger increase in crack size compared 7th Int. Congr. Adv. Civil Eng., Istanbul, Turkey: Yildiz Technical Univ.,
2006, pp. 1–10.
to the 2nd month generated forecasted crack images. These [9] M. S. Kaseko and S. G. Ritchie, “A neural network-based methodol-
existing datasets contain more linear cracks, for which the ogy for pavement crack detection and classification,” Transp. Res. C,
proposed dataset can forecast the growth of crack regions Emerg. Technol., vol. 1, no. 1, pp. 275–291, 1993, doi: 10.1016/0968-
090X(93)90002-W.
more accurately. Thus, the CFC-GAN model can generate the [10] Q. Li and X. Liu, “Novel approach to pavement image segmentation
forecasted crack images not only for the proposed CFC dataset based on neighboring difference histogram method,” in Proc. Congr.
but also for the existing dataset. Image Signal Process., Sanya, China, 2008, pp. 792–796.
Figure 10 shows the effect on the number of vehicles [11] K. Chen, A. Yadav, A. Khan, Y. Meng, and K. Zhu, “Improved
crack detection and recognition based on convolutional neural net-
traveled, temperature, and precipitation on the forecasted work,” Model. Simul. Eng., vol. 2019, Oct. 2019, Art. no. 8796743, doi:
crack images over the CFC dataset. It can be seen that as 10.1155/2019/8796743.
the number of vehicles traveled, temperature or precipitation [12] Z. Liu, Y. Cao, Y. Wang, and W. Wang, “Computer vision-
based concrete crack detection using U-Net fully convolutional net-
increases, and forecasted crack size also got increased. Thus, works,” Autom. Construct., vol. 104, pp. 129–139, Dec. 2019, doi:
the CFC-GAN model can work efficiently while varying the 10.1016/[Link].2019.04.005.
conditional factors like temperature, precipitation, and the [13] L. Zhang, F. Yang, Y. Daniel Zhang, and Y. J. Zhu, “Road crack detection
using deep convolutional neural network,” in Proc. IEEE Int. Conf.
number of vehicles traveled for generating the forecasted crack Image Process. (ICIP), Phoenix, AZ, USA, Sep. 2016, pp. 3708–3712.
images. [14] W. Song, G. Jia, H. Zhu, D. Jia, and L. Gao, “Automated pave-
ment crack damage detection using deep multiscale convolutional fea-
tures,” J. Adv. Trans., vol. 2020, Jan. 2020, Art. no. 6412562, doi:
VI. C ONCLUSION 10.1155/2020/6412562.
We have proposed a progressive Conditional Forecasted [15] H. T. Nguyen, G. H. Yu, S. Y. Na, J. Y. Kim, and S. M. Seo, “Pavement
crack detection and segmentation based on deep neural network,” The
Crack image generation-Generative Adversarial Network J. Korean Inst. Inf. Technol., vol. 17, no. 9, pp. 99–112, Sep. 2019.
(CFC-GAN) in this present work. We have developed a new [16] C. V. Dung and L. D. Anh, “Autonomous concrete crack detection using
Crack ForeCast (CFC) dataset. In doing so, CFC-GAN has the deep fully convolutional neural network,” Autom. Construct., vol. 99,
pp. 52–58, Mar. 2019, doi: 10.1016/[Link].2018.11.028.
input of the first month captured images with conditional fac-
[17] H. Li, D. Song, Y. Liu, and B. Li, “Automatic pavement crack detection
tors in a progressive manner to mimic appropriate forecasted by multi-scale image fusion,” IEEE Trans. Intell. Transp. Syst., vol. 20,
crack images. In addition to the adversarial loss, we have no. 6, pp. 2025–2036, Jun. 2019, doi: 10.1109/TITS.2018.2856928.
associated the L1 regularization and SSIM loss function to [18] X. Miao, J. Wang, Z. Wang, Q. Sui, Y. Gao, and P. Jiang, “Automatic
recognition of highway tunnel defects based on an improved U-Net
generate the appropriate crack images, similar to originally model,” IEEE Sensors J., vol. 19, no. 23, pp. 11413–11423, Dec. 2019,
captured crack images. CFC-GAN is trained in an end-to-end doi: 10.1109/JSEN.2019.2934897.
manner to optimize the error. CFC-GAN model is also tested [19] K. Zhang, Y. Zhang, and H. D. Cheng, “CrackGAN: Pavement crack
detection using partially accurate ground truths based on generative
with the forecasted crack images for three benchmark datasets adversarial learning,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 2,
to ensure the consistency of the model. With this proposed pp. 1306–1319, Feb. 2021, doi: 10.1109/TITS.2020.2990703.
model, the plan for the next road lay down and cost estimation [20] L. Pei, Z. Sun, L. Xiao, W. Li, J. Sun, and H. Zhang, “Virtual generation
can be done to improve road safety. Limitation of our proposal of pavement crack images based on improved deep convolutional gener-
ative adversarial network,” Eng. Appl. Artif. Intell., vol. 104, Sep. 2021,
is to use the structured (end to end paired) data for crack Art. no. 104376, doi: 10.1016/[Link].2021.104376.
images while training the network models. However, once the [21] A. Koochali, P. Schichtel, A. Dengel, and S. Ahmed, “Probabilistic
model is trained, forecasted crack images will be predicted forecasting of sensory data with generative adversarial networks—
ForGAN,” IEEE Access, vol. 7, pp. 63868–63880, 2019, doi: 10.1109/
with better performance quickly. ACCESS.2019.2915544.
[22] Z. Zhang, Y. Song, and H. Qi, “Age progression/regression by
R EFERENCES conditional adversarial autoencoder,” in Proc. IEEE Conf. Com-
put. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017,
[1] J. Canny, “A computational approach to edge detection,” IEEE Trans. pp. 4352–4360.
Pattern Anal. Mach. Intell., vol. PAMI-8, no. 6, pp. 679–698, Nov. 1986, [23] X. Tang, Z. Wang, W. Luo, and S. Gao, “Face aging with
doi: 10.1109/TPAMI.1986.4767851. identity-preserved conditional generative adversarial networks,” in Proc.
[2] S. Kabir, “Imaging-based detection of AAR induced map-crack dam- IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Salt Lake City, UT,
age in concrete structure,” NDT E Int., vol. 43, no. 6, pp. 461–469, USA, Jun. 2018, pp. 7939–7947.
Sep. 2010, doi: 10.1016/[Link].2010.04.007. [24] P. Li, Y. Hu, Q. Li, R. He, and Z. Sun, “Global and local consistent
[3] H. Oliveira and P. L. Correia, “Road surface crack detection: Improved age generative adversarial networks,” in Proc. 24th Int. Conf. Pattern
segmentation with pixel-based refinement,” in Proc. 25th Euro. Sign. Recognit. (ICPR), Beijing, China, 2018, pp. 1073–1078.
Proc. Conf. (EUSIPCO), San Jose, CA, Aug. 2017, pp. 2026–2030. [25] Y. Liu, Q. Li, Z. Sun, and T. Tan, “A3GAN: An attribute-aware
[4] W. Wang et al., “Pavement crack image acquisition methods and crack attentive generative adversarial network for face aging,” IEEE Trans.
extraction algorithms: A review,” J. Traffic Transp. Eng., vol. 6, no. 6, Inf. Forensics Security, vol. 16, pp. 2776–2790, Mar. 2021, doi:
pp. 535–556, Dec. 2019, doi: 10.1016/[Link].2019.10.001. 10.1109/TIFS.2021.3065499.
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
SEKAR AND PERUMAL: CFC-GAN: FORECASTING ROAD SURFACE CRACK USING FORECASTED CRACK GAN 21391
[26] H. Yang, D. Huang, Y. Wang, and A. K. Jain, “Learning face age pro- [47] Q. T. M. Pham, J. Yang, and J. Shin, “Semi-supervised FaceGAN for
gression: A pyramid architecture of GANs,” in Proc. IEEE/CVF Conf. face-age progression and regression with synthesized paired images,”
Comput. Vis. Pattern Recognit., Salt Lake City, UT, USA, Jun. 2018, Electronics, vol. 9, no. 603, pp. 1–16, Apr. 2020, doi: 10.3390/
pp. 31–39. electronics9040603.
[27] X. Guo, R. Nie, J. Cao, D. Zhou, L. Mei, and K. He, “FuseGAN: [48] M. Mirza and S. Osindero, “Conditional generative adversarial nets,”
Learning to fuse multi-focus image via conditional generative adversarial 2014, arXiv:1411.1784.
network,” IEEE Trans. Multimedia, vol. 21, no. 8, pp. 1982–1996, [49] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley,
Aug. 2019, doi: 10.1109/TMM.2019.2895292. “Least squares generative adversarial networks,” in Proc. IEEE Int. Conf.
[28] W. R. Tan, C. S. Chan, H. E. Aguirre, and K. Tanaka, “Improved Comput. Vis., Venice, Italy, Oct. 2017, pp. 2813–2821.
ArtGAN for conditional synthesis of natural image and artwork,” IEEE [50] Y. Shi, L. Cui, Z. Qi, F. Meng, and Z. Chen, “Automatic road
Trans. Image Process., vol. 28, no. 1, pp. 394–409, Jan. 2019, doi: crack detection using random structured forests,” IEEE Trans. Intell.
10.1109/TIP.2018.2866698. Transp. Syst., vol. 17, no. 12, pp. 3434–3445, Dec. 2016, doi: 10.1109/
[29] R. Li, J. Pan, Z. Li, and J. Tang, “Single image dehazing via conditional TITS.2016.2552248.
generative adversarial network,” in Proc. IEEE/CVF Conf. Comput. Vis. [51] Q. Zou, Y. Cao, Q. Li, Q. Mao, and S. Wang, “CrackTree: Automatic
Pattern Recognit., Salt Lake City, UT, USA, Jun. 2018, pp. 8202–8211. crack detection from pavement images,” Pattern Recognit. Lett., vol. 33,
[30] J. Park, D. K. Han, and H. Ko, “Fusion of heterogeneous adversarial no. 3, pp. 227–238, Feb. 2012, doi: 10.1016/[Link].2011.11.004.
networks for single image dehazing,” IEEE Trans. Image Process., [52] F. Yang, L. Zhang, S. Yu, D. V. Prokhorov, X. Mei, and H. Ling, “Feature
vol. 29, pp. 4721–4732, 2020, doi: 10.1109/TIP.2020.2975986. pyramid and hierarchical boosting network for pavement crack detec-
[31] I. Goodfellow et al., “Generative adversarial networks,” in Proc. Adv. tion,” IEEE Trans. Intell. Transp. Syst., vol. 21, no. 4, pp. 1525–1535,
Neural Inf. Process. Syst., Montreal, QC, Canada: Curran Associates, Apr. 2020, doi: 10.1109/TITS.2019.2910595.
Dec. 2014, pp. 4721–4732. [53] R. Li, J. Pan, Z. Li, and J. Tang, “Single image dehazing via conditional
[32] X. Mao, Q. Li, and H. Xie, “AlignGAN: Learning to align cross- generative adversarial network,” in Proc. IEEE/CVF Conf. Comput. Vis.
domain images with conditional generative adversarial networks,” in Pattern Recognit., Salt Lake City, UT, USA, Jun. 2018, pp. 8202–8211.
Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Honolulu, HI, [54] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter,
USA, Jul. 2017, pp. 1–6. “GANs trained by a two time-scale update rule converge to a local nash
[33] S. Liu et al., “Face aging with contextual generative adversarial nets,” equilibrium,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., Red
in Proc. 25th ACM Int. Conf. Multimedia Assoc. Comput. Machinery, Hook, NY, USA, 2017, pp. 6629–6640.
New York, NY, USA, Oct. 2017, pp. 82–90. [55] T. Salimans et al., “Improved techniques for training GANs,” in Proc.
[34] G. Antipov, M. Baccouche, and J. Dugelay, “Face aging with condi- NIPS, 2016, pp. 2234–2242.
tional generative adversarial networks,” in Proc. IEEE Int. Conf. Image [56] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter,
Process. (ICIP), Beijing, China, Sep. 2017, pp. 2089–2093. “GANs trained by a two time-scale update rule converge to a local nash
[35] Z. Huang, S. Chen, J. Zhang, and H. Shan, “PFA-GAN: Progres- equilibrium,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., Red
Hook, NY, USA: Curran Associates, 2017, pp. 6629–6640.
sive face aging with generative adversarial network,” IEEE Trans.
Inf. Forensics Security, vol. 16, pp. 2031–2045, Dec. 2021, doi:
10.1109/TIFS.2020.3047753.
[36] H. Fang, W. Deng, Y. Zhong, and J. Hu, “Triple-GAN: Progressive
face aging with triple translation loss,” in Proc. IEEE/CVF Conf.
Comput. Vis. Pattern Recognit. Workshops, Seattle, WA, USA, Jun. 2020,
pp. 3500–3509.
[37] H. Zhang, V. Sindagi, and V. M. Patel, “Image de-raining using
a conditional generative adversarial network,” IEEE Trans. Circuits
Syst. Video Technol., vol. 30, no. 11, pp. 3943–3956, Nov. 2020, doi:
10.1109/TCSVT.2019.2920407.
Aravindkumar Sekar received the master’s degree
[38] M. Zhai, L. Chen, F. Tung, J. He, M. Nawhal, and G. Mori, “Life- in computer science and engineering from Anna
long GAN: Continual learning for conditional image generation,” in University, Chennai, in 2012. He is currently doing
Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Seoul South Korea, his research with the Department of Computer
Oct. 2019, pp. 2759–2768. Technology, Anna University, and also working as
[39] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The an Assistant Professor with the Information Tech-
unreasonable effectiveness of deep features as a perceptual metric,” in nology Department, Rajalakshmi Engineering Col-
Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Salt Lake City, lege, Chennai. His research interests include image
UT, USA, Jun. 2018, pp. 586–595. processing, machine learning, and deep learning.
[40] J. Lin, Y. Xia, T. Qin, Z. Chen, and T. Liu, “Conditional image-to-image
translation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.,
Salt Lake City, UT, USA, Jun. 2018, pp. 5524–5532.
[41] D. Bhattacharjee, S. Kim, G. Vizier, and M. Salzmann, “DUNIT:
Detection-based unsupervised image-to-image translation,” in Proc.
IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Seattle, WA,
USA, Jun. 2020, pp. 4786–4795.
[42] A. A. Efros and T. K. Leung, “Texture synthesis by non-parametric
sampling,” in Proc. 7th IEEE Int. Conf. Comput. Vis., Kerkyra, Greece,
Sep. 1999, pp. 1033–1038.
[43] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks
for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Varalakshmi Perumal received the B.E.
Recognit. (CVPR), Boston, MA, USA, Jun. 2015, pp. 3431–3440. degree (CSE) from the Government College of
[44] P. Isola, J. Zhu, T. Zhou, and A. A. Efros, “Image-to-image trans- Technology in 1991, the [Link]. degree (CSE)
lation with conditional adversarial networks,” in Proc. IEEE Conf. from Pondicherry University in 1999, and the Ph.D.
Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, degree from Anna University in 2009. She has
pp. 5967–5976. around 25 years of teaching experience and currently
[45] P. Sangkloy, J. Lu, C. Fang, F. Yu, and J. Hays, “Scribbler: Controlling working as a Professor with the Department of
deep image synthesis with sketch and color,” in Proc. IEEE Conf. Computer Technology, Anna University. She is
Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, currently serving as the Director for the AU-KBC
pp. 6836–6845. Research Centre for Emerging Technologies, MIT,
[46] L. Karacan, Z. Akata, A. Erdem, and E. Erdem, “Learning to generate Anna University. She has 95 reputed journals and
images of outdoor scenes from attributes and semantic layouts,” in Proc. 115 international conference publications to her credit. Her research interests
IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, include data analytics, machine learning, deep learning, the Internet of
USA, Dec. 2016, pp. 1–15. Things, blockchain, image processing, cloud computing, and security.
Authorized licensed use limited to: Indian Institute of Technology - BHUBANESWAR. Downloaded on November 24,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
Integrating conditional information in the CFC-GAN model significantly enhances the realism and accuracy of forecasted crack images by factoring in actual conditions such as vehicle load, climate conditions, and road aging. This allows the model to produce images that are not only visually similar to the original images but also contextual to real-world parameters affecting road conditions .
The CFC-GAN model tackles the lack of ground truth images by using SSIM scores and input image comparisons to evaluate consistency. By ensuring high structural similarity to original images and integrating real-world conditions, the model can generate realistic crack predictions accurately even without direct ground truths .
The CFC-GAN model outperforms Conditional GAN+Adversarial and Conditional GAN+Adversarial+L1 models on benchmark datasets like Crack-500 and CrackTree-200. This is attributed to its comprehensive loss function, including adversarial, L1, and SSIM losses. The generated images are more accurate and closer to the original scenarios observed in the datasets .
The application of CFC-GAN in road maintenance can revolutionize predictive strategies by accurately forecasting crack development under variable conditions like traffic and climate. This enables timely interventions, optimizing maintenance schedules and resource allocation. By maintaining road quality with precision and reducing early-stage crack progression, the model enhances efficiency and safety in transportation infrastructure management .
The CFC-GAN model achieves superior performance by integrating multiple loss components—adversarial, L1, and SSIM—that enhance image quality and structural similarity. Additionally, its use of conditional information relevant to real-world scenarios, like traffic and weather conditions, allows precise forecasting beyond traditional conditional GAN models, which may not include such factors comprehensively .
The CFC-GAN model uses paired translation-based mapping conditional information with L1 regularization loss and SSIM loss in addition to adversarial loss, resulting in higher performance. This combination ensures structural similarity of generated images to original images and provides more realistic and accurate forecasts than models like Conditional GAN+Adversarial and Conditional GAN+Adversarial+L1 .
SSIM loss in CFC-GAN ensures that the generated forecasted crack images are structurally similar to the original captured images, which results in more realistic and consistent crack forecasts .
Leaky ReLU is preferred because it includes a small derivative value for negative inputs, thus allowing some learning during back-propagation even with negative inputs. This contrasts with ReLU, which leads to zero derivatives for negative values, hindering learning. Leaky ReLU also speeds up the training process, enhancing model performance .
Conditional factors such as the number of vehicles traveling, temperature, and precipitation significantly affect the quality of forecasted crack images. These factors are integrated into the CFC-GAN model to ensure that the generated images accurately reflect how these conditions influence road surface degradation. For instance, increased traffic and high precipitation levels contribute to faster deterioration illustrated in the model's forecasts .
Adversarial loss in CFC-GAN promotes the generation of high-quality images by setting up a competition between generator and discriminator networks. The generator aims to produce realistic crack images that the discriminator attempts to distinguish from real images, encouraging the generator to improve the realism of its outputs iteratively .