Biase 2022
Biase 2022
A R T I C L E I N F O A B S T R A C T
Keywords: Livestock production efficiency is essential to improve the world food chain in terms of making meat available to
Box–Jenkins models more people and reducing producer costs, while supporting environmental sustainable solutions. In this context,
Empirical mode decomposition predicting cattle weights supports the decision making process to optimize the beef cattle supply chain animals
Multi-layer perceptron
and improving feed efficiency. Current body weight analyses are typically performed using predetermined
Statistical learning theory
Stochastic and deterministic components
models based on a set of differential equations (e.g. Davis Growth model), however they are not easily adaptable
to accept new influencing variables made available in the current technological scenario. This study, proposes
two fully adaptable approaches to build up models and forecast cattle body weights while considering related
variables (e.g. temperature, atmospheric pressure, global radiation, wind speed, air humidity and dry matter
intake (DMI). Our approaches explore two complementary scientific branches: (i) Stochastic Processes, where we
employ the Autoregressive Integrated Moving Average (ARIMA) and Seazonal Autoregressive Integrated Moving
Average (SARIMA) models only on the variable weight; and, (ii) Deterministic Dynamical Systems, with
reconstruct at multidimensional spaces representing the relationships among between daily body weights while
being influenced by climatic, management and diet variables. Takens’ embeded theorem was used to represent
phase spaces, which work as input for a weights regression model based on Multi-Layer Perceptron (MLP) –
Artificial Neural Network (ANN) base. A dataset comprising 71 Nelore (Bos indicus) cattle were used in this study
and the leave-one-out was used as a cross-validation strategy. Models were evaluated using the Mean-Distance
from the Diagonal Line (MDDL) technique. MDDL results for 14, 21 and 28 days of prediction were, respec
tively, for MLP: 0.2216, 0.3947 and 0.0025 (with 5 hidden layer neurons). For ARIMA, MDDL results were
0.8763, 0.9494 and 0.8299 for 14, 21 and 28 days of prediction horizon, respectively; and for SARIMA 0.5912,
0.5614 and 0.4884 for 14, 21 and 28 days of prediction horizon, respectively. This study demonstrates that by
integrating different data sources in a deterministic model, one can predict meat production, surpassing the
ARIMA and SARIMA models. Further studies on decomposition analyses to support the individual modeling of
animals based on stochastic and deterministic influences are warranted.
that could impact our society. These requirements became even more
1. Introduction evident after the Covid-19 pandemics, which started in a market in
Wuhan, China, in December 2019 Andersen et al., 2020; Cui et al., 2019,
A thirty percent increase in the human population is expected by due to the lack of sanitary criteria related to the commerce of bushmeat
2050, consequently agriculture and livestock must be significantly (e.g. pangolin and bat). The viral potential of those wild animals worries
expanded to meet mankind nutritional needs FAO, 2009. In addition to researchers about possible and accidental new health crises Cui et al.
this increase in production, there is the need for providing stricter health (2019), Andersen et al. (2020).
guarantees in order to prevent the spread and emergence of new diseases In addition to health aspects, individual historical monitoring along
* Corresponding author.
** Corresponding author.
E-mail addresses: [email protected] (A.G. Biase), [email protected] (T.Z. Albertini), [email protected] (R.F. de Mello).
URL: https://www.linkedin.com/in/adriele-giaretta-biase-0b074b37?trk=hp-identity-name (A.G. Biase), https://www.techagr.com/ (T.Z. Albertini), https://
sites.icmc.usp.br/mello/ (R.F. de Mello).
https://doi.org/10.1016/j.compag.2022.106706
Received 25 June 2021; Received in revised form 9 January 2022; Accepted 9 January 2022
Available online 10 March 2022
0168-1699/© 2022 Elsevier B.V. All rights reserved.
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
the supply chain, the mechanization of the industry and scale produc animals, is highly correlated to their health and the meat quality
tion, together with the need to reduce production costs and make food Albertini et al. (2016a). In this context, the collection of individual data
more accessible to everyone, especially to economic vulnerable pop is a valuable source of information for decision making at the level of a
ulations FAO (2020). In this context, it is expected to democratize the single animal (e.g. biometrics, health, meat quality, welfare, and,
consumption of monitored animals, which would certainly reduce the slaughter) and not in terms of the herd or lot average.
chances of a new world-scale crisis arising from infectious diseases and The availability information on livestock production systems has
the transmission of zoonotic viruses from animal hosts to humans, as in substantially increased over the years, allowing the development of
the case of Ebola, HIV and several species of Coronavirus, including the more detailed forecasting models on individual growth. Analyses within
SARS-COV-2 Karesh and Noble (2009), Cui et al. (2019), Kurpiers et al. the Cooperative Research Center (CRC) phenotypic prediction program
(2016), Friant et al. (2020), Andersen et al. (2020). McPhee et al. (2007), McPhee et al., 2014 are typically performed using
The access to monitored animals should also assist in the preserva predetermined models in a System of Differential Equations (SDE), such
tion of wildlife species and the biodiversity maintenance, as they as the Davis Growth Model (DGM), proposed by Oltjen et al. (1986) and
reproduce more slowly and many places have already their natural extended by Oltjen et al. (2000) to estimate the body composition of
predators Ripple et al. (2016). Thus, the replacement of bushmeat by British Bos taurus cattle. The DGM reparametrizations were carried out
protein sources resulting from a high reproductive process in herds, with for other productive systems, such as in Brazil Sainz et al. (2006) and
the use of technology to monitor and ensure food security standards (e.g. Biase (2017) for Nelore and Crossbreed cattle. Likewise, there are other
health protocols, nutritional supervision, screening, monitoring of ani proposals for models to represent animal growth, such as in France et al.
mal health through ultrasound and analysis of circadian rhythm FAPESP (1987); Williams and Bennett (1995); Williams and Jenkins (2003);
(2020)), becomes essential for life in our modern society. Hoch and Agabriel (2004); Tedeschi et al. (2004). A study by Mayer
These positive aspects of modern livestock still depend on a list of et al. (2013) integrated stochasticity into the objective function to avoid
factors, involving improvements in the animal and in the pasture man Monte Carlo computation in the optimization of beef feedlots. The
agement system, in the control of morbidity and mortality involved in simulated results of Mayer et al. (2013) indicated that increases in
welfare studies, in the collection of data for monitoring physical and profitability can be obtained by altering days on feed, and by better
chemical body aspects, as well as monitoring nutritional, economic and allocation of the animals to the pens.
environmental variables. All of these factors are favorable to the detailed Due to the differences in growth patterns among cattle breeds, as
management of production and enable greater control over the admin well as to the great diversity of animal production systems in the world,
istrative routine Tullo et al. (2019b). By unifying all this information, in in addition to nutritional information, body biometrics, genetics and
what is referred to as precision livestock, it is possible to optimize the meteorology, those weight models based on differential equations are
supply chain and offer the expected guarantees to the market. not easily adaptable to accept new variables, thus influencing their
In addition to data collection, precision livestock involves the con practical application. Among other disadvantages, there is the need for
struction of mathematical, statistical and or Machine Learning models in complex variables to be collected as in the case of the DGM which re
order to support decision making by producers, aiming at finding quires information on the protein, DNA, and individuals body fat mass.
behavioral patterns, minimizing errors and reducing losses Tullo et al. This information is obtained only after the animal at slaughter, thus
(2019b). One of the variables with strong modeling potential is the in impairing the dynamic monitoring both in feedlot (intensive) and
dividual weights of animals Tullo et al. (2019a), which were previously pasture (extensive).
measured at the beginning and at the end of the feedlot process, failing In order to aggregate these variables and provide a better interme
to follow, in detail, the intermediate steps between the period of diate monitoring of the final fattening stage of livestock in intensive and
fattening until slaughtering. extensive systems, some researchers proposed the use of ultrasound
Currently, the dynamic daily basis monitoring of the weighing of equipment to assess animal body chemicals (fat thickness, mass and
individual animals with the aid of technology, e.g. with more than one viscera area – muscles). However, there are limitations in the applica
weighing per day without causing damage to the animal performance, tion of such technology due to the inherent costs of the image collection
natural behavior or stress, enriches the tracking information available in process (e.g. time of the specialist who performs the evaluation) as well
databases. Monitoring the weight, as well as monitoring the growth of as the eventual animal stress.
2
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
Studies such as McPhee et al. (2014) and FAPESP (2020a) aim at 2.1. System of Differential Equations (SDE)
replacing the ultrasound with a system of cameras and intelligent al
gorithms to assess the animals body composition during the fattening These models are used by researches to represent cattle growth rates
phase, avoiding any subjective judgment of the visual technical in as a function of the available variables. As a central example, the pro
spection. In nutritional terms, other studies have sought to use cameras jection of growth rates has been based on the average body condition
to monitor the ingestive behavior of animals, current filling of the trough and frame size of beef cattle, as widely accepted by institutions, such as
and its relationship with climatic data, in order to adjust the diet supply the National Research Council (NRC) (NRC, 1976; NRC, 2000; NRC,
and improve livestock management FAPESP (2019). 2011; NRC, 2016), which take into account feed requirements. How
The emerging fields of artificial intelligence and ML provide key ever, a number of factors affecting the growth performance are not
tools to perform data analysis and provide new large-scale forecasting included in the NRC system, to mention, both genetic and nutritional
strategies to monitor the animal dynamics Aiken (2020). By projecting background.
future trends, we can optimize resource allocation, such as changing or In the search for reducing the excess of fat, the meatpacking market
choosing the animals diet, predicting the emission of Greenhouse Gases has been transitioning to individually manage and commercialize cattle,
(GHG), water intake, and the manure per kg of produced meat; bringing thus leading it to a second generation of non-linear models (Oltjen et al.,
environmental, economic and genetic selection benefits with the iden 1986; Bywater et al., 1988; Di Marco et al., 1989; Oltjen et al., 2000;
tification of animals at the optimal economical point as well as mini Sainz et al., 2006; Biase, 2017). Those models attempt to express more
mizing pollutants per kg of meat thus making the entire supply chain complex characteristics such as the metabolism, nutrient availability,
more efficient and sustainable Albertini et al. (2016a), Biase (2016b), body weight, hormonal treatments and other initial conditions. Those
Albertini et al. (2016b), Biase et al. (2016a). models presumably output the chemical composition layers (e.g. some
In an attempt to overcome the disadvantages of growth models based DNA features, protein synthesis and degradation, as well as body fat
on SDE due to their complex parametrization and issues to consider new indices). In addition, the efficiency of animals is unfolded through the
variables, this paper addresses this problem by comparing a weight- interaction of the model input parameters, leading to biological in
based approach, using stochastic process tools, against a fully adaptive terpretations on the rate of protein increase and the energy of mainte
one, based on dynamical systems, to build up bovine growth models. nance thus allowing the selection of more productive individuals within
While the first approach takes advantage of the Autoregressive Inte the lot.
grated Moving Average (ARIMA) and Seazonal Autoregressive Inte In a third generation, non-linear models have aggregated further
grated Moving Average (SARIMA) models (Box and Jenkins, 1976; Box details from biochemical pathways and physiological mechanisms in an
et al., 2015) to represent the individual animal weights over time, the attempt to contemplate the growth of the following components:
latter employs Takens’ embedding theorem, Takens (1981), along with a viscera, muscles and fat reserves (tissue adipose) of an animal under
Multi-layer Artificial Neural Network (Hastie et al., 2009; Mello and different nutritional strategies Oddy et al. (1997), Soboleva et al. (1999),
Ponti, 2018) to learn the influences of relevant variables (e.g. temper Keogh et al. (2021), Zhang et al. (2021). Those models have been
ature, atmospheric pressure, global radiation, wind speed, air humidity, confirmed to be consistent with compensatory growth in sheep and
and Dry Matter Intake (DMI)) on weight forecasting. cattle over a range of energy intake levels Soboleva et al. (1999). And
Experimental results confirm both approaches produce promising significant effects of nutrition on muscle, bone and adipose tissue
forecasting results. While the stochastic-based approach helps fitting a development such as Keogh et al. (2021) and Zhang et al. (2021).
weight model, the fully adaptive one has the advantage of building up Other models based on fundamental and metabolic biological prin
bovine growth models from scratch by aggregating phenotypic, envi ciples have been proposed by (Gill, 1984; Baldwin, 1995; Tedeschi et al.,
ronmental and nutritional information in a multivariate way, so farmers 2004), which operate at the levels of tissue or underwoven aggregation.
can represent their own local scenarios in an attempt to improve the All those SDE models are primarily devoted to support research studies,
overall prediction performance. therefore practical aspects such as the operational cost, complexity and
This paper is organized as follows: Section 2 presents the related availability of adequate input data under feedlot and grazing conditions
work; Section 3 describes data streams and the proposed approaches to are not taken into account, thus restricting their use in routine man
model the prediction of cattle weights; Section 4 details the results based agement as predictors of growth at the individual level.
on the experimental dataset; Section 5 brings our discussion on both
approaches; Section 6 draws concluding remarks; Section 7 presents our 2.2. Time Series Analysis (TSA)
acknowledgments and, finally Appendices A, B and C work as supple
mentary material to support the theoretical foundation. The way errors are handled in the modeling process is fundamentally
important. The structures normally used for the residues are hetero
2. Related work scedasticity or homoscedasticity with independent or autoregressive
errors. Datasets depicting animal growth are inherently ordered along
Meatpacking companies have been considering a list of animal time and autocorrelated once the current features of an individual
characteristics to decide on paying bonuses or penalizing the final prices impact its development over time. Box and Jenkins (1976) proposed a
paid to producers. Among those features are the final weight, sex, degree methodology to represent this sort of correlation using what is referred
of finish (fat growth rate), lean mass, bone, carcass yield, breed and its to as the ARIMA model. Their idea described the observation at a given
crossings (Williams and Bennett, 1995; Biase, 2017), which have called time as a linear combination of past values of the time series itself, as
the attention of producers who look for proper ways of monitoring their well as previous modeling errors. When a series is non-stationary, an
cattle while maximizing profit. In this scenario, the supply chain actors, additional decomposition step is required to transform it into a sta
i.e, both reproduction and fattening (including the feedlots) producers tionary series by means of first-order differences (Metcalfe and Cow
and the meatpacking companies, are interested in maximizing their pertwait, 2009).
revenues while assuring food safety standards, meat quality, and the For example, Roseiro (2017) modeled and forecasted the body
certification of all this process throughout the supply chain. weight of 231 Senepol heifers (Bos taurus genotype), with an initial
Aiming at optimizing this chain, several models of cattle growth have weighted average and Standard deviation (SD) of 400 ± 56.7 kg, at 7, 14
been proposed in the last decades to help producers and slaughterhouses and 21 horizon days. Although the model was a good fit with significant
in this decision making process. Those models are here organized into parameters (P < 0.005), some of the limitations were associated with
three categories: SDE, Time Series Analysis (TSA) and ML, which obtains historical data and generally the results of prediction horizons
compose the most relevant study branches. greater than today were poor. Hence, an opportunity to improve the
3
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
results by Roseiro (2017), by implementing stochastic processes into a of weights taken making to day were 10 ± 3.4. All weights used in the
model. analysis resulted from the averages over a 24 h period, after outliers
were remove.
2.3. Machine Learning (ML) The monitoring of intake was also carried out by automatic
weightings installed in the trough, obtaining the total information of the
The use of ML techniques to predict beef cattle traits is not new, DMI per animal per day. As for the animals body characteristics, after
Decision Trees and Artificial Neural Network (ANN) have been the adaptation period, all animals were measured in terms of frame and
employed to address such task. For example, Alonso et al. (2013) used Body Condition Scoring (BCS). The BCS, the starting weight, the final
such approaches to predict beef cattle conformity scores and growth weight and the DMI were (μ ± σ ): 5.6 ± 0.26, 5.31 ± 0.42, 471.25 ±
using data from 91 animals. Additionally, Alonso et al. (2015) applied 34.12 kg, 545.5± 46.09 kg and 11.95 ± 1.91 kg/day, respectively, for a
Support Vector Machines (SVMs) to predict carcass weight in advance to pen of 71-male Nelore (Bos indicus) animals. Meteorological data were
slaughter for the Asturiana de los Valles cattle breed, based on zoometric obtained through the National Institute of Meteorology, INMET (2021),
measurement features with 144 animals. In this last study, experimental based on information from automatic stations for the temperature,
results confirm that SVMs help predict the carcass weight 150 days precipitation, humidity, atmospheric pressure, dew point, global radi
before slaughter. ation, wind speed and wind direction, whose measurements were
The efficacy comparison of the Linear Regression, the Generalized respectively: 24.53 ± 2.13 ◦ C, 7.00 ± 4.00 mm, 79.53 ± 8.68 %, 975.23
Linear Regression, Random Forests and ANN to predict beef carcass ± 2.06 hPa, 20.67 ± 2.24 ◦ C, 1052.90 ± 335.69 kj/m2, 3.08 ± 1.19 m/s
weight, age when finished, fat deposition, and carcass quality was per and 135.09 ± 65.58 degrees.
formed by Aiken (2020). The analyzed data contained information on
more than 4 million beef cattle from 5, 204 farms in Brazil. The animal 3.2. Our approaches on body weight modeling
category, the nutritional plan, the cattle sales price, the participation in
a technical advising program, the climate and the soil in which animals Rural producers data on their herds, supported by genetic, environ
were raised were deemed important for forecasting meat production and ment, management, pasture, breeding, weaning, rearing and body
quality using Random Forests. weight information must be considered to optimise negotiation de
The development of strategies for forecasting, through ML methods cisions and improve profit. By storing and assessing all of the historical
is promising in the livestock production using large-scale data. Despite information from past lots, improvements to the models in this study
complexities in beef cattle production, influenced by a farmer’s personal were made and, consequently, the body weight prediction of new cattle
interest, meat market regulators, and sanitary issues (such as spread of populations.
diseases), the use of ML integrates the different data sources making it Two approaches are proposed in this study and evaluated as a way to
possible to forecast meat production and quality at moderate-to-high compare observed and predicted body weights. The first uses the
accuracy coefficients Aiken (2020). deterministic component, resulting from Empirical Mode Decomposi
In this sense, the approach using ANN in this study, based on the tion (EMD), to build up a model that explores past lot information to
deterministic components and the construction of phase spaces, sub improve the representation of current data observations. The second
stantiates the prediction horizon with influences of external variables, as approach used as a baseline, only analyzes body weights, as addressed in
discussed by Alonso et al. (2015) and Aiken (2020). Roseiro (2017), from a stochastic process point of view. The results of
these two approaches are reported and discussed. The following sections
3. Materials and methods describe each one of these approaches in detail.
This section is divided into two parts: Section 3.1 describes data 3.2.1. Setup
streams and Section 3.2 describes the proposed methods to model the This section introduces the setup parameters used in both ap
prediction of cattle weights. proaches, i.e., a dynamical system in combination with Multi-Layer
Perceptron (MLP), and stochastic processes through the usage of
ARIMA and SARIMA models.
3.1. Data streams
Common Data and Analysis. Seventy-one animals with a back
ground report of 64 days in a feedlot, disregarding 14 days of adapta
The data were obtained from a commercial feedlot trial, located in
tion. The animal batch was kept in the same climatic environment, fed
Brasilândia, Mato Grosso do Sul (MS) State, Brazil (21◦ 15′ 21′ ′ S, 52◦ 02′
by the same nutritional diet and under the same management condi
1′ ′ W and altitude of 343 m), in 2019. The monitoring of weights was
tions. Leave-one-out was the cross-validation strategy from which all
carried out by a system of automatic weighing platforms (Intergado
information from 70 animals were used to tune modeling parameters.
model VW 1000)1 that were installed in front of the water tank in the
This was followed by the prediction of the weights of a single animal
feedlot bay. When the animal had access to the tank, it passed through
used as a test and the remaining seventy used for training the model.
the weighing platform, when it was identified via electronic ear ring
Thus, it was repeated 71 times until each animal could pass it once as a
from the FDX technology and thus weighed. The animals were adapted
test. It is worth noting that the animal used in the test is excluded from
to the system for a period of 14 days, then they were monitored for other
the training set.
64 days in the intensive system, following a single diet. The diet con
We assume that all 70 animals considered in the training stage
sisted of sorghum silage 62.7%, milled corn 30.9%, soybean meal 5.2%,
compose some historical information, as previous batches produced on
concentrates2 0.8%, and urea 0.4%. The Total Digestible Nutrient (TDN)
the same farm, to forecast the body weights of a single individual. In this
of the lot was 68.37%. The data collected at the time of weighing were
manner, the statistical inferences produced from these reports will
wirelessly transmitted to a central computer. The number (average ± sd)
reflect reliable predictions for a new batch (representing the test) on the
same farm and under the same conditions.
1 Dynamical Systems and Intelligence-based. In this approach, we
https://www.intergado.com.br/.
2
per kg: Ca (min) 98.00 g/kg; Ca (max) 113.00 g/kg; P (min) 45.00 g/kg; S propose a combination of methods to support the forecasting of cattle
(min) 40.00 g/kg; Mg (min) 44.00 g/kg; K (min) 61.50 g/kg; Na (min) 114.50 weights, which starts with the EMD method (details in Appendices A and
g/kg; Co (min) 48.50 mg/kg; Cu (min) 516.00 mg/kg; I (min) 30.00 mg/kg; Mn A.1) to decompose the body weight series, among others, into stochastic
(min) 760.00 mg/kg; Se (min) 9.00 mg/kg; Zn (min) 2, 516.50 mg/kg; sodium and deterministic components, as discussed in Rios and De Mello (2016).
monensin 2, 000.00 mg/kg; F (max) 450.00 mg/kg. This method consists of comparing the mutual information contained in
4
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
the phase spectra (complex Fourier coefficients) of consecutive Intrinsic nutritional variables, the first monocomponents were discarded, as they
Mode Functions (IMFs) to measure their similarities, as expanded in were considered to be predominantly stochastic. The other mono
Appendix A.2. When comparing the phase spectra of IMFs, a subset of components along with the residue were added over the days to form the
them contains stochastic influences, while the remaining ones are pre deterministic component. It is important to mention that when
dominantly deterministic. Therefore, it is possible to segment such be analyzing the variable body weight, the decomposition identified the
haviors in an attempt ot improve data modeling and forecasting. first two monocomponents as stochastic, with the remaining ones and
Training was carried out using real-world data (original) with the the residue being considered as part of the deterministic component.
addition of the respective deterministic components, such as in data Next, the selected series were reconstructed using Takens’ embed
augmentation strategies Gemmeke et al. (2010), in order to improve ding theorem (Appendix A.3), in which the parameter time delay, d, and
data sampling and thus model convergence. In case of phase-space re the embedding dimension or the number of spatial axes, m, were esti
constructions (kernel function parameterizations, Appendices A.3, A.4, mated using Auot-Mutual Information (AMI) and False Nearest Neigh
A.5), we chose the best regression setting, in other words, the one with bors (FNN), respectively. The overall structure for all phase spaces was
the greatest generalization capacity (Appendix A.4).Then, a recurrent built based upon the highest frequencies of d and m, considering the
forecast was employed using the best kernel parameterization, specif environmental, nutritional and body weight variables, leading to d = 4
ically, the series observations were recursively used as input feedback to and m = 3.
forecast the next ones, allowing to quantify the accumulated error and The architecture of the ANN, MLP is described in Appendix A.6,
compare it against the expected values and, finally, we evaluated the composed of three consecutive neuron layers: the first, called the input
generalization capacity of adjusting the regression (Appendix C). layer, with k = [(m × v) − 1] = [(3 × 4) − 1] = 11 neurons, in which m =
Environmental and nutritional variables having a higher degree of 3 is the dimension of the new space and v = 4 is the number of variables
relevance to represent the body weight were selected by means of selected to compose the model, they are: body weight, DMI, instanta
calculating the mutual information and the analysis of the main com neous atmospheric pressure and global radiation; the second layer
ponents. As body weight is the most relevant variable in the model, as it consisted of 1 to 30 neurons, which were tested using Mean-Distance
is the response variable in which we intend to predict, then using from the Diagonal Line (MDDL); and the output layer had a single
principal component analysis we analyzed the principal components neuron representing the variable of interest, i.e., body weight. The
with the highest importance score to represent the weight variable. maximum number of iterations, maxit, was set at 5, 000; the constant
Going into each component whose scores were above 0.25 and analyzing weight decay (decay), to avoid network overfitting was set at 5 × 10− 4 ,
the most important variables for its representation, we observed that and the range of values for random weights initialization (rang), was
some variables appeared more frequently and with a greater degree of defined as 0.01. Fig. 1 presents an outline with the steps of our approach,
importance. Mutual information was also calculated in order to verity depicting the update/recursive nature of this approach.
the principal component analysis. Mutual information establishes the Stochastic Parameters. Our stochastic approach (theoretical con
degree of dependence between two variables, both linear and nonlinear cepts included in Appendix B) considered the body weights of animals to
structures and without restrictions of monotonic functions. It is expected devise ARIMA and SARIMA models to compose a linear ensemble
that the results of the principal components and mutual information are learning strategy in an attempt to obtain more robust, consistent and less
convergent and define the same set of important variables to model the susceptible-to-noise results. The body-weight forecasting of each animal
predictions of the variable body weight. were produced via the average weighting of the ARIMA and SARIMA
In short, the following variables were decomposed into IMFs: envi models parameters adjusted in 70 animals (trained) with the complete
ronmental, nutritional and animal weights. In case of environmental and feedlot cycle (64 days), as well as taking into account the predictions for
Fig. 1. Outlining the update/recursive nature of our dynamical system and intelligence-based approach.
5
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
the best fit of the model using past information on the animal held for in Fig. 3 (A-C). The two first figures, i.e., Fig. 2 (A) and (B), correspond to
testing. the phases of the two first IMFs whose behavior is clearly stochastic, as
When the prediction horizon is excessively long, the short report of discussed in Rios and De Mello (2013), once phases tend to be random.
past information on the tested animal itself tend to produce inaccurate The third IMF, seen in Fig. 2 (C), starts presenting a congruent phase
weights. In this case, historical information of previous feedlots is behavior what allows us to classify it as deterministic, a conclusion also
considered in an attempt to correct prediction results for both ARIMA drawn from Rios et al. (2015). From this conclusion, we sum up the two
and SARIMA models. Finally, we use the evaluation concepts described first IMFs, i.e., Fig. 3(A) and (B), to form the stochastic component,
in Appendix (C) to evaluate the ARIMA and SARIMA models compared while the third IMF, seen in Fig. 3(C), is summed up to the series residue,
with the deterministic approach (Dynamical Systems and Intelligence- seen in Fig. 3(D), to form the deterministic component.
based models). The same variable, i.e., body weight, is again used to illustrate the
phase-space reconstruction process after Takens’ embedding theorem
4. Experimental results (besides the other variables previously mentioned were also submitted
throughout this process) based on the deterministic component extrac
This section reports the results of our approaches: the first based on ted using EMD. Such stage was performed based on the AMI and FNN
Dynamical Systems in conjunction with an Artificial Intelligence-Based methods to estimate the time delay d and embedding dimension m for
approach, then the Stochastic approach as baseline. A parameter com each animal, as illustrated in Fig. 4 (A) and (B), respectively. The first
parison is also provided to support the reader. local minimum in Fig. 4 (A) shows the typical estimate for d, as discussed
in Fraser and Swinney (1986), while the embedding dimension m is
4.1. Dynamical systems and artificial intelligence-based approach estimated from a fraction of false nearest neighbors smaller than 20%,
see Fig. 4 (B), as discussed in Kennel et al. (1992). The phase spaces for
At first, we decompose the following variables into deterministic and all animals in the feedlot were built upon the highest frequencies of
stochastic components using EMD: body weight, DMI, instantaneous estimations for d and m, also considering the remaining variables as
atmospheric pressure and global radiation. The Fig. 2 (A-C) illustrates previously discussed: body weight, DMI, instantaneous atmospheric
the phases of Fourier complex coefficients measured from each IMF seen pressure and global radiation. All the process were analysed with d = 4
Fig. 2. Phase spectra of all Fourier Coefficients (FCs) given the IMFs extracted from the original time series of body weights for a single animal. Each plot corresponds
to an IMF from Fig. 3 (A-C).
6
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
Fig. 3. Plots from (A) to (C) show all IMFs hn (t) extracted at each iteration from the original time series body weights X(t), using the EMD method. Plot (D) shows the
residue r(t).
Fig. 4. (A) Average Mutual Information (AMI), the first local minimum shows the typical estimate for time delay d, and (B) False Nearest Neighbours (FNN), whose
value below 20% is taken as estimate of m. Both plots were produced using the deterministic component of the variable body weight using the EMD method.
7
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
The number of hidden neurons varied from 1 to 30, confirming the best After the weight predictions here performed with the MLP network,
results with 5 units even when different prediction horizons are used, H the average prediction horizon H ̂ was estimated for the different ar
= 14, 21 and 28 days, as listed in Table 1. chitectures covered. As expected, the longer the feedlot duration used
The results may still suggest that the neural network saturation was for training, the greater the prediction horizon and the smaller the SDs,
not yet reached, meaning that we could attempt to employ more com σ (Table 2). Observing Lyapunov exponents, the greater the number of
̂
plex networks (more layers and nodes per layer) to improve results iterations (N) is, the time series has a more stable behavior, periodic
(Aiken, 2020). However, for the dataset used in this study, the effort orbit, characterizing systems with asymptotic stability. This type of
made in the overall analysis has already exceeded 500 h of computa behavior probably refers to weight behaviors in the first month of the
tions, what has already significantly saturated our available environ feedlot, in which the variation in weights was smaller. As animals grew
ment (accessing more of 80% of memory). Therefore, it would be and the number of iterations was reduced, N = 14, the series presented
infeasible to perform more tests for greater complexities given the chaotic and unstable orbits, most likely resulting from the variation in
computational resources at hand. the greater weights at the last days of the feedlot. If one decides to keep
animals longer in a feedlot only for the sake of collecting more data
Table 1 points, we could better explore the behavior of those orbits. The
Leave-one-out cross-validation of predictions using the Dynamical System and slaughter of animals in unmonitored feedlots is based on fixed dates
Artificial Intelligence-based approach under different prediction horizons H = (pre-determined fattening time) as well as the visual assessment. More
14, 21 and 28. sophisticated feedlots, daily monitor animal diet and body weights, so
Units in Prediction MDDL RSS MAPE (%) that the optimal fattening point is defined by a relationship between
hidden layer horizon DMI and the reduction in the body weight rate. In other words, when the
1 14/21/28 0.2836/ 0.0006/ 1.5827/ cost of maintaining the animal alive starts being greater or equivalent to
0.4871/ 0.0013/ 3.4882/ the profit, the animal is taken to slaughtering – not necessarily consid
0.3653 0.0010 2.6161
ering that the weight converges over time. By exploring the Hurst
2 14/21/28 0.2499/ 0.0003/ 1.3769/
0.4211/ 0.0014/ 2.9179/
exponent, Appendix C.2, also listed in Table 2, we conclude its value lies
0.3158 0.0010 2.1885 in the range 0.5 < κ < 1 thus suggesting a persistent behavior with a
3 14/21/28 0.2266/ 0.0007/ 1.7885/ correlation of past and future elements.
0.4227/ 0.0045/ 4.0464/
0.3170 0.0034 3.0348
4 14/21/28 0.2259/ 0.0009/ 1.7355/ 4.2. Stochastic-based approach
0.4208/ 0.0043/ 4.4890/
0.3156 0.0032 3.3667
5 14/21/28 0.2216/ 0.0010/ 1.9555/ Considering the stochastic models analyzed in this study, the SAR
0.3947/ 0.0034/ 3.9007/ IMA models performed better for every prediction horizon studied, H =
0.2960 0.0025 2.9255 14, 21 and 28, in comparison to the ARIMA models, as observed in
6 14/21/28 0.2413/ 0.0006/ 1.7366/
Table 3. This result may be explained by the weekly animal handling
0.4449/ 0.0063/ 5.7776/
0.6349 0.0045 6.0055
routine, which produces seasonal effects on their weights. As an
7 14/21/28 0.2453/ 0.0015/ 2.2441/ example, the reduction of employees on the weekends may change the
0.4532/0.571 0.0074/ 6.2637/ animal feeding behavior, thus affecting their weights. The ARIMA
0.0046 6.0309 models were obtained with the following parametrizations:
8 14/21/28 0.2572/ 0.0011/ 2.2478/
0.4617/ 0.0078/ 6.2452/
0.5761 0.0064 6.7733 Table 2
9 14/21/28 0.2400/ 0.0010/ 1.9895/ Prediction horizon calculated by the Lyapunov exponent using the forecasting
0.4124/ 0.0046/ 5.0149/ results from MLP.
0.6500 0.0076 7.5366
10 14/21/28 0.2688/ 0.0014/ 2.3098/ Hidden N Lyapunov ( Λ
̂ ± Prediction Horizon ( H
̂ Hurst (̂ σ)
κ±̂
0.4828/ 0.0043/ 5.4205/ layer σ)
̂ ±̂σ)
0.3621 0.0032 4.0654
1 14 0.0138 ± 158 ± 260 0.7823 ±
11 14/21/28 0.2663/ 0.0009/ 2.3302/
2 0.0063 160 ± 279 0.0056
0.4311/ 0.0038/ 4.9649/
3 142 ± 221
0.3234 0.0028 3.7237
4 152 ± 292
12 14/21/28 0.2423/ 0.0014/ 2.4949/
5 137 ± 305
0.4964/ 0.0075/ 6.7581/
10 106 ± 288
0.3723 0.0056 5.0686
20 118 ± 276
13 14/21/28 0.2461/ 0.0019/ 2.8573/
0.4191/ 0.0028/ 4.6860/
0.3143 0.0021 3.5145 1 21 0.0060 ± 113 ± 541 0.7665 ±
14 14/21/28 0.2548/ 0.0014/ 2.6400/ 2 0.0069 118 ± 646 0.0073
0.4567/ 0.0041/ 5.4899/ 3 117 ± 582
0.3425 0.0031 4.1174 4 100 ± 574
15 14/21/28 0.2522/ 0.0017/ 2.8712/ 5 101 ± 572
0.4183/ 0.0035/ 5.2356/ 10 60 ± 415
0.3137 0.0027 3.9267 20 32 ± 307
20 14/21/28 0.2833/ 0.0026/ 3.5686/
0.5012/ 0.0061/ 6.5926/
0.3759 0.0046 4.9444 1 28 0.0016 ± 15 ± 836 0.7434 ±
25 14/21/28 0.2930/ 0.0014/ 2.8510/ 2 0.0077 3 ± 1232 0.0128
0.4749/ 0.0046/ 6.1190/ 3 13 ± 1632
0.3562 0.0035 4.5892 4 5 ± 1736
30 14/21/28 0.2898/ 0.0030/ 3.8614/ 5 4 ± 1440
0.5133/ 0.0064/ 7.0756/ 10 5 ± 845
0.3849 0.0048 5.3067 20 17 ± 1393
MDDL: Mean-Distance from the Diagonal Line; RSS: Residual Sum of Squares; N: number of iterations; ̂
σ : standard deviation.
MAPE: Mean Absolute Percentage Error.
8
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
Table 3 to evaluate our models while forecasting body weights, including Re
leave-one-out cross-validation of predictions using the Stochastic-based sidual Sum of Squares (RSS), Mean Absolute Percentage Error (MAPE),
approach under different prediction horizons H = 14, 21 and 28. and MDDL. The MDDL is seen as the best measure to evaluate time series
Model Prediction MDDL RSS MAPE (%) forecasting when compared to other commonly adopted measures that
horizon simply compute the differences between expected and predicted ob
ARIMA 14/21/28 0.8763/ 0.8081/ 1.0363/ servations, such as the MAPE and the RSS (Rios and De Mello, 2016).
0.9494/0.8299 1.9986/1.5365 0.6892/0.6082 The MDDL analysis determines the necessary time shifts/warps to
SARIMA 14/21/28 0.5912/ 0.3350/ 0.6605/ obtain the best synchronization between two series (verifying the global
0.5614/0.4884 0.3499/0.7026 0.9852/0.4263
behavior, which may have similar trends and seasonality) and later
MDDL: Mean-Distance from the Diagonal Line; RSS: Residual Sum of Squares; assess their similarities (Rios and De Mello, 2013). Another important
MAPE: Mean Absolute Percentage Error. feature of MDDL is related to the stability of its results, which are not
influenced by the number of serial observations. In MDDL, scores are
ARIMA(1, 1, 0) and ARIMA(0, 1, 1) and the SARIMA models were more influenced by the general behavior rather than by the series length.
SARIMA(0, 1, 1)(0, 0, 1) and SARIMA(0, 1, 1)(0, 0, 2). Similar results Due to this stability and ability to assess general similarities between
obtained by the MLP regarding the average horizon estimate are series, the MDDL is here seen as the main approach to analyze the
repeated for the ARIMA and SARIMA models, i.e., the shorter the feedlot overall results.
report used for training (when the number of iterations N is greater), the It is possible to notice that all analyses based on deterministic
greater the decline of the prediction horizon and the greater the SDs, as modeling indicated superior results, more accurate and precise, showing
discussed in the previous section under (Tables 2 and 4). that even by integrating different data sources it is possible to predict
The stationarity test was applied before and after a first-order dif meat production, surpassing the ARIMA and SARIMA models. This
ference. Estimates of p-values suggested the presence of a trend in body conclusion confirms the greater influence of deterministic behaviors on
weight, for all prediction horizons used, as represented in Fig. 5 (A). To the weights, observing the probability densities of the MDDL, in
eliminate such a trend, a first-order difference and the Augmented different prediction horizons (H = 14, 21 and 28), as seen in Fig. 8. With
Dickey-Fuller test were applied to check the stationarity [Fig. 5 (B)]. The MLP, the high density concentration is clear when the MDDL values are
histogram confirm that one difference was enough to eliminate the smaller.
trend, rejecting the hypothesis H0 that there is stationarity. Fig. 6 (A) The simple analyses of linear regressions based on weight adjust
illustrates the representation of all 71 animals: at the top it is possible to ments, considering observed versus predicted observations, by the
find the behavior of the weights and at the bottom the illustration of ARIMA, SARIMA and MLP methods are illustrated in Figs. 9–11,
stationarity after performing a first-order difference. respectively, following the prediction horizons of 14, 21 and 28 days. As
The histograms in Fig. 6 (B) illustrate the Fisher’s Exact G test as for observed, adjustments are best up to 21 days. With 28 days, stochastic
the presence of seasonality, in the upper, middle and lower parts, strategies overestimated the animal weights, while the MLP under
respectively, for the following prediction horizons: 14, 21 and 28 days. estimated the weights.
These results show the presence of the seasonality component in the
model for about 50% of the animals. After adjusting the combinations of 5. Discussion
the ARIMA and SARIMA models, the normality of the residues was
tested, Fig. 7 (A) and (B), respectively. The histograms referring to p- The ANNs have been applied on a large number of fields in recent
values, at the top, middle and bottom correspond to the predictions decades, however its applications are still marginal in animal sciences.
horizons of 14, 21 and 28 days, respectively, in both part (A) and (B) of Even with little contributions, the ANN results in animal sciences are
Fig. 7, suggesting the normality of the residues. more powerful than most traditional statistical forecasting methods such
as Multivariate Linear Regression (MLR), Logistic Regression, Principal
Component Analysis, Discriminant Analysis, k-Nearest Neighbor classi
4.3. Common parameters fication, among others. Unlike traditional statistical methods, the ANNs
attempt to solve problems through explicit learning, thus fitting a model
The assessment of models is essential for policy makers and re requires more computing resources than a traditional MLR model but
searchers to provide ways to express scientific knowledge. This study the benefits of its accuracy far outweigh the underlying computing
assesses model accuracy, robustness and sustainability while addressing overheads (Salawu et al., 2014). Moreover, ANNs are adaptive data-
a specific target. In this paper, we present and discuss various strategies driven in nature, meaning that their models may be modified to better
represent the features of time series data (Büyükşahin and Ertekin,
Table 4 2019).
Prediction horizon calculated using the Lyapunov Exponent using the fore In the context of this study, ANNs support the prediction of the
casting results from the Stochastic-based approach. growth rates and the optimal point of animal negotiation but also the
Model N Hurst (̂ σ) drop in weights when it happens, thus allowing to detect eventual issues
Lyapunov Prediction ( Λ
̂ Horizon ( H
̂ ± κ±̂
type ±̂σ) σ)
̂ related to animal health, injuries, immobilization, welfare, social in
teractions or even a technical failure within a feedlot. When this type of
ARIMA 14 0.0138 ± 0.0062 96 ± 222 0.7823 ±
0.0056 problem is observed, intervention measures may be taken in an attempt
ARIMA 21 0.0060 ± 0.0069 86 ± 564 0.7665 ± to minimize losses and mortality (Sarout et al., 2018). Previous studies
0.0073 have revealed that management, climate and environmental factors can
ARIMA 28 0.0016 ± 0.0077 6 ± 1801 0.7434 ± significantly affect the spatial distribution when it comes to the disease
0.0128
prediction (Selemetas et al., 2015). Disease surveillance using infor
SARIMA 14 0.0138 ± 0.0063 100 ± 217 0.7823 ±
0.0056 mation from previous time series is also central to predict disease ac
SARIMA 21 0.0060 ± 0.0069 86 ± 667 0.7665 ± tivity, in which the spatial extent of epidemiological data are
0.0073 particularly important factors to investigate the annual and interannual
SARIMA 28 0.0016 ± 0.0077 3 ± 1811 0.7434 ±
patterns of any sort of disease in order to have a more accurate forecast
0.0128
(Myers et al., 2000).
N: number of iterations; ̂
σ : standard deviation. In recent decades, considerable effort has been devoted to devel
oping and improving time series prediction models. In the literature,
9
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
Fig. 5. (A) Augmented Dickey-Fuller Test to assess the trend of the original variable, considering 14, 21 and 28 days of prediction horizons, respectively; (B)
Augmented Dickey-Fuller Test to assess trends with a first-order difference, considering 14, 21 and 28 days of prediction horizons, respectively.
Fig. 6. (A), top: illustrates the time series of body weights considering all animals; (A), bottom: illustrates the first-order differences for all time series of body
weights; (B) Fisher’s Exact G Test to assess seasonality, considering 14, 21 and 28 days of prediction horizons, respectively.
several forecasting methods have been proposed, which use linear and components in a hybrid model.
non-linear models separately or a combination of both, in a hybrid way. It is worth mentioning that image analysis techniques have also been
For example, Maheswari et al. (2021) report that statistical and linear used to estimate the body weight of animals based on spatial features
models provide better results than ANNs. On the other hand, Alonso (Cominotte et al., 2020; Qiao and Kong, 2021; Rahagiyanto and
et al. (2015) and Aiken (2020) report that ANN outperforms linear Adhyatma, 2021). This approach may be combined with our ap
models when data have high volatility and multicollinearity. As already proaches, on modern farms, to support an even more precise way of
dicussed in the literature, the integration of linear and non-linear stra ensuring investment returns.
tegies, such as the ARIMA model – linear – with the ANNs – as a non-
linear model, in conjunction with the EMD method tend to improve 6. Concluding remarks
the overall predictions (Rios and De Mello, 2013; Büyükşahin and
Ertekin, 2019; Mayilsamy et al., 2021). Predicting the future of the livestock production using large-scale
Our experimental results using an ANN in combination with the EMD data is the central motivating aspect of this study. This study esti
method have a good agreement (with 5 units in hidden layer and MDDL mates future trends and optimizes resource allocation at all levels of the
0.2216, 0.3947 and 0.2960 for 14, 21 and 28 days of prediction horizon, supply chain, making animal production more sustainable. Although
respectively) when compared to the baseline approach. In future studies, beef cattle production is a complex system, the analysis based on the
other time series analysis methods may be assessed and combined, as a deterministic modeling confirmed moderate-to-high precision. This
way to capture additional structures of linear and non-linear study demonstrates that by integrating different data sources in a
10
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
Fig. 7. (A) Normality test – ARIMA models for the prediction horizons of 14, 21 and 28, respectively; (B) Normality test – SARIMA models for the prediction horizons
14, 21 and 28, respectively.
Fig. 8. The probability density of the MDDL involving the prediction horizons of 14, 21 and 28 days, respectively in (A), (B) and (C). With the deterministic
approach, five neurons were used at the hidden layer of the MLP network.
Fig. 9. The ARIMA model: Adjustments of observed and predicted observations for the prediction horizons of 14, 21 and 28 days, respectively.
deterministic model, one can predict meat production, surpassing the CRediT authorship contribution statement
ARIMA and SARIMA models. Further studies on decomposition analyses
to support the individual modeling of animals based on stochastic and Adriele Giaretta Biase: Conceptualization, Methodology, Valida
deterministic influences are warranted. tion, Formal analysis, Writing – original draft. Tiago Zanett Albertini:
Resources, Data curation, Investigation. Rodrigo Fernandes de Mello:
11
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
Fig. 10. The SARIMA model: Adjustments of observed and predicted observations for the prediction horizons of 14, 21 and 28 days, respectively.
Fig. 11. The MLP model: Adjustments of observed and predicted observations for the prediction horizons of 14, 21 and 28 days, respectively.
Data collected from natural phenomena tend to present a mixture of deterministic and stochastic influences, which motivated more recent studies
(Box et al., 2015; Rios and De Mello, 2013; Rios and De Mello, 2016) to decompose and model them individually. When dealing with predominantly
deterministic data flows, Ravindra and Hagedorn (1998) show that the Takens’ embedding theorem produces better results when analyzing non-linear
deterministic signals. Among the most current approaches, the proposal by Rios and De Mello (2016) stands out, which employs EMD Huang et al.
(2003) in conjunction with Mutual Information (MI) Darbellay and Vajda (1999) to segment such components and was adopted in this study. Further
details necessary for the understanding of the decomposition step are presented. The first approach is now presented.
To perform the decomposition of a time series, the first process is sifting (Rios and De Mello, 2016), which initially analyzes a sign x(t) and
identifies local maxima and minima observations over time. The cubic spline model is then applied to compose the upper u(t) and lower l(t) envelopes.
Average envelopes m(t) are obtained using the values of the approximations with cubic splines fited inferiorly l(t) and superiorly u(t) (Huang et al.,
2003).
After this, m(t) is removed from its original signal x(t), producing the first candidate component h1,1 = x(t) − m(t), in which the first index
corresponds to the IMF identifier (as this is the first IMF to be extracted, the index is one) and, the second to the candidate identifier (as this is the first
candidate, the index is one as well). This candidate is used in the place of the original data and the entire screening process is repeated until the
3
https://techagr.com/.
4
https://www.agrohora.com.br/pt/.
12
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
candidate fulfills the IMF definition, it must settle the following requirements: (i) the number of extremes and the number of zero crossings must be
equal or differ at least one extreme; or (ii) at each point, m(t) is equal to zero. After obtaining the candidate which fulfills the IMF definiton, the first
IMF is defined according to h1 (t) = h1 ,k(t), assuming that k candidates are produced until the IMF definition is reached. Then, this first IMF is removed
from data, in other words, x(t) − h1 (t), and the resulting signal is analyzed by the whole process once again, producing more IMFs until a stop criterion
is achieved. This standard usually holds until the last IMF becomes a monotonic function, avoiding the extraction of other components. Therefore, this
last component is called the final residue, r(t) (Rios and De Mello, 2016). In short, according to EMD, a signal x(t) is composed of a set of IMFs along
with the residue, as shown in the following equation:
∑
x(t) = = hj (t) + r(t). (1)
j=1
As an advantage, the EMD method supports the decomposition of signals into IMFs, regardless of their linearity, stationarity and stochasticity Huang
et al. (2003).
In order to understand the IMFs extraction process, we define a dataset x(t) = x(1), x(2), …, x(T), which represents the combination of a deter
ministic function in the addition to a noise with Normal distribution with the average (μ = 0) and the SD (σ = 1). After applying the EMD to the
variable x(t), a set of IMFs (hj (t)|1⩽j⩽N) and a residue r(t) are extracted. While analyzing EMD results, frequency bandwidths tend to decrease as new
IMFs are extracted Rios and De Mello (2016). The Fourier transform F(⋅) is applied to each IMF (Eq. 2), and also on the residue (the last IMF). For each
IMF hj (t), is Cj (t) calculated:
( )
Cj (t) = F hj (t) , (2)
then, a set of complex coefficients are obtained Cj (t) = {cj,1 , cj,2 , …, cj,k , …, cJ,T } in the frequency space, thus cj,k is calculated using the following
equation:
()
∑
T
(3)
t
cj,k = hj t e− i2π(k/T) , ∀k ∈ 1, 2, 3, …, T.
t=1
After obtaining the Fourier coefficients for each IMF, the phase spectrum for each component is calculated. For this purpose, the arc tangent function is
applied to the relationship between the imaginary I(⋅) and real parts R(⋅) as presented in the following equation:
( ( )) ( ( ( )) )
I Cj t
θ hj t = arctan ( ( )) , ∀j ∈ 1, 2, 3, …, N. (4)
R Cj t
It is expected that the last phase spectra behave in a more deterministic way than in the first (Rios and De Mello, 2016). In addition to this visual
inspection, an interesting way is to apply a method to quantify MI of the phase spectra of consecutive IMFs (Rios and De Mello, 2016; Rios et al., 2015).
The concept of MI was initially introduced by Shannon (Shannon, 2001) to quantify the information shared by two variables. The MI between the
continuous variables X and Y is formally defined as:
( ) ∫ ∫ ( ) ( ( ))
fX,Y x, y
I X; Y = fX,Y x, y log dxdy, (5)
Y X fX (x)fY (y)
in which fX,Y (x, y) is the joint probability density function (PDF) for the variables X and Y, and fX (x) and fY (y) are the marginal PDFs of X and Y,
respectively (Papana and Kugiumtzis, 2008). In order to provide a practical interpretation, if X and Y are two strictly independent random variables,
their MI is zero because one variable cannot provide any information on the other. If, on the other hand, X and Y are dependent, your MI has a large
value (∞). The advantage of using MI is in the aspect of contemplating both linear and non-linear behaviors between the variables X and Y (Kraskov
et al., 2004).
The most accurate technique for calculating MI between continuous random variables is the one proposed by Darbellay and Vajda (DV) (Darbellay
and Vajda, 1999), which performs the partitioning of the data space (axes formed by both variables) into a finite number and not superimposed of
rectangular cells formed according to a space division standard (Chi-square test) in order to achieve a condition of functional independence between
the cells. In this way, the MI estimated on these partitions is closer to the theoretical value (Darbellay and Tichavský, 2000). The use of MI about the
phase spectra allows the detection of behavior modifications between consecutive IMFs, allowing to observe if a congruent behavior between phases is
shown (Rios and De Mello, 2016). Higher frequency IMFs tend to have lower MI among themselves, which allows us to characterize the presence of
stochastic behavior, whereas MI increases as a next (always lower frequency) IMF is obtained (Rios and De Mello, 2016). Deterministic IMFs present
phase spectra with more similar information, as there is a strong dependence on entropy between them. The difference between frequency bandwidths
of IMFs can be explained by successive subtractions between the signal and the average envelope, which is calculated by the local maximum and
minimum values of the signal values (Rios and De Mello, 2016).
As a consequence, IMFs can be organized into two classes: one with lower MI IMFs and another containing higher MI IMFs. Thus, Rios and De Mello
(2016) assume that the first class corresponds to stochastic influences, while the second class represents deterministic ones. The theoretical basis that
EMD acts as a filter bank is found in the Nyquist-Shannon sampling theorem, as demonstrated and detailed by Rios and De Mello (2016).
13
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
This approach reconstructs each observation x(t) from a time series X, for all t = 0, …, t, in phase space coordinates Φ in the form:
ϕt = (x(t), x(t + d), …, x(t + (m − 1)d)), (6)
in which m is the dimension of incorporation (number of spatial axes), d is known as time delay and ϕt corresponds to a point or state in Φ phase space,
i.e., ϕt ∈ Φ. To estimate the parameters m and d, one can use strategies such as FNN and AMI, respectively (Fraser and Swinney, 1986; Kennel et al.,
1992).
The SLT provides the theoretical basis to ensure limits for supervised learning (Von Luxburg and Schölkopf, 2011), as long as five assumptions are
held: (i) the examples (xi , yi ) ∈ X × Y must be independent and identically (i.i.d) sampled according to the joint probability distribution P(X × Y );
(ii) no assumptions are made about P(X × Y ), so any function can be represented from data; (iii) labels can assume non-deterministic values due to
the presence of noise in the collected data or errors in the supervision carried out by specialists; (iv) the distribution P(X × Y ) is fixed in order to
support a study of uniform convergence according to the Law of Large Numbers; and (v) P(X × Y ) is still unknown at the time of training, and must
be learned. The assumption of independence, given by (i), imposes a strong limitation to guarantee learning in time-dependent scenarios, as SLT is
based on the Law of Large Numbers to guarantee learning limits. In order to solve this problem, some researchers relax this assumption at the expense
of invalidating all the theoretical results provided by it.
The structure of the SLT considers an input space X and an output space Y in which each xi ∈ X corresponds to an example (or feature vector)
and yi ∈ Y is the expected class or label. In this context, learning is defined as the process of converging to the best classifier f : X →Y , which
provides the least possible error or loss, depending on the size of the training sample n→∞. The best classifier is in fact given by the function whose
regression best represents the joint probability distribution P(X × Y ).
Vapnik (2013) used the assumptions (theoretical requirements) described above to employ the Law of Large Numbers, in order to define the
Empirical Risk of Minimization Principle (ERMP), which guarantees:
(⃒ ⃒ )
P ⃒R(f ) − Remp (f )⃒ > ∊ →0, n→∞, (7)
in other words, the empirical risk classifier Remp (f) converges, probabilistically, to the risk R(f) (also known as real risk or expected risk) according to
the size of the sample n as it tends to infinity, considering ∊ ∈ ℝ+ . The empirical risk corresponds to the error in a sample, defined as follows:
( ) ( ( ))
1∑ n
Remp f = ℓ xi , yi , f xi , (8)
n i=1
in which E(⋅) is the expected value and ℓ(.) is a loss function, i.e., capable of measuring the error between the result of f(xi ) and its expected value yi .
Eq. (7) is the result of the generalization concept |R(f) − Remp (f)|, which aims to measure whether a classifier f provides a risk or error similar to
unseen examples such as sample data, typically training, which is, a classifier with good generalization is not necessarily one that provides a low risk,
but one whose empirical risk is a good estimator of the expected risk, (Pagliosa and Mello, 2017).
( ⃒ ⃒ )
⃒ ⃒
⃒ ⃒
Vapnik (2013) defined a superior limit to P sup ⃒R(f) − Remp (f)⃒ > ∊ , in order to ensure the learning to any supervised algorithm, as long as it is
f∈F ⃒ ⃒
used in a space of admissible functions F sufficently restricted. Then, it is proved under what conditions the following inequality uniformly converges:
( ⃒ ⃒ ) ( ⃒ ⃒ )
⃒ ⃒ ⃒ ⃒
P sup⃒⃒R(f ) − Remp (f )⃒⃒ > ∊ ⩽2P sup⃒⃒Remp (f) − Remp ’(f)⃒⃒ > ∊ ⩽2N (F , 2n)en∊ /4 , (10)
2
f ∈F f∈F
in which two empirical risks are considered Remp (f), Remp ’(f), under two different samples, each of them with size n, being sampled from the same joint
distribution P(X × Y ). It is then considered a uniform convergence of the worst classifier f ∈ F , contained in the learning bias, according to the
sample size n→∞, if and only if the subspace F is characterized by a polynomial shattering coefficient N (F , 2n) which is the same as saying that the
subspace is sufficiently restricted to solve the target problem. This coefficient (or function) must grow polynomially, or the ERMP becomes
inconsistent.
This is one of the most important formal foundations (if not the most) to the ML, as it assures the learning conditions for supervised algorithms
(Vapnik, 2013; Mello and Ponti, 2018).
From the following definition:
( ) 2
δ = 2N F , 2n en∊ /4 , (11)
14
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
4
∊= (log2N (F , 2n) − logδ), (12)
n
which allows us assess the absolute divergence among sampling errors calculated on different samples, in the form:
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
4
|Remp (f ) − Remp ’(f )|⩽ (log2N (F , 2n) − logδ ). (13)
n
Therefore, it is possible to induce classifiers or regressors for supervised learning problems and apply them on k test samples, in order to obtain a set of
( )
empirical errors {Remp 1 (f1 ), Remp 2 (f2 ), …, Remp k fk }, in which each f is the convergent classifier or regressor over the k-th training set and tested on
( ) ( )⃒
⃒
another k-th test set containing unseen examples. From the calculation of all divergences |Remp i fi − Remp j fj ⃒, for every i ∕ = j, there is a set of
measures which allows estimating a probability distribution that characterizes the ∊ values, in other words, it estimates the divergence of these
empirical risks to the expected risk R(f).
Aiming to assist the first ML assumption, Pagliosa and Mello (2017) considered one kernel function to produce new multidimensional spaces (also
known as phase spaces) in an attempt to make the examples independent from each other and to allow the definition of theorical learning limits with
the SLT. The reconstruction of the phase space in a proper way assumes that the points in this target space will no longer be dependent on each other,
therefore they can be sampled in an equally distributed way. As it was already addressed by the authors, Takens’ embedding theorem is the best
candidate for the kernel function in order to assure supervised learning.
Once Pagliosa and Mello (2017) defined the ideal phase space according to a supervised learning algorithm, in that case using the Distance-
Weighted Nearest Neighbors (DWNN), they produced the least empirical risks for multiple training and test samples. This study considers the
same method to find the proper embedding to our problem of interest, using another supervised learning algorithm though, in our case MLP, in order
to assure learning limits, i.e., which the inferred regressors may produce proper possible estimators of the expected risks R(f) (Eq. (10)). In other
words, it is possible to select a set of parameters of the kernel function, which is, time delay d and embedding dimension m, as well as the number of
neurons that minimize both empirical risk and ∊ to assure that Remp (f) is a good estimator for R(f). After defining the embedding, it is verified if this
space is in fact capable of generalizing data which were never seen before, and it allows the verification of both underfitting and overfitting influences.
The MLP is an ANN typically composed of three consecutive layers of units, a.k.a. neurons, thus forming the architecture:
1. the first is the input layer, responsible for receiving the input data, which contains k = [(m × v) − 1] neurons, having m as the embedding dimension
and v as the number of variables problem (e.g. weight, temperature, wind speed);
2. the hidden layer is fully connected to the previous layer and it is responsible for building up hyperplanes on the input space to induce some
classifier or regression function. It has been considered the MDDL Rios and De Mello (2013) estimated the sufficient number of neurons for this
particular layer;
3. and, the output layer which combines the results from the other layers to produce meaningful classification or regression outputs.
Each MLP architecture is then trained using some loss function and tested on unseen examples that follow the same joint probability distribution of
the problem under consideration.
As a comparison baseline to the approach defined in the previous section, we decided to analyze the problem variables using a stochastic bias,
based on the ARIMA and SARIMA models (Box and Jenkins, 1976). These models are characterized by being simple, parsimonious and useful to
forecast under sufficient accuracy. Then, these models were organized into stationary and non-stationary linear models in order to better describe
them.
There are three particular cases of stationary linear processes, autoregressive order process p, represented by AR(p); the moving-average order
process q, represented by MA(q); and the autoregressive and moving-average orders p e q, respectively, represented by ARMA (p, q).
1) Autoregressive models - AR(p). An autoregressive order model p, AR(p), is described according to its past values and random component, a(t),
also called residue, being denoted by: X(t)
̃ = ϕ1 X(t
̃ − 1) + ϕ2 X(t
̃ − 2) + … + ϕp X(t
̃ − p) + a(t), in which X(t)
̃ = X(t) − μ and μ corresponds to an average
which determines the series level. The AR(p) quotation can also be described in the form: ϕ(B)X(t)
̃ = a(t), in which ϕ(B) = 1 − ϕ1 B − ϕ2 B2 − … − ϕp Bp is
the autoregressive polynomial order p, and B that corresponds to the transfer operator for the past, e.g. Bm X(t) = X(t − m). This operator produces a
period displacement m for the past.
2) Moving-average models - MA(q). The moving-average model results from the linear combination of white noises which occur in the current
period or occured in the past, so it can be described in the form X(t) ̃ = a(t) − θ1 a(t − 1) − θ2 a(t − 2) − … − θq a(t − q) or X(t)
̃ =
(1 − θ1 B − θ2 B2 − … − θq Bq )a(t) = θ(B)a(t), having θ(B) as the stationary moving-average order operator q.
3) Autoregressive and moving-average models - ARMA (p, q). Obtained by combining the autoregressive model (AR) and the moving-average
model (MA), in which p corresponds to the autoregressive term, and q is associated to the moving-average order. This model can be described in
15
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
̃ = ϕ1 X(t
the form X(t) ̃ − 1) +… +ϕp X(t
̃ − p) − θ1 a(t − 1) − … − θq a(t − q) or ϕ(B)X(t)
̃ = θ(B)a(t), in which ϕ(B) and θ(B) are the autoregressive and
moving-average, respectively.
B.2. Non-stationary models
Most of the provisional series present some sort of non-stationary behavior. Provisional series procedures assume that observations are stationary,
it is necessary to transform the original data containing non-stationary terms. The most commonly applied transformation consists of taking successive
differences in the original series until a stationary series is obtained. The number of differences d required to make a series stationary is named
integration order (Box et al., 2015). Among the models proposed for the adjustment of this type of data, it is possible to find the integrated autor
egressive model and the moving-average ARIMA(p, d, q).
1) Autoregressive Integrated Moving Average models - ARIMA(p, d, q). ARIMA models (p, d, q) are defined according to Eq. (14),
( ) () ( ) () ( ) () ( ) ()
ϕ B Δd X t = θ B a t ou ϕ B (1 − B)d X t = θ B a t , (14)
in which ϕ(B) = 1 − ϕ1 (B) − ϕ2 (B2 ) − … − ϕp (Bp ) is the autoregressive order operator p; Δd is the simple difference operator, for d defining the dif
ferences number; θ(B) = θ1 (B) − θ2 (B2 ) − … − θq (Bq ) is the moving-average order operator q; a(t) is the residue which, eventually, can be represented
as white noise.
2) Seazonal Autoregressive Integrated Moving Average models - SARIMA (p,d,q)(P,D,Q). When a time series presents a periodic component, it is
necessary to add a component to represent such seasonality in the model. It is possible to have two kinds of seasonal models: the deterministic and
stochastic ones. A seasonal series is stochastic when it presents significant correlations in seasonal lags, in other words, multiples of the period s; and it
is deterministic when it becomes stationary after taking D differences from the series (Box et al., 2015). There are still other scenarios in which the
series presents both characteristics. Models in SARIMA form (p, d, q)(P, D, Q) are described according to Eq. (16),
( ) ( ) ( ) ( ) ( ) ( )
ϕ B Φ Bs Δd ΔDs X t = θ B Θ Bs a t , (15)
in which ϕ(B) = 1 − ϕ1 B − … − ϕp Bp is the autoregressive polynomial of order p; Φ(Bs ) = 1 − Φ1Bs − … − Φp BPs is the seasonal autoregressive
polynomial of order P; Δd = (1 − B)d is the difference operator, so d is responsible for defining the number of differences needed to remove the trend
from the series; ΔDs = (1 − Bs )D is the generalized difference operator, for pairs of observations that show sufficient similarities being s the distant time
break from each one, and D is the number of differences of s that is necessary to remove seasonality from the series; θ(B) = 1 − θ1 Bs − … − θp BQs is the
moving average polynomial of order q; and finally, Θ(B) = 1 − Θs Bs − … − Θp BQs is the seasonal moving average polynomial of order Q.
In addition to allow the verification of the series stationarity, they are also used to determine time dependencies. The ACF calculates the obser
vations correlation of a time series with itself, according to lags (or time delays). It also assists in the estimation of seasonality and the order of a
moving-average model (MA) (q). The PACF applies the ACF in the X(t) and in the X(t + k), however, it removes the linear dependencies found in the
elements range X(t +1) e X(t + k − 1). The PACF assists in estimating the order of the AR component (p).
B.4. Trend
The adjustments of the ARIMA and SARIMA models require the series to be stationary. A series is considered to be stationary when its average,
variance and autocovariance are invariant over time. As the ARIMA and SARIMA models use previous series lags to represent their behavior, the
modeling of stable series with consistent properties involves less uncertainty. The Augmented Dickey-Fuller (ADF) is a formal statistical test to check
the stationarity of a series, also referred to as unit root. The null hypothesis assumes the series is non-stationary. ADF tests whether the change in X(t)
can be explained by a lagged value or by a linear trend. If the contribution of the lagged value to the X(t) variation is not significant and if a trend
component is present, the series is non-stationary and the null hypothesis will not be rejected.
B.5. Seasonality
As it happens with the trend, the seasonality or periodicity constitutes another form of non-stationarity and must be estimated and removed from
the series, in which ̂S(t) is the estimate of S(t). Then, the seasonally adjusted series considering an additive model is defined as:
( ) ( )
SA
X(t) = X t − ̂ S t . (16)
Fisher (1929) proposes the Fisher test to check the presence of deterministic seasonality, which is based on the analysis of a number of observations
using a periodogram describing the observed values of a series through the overlapping of sine waves under different frequencies. Its most common
practical application aims to identify cyclical or periodic components. According to Priestley (1989), the periodic function is given by:
( ) [( )( () )2 ]
2 ∑n
2πi ∑n
2πi
Ip fi = at cos t a t sin t , (17)
n t=1
n t=1
n
in which 0 < fi < 1/2; t = 1, 2, …, n; Ip (fi ) is the intensity or magnitude of the frequency fi . The periodicity of period 1/fi can be observed by the
existence of peaks at the frequency fi = 1/n.
Fisher Test. To test the null hypothesis (H0 ), which contemplates the non-existence of seasonality, the following statistic is used:
16
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
maxIp
g = ∑N/2 , (18)
p=1 Ip
in which Ip is the value of the periodogram, Eq. (17), with the period p, and N is the number of observations in the series. Fisher’s test statistic, zα , is
given by:
(α) 1
zα = 1 − n− 1 , (19)
n
in which n = N/2 and α is the level of significance of the test. If g > zα , rejects H0 , in other words, the series presents periodicity p.
2
The Box-Pierce test to check if the residue is white noise based on the first k estimates of the autocorrelations, ̂r k, of noises (Box and Pierce, 1970).
The test statistic is given by:
( ) ( )
∑k
rk2
Q k =n n+2 . (20)
j=1
n − j
If the adjusted model is appropriate Q(k) ∼ χ 2k− p− q , in which k is the number of “lags”, p is the order of the autoregressive part of the model, and q is the
order of the moving average part, then the white noise hypothesis will be accepted if Q(k) < χ 2 .
Among the methods that can be used for the selection of models, the ones that stand out are Akaike Criterion (AIC) and the Bayesian Information
Criterion (BIC). According to McElreath (2020), the identification of the model is carried out by means of the differentiated series. Both AIC and BIC
can be respectively calculated using Eq. (21) and (22):
( ) ( )
(p + q)
AIC p, q = ln ̂ σ 2p,q + 2 (21)
N
and
( ) ( )
(p + q)lnN
BIC p, q = ln ̂σ 2p,q + , (22)
N
in which ̂σ 2p,q corresponds to the variance of the estimated model by the maximum probability method, and p +q is given by the number of parameters
of the estimated model (p is the order of the autoregressive part and q is the order of the moving average part).
In this section, we present several techniques for mathematical comparison of the models. These techniques are necessary to support decisions and
to demonstrate the success of a model, presenting evidence to promote their acceptance and use for certain purposes.
For both approaches employed in this paper, it was decided to use the prediction horizon as a basis to define the minimum number of observations
to be predicted and analyzed in the experimental stages.
The phase space Φ of a dynamic system is studied according to its sensitivity to initial conditions, as stated by saying that two infinitely close points,
for example ϕi ,ϕi+1 ∈ Φ, eventually lead to significantly different orbits over time (Kathleen et al., 1997). In this context, the separation rate among the
theoretical trajectories produced for both points after k iterations are usually found using the Jacobian matrix Jk (ϕi ) for any point ϕi . More precisely,
being λ the greatest eigenvalue of Jk (ϕi )Jk (ϕi )T , the exponent Lyapunov Λ is defined as:
1
Λ = lim lnλ, (23)
k→∞ k
these nearby points converge to a fixed attractor when Λ < 0, behave conservatively if Λ = 0, or diverge if Λ > 0. In this context, the Lyapunov
exponent represents how the trajectories in the phase space behave over time, allowing to estimate the prediction horizon H in the form:
[ ( )]
ln Remp Φ
H=− , (24)
Λ
according to the number of iterations that can be recursively predicted under a minimum confidence level.
17
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
The Hurst exponent (κ) is a measure of long-term memory of a time series where we measure the amount by which a given time series deviates from
a random walk. The scalar value shows the relative tendency of a time series to cluster in a particular direction (trending pattern in both positive and
negative manner) or regressing strongly to the mean (mean-reverting pattern). The Hurst exponent, Black (1965), is obtained by isolating κ in the Eq.
(25):
[ ]
R(N)
E = cN κ , (25)
S(N)
where R(N) is the range of the first N cumulative deviations from the mean; S(N) is the series (sum) of the first N SD; E(⋅) is the expected value; N is the
time span of the observation (number of data points in a time series) and c is a constant. The various values of the Hurst exponent always range
between 0 and 1 and based on the value of κ we can classify the given time series as follows:
• κ < 0.5 – Mean-Reverting (anti-persistent) series. A closer value to 0 means that there is a strong pattern of the mean-reversion process. In practical
situations, it means that a high value will be followed by a low value next and vice versa.
• κ = 0.5 – Geometric Random Walk. This means that it can go either way and there is no clear deduction possible from the given parameters
• κ > 0.5 – Trending (Persistent) series. A value that is closer to 1 means that the trending pattern is strong and the trend will continue. Generally, it
means that a high value will be followed by a higher value.
As discussed in the content of this study, the risk of a classifier f is defined by using a loss function ℓ(xi ,yi ,f(xi )), so that the best classifier reaches the
lowest expected risk R(f). Nevertheless, the loss function must be defined according to the objective of the problem, space and restrictions. For
example, the 0 − 1 loss function (Eq. (26)) and the squared-error function (Eq. (26))
{
1 if f (xi ) ∕
= yi ,
l(xi , yi , f (xi ) ) = (26)
0 otherwise,
( ( ))
l xi , yi , f xi = (yi − f (xi ))2 (27)
may be inadequate to compare time-dependent data, mainly due to non-stationarities and trends.
Therefore, all the experimental results presented in this paper were evaluated based on the MDDL (Rios and De Mello, 2016), which aligns two time
series according to their similarities using the Dynamic Time Warping (DTW), and that produces a path in a matrix of Euclidean distances of the
observations. Then, the MDDL calculates the average distance of each path to the diagonal line, providing a single dissimilarity measure between any
two time series.
In this particular case, the MDDL is used to measure the dissimilarities in the observations predicted by the proposed approaches versus the ex
pected observations. As an advantage, the MDDL summarizes the divergence between these series, considering their best matching over time instead of
a traditional Euclidean distance or the DTW itself, as discussed in Rios and De Mello (2016).
C.4. MAPE
It expresses accuracy as a percentage of the error. To calculate the MAPE, it is necessary to use the following Eq. (28):
⃒ ⃒
100 ⃒⃒(f (xi ) − yi )⃒⃒
⃒, (28)
n ⃒ yi
in which yi is the observed, f(xi ) is the forecast, and n is the number of observations. Note that if f(xi ) approaches yi , then MAPE approaches zero, i.e.,
the smaller the MAPE, the better the fit.
References Alonso, J., et al., 2015. Improved estimation of bovine weight trajectories using support
vector machine classification. Comput. Electron. Agric. 110, 36–41. https://doi.org/
10.1016/j.compag.2012.08.009.
Aiken, V.C.F., et al., 2020. Forecasting beef production and quality using large-scale
Andersen, K.G., et al., 2020. The proximal origin of sars-cov-2. Nature Med. 26, 450–452.
integrated data from brazil. J. Animal Sci. 98, skaa089. https://doi.org/10.1093/
Baldwin, R.L., 1995. Modeling ruminant digestion and metabolism. Springer Science &
jas/skaa089.
Business Media.
Albertini, T.Z., et al., 2016a. Beeftader (part i): optimal economical endpoint
Biase, A.G., et al., 2016a. Beeftrader: optimal economical endpoint maximization
identification using mixed modeling approach decreases greenhouse gases emission
decision support system for feedlots and meat packers. ASAS-CSAS Annual Meeting
and other pollutants for livestock farmers. In: Second International Symposium on
& Trade Show, pp. 307.
Greenhouse Gases in Agriculture. Second International Symposium on Greenhouse
Biase, A.G., et al., 2016b. Beeftrader (part ii): optimal economical endpoint identification
Gases in Agriculture, pp. 182–186.
using nonparametric bootstrapping technique decreases greenhouse gases emission
Albertini, T.Z., et al., 2016b. Beeftrader (part iii): meat industry opportunity to improve
and other pollutants in feedlots. In: Second International Symposium on Greenhouse
its profitability reducing greenhouse gases emissions and pollutants based on
Gases in Agriculture, pp. 187–191.
optimal economical endpoint identification. In: Second International Symposium on
Biase, A.G., et al., 2017. Parametrization of the davis growth model using data of
Greenhouse Gases in Agriculture. Second International Symposium on Greenhouse
crossbred zebu cattle. Scientia Agricola 74, 8–17. https://doi.org/10.1590/1678-
Gases in Agriculture, pp. 191–194.
992x-2015-0284.
Alonso, J., et al., 2013. Support vector regression to predict carcass weight in beef cattle
Black, R., et al., 1965. Long-term storage: an experimental study. Constable.
in advance of the slaughter. Comput. Electron. Agric. 91, 116–120.
Box, G., Jenkins, G., 1976. Time Series Analysis: forecasting and control. HoldenDay, San
Francisco.
18
A.G. Biase et al. Computers and Electronics in Agriculture 195 (2022) 106706
Box, G., Pierce, D., 1970. Distribuition of residual auto-correlations in autoregressive- McPhee, M., et al., 2014. Beefspecs fat calculator to assist decision making to increase
integrated moving average time series models. J. Am. Stat. Assoc. 65, 1509–1529. compliance rates with beef carcass specifications: evaluation of inputs and outputs.
Box, G., et al., 2015. Time series analysis: forecasting and control. John Wiley and Sons. Animal Prod. Sci. 54, 2011–2017.
Büyükşahin, Ü.Ç., Ertekin, Ş., 2019. Improving forecasting accuracy of time series data Mello, R.F.d., Ponti, M.A., 2018. Machine learning: a practical approach on the statistical
using a new arima-ann hybrid method and empirical mode decomposition. learning theory. Springer. https://doi.org/10.1007/978-3-319-94989-5.
Neurocomputing 361, 151–163. Metcalfe, A.V., Cowpertwait, P.S., 2009. Introductory time series with R. Springer.
Bywater, A., et al., 1988. Modelling animal growth. Mathe. Comput. Simulat. 30, Myers, M.F., et al., 2000. Forecasting disease risk for increased epidemic preparedness in
165–174. public health. Adv. Parasitol. 47, 309–330.
Cominotte, A., et al., 2020. Automated computer vision system to predict body weight NRC, 1976. Nutrient requirements of beef cattle. 6th ed., National Academies Press.
and average daily gain in beef cattle during growing and finishing phases. Livestock NRC, 2000. Nutrient requirements of beef cattle. 7th ed., National Academies Press.
Sci. 232, 103904. NRC, 2011. Nutrient requirements of beef cattle. National Academies Press.
Cui, J., et al., 2019. Origin and evolution of pathogenic coronaviruses. Nat. Rev. NRC, 2016. Nutrient requirements of beef cattle. National Academies Press.
Microbiol. 17, 181–192. https://doi.org/10.1098/rsos.160498. Oddy, V., et al., 1997. Understanding body composition and efficiency in ruminants: a
Darbellay, G.A., Tichavský, P., 2000. Independent component analysis through direct non-linear approach. Recent Adv. Animal Nutrit. Australia 11, 209–222.
estimation of the mutual information. In: Proceedings of 2nd International Oltjen, J., et al., 1986. Development of a dynamic model of beef cattle growth and
Workshop on ICA and Blind Source Separation, IEEE Signal Processing Society, composition. J. Animal Sci. 62, 86–97.
Finland. pp. 69–74. Oltjen, J., et al., 2000. Second-generation dynamic cattle growth and composition
Darbellay, G.A., Vajda, I., 1999. Estimation of the information by an adaptive models. Modelling nutrient utilization in farm animals, 197.
partitioning of the observation space. IEEE Trans. Inf. Theory 45, 1315–1321. Pagliosa, L.C., Mello, R.F., 2017. Applying a kernel function on time-dependent data to
Di Marco, O., et al., 1989. Simulation of dna, protein and fat accretion in growing steers. provide supervised-learning guarantees. Expert Syst. Appl. 71, 216–229. https://doi.
Agric. Syst. 29, 21–34. org/10.1016/j.eswa.2016.11.028.
FAO, 2009. How to feed the world in 2050. Papana, A., Kugiumtzis, D., 2008. Evaluation of mutual information estimators on
FAO, 2020. Covid-19 impacts driving up acute hunger in countries already in food crisis. nonlinear dynamic systems. arXiv preprint arXiv:0809.2149.
URL: http://www.fao.org/news/story/pt/item/1307458/icode/. Priestley, M., 1989. Spectral analysis and time series. Academic Press, London.
FAPESP, 2019. Smarttrato: computer vision and artificial intelligence plataform to Qiao, Y., Kong, et al., 2021. Intelligent perception for cattle monitoring: A review for
improve feed management based on animal behaviour. URL: https://bv.fapesp.br/e cattle identification, body condition score evaluation, and weight estimation.
n/auxilios/102976/smarttrato-computer-vision-and-artificial-intelligence-platafo Comput. Electron. Agric. 185, 106143.
rm-to-improve-feed-management-based-o/. Rahagiyanto, A., Adhyatma, M., et al., 2021. A review of morphometric measurements
FAPESP, 2020a. Smartus: artificial intelligence and machine vision for precision techniques on animals using digital image processing. Food Agric. Sci.: Polije Proc.
livestock feeding. URL: https://bv.fapesp.br/en/auxilios/106579/smartus-artifici Series 3, 67–72.
al-intelligence-and-machine-vision-for-precision-livestock-feeding/. Ravindra, B., Hagedorn, P., 1998. Invariants of chaotic attractor in a nonlinearly damped
FAPESP, 2020b. Virtualvet: intelligence platform for early identification of physiological system. J. Appl. Mech. 65, 875–879. https://doi.org/10.1115/1.2791926.
and environmental disorders in beef cattle. URL: https://bv.fapesp.br/en/auxilios/ Rios, R.A., De Mello, R.F., 2013. Improving time series modeling by decomposing and
105967/virtualvet-intelligence-platform-for-early-identification-of-physiological-an analyzing stochastic and deterministic influences. Signal Process. 93, 3001–3013.
d-environmental-disord/. Rios, R.A., De Mello, R.F., 2016. Applying empirical mode decomposition and mutual
Fisher, R., 1929. Tests of significance in harmonic analysis. Proc. Roy. Soc. 125, 54–59. information to separate stochastic and deterministic influences embedded in signals.
France, J., et al., 1987. A model of nutrient utilization and body composition in beef Signal Process. 118, 159–176. https://doi.org/10.1016/j.sigpro.2015.07.003i.
cattle. Animal Sci. 44, 371–385. Rios, R.A., et al., 2015. Estimating determinism rates to detect patterns in geospatial
Fraser, A.M., Swinney, H.L., 1986. Independent coordinates for strange attractors from datasets. Remote Sens. Environ. 156, 11–20.
mutual information. Phys. Rev. A 33, 1134. Ripple, W., et al., 2016. Bushmeat hunting and extinction risk to the world’s mammals.
Friant, S., et al., 2020. Eating bushmeat improves food security in a biodiversity and Royal Soc. Open Sci. 3, 1–16. https://doi.org/10.1098/rsos.160498.
infectious disease ”hotspot”. EcoHealth 17, 125–138. https://doi.org/10.1007/ Roseiro, G., et al., 2017. Beef cattle body weight prediction using time series. Encontro
s10393-020-01473-0. Mineiro de Estatística (MGEST).
Gemmeke, J.F., et al., 2010. Compressive sensing for missing data imputation in noise Sainz, R., et al., 2006. Growth patterns of nellore vs. british beef cattle breeds assessed
robust speech recognition. IEEE J. Sel. Top. Signal Process. 4, 272–287. https://doi. using a dynamic, mechanistic model of cattle growth and composition. KEBREAB, E.;
org/10.1109/JSTSP.2009.2039171. DIJKSTRA, J.; BANNINK, A, 160–170. https://doi.org/10.1079/9781845930059
Gill, M., 1984. Modelling the partition of nutrients for growth, pp. 75–79. .0160.
Hastie, T., et al., 2009. The elements of statistical learning: data mining, inference, and Salawu, E.O., et al., 2014. Using artificial neural network to predict body weights of
prediction. Springer Science & Business Media. rabbits. Open J. Animal Sci. 2014.
Hoch, T., Agabriel, J., 2004. A mechanistic dynamic model to estimate beef cattle growth Sarout, B.N.M., et al., 2018. Assessment of circadian rhythm of activity combined with
and body composition: 1. model description. Agric. Syst. 81, 1–15. random regression model as a novel approach to monitoring sheep in an extensive
Huang, N.E., et al., 2003. A confidence limit for the empirical mode decomposition and system. Appl. Animal Behav. Sci. 207, 26–38.
hilbert spectral analysis. Proc. Roy. Soc. London. Series A: Mathe., Phys. Eng. Sci. Selemetas, N., et al., 2015. The effects of farm management practices on liver fluke
459, 2317–2345. prevalence and the current internal parasite control measures employed on irish
INMET, 2021. Dados históricos anuais. URL: https://portal.inmet.gov.br/. dairy farms. Veterinary Parasitol. 207, 228–240.
Karesh, W.B., Noble, E., 2009. The bushmeat trade: increased opportunities for Shannon, C.E., 2001. A mathematical theory of communication. ACM SIGMOBILE
transmission of zoonotic disease. Mount Sinai J. Med.: J. Translat. Personalized Mobile Comput. Commun. Rev. 5, 3–55.
Med.: J. Translat. Personalized Med. 76, 429–434. https://doi.org/10.1002/ Soboleva, T., et al., 1999. A dynamical model of body composition in sheep, in:
msj.20139. Proceedings-New Zealand Society of Animal Production, New Zealand Society of
Kathleen, T., et al., 1997. Chaos: An introduction to dynamical systems. Phys. Today Animal Prod Publ. pp. 275–278.
67–68. Takens, F., 1981. Detecting strange attractors in turbulence. In: Dynamical Systems and
Kennel, M.B., et al., 1992. Determining embedding dimension for phase-space Turbulence. Springer-Verlag, Lecture Notes in Mathematics, pp. 366–381.
reconstruction using a geometrical construction. Phys. Rev. A 45, 3403. Tedeschi, L.O., et al., 2004. A decision support system to improve individual cattle
Keogh, K., et al., 2021. Effect of plane of nutrition in early life on the transcriptome of management. 1. a mechanistic, dynamic model for animal growth. Agric. Syst. 79,
visceral adipose tissue in angus heifer calves. Sci. Rep. 11, 1–12. 171–204.
Kraskov, A., et al., 2004. Estimating mutual information. Phys. Rev. E 69, 066138. Tullo, E., et al., 2019a. Cattle segmentation and contour extraction based on mask r-cnn
Kurpiers, L.A., et al., 2016. Bushmeat and emerging infectious diseases: lessons from for precision livestock farming. Comput. Electron. Agric. 165, 104958.
africa, in: Problematic Wildlife. Springer, pp. 507–551. https://doi.org/10.1007/9 Tullo, E., et al., 2019b. Environmental impact of livestock farming and precision
78-3-319-22246-2_24. livestock farming as a mitigation strategy. Sci. Total Environ. 650, 2751–2760.
Maheswari, B.U., et al., 2021. Arima versus ann—a comparative study of predictive Vapnik, V., 2013. The nature of statistical learning theory. Springer science and business
modelling techniques to determine stock price. In: Proceedings of the Second media.
International Conference on Information Management and Machine Intelligence, Von Luxburg, U., Schölkopf, B., 2011. Statistical learning theory: models, concepts, and
Springer. pp. 315–323. results. In: Handbook of the History of Logic, pp. 651–706.
Mayer, D., et al., 2013. Integrating stochasticity into the objective function avoids monte Williams, C., Jenkins, T., 2003. A dynamic model of metabolizable energy utilization in
carlo computation in the optimisation of beef feedlots. Comput. Electron. Agric. 91, growing and mature cattle. ii. metabolizable energy utilization for gain. J. Animal
30–34. https://doi.org/10.1016/j.compag.2012.11.006. Sci. 81, 1382–1389.
Mayilsamy, K., et al., 2021. Modeling of a simplified hybrid algorithm for short-term load Williams, C.B., Bennett, G.L., 1995. Application of a computer model to predict optimum
forecasting in a power system network. COMPEL-Int. J. Comput. Mathe. Electrical slaughter end points for different biological types of feeder cattle. J. Anim. Sci. 73,
Electronic Eng. 2903–2915.
McElreath, R., 2020. Statistical rethinking: A Bayesian course with examples in R and Zhang, Y., et al., 2021. Effects of low and high levels of maternal nutrition consumed for
Stan. CRC Press. the entirety of gestation on the development of muscle, adipose tissue, bone, and the
McPhee, M., et al., 2007. Parameter estimation of fat deposition models in beef steers. organs of wagyu cattle fetuses. Animal Sci. J. 92, e13600.
DNA 4, 115–120.
19