0% found this document useful (0 votes)
9 views17 pages

Heuristic Based Federated Learning With Adaptive Hyperparameter Tuning For Households Energy Prediction

This paper presents a novel hierarchical federated learning approach for household energy prediction that incorporates adaptive hyperparameter tuning and clustering techniques. By aggregating models from households with similar energy profiles and optimizing hyperparameters using genetic algorithms and simulated annealing, the proposed method significantly enhances prediction accuracy and reduces communication overhead. The results indicate improved performance compared to traditional federated averaging, particularly in scenarios with non-IID data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views17 pages

Heuristic Based Federated Learning With Adaptive Hyperparameter Tuning For Households Energy Prediction

This paper presents a novel hierarchical federated learning approach for household energy prediction that incorporates adaptive hyperparameter tuning and clustering techniques. By aggregating models from households with similar energy profiles and optimizing hyperparameters using genetic algorithms and simulated annealing, the proposed method significantly enhances prediction accuracy and reduces communication overhead. The results indicate improved performance compared to traditional federated averaging, particularly in scenarios with non-IID data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

www.nature.

com/scientificreports

OPEN Heuristic based federated learning


with adaptive hyperparameter
tuning for households energy
prediction
Liana Toderean1, Mihai Daian1, Tudor Cioara1, Ionut Anghel1, Vasilis Michalakopoulos2,
Efstathios Sarantinopoulos2 & Elissaios Sarmas2
Federated Learning is transforming electrical load forecasting by enabling Artificial Intelligence
(AI) models to be trained directly on household edge devices. However, the prediction accuracy of
federated learning models tends to diminish when dealing with non-IID data highlighting the need for
adaptive hyperparameter optimization strategies to improve performance. In this paper, we propose
a novel hierarchical federated learning solution for efficient model aggregation and hyperparameter
tuning, specifically tailored to household energy prediction. The households with similar energy
profiles are clustered at the edge, linked, and aggregated at the fog level, to enable effective and
adaptive hyperparameter tuning. The federated model aggregation is optimized using hierarchical
simulated annealing optimization to prioritize updates from the better-performing models. A genetic
algorithm-based hyperparameter optimization method reduces the computational load on edge
nodes by efficiently exploring different configurations and using only the most promising ones for
edge nodes’ cross-validation. The evaluation results demonstrate a significant improvement in
average prediction accuracy and better capturing of energy patterns compared to the federated
averaging approach. The impact on network traffic among nodes across different layers is kept below
30 KB. Additionally, hyperparameter tuning reduces the size of model updates and the number of
communication rounds by 30%, which is particularly beneficial when network resources are limited.

Keywords Federated learning, Energy prediction, Hyperparameters optimization, Simulated Annealing-


based federated aggregation, Genetic algorithm

Energy prediction is significant for modern power grids, ensuring their efficient operation, mitigating instability,
and optimizing resource allocation and renewable energy source integration1. In recent years, progress has been
made in ML forecasting for energy prediction2,3. The accuracy and reliability of energy forecasts have been
improved by leveraging sophisticated models and large datasets to anticipate demand and supply fluctuations
more precisely. However, large amounts of data are utilized in the training process to create effective prediction
models. Since the household’s energy data contains sensitive information about individuals’ behaviours, ensuring
privacy in learning while still achieving good performance is an open research topic4. Even with strong privacy
and security guarantees, the households’ residents are often reluctant to grant access to their energy data for
storage in centralized cloud silos, where it can be further processed and used for model training purposes5.
Recently, Federated Learning (FL) has emerged as a promising approach in the field of energy prediction,
particularly for electrical load forecasting. It enables local prediction model training on data collected
and stored on household devices at the edge and offers advantages for training models on distributed data,
including improved efficiency and enhanced data privacy. Taïk et al.6 conducted one of the first studies on
electrical load forecasting using edge computing and FL. They employed Long short-term memory (LSTM)
in a federated scenario to predict residential load for 200 houses in Texas. Their approach highlighted the
benefits of personalization through re-training, achieving a 5% performance increase in terms of root mean
square deviation (RMSE) and mean absolute percentage error (MAPE). Similarly, Liu et al.7 introduced a FL

1Distributed Systems Research Laboratory, Computer Science Department, Technical University of Cluj-Napoca,
G. Barițiu 26-28, Cluj-Napoca 400027, Romania. 2Decision Support Systems Laboratory, School of Electrical &
Computer Engineering, National Technical University of Athens, Ir. Politechniou 9, Athens 157 73, Greece. email:
[email protected]; [email protected]

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 1


www.nature.com/scientificreports/

framework for smart grids, integrating power consumption data with weather features from 60 transformer
stations in Zhuhai, China. This study utilized LSTMs and boosting trees, comparing horizontal and vertical FL
models using MSE as the performance metric. The work emphasized the importance of securing power traces in
collaborative learning environments.
Further research indicated the diminishing performance of FL when dealing with non-independent and
identically distributed (non-IID) data8,9. This prompted several researchers to experiment with clustering
techniques. Savi et al.10 explored short-term load forecasting (STLF) at the edge, using FL and clustering
methodologies. The prediction model was based on LSTMs and incorporated weather data. They compared
FL with clustering learning in terms of accuracy, impact of clustering, scalability, and communication cost,
with the Kmeans FL model achieving the best performance in most metrics. Brigs et al.11 conducted a similar
study with an LSTM-based model enhanced with weather data. They tested several scenarios, comparing FL,
centralized learning, local learning, and Hierarchical Clustering (HC). The results showed that FL approaches
outperformed centralized learning but underperformed local learning. However, with a personalization step, FL
and its clustered variant (FL + HC) improved performance by up to 5% over localized learning while maintaining
data privacy. Additionally, FL + HC with fine-tuning significantly reduced computational demands, requiring up
to 10 times fewer samples for optimal model performance. He et al.12 tested residential STLF on 250 households
from Australia, using LSTM models and K-means clustering in a federated setting. It showcased the importance
of clustering and indicated that FL can be particularly useful for collaborative training in cases of users with
missing historical data. More advanced clustering techniques have been used in13,14. Tun et al.13 implemented
bi-directional LSTM models with ordering points to identify the clustering structure for STLF on data from 22
households in British Columbia. Their comparison between clustered and non-clustered approaches revealed
the benefits of clustering in improving forecast accuracy. Gholizadeh et al.14 introduced hyperparameter-
based clustering for electrical load forecasting on 75 households in Edmonton, comparing FL with centralized
and local learning using RMSE. The results revealed that the clustering method significantly reduced the
convergence time and that FL performed worse than local learning and better than centralized learning in
individual load forecasting. Fernández et al.15 focused on privacy-preserving FL for residential STLF, testing
various architectures and scenarios. Their findings suggest that FL performs worse than centralized learning
in terms of accuracy, the performance of FL increases proportionally with the number of participating clients.
Additionally, clustering methods enhance forecasting accuracy, while complex model architectures involve high
computational costs and pose risks of overfitting. Duttagupta et al.16 explored lightweight FL for distributed load
forecasting using a feedforward neural network model, demonstrating that lightweight models could indeed
achieve comparable performance to more complex architectures. The experiments highlighted the potential of
FL in reducing computational costs while maintaining accuracy.
A limited number of studies have experimented with variations of the federated aggregations algorithms in
energy predictions. Wang et al.17 introduced the SecFedAProx-LSTM an adaptive FL framework for multiparty
wind power forecasting, based on an LSTM model, a variation of the FedProx framework, and secure aggregation.
Their method demonstrated three key advantages. It provided more accurate and reliable forecasts compared
to Multilayer Perceptron, Convolutional Neural Network, Recurrent Neural Network, and Gated recurrent unit
(GRU) models and achieved faster convergence and improved accuracy in the presence of statistical heterogeneity
compared to FedProx, especially as the number of clients increased. Additionally, it ensured privacy without
requiring a third party for key generation, using Decentralized Multi-Client Functional Encryption for secure
aggregation. Fekri et al.18 experimented with two federated aggregation algorithms: FedSGD and FedAVG. Both
achieved higher accuracy than individual and central models for one-hour forecasting, with FedAVG slightly
better. For 24-hour forecasting, FedAVG outperformed all methods, while FedSGD had convergence issues. The
approach maintained high accuracy even when new smart meters joined post-training. Some approaches aim to
ensure a more efficient federated model aggregation. Hu Y. et al.19 propose an aggregation method that considers
the characteristics of individual datasets of the training nodes, enabling participants to make element-wise
contributions to improve the learning performance and convergence speed. Hu Z. et al. propose in20 a multi-
objective optimization approach for FL that converges to Pareto stationary solutions. The aggregation algorithm
considers individual objectives and the overall collaborative objective. Chifu et al.21 introduced FedWOA, a FL
model for predicting renewable energy production using time series data from local prosumer nodes. Utilizing
the Whale Optimization Algorithm (WOA) to aggregate LSTM model weights, FedWOA addresses data
heterogeneity and variations in generation patterns. With Kmeans clustering for non-IID data management,
FedWOA improved prediction accuracy by 25% for MSE and 16% for Mean absolute error (MAE) compared to
FedAVG, demonstrating good convergence and reduced loss. This approach enables precise forecasts for small-
scale energy prosumers through decentralized data and collaborative global model optimization.
Finally, the hyperparameters of local models may significantly impact the performance of FL for energy
prediction. Improving hyperparameter selection such as learning rate, batch size, or number of epochs and
dynamically adjusting them can increase convergence speed and enhance the learning of local models22.
However, communication overhead and convergence speed between the edge devices and the cloud server may
affect the prediction accuracy and training efficiency23. Heuristic-based approaches are often used to find the
optimal hyperparameter settings as they are exploring efficiently large search spaces by balancing the exploration
and exploration in finding the optimal configuration22. Kundroo et al.24 highlight the importance of selecting
the appropriate configuration of hyperparameters for both model performance and training efficiency. In their
case, the clients are responsible for hyperparameter optimization, by dynamically adjusting the learning rate and
number of epochs according to the model training loss. Qolomany et al. propose a Particle Swarm Optimization
algorithm for hyperparameter tuning of deep long short-term memory models25. The number of communication
rounds needed to find the best solution is reduced compared to a grid search method. Al-Wesabi et al.26 use the
Pelican Optimization Algorithm to fine-tune the hyperparameters of a belief network for attack detection on

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 2


www.nature.com/scientificreports/

local IoT devices. A heuristic approach for hyperparameter tuning was applied for spiking neural networks
in27. This type of neural network has many hyperparameters, and the Cuckoo Search Algorithm, Grasshopper
Optimization Algorithm, and Polar Bears Algorithm were tested for their optimization. Orchard meta-heuristic
optimization algorithm is proposed by Bukhari et al. in28 for hyperparameter tuning of a FL model that predicts
photovoltaic power generation. The optimization problem solutions are composed of architectural information
for the proposed Conv-SGRU model, learning, and dropout rate. Michalakopoulos et al.29 propose a federated
framework for collaborative model training across decentralized prosumer energy data without compromising
sensitive information. They leverage clustering algorithms that utilize the models’ hyperparameters as the input
space and integrate the differential privacy aggregator. The privacy-preserving transfer learning for short-term
building energy consumption predictions is addressed in30. The federated model learns transferable knowledge,
and the hyperparameter fine-tuning process is made during the training phase using a grid search algorithm
to find the optimal configuration regarding model architecture, learning rate, and the used optimizer. The grid
search algorithm is also used for hyperparameter selection in31 in different FL settings for residential energy
consumption prediction.
The paper explores a novel hierarchical FL solution for households’ energy consumption prediction that
incorporates clustering techniques, simulated annealing (SA), and genetic algorithms (GAs) for efficient models’
aggregation and hyperparameters tuning. We address the challenge of effective and adaptive hyperparameter
tuning for heterogeneous energy profiles by using a clustering technique. Similar energy profiles are grouped and
linked for aggregation at the fog level. The GA efficiently explores the hyperparameter configurations, selecting
and sending only the most promising ones to the validation nodes for evaluation. Additionally, there is a need
for effective hyperparameter tuning methods that can scale to numerous households and massive datasets. These
methods should be capable of handling the diverse FL deployments and consider the limited computational
resources available at the edge. To address this gap, a hierarchical SA optimization is used as an efficient
aggregation method at the fog and cloud layers. The method improves performance by prioritizing updates from
the better-performing models.,and enhances training efficiency by focusing on early updates. Finally, the GA-
based hyperparameter optimization process reduces the computational effort of edge nodes by using only one
hyperparameter configuration at a time for training and validation. In this way, we address significant challenges
in FL, such as optimizing the communication between edge devices and the fog/cloud to reduce overhead, while
maintaining the prediction performance of the global model. This is relevant, especially in the case of households’
energy consumption prediction where the energy data is non-IID and a node with a larger dataset and higher
energy profile magnitude shouldn’t necessarily have a greater influence on the global model. Additionally, it’s
important to consider, especially in the early stages some prediction models may perform poorly on edge nodes
but still contribute positively to the global model.
The remainder of the paper is structured as follows: the Methods section introduces the proposed FL solution
for households’ energy production, the Results section details the evaluation and validation results and the
Conclusion section summarizes the paper and highlights future works.

Methods
Figure 1 presents the proposed three-layer FL architecture for energy consumption prediction of a set of
households, h ∈ H . The edge nodes refer to gateway devices located in buildings, which are used to train
local prediction models on the data stored locally. These devices then send updates of their learned models to
the upper fog layer. Since households have different energy consumption profiles with varying patterns and
amplitudes, their effective grouping into distinct clusters is important for prediction accuracy. In this scope we
have used our clustering solution from32, with one change that involves removing the extra features related to
peak demand hours, as it plays no role in understanding the time series patterns that we are trying to categorize.
Therefore, the fog devices are associated with a cluster c ∈ C , of H c households ∪ H c = H , enabling them
to contribute to a shared prediction model on the fog layer. The top cloud layer is responsible for efficiently
aggregating the fog layer updates into a global prediction model.
A round of communications between the top layer cloud, fog clusters, and each cluster with its households
and reverse, represents an iteration. We have considered K as the total number of iterations needed to complete
the trading of the global federated model. The top layer is responsible for initializing and storing the global
weights w (k) after each iteration k ∈ K , and a set of hyperparameter configurations of the global model ψ
. Also, λ is a cumulative hyperparameter for the cloud model and α (k) is the computed performance of the
global model on iteration k. Each fog layer cluster, c ∈ C , has a set of hyperparameter configurations ψ c from
which it selects the best configuration φ cbest (k) and sends it to the edge layer. The cumulative hyperparameter
of the cluster model is denoted as λ c and its performance on iteration k as α c (k). Additionally, the cluster-
associated vector of weights on iteration k, wc (k) is updated by aggregating the weights received from each
edge node. Finally, the household edge nodes are responsible for the training and validation of the model. They
receive the initial weights and configuration from the fog and update and evaluate their performance considering
the current configuration of the hyperparameters. The performance of the updated model on iteration k is
denoted as α cp (k). The computed weights on the prosumer node are wpc (k).
For each cluster c the edge nodes H c are split into train, Htc and validation nodes Hvc , such that
H c = Htc ∪ Hvc . We define the learning of the global federated model as a multi-objective optimization
problem. On the edge layer, for each training node hcp (k) ∈ Htc the objective is to minimize the loss on its
training data set Dhp , given the weights of the local model wpc (k) and the best hyperparameter configuration
φ cbest (k) sampled from the set of fog configurations. The objective function is expressed as:

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 3


www.nature.com/scientificreports/

Fig. 1. Layered FL architecture for energy prediction.

hc ( )
p
fobj (φ ci (k)) = min Lossφ cbest (k) Dhp , wpc (k) , hp ∈ Htc (1)
c (k)∈ Rd
wp

On the fog layer, the objective at each cluster is to minimize the sum of the losses computed on both training
and validation edge nodes. This involves minimizing the total loss from all household nodes in the cluster
by aggregating the weights from edge nodes within the cluster and selecting the optimal hyperparameter
configuration for training. The objective function is:
∑ hc
c
fobj = min p
fobj (φ cbest (k)) (2)
φc (k)∈ ψ , wc (k)∈ Rd hc
p∈ H
c
best

where c is the cluster of edge nodes, ψ c is the hyperparameter configurations for the cluster and wc (k) the set
of edge models in the cluster.
The cloud layer’s global objective is to minimize the overall loss on all edge nodes by efficiently aggregating
the updates received from the fog nodes:
g
fobj = min fc , c ∈ C
∪ wc (k)∈ Rd (3)
obj

In other words, the optimization problem is to efficiently aggregate the model weights both on fog and cloud
layers and to find the best hyperparameter configuration of nodes such that the sum of edge node training and
validation losses is minimized.

Federated learning methodology


In Fig. 2 the computational and communication steps involved in the FL process are presented. Firstly, the cloud
initializes the weights and the current temperature for the simulated annealing (SA) process as the maximum
temperature (1). Also, the fog nodes initialize the population ψ c for the GA with ψ size chromosomes (2).
The following steps are repeated for K iterations to achieve the overall established objective. The current
global weights w (k) and the current temperature Tcurrent are broadcasted to all the fog nodes c ∈ C (3).
The model weights at the fog level from the previous round wc (k − 1) are updated with the weights received
from the cloud (4). The fog nodes randomly select from their connected edge nodes H c , a node for validation
hcv (5). The hyperparameter tunning process consists of updating the population (6.1), and communication
with the validation edge node for evaluating the chromosomes. The chromosomes φ ci (k) are selected with a
probability for evaluation and are sent to the validation edge node together with the aggregated weights from
the previous round (6.2). The validation edge node evaluates the hyperparameter configuration represented by
the chromosome on the given weights and its validation data (6.3) and sends back to the fog node the fitness

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 4


www.nature.com/scientificreports/

Fig. 2. Sequence diagram of the FL process.

score (6.4). The detailed GA as well as the population update process involving offspring generation, removing
the worst candidates, and fitness score computation is described in the Hyperparameter Tuning section. The
fog nodes select the best chromosome from the population based on fitness score (6.5). The weights and the
best-selected chromosome are broadcasted to all the training edge nodes hcp from Htc (7). The edge nodes
train the model with the given hyperparameter configuration on its dataset Dhp (8) and send to the fog node
the updated weights wpc (k) and its performance α cp (k) (9). Using the SA process, the fog node aggregates the
received updates (10) and sends the aggregated model weights wc (k) and performance α c (k) to the clod (11).
Finally, the cloud aggregates the model updates received from fog nodes (12) and the process is repeated for the
remaining iterations.

Prediction models aggregation


We have defined a SA33 based aggregation solution considering the model performance and allowing for a larger
exploration space in the early stages. SA searches for the optimal solution by accepting solutions that are worse
than the current one with a probability that is higher at the beginning of the process and decreases over time,
controlled by a temperature parameter. Therefore, as the federated learning process progresses the probability of
considering models with lower accuracy in aggregation decreases.
The models are updated based on local performance and previous participation, specifically how early they
provided their solutions, using a cumulative hyperparameter (see Algorithm 1). Nodes that contributed to the
global model more promptly are rewarded with a higher weight in the aggregation process. It has as input a set
of weights wi (k) and performances α i (k), the cumulative hyperparameter from the previous round λ , the
aggregated model weights wagg (k − 1) and performance α agg (k − 1) from the previous round, as well as the
current temperature Tcurrent (line 2).

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 5


www.nature.com/scientificreports/

Algorithm 1: SA Aggregation

The method returns a new set of aggregated weights, the performance of the aggregated model, and the
updated cumulative hyperparameter. Firstly, the algorithm computes two factors: µ prev based on current the
cumulative hyperparameter and µ new based on current temperature (line 5). Afterward, for each set of weights
(line 6), the difference ∇ E between the performances of the previous aggregated model and the current
updates is computed, and a random number γ is selected between 0 and 1 (lines 7–8). If the performance of
the updated weights is higher than the previous aggregated model or with a given probability influenced by γ ,
∇ E , Tcurrent and a constant kB the model is aggregated (lines 9–11). The ponders of the new weights and the
aggregated model is given by the µ prev and µ new and the cumulative hyperparameter is updated with µ new .
Finally, the performance of the aggregated weights is computed as the maximum values between the previous
performance of the aggregated model and the performances of all the updated models (line 14). The usage of
the Boltzmann constant kB employs to operate with the Boltzmann probability distribution where the random
value γ is evaluated concerning the chance that the system is found in a state with a difference of performance
∇ E therefore searching function of temperature for better or random states.

Hyperparameters tuning
The GA34 is used to find the best configuration for the hyperparameters for each fog node
{ corresponding to a cluster.
}
The population is initialized with a set of hyperparameter configurations ψ c = φ c1, , φ c2, φ c3, ... φ cψ size
where φ ci is the ith chromosome of the population:
φ ci = (η i , batchi , epochi , Pi , Nf ti )(4)

The genes represent hyperparameters that significantly influence model performance in federated energy
prediction tasks. The learning rate η ∈ [10−4 , 10−2 ], is tuned to find a balance between stable convergence
and faster training; the batch size batch ∈ [16, 128] allows for exploring different trade-offs between
computational efficiency and capturing complex consumption patterns; the number of epochs epoch ∈ [1, 100]
ensures flexibility in fitting seasonal and varying consumption behaviours without overfitting; the early stopping
patience P ∈ [1, 20] helps to detect convergence and prevent unnecessary training, accommodating data
irregularities; and the number of fine-tuning layers Nft ∈ [1, 10] controls how much of the pre-trained model
is adapted to local conditions. For population initialization, ψ size individuals are randomly generated with
each hyperparameter value drawn from its defined range, enabling a broad search space for discovering effective
configurations.
The GA-based hyperparameter tuning is defined In Algorithm 2. It receives the current temperature from
the SA_AGG, Tcurrent , population of chromosomes ψ c , the validation node hcv (k) and the current cluster-
level aggregated weights wc (k). Firstly, the candidates for crossover φ cp1 , φ cp2 are selected as the best two
hyperparameter configurations in the population (line 6). The new offsprings φ o1 and , φ o2 are generated by
crossover between the selected candidates (line 7) and it is added to the survivor population (line 8). The Single-
Point Crossover is used for offspring generation which involves swapping segments of two parent chromosomes
at a random point. As parameters, we have set a probability of 60% for the crossovers meaning that for a given
pair of parents, there is a 60% chance that crossover will be applied to produce offspring. If φ p1 and φ p2 are
the parent chromosomes, φ o1 and , φ o2 are the offspring chromosomes, r is the crossover point and n is the
length of the chromosomes, then the formula is:
( ) ( )
φ o1 = φ p1 [0 : r] , φ p2 [r : n] , φ o2 = φ p2 [0 : r] , φ p1 [r : n] (5)

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 6


www.nature.com/scientificreports/

Algorithm 1: GA Hyperparameter Tuning Method

For the mutation process, each gene in the offspring has a 3% probability of being changed to a random value
from its domain. After the generation of the offspring, the new population ψ cnew is obtained by replacing the
two chromosomes with the lowest fitness scores with the newly generated offspring (lines 8–9). Only some of
the chromosomes from the population are selected in the current iteration to be evaluated on the validation edge
node hcv (k) ∈ Hvc (lines 10–15). The probability of a chromosome φ ci to be selected is given by a randomly
generated value (line 11), its current fitness score f (φ ci ), constant kB and the temperature Tcurrent (line 12).
For each selected chromosome in the new population, the randomly chosen validation edge node hv (k) ∈ Hvc ,
receives the current cluster-level aggregated weights wc (k) and a hyperparameter configuration corresponding
to a chromosome to compute the fitness score. The fitness score is determined by computing the loss of fitting the
model with the received weights and hyperparameters (line 13). If the chromosome is not selected for evaluation,
the previous fitness will be kept. Finally, the algorithm returns the new population (line 16).

Results and discussion


The dataset used for evaluation contains energy consumption readings from over 4000 London households, with
a subset of these households participating in a time-of-use demand response program35. The data is recorded at
30-minute intervals between November 2011 and February 2014 and provides insights into energy consumption
patterns, tariffs, and responses to price signals. Figure 3 illustrates the hourly average energy consumption of
households over the data collection period. Distinct groups of houses can be identified based on their energy
usage levels. Additionally, there are significant peaks in energy consumption during the day, primarily occurring
early in the morning and in the evening. These peaks can be attributed to the unique consumption patterns of
each household.
The monitored data often exhibit imperfections, such as incompleteness, inconsistency, and inaccuracy, as
well as errors, outliers, or missing values. To improve data quality and uncover meaningful relationships within

Fig. 3. Hourly energy consumption data for each household (recorded daily across the dataset).

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 7


www.nature.com/scientificreports/

Type Feature Description


Hour, minute
Timestamp Day, month Extracted from the monitoring date and computing sinus and cosines for their values
Weekday
Rolling mean value
Statistical Computed on the consumption values for a time window with sizes 3 and 6
Rolling maximum/minimum value

Table 1. Input features for federated energy prediction model.

Fig. 4. Households’ energy profiles analysis: (a) Number of households by daily energy consumption range
and (b) Hourly energy consumption.

the dataset, a data cleaning process was undertaken before data analysis. Initially, data points with missing or
erroneous values were removed to ensure data integrity, resulting in a final sample of 4,438 households for this
study.
For solution evaluation, a wide array of features was considered to capture various aspects of energy
consumption patterns. These features are categorized into several groups, each contributing uniquely to the
predictive power of the federated model. We considered temporal features like the hour of the day, day of the
week, and month of the year to capture features that capture daily, weekly, and seasonal patterns in energy
consumption. To capture short-term trends and variability, statistical features such as moving averages, rolling
mean, and maximum and minimum values were used (see Table 1).
Figure 4 presents an overview of the daily energy consumption of households from the dataset. Figure 4 (a)
shows the distribution of households in the dataset by their daily energy consumption range. Most households
in the dataset have an average daily energy consumption that falls within the interval of 0 to 10 or 10 to 20 kWh/
day. The average daily energy consumption is computed for overall households as an hourly average and is
illustrated in Fig. 4 (b).
Figure 5 (a) represents the monthly average energy consumption. The average is computed for overall
households, and the seasons are represented with different colours, and it can be noticed that the lowest energy
consumption is during the summer months (yellow) and the highest is during the winter (blue). Figure 5 (b)
presents a heatmap of the average energy consumption for each day of the week and how it varies based on the
month. The colour intensity from the heatmap indicates the value of the energy consumption, from blue (high)
to light yellow (low).
We have clustered the households’ prosumers based on the energy profile features using the methodology
presented in31. In the process, a normalization procedure was applied using the Min-Max normalization method,
which scales all values to a range of 0.0 to 1.0. Specifically, the minimum value of each feature is transformed to
0, the maximum value to 1, and all other values to a decimal between 0 and 1. This normalization step is crucial
for mitigating the impact of varying data magnitudes on subsequent clustering analyses, thereby preventing
associated biases. The applied data preparation process aims to enhance the robustness of the clustering analyses
by normalizing data scales and facilitating the use of distance-based metrics in data exploration. Three clustering
algorithms, K-means36, K-medoids37, and Hierarchical clustering38 are applied to segment the data based on the
features of each load profile. Determining the optimal number of clusters in clustering analysis is challenging, as
it typically cannot be precisely known in advance. Therefore, the various clustering algorithms are tested over a
predefined range of clusters, from 2 to 30. This extensive range is systematically explored to determine the most
appropriate number of clusters using three evaluation metrics: the Silhouette Score (SIL), the Davies-Bouldin
Index (DBI), and the Calinski-Harabasz Index (CHI). Table 2 shows the optimal number of clusters for our

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 8


www.nature.com/scientificreports/

Fig. 5. Statistical features analysis: (a) Overall monthly energy consumption and (b) Day of the week energy
consumption by month.

Evaluation
metric
Clustering Algorithm SIL DBI CHI
K-means 3 3 3
HAC 3 3 3
K-medoids 2 2 2

Table 2. The optimal number of clusters for the clustering algorithms based on three evaluation metrics.

Fig. 6. SIL scores for every clustering algorithm under the selected range.

case is three. The only exceptions are observed with K-medoids, where the optimal number of clusters is two.
However, as discussed in previous studies, K-medoids is not reliable for tasks of this nature. Consequently, its
results are excluded from further analysis. On the other hand, the results of K-means and Hierarchical clustering
mostly agree, with only minor exceptions. Since K-means achieves higher scores across all evaluation metrics
(SIL, DBI, and CHI), the labels selected by this algorithm will be incorporated into the proposed solution for
further assessment.
Figure 6 illustrates the SIL scores for all clustering algorithms evaluated across the selected range of cluster
numbers, offering a clear comparison of their performance. The figure highlights the consistent superiority of
K-means, as it achieves the highest SIL scores for most of the tested configurations. This trend underscores the
robustness of K-means in identifying well-separated and compact clusters.

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 9


www.nature.com/scientificreports/

Fig. 7. Normalized median values for each cluster during the day (generated using Python 3.12.341 and
Matplotlib48).

Fig. 8. Evaluation Setup: Households data and Physical Device Assignments (created with Microsoft
PowerPoint60).

In Fig. 7a visual depiction of the clustering outcomes derived from our methodology is presented, overlaying
the time-series data. Each cluster is represented by a unique color to enhance visual distinction, with its respective
median trend line displayed in the same color to emphasize the central tendency within the cluster. The clusters
reveal subtle yet meaningful variations in energy consumption patterns, primarily distinguished by the volume
of usage, providing insights into the underlying structure of our dataset. More specifically, Cluster 1 exhibits the
largest magnitude in daytime peaks, reflecting higher activity levels. Cluster 2 shows a moderate level of energy
usage, with peaks smaller than those of Cluster 1, but still pronounced compared to Cluster 0. Despite these
differences, all clusters share a common temporal structure influenced by similar daily cycles across the dataset.
The evaluation setup for each layer in the federated architecture is presented in Fig. 8. The edge devices,
represented by different versions of Raspberry Pi, are mapped to the corresponding households in the dataset.
For each cluster, a fog device (Intel Core I3 and 8GB RAM) was used for the aggregation and hyperparameter
tuning process. The edge devices are connected to the fog node that represents the cluster to which the consumer
belongs.
We have developed applications for script handling and communication exchanges, each corresponding to
a layer of the federated architecture, using Spring Boot 3.2.5 with Java 1740. For the dependencies manager
we have used Maven and communication among nodes is established using Representational State Transfer
(REST) communication. Python 3.12.341 and TensorFlow 2.1842 are used for building scripts for data and model
manipulation. The applications and the scripts run on Docker containers deployed on the federated architecture
nodes, providing a virtual environment featuring the following libraries: (i) TensorFlow for managing and
creating models, (ii) Pandas43 for reading data from comma-separated values (CSV) files and processing it

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 10


www.nature.com/scientificreports/

through feature engineering pipelines, and (iii) Scikit-learn44 for scaling tasks. Additionally, Scipy45 was used for
special functions, such as the Boltzmann constant, while Argparse46 handled parsing arguments from the stack.
Protobuf47 was used for building the image, and Matplotlib48 for generating plots. The GA for hyperparameters
optimization is implemented using the Java library Jenetics 7.0.049 and it is deployed on the Dockers from the
fog nodes. The SA algorithm for models’ aggregation is implemented from scratch and runs on the Dockers from
the cloud and fog nodes. For monitoring network traffic and hyperparameters, we used features of the Spring
Framework along with a custom caching mechanism to capture the state of the algorithms across iterations. The
code of our federated solution is available on GitHub50.
The energy prediction model architecture is designed using the Keras library51 and is constructed with
sequential layers, the core layer being the LSTM and using ReLU52 as activation function. The input consists
of a sequence of 6 features with a sequence length of 48. The first LSTM layer contains 32 units. The value was
determined through repeated attempts, correlating their impact on the quality of the predictions. A second
LSTM layer with 64 units is then applied. Finally, a Dense layer with 16 units, followed by a final Dense layer with
1 unit to output the predicted value. To update the model’s weights, we used the Adam optimizer53 and Mean
Squared Error (MSE) as the loss function.
Figure 9 reports the prediction accuracy of our FL methodology compared with other state of the art
methods using the average MSE (Mean Square Error) for households’ energy prediction over several iterations
(executed on daily energy profiles from 2013-07-10 to 2013-07-20). For a series of iterations, the performance of
the aggregated model at the cloud model was analysed, as well as the execution time and volume of the network
transmitted data. Compared with FedAVG39 it can be noticed that the hyperparameter tuning method helps the
model converge earlier and, by finding the optimal hyperparameters for training, prevents the spikes of the MSE
during iterations.
We have compared the accuracy of our FL energy prediction model for each edge device, representing a
household. The results presented in Table 3 show that in average the model outperforms the considered baseline
represented by the FedAVG algorithm. Our solution effectively captures patterns in household energy profiles
through clustering and hyperparameter tuning. It demonstrates superior performance in scenarios where
FedAVG struggles, such as for households with device IDs MAC001198 and MAC000321. By introducing
greater variance in the energy prediction data used during training and later in cluster-level cross-validation
our model has good generalization features. It achieves similar accuracy with FedAVG minimizing prediction
deviations across the rest of the households used in testing.
The execution time and the network traffic are measured over iterations to have an overview of the costs
implied by the integration of the proposed aggregation method and hyperparameter tuning process. Figure 10
shows the execution time for each iteration involving a complete federated energy prediction model update.

Fig. 9. Average prediction accuracy of our federated model compared with state of the art methods.

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 11


www.nature.com/scientificreports/

Our Federated Model FedAVG


Edge Device ID MAE R2 RMSE MAE R2 RMSE
MAC000434 0.044188375 0.9792175 0.058825875 0.01947587 0.9962506 0.024245411
MAC004505 0.088838167 0.971611333 0.1101665 0.01595106 0.9980114 0.020075945
MAC001441 0.099584625 0.94392825 0.16032175 0.018966151 0.9900336 0.022822501
MAC002451 0.094873286 0.968307667 0.136992286 0.020804703 0.99762475 0.027247075
MAC001326 0.0564565 0.983215375 0.075719875 0.013383849 0.99841934 0.0175527
MAC004290 0.056729286 0.989875286 0.074915429 0.015129962 0.9926895 0.035435524
MAC002163 0.111804286 0.916483714 0.135642 0.035224594 0.9912712 0.10221271
MAC001198 0.077307667 0.990275167 0.104408333 0.7507679 0.76109415 1.3125988
MAC000321 0.039661333 0.968815 0.048596889 0.19151235 0.8691262 0.68717957

Table 3. Prediction accuracy of individual households at the edge.

Fig. 10. Execution time for each prediction model update iteration.

As many combinations of parameters in the search space need to be evaluated by the genetic heuristics it adds
computational overhead. The time depends on how many chromosomes are selected during the GA evolution for
validation and how fast the edge nodes respond to the computed performance or updated model. Additionally,
the increase in the execution time is due to steps that involve additional communication with the household’s
validation nodes inside the same cluster.
The computational complexity of our solution is influenced by the additional complexities brought by the GA
for hyperparameters optimization and by the simulating annealing solution for prediction models aggregation.
In the case of the GA, the complexity per each cluster c is directly influenced by the size of the initial population
φ ci , the number of iterations I, and the complexity of the fitness function Of :
( ( ( ( ) ) ))
O φ ci ∗ I ∗ Of −GA |H c | + |Htc | · avgsize Dhp + wpc (k) *I + |φ | , hp ∈ Htc (6)

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 12


www.nature.com/scientificreports/

where |H
( | is) the total edge nodes in the cluster hp , m cthe number of training nodes in the cluster,
c

avgsize Dhp is the average size per training node ( Dhp ), wp (k) the dimensionality of the model and ∣Φ∣ is
the number of hyperparameter configurations.
The computational cost of SA for model aggregation per cluster c depends on the number of edge devices in
the cluster, |H c |, the complexity of the objective function which is the model aggregation loss:
( )
Of −SA (I∗( |H c | ∗ wpc (k) + |H c | ∗ avgsize Dhp )(7)

Despite the additional complexity brought by the GA and SA algorithms the execution time for each model
iteration remains within reasonable boundaries feasible for solutions requiring the day ahead energy prediction
for energy prosumers. Additionally, the accuracy gains are significant compared to other federated models in
state of the art. Its complexity could be managed by selecting and sampling only a subset of edge devices or model
parameters to approximate the objective function, reducing the dependence on the number of edge devices per
cluster and the prediction model dimensionality.
Figure 11 shows the data transmission overhead brought by our federated solutions for all the layers. The
edge and fog quantity of transmitted data is computed as an average across all nodes. The FL methodology
proposed has minimum impact on incoming and outgoing traffic among nodes on different architectural layers,
which is beneficial when network resources are limited such as the cases of edge nodes in smart grids. In our
case, the hyperparameter tuning reduces the size of model updates sent between nodes at edge and fog layers as
the GA efficiency parameters such as batch size, learning rate, and update frequency. Therefore, the FL-based
solution can scale more effectively across larger energy networks with many households associated with edge
devices without overwhelming the data network infrastructure. Additionally, the low network traffic overhead
of our solution reduces the energy consumption of edge devices, which is particularly important for important
in households where energy management often overlaps with the integration of smart homes into energy grids.
GA-based hyperparameter optimization minimizes the communication rounds that are required for accurate
households’ energy prediction. This not only optimizes the use of data network resources but also smooths the
data transmission patterns between nodes making the data flow in federated prediction model update more
stable and manageable. Therefore, our federated energy prediction model converges faster leading to quicker
decision-making on edge and fog devices, contributing to the management of microgrids.
The best fitness score and the diversity from each fog population are represented in Fig. 12. The fitness score
(see Fig. 12a) is computed as the performance of the hyperparameters configuration on the selected validation
node. The fitness score for the best chromosome is more stable in the later iterations, as the algorithm progresses.
This stability reflects a more refined and accurate prediction model as the FL process converges thus the federated
model is reaching an optimal solution across all household’s edge nodes, leading to better energy predictions.
The diversity of the population on each fog node (see Fig. 12b) helps prevent premature convergence and ensures
a more robust, globally optimal solution for the federated energy prediction model. The diversity varies based
on local conditions, such as households’ energy data heterogeneity. However, the clustering of households based
on energy profiles and the cross-validation of the model between the edge nodes of the same cluster helps
in exploring a wide solution space. Our federated model explores not only individual households’ patterns
but also broader trends within the cluster widening the solution space, as the model benefits from both local
(individual household) and group (cluster) data patterns. Consequently, different fog nodes can host distinct
local populations of chromosomes, representing local solutions to the energy prediction problem.
To benchmark the energy prediction accuracy results of our methodology we have used the FedAVG,
FedProx and FedMIME implementation from the Tensorflow Federated framework. FedProx is an extension of
FedAvg that incorporates a regularization term to handle heterogeneous client data and improve stability in non-
iid settings, whilst FedMIME is a personalized federated learning method. The energy consumption values were
scaled using Standard Scaler, the dataset was split into training and testing sets (80%-20%), and the federated

Fig. 11. The volume of network traffic for cloud, fog, and edge: (a) incoming and (b) outgoing.

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 13


www.nature.com/scientificreports/

Fig. 12. Genetic based hyper parameters tunning (a) Best fitness score and (b) diversity for each fog over
iterations.

Prediction Accuracy
relative improvement (%)
Metric FedAVG FedProx FedMIME Our federated model FedAVG FedProx FedMIME
MAE 0.12014 0.10431 0.50415 0.07438 38.08 28.69 85.25
RMSE 0.24993 0.22456 0.69997 0.10062 59.74 55.19 85.62
R2 0.95495 0.96366 0.31188 0.96797 28.91 11.85 95.35

Table 4. Average prediction accuracy of our solution compared with state of the Art aggregation methods.

model was trained over 10 communication rounds. The metrics were computed on the testing set for each client,
using the global model. For FedProx, we set the proximal strength to 0.01 to balance stabilizing updates from
heterogeneous data and allowing local model adaptation, and the Yogi client optimizer was used with a learning
rate of 0.01. For FedMIME, Yogi optimizer was used for both the base and server optimizers, with learning rates
of 0.001 and 0.01, respectively. In Table 4 are presented the average values for those metrics computed over all
clients and the statistical improvement of our solution.
Our federated model demonstrates consistent performance improvements over FedAVG, FedProx, and
FedMime across all evaluated metrics. Compared to FedAVG, the MAE decreased in average by 38%, RMSE
by 59% and R2 metric was improved by 28%. Similarly, the average accuracy improvements over FedProx were
of 28% for MAE, 55% for RMSE, and 11% for R2. The prediction performance of FedMIME was worse than
FedAVG and FedProx due to its focus on personalization and the relatively small number of training examples
(hourly consumption data over less than one year). Thus, the improvement was higher in this case (over 85%).
As a final note, the hierarchical FL methodology and adaptive hyperparameter tuning strategy presented here
are not restricted to energy prediction and can be applied in diverse fields characterized by data decentralization
and privacy concerns. Examples include distributed healthcare analytics (e.g., hospital-level patient data)54,55,
language modeling56,57, traffic58, and telecommunications59 forecasting among others. In each case, grouping
similar data sources into clusters and adjusting hyperparameters to local conditions enhances performance,
robustness, and scalability. Likewise, the GA-based hyperparameter tuning method is equally domain-agnostic.
It can efficiently search large and complex hyperparameter spaces to identify near-optimal configurations
without requiring explicit assumptions about the underlying data distribution or the nature of the predictive task.
This flexibility makes the proposed approach readily transferable to other fields where FL and hyperparameter
optimization are needed.

Conclusions
The proposed hierarchical federated learning solution for household energy prediction, captures well the
household energy patterns through clustering and hyperparameter tuning, excelling in scenarios where
FedAVG underperforms with an average accuracy improvement of about 20%. It ensures good generalization
by introducing greater variance in training and cluster-level cross-validation while achieving comparable
accuracy to FedAVG in scenarios where FedAVG excels (around 4%). Additionally, it outperforms FedProx,
and FedMIME, with significant gains in prediction accuracy. The network traffic is kept below 30 KB, and
hyperparameter tuning reduces model update sizes and communication rounds by 30%, making the approach
efficient in resource-constrained networks.

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 14


www.nature.com/scientificreports/

Data availability
All data generated or analysed during this study are included in this published article .

Received: 18 September 2024; Accepted: 28 March 2025

References
1. Sarmas, E. et al. Revving up energy autonomy: A forecast-driven framework for reducing reverse power flow in microgrids’,
sustain. Energy Grids Netw. 38, 101376. https://doi.org/10.1016/j.segan.2024.101376 (Jun. 2024).
2. Aslam, S. et al. A survey on deep learning methods for power load and renewable energy forecasting in smart microgrids’, renew.
Sustain. Energy Rev. 144, 110992. https://doi.org/10.1016/j.rser.2021.110992 (Jul. 2021).
3. Zhu, J. et al. ‘Review and prospect of data-driven techniques for load forecasting in integrated energy systems’, Appl. Energy, 321,
119269, DOI: https://doi.​org/10.1016/​j.apenergy.2​022.119269​.Sep. (2022).
4. Olusogo Popoola, M. et al. A critical literature review of security and privacy in smart home healthcare schemes adopting IoT &
blockchain: Problems, challenges and solutions, Blockchain: Research and Applications, Volume 5, Issue 2, 100178, ISSN 2096–
7209, (2024). https://doi.org/10.1016/j.bcra.2023.100178
5. Vigurs, C., Maidment, C., Fell, M. & Shipworth, D. Customer privacy concerns as a barrier to sharing data about energy use in
smart local energy systems: A rapid realist review. Energies 14 (5), 1285 (2021).
6. Taïk, A. & Cherkaoui, S. ‘Electrical Load Forecasting Using Edge Computing and Federated Learning’, in ICC 2020–2020 IEEE
International Conference on Communications (ICC), Jun. pp. 1–6. (2020). https://doi.org/10.1109/ICC40277.2020.9148937
7. Liu, H., Zhang, X., Shen, X. & Sun, H. ‘A federated learning framework for smart grids: Securing power traces in collaborative
learning’, Nov. 01, 2021, arxiv: arxiv:2103.11870. https://doi.org/10.48550/arXiv.2103.11870
8. Li, Q., Diao, Y., Chen, Q. & He, B. ‘Federated Learning on Non-IID Data Silos: An Experimental Study’, in 2022 IEEE 38th
International Conference on Data Engineering (ICDE), May pp. 965–978. (2022). https://doi.org/10.1109/ICDE53745.2022.00077
9. Zhu, H., Xu, J., Liu, S. & Jin, Y. ‘Federated learning on non-IID data: A survey’, Neurocomputing, vol. 465, pp. 371–390, Nov.
(2021). https://doi.org/10.1016/j.neucom.2021.07.098
10. Savi, M. & Olivadese, F. Short-Term energy consumption forecasting at the edge: A federated learning approach. IEEE Access. 9,
95949–95969. https://doi.org/10.1109/ACCESS.2021.3094089 (2021).
11. Briggs, C., Fan, Z. & Andras, P. Federated learning for Short-Term residential load forecasting. IEEE Open. Access. J. Power Energy.
9, 573–583. https://doi.org/10.1109/OAJPE.2022.3206220 (2022).
12. He, Y., Luo, F., Ranzi, G. & Kong, W. ‘Short-Term Residential Load Forecasting Based on Federated Learning and Load
Clustering’, in 2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids
(SmartGridComm), Oct. pp. 77–82. (2021). ​h​t​tp ​ ​s​:​​/​/​d​o​i​.o
​​ ​r​g​/​1​0​​.​1​1​0​9​/​​S​m​a​r​t​​G​r​i​d​C​o​​m​m​5​1​99​ ​​9​.​2​0​2​1​​.​9​6​3​2​3​1​4
13. Tun, Y. L., Thar, K., Thwal, C. M. & Hong, C. S. ‘Federated Learning based Energy Demand Prediction with Clustered Aggregation’,
in IEEE International Conference on Big Data and Smart Computing (BigComp), Jan. 2021, pp. 164–167. (2021). ​h​t​t​p​s​:​/​/​d​o​i​.o ​ ​r​g​/​
1​0​.​1​1​09​ ​/​B​i​g​C​o​m​p​51​ ​1​2​6​.​2​0​2​1​.​0​0​0​3​9​​​​
14. Gholizadeh, N. & Musilek, P. ‘Federated learning with hyperparameter-based clustering for electrical load forecasting’, Internet
Things, vol. 17, p. 100470, Mar. (2022). https://doi.org/10.1016/j.iot.2021.100470
15. Fernández, J. D., Menci, S. P., Lee, C. M., Rieger, A. & Fridgen, G. Privacy-preserving federated learning for residential short-term
load forecasting. Appl. Energy. 326, 119915. https://doi.org/10.1016/j.apenergy.2022.119915 (Nov. 2022).
16. Duttagupta, A., Zhao, J. & Shreejith, S. ‘Exploring Lightweight Federated Learning for Distributed Load Forecasting’, in 2023 IEEE
International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Oct.
pp. 1–6. (2023). ​h​t​t​p​s​:​​/​/​d​o​i​.​​o​r​g​/1​ ​0​​.​1​1​0​9​/​​Sm
​ ​a​r​t​​G​r​i​d​C​o​​m​m​5​7​3​5​​8​.​2​0​2​3​​.​1​0​3​3​38​ ​8​9
17. Wang, Y. & Guo, Q. Privacy-Preserving and adaptive federated deep learning for multiparty wind power forecasting. IEEE Trans.
Ind. Appl. 1–11. https://doi.org/10.1109/TIA.2024.3430229 (2024).
18. Fekri, M. N., Grolinger, K. & Mir, S. Distributed load forecasting using smart meter data: federated learning with recurrent neural
networks. Int. J. Electr. Power Energy Syst. 137, 107669. https://doi.org/10.1016/j.ijepes.2021.107669 (May 2022).
19. Hu, Y., Ren, H., Hu, C., Deng, J. & Xie, X. An Element-Wise Weights Aggregation Method for Federated Learning, 2023 IEEE
International Conference on Data Mining Workshops (ICDMW), Shanghai, China, 2023, pp. 188–196. ​ht​ ​t​p​s​:​/​/​d​o​i.​ ​o​r​g​/​1​0​.​1​10​ ​9​/​I​C​
D​M​W​6​08​ ​4​7​.​2​0​2​3​.​0​0​0​3​1​​​​
20. Hu, Z., Shaloudegi, K., Zhang, G. & Yu, Y. Federated Learning Meets Multi-Objective Optimization, in IEEE Transactions on
Network Science and Engineering, vol. 9, no. 4, pp. 2039–2051, 1 July-Aug. (2022). https://doi.org/10.1109/TNSE.2022.3169117
21. Chifu, V., Cioara, T., Anitiei, C., Pop, C. & Anghel, I. ‘FedWOA: A Federated Learning Model that uses the Whale Optimization
Algorithm for Renewable Energy Prediction’, Sep. 19, 2023, arXiv: arXiv:2309.10337. https://doi.org/10.48550/arXiv.2309.10337
22. Raiaan, M. A. K., Sakib, S., Fahad, N. M. & Mamun, A. A. Md. Anisur Rahman, Swakkhar Shatabda, Md. Saddam Hossain Mukta,
A systematic review of hyperparameter optimization techniques in convolutional neural networks. Decis. Analytics J. 11, 2772–
6622. https://doi.org/10.1016/j.dajour.2024.100470 (2024).
23. Jingwen Zhou, S., Pal, C., Dong, K. & Wang Enhancing quality of service through federated learning in edge-cloud architecture.
Ad Hoc Netw. 156, 1570–8705. https://doi.org/10.1016/j.adhoc.2024.103430 (2024).
24. Kundroo, M. & Kim, T. Federated learning with hyper-parameter optimization. J. King Saud University-Computer Inform. Sci. 35
(9), 101740 (2023).
25. Qolomany, B., Ahmad, K., Al-Fuqaha, A. & Qadir, J. Particle Swarm Optimized Federated Learning For Industrial IoT and Smart
City Services, GLOBECOM 2020–2020 IEEE Global Communications Conference, Taipei, Taiwan, pp. 1–6, (2020). ​h​t​t​p​s​:​/​/​d​oi​ ​.​o​r​
g​/​1​0​.​11​ ​0​9​/​G​L​O​B​E​CO ​ ​M​4​2​0​0​2​.​2​02​ ​0​.​9​3​2​2​4​6​4​​​​
26. Fahd, N. et al. Deepak Gupta, pelican optimization algorithm with federated learning driven attack detection model in internet
of things environment. Future Generation Comput. Syst. 148 https://doi.org/10.1016/j.future.2023.05.029 (2023). Pages 118–127,
ISSN 0167-739X.
27. Połap, D. et al. A heuristic approach to the hyperparameters in training spiking neural networks using spike-timing-dependent
plasticity. Neural Comput. Applic. 34, 13187–13200. https://doi.org/10.1007/s00521-021-06824-8 (2022).
28. Bukhari, S. M. S., Moosavi, S. K. R., Zafar, M. H., Mansoor, M., Mohyuddin, H., Ullah,S. S., … Sanfilippo, F. (2024). Federated
transfer learning with orchard-optimized Conv-SGRU: A novel approach to secure and accurate photovoltaic power forecasting.
Renewable Energy Focus, 48, 100520.
29. Vasilis Michalakopoulos, E. et al. A machine learning-based framework for clustering residential electricity load profiles to
enhance demand response programs. Appl. Energy. 361, 0306–2619. https://doi.org/10.1016/j.apenergy.2024.122943 (2024).
30. Li, J. et al. Federated learning-based short-term Building energy consumption prediction method for solving the data silos
problem. Build. Simul. 15, 1145–1159https://doi.org/10.1007/s12273-021-0871-y (2022).
31. Petrangeli, E., Tonellotto, N. & Vallati, C. Performance evaluation of federated learning for residential energy forecasting. IoT 3 (3),
381–397 (2022).
32. Vasilis Michalakopoulos, E., Sarantinopoulos, E., Sarmas, V. & Marinakis Empowering federated learning techniques for privacy-
preserving PV forecasting. Energy Rep. 12 https://doi.org/10.1016/j.egyr.2024.08.033 (2024). Pages 2244–2256, ISSN 2352–4847.

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 15


www.nature.com/scientificreports/

33. Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing. Sci. New. Ser., 220, No. 4598. (May 13, 1983),
pp. 671–680 .
34. Man, K. F., Tang, K. S. & Kwong, S. Genetic Algorithms: Concepts and Applications, IEEE Transactions on Industrial Electronics,
Vol. 43, No. 5, 519 (October 1996).
35. UK Power Networks. SmartMeter Energy Consumption Data in London Households, ​h​t​t​p​s​:​​/​/​d​a​t​a​​.​l​o​n​d​o​​n​.​g​o​v​.​​u​k​/​da​ ​​t​a​s​e​t​/​​s​ma​ ​r​t​m​​
e​t​e​r​-​e​​n​e​r​g​y​​-​u​s​e​-​d​​a​t​a​-​i​n​​-​l​o​n​d​o​​n​-​h​o​u​s​e​h​o​l​d​s
36. David Arthur and Sergei Vassilvitskii. K-means++: the advantages of careful seeding. In Proceedings of the eighteenth annual
ACM-SIAM symposium on Discrete algorithms (SODA ‘07). Society for Industrial and Applied Mathematics, USA, 1027–1035.
(2007).
37. Leonard Kaufman, Peter, J. & Rousseeuw Finding Groups in Data: An Introduction to Cluster Analysis, ISBN:9780471878766
|Online ISBN:9780470316801 (1990). https://doi.org/10.1002/9780470316801
38. Murtagh, F. & Contreras, P. Algorithms for hierarchical clustering: an overview. WIREs Data Min. Knowl. Discov. 2, 86–97. ​h​t​t​p​s​:​
/​/​d​o​i​.​o​r​g​/1​ ​0​.​1​0​0​2​/​w​id ​ ​m​.​5​3​​​​ (2012).
39. Brendan, H. McMahan Eider Moore Daniel Ramage Seth Hampson Blaise Aguera y Arcas, Communication-Efficient Learning of
Deep Networks from Decentralized Data, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics
(AISTATS) (2017).
40. Boot, S. https://spring.io/projects/spring-boot
41. Python ​h​t​t​ps​ ​:​​/​/​w​w​w​.​​py​ ​t​h​o​n​​.​o​r​g​/​d​​o​w​n​l​o​​a​d​s​/​r​e​​l​e​a​s​e​/​p​y​t​h​o​n​​-​3​1​2​3​/
42. Tensorflow https://github.com/tensorflow/tensorflow/releases
43. Pandas https://pandas.pydata.org/.
44. scikit-learn, https://scikit-learn.org/stable/
45. SciPy https://scipy.org/.
46. argparse https://docs.python.org/3/library/argparse.html
47. Protobuf https://protobuf.dev/.
48. Matplotlib https://matplotlib.org/.
49. Jenetics https://jenetics.io/.
50. Heuristic-Based Federated Learning on GitHub. ​h​t​t​p​s​:​​/​/​g​i​t​h​​u​b​.​c​o​m​​/​m​i​h​a​i​​d​1​5​0/​ ​​H​e​u​r​i​s​​t​ic​ ​-​A​d​​a​p​t​i​v​e​​-​F​e​d​e​r​a​t​ed
​ ​-​L​e​a​r​n​i​n​g
51. Keras https://keras.io/.
52. Abien Fred, M. & Agarap Deep Learning using Rectified Linear Units (ReLU), (2018). https://arxiv.org/abs/1803.08375
53. Diederik, P. & Kingma Jimmy Ba, Adam: A Method for Stochastic Optimization, (2014). https://arxiv.org/abs/1412.6980
54. Brisimi, T. S. et al. Federated learning of predictive models from federated electronic health records. Int. J. Med. Inf. 112, 59–67.
https://doi.org/10.1016/j.ijmedinf.2018.01.007 (Apr. 2018).
55. Choudhury, O. et al. Differential Privacy-enabled Federated Learning for Sensitive Health Data, Feb. 27, 2020, arXiv:
arXiv:1910.02578. https://doi.org/10.48550/arXiv.1910.02578
56. McMahan, H. B., Ramage, D., Talwar, K. & Zhang, L. Learning Differentially Private Recurrent Language Models, Feb. 23, 2018,
arXiv: arXiv:1710.06963. https://doi.org/10.48550/arXiv.1710.06963
57. Wu, X., Liang, Z. & Wang, J. FedMed: A Federated Learning Framework for Language Modeling, Sensors, vol. 20, no. 14, Art. no.
14, Jan. (2020). https://doi.org/10.3390/s20144048
58. Liu, Y., Yu, J. J. Q., Kang, J., Niyato, D. & Zhang, S. Privacy-Preserving Traffic Flow Prediction: A Federated Learning Approach,
IEEE Internet Things J., vol. 7, no. 8, pp. 7751–7763, Aug. (2020). https://doi.org/10.1109/JIOT.2020.2991401
59. Perifanis, V., Pavlidis, N., Koutsiamanis, R. A. & Efraimidis, P. S. Federated learning for 5G base station traffic forecasting. Comput.
Netw. 235, 109950. https://doi.org/10.1016/j.comnet.2023.109950 (Nov. 2023).
60. Powerpoint, M. ​h​t​t​p​s​:/​​​ ​​/​w​w​​w​.​m​i​cr​ ​o​s​o​f​​t​.​c​​​o​m​/​​r​​o​​-​r​o​/m ​ ​​i​c​r​o​s​​​o​f​t ​​-​3​​6​5​/​p​o​w​e​r​p​o​i​n​t

Acknowledgements/Funding
This research received funding from the European Union’s Horizon Europe research and innovation program
under Grant Agreements number 101136216 (Hedge-IoT) and 101103998 (DEDALUS). Views and opinions
expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union
or the European Climate, Infrastructure, and Environment Executive Agency. Neither the European Union nor
the granting authority can be held responsible for them.

Author contributions
Conceptualization, T.C., I.A., V.M and Ef.S.; Methodology, T.C., L.T., El.S. and V.M.; writing—original draft
preparation, L.T., M.D., T.C., I.A., V.M., Ef.S. and El.S.; writing—review and editing, L.T., M.D., T.C., I.A., V.M.,
Ef.S,, and El.S.; All authors read and agreed to the submitted version of the manuscript.

Declarations

Competing interests
The authors declare no competing interests.

Human ethics and consent to participate declarations


Not applicable.

Additional information
Correspondence and requests for materials should be addressed to T.C. or I.A.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 16


www.nature.com/scientificreports/

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives


4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide
a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have
permission under this licence to share adapted material derived from this article or parts of it. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence
and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder. To view a copy of this licence, visit ​h​t​t​p​:​/​/​c​r​e​a​ti​ ​v​e​c​o​m​m​o​
n​s.​ ​o​r​g​/​l​i​c​e​ns​ ​e​s​/​b​y​-​n​c​-​n​d​/​4​.​0​/​​​​.​​

© The Author(s) 2025

Scientific Reports | (2025) 15:12564 | https://doi.org/10.1038/s41598-025-96443-3 17

You might also like