Introduction

“The United Nations World Tourism Organization (UNWTO)” reported that since 1950 the number of tourist arrivals has been raised to 1.4 billion yearly. Also, the fast growth in tourism is expected to reach 1.8 billion worldwide by the year 2030. Tourism plays a vital role in extending economic freedom in developed countries and presents a paradox. To overcome this paradox, different companies related to the tourism sector can play a vital role in different sectors, such as business communities and industries. In the past decade, a significant improvement has been witnessed by development experts, industry leaders, and policymakers toward the tourism sectors in various countries in the world [1]. Consequently, tourism has gained positive economic outcomes, especially by boosting the GDP (Gross Domestic Product) and providing employment opportunities [2]. Considering the growth of tourism and travellers' necessities, it is pertinent to enhance the services provided to travelers according to their needs and interests [3]. Therefore, exploiting the choices and preferences of users is a hot topic in academia as it greatly impacts decision-making, decision rules, and choice factors [4]. On the contrary, acute developments in the web, social networks, big data [5], cloud computing, IoVs (Internet of Vehicles) [6, 7], and IoTs (Internet of Things) technologies provide abysmal information that acquaints information overload problems. The individuals are precarious in choosing relevant information and making decisions. Therefore, recommender systems in information technology come in, which cope with the information overload problem [8]. It suggests relevant information to the users, considering their explicit or implicit preferences. Therefore, computer scientists contributed to the tourism industry, and plenty of research has been conducted to facilitate tourists using recommender systems. Due to dynamic and temporal preferences, the existing approaches are limited to coping with the sparsity [9] and cold-start [10] problems. Extensive research [11,12,13,14,15,16,17,18] have been devoted to this area which focuses on users and location relationship, but they exclude sparsity problem due to preference dynamics. In this regard, studies [19,20,21,22] have collected data about the relationship between users and tied it to user location but failed to resolve dynamic and temporal preferences. Besides, research dealing with users' temporal dynamics is still limited in alleviating the sparsity problem since these models do not exploit auxiliary contextual information which changes with user preferences over time [10]. For simplicity let’s say a tourist loves to visit mountains in summer and cities in winter, providing them with mountains in winter and cities in summer will be inappropriate or irrelevant. Therefore, the proposed model temporal factor by splitting the dataset into seasons and categories of locations. Using this approach, user satisfaction with recommendations will be enhanced and can achieve higher accuracy respectively.

To prudently overcome aforesaid issues, there is a need for a unified model that exploits the behavior and preference dynamics of users for a more personalized recommendation. Therefore, the proposed model alleviates such problems and makes the following contributions.

  • The proposed model uses an approach based on a probabilistic weighting strategy using eleven graphs to tackle the sparsity problem.

  • Presents two algorithms to get users' favorite season(s) with the most visited categories in a particular season using past check-ins history.

  • The proposed approach uses the work of RELINE (Recommendation with Multiple Network Embeddings) [10] and tries to learn the embeddings of graphs by using the concept of graphs to find the heterogeneous preferences of users.

The rest of the paper is organized as follows: Literature review section presents the contributions being devoted to POI recommendations. Participated networks in the recommendation model section emphasizes participated networks in the proposed model, Proposed next-POI recommender system section explains the proposed work, and Results and discussion section discusses obtained results. Finally, Conclusion and future work section concludes the proposed work and presents future work.

Literature review

This section discusses the contribution being devoted to recommender systems facilitating POIs. In [21] the authors have proposed a system using Hadoop technology which consists of four phases; scrapping, mapping, de-duplication, and recommendation. The shortcoming of this method is a user-centric approach. Due to its complexity, it increases computation time. Likewise, [23] proposed a collaborative filtering approach that performs better due to WSN (Wireless Sensor Network) [24, 25] installations around tourist sites (IoT sensors on edge). It provides the convenience of uploading tourist information and rating POI using smartphones. However, such a method fails to tackle the cold start problem because it is not feasible to implement WSN in all tourist spots, as a result, there may be some locations that remain unrated/unvisited due to the unavailability of sensors/devices at various spots. In this regard, [14] and [25] come up with different approaches to resolving the cold start problem using the notion of CARS (Context-Aware Recommender System), they have tried to get contextual information for achieving better results but ignored the importance of preference dynamics. Sampling on graphs has been used in various flavors, but less attention has been paid to matching a large set of graph properties. To this end, various studies employed network embedding models to exploit semantic relations between the network objects and generate their low-dimensional representations. In [26] the authors proposed a graph-based POI recommendation incorporating geographical and temporal influence to tackle the cold-start problem, but they ignored the importance of preference dynamics. Similarly, [27] have considered users’ preference dynamics but ignored social influence. Furthermore, [28] realized the need to provide POI recommendations at an appropriate time rather than only exploiting user, social, and geographical preferences. Finally, authors in [15] upgraded the work of LINE (Large-scale Information Network Embedding) [29] and used large bipartite graphs to cope with cold start problems achieving good accuracy using social-, geographical-, temporal-influence, along with users’ preference dynamics. To clarify, Table 1 summarizes the incorporating factors of existing methods while exploiting POI recommendations.

Table 1 Incorporating factors in existing approaches

To exploit users’ or tourists’ behavioral patterns, we have highlighted the limitations of existing methods, many models [13, 28, 30,31,32] employed POIs as nodes and don’t consider the spatial dimension with distance information. However, [13, 19, 28] have considered location influence but ignore the preference dynamics, which change over time. Methods [31,32,33] used to capture the temporal dynamics elegantly but do not incorporate spatial dimension. Furthermore, methods that tackle spatial and temporal behavior but fail to include preference evaluation, and finally, methods that capture all factors failed to maintain users’ satisfaction regarding recommendations.

Participated networks in the recommendation model

This section presents the proposed model's workflow and discusses the participated networks and problem definition. Figure 1 depicts all incorporated networks in the proposed model, where Table. 2 describes mostly used symbols in the text.

Fig. 1
figure 1

All participating networks in the proposed model

Table 2 Commonly used symbols in the text

As depicted in Fig. 1, eleven graphs (unipartite and bipartite) have been used in the proposed model. The proposed model consists of the social and location layers, which utilize all the incorporated graphs. The social layer represents the relationship between users (friendship) with each other. Similarly, the location layer describes the physical relationship between various locations like distance, height, and temperature. The embeddings have been generated for each graph and have fed to a collective space, where all the graphs are combined into a single vector space. To further understand the proposed model, the subsequent sub-sections explain the role of each participated graph as follows.

Point-of-Interest

Point-of-Interest is the location where tourists can take interest and have most check-in. It can be represented as a set: \(({s}_{id}, lon, lat)\), where \({s}_{id}\) specifies a location in the dataset, \(lon\) and \(lat\) refer to the longitude and latitude of a particular location.

Check-in

It is the presence of a user u in desired place l at a particular instance of time t, denoted as \(\varvec{c}_{\varvec{i}}=({ }{\varvec{u}},{\varvec{l}},{\varvec{t}})\). A user u can check in only one place where they can record multiple check-ins in their profile \({\varvec{c}}_{\varvec{ui}}=\left\{({\varvec{l}}_{\varvec{i}},{\varvec{t}}_{\varvec{i}}),\dots ,({\varvec{l}}_{\varvec{j}},{\varvec{t}}_{\varvec{j}})\right\}\). For each user, a profile is maintained that stores the locations being visited by him/her; as the user profile grows, the preferences of a particular user will be more helpful in the recommendation.

Season

It can be defined as the season of the dataset and is divided into four seasons (Winter, Summer, Spring, and Autumn). Each season has the check-ins of all tourists during season ∆S. Every tourist prefers to visit or go for a tour in a particular season, while some tend to go for a visit in a few seasons or even all seasons, respectively. The importance of this season must be realized in the POI recommendation. Therefore, we have used time as a season.

User-user graph

The social interaction between users can be represented by and User-user graph as it is a unipartite graph. It can be represented by\({G}_{uu}=\left(U\cup V, {W}_{uv}\right)\), where \(U\) and \(V\) represent the sets of users \(, while {W}_{uv}\) describes the set of weights among users in the network, which can be computed by Eq. (1) as follows

$${W}_{uv}= \frac{1}{{\sum }_{i=1}^{n}\left|{v}_{i}\right| }$$
(1)

User-category graph

A bipartite graph shows the user's and category's connection, considering an entire check-in history. Specifically, it shows the significance of a specific category against all categories for a candidate user. Symbolically, this graph is represented by \({G}_{uk}=\left(U\cup K, {W}_{uk}\right)\), in which U and K are set of users and categories. \({W}_{uk}\) is a set of weighted edges between U and K which can be computed using Eq. (2) and indicates the number of check-ins made in desired category ki against overall check-ins made by user ui.

$${W}_{uk}= \frac{\sum_{\forall {c}_{{u}_{i} }\in { k}_{i}}{c}_{{u}_{i},k}}{\sum_{\forall {c}_{{u}_{i} }\in K}{c}_{{u}_{i},K}}$$
(2)

User-season graph

A bipartite graph represents the relationship between user ui and season si. Algorithm 1 has been used to compute the significance of season si for each user ui. Can be denoted as \({G}_{us}=\left(U\cup S, {W}_{us}\right)\), in which \(S\) and \(U\) denote the set of seasons set users. \({W}_{us}\) ia s set of weighted links as computed by Eq. (3), the number of user’s check-ins in overall season check-ins made by the user ui.

$${W}_{us}= \frac{\sum_{\forall {c}_{{u}_{i} }\in {s}_{i}}{c}_{{u}_{i},s}}{\sum_{\forall {c}_{{u}_{i} }\in S}{c}_{{u}_{i},S }}$$
(3)
figure a

Algorithm 1. Extracting favorite season(s) of each user

Algorithm 1 tries to extract a list of seasons that the user visits most. Checkin history, users, and seasons (winter, summer, spring, autumn) are provided as inputs. In step 1 the check-ins have been sorted on timestamp/date, whereas in step 2 create a list L for the most visited season(s). Step 3 runs for each season, and step 4 checks whether the user ui checked-in in Si. If it is true, step 5 increment the season by one and assign it to user ui in list L. Finally, step 6 returns the obtained L that consists of users and their favorite season(s).

User-location graph

A bipartite graph represents the degree of a specific location for a given user in a desired category. This relation is represented by \({G}_{ul}=\left(U\cup L, {W}_{ul}\right)\) U and L denote the sets of users and locations respectively. This relation is represented as \({G}_{ul}=\left(U\cup L, {W}_{ul}\right)\). Equation (4) computes the number of times one user visited a particular location in a category while the denominator calculates the user’s overall visits to distinct categories during all seasons:

$${W}_{ul}= \frac{\sum_{\forall {l}_{{u}_{i,k} }\in {l}_{i}}{c}_{{u}_{i},l}}{\sum_{\forall {l}_{{u}_{i,k} }\in L}{R}_{{u}_{i},L }}$$
(4)

Category

Five categories are being selected for POIs: Mountains, Rivers, Lakes, Restaurants, and Cities. Every location l must belong to one of the above categories. That is, we have divided the entire dataset into different categories and then extracted the desired category corresponding to the interest of each user.

Category-location graph

To represent the relationship between a location and category, the proposed model uses a directed bipartite graph which is different from the previous one based on the weighting mechanism adopted. It is represented by \({G}_{kl}=\left(K\cup L, {W}_{kl}\right)\), in which L and K denote the sets of locations and categories, respectively. \({W}_{kl}\) represents weighted edges and can be computed using Eq. (5). It can be computed as the number of times a place \(l\) is visited in a specific category k against all check-ins in the concerned category:

$${W}_{kl}= \frac{\sum_{\forall {c}_{l}\in {k}_{i}}{c}_{l}}{\sum_{\forall {c}_{L}\in {k}_{i}}{c}_{L}}$$
(5)

Category-user graph

This graph emphasizes the importance of a category that corresponds to each user. The proposed model extracts categories for each POIs using Algorithm 2, which is a modified version of an algorithm proposed in [15]. It correlates each user from \(U\) with a category \(k \in K.\) The graph is denoted by \({G}_{uk}=\left(U\cup K, {W}_{uk}\right)\), where \({W}_{uk}\) denotes a set of weights between users and categories as computed in Eq. (6), the number of times a user visited the desired category to the total number of visits to all categories made by the same user:

$$W_{ku}=\frac{\sum_{\forall l_{u_{i,k}}\in t_i}c_{u_i,t}}{\sum_{\forall l_{u_{i,k}}\in T}c_{u_i,T}}$$
(6)

Category-category graph

It is a bidirectional bipartite graph representing the relationship between pair of categories. For example, if we take two categories viz, \(k\) and \({k}^{{}^{\prime}}\) which are linked using a link if a certain user u check-in both in the same season s. Using this intuition, we construct the graph as \({G}_{k{k}^{{}^{\prime}}}=\left(K\cup K, {W}_{k{k}^{{}^{\prime}}}\right)\), in which \(K\) denotes the set of categories, and \({W}_{k{k}^{{}^{\prime}}}\) is the weighted edges between pair of categories. The weight between categories is calculated using Eq. (7) as it represents the number of times a user u has visited the corresponding categories simultaneously in season s.

$${W}_{{kk}^{^{\prime}}}= \frac{\sum_{\forall {u}_{{l}_{i,k} }\in {s}_{i}}{c}_{{u}_{i},s}}{\sum_{\forall {u}_{{l}_{i,k} }\in S}{C}_{{u}_{i},S }}$$
(7)
figure b

Algorithm 2. Extracting most visited categories in each season for all users

Algorithm 2 helps to extract the most visited categories of locations in each season. Such an algorithm aims to get the categories users mostly prefer in the desired season(s). For example, a user ui may love to visit historical places in summer, whereas in spring, she/he prefers to hike the mountains. Considering this approach to provide the most preferable recommendation in various periods can be helpful. Algorithm 2 accepts check-in history and seasons, respectively. Step 1 is sorting the check-ins based on date, whereas steps 2 and 3 create lists for time (dividing the dataset into seasons) and categories (categories like mountains, rivers, lakes, meadows, cities, and parks). Step 4 goes through for each season, and n is declared in step 5 to keep track of each check-in of users. Step 6 is whether check-ins are fall or not in the desired seasons. Step 7 runs and adds the check-in to the season list if true. Step 8 increment the n, whereas step 9 initializes q to track the desired category in a season. Step 10 runs to make the check-in in the desired category by checking it in step 11. If it is true, the desired check-in is added to the category list in step 12. For false, it increments the q in step 14 and reruns for the next season. Finally, step 15 returns list K.

Category-season graph

A directed bipartite graph represents relations between a season and a category. The category-season graph is denoted by \({G}_{ks}=\left(K\cup S, {W}_{ks}\right)\), in which K and S show the categories and seasons, respectively. \({W}_{ks}\) is a set of weights established between category and season. Weighted edges between season and category are computed using Eq. (8), where the numerator computes the number of check-ins performed in the category \(k\) by all users in seasons \({s}_{i}\), while the denominator represents the whole check-ins performed in all seasons for the same category.

$$W_{ks}=\frac{\sum_{\forall c_U\in s_i}\left|n_{ks}\right|}{\sum_{\forall c_U\in S}\left|n_{ks}\right|}$$
(8)

Location-location graph

This bipartite graph is employed to show the spatial distance between locations if a user u visits two locations \(l\) and \({l}^{{}^{\prime}}\) at the same time and distance in a range \({R}_{g}\), then a link is established between them. The graph is denoted as \({G}_{l{l}^{{}^{\prime}}}=\left(L\cup L, {W}_{l{l}^{{}^{\prime}}}\right)\), in which \(L\) is a set of locations and \({W}_{l{l}^{{}^{\prime}}}\) is a set of weights among them as computed by Eq. (9) using geographical proximity.

$${W}_{l{l}^{{}^{\prime}}}=1- \frac{{geodist}_{l{l}^{{}^{\prime}}}}{{R}_{g}}$$
(9)

Location-user graph

It represents the relationship between locations and users, also known as a directed bipartite graph. Also, users' interests correspond to a specific location that changes over time. More specifically, this relation is represented as \({G}_{lu}=\left(L\cup U, {W}_{lu}\right)\). The weight \({W}_{lu}\) is calculated using Eq. (10) as the number of times a user u visit a place l to the total number of check-ins made by all the users to that location:

$$W_{lu}=\frac{\sum_{\forall c_u\in l_i}c_u}{\sum_{\forall c_U\in l_i}c_U}$$
(10)

Location-season graph

To show the significance of a certain location for a user u in a season si, the proposed model uses a location-season graph, which can be represented by \({G}_{ls}=\left(L\cup S, {W}_{ls}\right)\), \(L\) and \(S\) show a set of locations and seasons. \({W}_{ls}\) represents the weighted edges as calculated using Eq. (11).

$$W_{ls}=\frac{\sum_{\forall c_U\in l_i}\left|n_{ls}\right|}{\sum_{\forall c_U\in S}\left|n_{ls}\right|}$$
(11)

Problem definition

Some tourists visit a set of locations in a particular season, while some tend to visit each season. Furthermore, users’ preferences change by location category K, i.e., ui may love to visit li in winter whereas uj in summer. To consider this problem, we must provide a list of locations L to user u in a season s belong to category ki given that \(Q (u,l,s)\).

Proposed next-POI recommender system

This section presents the proposed POI recommendation model that jointly learns multiple graph embeddings and encodes them into a low-dimensional embedding space exploiting semantic relations between the nodes of the networks.

Learning embeddings for large information networks

Two nodes m and n are directly connected by an edge, known as first-order proximity. In contrast, the relation between vertices that share multiple neighbor nodes but are not directly associated with each other is referred to as second-order proximity. The LINE model [23] tries to learn the relationships of large graphs to extract this kind of proximity. With this concept, we expand our model to learn the embeddings of large information networks.

Consider two disjoint sets \(=({Q}_{A}\cup {Q}_{B}, W)\), where the vertices in \({Q}_{A}\) that collaborate many common neighbors with \({Q}_{B}\) but they are not linked, then there is a high probability that their distributions are the same. To compute the conditional probability of vertex \({n}_{j} \in {Q}_{B}\) given node\({m}_{i} \in {Q}_{A}\), the model employs the following equation:

$$p\left({n}_{j}|{m}_{i}\right)=\frac{exp({\overrightarrow{n}}_{j}^{T} \times {m}_{i})}{\sum_{{u}_{n}\in {Q}_{B}}exp({\overrightarrow{n}}_{n}^{T} \times {\overrightarrow{m}}_{n})}$$
(12)

The vectors of \({m}_{i}\) and \({n}_{j}\) can be represented as \({\overrightarrow{m}}_{i}\) and \({\overrightarrow{n}}_{j}\). Hence, for each vertex \({m}_{i} \in {Q}_{A}\), Eq. (12) presents conditional distribution \(p\left(\bullet |{m}_{i}\right)\) to all related vertices in the set \({Q}_{B}\). Then, the model uses the conditional distribution to approximate the empirical distribution \(\widehat{p}\left(\cdot |{m}_{i}\right)= \frac{{w}_{i,j}}{\sum {w}_{i,m}}\) employing the following objective function:

$$O= \sum_{{u}_{i} \in {Q}_{A}}{\lambda }_{i}d(\widehat{p}\left(\bullet |{m}_{i}\right), p\left(\cdot |{m}_{i}\right))$$
(13)

where \(d\left(\bullet |\bullet \right)\) indicates the Fullback–Leibler divergence between conditional and empirical distributions. To tune the significance of \({m}_{i}\), we have used \({\lambda }_{i}\) as a hyper-parameter. This parameter is set to the outdegree of each node. Thus, Eq. (13) tries to optimize the following objective function:

$$O= -\sum_{{e}_{i,j} \in W}{w}_{i,j }\mathit{log}p\left({n}_{j}|{m}_{i}\right)$$
(14)

The \({\left\{{\overrightarrow{\varvec{m}}}_{\varvec{i}}\right\}}_{{\varvec{i}}=1\dots {\varvec{Q}}_{\varvec{A}}}\) and \({\left\{{\overrightarrow{\varvec{n}}}_{\varvec{j}}\right\}}_{{\varvec{j}}=1\dots {\varvec{Q}}_{\varvec{B}}}\) that minimizes Eq. (14) are the low-dimensional nodes representations in \({\mathbb{R}}^{d}\) [15].

Optimization of the model

It requires the summation of the complete set to find conditional probability as a result, it enhances computational complexity. To address this problem, we use negative sampling used in [27], which simply samples N negative edges according to the noise distribution for every edge (i, j) as defined in the following equation.

$$\mathit{log}\sigma \left({\overrightarrow{n}}_{j}^{T} \times {\overrightarrow{m}}_{i}\right)+\sum_{h=1}^{N}{W}_{{u}_{n}}\sim {P}_{n}\left(n\right)\left[\mathit{log}\sigma -\left({\overrightarrow{n}}_{j}^{T}\times {\overrightarrow{m}}_{i}\right)\right]$$
(15)

where \(\sigma (x) = 1/1 + exp(-x)\) is the sigmoid function, and \({P}_{n}=\left(n \propto {d}_{h}^{{}^{3}\!\left/ \!{}_{4}\right.}\right)\) same as proposed in [29], \({d}_{h}\) is the out-degree of node n. Furthermore, we come up with an asynchronous stochastic gradient algorithm [33] to optimize Eq. (15). If an edge (i, j) has been sampled, the gradient concerning to the embedding of \({\overrightarrow{m}}_{i}\) of node i can be computed as:

$$\frac{\partial O}{\partial {\overrightarrow{m}}_{i}}={w}_{i,j}\times \frac{\partial \mathit{log}p\left({n}_{j}|{m}_{i}\right)}{\partial {\overrightarrow{m}}_{i}}$$
(16)

It is also considered that the gradient is multiplied by the weight of a link in Eq. (16). It may be problematic if we ignore the balancing of the learning rate. We should have to carefully keep the learning rate because, when selecting the learning rate according to the links with low weights, the gradients on links with high weights will be disastrous. Similarly, when selecting the learning rate with high weight, the gradient will be too small. The model employs the sampling approach adopted in [31] to sample a random edge. Finally, the model draws a sampled edge using alias table according to [29], which minimizes computational complexity to \(O(1)\). Table 3 illustrates the complexity of edge sampling optimization process.

Table 3 Net Complexity of sampling

Learning graph dynamics

Initially, by providing bipartite input graphs, the subsequent step combines them into the model. Our input graphs have been divided into three parts, one considering user networks (User-Location, User-Season, User-Category, User-User). At the same time, the second is related to location (Location-Location, Location-User, Location-Season), and the third one corresponds to a category of places (Category-User, Category-Category, Category-Location). The model collectively integrates the embedding of participating graphs as defined in Eq. (1828) related to users and POIs relations, optimizing the objective function defined in Eq. (17) as follows:

$$O= {O}_{ul}+ {O}_{us}+ {O}_{uk}+ {O}_{uu}+{O}_{ll}+{O}_{lu}+{O}_{ls}+{O}_{ku}+{O}_{kk}+{O}_{kl}+ {O}_{ks}$$
(17)

The model computes these objective functions as follows.

$${O}_{ul}=-\sum_{{e}_{i,j} \in {W}_{ul}}{w}_{i,j}\mathit{log}p \left({u}_{i}|{l}_{j}\right)$$
(18)
$${O}_{us}=-\sum_{{e}_{i,j} \in {W}_{us}}{w}_{i,j}\mathit{log}p \left({u}_{i}|{s}_{j}\right)$$
(19)
$${O}_{uk}=-\sum_{{e}_{i,j} \in {W}_{uk}}{w}_{i,j}\mathit{log}p \left({u}_{i}|{k}_{j}\right)$$
(20)
$${O}_{uu}=-\sum_{{e}_{i,j} \in {W}_{uu}}{w}_{i,j}\mathit{log}p \left({u}_{i}|{u}_{j}\right)$$
(21)
$${O}_{ll}=-\sum_{{e}_{i,j} \in {W}_{ll}}{w}_{i,j}\mathit{log}p \left({l}_{i}|{l}_{j}\right)$$
(22)
$${O}_{lu}=-\sum_{{e}_{i,j} \in {W}_{lu}}{w}_{i,j}\mathit{log}p \left({l}_{i}|{u}_{j}\right)$$
(23)
$${O}_{ls}=-\sum_{{e}_{i,j} \in {W}_{ls}}{w}_{i,j}\mathit{log}p \left({l}_{i}|{s}_{j}\right)$$
(24)
$${O}_{ku}=-\sum_{{e}_{i,j} \in {W}_{ku}}{w}_{i,j}\mathit{log}p \left({k}_{i}|{u}_{j}\right)$$
(25)
$${O}_{kk}=-\sum_{{e}_{i,j} \in {W}_{kk}}{w}_{i,j}\mathit{log}p \left({k}_{i}|k\right)$$
(26)
$${O}_{kl}=-\sum_{{e}_{i,j} \in {W}_{kl}}{w}_{i,j}\mathit{log}p \left({k}_{i}|{l}_{j}\right)$$
(27)
$${O}_{ks}=-\sum_{{e}_{i,j} \in {W}_{ks}}{w}_{i,j}\mathit{log}p \left({k}_{i}|{s}_{j}\right)$$
(28)

For optimization of object functions, as defined in Eq. (17), it requires merging the links of all networks, and at every step, the model updates a new sample edge. The probability of the desired link is computed based on its associated weight. The training of our model is done jointly using the algorithm in [15] dynamically.

Personalized next-POI recommendation

After exploiting semantic relations between the nodes of the participating graphs and learning their embeddings, the next step is to make recommendations for a user. Given a query \(Q(u,l,s)\), which specifies a user \(u\) in a location \(l\) and season s we can correspond these values to desire season \(s\). Claiming that a user is willing to attend locations in specific seasons corresponding to category ki. Finally, we rank a list with top@n unvisited location for a user ui related to category ki. The proposed model uses Eq. (29) to recommend unvisited locations:

$$Q\left(u,l,s\right)=\alpha \times \left({\overrightarrow{{\varvec{u}}}}^{{\varvec{T}}}\times \overrightarrow{{\varvec{l}}}\right)+\beta \times \left({\overrightarrow{{\varvec{k}}}}^{{\varvec{T}}}\times \overrightarrow{{\varvec{l}}}\right)+\gamma \times \left({{\varvec{s}}}^{{\varvec{T}}}\times \overrightarrow{{\varvec{l}}}\right)$$
(29)

where \(\overrightarrow{{\varvec{u}}}\) and \(\overrightarrow{{\varvec{l}}}\) is the vector representations of user \(u\), and location \(l\). Similarly,\(\overrightarrow{{\varvec{k}}}\) represents the vector representation of the category \(k\) such check-in. The proposed model uses cloud computing to store and process the data, and to jointly learn the vector representations from various information graphs in the same embedding space. This allows for more efficient handling of large amounts of data and the ability to perform complex computations. The cloud-based solution also allows for the incorporation of more information networks, which in turn reduces sparsity by incorporating more information networks, and jointly learns the dynamics of the social influence \({(\overrightarrow{{\varvec{u}}}}^{{\varvec{T}}}\times \overrightarrow{{\varvec{l}}})\), the geographical influence (\({\overrightarrow{{\varvec{k}}}}^{{\varvec{T}}}\times \overrightarrow{{\varvec{l}}}\)), and the temporal influence (\({\overrightarrow{{\varvec{s}}}}^{{\varvec{T}}}\times \overrightarrow{{\varvec{l}}}\)), simultaneously to provide more accurate and personalized POI recommendations to users.

Employment of cloud and edge computing

Edge computing is a distributed computing paradigm that focuses on processing data near the source of data generation, thereby reducing latency and bandwidth usage. The proposed work leverages edge computing to process data generated by tourists using LBSNs to share their preferences and interests. LBSNs enable the collection of tourists' location and interest data, which can be processed at the network's edge, facilitating real-time and personalized recommendations. In addition to edge computing, cloud computing can be employed to store and process the large amounts of data collected from LBSNs as shown in Fig. 2 consists of two main components. The first component shows tourists using LBSNs to share their preferences and interests, generating data collected and processed at the network's edge through edge computing. The second component displays the cloud computing infrastructure used to store and process the large amounts of data collected from LBSNs. This approach enhances the computational capabilities of the proposed model and enables tourism management systems worldwide to access and use it easily.

Fig. 2
figure 2

Use of edge and cloud computing in the proposed model

Results and discussion

This section focuses on the performance evaluation of the proposed model based on the Foursquare dataset. That is, we compare the results of the proposed model with existing POI models.

Dataset

To analyze the results of the proposed model, we have used a publicly available dataset known as Foursquare.Footnote 1 The dataset consists of POIs, users’ check-ins, and friendships, which have been collected from the year 2012 to 2014. The distribution of the dataset is depicted in Table 4. The seasons are extracted using the period given in the dataset. Similarly, each POI is associated with its category like restaurant, river, lake, city, and so on.

Table 4 Distribution of dataset

Baseline models

In this set of experiments, we have comparatively viewed the results with the following models.

  • RELINE [10]: They have used a graph-based approach to learn users’ and POI relationships from 8 weighted networks in a hidden space and provide location recommendations under a strategy having a probability that examines the influence of social, geographical, temporal, and preference dynamics.

  • GE [28]: is another graph-based embedding model that exploits geographical influence, sequential effect, temporal cyclic effect, and semantic effect in a unified way and embeds four information graphs into a shared embedding space. Also, a novel time-decay method is proposed that dynamically computes the user’s latest preferences based on the embedding of his/her checked-in POIs learned in the embedding space.

  • WWO [31]: is a unified POI recommender system with temporal interval assessment that considers temporal interval distributions and developed the low-rank network model, identifying a set of bi-weighted network bases to learn the static preferences and dynamic preferences coherently.

  • PGB [33]: This probabilistic model employs the graph-based Markov chain method to improve recommendation accuracy. The choice of suggesting an item is conditioned by considering recommendations generated in previous steps.

Experiments evaluation

To conduct experiments, we have divided the check-ins into three partitions for each target user that include: (i) the training set \({\varepsilon }_{C}^{T}\) is the known information, which comprises 80% of the entire check-ins, (ii) the probe set \({\varepsilon }_{C}^{P}\), used to test the model, and it contains 10% of the data, and (iii) the validation set \({\varepsilon }_{C}^{V}\) is the rest 10% for validation of the proposed model. Mathematically we can represent it as \({\varepsilon }_{c}= {\varepsilon }_{C}^{T}\cup {\varepsilon }_{C}^{P}\cup {\varepsilon }_{C}^{V}\) and \({\varepsilon }_{C}^{T}\cap {\varepsilon }_{C}^{P}\cap {\varepsilon }_{C}^{V}=\varnothing\). To make recommendations for a target user, the model uses the POIs in set \({\varepsilon }_{C}^{T}\).

For the evaluation, we measure the \(Accuracy@n\) as proposed in [15]. For each \(l \in {\varepsilon }_{C}^{P}\) given as a query \(Q(u, l, s)\), the prediction score for that specific location \(l\) along with all unvisited proximate POIs of the target user is computed based on Eq. (30). The model ranks the predicted scores into a list, and then chooses the top POIs. If the ground truth \(l \in {\varepsilon }_{C}^{P}\) exists in the top recommendations, then the model has accurately recommended that location (i.e., True Positive), otherwise it has suggested a wrong location. To calculate the net accuracy of ten recommendations, the model averages all predictions test cases as follows:

$$Accuracy@n=\frac{\#True Positive@n}{{\varepsilon }_{C}^{P}}$$
(30)

Impact of information graphs

Particularly, we examine how the integration of an information graph influences the top-n predictions. Thus, we compare NPR-LBN with the four models PGB, GE, WWO, and RELINE which are described in Baseline models section. The results shown in Table 5 reveal that as the model integrates the latest information network, its accuracy increases. This way, the proposed model lessens the sparsity issue by exploiting rich information about the users or the POIs. Also, it is noticeable that the accuracy of all models increases with n, which exhibits that the models fit well to users’ behavior.

Table 5 Impact of additional information networks to the model

Comparative analysis

This section evaluates the results of the proposed work with baselines using accuracy to produce top@n recommendations \([n=1, 5, 10, 15, 20]\) employing a real-world dataset (Foursquare). Specifically, judging the performance of all models that provide POI recommendations against cold-start users and locations.

Cold start user

We have studied the efficacy of the proposed work considering the cold-start user problem. Producing recommendations for such kinds of users is incredibly challenging due to the unavailability of the required information. In this context, we conducted experiments to provide recommendations only to cold-start users and used an accuracy metric to analyze the results of these models, as shown in Fig. 3. Since all models mentioned above support the cold-start recommendation, in this regards we compare our model with all these baselines. We employ side information related to users and locations from eleven information graphs, yielding improved results.

Fig. 3
figure 3

Accuracy for cold-start users

Cold start locations

Similarly, we examine a problem that is known as cold-start locations. The aim is to suggest locations that were not visited for at least one or few check-ins at a location with less than 25 check-ins. Thus, to cope with this, we have investigated how models behave on unpracticed users or new location is introduced into the system. In addition, we have analyzed whether the new location is listed in the top@n recommendations. Again, the proposed model outperforms in terms of producing quality recommendations, as shown in Fig. 4.

Fig. 4
figure 4

Accuracy for cold-start POIs

Significance of seasons

Here, we have analyzed the influence of the time interval per season ∆S against the accuracy for different values of the ten recommendations made by the model. ∆S significantly impacts the model's results since it is employed to build multiple information graphs. Table 6 shows the results of the model using the dataset. We can notice that the accuracy score reaches a maximum value at a certain point and then decreases gradually. The reason behind the low accuracy score is the value of ∆S. If its value is exceedingly small, then it means we have less data, and diffidently the accuracy score will be less. On the contrary, for large values of ∆S, a substantial number of nodes related to the target user exists, which causes an overfitting problem. Considering these factors, we choose the size value for the dataset to be 15.

Table 6 Impact of time period/seasons \(\Delta {\varvec{S}}\) over accuracy for top@n recommendations

Conclusion and future work

Due to the exponential growth of information on the internet, recommender systems have become prevalent technological assistants to humans. Likewise, LBS have become ubiquitous in different sectors and therefore gained the attention of numerous scientific disciplines. The emergence and usage of communication technologies such as mobile devices allow researchers to come up with more elegant solutions against sparsity and cold-start problems. Geographical information shared on such networks enables researchers to tackle both problems. Various POI approaches that have been proposed using social influences, geographical proximity, and preference dynamics using their social influences. In this work, we have considered cold-start and sparsity problems while providing POI recommendations in the tourism sector. Particularly, our proposed model has been upgraded and comes with additional features such as users’ satisfiability on the system and employs a weighted probabilistic approach over eleven information graphs based on relations established among users, seasons, and categories. The model personalizes locations by jointly learning the embeddings of users and POIs into the same embedding space. The incorporation of edge and cloud computing in our proposed model has improved the system's accuracy and scalability, allowing it to be easily used by tourism management systems worldwide. The influence of social, geographical, and temporal factors in terms of accuracy has been scrutinized. Our model has been evaluated and outperformed against the cold-start users and the cold-start locations. Our future work is to ensure users’ social information security, which is crucial to users, and a state-of-the-art problem in POI recommendation.