QoE-Driven Multipath Video Conferencing
QoE-Driven Multipath Video Conferencing
Computation-Driven Perspective
CCS Concepts: • General and reference → Surveys and overviews; • Information systems → Comput-
ing platforms; Multimedia streaming; • Networks → In-network processing;
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government
(MSIT) (No. 2021R1G1A1008105). The work of Anh-Tien Tran and Sungrae Cho was supported by the National Research
Foundation of Korea (NRF) under Grant NRF-2019R1A2C1090447 funded by the Korea Government (Ministry of Science
and ICT). The work of Tran Thien Thanh and Vo Nguyen Quoc Bao was funded by the Vietnam National Foundation for
Science and Technology Development (NAFOSTED) under Grant 102.02-2018.320.
Authors’ addresses: N.-N. Dao, Department of Computer Science and Engineering, Sejong University, Neungdong-ro 209,
Gwangjin-gu, Seoul 05006, South Korea; email: nndao@[Link]; A.-T. Tran and S. Cho (corresponding author), School
of Computer Science and Engineering, Chung-Ang University, Heukseok-ro 84, Dongjak-gu, Seoul 06974, South Korea;
emails: attran@[Link], srcho@[Link]; N. H. Tu, Department of Smart Energy Systems, Seoul National University
of Science and Technology, Gongneung-ro 232, Nowon-gu, Seoul 01811, South Korea, and Department of Computer Engi-
neering, Ho Chi Minh City University of Transport, Vo Oanh 2, Binh Thanh District, Ho Chi Minh City 710372, Vietnam;
email: ngohoangtu@[Link]; T. T. Thanh, Department of Computer Engineering, Ho Chi Minh City University of
Transport, Vo Oanh 2, Binh Thanh District, Ho Chi Minh City 710372, Vietnam; email: [Link]@[Link]; V. N. Q. Bao
(corresponding author), Wireless Communications Department, Posts and Telecommunications Institute of Technology,
Nguyen Dinh Chieu 11, District 1, Ho Chi Minh City 710372, Vietnam; email: baovnq@[Link].
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from permissions@[Link].
© 2022 Association for Computing Machinery.
0360-0300/2022/11-ART202 $15.00
[Link]
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:2 N.-N. Dao et al.
Additional Key Words and Phrases: Live video streaming, in-network computing, cloud computing, time-
sensitive services
1 INTRODUCTION
The popularity of multimedia distribution platforms such as YouTube, Netflix, Twitch, and Face-
book Live has led to an exponential increase in emerging social networking paradigms. In addi-
tion, recent user devices equipped with various computational capabilities and display resolutions
adequately accommodate user satisfaction with video quality adaptation on demand. The tech-
nical advancements and convenience of multimedia delivery services has had a significant influ-
ence on market expansion [95]. For instance, personal live streaming content is simply produced
to be posted on Facebook by any person who has basic knowledge of using digital devices. At
home, smart TVs provide us with live entertainment content broadcast through various installable
streaming apps released by content providers as well as third parties. In addition, online learn-
ing and meeting platforms such as Zoom, Cisco Webex, Google Meet, and Microsoft Teams are
playing an essential role in supporting remote collaborations amid the coronavirus disease of 2019
(COVID-19) pandemic by offering live video conferencing services. These analytical observations
imply that live video streaming (LVS) services are expected to retain their dominance in Internet
services in the coming years [52].
From a technological perspective, LVS refers to a video delivery service that simultaneously
records and broadcasts media content to all users in real time. To offer convenient service experi-
ences, LVS is typically implemented on the Internet infrastructure using web transfer protocols to
synchronously distribute video packets via multiple paths [181]. In a modern LVS system, hetero-
geneous user demands and preferences are supported by adaptive bitrate streaming (ABS) services
that enable networks to dynamically adjust the quality level of the videos according to environ-
mental conditions and resource availability. Network elements on the path play essential roles in
optimally delivering, possibly transcoding, automatically editing, and temporarily caching video
content during streaming operations. However, these network elements are typically limited by
computation and storage resource constraints, which may prevent their efforts to perform these
tasks. In addition, the instability and uncertainty of network conditions negatively affect the adap-
tation capability of the networks to dynamically adjust the video bitrate [34, 161]. Hence, although
existing approaches have demonstrated several impressive advantages, LVS studies still attract
considerable attention from both the research community and industry with the aim of improving
the quality of experience (QoE), quality of service (QoS), and performance of LVS. Therefore, a
contemporary review of cutting-edge studies on improving LVS performance is crucial to direct
ongoing and future work in the field.
In the past, the literature has encompassed several surveys conducted on streaming ser-
vices [16, 46, 179]. However, almost all existing studies have focused on streaming services by
considering specific domains, such as delivery protocols, video applications, and performance met-
rics. In addition, they do not have taxonomized LVS and on-demand video streaming in the general
streaming service category. Table 1 presents a summary of recent streaming service surveys. It is
observed that the related work has its own limitations in responding to the aforementioned re-
search questions in two ways: (i) the scope of existing studies covers streaming services in general
instead of distinguishing the LVS, and (ii) in-network computation capabilities have not yet been
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:3
considered as a major field of survey. As emerging edge-cloud computing paradigms have recently
been integrated into every Internet service [33], these areas on which existing studies are rare have
inspired us to conduct a contemporary survey on LVS from a computation-driven perspective.
To provide a comprehensive outlook of computation-driven LVS research, our survey was con-
structed as follows.
• First, we provide an overview of state-of-the-art commercial LVS platforms. We exploit the
service qualities offered by various LVS providers, such as video dimension, maximum file
size, maximum duration, total storage, and compatible formats. Our observations reveal
emerging trends in LVS services. The details are provided in Section 2.
• Second, we provide an overview of LVS services. In particular, global recommendations and
standards managed by international organizations are described. Adopting the standards, the
LVS system architectures, along with their service components and functions, are clarified.
We then present well-known streaming protocols integrated into LVS systems. The details
are provided in Section 3.
• Third, hierarchical computation-driven LVS models are investigated that are further classi-
fied into cloud-based, edge-based, peer-to-peer (P2P)-based, and hybrid streaming categories.
Here, the exploitation of relevant computing capabilities to assist LVS services at different
locations on the video delivery paths is anatomized. The details are provided in Section 4.
• Fourth, to evaluate the improvements of cutting-edge LVS solutions, we divide these works
into several groups by performance metrics such as service availability (SA), video bitrate,
end-to-end (E2E) latency, network QoS/QoE, system serviceability, hit ratio, resource con-
sumption, and security and privacy. The details are provided in Section 5.
• Fifth, from previous analytical observations, we present open challenges to drive ongoing
and future research toward LVS advancements and popularity. The details are provided in
Section 6.
The main contributions of this study are as follows. This survey provides a reference framework
for interested readers, along with cutting-edge knowledge and studies regarding LVS services.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:4 N.-N. Dao et al.
From a computation-driven perspective, three technical areas constituting the LVS were system-
atically investigated, including standard architectures, computing-assisted models, and metrics of
performance evaluations. The lessons learned are summarized and discussed at the end of each
section. In addition, open challenges are highlighted to support future research.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:5
HTML5 video player, CMS and analytic tools, and transcoding. No coding is required, and there
is a website and apps support for mobile and television, monetization, and security options [132].
The drawbacks are a complex CMS platform and bad integration. Stream Shark benefits include
providing global multi-CDN services, mobile compatibility, viewer reports, video encoding,
monetization, and privacy options [155]. Note that video analytic and embeddable playlists
are not included in Stream Shark’s service providability. Finally, API for further integration,
CMS, access and secure portal management, and analytic tools are simultaneously supported by
Panopto despite the lack of customizable templates and an image editor [135].
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:6 N.-N. Dao et al.
conveyed through the HTTP protocol as a series of segments rather than a bulk file. The clients
measure the current Internet connection speed and level of playback buffer to choose the next seg-
ments with the expectation that the next video segments are always available before the current
video segment expires.
DASH possesses its own shortcomings, especially at peak times, when multiple DASH clients
must compete for shared network resources, such as bandwidth. Specifically, the research com-
munity has thoroughly investigated solutions for the problems of QoE unfairness among clients,
the destructive influence of bitrate switching, screen freezes and initial delay, network resource
under-utilization, outdated information in media presentation description (MPD) after a network
failure or reconfiguration, or undesirable interactions and oscillations among DASH clients com-
peting for the same bandwidth. These problems constitute a serious concern for video content
providers and network operations and become worse in the case of diversified environments. To al-
leviate these problems, the server and network-assisted DASH (SAND), a finalized extension of the
MPEG-DASH standard (in 2017) with the aim of enhancing the delivery of DASH content [90], was
proposed. This will be discussed next. The SAND architecture has three broad categories: DASH
clients, (ii) DASH-aware network elements, and (iii) regular network elements. Correspondingly,
it requires three interfaces that bear diverse types of messages: (i) metrics and status (from clients
to DASH-Aware Network Elements (DANE)), (ii) parameters enhancing delivery (PED) (among
DANEs), and (iii) parameters enhancing reception (PER) (from DANE to clients). All of these mes-
sages are referred to as SAND messages. SAND messages are not necessarily sent simultaneously.
Clients inform other elements of the network regarding their current status on the DANE via
status messages. For instance, the client apprises the cache server whose specific segments are
likely to download; then, the cache server proactively prefetches them ahead and immediately
serves segments as soon as the actual request from the client is sent. This process is expected to
enhance the cache hit ratio on the server and the perceived QoE of clients. The cache server informs
associating clients regarding available segments via PER messages. The DASH clients may consider
these messages as a suggestion for the selection of future requests to retain a stable and continuous
streaming experience. Consider a live streaming scenario wherein a large number of DASH clients
expect to watch the same content, for example, sports events/live concerts, and each DASH client
possesses different capabilities in terms of network conditions. The QoE of clients can instantly
deteriorate because the cache server cannot prefetch all requested segments owing to bandwidth
and/or storage shortage. PER messages can help to lift this burden by notifying the clients of
the available segments so that DASH clients can properly modify their requests. The server can
communicate information regarding the streamed video to the network delivery element/node
using a PED message. However, the SAND specification does not provide PED messages in the
primary edition.
In addition, the Video Quality Experts Group (VQEG) [165] developed some tools called Stream-
Sim [164] to simulate the streaming environment for research purposes. This toolchain can per-
form five tasks in a separate fashion, including video encoding, streaming, loss insertion, payload
extraction, and decoding. Each task is also possible to be individually configured and additional fea-
tures could be considered. Specifically, packet loss, delay, or jitter can be simulated via predefined
network configurations, and the raw video material can be encoded with different settings and
transmitted. The decoded transmitted videos would be used to compare with the original video.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:7
MPD [71]. These media segments comply with the media format the system is associated with and
enable playback either independently or when combined with other segments. Initially, the client
sends an HTTP request to the DASH server and receives the corresponding MPD file. The main
concerns at the server side are optimal encoding, choice of available representations, and segment
length (where selectable). The selected segment length should satisfy two contradictory require-
ments. It should be long enough to maintain a low data overhead and short enough to quickly
react to the oscillating network conditions. The segments can be cached for future requests as
they traverse the base stations (BSs). The segments easily traverse through the firewalls using
HTTP messages and then fill in the playback buffer, decode, and play by media players (such as
THEOPlayer [158], [Link] [22], Flowplayer [1], Clappr [29], JWplayer [136], Bitmovin [84], and
VLC media player [137]).
The viewing process will be interrupted if there are no remaining segments in the playback
buffer, leading to degradation of the user’s experience. To decrease the frequency and duration of
stalling events (screen freezes owing to an empty playback buffer), the adaptation engine always
updates technical parameters, including the buffer status, current playback time, and achievable
throughput to properly determine the bitrate of the next video segment [148]. As shown in Figure 1,
even for the scenario of highly fluctuating network environments, the DASH client is expected to
actively adapt the quality of future video segments. The throughput is initially sufficiently good
to provide initialization segments with the highest quality (2200 kbps). It is then condensed to
a lower level so that a reduced video quality may be served to avoid playback buffer emptiness
(1400 kbps). Subsequently, the bandwidth is improved; it then abruptly decreases. All of these
abnormal changes in network throughput can be quickly observed at the DASH client side, and
an immediate response can be determined by the adaptation engine. In particular, if any reduction
in bandwidth is detected, the DASH client may agree to downgrade video quality and size to
prevent buffer emptiness and retain a seamless media consumption experience. In another case,
if the bandwidth is enlarged, it can demand a higher visual quality, thereby achieving better QoE.
The switching among different representations can be monitored during the playback because the
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:8 N.-N. Dao et al.
segments corresponding to respective quality can be requested separately and then merged at the
client side. The adaptation engine inside the DASH client updates the MPD file and sends it back
to the DASH server.
As defined in the latest standard on adaptive video streaming, that is, IEEE 1857.7 [71], an MPD
file is an extensible markup language (XML) document containing metadata for accessing seg-
ments and providing streaming media services for users. The metadata of the video segments
include segment durations, video/audio codec, bitrate, and video spatial resolution. The format of
the segment conforms to ISO/IEC 13818-1 [87], GB/T 20090.1 [50], or GB/T 17975.1 [49]. A com-
plete MPD schema and details of the MPD are presented in the IEEE 1857.7 standard [71]. Each
MPD file consists of one or multiple periods (high-level time interval of the media presentation)
and can be fragmented and partially delivered if sudden network impairments occur unexpectedly.
The MPD can be updated proactively by clients during the streaming session. Periods determine
the beginning and end times of each part of the media presentation and can be used to insert adver-
tisements and content segments. Each period contains one or more adaptation sets, each of which
is a set of compatible encoded versions of media presentations. Each adaptation set contains one
or more perceptually equivalent representations and can construct media streams with the same
media content components. Seamless switching across diverse representations was implemented
by equipping the adaptation set and its contained representations with sufficient information. The
adaptation set also specifies the maximum and minimum bandwidths, widths, heights, and frame
rates of their representations. Therefore, DASH can easily support a wide range of devices with
different settings and capacities. Each representation is either a complete set or a subset of media
content components.
Representations can be encoded with different video codecs, allowing battery-powered devices
to choose older codecs to reduce battery usage. The DASH clients might override the choices of
quality of perceived video to satisfy their own preferences, such as willingness to have possible
video stalls in exchange for higher quality or degradation of video quality for the sake of smooth-
ness. The segments within a representation are optional for decoding or restoring representations.
Moreover, if the segments are perfectly time aligned, smooth switching can be achieved. Note that
stream access points (SAPs) indicate the position in a representation from which clients can be-
gin playback of a media stream utilizing solely the enclosed information in representation data
initiating from that position onward.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:9
latency. WebRTC is a free project supported by Google, Mozilla, and Opera, among others. It aims
to provide browsers and mobile applications with real-time communication capabilities (ultra-low
latency of 0.5 second) via simple APIs. (Compared with the Apple Common Media Application
Format (CMAF) with the same purpose, CMAF provides a low-latency standard of 3–5 seconds).
SRT was developed and pioneered by Haivision to optimize streaming performance across fickle
networks with secure streams (empowered by AES [127]) and easy firewall traversal. Although
any data type can be transferred via SRT, the protocol is particularly optimized for audio/video
streaming. Haivision and Wowza founded a consortium (SRT Alliance) dedicated to the continued
development and adoption of the protocol; its current membership numbers more than 170.
The differences between the current streaming protocols are listed in Table 2. The term codec-
agnostic means that the related protocol supports all codecs. Note that the common MPEG encryp-
tion schemes, CENC and CBCS, are mutually exclusive. Specifically, encrypted content according
to CENC cannot be decrypted by a system supporting only the CBCS scheme, and vice versa.
The following properties are of high importance in the context of this survey: data description,
video/audio codec, playback compatibility, and encryption.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:10 N.-N. Dao et al.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:11
were chosen based on the average popularity of views. To prove the efficiency of the proposed sys-
tem, the optimal, Top-N , and FBRS resource allocation algorithms were compared. Specifically, the
QoE was investigated in terms of the computational instances and bandwidth. The results show
that the proposed optimal algorithm yielded the highest values among the three algorithms.
A lower-cost model for large-scale live video providers (e.g., [Link], YouTube Live) than that
in [19] is a geo-distributed cloud infrastructure. In this paradigm, the CLS service can be deployed
by multilevel cloud sites distributed across different global geographical locations. Each cloud site
resides in a data center composed of interconnected and virtualized servers. The server resources
will be provisioned for CLS, for example, computation resources for collective production and
transcoding. As shown in Figure 2(b), the single-source CLS of a crowdsourcer is uploaded to the
highest-level cloud site (cloud site level 1). These servers at cloud site level 1 perform the function
of source video collection and scheduling decision-making. Based on a specific optimal algorithm,
the CLS videos are forwarded to the allocated cloud instances in cloud level 2. Subsequently, the
original source stream is transcoded into a target-quality version and then broadcast to viewers.
Bilal et al. [20] presented a cost-effective QoE-driven video control plane to choose an appropri-
ate transcoding location (cloud site) and video representations to minimize overall system cost
in terms of the viewer’s available bandwidth, average latency between viewers, and transcoding
location, including switching delay, required video quality, and resource availability per cloud site.
There are two proposed algorithms in the case of optimal and a heuristic called greedy minimal
cost (GMC). However, the GMC heuristic algorithm rarely achieves optimum streaming because
it cannot adapt to changes in load or users’ behaviors.
Although the system in [20] was focused on viewers’ aspects, the work in [41] provided a joint
solution for reducing operational costs, including video transcoding cost, bandwidth cost, VM
rental cost, and video distributions, for CLS providers in terms of data center selection fitting for
both crowdsourcers and their viewers. An optimal online strategy based on the Lyapunov optimiza-
tion framework was proposed for a geo-distributed cloud platform that can work cost-effectively
while ensuring good QoE for users. The source data center, which is the data center selected for
a crowdsourcer, the targeted data centers that are selected to deal with their viewer requests, and
the interaction delay between them are considered as the input controls to build a specific video
distribution path for each of the CLS services that the crowdsourcer is using at the same time. This
online algorithm can be executed in parallel to serve each crowdsourcer independently.
Applying machine learning (ML) or reinforcement learning (RL) to seek more precise solu-
tions in resource allocation has recently become a popular trend in cloud-based CLS research
[13, 60]. These works presented forecasting models applying ML to minimize the cost to the con-
tent providers while providing a maximum QoE level for users by solving the over-provisioning
of resources. The model in [59] concentrated on assigning storage resources, whereas the model
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:12 N.-N. Dao et al.
in [60] focused on computation resources for video transcoding. The total cost was considered,
including the total storage cost, total serving request cost, and total migration cost (i.e., the total
cost of moving a video replica through cloud sites). Based on the metadata shared in [12], Haouari
et al. [59, 60] set up an offline database in which each geo-distributed cloud site has a collection
of near viewers for each incoming live video. To minimize both the start-up delay of the video
transmission and the cost for the content provider, the storage resources are allocated as close as
possible to the viewers. Much more complicated than the optimization problem in [59], the data-
base in [60] with user collection is classified for each video bitrate representation. This database is
used to make decisions to allocate optimal transcoding resources for each near user of a cloud site
to minimize the overall system cost while maximizing the viewer’s QoE. To proactively reserve
the exact transcoding resources for incoming live videos, ML was adopted to build distributed
time-series resource forecasting models. Simulations to evaluate the performance of the proposed
system were examined, including the optimal cost and average latency in terms of renting hours.
Specifically, five ML algorithms were applied to predictive models (i.e., long short-term memory,
gated recurrent unit, convolutional neural network, multilayer perceptron, and XGboost).
Similarly, the authors of [13] applied RL to build an online and proactively predictive model,
called reinforcement learning for online and proactive resource allocation (RL-OPRA), to address
minimum operational cost (i.e., rental cost, dispatching and migration cost, and serving cost)
optimization. This model outputs a database of the popularity of live videos that are based on video
features (e.g., broadcasters, category, creation time, and date) at different geo-located cloud sites.
This predictive model is used to select the relevant data centers located in clouds while offering
the best QoE for live streaming viewers by reducing perceived delays. The proposed RL-OPRA
predictive model is deployed in a centralized master server that orchestrates resource allocation.
The work also showed that the RL methods can give the same result as the optimal solution and
provide a better result than greedy decisions such as the GMC algorithm. Furthermore, the RL
approach was utilized [13] to continue learning to adapt to any system fluctuation.
4.1.2 Multi-party Interactive Live Streaming. Online video conferencing services have been
widely deployed for virtual, face-to-face communication among separate parties, especially in the
ongoing COVID-19 situation. The use of this kind of communication can also reduce travel expen-
diture for not only global companies but also individuals. Other applications for online multime-
dia conferencing services include distance learning, online video meetings, and multimedia mul-
tiplayer online games. Unlike the CLS service, the LVSs in multimedia conferencing are two-way
streams instead of one-way streams. Application users taking part in an online conference send
their LVS and receive the LVSs from all the other participants concurrently. Cloud computing solu-
tion development for providing multimedia conferencing services can be classified into three key
directions based on each part of the service provision: Software-as-a-Service (SaaS), Platform-as-a-
Service (PaaS), and Infrastructure-as-a-Service (IaaS). Each solution development targets specific
users and involved objects as follows.
In cloud-based architecture, a harmonization among SaaS, PaaS, and IaaS is of importance for
optimal operation throughout the whole network. To solve this problem, a joint PaaS and IaaS
architecture is proposed in [152] along with novel APIs at PaaS and conferencing subtract at IaaS.
This holistic architecture works efficiently, allowing multiple conference application providers to
share one conferencing service at SaaS with the same service characteristic, either audio or video. It
also provides on-the-fly scaling of the running conference features under the required QoS. To this
end, the memory and CPU resources are integrated into the total amount of allocated resources
to fit the needs of all participants. To verify the performance of the proposed system, measure-
ments based on system performance metrics (i.e., resource allocation, scale time, conference start
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:13
time, and participant joining time) were conducted under both suboptimal and over-provisioned
conditions. This model can be used by multiple-level application providers, experts as well as non-
experts. Furthermore, this model was investigated in terms of the efficient resource allocation
solution for media handling services, including video mixing, transcoding, and compressing by
solving an integer linear problem and its heuristic in [153].
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:14 N.-N. Dao et al.
Fig. 3. Transcoding and forwarding LVS P2P-assisted models: (a) Voluntary peers [40]. (b) Paid peers [192].
all users watching the shared live video in the same bitrate are formed into a specific bitrate user
collection (i.e., cluster), named the Bitrate-region, as shown in Figure 3(a). The bitrate of a cluster
obeys the rule: the higher the index of the Bitrate-region, the lower the bitrate. The participants
(i.e., nodes) engaging in the system are categorized into three primary groups.
• Video source: The mobile user who uploads live video after transcoding it to the highest
requested bitrate.
• A leader: The mobile user who downloads the shared video content send by the video source
or the next highest upper-level leader who forwards this video content to peers (i.e., follow-
ers) belonging to its Bitrate-region. In the case in which a leader is not the leader belonging
to the lowest Bitrate-region, it transcodes the video content to the next highest lower-level
leader.
• A follower: The mobile user who downloads video shared by a leader or another follower.
Because a node enters or leaves a cluster at any time while the video service must remain available,
the authors modeled each collection of viewers as a distributed balanced tree. To obtain the optimal
solution for maximizing the liveness of the video service, a rebalanced algorithm is invoked locally
in the cluster to balance device resources (i.e., bandwidth, energy) (i) when the number of nodes
in this tree changes and (ii) to achieve fairness periodically. Clearly, the main disadvantage of the
algorithm in [40] is that it does not provide an optimal solution for the entire D2D network. In
addition, the amount of hop-to-forward video content does not have an upper bound, which will
lead to a significant end-to-end latency that may not meet some required QoE goals.
Another interesting research trend in LVS P2P systems is the formation of a cluster of peers that
optimizes the cluster size. One sufficient P2P cluster in which peers view the same live channel
can be formed into an alliance in which only the contributing members are allowed to join. The
existence of free-riders who benefit by cooperation with other users in D2D networks without
contributing and redundant streams can drastically degrade playback quality and network per-
formance. To reduce the influence of free-riders on a P2P live video system, Zhang et al. [184]
presented a solution based on the distance-driven method for constructing a reciprocal P2P topol-
ogy. Specifically, a group of truthful users (i.e., nodes) who contribute to and receive the assistance
of other users in distributing data chunks within a group is formed gradually by the proposed
distance-driven alliance algorithm. This algorithm can be invoked by the following cases: (i) when
a node joins the network and each peer is provided and (ii) when a node closer to the alliance
becomes available during runtime. Working under these rules, the farthest member is replaced or
a peer will be replaced if it does not show the contribution of chunks within the timeout period;
this reduces redundant streams and shrinks the topology. Therefore, in contrast to earlier find-
ings, this algorithm helps the D2D networks operate efficiently in a large proportion of public IP
nodes or in communication environments made vulnerable by traffic fluctuations. In addition, the
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:15
performance of the proposed algorithm in terms of the continuity ratio (i.e., ratio of the received-
before-played chunk count to the total requested chunk count) for different free-rider percentages
is enhanced in comparison with other alliance algorithms, including random alliance, bandwidth-
likeness alliance, and content-likeness alliance.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:16 N.-N. Dao et al.
BSs under budget constraints. Mobile user devices (i.e., generators) connected with a certain BS or
an access point, via a platform supplied by a crowdsourced educational and entertaining applica-
tion provider, upload their video content to upload servers (ULSs) attached to BSs. The ULSs with
the video collector modules will forward the video contents of the received generators to appro-
priate download servers (DLSs) via backhaul links and the core network. DLSs then process these
contents and distribute the processed data to viewers. To address the QoS of the video crowdsourc-
ing platform, Huang et al. considered a group of generators cooperatively producing the same
video content, which will then be forwarded to another group of interested users (i.e., viewers).
Moreover, the system operates under budget constraints. The budget needed for content delivery
from generators to viewers consists of two kinds of costs that the application providers must pay:
(i) network data transmission cost, which is charged per byte; and (ii) server rental cost, which
is assessed per unit time in both ULSs and DLSs. Choosing the optimal ULS and DLS for a given
number of ULSs and DLSs for each generator and viewer in order to guarantee video crowdsourc-
ing experiences under multi-level operational budget constraints is the problem to solve. To this
end, a server placement and user association scheme was formulated as an optimization NP-hard
problem. To verify the proposed system, the overall E2E delivery time reduction was investigated
in terms of average video size per generator, the number of involved BSs, the number of users per
collaboration group, and types of algorithms (i.e., brute force). In contrast to [192], three different
practical budget cases classified into high, low, and medium levels were examined in [68]. Fur-
thermore, the solution given in [41] was crowdsourcer driven (i.e., multiple viewers are concerned
about watching the content from one source), whereas the solution in [68] aims at the content de-
livery from multiple sourcers to multiple viewers. The limitation of this work is that the influence
of immediate nodes (e.g., BS controllers and mobile switching centers) between the two selected
ULSs and DLSs on the system performance was outside the scope of [68].
In [192], a CLS cloud-based system operated using viewers’ phones with massive broadcasters.
This peer-assisted model uses the idle end-viewers’ resources to transcode immense video data to
offload computational resources from the cloud. This solution reduces the leasing cost for content
service providers and enhances the supply of low-latency LVS service stability. The system in [192]
operates in multiple regions with one regional data center (or a regional server) located in each
region. The functions of this data center are as follows: (i) receiving the upload CLSs from broad-
casters, (ii) assigning transcoding tasks to either viewers or cloud, and (iii) recollecting transcoded
video and forwarding the processed streams for further delivery (Figure 3(b)). An algorithm based
on certain criteria, such as viewer stability, is used to select promising candidates who can assist
the cloud and will be paid for their resource contribution (i.e., electrical power and computing). To
deal with qualified viewer selection, the authors presented an auction-based approach that can be
implemented in each region to concurrently implement two jobs: (i) enabling the crowd of viewers
to facilitate the transcoding task assignments and (ii) offering a dynamic viewer-driven payment
for these selected viewers under a given budget constraint. If the transcoding assignment cannot
be deployed successfully (i.e., no satisfiable transcoding viewers can be chosen locally), the un-
matched tasks will be directed to the cloud server. After processing the given transcoding task,
the dedicated cloud server sends the transcoded stream back to the region. A prototype with an
online scheduler was conducted to prove the feasibility of the design, and a comparison of three
scheduler strategies (i.e., online, baseline, and comprehensive) in terms of the percentage of stable
candidates, total cost, and total number of reassignments was performed. Obviously, this model
can be a valuable research direction because of its feasibility in utilizing idle resources from peers
with payment. This policy contrasts with that in [40] and is suitable for constructing a long-lasting
relationship between all involved entities in the network. This model can be considered for further
improvements, as idle viewers can help concurrently with more than one job.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:17
An edge-assisted crowdcast framework, called DeepCast, was proposed in [167]. For crowdcast
content delivery, DeepCast seamlessly integrates many entities, including the cloud, a CDN, and
non-uniform edge servers. In addition, through DRL, it automatically determines the most relevant
strategies for viewer assignment and transcoding at edges. This proposed framework proved its
effectiveness for better personalized QoE and lower cost for crowdcast systems. In this system, a
broadcaster uploads a raw stream to a platform’s service center (i.e., the cloud). Next, the original
stream is encoded and compressed into multiple-bitrate streams and pushed into the CDN servers.
By using the WebRTC protocol or other proprietary protocols for multimedia streaming, service
providers can provide interactive streaming services with a tight latency demand. The high-quality
versions of streams from CDN servers are then forwarded to the edge servers through HTTP.
These edge servers will possibly transcode the received data streams to low-quality versions in
response to the different bitrate requests of viewers. To fulfill the joint requirements of minimizing
the system cost and optimizing the viewers’ personalized QoE, the regional edge can serve the
viewer itself or ask for help from another edge or the CDN. To achieve low channel switching
latency, the nearest of either of the two mentioned entities was chosen. To this end, the authors pro-
posed a data-driven DRL-based approach located in an edge system that can automatically learn
from the network and viewer information to make intelligent decisions without any predefined
rules. Specifically, DeepCast applies the state-of-the-art asynchronous advantage actor–critic
model [130] as the learning model. The three QoE metrics used in [167] are streaming delay,
channel switching latency, and bitrate mismatch level (i.e., a function of the difference between the
target version of a viewer and the actual assigned version). Thus, the optimization objective is to
minimize the sum of the overall penalty, including QoE and the system cost. Compared with other
deep learning models, a deep Q-learning network (DQN) with its subcategories 1-step-DQN and
n-step-DQN or Q-learning, the proposed system outperformed with regard to the overall penalty.
5 PERFORMANCE METRICS
5.1 Service Availability
According to ITU-T E.860 [91] and X.140 [92] recommendations released by the International
Telecommunication Union (ITU), SA refers to the probability that the system can work overtime
to provide services to the satisfaction of users whenever and wherever the services are required.
In the context of LVS systems, SA metrics are alternatively measured by stalling duration over the
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:18 N.-N. Dao et al.
total playback periods. For example, Dantas et al. investigated video on-demand streaming services,
promising to easily extend LVS services by further considering the E2E latency, hosted in the cloud
computing environment [32]; here, the hierarchical modeling techniques used the Markov chains
to deal with the complexity of representing such a system that focuses on the virtual machine and
specific application components (e.g., web server and database server) required for video playback.
In [32], the performance was achieved with an SA of 0.9881, which indicates a downtime of 104.24
hours per year. Meanwhile, Bezerra et al. [17] conducted an experiment to analyze the Eucalyptus
platform for a video-on-demand streaming system under cloud computing support, in which (i)
Eucalyptus is an open-source cloud middleware that is beneficial to the private cloud platform
and (ii) the continuous-time Markov chain with reliability block diagrams was utilized to evaluate
the SA metric as well as potentially demonstrate the extensive capability of LVS. The numerical
results in [17] showed an SA of 0.988571 with an unavailability of 100.11 hours per year. To achieve
a higher SA for the LVS service, in [123], Melo et al. proposed a redundant node architecture, in
which the secondary node controller (NC) has the same software and hardware specifications as
the primary NC, which is active only when the primary NC fails. The results in [123] confirmed
that the achievable SA and annual downtime with the redundant node architecture were 0.990434
and 83.798 hours, respectively. By extending the work, Melo et al. [123] further investigated the
Eucalyptus cloud platform along with the design of experiments and percentage difference utiliza-
tion to identify availability bottlenecks. Numerical results in [122] revealed that the value of SA is
derived up to 0.994401; moreover, they revealed that the downtime degradation reached only 49.05
hours per year, which represents 2.04375 days of downtime in a year. However, the redundant node
architecture’s utilization has some trade-offs among SA achievement, downtime, cost, and compu-
tational/employable complexity compared with the conventional approach. Furthermore, in [9],
by additionally considering the occurrence of software aging issues in a web browser plug-in for
cloud-based LVS services via two rejuvenation strategies, substantial performance improvement
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:19
was achieved, including (i) the time-based rejuvenation strategy with 0.9999359 of SA representing
0.561516 h (33.69 min) of annual downtime and (ii) the prediction-based rejuvenation strategy
with 0.9999361 of SA representing 0.559764 h (33.59 min) of downtime per year. In the proposed
framework, the continuous-time Markov chain was leveraged to predict the resource utilization
ahead of time, whereas an automated workload simulated the access behaviors of YouTube users.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:20 N.-N. Dao et al.
(e.g., propagation, radio access, queuing, and reordering delay) [4]. Among these factors, (i) the
holding time characterizes the duration time needed to process or handle video frames on both
the transmit and receive sides, (ii) the transmission and radio access delay refers to the duration of
physical radio interface hardware to map the data from packets to bits, (iii) the propagation delay
comes from the distance between terminals, (iv) the queuing delay refers to packet buffering at the
terminals during transmission, and (v) reordering delay is caused by LVS on multipath networks.
Consequently, to realize LVS services, the objective function is to minimize the E2E delay under
the constraints of a given on-demand video quality, which has recently received more attention
from scientists around the world.
Several studies have been conducted from a system model perspective to mitigate the E2E delay
to facilitate LVS systems [69, 105, 108, 151, 156, 171]. For instance, Li et al. adopted the HTTP/2-
based LVS framework to achieve low latency in video streaming, which was solved by the model
predictive control frame-dropping algorithm [108]. The results in [108] indicated that the ABS
method not only improves the achievable video quality and smoothness but also reduces the frame
size by 8.06%, leading to significant E2E latency degradation. Similarly, the authors of [105] de-
scribe two HTTP/2 features, server push and stream termination, being leveraged in the LVS expe-
rience to enable low delay from the packet buffering, which was minimized to 2 seconds. Shuai and
Herfet [151] analyzed and obtained a closed-form expression for the average achievable buffering
delay, that is, queuing delay, using the ABS method in the LVS system. Subsequently, Wang et al.
[171] developed the MultiLive ABS algorithm for LVS services, in which the E2E latency was re-
duced to approximately 100 ms. A novel DRL approach was recently developed in the low-latency
viewpoint for LVS services. The work in [156] developed an ABS algorithm based on DRL, called
DNNStream, which estimated the optimal video bitrate in the LVS experience for ultra-low-latency
purposes. In [69], the quality-aware rate control (QARC) algorithm based on DRL was proposed
for LVS, which not only obtained an 18–25% improvement in the average video quality but also de-
creased 23–45% average E2E latency compared with Google Hangout [53], Compound TCP [138],
and TCP Vegas [157].
From a transcoding perspective, typical publications that applied ABS based on video coding
standards have significantly reduced the E2E latency for LVS services [98, 103, 142, 145]. The first
attempt was made by the authors of [98]; by leveraging the concept of ABS using the SVC for LVS
services, the bitrate was controlled more frequently, resulting in coding bitrate decrements of 38%
and a reduction in the E2E latency. Kobayashi et al. [103] considered the ABS algorithm for the LVS
experience based on high-efficiency video coding (HEVC), also known as the H.265 video codec,
which provides approximately double encoding efficiency compared with SVC, that is, 56.7% of the
encoding bitrate improvement. Subsequently, Ryu et al. proposed an extension of HEVC, referred
to as scalable HEVC (SHVC), which is applied for ultra-high-definition LVS [145] with scalability
support, which showed a gain of approximately 20% decoding speed up. Versatile video coding
(VVC) is also a potential approach that provides a super video resolution up to 8K (7680 × 4320).
It also conforms to the constraints of LVS applications [142], which is suitable for a richer user
experience of LVS services. In [142], the authors’ proposal provided a low initial queuing delay of
approximately 0.21 seconds, which is 10 times lower than that of HTTP/2 in [105].
Utilizing in-network computing capability, with a focus on low-latency purposes, Bilal and Er-
bad [18] attempted to employ edge computing for interactive media and video streaming, in which
the latency and response time were minimized while providing outperformance of computing/
bandwidth/energy savings in multimedia applications, transcoding, and video streaming. Similarly,
Yang et al. [177] introduced an end–edge–cloud coordination framework to process LVS frames
from different sources by considering the low-latency constraint as well as the accurate LVS ana-
lytic, LVS quality, and computing resource configuration. In [10], fog architecture was highlighted
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:21
by the effectiveness of not only low-latency but ultra-reliable communications for intelligent trans-
port and video-on-demand scenarios. The authors of MEC [110] leveraged the paradigm along
with the flexible transcoding ABS to provide viewers with low-latency video-on-demand stream-
ing services under the limited consideration of computing, caching, and bandwidth resources. The
experimental results from [110] have shown that the E2E latency is within the low range of 15–75
ms. It is worth noting that contemporary contributions [10, 110] are promising for extension of
LVS services.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:22 N.-N. Dao et al.
bandwidth utilization than those based on benchmark solutions. In [25], a joint optimization prob-
lem of caching placement, video quality decision, and user association in LVS services under the
dual pricing specification constraint was solved by a convex transformation and a one-step La-
grangian dual pricing algorithm. The proposed algorithm [25] achieved a remarkable enhance-
ment of the average QoE per user in MEC-enabled cellular networks. The authors of [37] designed
a hybrid named data networking-based and Internet protocol-based (NDN-IP) prototype via oper-
ating system and networking virtualization techniques for LVS services to perform the efficient
utilization of network resources and achieve a better QoE metric in terms of APB, BRS, RoB, and
spectrum than conventional baselines.
Recently, ML-based applications have become more powerful artificial intelligence tools to ef-
fectively predict outcomes, in particular for network QoS/QoE measurements, without being ex-
plicitly programmed to do so. Specifically, Tian et al. [159] accelerated the training process of
DRL-based QoE maximization via window completion with historical data and quick-start with a
rate-based algorithm, named Deeplive, for LVS systems, for which QoE measurements were taken
into account in terms of RoB, FVQ, BRS, frame skipping, and latency. According to the experiment
results in [159], Deeplive achieved not only low execution training time but also an average of
15–55% improvement of QoE compared with state-of-the-art ABS LVS algorithms. The authors
of [189] studied the user scheduling, transcoding decisions, and computational and wireless spec-
trum resource allocation problems in SDN-based cloud-aided heterogeneous networks, in which
the QoE function that was formulated as a logarithmic form was maximized under the constraint
of a time-delay requirement. To tackle the problem of dynamic characteristics of wireless net-
works and the available resources with multidimensional continuous-discrete mixed variables, a
Markov decision model with an online actor-critic learning algorithm was designed, which demon-
strated its superior performance compared with the policy gradient algorithm and deep Q-learning
network. In [116], an ML-based algorithm, ReCLive, was developed to effectively distinguish live
streams from video-on-demand streams using media-request patterns as well as to infer QoE mea-
surements in terms of resolution and RoB for the detected-chunk-attribute LVS. The authors of
[30] introduced an innovative ML-based scheduling solution for omnidirectional LVS systems in
highly dynamic unmanned aerial vehicle (UAV)–based environments. Based on the simulation re-
sults, the proposed methodology [30] has confirmed its effectiveness in terms of QoS provisioning,
packet loss rate, PSNR, and throughput compared with state-of-the-art scheduling benchmarks
(e.g., static prioritization, required activity detection scheduler, and frame-level scheduler).
bits delivered successfully using a unit of resources such as bits-per-Hertz for bandwidth
occupation and bits-per-Joule for energy consumption.
For instance, Jiang et al. proposed ABS-based fair, efficient, and stable adaptive (FESTIVE) for
sharing a bottleneck link of multi-streaming in [97]; its performance was demonstrated to improve
the service stability by 50%, fairness by 40%, and efficiency by 10% compared with various real
and competitive commercial players. In [109], Li et al. innovatively proposed ABS-based PANDA,
from which PANDA was able to improve the service stability by 75% and was significantly better
in terms of fairness and efficiency than the conventional algorithms. However, there are trade-offs
between the service stability, efficiency, and fairness of PANDA when compared with FESTIVE.
An ABS proposal by the authors of [93] was implemented to provide improved stability, efficiency,
and fairness metrics over the conventional approach, in which they utilize a logarithmic approach
for received bandwidth that is increased or decreased logarithmically to converge to the fair share
bandwidth, that is, the estimated bandwidth. In [133], Shahid Nabi et al. proposed a dynamic rate-
adaptation algorithm, named SHANZ, to provide a balance between service stability and efficiency
even in drastic network fluctuations. SHANZ was measured based on the adaptive step-up func-
tion and feedback control mechanism. The results of the work in [133] confirmed that the proposed
method can achieve better balancing performance in terms of service stability and efficiency com-
pared with FESTIVE, PANDA, and another benchmark (e.g., the adaptation algorithm for adap-
tive streaming over HTTP, shortened by AAASH [128]). To further improve the performance of
both the FESTIVE and PANDA strategies with respect to the stability, efficiency, and fairness met-
rics, the authors of [42] and [191] presented enhanced server and client cooperation (ESTC) and
throughput-friendly DASH (TFDASH) novelty algorithms. In [42], ESTC allows fast convergence
among different clients’ bandwidth levels to the estimated bandwidth and establishes incorpora-
tion between the server and client sides to appropriately assign the allocated bitrate, where (i) the
client has the responsibility for making the right bitrate decision for efficiency and service stability
insurance, whereas (ii) the number of connected clients, current download bitrates, and bottleneck
link bandwidth are leveraged at the server side to ensure fairness among competing clients. The
key idea behind TFDASH in [191] is to avoid the OFF periods during the downloading process for
all clients by adopting a dual-threshold buffer model, for example, the low and high thresholds for
preventing buffer underflow and overflow, respectively, to achieve a good balance among system
serviceability factors.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:24 N.-N. Dao et al.
LVS and (ii) the cache value of video chunks was estimated based on its size, popularity, SVC layer,
and FoV existence. The results of [183] confirmed that its achievable hit ratio outperforms LFU,
LRU, and greedy-dual size frequency (GDSF) [28] strategies. Meanwhile, a proposal in [26] that
formulated a caching problem of maximizing the cache hit ratio under the constraints of the storage
capacity has revealed significant gains over the hit ratio comparison of LFU, LRU, and weighted
GDSF [174]. In [187], the authors examined the max–min video utility fairness caching (MUFC)
algorithm, which achieves a better hit ratio than the advanced FIFO caching and FairRide caching
[140]. In [139], Poularakis et al. studied the layer-aware cooperative caching (LCC) strategy with
an effort to improve the hit ratio value for LVS services compared with independent caching and
Femto caching [150]. The trade-off between the hit ratio and content quality is also a considerable
problem that has been addressed by the authors in [34] and resolved by their proposal, referred
to as the hit ratio and content quality balancing algorithm, or HITCOT, in which an edge caching
system is considered for video-based multi-streaming ABS services.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:25
cooperates with the LVS mCast approach, where mCast has proven its capability of more than
50% link utilization improvement and 0% network losses, leading to a degradation in bandwidth
consumption. Further consideration was made by the authors of [175] to show the additional in-
tegration among the network function virtualization (NFV), SDN, and mCast in various benefi-
cial network applications, including online conferencing, LVS, and event monitoring, from which
the network throughput was maximized while minimizing computing and bandwidth resource
consumption. Moreover, SDN and mCast cooperates with the scalable ABS to further support
LVS applications and obtain intelligent and dynamic service provisioning, where the equivalent
bandwidth effectiveness was confirmed [176]. The authors of [40] investigated a video transcod-
ing method for adaptive bitrate LVS, where LVS services are responsible for transcoding a large
number of videos into various bitrate levels to adaptively stream to users. In the proposed work,
the edge-assisted architecture incorporating the LVS ecosystem and mCast distribution were pre-
sented, which showed the extension to not only provide bandwidth and energy resource efficiency
but also ensure fairness and live capability. To further save network resources for LVS, instead of
unicast or mCast separation, a hybrid architecture was reported in [7]. With the hybrid architec-
ture deployment, the network not only outperformed the hit ratio, spectral efficiency, video quality,
frame loss rate, initial buffering time, and number of rebuffering events but also balanced both uni-
cast and mCast trade-offs such as (i) the higher network load but lower energy consumption using
unicast and (ii) the lower network load but higher energy consumption using mCast.
Many contemporary studies have recently focused on analyzing the optimization problems of
resource allocation for LVS applications. By invoking the conventional cloud architecture for LVS,
Li et al. [106] proposed a solution for the joint optimization of communication and computa-
tional resource allocation with the aim of maximizing the QoE objective function. Subsequently,
a cloud-based P2P architecture was considered in [66], in which the authors analyzed the optimal
bandwidth allocation problem to provide a high degree of user satisfaction. A further considera-
tion of the edge cloud-based paradigm and VFN support for the LVS experience was conducted
in [23], in which the QoE objection was maximized under the load-balancing constraints of lim-
ited cloud computing and caching resources, transcoding requirements, throughput, and latency.
As indicated in [110], the capability of the MEC and flexible transcoding ABS coordination has
demonstrated its low-latency outperformance under limited computing, caching, and bandwidth
resources. Simultaneously, in this contribution, the optimization problems were further consid-
ered in (i) joint optimization of access control and resource allocation and (ii) joint optimization
of caching decision and transcoding strategies. In [107], the total expected energy consumption
in an LVS service was minimized via the MEC support along with caching, transcoding, backhaul
retrieving, and ABS platforms. The results obtained from [107] show not only the optimal energy
scheme but also the effectiveness of the cache hit ratio. In [178], an online learning algorithm with-
out training phases was proposed to actively estimate user preferences according to user feedback
based on regression analysis, from which the optimal edge resource allocation strategy regard-
ing computing, caching, and bandwidth parameters for MEC-based LVS services was developed.
Unlike the work in [23, 66, 106, 107, 110, 178], without cloud/edge platforms, Erfanian et al. [43]
have recently introduced an optimizing available resource utilization strategy that focuses on the
bandwidth resource for LVS based on SDN, NFV, and mCast support, in which the requirement of
the E2E latency threshold is satisfied.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:26 N.-N. Dao et al.
guaranteeing the security of E2E communications [70, 131]. With a focus on LVS, the blockchain
differs from existing LVS-supporting technologies (e.g., cloud/edge-based, CDN, SDN, and VFN),
in which each stream of information created by the communication between any two devices, re-
ferred to as transaction information, is stored in a chain block [6, 15, 100, 112, 129]. All transactions
are visible at any node in the committed chain, which means that all modifications are tracked
publicly. In this way, it helps the system to prevent cybercrimes, which guarantees the system’s
security. The blockchain uses asymmetric cryptography that includes public and private keys,;
these keys are randomly created by strings of numbers [6, 15, 100, 112, 129]. Within such a large
number of keys, it is mathematically impossible to deceptively gain access or guess the keys of
other users, strengthening security and privacy. For example, Li et al. [112] proposed MEC-assisted
transcoding for blockchain-based live/on-demand video streaming while adapting the block size
of blockchains, which significantly affects performance. In addition, the alternating direction
method of multipliers and smart contracts are enabled to facilitate the joint optimization of video
transcoding offloading scheduling, block size adaptation, and resource allocation. In [129], the au-
thors leveraged the help of an interplanetary file system (IPFS), HLS, and blockchain-based smart
contracts to provide authentication, authorization, accessibility, and security for the LVS system.
Meanwhile, Allen and Lucchi [6] considered the blockchain-based Red5-Network, which utilizes
the Red5Coin token to make the network node transactions and further supports encrypted LVS
streams to ensure content access–allowed parties. Khalaf et al. [100] presented a new algorithm
for blockchain-based LVS that comprised block architecture and cryptographic operations, from
which it was confirmed its flexibility and scalability to effortlessly adapt to other platforms, such
as the Internet of Things (IoT), artificial intelligence, ML, and cloud/edge-based technologies.
From the current market perspective, the seven biggest blockchain providers — dlive, livepeer,
Theta, VideoCoin, flixxo, LBRY, and Play2Live — were surveyed in [15]. These companies have
furnished not only on-demand video streaming but also the LVS platform. Despite the security
and privacy contributions from highly efficient blockchain technologies, these approaches suffer
from several fundamental limitations, including a consensus mechanism that consumes significant
energy, considerable latency from transaction confirmation, and restricted scalability [67].
On the other hand, Varghese et al. [141] exhibited a data privacy platform based on hierarchical
inner product encryption (HIPE), and broker with an anonymous pubsub architecture for LVS sys-
tems. The results in [141] have shown the security and privacy outperformance of their proposal
compared with a system without HIPE. In [38], the practical privacy-preserving live streaming,
called P3LS, was first proposed to protect the privacy of multiple streams in P2P LVS, where the
evaluation of P3LS not only showed the privacy contribution but also 30% less bandwidth con-
sumption than the non-P3LS strategy. Because the energy issue has become crucial in the mobile
platform, Samet et al. [146] investigated the energy consumption comparison among the triple
data encryption standard (3DES), advanced encryption standard (AES), and Blowfish algorithms
for video streaming services. Unlike blockchain using asymmetric cryptography, DES, AES, and
Blowfish are symmetric-key block ciphers that also enable security and privacy for the consid-
ered systems based on the various long key lengths. The privacy-aware architecture utilizes the
face recognition framework to further enhance the secure characteristics of LVS [168], which has
demonstrated safety and high accuracy.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:27
satisfactory services to users whenever and wherever required, from which several of the most rel-
evant publications considering with/without the redundant node architecture and software aging
issues in the web browser plug-in for cloud-based LVS services are surveyed in an attempt to ei-
ther upgrade the SA value or degrade the downtime per year. With a focus on vision quality of
the LVS experience, Section 5.2 considers video bitrate issues, for which many ABS techniques
have been adapted to achieve the highest video quality possible in the context of bandwidth fluc-
tuations owing to changing network conditions. Nonetheless, the ABS techniques in Section 5.2
do not thoroughly consider the E2E latency aspect, whereas the nature of LVS systems comes
from the stringent constraints on real-time providability. Section 5.3 considers further cooperative
research between ABS mechanisms and HTTP/2, DRL, video coding standards, and hierarchical
computing models, which proved their capability in terms of E2E latency and video quality for LVS,
where SVC, HEVC, SHVC, VVC, and hybrid architectures can be listed as helpful mechanisms for
video coding standards. In Section 5.4, we provide the network QoS/QoE aspect, which reflects
the relationship between the technology provisioning and the end-users’ satisfaction, in which
QoS/QoE in terms of MOS, RoB, RPoVS, RoF, APB, FVQ, BPS, frame skipping, resolution, latency,
spectrum, and so on, are beneficial in measurements and estimation. Since ML-based applications
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:28 N.-N. Dao et al.
have recently become very popular because of their powerful characteristics, network QoS/QoE
measurements with ML-based prediction are also investigated in Section 5.4. The serviceability
of the LVS system with respect to service stability, efficiency, and fairness metrics is provided in
Section 5.5, for which the FESTIVE, PANDA, SHANZ, ESTC, and TFDASH strategies are bene-
ficial. In Section 5.6, various novel algorithms — including FoV-aware, smart edge caching, SPLF,
MUFC, LCC, and HITCOT — have been invoked to significantly improve the hit ratio value for LVS
systems compared with several conventional benchmarks. Because an LVS service is one of the
most resource-hungry applications, the survey scope of Section 5.7 is covered within four interde-
pendent measurements: computing, caching, bandwidth, and energy. In addition to the efficiency
achieved by mCast, NFV, SDN, DNN, and ABS approaches, the optimization problems among one
or several of the four key parameters were analyzed. In Section 5.8, we reviewed the security and
privacy perspectives within the LVS scope, which are based on blockchain technologies as well as
non-blockchain platforms. It is worth noting that the integration of the aforementioned algorithms
has resolved only a few performance aspects; there were some trade-offs regarding providability.
6 OPEN CHALLENGES
6.1 System Scalability
Massive connectivity has been considered one of the major requirements for realizing future
communication networks, where billions of user devices participate in the Internet to exchange
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:29
information [172]. As video traffic is increasingly dominant in 5G ecosystems and beyond, LVS
frameworks should provide scalability to adaptively serve a massive number of user requests with
various streaming flows simultaneously. Because user interests are spatiotemporal patterns, LVS
capabilities must be flexibly elastic to any fluctuations of service request volumes and distributions
in both the time and space domains. For instance, a self-organized model of LVS frameworks
automatically activates/deactivates LVS-aware functions at several network components within
an optimal design to achieve energy and computation efficiencies while retaining service quality.
Conversely, SA and video bitrate can be considered in a trade-off optimization to balance these
Catch-22 features. Obviously, scalability is critical for optimal and efficient LVS systems in the
current and next communication network generations; therefore, this capability deserves the
attention of research communities.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:30 N.-N. Dao et al.
demographics of users’ gender, age, occupation, movie genre preferences, and time patterns.
Obviously, an intelligent recommendation feature can only be developed if the system has
appropriate knowledge of user behaviors and expectations. This problem is considered more
challenging in this era, where digital content is produced every second and published on the
Internet. Therefore, efficient learning and fusing of multiple aspects of user behaviors should be
a focus of future research on LVS development.
7 CONCLUDING REMARKS
In this article, we have provided a contemporary survey on LVS from a computation-driven per-
spective, where in-network computation capabilities play a key role in assisting LVS operations.
By conducting a thorough investigation of multiple aspects of LVS, we have constructed a refer-
ence framework with state-of-the-art knowledge about LVS systems for interested readers. LVS
commercial platforms, standard architectures, service models, and performance metrics have been
analyzed to obtain and discuss valuable insights. Based on these observations, we have highlighted
open research challenges in LVS for future studies.
REFERENCES
[1] Flowplayer AB. 2022. Flowplayer: The Performance First Online Video Platform. Retrieved January 11, 2022 from
[Link]
[2] Miran Taha Abdullah Abdullah, Jaime Lloret, Alejandro Cánovas Solbes, and Laura García-García. 2017. Survey of
transportation of adaptive multimedia streaming service in Internet. Network Protocols and Algorithms 9, 1-2 (2017),
85–125.
[3] Adobe System Inc. 2021. HTTP Dynamic Streaming. Retrieved March 12, 2022 from [Link]
[Link].
[4] Samira Afzal, Vanessa Testoni, Christian Esteve Rothenberg, Prakash Kolan, and Imed Bouazizi. 2019. A holistic
survey of wireless multipath video streaming. arXiv preprint arXiv:1906.06184
[5] Adnan Ahmed, Zubair Shafiq, Harkeerat Bedi, and Amir Khakpour. 2017. Suffering from buffering? Detecting QoE im-
pairments in live video streams. In IEEE 25th International Conference on Network Protocols (ICNP’17). IEEE, Toronto,
ON, 1–10.
[6] Chris Allen and Davide Lucchi. 2019. Red5 network: Decentralized real-time secure video streaming service. In
Proceedings of the 10th ACM Multimedia Systems Conference. Amherst, MA. ACM, 296–299.
[7] Saleh Almowuena, Md Mahfuzur Rahman, Cheng-Hsin Hsu, Ahmad AbdAllah Hassan, and Mohamed Hefeeda. 2016.
Energy-aware and bandwidth-efficient hybrid video streaming over mobile networks. IEEE Transactions on Multime-
dia 18, 1 (Jan. 2016), 102–115.
[8] Apple Inc. 2021. HTTP Live Streaming. Retrieved March 12, 2022 from [Link]
[9] Jean Araujo, Felipe Oliveira, Rubens de S. Matos, Matheus Torquato, Joao Ferreira, and Paulo Romero Martins Maciel.
2016. Software aging issues in streaming video player. Journal of Software 11, 6 (Jun. 2016), 554–568.
[10] Bouchaib Assila, Abdellatif Kobbane, Mohammed El Koutbi, Jalel Ben-Othman, and Lynda Mokdad. 2018. Caching
as a service in 5G networks: Intelligent transport and video on demand scenarios. In IEEE Global Communications
Conference (GLOBECOM’18). IEEE, 1–6.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:31
[11] Ramy Atawia, Hossam S. Hassanein, and Aboelmagd Noureldin. 2017. Energy-efficient predictive video streaming
under demand uncertainties. In IEEE International Conference on Communications (ICC’17). Paris, France. IEEE, 1–6.
[12] Emna Baccour, Aiman Erbad, Kashif Bilal, Amr Mohamed, Mohsen Guizani, and Mounir Hamdi. 2020. FacebookVide-
oLive18: A live video streaming dataset for streams metadata and online viewers locations. In IEEE International
Conference on Informatics, IoT, and Enabling Technologies (ICIoT’20). 476–483.
[13] Emna Baccour, Aiman Erbad, Amr Mohamed, Fatima Haouari, Mohsen Guizani, and Mounir Hamdi. 2020. RL-OPRA:
Reinforcement learning for online and proactive resource allocation of crowdsourced live videos. Future Generation
Computer Systems 112 (2020), 982–995.
[14] Alcardo Alex Barakabitze, Nabajeet Barman, Arslan Ahmad, Saman Zadtootaghaj, Lingfen Sun, Maria G. Martini,
and Luigi Atzori. 2019. QoE management of multimedia streaming services in future networks: A tutorial and survey.
IEEE Communications Surveys & Tutorials 22, 1 (2019), 526–565.
[15] Nabajeet Barman, G. C. Deepak, and Maria G. Martini. 2020. Blockchain for video streaming: Opportunities, chal-
lenges, and open issues. Computer 53, 7 (Jul. 2020), 45–56.
[16] Abdelhak Bentaleb, Bayan Taani, Ali C. Begen, Christian Timmerer, and Roger Zimmermann. 2019. A survey on bi-
trate adaptation schemes for streaming media over HTTP. IEEE Communications Surveys & Tutorials 21, 1 (Firstquar-
ter 2019), 562–585.
[17] Maria Clara Bezerra, Rosangela Melo, Jamilson Dantas, Paulo Maciel, and Francisco Vieira. 2014. Availability mod-
eling and analysis of a VoD service for eucalyptus platform. In IEEE International Conference on Systems, Man, and
Cybernetics (SMC’14). San Diego, CA, 3779–3784.
[18] Kashif Bilal and Aiman Erbad. 2017. Edge computing for interactive media and video streaming. In 2nd International
Conference on Fog and Mobile Edge Computing (FMEC’17). Valencia, Spain. IEEE, 68–73.
[19] K. Bilal, A. Erbad, and M. Hefeeda. 2017. Crowdsourced multi-view live video streaming using cloud computing. IEEE
Access 5 (2017), 12635–12647.
[20] Kashif Bilal, Aiman Erbad, and Mohamed Hefeeda. 2018. QoE-aware distributed cloud-based live streaming of mul-
tisourced multiview videos. Journal of Network and Computer Applications 120 (2018), 130–144.
[21] Brightcove. [n.d.]. Brightcove Inc. Retrieved January 11, 2022 from [Link]
[22] Inc. Brightcove. 2022. Video JS. Retrieved January 11, 2022 from [Link]
[23] Utku Bulkan, Muddesar Iqbal, and Tasos Dagiuklas. 2018. Load-balancing for edge QoE-based VNF placement for
OTT video streaming. In IEEE Globecom Workshops (GC Wkshps’18). Abu Dhabi, United Arab Emirates. IEEE, 1–6.
[24] Alexander Bychok. 2020. What is video bitrate: Full guide. Retrieved January 11, 2022 from [Link]
what-is-video-bitrate.
[25] Wei-Yu Chen, Po-Yu Chou, Chih-Yu Wang, Ren-Hung Hwang, and Wen-Tsuen Chen. 2021. Dual pricing optimiza-
tion for live video streaming in mobile edge computing with joint user association and resource management. IEEE
Transactions on Mobile Computing (Jun. 2021). DOI:10.1109/TMC.2021.3089229
[26] Xing Chen, Lijun He, Shang Xu, Shibo Hu, Qingzhou Li, and Guizhong Liu. 2019. Hit ratio driven mobile edge caching
scheme for video on demand services. In IEEE International Conference on Multimedia and Expo (ICME’19). Shanghai,
China. IEEE, 1702–1707.
[27] Xusong Chen, Dong Liu, Zhiwei Xiong, and Zheng-Jun Zha. 2020. Learning and fusing multiple user interest repre-
sentations for micro-video and movie recommendations. IEEE Transactions on Multimedia 23 (2020), 484–496.
[28] Ludmila Cherkasova. 1998. Improving WWW Proxies Performance with Greedy-Dual-Size-Frequency Caching Policy.
Hewlett-Packard Laboratories.
[29] Clappr. [n.d.]. Clappr: An extensible media player for applications. Retrieved January 11, 2022 from [Link]
[30] Ioan-Sorin Comşa, Gabriel-Miro Muntean, and Ramona Trestian. 2021. An innovative machine-learning-based sched-
uling solution for improving live UHD video streaming quality in highly dynamic network environments. IEEE
Transactions on Broadcasting 67, 1 (Mar. 2021), 212–224.
[31] Dacast. [n.d.]. Live Streaming & Video Hosting Platform. Retrieved January 11, 2022 from [Link]
[32] Jamilson Dantas, Rubens Matos, Jean Araujo, Danilo Oliveira, Andre Oliveira, and Paulo Maciel. 2016. Hierarchi-
cal model and sensitivity analysis for a cloud-based VoD streaming service. In 46th Annual IEEE/IFIP International
Conference on Dependable Systems and Networks Workshop (DSN-W’16). Toulouse, France. IEEE, 10–16.
[33] Nhu-Ngoc Dao, Woongsoo Na, and Sungrae Cho. 2020. Mobile cloudization storytelling: Current issues from an
optimization perspective. IEEE Internet Computing 24, 1 (2020), 39–47.
[34] Nhu-Ngoc Dao, Duy Trong Ngo, Ngoc-Thanh Dinh, Trung V. Phan, Nam D. Vo, Sungrae Cho, and Torsten Braun.
2021. Hit ratio and content quality tradeoff for adaptive bitrate streaming in edge caching systems. IEEE Systems
Journal 15, 4 (2021), 5094–5097.
[35] Nhu-Ngoc Dao, Quoc-Viet Pham, Dinh-Thuan Do, and Schahram Dustdar. 2021. The sky is the edge–toward mobile
coverage from the sky. IEEE Internet Computing 25, 2 (2021), 101–108.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:32 N.-N. Dao et al.
[36] Nhu-Ngoc Dao, Quoc-Viet Pham, Ngo Hoang Tu, Tran Thien Thanh, Vo Nguyen Quoc Bao, Demeke Shumeye Lakew,
and Sungrae Cho. 2021. Survey on aerial radio access networks: Toward a comprehensive 6G access infrastructure.
IEEE Communications Surveys & Tutorials 23, 2 (2021), 1193–1225.
[37] Ishita Dasgupta, Susmit Shannigrahi, and Michael Zink. 2021. A hybrid NDN-IP architecture for live video streaming:
A QoE analysis. In 2021 IEEE International Symposium on Multimedia (ISM’21). Naple, Italy, 148–157.
[38] Jérémie Decouchant, Antoine Boutet, Jiangshan Yu, and Paulo Esteves-Verissimo. 2019. P3LS: Plausible deniability
for practical privacy-preserving live streaming. In 38th Symposium on Reliable Distributed Systems (SRDS’19). Lyon,
France. IEEE, 1–109.
[39] Google Developers. 2022. WebRTC: Real-time communication for the web. Retrieved January 11, 2022 from https:
//[Link]/?hl=en.
[40] Pradeep Dogga, Sandip Chakraborty, Subrata Mitra, and Ravi Netravali. 2019. Edge-based transcoding for adaptive
live video streaming. In 2nd {USENIX} Workshop on Hot Topics in Edge Computing (HotEdge’19). Renton, WA. https:
//[Link]/conference/hotedge19/presentation/dogga.
[41] C. Dong, W. Wen, T. Xu, and X. Yang. 2019. Joint optimization of data-center selection and video-streaming distribu-
tion for crowdsourced live streaming in a geo-distributed cloud platform. IEEE Transactions on Network and Service
Management 16, 2 (2019), 729–742.
[42] Oussama El Marai, Tarik Taleb, Mohamed Menacer, and Mouloud Koudil. 2018. On improving video streaming effi-
ciency, fairness, stability, and convergence time through client–server cooperation. IEEE Transactions on Broadcasting
64, 1 (Mar. 2018), 11–25.
[43] Alireza Erfanian, Farzad Tashtarian, Anatoliy Zabrovskiy, Christian Timmerer, and Hermann Hellwagner. 2021.
OSCAR: On optimizing resource utilization in live video streaming. IEEE Transactions on Network and Service Man-
agement (Mar. 2021).
[44] European Telecommunications Standard Institute (ETSI). 2013. Universal Mobile Telecommunications System
(UMTS); LTE; Transparent end-to-end Packet-switched Streaming Service (PSS); Progressive Download and Dy-
namic Adaptive Streaming over HTTP (3GP-DASH) (3GPP TS 26.247 version 11.1.0 Release 11). Sophia-Antipolis
Cedex, France.
[45] European Telecommunications Standard Institute (ETSI). 2009. Universal Mobile Telecommunication System
(UMTS); LTE; Transparent end-to-end Packet-Switched Streaming Service (PSS); Protocols and Codecs. Sophia-
Antipolis Cedex, France.
[46] Przemysław Falkowski-Gilski and Tadeus Uhl. 2020. Current trends in consumption of multimedia content using
online streaming platforms: A user-centric survey. Computer Science Review 37 (2020), 100268.
[47] Xianglong Feng, Viswanathan Swaminathan, and Sheng Wei. 2019. Viewport prediction for live 360-degree mobile
video streaming using user-content hybrid motion tracking. Proceedings of the ACM on Interactive, Mobile, Wearable
and Ubiquitous Technologies 3, 2 (Jun. 2019), 1–22.
[48] Miguel García-Pineda, Santiago Felici-Castell, and Jaume Segura-García. 2017. Adaptive SDN-based architecture
using QoE metrics in live video streaming on cloud mobile media. In 4th International Conference on Software Defined
Systems (SDS). Valencia, Spain. IEEE, 100–105.
[49] GB/T 17975.1-2010. 2010. Information Technology–Generic Coding of Moving Pictures and Associated Audio
Information–Part 1: Systems.
[50] GB/T 20090.1-2012. 2012. Information technology–Advanced coding of audio and video - Part 1: System.
[51] Chang Ge, Ning Wang, Wei Koong Chai, and Hermann Hellwagner. 2018. QoE-assured 4K HTTP live streaming
via transient segment holding at mobile edge. IEEE Journal on Selected Areas in Communications 36, 8 (Aug. 2018),
1816–1830.
[52] Romeo Giuliano, Franco Mazzenga, and Alessandro Vizzarri. 2020. Integration of broadcaster and Telco access net-
works for real time/live events. IEEE Transactions on Broadcasting 66, 3 (2020), 667–675.
[53] Google. 2022. Google Hangouts. Retrieved January 11, 2022 from [Link]
[54] W3C Working Group. 2016. ISO Common Encryption (‘cenc’) Protection Scheme for ISO Base Media File Format
Stream Format. Retrieved January 11, 2022 from [Link]
[55] Y. Guo, F. R. Yu, J. An, K. Yang, C. Yu, and V. C. M. Leung. 2020. Adaptive bitrate streaming in wireless networks
with transcoding at network edge using deep reinforcement learning. IEEE Transactions on Vehicular Technology 69,
4 (2020), 3879–3892.
[56] Haivision. 2022. Secure Reliable Transport (SRT) Protocol Technical Overview. Retrieved January 11, 2022
from [Link]
Resources+SRT+Tech+Specs+PR.
[57] Sangwook Han, Yunmin Go, Hyunmin Noh, and Hwangjun Song. 2019. Cooperative server-client HTTP adaptive
streaming system for live video streaming. In 2019 International Conference on Information Networking (ICOIN’19).
IEEE, 176–180.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:33
[58] Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016.
MCDNN: An approximation-based execution framework for deep stream processing under resource constraints.
In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. Singapore,
123–136.
[59] F. Haouari, E. Baccour, A. Erbad, A. Mohamed, and M. Guizani. 2019. QoE-aware resource allocation for crowd-
sourced live streaming: A machine learning approach. In IEEE International Conference on Communications (ICC’19).
Shanghai, China, 1–6.
[60] F. Haouari, E. Baccour, A. Erbad, A. Mohamed, and M. Guizani. 2019. Transcoding resources forecasting and reserva-
tion for crowdsourced live streaming. In 2019 IEEE Global Communications Conference (GLOBECOM’19). Waikoloa,
HI. IEEE, 1–7.
[61] Muhammad Haris, Greg Shakhnarovich, and Norimichi Ukita. 2020. Space-time-aware multi-resolution video en-
hancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2859–2868.
[62] Gerhard Hasslinger, Juho Heikkinen, Konstantinos Ntougias, Frank Hasslinger, and Oliver Hohlfeld. 2018. Optimum
caching versus LRU and LFU: Comparison and combined limited look-ahead strategies. In 16th International Sym-
posium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt’18). IEEE, Shanghai, China,
1–6.
[63] Marc Helmold. 2021. New work in education and teaching. In New Work, Transformational and Virtual Leadership.
Springer, 143–155.
[64] HootSuite. 2021. 25 YouTube Statistics that May Surprise You: 2021 Edition. Retrieved March 12, 2022 from https:
//[Link]/youtube-stats-marketers.
[65] Mojtaba Hosseini, Dewan Tanvir Ahmed, Shervin Shirmohammadi, and Nicolas D. Georganas. 2007. A survey of
application-layer multicast protocols. IEEE Communications Surveys & Tutorials 9, 3 (Third quarter 2007), 58–74.
[66] Guowei Huang, Lingjing Kong, Keke Wu, and Zhi Chen. 2017. A bandwidth allocation policy for helpers in cloud-
assisted P2P video-on-demand systems. In 5th International Conference on Advanced Cloud and Big Data (CBD’17).
Shanghai, China. IEEE, 7–12.
[67] Junqin Huang, Linghe Kong, Guihai Chen, Min-You Wu, Xue Liu, and Peng Zeng. 2019. Towards secure industrial
IoT: Blockchain system with credit-based consensus mechanism. IEEE Transactions on Industrial Informatics 15, 6
(Jun. 2019), 3680–3689.
[68] S. Huang, X. Huang, and N. Ansari. 2021. Budget-aware video crowdsourcing at the cloud-enhanced mobile edge.
IEEE Transactions on Network and Service Management 18, 2 (2021), 2123–2137.
[69] Tianchi Huang, Rui-Xiao Zhang, Chao Zhou, and Lifeng Sun. 2018. QARC: Video quality aware rate control for real-
time video streaming based on deep reinforcement learning. In Proceedings of the 26th ACM International Conference
on Multimedia. Seoul, Republic of Korea. ACM, 1208–1216.
[70] Tam T. Huynh, Thuc D. Nguyen, and Hanh Tan. 2019. A survey on security and privacy issues of blockchain tech-
nology. In 2019 International Conference on System Science and Engineering (ICSSE’19). Dong Hoi, Vietnam. IEEE,
362–367.
[71] IEEE 1857.7-2018. 2018. IEEE Standard for Adaptive Streaming. [Link]
[72] IETF. 2003. The Base16, Base32, and Base64 Data Encodings. Retrieved January 11, 2022 from [Link]
html/rfc3548.
[73] IETF. 2003. UTF-8, a transformation format of ISO 10646. Retrieved January 11, 2022 from [Link]
rfc3629.
[74] IETF. 2006. RTP Payload for DTMF Digits, Telephony Tones, and Telephony Signals. Retrieved January 11, 2022 from
[Link]
[75] IETF. 2006. SDP: Session Description Protocol. Retrieved January 11, 2022 from [Link]
page-10.
[76] IETF. 2010. Datagram Transport Layer Security (DTLS) Extension to Establish Keys for the Secure Real-time Trans-
port Protocol (SRTP). Retrieved Jan. 11, 2022 from [Link]
[77] IETF. 2011. RTP Payload Format for MPEG-4 Audio/Visual Streams. Retrieved January 11, 2022 from [Link]
[Link]/html/rfc6416.
[78] IETF. 2012. Definition of the Opus Audio Codec. Retrieved January 11, 2022 from [Link]
[79] IETF. 2016. Real-Time Streaming Protocol Version 2.0. Retrieved January 11, 2022 from [Link]
[Link].
[80] IETF. 2017. HTTP Live Streaming. Retrieved January 11, 2022 from [Link]
[81] IETF. 2021. WebRTC Security Architecture. Retrieved January 11, 2022 from [Link]
[82] Adobe Systems Inc. 2012. Adobe’s Real Time Messaging Protocol. Retrieved January 11, 2022 from [Link]
[Link]/content/dam/acom/en/devnet/rtmp/pdf/rtmp_specification_1.[Link].
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:34 N.-N. Dao et al.
[83] Adobe Systems Inc. 2013. Action Message Format - AMF 3. Retrieved January 11, 2022 from [Link]
[Link]/content/dam/acom/en/devnet/pdf/[Link].
[84] Bitmovin Inc. 2022. Bitmovin: Play everywhere. Retrieved January 11, 2022 from [Link]
[85] Adobe Systems Incorporated. 2013. HTTP Dynamic Streaming Specification — Version 3.0 (FINAL). Retrieved Janu-
ary 11, 2022 from [Link]
pdf.
[86] Instagram. 2022. Instagram from Meta. Retrieved January 11, 2022 from [Link]
[87] ISO/IEC 13818-1:2019. 2019. Information technology — Generic coding of moving pictures and associated audio
information – Part 1: Systems.
[88] ISO/IEC 23009-1:2012. 2012. Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 1:
Media presentation description and segment formats.
[89] ISO/IEC 23009-1:2019. 2019. Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 1:
Media presentation description and segment formats.
[90] ISO/IEC Standard 23009-5. 2017. Information Technology — Dynamic Adaptive Streaming Over HTTP (DASH) —
Part 5: Server and Network Assisted DASH (SAND).
[91] ITU-T Recommendation E.860 (06/2002). [n.d.]. Framework of a Service Level Agreement. [Link]
dologin_pub.asp?lang=e&id=T-REC-E.860-200206-I!!PDF-E&type=items.
[92] ITU-T Recommendation X.140 (09/92). [n.d.]. General Quality of Service Parameters for Communication via Public
Data Networks. [Link]
[93] Saba Qasim Jabbar, Dheyaa Jasim Kadhim, and Yu Li. 2018. Proposed an adaptive bitrate algorithm based on measur-
ing bandwidth and video buffer occupancy for providing smoothly video streaming. Technology 9, 2 (2018), 191–195.
[94] Rajendra K. Jain, Dah-Ming W. Chiu, William R. Hawe, et al. 1984. A quantitative measure of fairness and discrimi-
nation. Eastern Research Laboratory, Digital Equipment Corporation, Hudson, MA (1984).
[95] Behrouz Jedari, Gopika Premsankar, Gazi Illahi, Mario Di Francesco, Abbas Mehrabi, and Antti Ylä-Jääski. 2020.
Video caching, analytics and delivery at the wireless edge: A survey and future directions. IEEE Communications
Surveys & Tutorials 23, 1 (2020), 431–471.
[96] Junchen Jiang, Ganesh Ananthanarayanan, Peter Bodik, Siddhartha Sen, and Ion Stoica. 2018. Chameleon: scalable
adaptation of video analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Com-
munication. Budapest, Hungary. ACM, 253–266.
[97] Junchen Jiang, Vyas Sekar, and Hui Zhang. 2012. Improving fairness, efficiency, and stability in HTTP-based adaptive
video streaming with FESTIVE. In Proceedings of the 8th International Conference on Emerging Networking Experiments
and Technologies. 97–108.
[98] Yuxuan Jiang, Bo Sun, and Danny H. K. Tsang. 2021. Not taken for granted: Configuring scalable live video streaming
under throughput fluctuations in mobile edge networks. IEEE Transactions on Vehicular Technology 70, 3 (2021), 2771–
2782.
[99] Kaltura. 2021. Our mission is to power any video experience, for any organization. Retrieved January 11, 2022 from
[Link]
[100] Osamah Ibrahim Khalaf, Ghaida Muttashar Abdulsahib, Hamed Daei Kasmaei, and Kingsley A. Ogudo. 2020. A new
algorithm on application of blockchain technology in live stream video transmissions and telecommunications. In-
ternational Journal of e-Collaboration (IJeC) 16, 1 (2020), 16–32.
[101] Ahmed Khalid, Ahmed H. Zahran, and Cormac J. Sreenan. 2017. mCast: An SDN-based resource-efficient live video
streaming architecture with ISP-CDN collaboration. In IEEE 42nd Conference on Local Computer Networks (LCN’17).
Singapore. IEEE, 95–103.
[102] Ahmed Khalid, Ahmed H. Zahran, and Cormac J. Sreenan. 2019. An SDN-based device-aware live video service for
inter-domain adaptive bitrate streaming. In Proceedings of the 10th ACM Multimedia Systems Conference. Amherst,
MA. ACM, 121–132.
[103] Daisuke Kobayashi, Ken Nakamura, Tatsuya Osawa, Yuya Omori, Takayuki Onishi, and Hiroe Iwasaki. 2019. A real-
time 4K HEVC multi-channel encoding system with content-aware bitrate control. In 2019 IEEE Global Communica-
tions Conference (GLOBECOM’19). Waikoloa, HI. IEEE, 1–6.
[104] Jonathan Kua, Grenville Armitage, and Philip Branch. 2017. A survey of rate adaptation techniques for dynamic
adaptive streaming over HTTP. IEEE Communications Surveys & Tutorials 19, 3 (Third quarter 2017), 1842–1866.
[105] Hung T. Le, Thoa Nguyen, Nam Pham Ngoc, Anh T. Pham, and Truong Cong Thang. 2018. HTTP/2 push-based
low-delay live streaming over mobile networks with stream termination. IEEE Transactions on Circuits and Systems
for Video Technology 28, 9 (Sep. 2018), 2423–2427.
[106] Jie Li, Cong Zhang, Zhi Liu, Wei Sun, and Qiyue Li. 2020. Joint communication and computational resource alloca-
tion for QoE-driven point cloud video streaming. In 2020 IEEE International Conference on Communications (ICC’20).
Dublin, Ireland. IEEE, 1–6.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:35
[107] Liang Li, Dian Shi, Ronghui Hou, Rui Chen, Bin Lin, and Miao Pan. 2020. Energy-efficient proactive caching for
adaptive video streaming via data-driven optimization. IEEE Internet of Things Journal 7, 6 (Jun. 2020), 5549–5561.
[108] Yunlong Li, Shanshe Wang, Xinfeng Zhang, Chao Zhou, and Siwei Ma. 2020. High efficiency live video streaming
with frame dropping. In 2020 IEEE International Conference on Image Processing (ICIP’20). Abu Dhabi, United Arab
Emirates. IEEE, 1226–1230.
[109] Zhi Li, Xiaoqing Zhu, Joshua Gahm, Rong Pan, Hao Hu, Ali C. Begen, and David Oran. 2014. Probe and adapt: Rate
adaptation for HTTP video streaming at scale. IEEE Journal on Selected Areas in Communications 32, 4 (Apr. 2014),
719–733.
[110] Chunyu Liu, Heli Zhang, Hong Ji, and Xi Li. 2021. MEC-assisted flexible transcoding strategy for adaptive bitrate
video streaming in small cell networks. China Communications 18, 2 (Feb. 2021), 200–214.
[111] Junquan Liu, Weizhan Zhang, Shouqin Huang, Haipeng Du, and Qinghua Zheng. 2021. QoE-driven HAS live video
channel placement in the media cloud. IEEE Transactions on Multimedia (2021).
[112] Mengting Liu, Yinglei Teng, F. Richard Yu, Victor C. M. Leung, and Mei Song. 2020. A mobile edge computing (MEC)-
enabled transcoding framework for blockchain-based video streaming. IEEE Wireless Communications 27, 2 (Apr.
2020), 81–87.
[113] Facebook Live. 2022. Meta for Media. Retrieved January 11, 2022 from [Link]
facebook-live.
[114] Youtube Live. 2022. YouTube Live Streaming & Premieres. Retrieved January 11, 2022 from [Link]
com/intl/en_us/howyoutubeworks/product-features/live/#youtube-live.
[115] Vimeo Livestream. 2022. The world’s only all-in-one video solution. Retrieved January 11, 2022 from [Link]
com/vimeolivestream.
[116] Sharat Chandra Madanapalli, Alex Mathai, Hassan Habibi Gharakheili, and Vijay Sivaraman. 2021. ReCLive: Real-
time classification and QoE inference of live video streaming services. In IEEE/ACM 29th International Symposium
on Quality of Service (IWQOS’21). Tokyo, Japan. IEEE, 1–7.
[117] Anahita Mahzari, Afshin Taghavi Nasrabadi, Aliehsan Samiei, and Ravi Prakash. 2018. FoV-aware edge caching for
adaptive 360 video streaming. In Proceedings of the 26th ACM International Conference on Multimedia. ACM, 173–181.
[118] Muhammad Faran Majeed, Syed Hassan Ahmed, Siraj Muhammad, Houbing Song, and Danda B. Rawat. 2017. Mul-
timedia streaming in information-centric networking: A survey and future perspectives. Computer Networks 125
(2017), 103–121.
[119] Pantelis Maniotis and Nikolaos Thomos. 2021. Tile-based edge caching for 360º live video streaming. IEEE Transac-
tions on Circuits and Systems for Video Technology (Feb. 2021).
[120] IBM Watson Media. 1998–2022. The Future of Video with Watson. Retrieved January 11, 2022 from [Link]
com.
[121] Wowza media systems. 2007–2022. If You Can Dream It, Wowza Can Stream It. Retrieved January 11, 2022 from
[Link]
[122] Rosangela Melo, Maria Clara Bezerra, Jamilson Dantas, Rubens Matos, Ivanildo José de Melo Filho, Aline Santana
Oliveira, Fábio Denilson de Oliveira Feliciano, and Paulo Romero Martins Maciel. 2017. Sensitivity analysis tech-
niques applied in cloud computing environments. In 12th Iberian Conference on Information Systems and Technologies
(CISTI’17). IEEE, Lisbon, 1–7.
[123] Rosangela Melo, Maria Clara Bezerra, Jamilson Dantas, Rubens Matos, Ivanildo Melo, and Paulo Maciel. 2014. Sensi-
tivity analysis of availability of video streaming service in cloud computing. In IEEE 33rd International Performance
Computing and Communications Conference (IPCCC’14). Austin, TX. IEEE, 1–2.
[124] Microsoft. 2009. Smooth Streaming Technical Overview. Retrieved March 12, 2022 from [Link]
us/iis/media/on-demand-smooth-streaming/smooth-streaming-technical-overview.
[125] Microsoft. 2020. Protected Interoperable File Format. Retrieved January 11, 2022 from [Link]
us/iis/media/smooth-streaming/protected-interoperable-file-format.
[126] Microsoft. 2020. Smooth Streaming Protocol. Retrieved January 11, 2022 from [Link]
openspecs/windows_protocols/ms-sstr/8383f27f-7efe-4c60-832a-387274457251.
[127] Microsoft. 2021. Protect your content with Media Services dynamic encryption. Retrieved January 11, 2022 from
[Link]
[128] Konstantin Miller, Emanuele Quacchio, Gianluca Gennari, and Adam Wolisz. 2012. Adaptation algorithm for adaptive
streaming over HTTP. In 19th International Packet Video Workshop (PV’12). Munich-Garching, Germany. IEEE, 173–
178.
[129] Anish Mishra, Sagar Ganiga, Meit Maheshwari, Shreya Saha, and Gaurav Kumar. 2019. Secure and decentralized live
streaming using blockchain and IPFS. In 3rd Workshop on Blockchain Technologies and its Applications.
[130] Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver,
and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International Conference
on Machine Learning. PMLR, 1928–1937.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:36 N.-N. Dao et al.
[131] Ahmed Afif Monrat, Olov Schelén, and Karl Andersson. 2019. A survey of blockchain from the perspectives of ap-
plications, challenges, and opportunities. IEEE Access 7 (Aug. 2019), 117134–117151.
[132] Muvi. 2022. The global OTT Video & Audio Streaming. Retrieved January 11, 2022 from [Link]
[133] Shahid Nabi, Muhammad Umar Farooq, and Farhan Hussain. 2019. SHANZ algorithm for QoE enhancement of HTTP
based adaptive video streaming. In IEEE 11th International Conference on Communication Software and Networks
(ICCSN’19). Chongqing, China. IEEE, 393–400.
[134] Koichi Nihei, Hiroshi Yoshida, Natsuki Kai, Kozo Satoda, and Keiichi Chono. 2018. Adaptive bitrate control of scal-
able video for live video streaming on best-effort network. In 2018 IEEE Global Communications Conference (GLOBE-
COM’18). Abu Dhabi, United Arab Emirates. IEEE, 1–7.
[135] Panopto. 2022. Record, Share, and Manage Videos Securely. Retrieved January 11, 2022 from [Link]
com/kr/.
[136] JW Player. 2007-2021. We’re passionate about video innovation. Retrieved January 11, 2022 from [Link]
[Link]/.
[137] VLC Media Player. 2022. VLC Media Player. Retrieved January 11, 2022 from [Link]
en_GB.html.
[138] Shiva Raj Pokhrel and Surjit Singh. 2021. Compound TCP performance for industry 4.0 WiFi: A cognitive federated
learning approach. IEEE Transactions on Industrial Informatics 17, 3 (Mar. 2021), 2143–2151.
[139] Konstantinos Poularakis, George Iosifidis, Antonios Argyriou, Iordanis Koutsopoulos, and Leandros Tassiulas. 2019.
Distributed caching algorithms in the realm of layered video streaming. IEEE Transactions on Mobile Computing 18,
4 (Apr. 2019), 757–770.
[140] Qifan Pu, Haoyuan Li, Matei Zaharia, Ali Ghodsi, and Ion Stoica. 2016. FairRide: Near-optimal, fair cache sharing. In
13th {USENIX} Symposium on Networked Systems Design and Implementation (NSDI’16). Santa Clara, CA, 393–406.
[141] M. A. Rajan, Ashley Varghese, N. Narendra, Meena Singh, V. L. Shivraj, Girish Chandra, and P. Balamuralidhar.
2016. Security and privacy for real time video streaming using hierarchical inner product encryption based publish-
subscribe architecture. In 30th International Conference on Advanced Information Networking and Applications Work-
shops (WAINA’16). Crans-Montana, Switzerland. IEEE, 373–380.
[142] Farhad Raufmehr, Mohammad Reza Salehi, and Ebrahim Abiri. 2020. A frame-level MLP-based bit-rate controller for
real-time video transmission using VVC standard. Journal of Real-Time Image Processing (Sep. 2020), 1–13.
[143] Qingmei Ren, Yong Cui, Wenfei Wu, Changfeng Chen, Yuchi Chen, Jiangchuan Liu, and Hongyi Huang. 2018. Improv-
ing quality of experience for mobile broadcasters in personalized live video streaming. In IEEE/ACM 26th International
Symposium on Quality of Service (IWQoS’18). Banff, AB, Canada. IEEE, 1–6.
[144] Giovanni Rigazzi, Jani-Pekka Kainulainen, Charles Turyagyenda, Alain Mourad, and Jaehyun Ahn. 2019. An edge
and fog computing platform for effective deployment of 360 video applications. In IEEE Wireless Communications
and Networking Conference Workshop (WCNCW’19). Marrakech, Morocco. IEEE, 1–6.
[145] Eun-Seok Ryu and SunJung Ryu. 2017. Robust real-time UHD video streaming system using scalable high efficiency
video coding. Multimedia Tools and Applications 76, 23 (May 2017), 25511–25527.
[146] Nouha Samet, Asma Ben Letaifa, Mohamed Hamdi, and Sami Tabbane. 2017. Energy consumption comparison for
mobile video streaming encryption algorithm. In 13th International Wireless Communications and Mobile Computing
Conference (IWCMC’17). IEEE, Valencia, 1350–1355.
[147] Yusuf Sani, Andreas Mauthe, and Christopher Edwards. 2017. Adaptive bitrate selection: A survey. IEEE Communi-
cations Surveys & Tutorials 19, 4 (2017), 2985–3014.
[148] Michael Seufert, Sebastian Egger, Martin Slanina, Thomas Zinner, Tobias Hoßfeld, and Phuoc Tran-Gia. 2014. A
survey on quality of experience of HTTP adaptive streaming. IEEE Communications Surveys & Tutorials 17, 1 (2014),
469–492.
[149] Wella Edli Shabrina, Dodi Wisaksono Sudiharto, Endro Ariyanto, and Muhammad Al Makky. 2020. The QoS im-
provement using CDN for live video streaming with HLS. In 2020 International Conference on Smart Technology and
Applications (ICoSTA’20). Surabaya, Indonesia. IEEE, 1–5.
[150] Karthikeyan Shanmugam, Negin Golrezaei, Alexandros G. Dimakis, Andreas F. Molisch, and Giuseppe Caire. 2013.
FemtoCaching: Wireless content delivery through distributed caching helpers. IEEE Transactions on Information The-
ory 59, 12 (Sep 2013), 8402–8413.
[151] Yongtao Shuai and Thorsten Herfet. 2018. Towards reduced latency in adaptive live streaming. In 15th IEEE Annual
Consumer Communications & Networking Conference (CCNC’15). Las Vegas, NV. IEEE, 1–4.
[152] A. Soltanian, F. Belqasmi, S. Yangui, M. A. Salahuddin, R. Glitho, and H. Elbiaze. 2018. A cloud-based architecture for
multimedia conferencing service provisioning. IEEE Access 6 (Jan. 2018), 9792–9806.
[153] Abbas Soltanian, Diala Naboulsi, Roch Glitho, and Halima Elbiaze. 2019. Resource allocation mechanism for media
handling services in cloud multimedia conferencing. IEEE Journal on Selected Areas in Communications 37, 5 (May
2019), 1167–1181. [Link]
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
A Contemporary Survey on Live Video Streaming 202:37
[154] SproutSocial. 2021. 20 Facebook Stats to Guide your 2021 Facebook Strategy. Retrieved March 12, 2022 from https:
//[Link]/insights/facebook-stats-for-marketers.
[155] StreamShark. 2022. Live Stream with Confidence. Make your next live stream a success with StreamShark! Retrieved
January 11, 2022 from [Link]
[156] Satish Kumar Suman, Aniket Dhok, and Swapnil Bhole. 2020. DNNStream: Deep-learning based content adaptive real-
time streaming. In 2020 International Conference on Signal Processing and Communications (SPCOM’20). Bangalore,
India. IEEE, 1–5.
[157] Carlos A. Talay, Franco A. Trinidad, Diego R. Rodríguez Herlein, M. Luz Almada, Claudia N. González, and Luis A.
Marrone. 2018. Analysis of the performance of TCP Vegas and its relationship with alpha and beta parameters in
a wireless links network and burst errors. In 2018 Congreso Argentino de Ciencias de la Informática y Desarrollos de
Investigación (CACIDI’18). Buenos Aires, Argentina. IEEE, 1–6.
[158] THEOPlayer. 2022. THEOPlayer: Universal Video Player. Retrieved January 11, 2022 from [Link]
com/.
[159] Zhao Tian, Laiping Zhao, Lihai Nie, Peiqi Chen, and Shuyu Chen. 2019. Deeplive: QoE optimization for live video
streaming through deep reinforcement learning. In IEEE 25th International Conference on Parallel and Distributed
Systems (ICPADS’19). Tianjin, China. IEEE, 827–831.
[160] Tiktok. 2022. Short Live Video Streaming. Retrieved January 11, 2022 from [Link]
[161] Anh-Tien Tran, Nhu-Ngoc Dao, and Sungrae Cho. 2020. Bitrate adaptation for video streaming services in edge
caching systems. IEEE Access 8 (2020), 135844–135852.
[162] Anh-Tien Tran, Demeke Shumeye Lakew, The-Vi Nguyen, Van-Dat Tuong, Thanh Phung Truong, Nhu-Ngoc Dao,
and Sungrae Cho. 2021. Hit ratio and latency optimization for caching systems: A survey. In 2021 International
Conference on Information Networking (ICOIN’21). Jeju Island, Korea (South). IEEE, 577–581.
[163] Twitch. 2022. Esport Live Streaming. Retrieved January 11, 2022 from [Link]
[164] Video Quality Experts Group (VQEG). 2022. StreamSim. Retrieved January 11, 2022 from [Link]
software-tools/encoding/streaming/streamsim/.
[165] Video Quality Experts Group (VQEG). 2022. Video Quality Expert Group — Motivation, Objectives and Rules. Re-
trieved January 11, 2022 from [Link]
[166] World Wide Web Consortium (W3C). 2022. WebIDL. Retrieved January 11, 2022 from [Link]
WebIDL/.
[167] F. Wang, C. Zhang, F. Wang, J. Liu, Y. Zhu, H. Pang, and L. Sun. 2020. DeepCast: Towards personalized QoE for
edge-assisted crowdcast with deep reinforcement learning. IEEE/ACM Transactions on Networking 28, 3 (Jun. 2020),
1255–1268.
[168] Junjue Wang, Brandon Amos, Anupam Das, Padmanabhan Pillai, Norman Sadeh, and Mahadev Satyanarayanan.
2018. Enabling live video analytics with a scalable and privacy-aware framework. ACM Transactions on Multimedia
Computing, Communications, and Applications 14, 3s (Jun. 2018), 1–24.
[169] Junjue Wang, Ziqiang Feng, Zhuo Chen, Shilpa George, Mihir Bala, Padmanabhan Pillai, Shao-Wen Yang, and
Mahadev Satyanarayanan. 2018. Bandwidth-efficient live video analytics for drones via edge computing. In 2018
IEEE/ACM Symposium on Edge Computing (SEC’18). Seattle, WA. IEEE, 159–173.
[170] Mu Wang, Changqiao Xu, Shijie Jia, and Gabriel-Miro Muntean. 2018. Video streaming distribution over mobile
Internet: A survey. Frontiers of Computer Science 12, 6 (2018), 1039–1059.
[171] Ziyi Wang, Yong Cui, Xiaoyu Hu, Xin Wang, Wei Tsang Ooi, and Yi Li. 2020. MultiLive: Adaptive bitrate control for
low-delay multi-party interactive live streaming. In IEEE Conference on Computer Communications (INFOCOM’20).
Toronto, ON, Canada. IEEE, 1093–1102.
[172] Yongpeng Wu, Xiqi Gao, Shidong Zhou, Wei Yang, Yury Polyanskiy, and Giuseppe Caire. 2020. Massive access for
future wireless communication systems. IEEE Wireless Communications 27, 4 (2020), 148–156.
[173] S. Xu and G. Liu. 2020. Multi-access edge computing based user experience driven multicast video conference algo-
rithm. In 2020 IEEE International Conference on Edge Computing (EDGE’20). Beijing, China. IEEE, 99–105.
[174] Xiaodong Xu, Jiaxiang Liu, and Xiaofeng Tao. 2017. Mobile edge computing enhanced adaptive bitrate video delivery
with joint cache and radio resource allocation. IEEE Access 5 (Aug. 2017), 16406–16415.
[175] Zichuan Xu, Weifa Liang, Meitian Huang, Mike Jia, Song Guo, and Alex Galis. 2018. Efficient NFV-enabled multicas-
ting in SDNs. IEEE Transactions on Communications 67, 3 (Mar. 2018), 2052–2070.
[176] Jian Yang, Enzhong Yang, Yongyi Ran, Yifeng Bi, and Jun Wang. 2018. Controllable multicast for adaptive scalable
video streaming in software-defined networks. IEEE Transactions on Multimedia 20, 5 (May 2018), 1260–1274.
[177] Peng Yang, Feng Lyu, Wen Wu, Ning Zhang, Li Yu, and Xuemin Sherman Shen. 2019. Edge coordinated query con-
figuration for low-latency and accurate video analytics. IEEE Transactions on Industrial Informatics 16, 7 (Jul. 2019),
4855–4864.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
202:38 N.-N. Dao et al.
[178] Peng Yang, Ning Zhang, Shan Zhang, Feng Lyu, Li Yu, and Xuemin Shen. 2019. Asymptotic optimal edge resource al-
location for video streaming via user preference prediction. In 2019 IEEE International Conference on Communications
(ICC’19). Shanghai, China. IEEE, 1–6.
[179] Abid Yaqoob, Ting Bi, and Gabriel-Miro Muntean. 2020. A survey on adaptive 360° video streaming: Solutions, chal-
lenges and opportunities. IEEE Communications Surveys & Tutorials 22, 4 (2020), 2801–2838.
[180] Jihyeok Yun, Md Jalil Piran, and Doug Young Suh. 2018. QoE-driven resource allocation for live video streaming over
D2D-underlaid 5G cellular networks. IEEE Access 6 (Dec. 2018), 72563–72580.
[181] Kamran Zahoor, Kashif Bilal, Aiman Erbad, and Amr Mohamed. 2020. Service-less video multicast in 5G: Enablers
and challenges. IEEE Network 34, 3 (2020), 270–276.
[182] Hassan Ibrahim Zawia, Rosilah Hassan, and Dahlila Putri Dahnil. 2018. A survey of medium access mechanisms for
providing robust audio video streaming in IEEE 802.11aa standard. IEEE Access 6 (2018), 27690–27705.
[183] Ju Zhang, Qian Gao, and Guoqiang Zhang. 2020. Edge cache replacement strategy for SVC-encoding tile-based 360-
degree panoramic streaming. In 3rd International Conference on Hot Information-Centric Networking (HotICN’20).
Hefei, China. IEEE, 122–128.
[184] J. Zhang, Y. Zhang, and M. Shen. 2020. A distance-driven alliance for a P2P live video system. IEEE Transactions on
Multimedia 22, 9 (2020), 2409–2419.
[185] Rong Zhang, Wei Li, Peng Wang, Chenye Guan, Jin Fang, Yuhang Song, Jinhui Yu, Baoquan Chen, Weiwei Xu, and
Ruigang Yang. 2020. Autoremover: Automatic object removal for autonomous driving videos. In Proceedings of the
AAAI Conference on Artificial Intelligence, Vol. 34. 12853–12861.
[186] Xuguang Zhang, Huangda Lin, Mingkai Chen, Bin Kang, and Lei Wang. 2020. MEC-enabled video streaming in
device-to-device networks. IET Communications 14, 15 (2020), 2453–2461.
[187] Zhao Zhang, Huadong Ma, Yaohong Xue, and Liang Liu. 2017. Fair video caching for named data networking. In
IEEE International Conference on Communications (ICC’17). Paris, France. IEEE, 1–6.
[188] Zhicai Zhang, Ru Wang, F. Richard Yu, Fang Fu, and Qiao Yan. 2019. QoS aware transcoding for live streaming in
edge-clouds aided HetNets: An enhanced actor-critic approach. IEEE Transactions on Vehicular Technology 68, 11
(Nov. 2019), 11295–11308.
[189] Zhicai Zhang, Ru Wang, F. Richard Yu, Fang Fu, Qiao Yan, and Qi Jiao. 2019. QoE aware transcoding for live streaming
in SDN-based cloud-aided HetNets: An actor-critic approach. In IEEE International Conference on Communications
Workshops (ICC Workshops’19). Shanghai, China. IEEE, 1–6.
[190] Wei Zhao, Wen Qiu, Chuanhua Zhou, Zhi Liu, and Takahiro Hara. 2018. Edge-node assisted live video streaming: A
coalition formation game approach. In IEEE Globecom Workshops (GC Wkshps’18). Abu Dhabi, United Arab Emirates.
IEEE, 1–6.
[191] Chao Zhou, Chia-Wen Lin, Xinggong Zhang, and Zongming Guo. 2019. TFDASH: A fairness, stability, and efficiency
aware rate control approach for multiple clients over DASH. IEEE Transactions on Circuits and Systems for Video
Technology 29, 1 (Jan. 2019), 198–211.
[192] Y. Zhu, Q. He, J. Liu, B. Li, and Y. Hu. 2020. When crowd meets big video data: Cloud-edge collaborative transcoding
for personal livecast. IEEE Transactions on Network Science and Engineering 7, 1 (2020), 42–53.
ACM Computing Surveys, Vol. 54, No. 10s, Article 202. Publication date: November 2022.
Edge computing combined with in-network computing capabilities is recommended for latency reduction in live video streaming systems. By processing data closer to the source or user, these strategies reduce the time data spends traveling over the network. Experiments have shown that E2E latency can be kept within a range of 15–75 ms, which is considered low. These strategies are effective in minimizing response time while optimizing resource use for multimedia applications .
Advancements in end-user experience for live video streaming through QoE optimization include the development of solutions like Deeplive, which leverages deep reinforcement learning to maximize QoE metrics, including buffering rate and video quality. This method reduces training time and achieves 15-55% QoE improvements compared to traditional algorithms. Additionally, ML-based solutions like ReCLive infer QoE by monitoring stream patterns, heightening resolution, and reducing buffering .
Redundant node architectures present trade-offs between service availability (SA), downtime, cost, and computational/employable complexity. Although these architectures can improve service availability by reducing downtime to approximately 49.05 hours (2.04375 days) per annum, they increase computational complexity and may incur higher costs compared to conventional approaches .
CDN infrastructure enhances live video streaming services by distributing content across a network of geographically dispersed servers, which reduces latency and increases throughput. This setup allows for faster and more reliable content delivery to end-users. The study reported an 11.58% improvement in average throughput and a 0.25% reduction in packet loss ratio with CDN use, compared to systems without CDN support, thereby improving video quality and user experience .
The ReCLive algorithm effectively differentiates live streams from video-on-demand streams using media-request patterns. It provides real-time QoE measurement by assessing metrics like resolution and rate of buffering. The automation and inference capabilities ensure that live content is classified quickly, offering enhancements in stream management and subsequent QoE improvements for live streaming services .
ABS techniques enhance video streaming quality by dynamically adjusting the bitrate of video streams in response to network conditions, which helps maintain high video quality even with bandwidth fluctuations. This involves encoding the video at multiple bitrates and selecting an appropriate one based on network conditions. ABS improves the user experience by ensuring smooth playback without buffering interruptions, utilizing cooperative client-server models for optimal encoding and quality adjustments .
Quality of Service (QoS) focuses on technical aspects of network performance, such as jitter, latency, and packet loss, without directly considering the user's perspective. In contrast, Quality of Experience (QoE) measures the actual perceived quality by end-users, encompassing metrics like the rate of buffering, playback bitrate, and user satisfaction. QoE reflects whether the network effectively delivers a satisfactory end-user experience .
Fog architecture aids video streaming applications by enabling computing and processing closer to the data source, which minimizes latency and enhances reliable communication in low-latency scenarios. It supports ultra-reliable video-on-demand services and intelligent transport systems by ensuring that processing occurs near the edge of the network, thereby reducing the delay associated with sending data to distant centralized servers .
LVS systems face serviceability challenges such as service instability, unfairness, and inefficiency due to limited resource capacities. These challenges are addressed by defining metrics like service stability, fairness, and efficiency to evaluate performance. Service stability is maintained through bitrate management, fairness is ensured with resource allocation strategies, and efficiency is measured by the effective use of available resources, enhancing overall system serviceability .
The time-based rejuvenation strategy achieved a service availability (SA) of 0.9999359, which corresponds to 0.561516 hours (33.69 minutes) of annual downtime. The prediction-based strategy slightly improved SA to 0.9999361, reducing downtime further to 0.559764 hours (33.59 minutes) per year. Both strategies lead to substantial performance improvements by addressing software aging issues for cloud-based live video streaming services .