Intelligent World Adn 2024 en
Intelligent World Adn 2024 en
Autonomous Driving
Network (ADN)
AI for Network, Ushering in New Era
of High Autonomy
02 Trends Insights 08
Trend 3 The Expansion of ICCs Is Promoted by Large Models, and Confronted by Challenges Posed 20
by Intelligent Troubleshooting and Operations Efficiency Improvement in All Domains
Trend 5 High-Value Scenarios Are Driving the Improvement of the AN Level, and Large 31
Models Are Advancing the Evolution Toward AN Level 4
Trend 7 Multi-Agent Collaboration Will Be a Key Technology for Achieving Highly Autonomous 45
Networks in CSP Networks, and It Is Gaining Attention in Industry Research
01
CSPs in the
Intelligent Era
1
Autonomous Driving Network
2
Autonomous Driving Network
» Augmented Workforce: This means that each employee has an intelligent assistant
who understands the employee and completes each task efficiently and with high
quality.
» All Connected Resources: Refers to the full connection of enterprise assets, employees,
customers, partners, and ecosystems. This integration, along with real-time feedback
and digitalization of objects, processes, and rules, improves the amount and quality
of information quality. This helps develop a data flywheel and provides significant
information-based advantages for enterprises.
3
Autonomous Driving Network
1. Adaptive User Experience: CSPs deliver an adaptive user experience for mobile
broadband (MBB), home broadband (HBB), enterprise private lines, video services, New
Calling, smart homes, and other scenarios. Let's consider home products and services as
an example.
» Adaptive gym: Based on the workout records of different family members in the past,
the system customizes fitness courses for each member by automatically adjusting
the themes and difficulty levels to ensure that everyone has the most suitable fitness
experience.
» Adaptive home: The lighting, room temperatures, and water temperatures are
automatically adjusted based on the user's preferences, time, and weather to provide
the most comfortable home environment.
» Fun calls: More creative and interesting call backgrounds and features are provided.
4
Autonomous Driving Network
» Real-time translation: More languages are supported, and the translation accuracy is
improved.
» Digital avatars: A more diverse range of scenarios, such as intelligent call answering
services and conference speaking, are supported.
3. Autonomous Operations: CSPs deploy the wireless network optimization agent, fault
monitoring and handling agent, change monitoring agent, home broadband experience
assurance agent, and service provisioning agent to implement autonomous network
operations. Let's consider the wireless network optimization agent as an example. It
enables autonomous wireless network optimization, reducing the number of low-rate
cells by 20% and the network optimization period from one day to one hour.
Network
optimization scope Anomaly analysis Result evaluation On-site Operation Result evaluation
Network Monitoring (Optional)
and objectives
4. Augmented Workforce: CSPs offer exclusively smart assistants for their employees by
deploying the home broadband installation and maintenance copilot, field maintenance
copilot, customer service copilot, and development copilot. Let's consider the home
broadband installation and maintenance copilot as an example. It reduces the average
fault rectification time of installation and maintenance engineers from 60–90 minutes to
30 minutes, and the second site visit rate from 10%–15% to 5%.
Tickets including
Service
HBB complaints Complaint and Fault Locating Fault rectification
Verification
and fault reporting
The installation and maintenance personnel uses copilots to Rectify faults Use copilots to
To-be perform pre-diagnosis and obtain check results, fault locations,
and handling suggestions
based on copilots’
suggestions
perform automatic
test and verification
5
Autonomous Driving Network
» Electrical layer: Includes information about the ODU status, route, latency, bit error,
alarms, and shared risk link groups (SRLGs).
» Optical layer: Includes information about wavelength resources and status, optical
path routes, pre-FEC bit errors, optical power, signal-to-noise ratio (SNR), and alarms.
» Fiber/cable layer: Includes information about fibers, optical cables, fiber distribution
terminals (FDTs), optical distribution frames (ODFs), poles, manholes, optical cable
routes, fiber core connections, optical power, optical attenuation, and health status.
Cross-domain O&M
Single-domain O&M
Large models Digital twins
Intelligent O&M
UPF
Core
…
10 Gigabit Mobile
10 Gigabit Campus
6
Autonomous Driving Network
· Auto-evolving MBB
· Auto-evolving FBB
Adaptive Autonomous
User Experience Operations · HBB installation and
· Auto-evolving
maintenance copilot
enterprise private line
· Field maintenance
· Auto-evolving video
service Auto-Evolving “6A” Augmented copilot
· Customer service
· Auto-evolving new Products CSPs Workforce
copilot
calling
· R&D copilot
· Auto-evolving smart
· ...
home All-Connected AI-Native
· ...... Resources Infrastructure
7
Autonomous Driving Network
02
Trends Insights
8
Autonomous Driving Network
Trend 1
9
Autonomous Driving Network
Business Insights
AI technologies are transforming the landscape of the live-streaming industry and related
technologies. From e-commerce, gaming, technologies, culture, and tourism to real estate,
the virtual human live-streaming service is witnessing rapid growth and extensive application,
becoming one of the most trending services. On April 19, 2024, iiMedia Research, a third-
party data mining and analysis organization in the new economy industry, released a White
Paper on the Development of China's Virtual Digital Human Industry in 2024, announcing
that the market scale created by virtual humans in China reached CNY333.47 billion and the
core market scale reached CNY20.52 billion in 2023. These numbers are expected to reach
CNY640.27 billion and CNY48.06 billion in 2025, respectively.
CNY333.47 billion
CNY48.06 billion
CNY20.52 billion
2023 2025
10
Autonomous Driving Network
400,000 hours
5000+ digital 100 million
live broadcast views
5 million
interactions
A virtual human can reproduce over 80% of the appearance, movements, and voice of
a real person. A virtual human live streamer can help enterprises communicate with
customers 24/7, sell products, and increase revenues.
11
Autonomous Driving Network
Impacts
The upcoming 5G era invigorates the development of live streaming. The growing
popularity of virtual human live streaming heralds an age of universal live streaming on
the Internet but also leads to "frustrations" regarding network speed among consumers.
According to iiMedia Research, 45.6% of consumers are dissatisfied with the network
speed of their traffic packages. Network speed has become a major concern for selecting
traffic packages. Live streaming features real-time interaction, frequent communication,
and subjective perception, meaning that any blurry images or occasional frame freezing
create a significant negative impact or even losses. Real-human and virtual-human
live streaming pose new requirements on stable network access, smooth real-time
transmission of high-definition images, sufficient network bandwidth, and low latency.
Differentiated automatic network quality awareness, analysis, and assurance capabilities
have become increasingly vital components in serving high-value network customers.
Virtual-human live streamers can be used on e-commerce platforms when real humans
are absent to increase the live streaming duration, continuously improving brand exposure
and increasing the time for user interaction. An E2E network latency of 70 ms is required
for virtual-human live streamers to produce real-time videos, for AIGC emotional
companion/role-play applications to generate real-time text and images, and for AI
assistants to deliver a "quasi-lifelike" or "lifelike" experience. Robotaxi security monitoring
and remote takeover require an E2E latency of 100 ms and an immersive video experience.
All these scenarios require improved real-time network awareness and interaction
experience. Furthermore, AI technologies will gradually evolve toward intelligent robots
with advanced intelligent interaction capabilities, delivering a more immersive experience
and a stronger sense of companionship and belonging across various fields. A wider range
of network applications will demand differentiated real-time awareness and interactive
experience, creating new business opportunities and key markets for CSPs in the future.
12
Autonomous Driving Network
Virtual-human live streaming eliminates the time and space restrictions of consumption
and creates new growth engines in the consumption market after midnight. It also
leads to a significant increase in network traffic and highlights the value of network
traffic during off-peak hours. Due to the limited profit potential of the traffic-based
charging mode, CSPs are in urgent need of new business models and innovative
business solutions to explore the value of high-value applications and traffic for revenue
growth. For instance, a CSP launched the "full-service acceleration" product in Beijing
to provide VIP network acceleration channels for users, improving the Internet access
rate by over four times on average. A CSP in Zhejiang province deployed the 5G-A
differentiated assurance solution to provide key service assurance for users by layer
and level. This solution enables seamless network access during concert streaming, live
streaming, mobile office services, and other key event scenarios, delivering a next-level
communication experience.
Suggestions
For new AI applications, offer differentiated automatic awareness, analysis, and O&M
capabilities for key quality indicators (KQIs) and customer experience indexes (CEIs)
to ensure that major user experience issues, including suboptimal network quality,
low image definition, and frame freezing of high-value network customers, can be
quickly and automatically identified in real-time and automatically resolved through
intelligent techniques. Furthermore, congestion points and bottlenecks on networks with
suboptimal KQIs and CEIs need to be resolved through periodic and proactive checks
and rectification. User experience needs to be improved to eliminate the "frustrations"
regarding network speed.
13
Autonomous Driving Network
These applications also integrate real-time interaction and awareness technologies, such
as sensors, voice, images, videos, and smell, delivering an innovative user experience.
CSPs need to consider these new AI applications and traffic as opportunities to pioneer
network technology transformation by improving real-time network interaction and
awareness capabilities.
CSPs should fully leverage the new business opportunities deriving from e-commerce
virtual human live streaming and design new business models. For example, CSPs can
use innovative business methods such as new charging modes and VIP packages to
provide differentiated traffic and deliver a differentiated interaction experience for high-
value live-streaming customers and scenarios, while exploring the value of traffic in low-
value periods (off-peak hours). These approaches will enable them to increase revenue.
14
Autonomous Driving Network
Trend 2
15
Autonomous Driving Network
Business Insights
The accuracy of large models continues to increase while the cost of using large models
continues to decrease. According to the ten major AIGC application layer trends in 2024
released by IDC, over 53% of enterprises have started innovating AI services, marking
the beginning of the digital intelligent transformation era for enterprises. However, the
exponential growth of enterprise service complexity and O&M requirements presents
numerous O&M challenges.
In the journey toward digital transformation, enterprise O&M encounters new challenges
due to the advent of all-wireless offices, cloud-based applications, video-based
applications, collaboration, and intelligent transformation. These developments catalyze
a shift in traditional work models, concurrently giving rise to higher O&M standards and
new requirements. Cloud-based, video-based, collaborative, and intelligent applications
have higher requirements on network bandwidth, latency, and stability. For instance, in
video conferencing and online collaboration scenarios, any network fluctuations may
interrupt conferences or collaboration, causing significant negative impacts on the office
experience.
Offices are evolving from a single headquarters to multiple branches, which are
distributed worldwide. The production mode is evolving into smart manufacturing.
Data centers, as the cornerstone of enterprise information, are growing in scale and
being migrated to the cloud, forming a hybrid cloud and multi-cloud architecture.
These transitions lead to a rapid increase in the enterprise network scale. The increasing
device diversity is also a trend that deserves attention. From traditional servers, routers,
and switches to Internet of Things (IoT) devices such as cloud cameras, smart sensors,
and automated equipment, a diverse range of devices presents a significant O&M
challenge, expanding the scope of routine maintenance and increasing the maintenance
16
Autonomous Driving Network
complexity. In addition to ensuring the stable running of all devices, the network must
quickly respond to various emergencies and continuously optimize network performance
to ensure service continuity and growth. Furthermore, the continuous development of
enterprise services and further digital transformation extends service deployment from
a single network environment or geographical location to multiple network domains,
including internal, external, cloud, data center, and IoT networks. Cross-domain
deployment significantly increases the complexity and the time required to complete
the deployment process, from service design and system integration to verification and
provisioning.
Network security, another essential topic of network O&M, becomes more challenging
as malware variants can be created more quickly, thanks to recent technological
advancements. Conventional detection methods based on feature codes can no longer
effectively detect these variants. Virus and malware variants can easily bypass traditional
defense measures and penetrate enterprise intranets, causing data leakage, system
breakdown, and other negative impacts. As more intelligent tools and technologies
become available, attackers can use malicious large models, automated attack tools,
and the like to collect and analyze crucial information about enterprises and formulate
more accurate and covert attacks. Such attacks bypass traditional security monitoring
and defense mechanisms, leading to long-term incubation and continuous penetration,
posing significant threats to enterprise information security.
Impacts
17
Autonomous Driving Network
enterprises need to invest more labor, material resources, and financial resources,
including recruiting professional O&M personnel and purchasing advanced O&M tools
and technical services, which increase operational expenditure (OPEX). In a fast-changing
market, enterprises need to flexibly adjust their business models and operation strategies
— a task that is impossible without advanced O&M capabilities.
From the perspective of service agility, enterprises cannot quickly respond to market
changes when service deployment takes a long time. In a fierce market environment,
enterprises often want to launch new products or services to seize opportunities quickly.
A long deployment period will undoubtedly impact the competitive edges of enterprises.
From the perspective of cost-effectiveness, time-consuming deployment increases OPEX,
which includes labor costs, time costs, and potential service losses caused by delayed
deployment in addition to hardware and software costs. From the perspective of user
experience, inefficient deployment may cause service interruption or delay, affecting user
satisfaction and loyalty.
18
Autonomous Driving Network
From the perspective of service security, these challenges threaten the information
security and service continuity of enterprises. Once an enterprise's information system
is infected by malware or intelligent attacks, sensitive data may be leaked, and services
may be interrupted, impacting the enterprise's reputation, customer relationships, and
service competitiveness. From the perspective of operations cost, enterprises need to
invest more resources and funds to enhance security protection measures and address
these security challenges. Purchasing advanced security devices, software, and services,
and improving the technical skills and emergency response capability of the security
team significantly increase OPEX. Security challenges may also affect the service
quality and user satisfaction rates of enterprises. For example, excessively stringent
security control measures may cause inconvenience to user operations and affect user
experience. Exposing security vulnerabilities may cause panic and dissatisfaction among
users. These factors may harm the brand image and market position of an enterprise.
Suggestions
19
Autonomous Driving Network
Trend 3
20
Autonomous Driving Network
Business Insights
Large models are developing rapidly across the globe and have numbered 1,328 by the
first quarter of 2024, with over 80% of them coming from China and North America.
20%
44%
36%
Take China as an example. On January 11, 2023, China's State Information Center
released the Intelligent Computing Center Innovation and Development Guide. The
report estimates the economic benefits generated by investment in intelligent computing
centers (ICCs). By 2025, 80% of enterprises will use ICCs, and the investment in ICCs for
a city can drive the growth of core AI industries by 2.9 to 3.4 times and related industries
by 36 to 42 times. In the past two years, the development of AI large models has placed
new requirements on computing power, algorithm platforms, and data. According to
the scaling law, traditional CPU-centric cloud computing infrastructures cannot offer
orders of magnitude higher computing power required by large models. With the fierce
competition among myriad large models, the AI industry is witnessing the professional
division of labor, as well as a development trend of intelligent computing centering
21
Autonomous Driving Network
on GPUs and NPUs. On October 8, 2023, China's Ministry of Industry and Information
Technology (MIIT) and six other authorities issued the Action Plan For The High-quality
Development Of Computing Power Infrastructure. According to the plan, by 2025,
China aims to achieve over 300 EFLOPS in computing power, increase the percentage of
intelligent computing power to 35%, and build 50 ICCs. Over 100 ICC construction and
operation projects have been launched in China.
22
Autonomous Driving Network
Above all, improving the efficiency of cluster O&M (converged and full-stack O&M
for computing power, networks, and storage devices), as well as prediction, minute-
level demarcation and locating, and closed-loop capabilities for faults, are the key to
maximizing the value of ICCs' computing power.
Impacts
First, it is necessary to develop real-time automatic detection and root cause analysis
capabilities for faulty nodes in AI cluster servers. With these capabilities, O&M personnel
can collect the status and incident logs of AI cluster servers in real-time to identify
whether the servers and networks are normal. By doing so, O&M personnel can quickly
identify faulty processes, locate fault causes, and rectify faults. A prime example is
the complex process of locating optical link faults on cluster servers, which involves
computing power, networks, and storage devices. O&M personnel need to physically
visit an equipment room with professional tools for detecting optical link faults, and
detecting a single fault takes one hour. For tricky AI software faults like "Notify Wait"
timeout faults that occur frequently, the average time required for manual analysis
is greater than three days. Second, AI cluster servers must be able to resume training
in minutes, which is the key to ensuring that AI training interruption does not lead to
suboptimal availability of computing power. Generally, data is backed up before training
can be resumed, and checkpoints are set to check and update the backup data during
the training. It usually takes several hours to diagnose, isolate, and rectify a fault,
wasting a huge volume of computing power. To address this challenge, the AI model and
checkpoints of the optimizer need to be periodically persisted to reduce the checkpoint
time and minimize the extra AI training time caused by faults. Third, capabilities such as
intelligent fault prediction and fast automatic isolation of fault nodes must be developed
for AI cluster servers to reduce the probability of faults and their impact on AI training.
23
Autonomous Driving Network
Using O&M operations on AI clusters is not a well-trodden path. In the industry, there
are no widely used methods for checking large-scale AI cluster risks through NPU chip
health checks, HCCL bandwidth tests, intermittent internet connection checks, optical
component fault prediction, and frequent component fault prediction. Additionally,
ICCs involve full-stack integration of software and hardware about computing power,
networks, and storage devices in all domains. The traditional methods are inefficient for
checking risks in clusters with 10,000 GPUs/NPUs and over 100,000 computing power,
network, and storage components. Full software and hardware checks take more than
three days.
Moreover, inefficient nodes and networks cause a loss of over 50% of computing power.
Due to the strong synchronization feature of large model training, a single point of
failure (SPOF) may cause significant performance deterioration in jobs and difficulty
in fault locating due to fault diffusion, leading to tens of thousands of checkpoints. As
24
Autonomous Driving Network
a result, manual demarcation and locating can take hours or even days, presenting a
major challenge for ICCs.
Suggestions
CSPs need to implement automatic and intelligent fault detection, root cause
identification, and quick rectification for AI cluster servers to minimize the training
interruption time caused by faults. Meanwhile, the minute-level resumable training
capability must be developed for AI cluster servers to optimize the checkpoint process,
reduce the checkpoint time, and minimize the extra AI training time. Additionally, CSPs
need to implement intelligent prediction and fast automatic isolation of fault nodes to
reduce the probability and impact of faults and quickly resume the training process.
Before AI training is performed, a full-stack risk check and precise component-level fault
prediction must be implemented for ICCs to ensure good AI training performance in AI
cluster servers, reduce the job failure probability, and prevent computing power loss.
Additionally, the risk check period and prediction time need to be reduced from days to
minutes. When AI training or AI inference is performed, it is critical to quickly detect and
analyze inefficient nodes or networks and improve the computing power efficiency of AI
clusters. To this end, the deterioration detection capability needs to be transformed from
passive mode (reporting in hours through manual tickets) to proactive mode (intelligent
detection in minutes), and the demarcation and locating time needs to be reduced from
dozens of hours (manual handling) to less than 30 minutes (automatic handling).
25
Autonomous Driving Network
Trend 4
26
Autonomous Driving Network
Business Insights
The emerging low-altitude economy can open up new business opportunities for CSPs.
(1) Massive new connections will be added. For instance, drones mentioned earlier
27
Autonomous Driving Network
fly beyond line of sight (LOS), which means that video stream backhaul is required
to control their operation remotely. This requires ultra-large uplink video connections,
providing new opportunities for CSPs. According to China's National Three-Dimensional
Transportation Network Planning Outline, the market size of China's low-altitude
economy is expected to exceed CNY6 trillion by 2035. Furthermore, the number of
commercial and industrial drones is expected to reach 26 million, and the number of
drone pilots will grow to 630,000. Consequently, the number of connections will expand
to tens of or even hundreds of millions. (2) Low-altitude surveillance teams will have
more demanding sensing requirements, as they must be able to detect abnormal
intruders, such as unauthorized flights, in real-time to ensure safe passage on low-
altitude routes. However, traditional radar-based low-altitude detection solutions face
multiple challenges, including site re-selection and blocking during deployment, high
costs, and poor feasibility. To address these issues, CSPs can innovate sensing services by
using deployed wireless infrastructure.
28
Autonomous Driving Network
Impacts
In terms of specific O&M capabilities, wireless networks are expected to evolve from
providing traditional ground coverage to enabling ground and low-altitude 3D coverage.
Another major challenge is the increasing complexity of wireless environments. Low-
altitude networks are affected by multiple factors, including low-altitude flight, small
objects, complicated electromagnetic environments, and obstacles. Consequently, there
is a need for more network planning and maintenance, simulation technologies with
higher accuracy, and network optimization technologies with higher levels of intelligence
and automation.
29
Autonomous Driving Network
Suggestions
1. In the wireless network field, industry stakeholders must seize the opportunities
offered by the low-altitude economy to innovate low-altitude ISAC services, expand new
connection services, and provide sensing data services for low-altitude flight safety.
30
Autonomous Driving Network
Trend 5
31
Autonomous Driving Network
Industry Insights
The commercial practices on AN have allowed leading CSPs to gain significant benefits,
reinforcing their strategies to advance to AN Level 4. At the Digital Transformation
World (DTW) held in June 2024, China Mobile announced a significant milestone — it
reached AN Level 3.2 in 2023 and L3.5 in 2024, with a 95% automation rate of high-
value scenarios. This remarkable achievement, coupled with the development of 3,000 AI
baseline capabilities, allowing for minute-level network configuration, fault rectification,
and network optimization, is a testament to the potential of AN. This is expected to save
5,000 labor costs annually. China Mobile's future goals, including reaching AN Level 4 in
2025 and focusing on automating all high-value scenarios, inspire the company to strive
for cost-effective and efficient O&M featuring highly automatic and intelligent network
operations capabilities with E2E automation as the basis and AI+ as its core characteristics.
32
Autonomous Driving Network
According to Omdia, GenAI core capabilities are typically used in the communications
industry to generate more accurate responses for chatbots and optimized scripts for
customer service agents, as well as provide effective reasoning for network maintenance.
This helps CSPs reduce manual intervention or even achieve complete automation in
multiple service processes, significantly improving O&M efficiency. Consequently, the
industry proposes that the GenAI-based telecom foundation model is revolutionary and
holds the key to achieving Highly Autonomous Networks. This proposal has been widely
discussed at global industry conferences, including DTW and MWC. In response to this
proposal, TM Forum established a GenAI AN standard working group in December 2023.
Additionally, the China Communications Standards Association (CCSA) initiated the large
model pioneer plan to promote the implementation of AN Level 4 enabling technologies
such as large model/GenAI and agents.
TM Forum has incorporated AN as one of the three core missions, according to its
statement. At the DTW conference in June 2024, TM Forum, in collaboration with
industry partners, including China Mobile, Vodafone, Telefónica, Huawei, Ericsson, and
AsiaInfo, along with Professor Joseph Sifakis, the 2007 Turing Award winner, unveiled
the Autonomous Networks Level 4 industry blueprint: high-value scenarios. This report
introduces the Level 4 industry blueprint, which includes the vision and objectives, high-
value scenarios, architecture, and evolution path. It serves as a valuable reference for
CSPs looking to plan and deploy AN Level 4.
33
Autonomous Driving Network
value to maximize the return on investment (ROI). The Level 4 industry blueprint
categorizes implementation scenarios into two phases. Phase 1 (2025–2027) emphasizes
single-domain maintenance/optimization scenarios, while phase 2 (2028–2030) centers
on multi-domain E2E complex scenarios. TM Forum, on the other hand, adopts a
different standard to classify CSP scenarios into three value ranges: high, medium, and
low. It identifies 15 high-value scenarios in operations, maintenance, and optimization,
based on operations value and technology maturity (as shown in the following figure).
Service marketing
Operations
Service provisioning
Service assurance
Full network lifecycle
Complaint handling
Network optimization
Optimization
Energy consumption
optimization
High-value scenarios
In the past few years, AN exploration has focused on point-level use cases, contributing
to the identification of hundreds of capabilities through dozens of O&M tasks. However,
the fragmented implementation process and outcomes cannot be used for unified
deployment. To solve this problem, technical breakthroughs of the telecom foundation
model with GenAI as its core have improved the core capabilities such as content
generation, chain-of-thought analysis, and multi-modal simulation. These capabilities
have enabled the development of new processes and capabilities that are interdependent
and feature large granularity, offering application-level solutions for AN practices. The
Level 4 industry blueprint proposes an architecture with three layers (business, service,
and resource operations) and four closed loops. This architecture introduces full-stack
34
Autonomous Driving Network
Impacts
The evolution towards AN Level 4 marks a new stage in the development of AN, where
the "machines assisting humans" approach has transformed into "humans assisting
machines". The Autonomous Networks Level 4 industry blueprint: high-value scenarios
paper serves as a crucial reference for industry development. As AN Level 4 continues to
expand, cutting-edge technologies such as telecom foundation models and digital twins
will be developed to overcome existing limitations, offering the following benefits for the
communications networks:
1. Reshaping the O&M mode: Traditional CLI- and GUI-based O&M has low
efficiency and requires personnel to have high-level skills. For instance, network
maintenance staff need to locate specific operation interfaces across multiple systems
and perform operations on each interface to analyze and summarize the data for
the desired information. To address these issues, the traditional O&M mode has been
upgraded to the intelligent O&M mode. Role-specific copilots provide intelligent O&M
methods such as intelligent Q&A, intelligent query, and intelligent report generation
based on natural language interaction. Additionally, scenario-specific agents can
implement automatic closed-loop management based on the preset targets with
minimal or no interference.
35
Autonomous Driving Network
Suggestions
36
Autonomous Driving Network
37
Autonomous Driving Network
Trend 6
38
Autonomous Driving Network
Technology Insights
GenAI has become the driving force for the digital transformation of communications
networks. GenAI redefines network management through automatic event response,
network fault and risk identification, and network event handling solution optimization,
changing the network O&M management mode and improving the automatic O&M
management level. TM Forum estimates that GenAI can save US$20 billion a year for
CSPs in terms of network interruption and service degradation.
After innovative exploration, GenAI now has been applied to the communications
industry. For example, Ask AT&T, a ChatGPT-based GenAI platform launched by AT&T,
is used to empower employees. Ask AT&T is helping employees finish routine work,
such as Q&A and code conversion, allowing employees to focus on more complex and
valuable tasks. It is worth noting that AI has a vast space for development. GenAI is
one of the ways to solve real-world problems, but there are also other AI technologies,
such as machine learning, graph computing, predictive AI, optimal decision-making, and
heuristic rules, that are crucial for solving scenario-specific problems. GenAI needs to be
combined with other AI and non-AI technologies to achieve the accuracy, timeliness, and
explainability requirements of intelligent communication networks.
Gartner's AI prisms provide CSPs with a comprehensive assessment of GenAI use case
examples to aid in use case selection. Gartner's report points out that GenAI models are
more suitable for use cases, such as content generation, conversational interaction, and
knowledge discovery, and are not well-positioned to solve problems independently in
use cases, such as prediction, planning, and decision-making.
39
Autonomous Driving Network
Generative models'
Use-case family Use-case examples
current usefulness
Risk prediction, customer churn
Prediction/Forecasting LOW
prediction,sales/demand forecasting
Recommendation engine,
Recommendation Systems Medium
personalizedadvice, next best action
A standalone GenAI model for prediction may not be the optimal choice. Some have
tried using GenAI models for prediction tasks, such as trend prediction, traffic prediction,
rate prediction, and network status time series prediction, but such models are not
designed for statistics or autoregressive modeling based on given network status data.
Predictive AI technology is more suitable for these prediction tasks.
Current GenAI models also lack the network optimization capability. GenAI is not
effective in high-value network scenarios, such as energy saving optimization, path
optimization, multi-objective combinatorial optimization, and configuration policy
optimization — because the optimal solution can only be found through iterative
optimization by delivering optimization policies to the live or simulated network
environment. As such, it is essential to design a dedicated network optimization
algorithm based on reinforcement learning and automatic control technologies. In
practice, network optimization typically requires various technical approaches. By
integrating GenAI models into their network optimization process, CSPs can enhance the
planning and scheduling capabilities of their network optimization system.
40
Autonomous Driving Network
However, the output of current GenAI models still contains hallucinations and lacks
explainability. Using such output, CSPs' decision-making may pose critical technical risks.
In a complex network environment, GenAI is not able to achieve autonomous decision-
making. Key policies, such as awareness, analysis, decision, and execution, still rely on
human experience, control, and guidance.
The size of a model must be assessed based on the specific requirements of a service
scenario. A large model trained with a large amount of text might be very effective.
For example, OpenAI's GPT-4 exhibits great performance on assessment benchmarks
through natural language, but it does not mean that it is suitable for all types of
network service requirements. A larger model usually implies higher training and running
costs. It is estimated that training an LLM with hundreds of billions of parameters may
cost hundreds of millions of US dollars. The performance of a large model depends to
a large extent on the quality of training data. Therefore, to be used in use cases, large
models need to be trained with high-quality domain-specific data.
However, it is impractical for a large model to work with resource constraints due to its
high computing and storage requirements. A lightweight model that can balance between
performance and resources may stand out. For example, for a customer service system
that requires real-time feedback, a large model that responds quickly and only requires
a moderate amount of resources may be preferred to a large model with slightly higher
accuracy but slow response. Selecting a GenAI large model requires the consideration of
multiple factors, including service requirements, cost-effectiveness, data quality, deployment
environment, and security compliance. CSPs should conduct multi-dimensional analysis to
ensure that the selected large model meets the needs of their service scenarios.
Impacts
To meet the service requirements for higher efficiency and lower costs in network
maintenance, optimization, and operations, the use of GenAI will help to enable more
convenient, flexible, and diversified solutions.
41
Autonomous Driving Network
LLMs are very effective at understanding user intents, accurately generating content,
and distributing tasks downstream. Their natural language interfaces have become de-
facto standards for human-machine interaction in this intelligence era. Large models
are trained through tool learning and provide APIs without training and accurate calling
service, driving a paradigm shift in intelligent system integration. Specifically, large
models can call APIs based on user intents and orchestrate APIs based on user objectives.
They also ensure intent understanding, objective breakdown, task orchestration,
execution, and reflection in the network domain, as well as unprecedentedly smooth
interaction between humans and systems.
In network maintenance scenarios, such as fault diagnosis and risk identification, GenAI
and specialized network AI algorithm models automatically analyze alarms, indicators,
and logs to detect network fault events, diagnose the root causes of faults, generate
fault analysis reports, quickly match historical cases of the same type, and accurately
recommend fault rectification solutions. In this way, the models help achieve self-
diagnosis of network faults and potential risks, as well as automatic closure of network
O&M tickets, improving network O&M efficiency.
42
Autonomous Driving Network
Suggestions
1. Build industry and domain corpora and provide high-quality data for
the applications of GenAI large models.
Domain-specific data governance: Identify and collect available data sources of the
industry or domain, including internal databases, public datasets, and experience and
knowledge documents. Clean data by deleting redundant data and filtering out invalid,
outdated, and inaccurate data to ensure data quality.
Domain-specific data labeling: Determine a proper amount of data, classify and label
the data based on expert experience and knowledge in the industry to improve data
explainability and accuracy, and integrate the domain-specific data into GenAI models to
facilitate model pre-training, post-training, or retrieval augmented generation (RAG).
Data security compliance: Ensure that all collected and used data comply with local
privacy and data protection regulations and laws as well as AI acts. Where necessary,
take security measures, such as data encryption, access control, and auditing, to protect
data from unauthorized access or disclosure.
43
Autonomous Driving Network
For specific use cases, such as network prediction, network maintenance, network
optimization, and autonomous decision-making, select appropriate GenAI models and
classic AI solutions to solve problems systematically. For these use cases, classic AI
algorithm models are more reliable, controllable, easy to understand, and resource-
saving than a solution that depends only on a GenAI model. For example, a machine
learning model can be used for deterministic classification; a reinforcement learning
model for deterministic optimization; and an inference system based on a logical rule
and a knowledge graph for deterministic demarcation and locating.
GenAI integrates open network APIs and domain-specific AI algorithm models through
language user interfaces (LUI) to build more intelligent and professional network agents
and make up for its limitations, as well as improve the accuracy, real-time performance,
explainability, and security controllability of all AI solutions.
Upgrade AI servers or AI intelligent boards for network management and control units
and NEs to realize efficient and real-time AI computing capabilities for network services,
such as converged awareness of network status, root cause analysis (RCA), and fault
recovery assurance. Provide professional and 24/7 online network O&M assistants
for O&M personnel and FMEs as well as more intelligent and high-quality zero-fault
experience for users.
44
Autonomous Driving Network
Trend 7
45
Autonomous Driving Network
Technology Insights
Multi-agent technology has sparked significant research interest from both academia
and industry, particularly as networks continue to evolve in complexity. This technology
is key to achieving layered network autonomy and cross-domain E2E closed-loop
management, a trend that is rapidly gaining momentum.
Agent and multi-agent topics are dominating major AI summits such as AAAI, ICML,
ICRL, and IJCAI in 2024. The number of papers on multi-agent collaboration is also on
the rise. The research covers various directions, including multi-agent reinforcement
learning, collaboration mechanisms, communication interaction, competition and
confrontation, and cross-domain application. Researchers are exploring the use of
LLM-based multi-agent systems, which enable multiple agents to collaborate and
leverage their strengths to tackle complex issues. A 2024 paper titled "Large Language
Model-based Multi-Agents: A Survey of Progress and Challenges", based on the
review and analysis of 81 papers in the multi-agent field, defines LLM-based multi-
agent systems. Open-source multi-agent frameworks, such as Microsoft's AutoGen,
MetaGPT ("Meta Programming for A Multi-Agent Collaborative Framework,"
accepted for oral presentation at ICLR 2024), Agents, Camel, and ChatDev, are rapidly
evolving, significantly enhancing our understanding and exploration of multi-agent
communication and collaboration.
46
Autonomous Driving Network
endpoint transmission issues caused BS out-of-service faults. This difficulty was further
complicated by the inaccurate association of wireless and transmission resources, leading
to a low success rate of ticket dispatch based on root causes and repeated ticket dispatch
on both the wireless and transmission sides. On average, over 50 tickets were dispatched
daily for BS out-of-service faults caused by transmission issues at the aggregation layer
or higher. The wireless and transmission O&M teams had to collaborate extensively and
manually transfer the list of faulty BSs. Therefore, CSPs are in urgent need of improved
cross-domain fault demarcation and location capabilities and enhanced efficiency.
The TM Forum released the Autonomous Networks Level 4 industry blueprint – high-
value scenarios report in June 2024. This report explains how multi-agent collaboration
enables E2E closed-loop autonomy in complex scenarios, including E2E customer
complaint handling, cross-domain fault demarcation, E2E service assurance, and wireless
network collaborative optimization. China Unicom Research Institute, China Mobile
Information Technology Center, China Academy of Information and Communications
Technology (CAICT), and China Telecom Research Institute collaboratively launched
the Catalyst project "GenAI empowers computing force network." The project aims
to efficiently schedule cloud-edge-device computing network resources based on
customers' personalized service requirements, providing one-stop intelligent service
support. E2E computing network service provisioning involves multi-agent collaboration.
To facilitate the E2E orchestration of computing network services, it is essential to
explore objective breakdown and negotiation between the master and other agents and
to combine the strengths of all the agents involved.
Impacts
By leveraging the strengths of multiple agents, this approach can overcome individual
limitations, accomplish complex tasks in CSP networks, and deliver excellent flexibility
and scalability. It provides key enabling technologies for E2E closed-loop management
47
Autonomous Driving Network
AI agents are evolving rapidly, and we can expect to see many more agents in the
future, including intelligent terminals, embodied intelligence (such as robot dogs and
intelligent service robots), virtual intelligent assistants, and digital persons. These agents
will introduce new communication objects and service scenarios into future networks.
Future networks must offer new connection services for agents with varying forms and
capabilities, including digital identity authentication, interconnection and interworking,
and task collaboration.
(1) Currently, there are two technical approaches in the communication industry: multi-
agent collaboration based on reinforcement learning and multi-agent collaboration
based on LLMs. The industry is still exploring the future evolution and potential technical
complementarity of these two approaches.
(2) Multi-agent systems face unique capability requirements and key technical
challenges in communication and collaboration. The industry is still exploring
solutions, and no consistent methodology or framework is currently available. From
a collaboration perspective, the industry must implement objective breakdown,
coordination, and conflict prevention between different agents. Multi-layer collaboration
and message transfer between agents may magnify hallucination layer by layer. From
48
Autonomous Driving Network
(3) Research on multi-agent standards is still in its early stages, and no unified standards
or specifications are available yet. Researchers are still working on the requirements for
interoperability, semantic transfer accuracy, and compatibility between different systems.
Suggestions
The evolution from a single agent to multi-agent collaboration aims to enable E2E
closed-loop management in complex scenarios of carrier networks and accelerate the
transformation toward high-level autonomy. This is the overwhelming trend and the
ultimate goal. However, this transformation faces unique challenges due to unclear
high-value scenarios, primitive industry standards, and immature key technologies. In the
first stage toward AN Level 4, all industry stakeholders must collaborate and accelerate
the implementation of agent-based single-domain closed-loop management in high-
value scenarios. As a key technology in the second stage toward AN Level 4, multi-agent
collaboration lags in commercial implementation in the telecom field. The telecom
industry and academia should collaborate to overcome the challenges of multi-agent
collaboration technology and advance its development and application in the telecom
field. Specific suggestions are as follows:
Suggestion 1: CSPs should establish business scenarios and service requirements for
multi-agent collaboration and conduct technical research on high-value scenarios.
Suggestion 2: The telecom industry should start standard planning and research
justification to advance the standardized development of multi-agent collaboration
technologies. This is crucial for the application of multi-agent collaboration technology.
49
Autonomous Driving Network
03
Autonomous Driving
Network
50
Autonomous Driving Network
AN has become an industry consensus in recent years and is the optimal choice for
CSPs to achieve comprehensive intelligent transformation. ADN is one of the four core
strategies of Huawei's Communications Network 2030 and is also Huawei's solution
for the global AN industry. ADN aims to develop self-fulfilling, self-healing, self-
optimizing, and autonomous networks based on connectivity and intelligence. Based
on the principles of single-domain autonomy, cross-domain collaboration, value-driven
approaches, visualization, and productization, Huawei collaborates with CSPs and
enterprises to develop self-configuration, self-healing, and self-optimizing capabilities,
Digital operation
platform
Copilot Agent Monetization Customer O&M Resource
Third-party capability experience efficiency efficiency
platform AUTIN SmartCare ADO Business
Value Instant Proactive care Intelligent Smart green
Intelligent service platform service delivery Exp. Mgmt. fault O&M energy saving
Centralized training and cross-domain policy orchestration
Revenue Wi-Fi Exp. WO Auto. rate Multi.Obj Collab.Optim.
Insights Collaboration
TTM Churn rate MTTR Energy saving gain
Copilot Agent
Control/Mgmt./ Telecom
Analysis Foundation Model NOCMate FMEMate HCEMate HDEMate LinkHome
Mate series Mate
MAE NCE
Copilots NOC Customer HBB
Intelligent management and control system Engineer FME IME
5 categories Service User
Integrated training and inference, single-domain policy generation New 10 Copilots
Data reporting Policy delivery Application
ProvSpirit OperateSpirit OptimSpirit AssurSpirit CompSpirit
Spirit series
Agents Service Network Network Fault Complaint
Wireless Access Transmission IP Core …
5 categories Enablement Change Optim. Mgmt. Handling
11 Agents
Intelligent infrastructure
Local inference, real-time awareness & execution &Dǻǿ&²ÃɲÊìÊÊʪ²Êà/Dǻǿ/ÊäìÃÃì²ÑÊÊɲÊìÊÊʪ²Êà
51
Autonomous Driving Network
ADN delivers critical value to CSPs from the aspects of business, experience, efficiency,
and energy efficiency.
2. Customer experience+: Key indicators such as the service quality fulfillment rate
and complaint handling timeliness need to be improved to advance customer experience
and satisfaction and deliver a superior experience across the entire lifecycle. For instance,
in-home broadband experience assurance scenarios, the home visit rate is reduced from
65% to 10%, user complaints are reduced by 60%, and the average revenue per user
(ARPU) is improved by differentiated broadband.
52
Autonomous Driving Network
Huawei has developed the ADN Level 4 solution for high-value scenarios based on
various key technologies, including the telecom foundation model, converged awareness,
and digital twin. The solution offers crucial application capabilities through role-based
copilots and scenario-specific agents to help CSPs and enterprises improve employee
abilities, enhance user experience, and deliver more significant value through digital
intelligent network productivity. The solution enables the evolution toward Highly
Autonomous Networks in numerous typical scenarios.
To handle complex complaints, Huawei has launched CompSpirit, which is based on the
multimodal large model for core network O&M. CompSpirit can quickly comprehend
complaint intents, demarcate complex processes, simplify operation processes, and
expedite the closure of the entire complaint handling process. The proportion of
complaint tickets with the handling process moved forward has increased to more than
20% on average, and the E2E complaint ticket handling duration has decreased from
14.6 hours to 5 hours, improving the efficiency by 64%.
53
Autonomous Driving Network
Customers
Monitoring Dept Core Network Dept
Service Dept
Manual Manual Manual Follow-up
Complaint Ticket Manual lssue fixing &
complaint query & complaint visit & ticket
handling dispatching ticket filling verification
Before classification basic analysis analysis closure
Basic query User signaling xDRs
Core network workbench Signaling platform
Customers
Service Monitoring Dept Core Network Dept
Dept
· Complaint classificatio
nassistant
Complaint Auto filling Follow-up
· Basic complaint lssuefixing
handling analysis assistant
based on visit & ticket
& verification
· Complaint signaling
results closure
After analysis expert
» Automatic signaling parsing: The unique signaling large model adopts a three-
layer modeling approach to signaling behavior, allowing for a better understanding of
service logic. By merging signaling and semantic convergent coding, the large model
enables natural language-based Q&A signaling analysis, which simplifies the process
of signaling analysis. The model is capable of analyzing signaling issues layer by layer,
just like human experts. Through dialog-based Q&A interactions, even non-experts
can complete signaling analysis within 5 minutes and offer recommended root causes
and related cases. Such results are comparable to having core network O&M experts
with more than five years of experience. What's more, the time required for a single
signaling analysis phase has been reduced from 4 hours to just 5 minutes.
54
Autonomous Driving Network
The solution has now been implemented in the production process of China Mobile Zhejiang,
effectively adding more than 30 experienced digital employees to the team. This has resulted in a
redefined complaint-handling process and a positive cycle of core network O&M transformation.
As the digital economy rapidly develops, premium service experience has become a core
demand of home broadband users, with more users willing to pay higher prices for a
better broadband experience. CSPs are also keen to enhance user stickiness, increase
revenue, and explore new marketing opportunities by providing premium experience-
based operations. These usher in a new era of experience-based operations for home
broadband services. As home network users are increasingly demanding better broadband
access experience, their various STAs, and apps demand differentiated user-specific services
from optical access networks. Traditional manual O&M is no longer up to the task.
Huawei's IntelligentFAN solution offers HCEMate for field engineers, LinkHomeMate for home
broadband users, and CompSpirit for experience assurance. This solution helps automatically
identify, intelligently diagnose, and quickly rectify user experience issues, ensuring a premium
home broadband service experience and significantly improving operations and O&M efficiency.
Solution overview
55
Autonomous Driving Network
In the case of poor home broadband experience, CompSpirit automatically extracts the
network KPI and KQI data at the moment when the faults occurred with the help of
time-space correlative analysis, identifies the fault root causes within 30 seconds based
on the fault tree-based diagnosis algorithm, and generates feasible solutions. About
30% of these faults can be automatically rectified and closed remotely, while the rest
are handled by the CSP's NOC onsite. This solution helps rectify faults before users file
complaints, reduces onsite visits by 30% and fault diagnosis time by 80%, and increases
the E2E troubleshooting efficiency by 50%.
56
Autonomous Driving Network
IP Network Troubleshooting:
FMEMate and AssurSpirit Help CSPs
Improve Quality and Efficiency
IP networks are complex, with multiple layers, sophisticated routing protocols, and
dynamically changing routes. A single network failure can generate multiple alarms.
Traditionally, identifying faults from multiple alarms relied on predefined rules, leading
to repeated ticket dispatch. O&M personnel had to troubleshoot faults based on their
experience and system functions. For faults requiring field work, field maintenance
engineers (FMEs) had to collaborate with NOC personnel remotely, but the entire
troubleshooting process lacked automation, reducing O&M efficiency.
To address this, the ADN solution, based on key technologies such as digital twin and
telecom foundation model, offers a scenario-specific troubleshooting agent called
AssurSpirit, as well as two digital assistants, NOCMate and FMEMate, for NOC personnel
and FMEs. These tools enable automatic fault analysis, significantly improving O&M
personnel's troubleshooting efficiency. AssurSpirit automatically identifies root alarms by
filtering, aggregating, and intelligently correlating all alarms. It uses intelligent optical
modules to collect ms-level optical power data and large models to analyze chains
of thought, helping break down complex issues. By calling NMS functions, AssurSpirit
locates root causes and offers troubleshooting suggestions. By calling tools or interfaces,
it automatically resolves software commissioning issues. NOCMate and FMEMate
facilitate onsite fault rectification. Their intelligent Q&A and auxiliary troubleshooting
capabilities help FMEs quickly rectify faults without the need for collaboration.
57
Autonomous Driving Network
This ADN solution has been successfully implemented at China Mobile Guangdong. It
interworks with upper-layer troubleshooting and ticket systems, automating the entire
process from fault identification, locating, and rectification to verification. This has
resulted in a 15% reduction in trouble tickets, an 89% decrease in fault locating time,
and the creation of over 100 digital employees.
Nowadays, CSPs place increasingly high requirements on wireless network performance, and
the traditional network optimization process often faces many challenges. These include
high labor costs, low efficiency, dependence on expert experience, delayed responses, and
support for only single-objective optimization. As the network scale expands, the network
architecture has become increasingly complex, driving up O&M costs. CSPs are in urgent
need of more intelligent approaches to reduce costs and boost efficiency.
58
Autonomous Driving Network
1 Problem awareness
KPI 1
To Be: ...
KPI 2
...
KPI N 2 Root Cause OptimSpirit 4 Execution
Network Optimization Contention Analysis closed-loop
Intention generation rules
59
Autonomous Driving Network
The construction of campus networks is a major driver for quality education. Wi-Fi
access is used on most campus networks, especially in student dormitories. On these
networks, many services, such as online courses and videos, are running concurrently,
necessitating the need for Wi-Fi experience assurance. However, the short distance
between dormitories, numerous obstacles, dense population, and unauthorized
interference sources often cause problems such as Wi-Fi interference, unstable signals,
and slow network speed. These problems are often passively identified and located
based on complaints, resulting in low handling efficiency.
24/7 user experience assurance, optimizing Wi-Fi problems before customers complain
Intelligent Optimization Multi-objective Policy execution Policy delivery
scenario objective optimization time prediction and execution
identification recommendation policy generation
Intelligent multi-dimensional Scenario-specific automatic Automatic policy generation Optimization policy impact Execution during off-peak
analysis of internet access recommendation of optimization based on comprehensive evaluation and prediction, hours, with zero impact on
features and automatic objectives (balancing, bandwidth- consideration of interference, prediction during off-peak services, real-time policy
identification ofnetwork first, etc.) and adjustable coverage, bandwidth, roaming and peak hours, and execution
scenarios,such as optimization objectives and other factors; network-wide recommendation of the
dormitories, classroo, and multi-objective collaborative optimal policy execution
msand offices optimization time
This solution has been piloted in a top university in China. It enables the dormitory Wi-Fi
network to automatically resolve non-hardware issues, improving the fault interception
rate to over 80%, significantly reducing the number of fault tickets, and enhancing
network experience and customer satisfaction.
60
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Industrial Base
Bantian Longgang
Shenzhen 518129, P. R. China
Tel: +86-755-28780808
www.huawei.com
Tradememark Notice
, , are trademarks or registered trademarks of Huawei Technologies Co.,Ltd
Other Trademarks,product,service and company names mentioned are the property of thier respective owners
GENERAL DISCLAIMER
THE INFORMATION IN THIS DOCUMENT MAY CONTAIN PREDICTIVE STATEMENT INCLUDING, WITHOUT LIMITATION , STATEMENTS REGARDING THE
FUTURE FINANCIAL AND OPERATING RESULTS, FUTURE PRODUCT PORTFOLIOS, NEW TECHNOLOGIES,ETC. THERE ARE A NUMBER OF FACTORS
THAT COULD CAUSE ACTUAL RESULTS AND DEVELOPMENTS TO DIFFER MATERIALLY FROM THOSE EXPRESSED OR IMPLIED IN THE PREDICTIVE
STATEMENTS. THEREFORE, SUCH INFORMATION IS PROVIDED FOR REFERENCE PURPOSE ONLY AND CONSTITUTES NEITHER AN OFFER NOR AN
ACCEPTANCE. HUAWEI MAY CHANGE THE INFORMATION AT ANY TIME WITHOUT NOTICE.
Copyright © 2024 HUAWEI TECHNOLOGIES CO., LTD. All Rights Reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co.,Ltd.