H.264AVCOverIP资源-CSDN下载

需积分: 1 57 浏览量 2022-02-22 21:19:31 上传评论收藏 302KB PDF 举报

【H.264 AVC Over IP】是一种在网络上传输基于H.264编码的视频流的方法。H.264，全称ITU-T的G.722.1，也被称作ISO/IEC 14496-10或MPEG-4高级视频编码（AVC），是由MPEG和ITU的视频编码专家小组联合视频团队（JVT）共同开发的最新视频压缩标准。该标准在性能上显著优于之前的视频压缩标准，具有更高的压缩效率。 H.264编码体系由两部分组成：视频编码层（VCL）和网络适应层（NAL）。视频编码层执行经典的信号处理任务，生成包含编码宏块的位串。VCL包括块级、宏块级和片级等语法层次，设计目标是尽可能地独立于网络。VCL内含多种增强错误恢复的编码工具，这些工具可以在后续内容中简要介绍。网络适应层（NAL）则负责将VCL生成的位串适配到各种网络和复用环境中。它覆盖了高于VCL块级别的所有语法层次，以确保位串在网络传输中的友好性。NAL的一个关键功能是通过如灵活的宏块排序（Flexible Macroblock Ordering, FMO）、片交织等技术来提高在最佳努力IP网络中的错误鲁棒性。在IP网络中传输H.264/AVC视频时，通常采用实时传输协议（RTP）作为承载协议。RTP为实时数据提供了时间戳和序列号，从而有助于在不可靠的网络环境下恢复丢失的数据包。H.264的错误恢复工具，如错误隐藏、错误检测码、数据分区和冗余数据等，与RTP结合使用，可以有效地减少网络丢包对视频质量的影响。在中提到，论文详细阐述了在尽力而为的IP网络上使用H.264编码视频的方法，包括对H.264的错误恢复工具和RTP负载格式草案的介绍，并通过模拟验证了几种基于VCL和NAL的可能的错误恢复工具的性能。指数术语包括数据分区、灵活的宏块排序（FMO）、H264、RTP和片交织，这些都是实现高效H.264/AVC在网络中传输的关键技术。在实际应用中，为了保证视频流的质量，必须考虑网络的不稳定性。因此，选择合适的错误恢复策略和网络适应方法至关重要。例如，数据分区允许在不同级别对编码数据进行分割，使得某些关键信息（如I帧）比非关键信息（如P或B帧）更不容易受到丢包的影响。片交织则可以将视频帧的不同部分分散在网络包中，这样即使部分包丢失，其他包仍然可以重组一部分图像，降低画面质量的突然下降。 H.264 AVC Over IP是一个综合了高效视频编码和网络适应性的技术，它通过RTP和一系列错误恢复机制，实现了在IP网络上高质量、高鲁棒性的视频传输。这一技术对于视频会议、在线教育、远程医疗、流媒体服务等领域具有重要意义，极大地推动了互联网视频的普及和发展。

资源推荐

资源详情

资源评论

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003 645

H.264/AVC Over IP

Stephan Wenger

Abstract—H.264 is the ITU-T’s new, nonbackward compatible

video compression Recommendation that significantly outper-

forms all previous video compression standards. It consists of a

video coding layer (VCL) which performs all the classic signal

processing tasks and generates bit strings containing coded

macroblocks, and a network adaptation layer (NAL) which adapts

those bit strings in a network friendly way. The paper describes

the use of H.264 coded video over best-effort IP networks, using

RTP as the real-time transport protocol. After the description

of the environment, the error-resilience tools of H.264 and the

draft specification of the RTP payload format are introduced.

Next the performance of several possible VCL- and NAL-based

error-resilience tools of H.264 are verified in simulations.

Index Terms—Data partitioning, flexible macroblock ordering

(FMO), H264, RTP, slice interleaving.

I. INTRODUCTION

.264 is the denomination of ITU-T’s most recent

video codec Recommendation, which is also known as

ISO/IEC14496-10 or, less formally, as MPEG-4 Advanced

Video Codec (AVC). It is a product of the joint video team

(JVT) consisting of the members of MPEG and the ITU’s Video

Coding Experts Group. H.264 consists of a video coding layer

(VCL) and a network adaptation layer (NAL). The VCL con-

sists of the core compression engine, and comprises syntactical

levels commonly known as the block-, macroblock-, and slice

level. It is designed to be as network independent as possible.

Its main design goals, implementation, and performance are

reported elsewhere in this special issue [1]. The VCL contains

several coding tools that enhance the error resilience of the

compressed video stream. Those tools are briefly introduced

later in this paper.

The NAL adapts the bit strings generated by the VCL to var-

ious network and multiplex environments. It covers all syntac-

tical levels above the slice level. In particular, it includes mech-

anisms for:

• the representation of the data that is required to decode

individual slices (data that resides in picture and sequence

headers in previous video compression standards);

• the start code emulation prevention;

• the support of supplementary enhancement information

(SEI);

• the framing of the bit strings that represent coded slices

for the use over bit-oriented networks.

As a result of this effort, it has been shown that the NAL

design specified in the Recommendation is appropriate for the

adaptation of H.264 over RTP/UDP/IP, H.324/M, MPEG-2

Manuscript received December 12, 2001; revised May 9, 2003.

The author is with Teles AG, Berlin, Germany, and also with the Technical

University of Berlin, 10587 Berlin, Germany (e-mail: [email protected]).

Digital Object Identifier 10.1109/TCSVT.2003.814966

transport, and H.320. An integration into the MPEG-4 system

framework is also well on its way to standardization.

The main motivation for introducing the NAL, and its sep-

aration from the VCL is twofold. First, the Recommendation

defines an interface between the signal processing technology

of the VCL, and the transport-oriented mechanisms of the

NAL. This allows for a clean design of a VCL implementa-

tion—probably on a different processor platform than the NAL.

Second, both the VCL and the NAL are designed in such a way

that in heterogeneous transport environments, no source-based

transcoding is necessary. In other words, gateways never need

to reconstruct and re-encode a VCL bit stream because of

different network environments. This holds true, of course,

only if the VCL of the encoder has provisioned the stream for

the to-be-expected or measured, end-to-end transport charac-

teristics. It is, for example, the VLC responsibility to segment

the bit stream into slices appropriate for the networks in use, to

use sufficient nonpredictively coded information to cope with

erasures, and so forth.

The paperisorganizedin six main sections. SectionII reviews

the general constraints that apply to the transmission of com-

pressed video over IP. More specifically, the target applications

with their specific constraints and the protocol environment of

the IP-based network are discussed. Readers familiar with the

characteristics of IP networks and RTP packetization schemes

for video may want to skip this section. Section III discusses the

error-resilience tools available in H.264. Many of these tools,

such as slice structuring and intra macroblock placement, are

well known from earlier standards and previous research. How-

ever, some new technology has also emerged that deserves ad-

ditional discussion. The tools of Section III are equally suited

to IP and wireless networks, and are discussed in some depth,

whereas the error concealment and the encoder mechanisms for

error resilience, especially the loss-aware rate-distortion (R-D)

optimization,are introduced in[2]. Section IVgoes overthe con-

cepts of the RTP packetization for H.264, as it stands in discus-

sions in the IETF as of April 2003. It is believed that the final

RTP payload specification will be implementing most or all of

thetoolsdiscussedinthissection—however,itislikelythatsome

of the details will be changed. In Section V, the error-resilience

tools in the VCL and NAL and the RTP payload specification

are exposed to source video materialand a network simulation in

order to verify their combined performance. Emphasis is placed

onconversational,video-conferencing-like applications thatdis-

allowtheuseofmostchannel-basederrorprotectionschemesdue

to delay constraints, as outlined in Section II.

II. V

IDEO TRANSMISSION OVER IP

This section discusses the environment to which an IP-based

H.264 codec may be exposed. After going over the technical

646 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003

characteristics of key applications for IP-based video, the cur-

rently used protocol infrastructure and their characteristics are

introduced.

A. Applications

Before discussing the transmission of video over IP, it is nec-

essary to take a closer look at its intended applications. The na-

ture of which determines the constraints and the protocol envi-

ronment with which the video source coding has to cope.

Using IP as a transport, three major applications can currently

be identified.

• Conversational applications, such as videotelephony and

videoconferencing. Such applications are characterized

by very strict delay constraints—significantly less than

one second end-to-end latency, with less than 100 ms as

the (so far unreachable) goal. They are also limited to

point-to-point or small multipoint transmissions. Finally,

they imply the use of real-time video encoders and de-

coders, which allow the tuning of the coding parameters

in real-time, including the adaptive use of error-resilience

tools appropriate to the actual network conditions, and

often the use of feedback-based source coding tools.

However, the use of real-time encoders also limits the

maximum computational complexity, especially in the

encoder. Low delay constraints further prevent the use

of some coding tools that are optimized for high-latency

applications, such as bipredicted slices.

• Thedownload of complete, pre-coded video streams.Here,

the bit string is transmitted as a whole, using reliable pro-

tocols such as ftp [3] or http [4]. The video coder can op-

timize the bit stream for the highest possible coding effi-

ciency, and does not have to obey restrictions in terms of

delay and error resilience. Furthermore, the video coding

process is normally not a real-time process; hence, com-

putational complexity of the encoder is also a less crit-

ical subject. Most of the traditional video coding research

somewhat implies this type of application.

• IP-based streaming. This is a technology that, with respect

to its delay characteristics, is somewhere in the middle

between download and conversational applications.

There is no generally accepted definition for the term

“streaming”. Most people associate it with a transmission

service that allows the start of video playback before the

whole video bit stream has been transmitted, with an

initial delay of only a few seconds, and in a near real-time

fashion. The video stream is either pre-recorded and

transmitted on demand, or a life session is compressed

in real-time—often in more than one representation with

different bit rates—and sent over one ore more multicast

channels to a multitude of users. Due to the relaxed delay

constraints when compared to conversational services,

some high-delay video coding tools, such as bipredicted

slices, can be used. However, under normal conditions,

streaming services use unreliable transmission protocols,

making error control in the source and/or the channel

coding a necessity. The encoder has only limited—if

any—knowledge of the network conditions and has to

adapt the error resilience tools to a level that most users

would find acceptable. Streaming video is sent from a

single server, but may be distributed in a point-to-point,

multipoint, or even broadcast fashion. The group size

determines the possibility of the use of feedback-based

transport and coding tools.

This paper is mostly concerned with conversational services,

because here techniques from both the source coding and the

channel coding must be employed, and their interaction can

be shown. In addition, most research within JVT with respect

to IP-transport was performed assuming such an application.

Many of the discussions also apply to a streaming environment.

Readers primarily interested in download-type applications

should refer to papers that are concerned with coding efficiency

in this special issue [5].

IP networks can currently be found in two flavors: unman-

aged IP networks, with the Internet as its most prominent

example, and managed IP networks such as the wide-area

networks of some long-distance telephony companies. An

emerging third category could also be addressed: wireless

IP networks based on the third-generation mobile networks.

(Please see [2] in this Special Issue for an in-depth discussion.)

All three network types have somewhat different characteris-

tics in terms of the maximum transfer unit size (MTU size), the

probability for bit errors in packets, and the need to obey the the

Transmission Control Protocol (TCP) traffic paradigm.

1) MTU Size: The MTU size is the largest size of a packet

that can be transmitted without being split/recombined on the

transport and network layer. It is generally advisable to keep

coded slice sizes as close to, but never bigger than, the MTU

size, because this: 1) optimizes the payload/header overhead

relationship and 2) minimizes the loss probability of a (frag-

mented) coded slice due to the loss of a single fragment on

the network/transport layer and the resulting discarding of all

other fragments belonging to the coded slice in question (by the

network/transport layer protocols). The end-to-end MTU size

of a transmission path between two IP nodes is very difficult

to identify, and may change dynamically during a connection.

However, most research assumes MTU sizes of around 1500

bytes for wireline IP links (because of the maximum size of an

Ethernet packet). In a wireless environment, the MTU size is

typically considerably smaller—most research including JVT’s

wireless common conditions assume an MTU size of around

100 bytes.

2) Bit Errors: Bit-error probabilities of today’s wireline net-

works are so low that, within the scope of this work, they can

be safely ignored. (Please see [2] for a discussion on how the

H.264 test model handles the significantly higher bit error rates

found in wireless networks.)

3) Rate Control and TCP Traffic Paradigm: Since the big

Internet Meltdown of the late 1980s, the transport protocol TCP

[6], which is used to carry most Internet content such as email

and Web traffic, obeys the so-called TCP traffic paradigm [7].

It would be beyond the scope of this paper to discuss it in detail

but, in short, the TCP traffic paradigm mandates that a sender

reduces its sending bit rate to half (as a result of an adjustment

of the TCP buffer size) as soon as it observes a packet loss rate

above a certain threshold. Once the packet loss rate drops below

剩余11页未读，继续阅读

评论收藏

内容反馈

苍澜如水

粉丝: 2027

H.264 AVC Over IP

RTP_Payload_Format_for_H.264_Video

RTSP协议服务端代码，H264视频流直播

rfc3984_H.264中文版和RTP-RFC_3550中文版

H.248协议.7z

视频解码插件(264)

TVie 编码器产品说明书

H264裸流文件

便携式多网络协议视频解码器JR-SMD201-P

G729、h263、h264、MPEG4四种最流行的音频和视频标准的压缩和解压算法的源代码

pjsip最新版本带视频demo

android linphone支持5.0以下手机 h264编码

G729·h263·h264·MPEG4

test_file.rar

码流分析仪

4K+万兆视频监控解决方案.pdf

音视频编解码源码

audio-video-codecs.rar_audio_century64u_codec_video_videoCodecs

ts分析工具

x264Coder解码编码

freeswitch-1.10.0.-release.tar.gz

行业分类-设备装置-基于因特网的流媒体压缩、传输与存贮系统.zip

现代通信终端相关技术分析.ppt

通信标准信息服务.zip

RTSP服务器源码 纯C linux/windows

WireShark抓包文件.rar

电信设备-彩色图象声音通信系统.zip

Video Demystified - 3rd Edition

4关于我国IPTV的现状及发展趋势.doc

按自定义规则对list进行排序

personal-website

最新资源

RTSP服务器源码纯C linux/windows