The iov_iter interface is used to describe and iterate through buffers in the kernel. David Howells led a combined storage and filesystem session at the 2025 Linux Storage, Filesystem, Memory Management, and BPF Summit (LSFMM+BPF) to discuss ways to improve iov_iter. His topic proposal listed a few different ideas including replacing some iov_iter types and possibly allowing mixed types in chains of iov_iter entries; he would like to make the interface itself and the uses of iov_iter in the kernel better.
iov_iter
接口用于在内核中描述并遍历缓冲区。David Howells 在 2025 年的 Linux 存储、文件系统、内存管理与 BPF 峰会(LSFMM+BPF)上主持了一场联合存储与文件系统的会议,讨论如何改进 iov_iter
。他的议题提案列出了一些改进方向,包括替换部分 iov_iter
类型,以及在一组 iov_iter
条目中允许混合不同类型;他希望改进该接口本身以及内核中 iov_iter
的使用方式。
Howells began with an overview. An iov_iter is a stateful description of a buffer, which can be used for I/O; it stores a position within the buffer that can be moved around. There is a set of operations that is part of the API, which includes copying data into or out of the buffer, getting a list of the pages that are part of the buffer, and getting its length. There are multiple types of iov_iter. The initial ones were for user-space buffers, with ITER_IOVEC for the arguments to readv() and writev() and ITER_UBUF for a special case where the number of iovec entries (iovcnt) is one.
Howells 首先给出了一份概述。iov_iter
是对缓冲区的一种有状态描述,可用于 I/O 操作;它会记录缓冲区内的位置,并可在其中移动。这个接口提供一组 API 操作,包括向缓冲区中复制数据或从中复制数据、获取构成缓冲区的数据页列表,以及获取其总长度。iov_iter
有多种类型。最初的类型用于用户空间缓冲区,其中 ITER_IOVEC
用于 readv()
和 writev()
的参数,而 ITER_UBUF
是一个特殊情况,用于 iovec
条目数(iovcnt
)为 1 的场景。
There are also three iov_iter types for describing page fragments: ITER_BVEC, which is a list of page, offset, and length; ITER_FOLIOQ, which describes folios and is used by filesystems; and ITER_XARRAY, which is deprecated and describes pages that are stored in an XArray. The problem with ITER_XARRAY is that it requires taking the read-copy-update (RCU) read lock inside iteration operations, which me