Blame - docs/unsafe_buffers.md - chromium/src.git

blob: 4a4d5e437a85bf1309ac6a2bd868d64ff4abd997 [file] [log] [blame] [view]

Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	1	# The Unsafe Buffers Clang plugin.
danakj	4e625fb	2024-03-06 20:47:46	[diff] [blame]	2
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	3	Our compiler contains a [plugin](../tools/clang/plugins/UnsafeBuffersPlugin.cpp)
				4	which reports places in the code where unsafe buffer operations are present.
				5
				6	[TOC]
				7
				8
				9	## Preventing OOB by removing Unsafe Buffers operations.
				10
				11	Out-of-bounds (OOB) security bugs commonly happen through C-style pointers
				12	which have no bounds information associated with them. We can prevent such
				13	bugs by always using C++ containers. Furthermore, the plugin will warn
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	14	warn about unsafe pointer usage that should be converted to containers.
				15	When an unsafe usage is detected, Clang prints a warning similar to
				16	```
				17	error: unsafe buffer access [-Werror,-Wunsafe-buffer-usage]
				18	```
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	19	and directs developers to this file for more information. Several common
				20	[Techniques](#container-based-ecosystem) for fixing these issues are presented
				21	later in this document.
danakj	4e625fb	2024-03-06 20:47:46	[diff] [blame]	22
danakj	22031fb1	2024-11-08 14:52:30	[diff] [blame]	23	Clang documentation includes a guide to working with unsafe-buffer-usage
				24	warnings here: https://clang.llvm.org/docs/SafeBuffers.html
				25
Tom Sepez	b142846	2025-01-27 21:44:43	[diff] [blame]	26	## Preventing OOB by removing unsafe libc calls.
				27
				28	OOB bugs also commonly happen through C-style library calls such as
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	29	memcpy() and memset() where the programmer is responsible for specifying
				30	an (unchecked) length. In order to encourage safer alternatives, the
				31	plugin can warn about unsafe calls which should be converted to safer
				32	C++ alternatives. When an unsafe libc call is detected, Clang prints
				33	a warning similar to
				34	```
				35	error: function 'memcpy' is unsafe [-Werror,-Wunsafe-buffer-usage-in-libc-call]
				36	```
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	37	## Unsafe buffer warning suppressions
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	38
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	39	Because the Chromium codebase is not yet compliant with these warnings,
				40	there are mechanisms to opt out code on a directory, file, or per-occurence
				41	basis.
				42
				43	By default, all files are checked for unsafe-buffer-usage.
				44
				45	### Opting out entire directories
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	46
Tom Sepez	b142846	2025-01-27 21:44:43	[diff] [blame]	47	Entire directories are opted out of unsafe buffer usage warnings through
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	48	the [`//build/config/unsafe_buffers_paths.txt`](../build/config/unsafe_buffers_paths.txt)
				49	file. As work progresses, directories will be removed from this list, and
				50	non-compliant files marked on a per-file basis as below. Early results
				51	indicate that often 85%+ of files in a directory already happen to be
				52	compliant, so file-by-file suppression allows this code to be subject
				53	to enforcement.
				54
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	55	This mechanism opts directories out of all warning categories (unsafe
				56	buffers and unsafe libc calls).
				57
				58	#### Syntax of Unsafe Buffer Paths file
				59
				60	Note: Paths should be written as relative to the root of the source tree with
				61	unix-style path separators. Directory prefixes should end with `/`, such
				62	as `base/`.
				63
				64	Empty lines are ignored.
				65
				66	The `#` character introduces a comment until the end of the line.
				67
Tom Sepez	b1a8156	2025-02-20 17:28:54	[diff] [blame]	68	Lines starting with `.` declare which checks are to be enforced, as
				69	a comma-separated list of values. Currently allowed values are `buffers`
				70	and `libc`.
				71
				72	All other lines specify which paths are to be included/excluded.
				73
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	74	Lines that begin with `-` are immediately followed by path prefixes that
				75	will not be checked for unsafe-buffer-usage. They are known to do unsafe
				76	things and should be changed to use constructs like base::span or containers
				77	like base::HeapArray and std::vector instead. See https://crbug.com/40285824
				78
				79	Lines that begin with `+` are immediately followed by path prefixes that will
				80	be checked for unsafe-buffer-usage. These have no such usage (or all such
				81	usage is annotated), and are protected against new unsafe pointer behaviour
				82	by the compiler. Generally, `+` lines are used to enable checks for
				83	sub-directories of a path that has previously disabled checks (with a
				84	`-` line).
				85
				86	If a file matches both a `-` and `+` line, the longest matching prefix takes
				87	precedence.
				88
				89	#### Removing directories from Unsafe Buffers Paths file
				90
				91	The recommended process for removing a `-dir/` line from this file is:
				92
				93	1. Remove the `-dir/` line from this paths file. Possibly add some subdirectories
				94	now needed to reduce scope, like `-dir/sub_dir/`.
				95
				96	2. Add `#pragma allow_unsafe_buffers` to every file in the directory that now
				97	has a compilation error (see the next section).
				98
				99	### Opting out individual files
				100
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	101	Individual files are opted out of unsafe pointer usage warnings though
				102	the use of the following snippet, which is to be placed immediately
				103	following the copyright header in a source file.
				104	```
				105	#ifdef UNSAFE_BUFFERS_BUILD
				106	// TODO(crbug.com/ABC): Remove this and convert code to safer constructs.
				107	#pragma allow_unsafe_buffers
				108	#endif
				109	```
				110
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	111	The above mechanism also suppress unsafe libc call warnings in addition
				112	to the unsafe buffer warnings.
				113
				114	To prevent back-sliding on files which have been made safe with respect
				115	to unsafe buffers, there is now a per-file pragma which suppresses the
				116	libc warnings while still enforcing the unsafe buffer warnings.
				117
				118	```
				119	#ifdef UNSAFE_BUFFERS_BUILD
				120	// TODO(crbug.com/ABC): Remove libc calls to fix these warnings.
				121	#pragma allow_unsafe_libc_calls
				122	#endif
				123	```
				124
				125	An initial set of files containing allow\_unsafe\_libc\_calls has been
				126	uploaded; please keep these in place until the pending libc enforcement
				127	is enabled for Chromium.
				128
				129	#### Removing pragmas from individual files.
				130
				131	The recommended process for removing pragmas from individual files is:
				132
				133	1. 1. Remove the pragma from the file.
				134
				135	2. Use the compiler warnings now generated to identify the individual
				136	expressions to suppress (see the next section).
				137
				138	### Opting out individual expressions
				139
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	140	Individual expressions or blocks of code are opted out by using the
Evan Stade	8d390c5	2025-04-22 20:28:49	[diff] [blame]	141	`UNSAFE_BUFFERS()` macro as defined in [`//base/compiler_specific.h`](../base/compiler_specific.h)
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	142	file. These should be rare once a project is fully converted, except
				143	perhaps when working with C-style external APIs. These must
				144	always be accompanied by a `// SAFETY:` comment explaining in detail
				145	how the code has been evaluated to be safe for all possible input.
				146
Evan Stade	8d390c5	2025-04-22 20:28:49	[diff] [blame]	147	Code introducing `UNSAFE_BUFFERS()` macro invocations without corresponding
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	148	`// SAFETY:` comment should be summarily rejected during code review.
				149
Tom Sepez	3072e38	2024-08-14 22:42:44	[diff] [blame]	150	To allow for incremental conversion, code can be temporarily opted out by
				151	using the `UNSAFE_TODO()` macro. This provides the same functionality as
				152	the `UNSAFE_BUFFERS()` macro, but allows easier searching for code in need
				153	of revision. Add TODO() comment, along the lines of
				154	`// TODO(crbug.com/xxxxxx): resolve safety issues`.
				155
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	156	This mechanism opts expressions out of all warning categories (unsafe
				157	buffers and unsafe libc calls).
Tom Sepez	b142846	2025-01-27 21:44:43	[diff] [blame]	158
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	159	#### Removing UNSAFE_TODO() from individual expressions
Tom Sepez	b142846	2025-01-27 21:44:43	[diff] [blame]	160
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	161	We seek to convert code to use owning containers like HeapArray and vector,
				162	as explained in the next section.
Tom Sepez	b142846	2025-01-27 21:44:43	[diff] [blame]	163
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	164	## Container-based ecosystem
				165
				166	Containers may be owning types or view types. The common owning containers that
				167	us contiguous storage are `std::vector`, `std::string`, `base::HeapArray`,
				168	`std::array`. Their common view types are `base::span`, `std::string_view`,
				169	`base::cstring_view`.
				170
				171	Other owning containers include maps, sets, deques, etc. These are not
				172	compatible with `base::span` as they are not contiguous and generally do not
				173	have an associated view type at this time.
				174
				175	We are using `base::span` instead of `std::span` in order to provide a type that
				176	can do more than the standard type. We also have other types and functions to
				177	work with ranges and spans instead of unbounded pointers and iterators.
				178
				179	The common conversions to spans are:
				180	- `base::span<T>` replaces `T* ptr, size_t size`.
				181	- `base::span<T, N>` replaces `T (&ptr)[N]` (a reference to a compile-time-sized
				182	array).
				183	- `base::raw_span<T>` replaces `base::span<T>` (and `T* ptr, size_t size`) for
				184	class fields.
				185
				186	### Span construction
Peter Kasting	e825c6f	2024-12-02 17:35:39	[diff] [blame]	187	- `base::span()` makes a span, deducing the type and size, from any contiguous
				188	range. It can also take explicit begin/end or data/size pairs.
				189	- `base::to_fixed_extent<N>()` makes a fixed-size span from a dynamic one.
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	190	- `base::as_bytes()` and `base::as_chars()` convert a span’s inner type to
				191	`uint8_t` or `char` respectively, making a byte-span or char-span.
				192	- `base::span_from_ref()` and `base::byte_span_from_ref()` make a span, or
				193	byte-span, from a single object.
Peter Kasting	e825c6f	2024-12-02 17:35:39	[diff] [blame]	194	- `base::as_byte_span()` and `base::as_writable_byte_span()` make a
				195	byte-span from any contiguous range.
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	196
				197	#### Padding bytes
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	198	Note that if the type contains padding bytes that were not somehow explicitly
				199	initialized, this can create reads of uninitialized memory. Conversion to a
				200	byte-span is most commonly used for spans of primitive types, such as going from
				201	`char` (such as in `std::string`) or `uint32_t` (in a `std::vector`) to
				202	`unit8_t`.
				203
				204	### Dynamic read/write of a span
				205	- `base::SpanReader` reads heterogeneous values from a (typically, byte-) span
				206	in a dynamic manner.
				207	- `base::SpanWriter` writes heterogeneous values into a (typically, byte-) span
				208	in a dynamic manner.
				209
				210	### Values to/from byte spans
				211	In [`//base/numerics/byte_conversions.h`](../base/numerics/byte_conversions.h)
				212	we have conversions between byte-arrays and big/little endian integers or
				213	floats. For example (and there are many other variations):
				214	- `base::U32FromBigEndian` converts from a big-endian byte-span to an unsigned
				215	32-bit integer.
				216	- `base::U32FromLittleEndian` converts from a little-endian byte-span to an
				217	unsigned
				218	- `base::U32ToBigEndian` converts from an integer to a big-endian-encoded
				219	4-byte-array.
				220	- `base::U32ToLittleEndian` converts from an integer to a little-endian-encoded
				221	4-byte-array.
				222
				223	### Heap-allocated arrays
				224	- `base::HeapArray<T>` replaces `std::unique_ptr<T[]>` and places the bounds of
				225	the array inside the `HeapArray` which makes it a bounds-safe range.
				226
				227	### Copying and filling arrays
				228	- `base::span::copy_from(span)` replaces `memcpy` and `memmove`, and verifies
				229	that the source and destination spans have the same size instead of writing
				230	out of bounds. It lowers to the same code as `memmove` when possible.
				231	- Note `std::ranges::copy` is not bounds-safe (though its name sounds like
				232	it should be).
				233	- `std::ranges::fill` replaces `memset` and works with a range so you don't
				234	need explicit bounds.
				235
				236	### String pointers
				237
				238	A common form of pointer is `const char*` which is used (sometimes) to represent
				239	a NUL-terminated string. The standard library gives us two types to replace
				240	`char*`, which allow us to know the bounds of the character array and work with
				241	the string as a range:
				242
				243	- `std::string` owns a NUL-terminated string.
				244	- `std::string_view` is a view of a non-NUL-terminated string.
				245
				246	What’s missing is a view of a string that is guaranteed to be NUL-terminated so
				247	that you can call `.c_str()` to generate a `const char*` suitable for C APIs.
				248
				249	- `base::cstring_view` is a view of a NUL-terminated string. This avoids the
				250	need to construct a `std::string` in order to ensure a terminating NUL is
				251	present. Use this as a view type whenever your code bottoms out in a C API
				252	that needs NUL-terminated string pointer.
				253
				254	### Use of std::array<T>.
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	255
				256	The clang plugin is very particular about indexing a C-style array (e.g.
				257	`int arr[100]`) with a variable index. Often these issues can be resolved
				258	by replacing this with `std::array<int, 100> arr`, which provides safe
				259	indexed operations. In particular, new code should prefer to use the
				260	`std::array<T, N>` mechanism.
				261
				262	For arrays where the size is determined by the compiler (e.g.
				263	`int arr[] = { 1, 3, 5 };`), the `std::to_array<T>()` helper function
				264	should be used along with the `auto` keyword:
				265	`auto arr = std::to_array<int>({1, 3, 5});`
				266
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	267	### Avoid reinterpret_cast
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	268
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	269	Casts to bytes are common and can be handled as follows.
				270
				271	#### Writing to a byte span
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	272
				273	A common idiom in older code is to write into a byte array by casting
				274	the array into a pointer to a larger type (such as `uint32_t` or `float`)
danakj	cdf43e2	2024-07-02 16:42:16	[diff] [blame]	275	and then writing through that pointer. This can result in Undefined Behaviour
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	276	and violates the rules of the C++ abstract machine.
				277
				278	Instead, keep the byte array as a `base::span<uint8_t>`, and write to it
				279	directly by chunking it up into pieces of the size you want to write.
				280
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	281	Using `take_first()` (good for repeated modifications and loops):
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	282	```cc
				283	void write_floats(base::span<uint8_t> out, float f1, float f2) {
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	284	// Write `f1` into `out`'s prefix, moving `out` forward.
				285	out.take_first<4>().copy_from(base::byte_span_from_ref(f1));
				286	// Write `f2` into `out`'s new prefix (after `f1`).
				287	out.copy_prefix_from(base::byte_span_from_ref(f2));
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	288	}
				289	```
				290
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	291	Using `split_at()` (good when there are exactly two pieces):
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	292	```cc
				293	void write_floats(base::span<uint8_t> out, float f1, float f2) {
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	294	// Split `out` into a prefix to write `f1` into, and a remainder.
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	295	auto [write_f1, rem] = out.split_at<4>();
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	296	// Write `f1` into the prefix portion, `write_f1`.
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	297	write_f1.copy_from(base::byte_span_from_ref(f1));
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	298	// Write `f2` into the beginning of the remainder.
				299	rem.copy_prefix_from(base::byte_span_from_ref(f2));
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	300	}
				301	```
				302
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	303	Using `SpanWriter` and endian-aware `FloatToLittleEndian()` (good when non-fatal
				304	APIs are desired):
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	305	```cc
				306	void write_floats(base::span<uint8_t> out, float f1, float f2) {
				307	auto writer = base::SpanWriter(out);
				308	CHECK(writer.Write(base::FloatToLittleEndian(f1)));
				309	CHECK(writer.Write(base::FloatToLittleEndian(f2)));
				310	}
				311	```
				312
				313	Writing big-endian, with `SpanWriter` and endian-aware `U32ToBigEndian()`:
				314	```cc
				315	void write_values(base::span<uint8_t> out, uint32_t i1, uint32_t i2) {
				316	auto writer = base::SpanWriter(out);
				317	CHECK(writer.Write(base::U32ToBigEndian(i1)));
				318	// SpanWriter has a built-in shortcut to do the same thing.
				319	CHECK(writer.WriteU32BigEndian(i2));
				320	// Verify we wrote to the whole output. We can put a size parameter in the
				321	// `out` span to push this check to compile-time when it's a constant.
				322	CHECK_EQ(writer.remaining(), 0u);
				323	}
				324	```
				325
				326	Writing an array to a byte span with `copy_from()`:
				327	```cc
				328	void write_floats(base::span<uint8_t> out, std::vector<const float> floats) {
				329	base::span<const uint8_t> byte_floats = base::as_byte_span(floats);
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	330	// Or use copy_from() if you want to CHECK at runtime that all of `out` has
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	331	// been written to.
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	332	out.copy_prefix_from(byte_floats);
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	333	}
				334	```
				335
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	336	#### Reading from a byte span
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	337
				338	Instead of turning a `span<const uint8_t>` into a pointer of a larger type,
				339	which can cause Undefined Behaviour, read values out of the byte span and
				340	convert each one as a value (not as a pointer).
				341
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	342	Using `take_first()` and endian-aware conversion `FloatFromLittleEndian`:
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	343	```cc
				344	void read_floats(base::span<const uint8_t> in, float& f1, float& f2) {
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	345	f1 = base::FloatFromLittleEndian(in.take_first<4>());
				346	f2 = base::FloatFromLittleEndian(in.take_first<4>());
danakj	a4c4238	2024-06-18 19:05:44	[diff] [blame]	347	}
				348	```
				349
				350	Using `SpanReader` and endian-aware `U32FromBigEndian()`:
				351	```cc
				352	void read_values(base::span<const uint8_t> in, int& i1, int& i2, int& i3) {
				353	auto reader = base::SpanReader(in);
				354	i1 = base::U32FromBigEndian(*reader.Read<4>());
				355	i2 = base::U32FromBigEndian(*reader.Read<4>());
				356	// SpanReader has a built-in shortcut to do the same thing.
				357	CHECK(reader.ReadU32BigEndian(i3));
				358	// Verify we read the whole input. We can put a size parameter in the `in`
				359	// span to push this check to compile-time when it's a constant.
				360	CHECK_EQ(reader.remaining(), 0u);
				361	}
				362	```
				363
				364	## Patterns for spanification
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	365
				366	Most pointer issues ought to be resolved by converting to containers. In
				367	particular, one common conversion is to replace `T*` pointers with
				368	`base::span<T>` in a process known as spanification, since most pointers
				369	are unowned references into an array (or vector). The appropriate
				370	replacement for the pointer is
danakj	e7db1e3f3	2024-04-16 20:43:24	[diff] [blame]	371	[`base::span`](../base/containers/span.h).
danakj	4e625fb	2024-03-06 20:47:46	[diff] [blame]	372
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	373	### Copying arrays (`memcpy`)
				374
				375	You have:
				376	```cc
				377	uint8_t array1[12];
				378	uint8_t array2[16];
				379	uint64_t array3[2];
				380	memcpy(array1, array2 + 8, 4);
				381	memcpy(array1 + 4, array3, 8);
				382	```
				383
				384	Spanified:
				385	```cc
				386	uint8_t array1[12];
				387	uint8_t array2[16];
				388	uint64_t array3[2];
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	389	base::span<uint8_t> span1(array1);
				390	span1.take_first<4>().copy_from(base::span(array2).subspan<8, 4>());
				391	span1.copy_from(base::as_byte_span(array3).first<8>());
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	392
				393	// Use `split_at()` to ensure `array1` is fully written.
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	394	auto [from2, from3] = base::span(array1).split_at<4>();
				395	from2.copy_from(base::span(array2).subspan<8, 4>());
				396	from3.copy_from(base::as_byte_span(array3).first<8>());
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	397	```
				398
				399	### Zeroing arrays (`memset`)
				400
				401	`std::ranges::fill` works on any range/container and won't write out of
				402	bounds. Converting arbitrary types into a byte array (through
				403	`base::as_writable_byte_span`) is only valid when the type holds trivial
				404	types such as primitives. Unlike `memset`, a constructed object can be
				405	given as the value to use as in `std::ranges_fill(container, Object())`.
				406
				407	You have:
				408	```cc
				409	uint8_t array1[12];
				410	uint64_t array2[2];
				411	Object array3[4];
				412	memset(array1, 0, 12);
				413	memset(array2, 0, 2 * sizeof(uint64_t));
				414	memset(array3, 0, 4 * sizeof(Object));
				415	```
				416
				417	Spanified:
				418	```cc
				419	uint8_t array1[12];
				420	uint64_t array2[2];
				421	Object array3[4];
Peter Kasting	448444f	2024-12-13 01:25:36	[diff] [blame]	422	std::ranges::fill(array1, 0);
				423	std::ranges::fill(array2, 0);
				424	std::ranges::fill(base::as_writable_byte_span(array3), 0);
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	425	```
				426
				427	### Comparing arrays (`memcmp`)
				428
				429	You have:
				430	```cc
				431	uint8_t array1[12] = {};
				432	uint8_t array2[12] = {};
				433	bool eq = memcmp(array1, array2, sizeof(array1)) == 0;
danakj	8d29e18	2024-06-05 16:50:18	[diff] [blame]	434	bool less = memcmp(array1, array2, sizeof(array1)) < 0;
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	435
				436	// In tests.
				437	for (size_t i = 0; i < sizeof(array1); ++i) {
				438	SCOPED_TRACE(i);
				439	EXPECT_EQ(array1[i], array2[i]);
				440	}
				441	```
				442
				443	Spanified:
				444	```cc
				445	uint8_t array1[12] = {};
				446	uint8_t array2[12] = {};
				447	// If one side is a span, the other will convert to span too.
				448	bool eq = base::span(array1) == array2;
danakj	8d29e18	2024-06-05 16:50:18	[diff] [blame]	449	bool less = base::span(array1) < array2;
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	450
				451	// In tests.
				452	EXPECT_EQ(base::span(array1), array2);
				453	```
				454
				455	### Copying array into an integer
				456
				457	You have:
				458	```cc
				459	uint8_t array[44] = {};
				460	uint32_t v1;
				461	memcpy(&v1, array, sizeof(v1)); // Front.
				462	uint64_t v2;
				463	memcpy(&v2, array + 6, sizeof(v2)); // Middle.
				464	```
				465
				466	Spanified:
				467	```cc
				468	#include "base/numerics/byte_conversions.h"
				469	...
				470	uint8_t array[44] = {};
				471	uint32_t v1 = base::U32FromLittleEndian(base::span(array).first<4u>()); // Front.
				472	uint64_t v2 = base::U64FromLittleEndian(base::span(array).subspan<6u, 8u>()); // Middle.
				473	```
				474
				475	### Copy an array into an integer via cast
				476
David Benjamin	0ffdea7	2024-06-12 19:55:22	[diff] [blame]	477	Note: This pattern is prone to more UB than out-of-bounds access. It is UB to
				478	cast pointers if the result is not aligned, so these cases are often buggy or
				479	were only correct due to subtle assumptions on buffer alignment. The spanified
				480	version avoids this pitfalls. It has no alignment requirement.
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	481
				482	You have:
				483	```cc
				484	uint8_t array[44] = {};
				485	uint32_t v1 = reinterpret_cast<const uint32_t>(array); // Front.
David Benjamin	0ffdea7	2024-06-12 19:55:22	[diff] [blame]	486	uint64_t v2 = reinterpret_cast<const uint64_t>(array + 16); // Middle.
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	487	```
				488
				489	Spanified:
				490	```cc
				491	#include "base/numerics/byte_conversions.h"
				492	...
				493	uint8_t array[44] = {};
				494	uint32_t v1 = base::U32FromLittleEndian(base::span(array).first<4u>()); // Front.
David Benjamin	0ffdea7	2024-06-12 19:55:22	[diff] [blame]	495	uint64_t v2 = base::U64FromLittleEndian(base::span(array).subspan<16u, 8u>()); // Middle.
danakj	6a49deaa	2024-06-04 21:31:55	[diff] [blame]	496	```
				497
				498	### Making a byte array (`span<uint8_t>`) from a string (or any array/range)
				499
				500	You have:
				501	```cc
				502	std::string str = "hello world";
				503	func_with_const_ptr_size(reinterpret_cast<const uint8_t*>(str.data()), str.size());
				504	func_with_mut_ptr_size(reinterpret_cast<uint8_t*>(str.data()), str.size());
				505	```
				506
				507	Spanified:
				508	```cc
				509	std::string str = "hello world";
				510	base::span<const uint8_t> bytes = base::as_byte_span(str);
				511	func_with_const_ptr_size(bytes.data(), bytes.size());
				512	base::span<uint8_t> mut_bytes = base::as_writable_byte_span(str);
				513	func_with_mut_ptr_size(mut_bytes.data(), mut_bytes.size());
				514
				515	// Replace pointer and size with a span, though.
				516	func_with_const_span(base::as_byte_span(str));
				517	func_with_mut_span(base::as_writable_byte_span(str));
				518	```
				519
				520	### Making a byte array (`span<uint8_t>`) from an object
				521
				522	You have:
				523	```cc
				524	uint8_t array[8];
				525	uint64_t val;
				526	two_byte_arrays(array, reinterpret_cast<const uint8_t*>(&val));
				527	```
				528
				529	Spanified:
				530	```cc
				531	uint8_t array[8];
				532	uint64_t val;
				533	base::span<uint8_t> val_span = base::byte_span_from_ref(val);
				534	two_byte_arrays(array, val_span.data());
				535
				536	// Replace an unbounded pointer a span, though.
				537	two_byte_spans(base::span(array), base::byte_span_from_ref(val));
				538	```
danakj	e7db1e3f3	2024-04-16 20:43:24	[diff] [blame]	539
danakj	b70695e	2024-07-16 15:20:28	[diff] [blame]	540	### Avoid std::next() for silencing warnings, use ranges
				541
				542	When we convert `pointer + index` to `std::next(pointer, index)` we silence the
				543	`Wunsafe-buffer-usage` warning by pushing the unsafe pointer arithmetic into
				544	the `std::next()` function in a system header, but we have the same unsafety.
				545	`std::next()` does no additional bounds checking.
				546
				547	Instead of using `std::next()`, rewrite away from using pointers (or iterators)
				548	entirely by using ranges. `span()` allows us to take a subset of a contiguous
				549	range without having to use iterators that we move with arithmetic or
				550	`std::next()`.
				551
danakj	e64eb29	2024-10-22 19:02:33	[diff] [blame]	552	Likewise, `std::advance()` can silence the warning but does not add any safety
				553	to the pointer arithmetic and should be avoided.
				554
danakj	b70695e	2024-07-16 15:20:28	[diff] [blame]	555	Instead of using pointer/iterator arithmetic:
				556	```cc
				557	// Unsafe buffers warning on the unchecked arithmetic.
				558	auto it = std::find(vec.begin() + offset, vec.end(), 20);
				559	// No warning... But has the same security risk!
				560	auto it = std::find(std::next(vec.begin(), offset), vec.end(), 20);
				561	```
				562
				563	Use a range, with `span()` providing a view of a subset of the range:
				564	```cc
				565	auto it = std::ranges::find(base::span(vec).subspan(offset), 20);
				566	```
				567
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	568	### Functions with array pointer parameters
danakj	4e625fb	2024-03-06 20:47:46	[diff] [blame]	569
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	570	Functions that receive a pointer argument into an array may read
				571	or write out of bounds of that array if subsequent manual size
				572	calculations are incorrect. Such functions should be avoided if
				573	possible, or marked with the `UNSAFE_BUFFER_USAGE` attribute macro
				574	otherwise. This macro propagates to their callers that they must
				575	be called from inside an `UNSAFE_BUFFERS()` region (along with
				576	a corresponding safety comment explaining how the caller knows
				577	the call will be safe).
danakj	4e625fb	2024-03-06 20:47:46	[diff] [blame]	578
				579	The same is true for functions that accept an iterator instead
danakj	e7db1e3f3	2024-04-16 20:43:24	[diff] [blame]	580	of a range type. Some examples of each are `memcpy()` and
				581	`std::copy()`.
danakj	4e625fb	2024-03-06 20:47:46	[diff] [blame]	582
Tom Sepez	9e7f696	2024-06-13 00:21:59	[diff] [blame]	583	Again, calling such functions is unsafe and should be avoided.
				584	Replace such functions with an API built on base::span
danakj	4e625fb	2024-03-06 20:47:46	[diff] [blame]	585	or other range types which prevents any chance of OOB memory
				586	access. For instance, replace `memcpy()`, `std::copy()` and
				587	`std::ranges::copy()` with `base::span::copy_from()`. And
				588	replace `memset()` with `std::ranges::fill()`.
danakj	b470507	2024-11-06 20:54:09	[diff] [blame]	589
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	590	### Aligned memory
danakj	b470507	2024-11-06 20:54:09	[diff] [blame]	591
				592	An aligned heap allocation can be constructed into a `base::HeapArray` through
				593	the `base::AlignedUninit<T>(size, alignment)` function in
				594	`//base/memory/aligned_memory.h`. It will allocate space for `size` many `T`
				595	objects aligned to `alignment`, and return a `base::AlignedHeapArray<T>` which
				596	is a `base::HeapArray` with an appropriate deleter. Note that the returned
				597	memory is uninitialized.
				598	```cc
				599	base::AlignedHeapArray<float> array = base::AlignedUninit<float>(size, alignment);
				600	```
				601
				602	Some containers are built on top of buffers of `char`s that are aligned for
				603	some other `T` in order to manage the lifetimes of objects in the buffer
				604	through in-place construction (`std::construct_at`) and destruction. While the
				605	memory is allocated and destroyed as `char`, it is accessed as `T`. The
				606	`base::AlignedUninitCharArray<T>(size, alignment)` function in
				607	`//base/memory/aligned_memory.h` handles this by returning both:
				608	- A `base::AlignedHeapArray<char>` that will not call destructors on anything in its
				609	buffer.
				610	- A `base::span<T>` that points to all of the (not-yet-created) objects in the
				611	`AlignedHeapArray`. This span can be used to construct `T` objects in place in the
				612	buffer, and the caller is responsible for destroying them as well.
				613	```cc
				614	auto [a, s] = base::AlignedUninitCharArray<float>(size, alignment);
				615	base::AlignedHeapArray<char> array = std::move(a);
				616	base::span<float> span = s;
				617	```
Tom Sepez	d647d84	2025-02-11 21:11:51	[diff] [blame]	618