blob: 7cf15640294bf059262c72039e0f0d8d68e02a41 [file] [log] [blame] [view]
danakj4e625fb2024-03-06 20:47:461# Preventing OOB through Unsafe Buffers errors (aka Spanification)
2
3Out-of-bounds (OOB) security bugs commonly happen through pointers
danakj6a49deaa2024-06-04 21:31:554which have no bounds information associated with them. We can prevent such
danakj4e625fb2024-03-06 20:47:465bugs by always using containers.
6
danakj6a49deaa2024-06-04 21:31:557Entire directories have been opted out of unsafe pointer usage
8warnings through the
9[`//build/config/unsafe_buffers_paths.txt`](../build/config/unsafe_buffers_paths.txt)
10file. Our [compiler](../tools/clang/plugins/UnsafeBuffersPlugin.cpp) enables
11the `-Wunsafe-buffer-usage` warning in any file that is not
12opted out. As we convert unsafe pointers to safe constructs like
13`base::span`, `base::HeapArray` and `std::vector`, we will
14remove paths from that file to enable the warnings across the
15codebase. To incrementally apply the warnings across a whole directory,
16individual files [can be opted-out](#controlling-warnings-for-a-single-file)
17of the warning.
18
19## Patterns for spanification
20
danakj4e625fb2024-03-06 20:47:4621Most pointers are unowned references into an array (or vector)
22and the most appropriate replacement for the pointer is
danakje7db1e3f32024-04-16 20:43:2423[`base::span`](../base/containers/span.h).
danakj4e625fb2024-03-06 20:47:4624
danakj6a49deaa2024-06-04 21:31:5525### Copying arrays (`memcpy`)
26
27You have:
28```cc
29uint8_t array1[12];
30uint8_t array2[16];
31uint64_t array3[2];
32memcpy(array1, array2 + 8, 4);
33memcpy(array1 + 4, array3, 8);
34```
35
36Spanified:
37```cc
38uint8_t array1[12];
39uint8_t array2[16];
40uint64_t array3[2];
41base::span(array1).first(4u).copy_from(base::span(array2).subspan(8u, 4u));
42base::span(array1).subspan(4u).copy_from(base::as_byte_span(array3).first(8u));
43
44// Use `split_at()` to ensure `array1` is fully written.
45auto [from2, from3] = base::span(array1).split_at(4u);
46from2.copy_from(base::span(array2).subspan(8u, 4u));
47from3.copy_from(base::as_byte_span(array3).first(8u));
48
49// This can even be ensured at compile time (if sizes and offsets are all
50// constants).
51auto [from2, from3] = base::span(array1).split_at<4u>();
52from2.copy_from(base::span(array2).subspan<8u, 4u>());
53from3.copy_from(base::as_byte_span(array3).first<8u>());
54```
55
56### Zeroing arrays (`memset`)
57
58`std::ranges::fill` works on any range/container and won't write out of
59bounds. Converting arbitrary types into a byte array (through
60`base::as_writable_byte_span`) is only valid when the type holds trivial
61types such as primitives. Unlike `memset`, a constructed object can be
62given as the value to use as in `std::ranges_fill(container, Object())`.
63
64You have:
65```cc
66uint8_t array1[12];
67uint64_t array2[2];
68Object array3[4];
69memset(array1, 0, 12);
70memset(array2, 0, 2 * sizeof(uint64_t));
71memset(array3, 0, 4 * sizeof(Object));
72```
73
74Spanified:
75```cc
76uint8_t array1[12];
77uint64_t array2[2];
78Object array3[4];
79std::ranges::fill(array1, 0u);
80std::ranges::fill(array2, 0u);
81std::ranges::fill(base::as_writable_byte_span(array3), 0u);
82```
83
84### Comparing arrays (`memcmp`)
85
86You have:
87```cc
88uint8_t array1[12] = {};
89uint8_t array2[12] = {};
90bool eq = memcmp(array1, array2, sizeof(array1)) == 0;
danakj8d29e182024-06-05 16:50:1891bool less = memcmp(array1, array2, sizeof(array1)) < 0;
danakj6a49deaa2024-06-04 21:31:5592
93// In tests.
94for (size_t i = 0; i < sizeof(array1); ++i) {
95 SCOPED_TRACE(i);
96 EXPECT_EQ(array1[i], array2[i]);
97}
98```
99
100Spanified:
101```cc
102uint8_t array1[12] = {};
103uint8_t array2[12] = {};
104// If one side is a span, the other will convert to span too.
105bool eq = base::span(array1) == array2;
danakj8d29e182024-06-05 16:50:18106bool less = base::span(array1) < array2;
danakj6a49deaa2024-06-04 21:31:55107
108// In tests.
109EXPECT_EQ(base::span(array1), array2);
110```
111
112### Copying array into an integer
113
114You have:
115```cc
116uint8_t array[44] = {};
117uint32_t v1;
118memcpy(&v1, array, sizeof(v1)); // Front.
119uint64_t v2;
120memcpy(&v2, array + 6, sizeof(v2)); // Middle.
121```
122
123Spanified:
124```cc
125#include "base/numerics/byte_conversions.h"
126...
127uint8_t array[44] = {};
128uint32_t v1 = base::U32FromLittleEndian(base::span(array).first<4u>()); // Front.
129uint64_t v2 = base::U64FromLittleEndian(base::span(array).subspan<6u, 8u>()); // Middle.
130```
131
132### Copy an array into an integer via cast
133
David Benjamin0ffdea72024-06-12 19:55:22134Note: This pattern is prone to more UB than out-of-bounds access. It is UB to
135cast pointers if the result is not aligned, so these cases are often buggy or
136were only correct due to subtle assumptions on buffer alignment. The spanified
137version avoids this pitfalls. It has no alignment requirement.
danakj6a49deaa2024-06-04 21:31:55138
139You have:
140```cc
141uint8_t array[44] = {};
142uint32_t v1 = *reinterpret_cast<const uint32_t*>(array); // Front.
David Benjamin0ffdea72024-06-12 19:55:22143uint64_t v2 = *reinterpret_cast<const uint64_t*>(array + 16); // Middle.
danakj6a49deaa2024-06-04 21:31:55144```
145
146Spanified:
147```cc
148#include "base/numerics/byte_conversions.h"
149...
150uint8_t array[44] = {};
151uint32_t v1 = base::U32FromLittleEndian(base::span(array).first<4u>()); // Front.
David Benjamin0ffdea72024-06-12 19:55:22152uint64_t v2 = base::U64FromLittleEndian(base::span(array).subspan<16u, 8u>()); // Middle.
danakj6a49deaa2024-06-04 21:31:55153```
154
155### Making a byte array (`span<uint8_t>`) from a string (or any array/range)
156
157You have:
158```cc
159std::string str = "hello world";
160func_with_const_ptr_size(reinterpret_cast<const uint8_t*>(str.data()), str.size());
161func_with_mut_ptr_size(reinterpret_cast<uint8_t*>(str.data()), str.size());
162```
163
164Spanified:
165```cc
166std::string str = "hello world";
167base::span<const uint8_t> bytes = base::as_byte_span(str);
168func_with_const_ptr_size(bytes.data(), bytes.size());
169base::span<uint8_t> mut_bytes = base::as_writable_byte_span(str);
170func_with_mut_ptr_size(mut_bytes.data(), mut_bytes.size());
171
172// Replace pointer and size with a span, though.
173func_with_const_span(base::as_byte_span(str));
174func_with_mut_span(base::as_writable_byte_span(str));
175```
176
177### Making a byte array (`span<uint8_t>`) from an object
178
179You have:
180```cc
181uint8_t array[8];
182uint64_t val;
183two_byte_arrays(array, reinterpret_cast<const uint8_t*>(&val));
184```
185
186Spanified:
187```cc
188uint8_t array[8];
189uint64_t val;
190base::span<uint8_t> val_span = base::byte_span_from_ref(val);
191two_byte_arrays(array, val_span.data());
192
193// Replace an unbounded pointer a span, though.
194two_byte_spans(base::span(array), base::byte_span_from_ref(val));
195```
danakje7db1e3f32024-04-16 20:43:24196
197## Controlling warnings for a single file
198
199Warnings can be disabled for a single (C++ source or header) file by
200writing `#pragma allow_unsafe_buffers` anywhere in the file. This can
danakj51d26a42024-04-25 14:23:56201be used to mark future work to drive down over time:
202```
203#ifdef UNSAFE_BUFFERS_BUILD
204// TODO(crbug.com/ABC): Remove this and convert code to safer constructs.
205#pragma allow_unsafe_buffers
206#endif
207```
danakje7db1e3f32024-04-16 20:43:24208
danakj51d26a42024-04-25 14:23:56209It's recommended to place the pragma directly copyright header. This can be used
210to work through files in a directory incrementally. To do so, remove the whole
211directory from opt-out in the
danakje7db1e3f32024-04-16 20:43:24212[`//build/config/unsafe_buffers_paths.txt`](../build/config/unsafe_buffers_paths.txt)
danakj51d26a42024-04-25 14:23:56213file, and temporarily mark any files with `#pragma allow_unsafe_buffers` that
214need it.
215
216The warnings can be enabled for a single (C++ source or header) file by writing
217`#pragma check_unsafe_buffers` anywhere in the file. These also need to be
218guarded by `#ifdef UNSAFE_BUFFERS_BUILD` if being checked in.
danakj4e625fb2024-03-06 20:47:46219
220# Functions with array pointer parameters
221
222Functions that receive a pointer into an array may read
223or write out of bounds of the pointer if given a pointer that
224is incorrectly sized. Such functions should be marked with the
danakje7db1e3f32024-04-16 20:43:24225`UNSAFE_BUFFER_USAGE` attribute macro.
danakj4e625fb2024-03-06 20:47:46226
227The same is true for functions that accept an iterator instead
danakje7db1e3f32024-04-16 20:43:24228of a range type. Some examples of each are `memcpy()` and
229`std::copy()`.
danakj4e625fb2024-03-06 20:47:46230
231Calling such functions is unsafe and should generally be avoided.
232Instead, replace such functions with an API built on base::span
233or other range types which prevents any chance of OOB memory
234access. For instance, replace `memcpy()`, `std::copy()` and
235`std::ranges::copy()` with `base::span::copy_from()`. And
236replace `memset()` with `std::ranges::fill()`.
237
238# Writing unsafe data structures with pointers
239
240TODO: Write about `UNSAFE_BUFFERS()` for rare exceptions where
241the correctness of pointer bounds can be fully explained and
danakje7db1e3f32024-04-16 20:43:24242encapsulated, such as within a data structure or when working
243with Operating System and C-like APIs.