blob: 8f32ebaa850f6036ce369c54c9b4811b4b516979 [file] [log] [blame] [view]
danakj4e625fb2024-03-06 20:47:461# Preventing OOB through Unsafe Buffers errors (aka Spanification)
2
3Out-of-bounds (OOB) security bugs commonly happen through pointers
danakj6a49deaa2024-06-04 21:31:554which have no bounds information associated with them. We can prevent such
danakj4e625fb2024-03-06 20:47:465bugs by always using containers.
6
danakj6a49deaa2024-06-04 21:31:557Entire directories have been opted out of unsafe pointer usage
8warnings through the
9[`//build/config/unsafe_buffers_paths.txt`](../build/config/unsafe_buffers_paths.txt)
10file. Our [compiler](../tools/clang/plugins/UnsafeBuffersPlugin.cpp) enables
11the `-Wunsafe-buffer-usage` warning in any file that is not
12opted out. As we convert unsafe pointers to safe constructs like
13`base::span`, `base::HeapArray` and `std::vector`, we will
14remove paths from that file to enable the warnings across the
15codebase. To incrementally apply the warnings across a whole directory,
16individual files [can be opted-out](#controlling-warnings-for-a-single-file)
17of the warning.
18
19## Patterns for spanification
20
danakj4e625fb2024-03-06 20:47:4621Most pointers are unowned references into an array (or vector)
22and the most appropriate replacement for the pointer is
danakje7db1e3f32024-04-16 20:43:2423[`base::span`](../base/containers/span.h).
danakj4e625fb2024-03-06 20:47:4624
danakj6a49deaa2024-06-04 21:31:5525### Copying arrays (`memcpy`)
26
27You have:
28```cc
29uint8_t array1[12];
30uint8_t array2[16];
31uint64_t array3[2];
32memcpy(array1, array2 + 8, 4);
33memcpy(array1 + 4, array3, 8);
34```
35
36Spanified:
37```cc
38uint8_t array1[12];
39uint8_t array2[16];
40uint64_t array3[2];
41base::span(array1).first(4u).copy_from(base::span(array2).subspan(8u, 4u));
42base::span(array1).subspan(4u).copy_from(base::as_byte_span(array3).first(8u));
43
44// Use `split_at()` to ensure `array1` is fully written.
45auto [from2, from3] = base::span(array1).split_at(4u);
46from2.copy_from(base::span(array2).subspan(8u, 4u));
47from3.copy_from(base::as_byte_span(array3).first(8u));
48
49// This can even be ensured at compile time (if sizes and offsets are all
50// constants).
51auto [from2, from3] = base::span(array1).split_at<4u>();
52from2.copy_from(base::span(array2).subspan<8u, 4u>());
53from3.copy_from(base::as_byte_span(array3).first<8u>());
54```
55
56### Zeroing arrays (`memset`)
57
58`std::ranges::fill` works on any range/container and won't write out of
59bounds. Converting arbitrary types into a byte array (through
60`base::as_writable_byte_span`) is only valid when the type holds trivial
61types such as primitives. Unlike `memset`, a constructed object can be
62given as the value to use as in `std::ranges_fill(container, Object())`.
63
64You have:
65```cc
66uint8_t array1[12];
67uint64_t array2[2];
68Object array3[4];
69memset(array1, 0, 12);
70memset(array2, 0, 2 * sizeof(uint64_t));
71memset(array3, 0, 4 * sizeof(Object));
72```
73
74Spanified:
75```cc
76uint8_t array1[12];
77uint64_t array2[2];
78Object array3[4];
79std::ranges::fill(array1, 0u);
80std::ranges::fill(array2, 0u);
81std::ranges::fill(base::as_writable_byte_span(array3), 0u);
82```
83
84### Comparing arrays (`memcmp`)
85
86You have:
87```cc
88uint8_t array1[12] = {};
89uint8_t array2[12] = {};
90bool eq = memcmp(array1, array2, sizeof(array1)) == 0;
danakj8d29e182024-06-05 16:50:1891bool less = memcmp(array1, array2, sizeof(array1)) < 0;
danakj6a49deaa2024-06-04 21:31:5592
93// In tests.
94for (size_t i = 0; i < sizeof(array1); ++i) {
95 SCOPED_TRACE(i);
96 EXPECT_EQ(array1[i], array2[i]);
97}
98```
99
100Spanified:
101```cc
102uint8_t array1[12] = {};
103uint8_t array2[12] = {};
104// If one side is a span, the other will convert to span too.
105bool eq = base::span(array1) == array2;
danakj8d29e182024-06-05 16:50:18106bool less = base::span(array1) < array2;
danakj6a49deaa2024-06-04 21:31:55107
108// In tests.
109EXPECT_EQ(base::span(array1), array2);
110```
111
112### Copying array into an integer
113
114You have:
115```cc
116uint8_t array[44] = {};
117uint32_t v1;
118memcpy(&v1, array, sizeof(v1)); // Front.
119uint64_t v2;
120memcpy(&v2, array + 6, sizeof(v2)); // Middle.
121```
122
123Spanified:
124```cc
125#include "base/numerics/byte_conversions.h"
126...
127uint8_t array[44] = {};
128uint32_t v1 = base::U32FromLittleEndian(base::span(array).first<4u>()); // Front.
129uint64_t v2 = base::U64FromLittleEndian(base::span(array).subspan<6u, 8u>()); // Middle.
130```
131
132### Copy an array into an integer via cast
133
134Note: This pattern was UB-prone in addition to bounds-unsafe, as it can produce
135a pointer with incorrect alignment. reinterpret_cast is very easy to hold wrong
136in C++ and is worth avoiding.
137
138You have:
139```cc
140uint8_t array[44] = {};
141uint32_t v1 = *reinterpret_cast<const uint32_t*>(array); // Front.
142uint64_t v2 = *reinterpret_cast<const uint64_t*>(array + 6); // Middle.
143```
144
145Spanified:
146```cc
147#include "base/numerics/byte_conversions.h"
148...
149uint8_t array[44] = {};
150uint32_t v1 = base::U32FromLittleEndian(base::span(array).first<4u>()); // Front.
151uint64_t v2 = base::U64FromLittleEndian(base::span(array).subspan<6u, 8u>()); // Middle.
152```
153
154### Making a byte array (`span<uint8_t>`) from a string (or any array/range)
155
156You have:
157```cc
158std::string str = "hello world";
159func_with_const_ptr_size(reinterpret_cast<const uint8_t*>(str.data()), str.size());
160func_with_mut_ptr_size(reinterpret_cast<uint8_t*>(str.data()), str.size());
161```
162
163Spanified:
164```cc
165std::string str = "hello world";
166base::span<const uint8_t> bytes = base::as_byte_span(str);
167func_with_const_ptr_size(bytes.data(), bytes.size());
168base::span<uint8_t> mut_bytes = base::as_writable_byte_span(str);
169func_with_mut_ptr_size(mut_bytes.data(), mut_bytes.size());
170
171// Replace pointer and size with a span, though.
172func_with_const_span(base::as_byte_span(str));
173func_with_mut_span(base::as_writable_byte_span(str));
174```
175
176### Making a byte array (`span<uint8_t>`) from an object
177
178You have:
179```cc
180uint8_t array[8];
181uint64_t val;
182two_byte_arrays(array, reinterpret_cast<const uint8_t*>(&val));
183```
184
185Spanified:
186```cc
187uint8_t array[8];
188uint64_t val;
189base::span<uint8_t> val_span = base::byte_span_from_ref(val);
190two_byte_arrays(array, val_span.data());
191
192// Replace an unbounded pointer a span, though.
193two_byte_spans(base::span(array), base::byte_span_from_ref(val));
194```
danakje7db1e3f32024-04-16 20:43:24195
196## Controlling warnings for a single file
197
198Warnings can be disabled for a single (C++ source or header) file by
199writing `#pragma allow_unsafe_buffers` anywhere in the file. This can
danakj51d26a42024-04-25 14:23:56200be used to mark future work to drive down over time:
201```
202#ifdef UNSAFE_BUFFERS_BUILD
203// TODO(crbug.com/ABC): Remove this and convert code to safer constructs.
204#pragma allow_unsafe_buffers
205#endif
206```
danakje7db1e3f32024-04-16 20:43:24207
danakj51d26a42024-04-25 14:23:56208It's recommended to place the pragma directly copyright header. This can be used
209to work through files in a directory incrementally. To do so, remove the whole
210directory from opt-out in the
danakje7db1e3f32024-04-16 20:43:24211[`//build/config/unsafe_buffers_paths.txt`](../build/config/unsafe_buffers_paths.txt)
danakj51d26a42024-04-25 14:23:56212file, and temporarily mark any files with `#pragma allow_unsafe_buffers` that
213need it.
214
215The warnings can be enabled for a single (C++ source or header) file by writing
216`#pragma check_unsafe_buffers` anywhere in the file. These also need to be
217guarded by `#ifdef UNSAFE_BUFFERS_BUILD` if being checked in.
danakj4e625fb2024-03-06 20:47:46218
219# Functions with array pointer parameters
220
221Functions that receive a pointer into an array may read
222or write out of bounds of the pointer if given a pointer that
223is incorrectly sized. Such functions should be marked with the
danakje7db1e3f32024-04-16 20:43:24224`UNSAFE_BUFFER_USAGE` attribute macro.
danakj4e625fb2024-03-06 20:47:46225
226The same is true for functions that accept an iterator instead
danakje7db1e3f32024-04-16 20:43:24227of a range type. Some examples of each are `memcpy()` and
228`std::copy()`.
danakj4e625fb2024-03-06 20:47:46229
230Calling such functions is unsafe and should generally be avoided.
231Instead, replace such functions with an API built on base::span
232or other range types which prevents any chance of OOB memory
233access. For instance, replace `memcpy()`, `std::copy()` and
234`std::ranges::copy()` with `base::span::copy_from()`. And
235replace `memset()` with `std::ranges::fill()`.
236
237# Writing unsafe data structures with pointers
238
239TODO: Write about `UNSAFE_BUFFERS()` for rare exceptions where
240the correctness of pointer bounds can be fully explained and
danakje7db1e3f32024-04-16 20:43:24241encapsulated, such as within a data structure or when working
242with Operating System and C-like APIs.