danakj | 4e625fb | 2024-03-06 20:47:46 | [diff] [blame] | 1 | # Preventing OOB through Unsafe Buffers errors (aka Spanification) |
| 2 | |
danakj | a4c4238 | 2024-06-18 19:05:44 | [diff] [blame] | 3 | Out-of-bounds (OOB) security bugs commonly happen through C-style pointers which |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 4 | have no bounds information associated with them. We can prevent such |
danakj | a4c4238 | 2024-06-18 19:05:44 | [diff] [blame] | 5 | bugs by always using C++ containers. Furthermore, the Clang compiler can |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 6 | warn about unsafe pointer usage that should be converted to containers. |
| 7 | When an unsafe usage is detected, Clang prints a warning similar to |
| 8 | ``` |
| 9 | error: unsafe buffer access [-Werror,-Wunsafe-buffer-usage] |
| 10 | ``` |
| 11 | and directs developers to this file for more information. |
danakj | 4e625fb | 2024-03-06 20:47:46 | [diff] [blame] | 12 | |
danakj | 781691f | 2024-06-28 18:29:19 | [diff] [blame] | 13 | [TOC] |
| 14 | |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 15 | ## Suppressions |
danakj | 6a49deaa | 2024-06-04 21:31:55 | [diff] [blame] | 16 | |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 17 | Our [compiler](../tools/clang/plugins/UnsafeBuffersPlugin.cpp) enables |
| 18 | the `-Wunsafe-buffer-usage` warning on all files by default. Because the |
| 19 | Chromium codebase is not yet compliant with these warnings, there are |
| 20 | mechanisms to opt out code on a directory, file, or per-occurence basis. |
danakj | 6a49deaa | 2024-06-04 21:31:55 | [diff] [blame] | 21 | |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 22 | Entire directories are opted out of unsafe pointer usage warnings through |
| 23 | the [`//build/config/unsafe_buffers_paths.txt`](../build/config/unsafe_buffers_paths.txt) |
| 24 | file. As work progresses, directories will be removed from this list, and |
| 25 | non-compliant files marked on a per-file basis as below. Early results |
| 26 | indicate that often 85%+ of files in a directory already happen to be |
| 27 | compliant, so file-by-file suppression allows this code to be subject |
| 28 | to enforcement. |
| 29 | |
| 30 | Individual files are opted out of unsafe pointer usage warnings though |
| 31 | the use of the following snippet, which is to be placed immediately |
| 32 | following the copyright header in a source file. |
| 33 | ``` |
| 34 | #ifdef UNSAFE_BUFFERS_BUILD |
| 35 | // TODO(crbug.com/ABC): Remove this and convert code to safer constructs. |
| 36 | #pragma allow_unsafe_buffers |
| 37 | #endif |
| 38 | ``` |
| 39 | |
| 40 | Individual expressions or blocks of code are opted out by using the |
| 41 | `UNSAFE_BUFFERS()` macro as defined in [`//base/compiler_specific.h`[(../base/compiler_specific.h) |
| 42 | file. These should be rare once a project is fully converted, except |
| 43 | perhaps when working with C-style external APIs. These must |
| 44 | always be accompanied by a `// SAFETY:` comment explaining in detail |
| 45 | how the code has been evaluated to be safe for all possible input. |
| 46 | |
Adrian Taylor | 77d214a | 2024-06-21 13:09:09 | [diff] [blame] | 47 | Code introducing UNSAFE_BUFFERS() macro invocations without corresponding |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 48 | `// SAFETY:` comment should be summarily rejected during code review. |
| 49 | |
Tom Sepez | 3072e38 | 2024-08-14 22:42:44 | [diff] [blame] | 50 | To allow for incremental conversion, code can be temporarily opted out by |
| 51 | using the `UNSAFE_TODO()` macro. This provides the same functionality as |
| 52 | the `UNSAFE_BUFFERS()` macro, but allows easier searching for code in need |
| 53 | of revision. Add TODO() comment, along the lines of |
| 54 | `// TODO(crbug.com/xxxxxx): resolve safety issues`. |
| 55 | |
danakj | a4c4238 | 2024-06-18 19:05:44 | [diff] [blame] | 56 | ## Container-based ecosystem |
| 57 | |
| 58 | Containers may be owning types or view types. The common owning containers that |
| 59 | us contiguous storage are `std::vector`, `std::string`, `base::HeapArray`, |
| 60 | `std::array`. Their common view types are `base::span`, `std::string_view`, |
| 61 | `base::cstring_view`. |
| 62 | |
| 63 | Other owning containers include maps, sets, deques, etc. These are not |
| 64 | compatible with `base::span` as they are not contiguous and generally do not |
| 65 | have an associated view type at this time. |
| 66 | |
| 67 | We are using `base::span` instead of `std::span` in order to provide a type that |
| 68 | can do more than the standard type. We also have other types and functions to |
| 69 | work with ranges and spans instead of unbounded pointers and iterators. |
| 70 | |
| 71 | The common conversions to spans are: |
| 72 | - `base::span<T>` replaces `T* ptr, size_t size`. |
| 73 | - `base::span<T, N>` replaces `T (&ptr)[N]` (a reference to a compile-time-sized |
| 74 | array). |
| 75 | - `base::raw_span<T>` replaces `base::span<T>` (and `T* ptr, size_t size`) for |
| 76 | class fields. |
| 77 | |
| 78 | ### Span construction |
| 79 | - `base::span()` constructor can make a span, and deduce the type and size, |
| 80 | from: |
| 81 | - a `T[N]` array |
| 82 | - `std::array<T, N>` |
| 83 | - `std::vector` |
| 84 | - `std::string` |
| 85 | - any contiguous range with `begin()` and `end()` methods. |
| 86 | - any type with `T* data()` and `size_t size()` methods. |
| 87 | - `base::make_span<N>()` can make a fixed-size span from any range. |
| 88 | - `base::as_bytes()` and `base::as_chars()` convert a span’s inner type to |
| 89 | `uint8_t` or `char` respectively, making a byte-span or char-span. |
| 90 | - `base::span_from_ref()` and `base::byte_span_from_ref()` make a span, or |
| 91 | byte-span, from a single object. |
| 92 | - `base::as_byte_span()` and `base::as_writable_byte_span()` to make a |
| 93 | byte-span (const or mutable) from any container that can convert to a |
| 94 | `base::span<T>`, such as `std::string` or `std::vector<Stuff>`. |
| 95 | |
| 96 | #### Padding bytes |
| 97 | |
| 98 | Note that if the type contains padding bytes that were not somehow explicitly |
| 99 | initialized, this can create reads of uninitialized memory. Conversion to a |
| 100 | byte-span is most commonly used for spans of primitive types, such as going from |
| 101 | `char` (such as in `std::string`) or `uint32_t` (in a `std::vector`) to |
| 102 | `unit8_t`. |
| 103 | |
| 104 | ### Dynamic read/write of a span |
| 105 | - `base::SpanReader` reads heterogeneous values from a (typically, byte-) span |
| 106 | in a dynamic manner. |
| 107 | - `base::SpanWriter` writes heterogeneous values into a (typically, byte-) span |
| 108 | in a dynamic manner. |
| 109 | |
| 110 | ### Values to/from byte spans |
| 111 | In [`//base/numerics/byte_conversions.h`](../base/numerics/byte_conversions.h) |
| 112 | we have conversions between byte-arrays and big/little endian integers or |
| 113 | floats. For example (and there are many other variations): |
| 114 | - `base::U32FromBigEndian` converts from a big-endian byte-span to an unsigned |
| 115 | 32-bit integer. |
| 116 | - `base::U32FromLittleEndian` converts from a little-endian byte-span to an |
| 117 | unsigned |
| 118 | - `base::U32ToBigEndian` converts from an integer to a big-endian-encoded |
| 119 | 4-byte-array. |
| 120 | - `base::U32ToLittleEndian` converts from an integer to a little-endian-encoded |
| 121 | 4-byte-array. |
| 122 | |
| 123 | ### Heap-allocated arrays |
| 124 | - `base::HeapArray<T>` replaces `std::unique_ptr<T[]>` and places the bounds of |
| 125 | the array inside the `HeapArray` which makes it a bounds-safe range. |
| 126 | |
| 127 | ### Copying and filling arrays |
| 128 | - `base::span::copy_from(span)` replaces `memcpy` and `memmove`, and verifies |
| 129 | that the source and destination spans have the same size instead of writing |
| 130 | out of bounds. It lowers to the same code as `memmove` when possible. |
| 131 | - Note `std::ranges::copy` is not bounds-safe (though its name sounds like |
| 132 | it should be). |
| 133 | - `std::ranges::fill` replaces `memset` and works with a range so you don't |
| 134 | need explicit bounds. |
| 135 | |
| 136 | ### String pointers |
| 137 | |
| 138 | A common form of pointer is `const char*` which is used (sometimes) to represent |
| 139 | a NUL-terminated string. The standard library gives us two types to replace |
| 140 | `char*`, which allow us to know the bounds of the character array and work with |
| 141 | the string as a range: |
| 142 | |
| 143 | - `std::string` owns a NUL-terminated string. |
| 144 | - `std::string_view` is a view of a non-NUL-terminated string. |
| 145 | |
| 146 | What’s missing is a view of a string that is guaranteed to be NUL-terminated so |
| 147 | that you can call `.c_str()` to generate a `const char*` suitable for C APIs. |
| 148 | |
| 149 | - `base::cstring_view` is a view of a NUL-terminated string. This avoids the |
| 150 | need to construct a `std::string` in order to ensure a terminating NUL is |
| 151 | present. Use this as a view type whenever your code bottoms out in a C API |
| 152 | that needs NUL-terminated string pointer. |
| 153 | |
| 154 | ### Use of std::array<T>. |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 155 | |
| 156 | The clang plugin is very particular about indexing a C-style array (e.g. |
| 157 | `int arr[100]`) with a variable index. Often these issues can be resolved |
| 158 | by replacing this with `std::array<int, 100> arr`, which provides safe |
| 159 | indexed operations. In particular, new code should prefer to use the |
| 160 | `std::array<T, N>` mechanism. |
| 161 | |
| 162 | For arrays where the size is determined by the compiler (e.g. |
| 163 | `int arr[] = { 1, 3, 5 };`), the `std::to_array<T>()` helper function |
| 164 | should be used along with the `auto` keyword: |
| 165 | `auto arr = std::to_array<int>({1, 3, 5});` |
| 166 | |
danakj | a4c4238 | 2024-06-18 19:05:44 | [diff] [blame] | 167 | ## Avoid reinterpret_cast |
| 168 | |
| 169 | ### Writing to a byte span |
| 170 | |
| 171 | A common idiom in older code is to write into a byte array by casting |
| 172 | the array into a pointer to a larger type (such as `uint32_t` or `float`) |
danakj | cdf43e2 | 2024-07-02 16:42:16 | [diff] [blame] | 173 | and then writing through that pointer. This can result in Undefined Behaviour |
danakj | a4c4238 | 2024-06-18 19:05:44 | [diff] [blame] | 174 | and violates the rules of the C++ abstract machine. |
| 175 | |
| 176 | Instead, keep the byte array as a `base::span<uint8_t>`, and write to it |
| 177 | directly by chunking it up into pieces of the size you want to write. |
| 178 | |
| 179 | Using `first()`: |
| 180 | ```cc |
| 181 | void write_floats(base::span<uint8_t> out, float f1, float f2) { |
| 182 | out.first<4>().copy_from(base::byte_span_from_ref(f1)); |
| 183 | out = out.subspan(4u); // Advance the span past what we wrote. |
| 184 | out.first<4>().copy_from(base::byte_span_from_ref(f2)); |
| 185 | } |
| 186 | ``` |
| 187 | |
| 188 | Using `split_at()`: |
| 189 | ```cc |
| 190 | void write_floats(base::span<uint8_t> out, float f1, float f2) { |
| 191 | auto [write_f1, rem] = out.split_at<4>(); |
| 192 | auto [write_f2, rem2] = rem.split_at<4>(); |
| 193 | write_f1.copy_from(base::byte_span_from_ref(f1)); |
| 194 | write_f2.copy_from(base::byte_span_from_ref(f2)); |
| 195 | } |
| 196 | ``` |
| 197 | |
| 198 | Using `SpanWriter` and endian-aware `FloatToLittleEndian()`: |
| 199 | ```cc |
| 200 | void write_floats(base::span<uint8_t> out, float f1, float f2) { |
| 201 | auto writer = base::SpanWriter(out); |
| 202 | CHECK(writer.Write(base::FloatToLittleEndian(f1))); |
| 203 | CHECK(writer.Write(base::FloatToLittleEndian(f2))); |
| 204 | } |
| 205 | ``` |
| 206 | |
| 207 | Writing big-endian, with `SpanWriter` and endian-aware `U32ToBigEndian()`: |
| 208 | ```cc |
| 209 | void write_values(base::span<uint8_t> out, uint32_t i1, uint32_t i2) { |
| 210 | auto writer = base::SpanWriter(out); |
| 211 | CHECK(writer.Write(base::U32ToBigEndian(i1))); |
| 212 | // SpanWriter has a built-in shortcut to do the same thing. |
| 213 | CHECK(writer.WriteU32BigEndian(i2)); |
| 214 | // Verify we wrote to the whole output. We can put a size parameter in the |
| 215 | // `out` span to push this check to compile-time when it's a constant. |
| 216 | CHECK_EQ(writer.remaining(), 0u); |
| 217 | } |
| 218 | ``` |
| 219 | |
| 220 | Writing an array to a byte span with `copy_from()`: |
| 221 | ```cc |
| 222 | void write_floats(base::span<uint8_t> out, std::vector<const float> floats) { |
| 223 | base::span<const uint8_t> byte_floats = base::as_byte_span(floats); |
| 224 | // Or skip the first() if you want to CHECK at runtime that all of `out` has |
| 225 | // been written to. |
| 226 | out.first(byte_floats.size()).copy_from(byte_floats); |
| 227 | } |
| 228 | ``` |
| 229 | |
| 230 | ### Reading from a byte span |
| 231 | |
| 232 | Instead of turning a `span<const uint8_t>` into a pointer of a larger type, |
| 233 | which can cause Undefined Behaviour, read values out of the byte span and |
| 234 | convert each one as a value (not as a pointer). |
| 235 | |
| 236 | Using `subspan()` and endian-aware conversion `FloatFromLittleEndian`: |
| 237 | ```cc |
| 238 | void read_floats(base::span<const uint8_t> in, float& f1, float& f2) { |
| 239 | f1 = base::FloatFromLittleEndian(in.subspan<0, 4>()); |
| 240 | f2 = base::FloatFromLittleEndian(in.subspan<4, 4>()); |
| 241 | } |
| 242 | ``` |
| 243 | |
| 244 | Using `SpanReader` and endian-aware `U32FromBigEndian()`: |
| 245 | ```cc |
| 246 | void read_values(base::span<const uint8_t> in, int& i1, int& i2, int& i3) { |
| 247 | auto reader = base::SpanReader(in); |
| 248 | i1 = base::U32FromBigEndian(*reader.Read<4>()); |
| 249 | i2 = base::U32FromBigEndian(*reader.Read<4>()); |
| 250 | // SpanReader has a built-in shortcut to do the same thing. |
| 251 | CHECK(reader.ReadU32BigEndian(i3)); |
| 252 | // Verify we read the whole input. We can put a size parameter in the `in` |
| 253 | // span to push this check to compile-time when it's a constant. |
| 254 | CHECK_EQ(reader.remaining(), 0u); |
| 255 | } |
| 256 | ``` |
| 257 | |
| 258 | ## Patterns for spanification |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 259 | |
| 260 | Most pointer issues ought to be resolved by converting to containers. In |
| 261 | particular, one common conversion is to replace `T*` pointers with |
| 262 | `base::span<T>` in a process known as spanification, since most pointers |
| 263 | are unowned references into an array (or vector). The appropriate |
| 264 | replacement for the pointer is |
danakj | e7db1e3f3 | 2024-04-16 20:43:24 | [diff] [blame] | 265 | [`base::span`](../base/containers/span.h). |
danakj | 4e625fb | 2024-03-06 20:47:46 | [diff] [blame] | 266 | |
danakj | 6a49deaa | 2024-06-04 21:31:55 | [diff] [blame] | 267 | ### Copying arrays (`memcpy`) |
| 268 | |
| 269 | You have: |
| 270 | ```cc |
| 271 | uint8_t array1[12]; |
| 272 | uint8_t array2[16]; |
| 273 | uint64_t array3[2]; |
| 274 | memcpy(array1, array2 + 8, 4); |
| 275 | memcpy(array1 + 4, array3, 8); |
| 276 | ``` |
| 277 | |
| 278 | Spanified: |
| 279 | ```cc |
| 280 | uint8_t array1[12]; |
| 281 | uint8_t array2[16]; |
| 282 | uint64_t array3[2]; |
| 283 | base::span(array1).first(4u).copy_from(base::span(array2).subspan(8u, 4u)); |
| 284 | base::span(array1).subspan(4u).copy_from(base::as_byte_span(array3).first(8u)); |
| 285 | |
| 286 | // Use `split_at()` to ensure `array1` is fully written. |
| 287 | auto [from2, from3] = base::span(array1).split_at(4u); |
| 288 | from2.copy_from(base::span(array2).subspan(8u, 4u)); |
| 289 | from3.copy_from(base::as_byte_span(array3).first(8u)); |
| 290 | |
| 291 | // This can even be ensured at compile time (if sizes and offsets are all |
| 292 | // constants). |
| 293 | auto [from2, from3] = base::span(array1).split_at<4u>(); |
| 294 | from2.copy_from(base::span(array2).subspan<8u, 4u>()); |
| 295 | from3.copy_from(base::as_byte_span(array3).first<8u>()); |
| 296 | ``` |
| 297 | |
| 298 | ### Zeroing arrays (`memset`) |
| 299 | |
| 300 | `std::ranges::fill` works on any range/container and won't write out of |
| 301 | bounds. Converting arbitrary types into a byte array (through |
| 302 | `base::as_writable_byte_span`) is only valid when the type holds trivial |
| 303 | types such as primitives. Unlike `memset`, a constructed object can be |
| 304 | given as the value to use as in `std::ranges_fill(container, Object())`. |
| 305 | |
| 306 | You have: |
| 307 | ```cc |
| 308 | uint8_t array1[12]; |
| 309 | uint64_t array2[2]; |
| 310 | Object array3[4]; |
| 311 | memset(array1, 0, 12); |
| 312 | memset(array2, 0, 2 * sizeof(uint64_t)); |
| 313 | memset(array3, 0, 4 * sizeof(Object)); |
| 314 | ``` |
| 315 | |
| 316 | Spanified: |
| 317 | ```cc |
| 318 | uint8_t array1[12]; |
| 319 | uint64_t array2[2]; |
| 320 | Object array3[4]; |
| 321 | std::ranges::fill(array1, 0u); |
| 322 | std::ranges::fill(array2, 0u); |
| 323 | std::ranges::fill(base::as_writable_byte_span(array3), 0u); |
| 324 | ``` |
| 325 | |
| 326 | ### Comparing arrays (`memcmp`) |
| 327 | |
| 328 | You have: |
| 329 | ```cc |
| 330 | uint8_t array1[12] = {}; |
| 331 | uint8_t array2[12] = {}; |
| 332 | bool eq = memcmp(array1, array2, sizeof(array1)) == 0; |
danakj | 8d29e18 | 2024-06-05 16:50:18 | [diff] [blame] | 333 | bool less = memcmp(array1, array2, sizeof(array1)) < 0; |
danakj | 6a49deaa | 2024-06-04 21:31:55 | [diff] [blame] | 334 | |
| 335 | // In tests. |
| 336 | for (size_t i = 0; i < sizeof(array1); ++i) { |
| 337 | SCOPED_TRACE(i); |
| 338 | EXPECT_EQ(array1[i], array2[i]); |
| 339 | } |
| 340 | ``` |
| 341 | |
| 342 | Spanified: |
| 343 | ```cc |
| 344 | uint8_t array1[12] = {}; |
| 345 | uint8_t array2[12] = {}; |
| 346 | // If one side is a span, the other will convert to span too. |
| 347 | bool eq = base::span(array1) == array2; |
danakj | 8d29e18 | 2024-06-05 16:50:18 | [diff] [blame] | 348 | bool less = base::span(array1) < array2; |
danakj | 6a49deaa | 2024-06-04 21:31:55 | [diff] [blame] | 349 | |
| 350 | // In tests. |
| 351 | EXPECT_EQ(base::span(array1), array2); |
| 352 | ``` |
| 353 | |
| 354 | ### Copying array into an integer |
| 355 | |
| 356 | You have: |
| 357 | ```cc |
| 358 | uint8_t array[44] = {}; |
| 359 | uint32_t v1; |
| 360 | memcpy(&v1, array, sizeof(v1)); // Front. |
| 361 | uint64_t v2; |
| 362 | memcpy(&v2, array + 6, sizeof(v2)); // Middle. |
| 363 | ``` |
| 364 | |
| 365 | Spanified: |
| 366 | ```cc |
| 367 | #include "base/numerics/byte_conversions.h" |
| 368 | ... |
| 369 | uint8_t array[44] = {}; |
| 370 | uint32_t v1 = base::U32FromLittleEndian(base::span(array).first<4u>()); // Front. |
| 371 | uint64_t v2 = base::U64FromLittleEndian(base::span(array).subspan<6u, 8u>()); // Middle. |
| 372 | ``` |
| 373 | |
| 374 | ### Copy an array into an integer via cast |
| 375 | |
David Benjamin | 0ffdea7 | 2024-06-12 19:55:22 | [diff] [blame] | 376 | Note: This pattern is prone to more UB than out-of-bounds access. It is UB to |
| 377 | cast pointers if the result is not aligned, so these cases are often buggy or |
| 378 | were only correct due to subtle assumptions on buffer alignment. The spanified |
| 379 | version avoids this pitfalls. It has no alignment requirement. |
danakj | 6a49deaa | 2024-06-04 21:31:55 | [diff] [blame] | 380 | |
| 381 | You have: |
| 382 | ```cc |
| 383 | uint8_t array[44] = {}; |
| 384 | uint32_t v1 = *reinterpret_cast<const uint32_t*>(array); // Front. |
David Benjamin | 0ffdea7 | 2024-06-12 19:55:22 | [diff] [blame] | 385 | uint64_t v2 = *reinterpret_cast<const uint64_t*>(array + 16); // Middle. |
danakj | 6a49deaa | 2024-06-04 21:31:55 | [diff] [blame] | 386 | ``` |
| 387 | |
| 388 | Spanified: |
| 389 | ```cc |
| 390 | #include "base/numerics/byte_conversions.h" |
| 391 | ... |
| 392 | uint8_t array[44] = {}; |
| 393 | uint32_t v1 = base::U32FromLittleEndian(base::span(array).first<4u>()); // Front. |
David Benjamin | 0ffdea7 | 2024-06-12 19:55:22 | [diff] [blame] | 394 | uint64_t v2 = base::U64FromLittleEndian(base::span(array).subspan<16u, 8u>()); // Middle. |
danakj | 6a49deaa | 2024-06-04 21:31:55 | [diff] [blame] | 395 | ``` |
| 396 | |
| 397 | ### Making a byte array (`span<uint8_t>`) from a string (or any array/range) |
| 398 | |
| 399 | You have: |
| 400 | ```cc |
| 401 | std::string str = "hello world"; |
| 402 | func_with_const_ptr_size(reinterpret_cast<const uint8_t*>(str.data()), str.size()); |
| 403 | func_with_mut_ptr_size(reinterpret_cast<uint8_t*>(str.data()), str.size()); |
| 404 | ``` |
| 405 | |
| 406 | Spanified: |
| 407 | ```cc |
| 408 | std::string str = "hello world"; |
| 409 | base::span<const uint8_t> bytes = base::as_byte_span(str); |
| 410 | func_with_const_ptr_size(bytes.data(), bytes.size()); |
| 411 | base::span<uint8_t> mut_bytes = base::as_writable_byte_span(str); |
| 412 | func_with_mut_ptr_size(mut_bytes.data(), mut_bytes.size()); |
| 413 | |
| 414 | // Replace pointer and size with a span, though. |
| 415 | func_with_const_span(base::as_byte_span(str)); |
| 416 | func_with_mut_span(base::as_writable_byte_span(str)); |
| 417 | ``` |
| 418 | |
| 419 | ### Making a byte array (`span<uint8_t>`) from an object |
| 420 | |
| 421 | You have: |
| 422 | ```cc |
| 423 | uint8_t array[8]; |
| 424 | uint64_t val; |
| 425 | two_byte_arrays(array, reinterpret_cast<const uint8_t*>(&val)); |
| 426 | ``` |
| 427 | |
| 428 | Spanified: |
| 429 | ```cc |
| 430 | uint8_t array[8]; |
| 431 | uint64_t val; |
| 432 | base::span<uint8_t> val_span = base::byte_span_from_ref(val); |
| 433 | two_byte_arrays(array, val_span.data()); |
| 434 | |
| 435 | // Replace an unbounded pointer a span, though. |
| 436 | two_byte_spans(base::span(array), base::byte_span_from_ref(val)); |
| 437 | ``` |
danakj | e7db1e3f3 | 2024-04-16 20:43:24 | [diff] [blame] | 438 | |
danakj | b70695e | 2024-07-16 15:20:28 | [diff] [blame] | 439 | ### Avoid std::next() for silencing warnings, use ranges |
| 440 | |
| 441 | When we convert `pointer + index` to `std::next(pointer, index)` we silence the |
| 442 | `Wunsafe-buffer-usage` warning by pushing the unsafe pointer arithmetic into |
| 443 | the `std::next()` function in a system header, but we have the same unsafety. |
| 444 | `std::next()` does no additional bounds checking. |
| 445 | |
| 446 | Instead of using `std::next()`, rewrite away from using pointers (or iterators) |
| 447 | entirely by using ranges. `span()` allows us to take a subset of a contiguous |
| 448 | range without having to use iterators that we move with arithmetic or |
| 449 | `std::next()`. |
| 450 | |
danakj | e64eb29 | 2024-10-22 19:02:33 | [diff] [blame^] | 451 | Likewise, `std::advance()` can silence the warning but does not add any safety |
| 452 | to the pointer arithmetic and should be avoided. |
| 453 | |
danakj | b70695e | 2024-07-16 15:20:28 | [diff] [blame] | 454 | Instead of using pointer/iterator arithmetic: |
| 455 | ```cc |
| 456 | // Unsafe buffers warning on the unchecked arithmetic. |
| 457 | auto it = std::find(vec.begin() + offset, vec.end(), 20); |
| 458 | // No warning... But has the same security risk! |
| 459 | auto it = std::find(std::next(vec.begin(), offset), vec.end(), 20); |
| 460 | ``` |
| 461 | |
| 462 | Use a range, with `span()` providing a view of a subset of the range: |
| 463 | ```cc |
| 464 | auto it = std::ranges::find(base::span(vec).subspan(offset), 20); |
| 465 | ``` |
| 466 | |
danakj | 4e625fb | 2024-03-06 20:47:46 | [diff] [blame] | 467 | # Functions with array pointer parameters |
| 468 | |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 469 | Functions that receive a pointer argument into an array may read |
| 470 | or write out of bounds of that array if subsequent manual size |
| 471 | calculations are incorrect. Such functions should be avoided if |
| 472 | possible, or marked with the `UNSAFE_BUFFER_USAGE` attribute macro |
| 473 | otherwise. This macro propagates to their callers that they must |
| 474 | be called from inside an `UNSAFE_BUFFERS()` region (along with |
| 475 | a corresponding safety comment explaining how the caller knows |
| 476 | the call will be safe). |
danakj | 4e625fb | 2024-03-06 20:47:46 | [diff] [blame] | 477 | |
| 478 | The same is true for functions that accept an iterator instead |
danakj | e7db1e3f3 | 2024-04-16 20:43:24 | [diff] [blame] | 479 | of a range type. Some examples of each are `memcpy()` and |
| 480 | `std::copy()`. |
danakj | 4e625fb | 2024-03-06 20:47:46 | [diff] [blame] | 481 | |
Tom Sepez | 9e7f696 | 2024-06-13 00:21:59 | [diff] [blame] | 482 | Again, calling such functions is unsafe and should be avoided. |
| 483 | Replace such functions with an API built on base::span |
danakj | 4e625fb | 2024-03-06 20:47:46 | [diff] [blame] | 484 | or other range types which prevents any chance of OOB memory |
| 485 | access. For instance, replace `memcpy()`, `std::copy()` and |
| 486 | `std::ranges::copy()` with `base::span::copy_from()`. And |
| 487 | replace `memset()` with `std::ranges::fill()`. |