Vlad Tsyrklevich | 08bc0525 | 2018-12-04 06:58:54 | [diff] [blame] | 1 | # GWP-ASan |
| 2 | |
| 3 | GWP-ASan is a debug tool intended to detect heap memory errors in the wild. It |
Vlad Tsyrklevich | f9c9065 | 2018-12-28 21:15:03 | [diff] [blame] | 4 | samples allocations to a debug allocator, similar to ElectricFence or Page Heap, |
Vlad Tsyrklevich | 6e6402a | 2019-01-22 07:50:20 | [diff] [blame] | 5 | causing memory errors to crash and report additional debugging context about |
| 6 | the error. |
Vlad Tsyrklevich | 08bc0525 | 2018-12-04 06:58:54 | [diff] [blame] | 7 | |
Vlad Tsyrklevich | ee4629b | 2019-10-24 20:07:06 | [diff] [blame] | 8 | It is also known by its recursive backronym, GWP-ASan Will Provide Allocation |
| 9 | Sanity. |
| 10 | |
Henrique Ferreiro | 2dca1a49 | 2024-05-02 11:23:56 | [diff] [blame] | 11 | To read a more in-depth explanation of GWP-ASan see [this |
| 12 | post](https://www.chromium.org/Home/chromium-security/articles/gwp-asan). |
Vlad Tsyrklevich | ff3290ed | 2019-11-16 07:45:18 | [diff] [blame] | 13 | |
Vlad Tsyrklevich | 08bc0525 | 2018-12-04 06:58:54 | [diff] [blame] | 14 | ## Allocator |
| 15 | |
| 16 | The GuardedPageAllocator returns allocations on pages buffered on both sides by |
| 17 | guard pages. The allocations are either left- or right-aligned to detect buffer |
| 18 | overflows and underflows. When an allocation is freed, the page is marked |
| 19 | inaccessible so use-after-frees cause an exception (until that page is reused |
| 20 | for another allocation.) |
| 21 | |
| 22 | The allocator saves stack traces on every allocation and deallocation to |
| 23 | preserve debug context if that allocation results in a memory error. |
| 24 | |
Vlad Tsyrklevich | dc1a9a5e8 | 2018-12-18 18:04:01 | [diff] [blame] | 25 | The allocator implements a quarantine mechanism by allocating virtual memory for |
Vlad Tsyrklevich | 4a2e4d20 | 2019-04-25 00:22:43 | [diff] [blame] | 26 | more allocations than the total number of physical pages it can return at any |
| 27 | given time. The difference forms a rudimentary quarantine. |
| 28 | |
| 29 | Because pages are re-used for allocations, it's possible that a long-lived |
| 30 | use-after-free will cause a crash long after the original allocation has been |
| 31 | replaced. In order to decrease the likelihood of incorrect stack traces being |
| 32 | reported, we allocate a lot of virtual memory but don't store metadata for every |
| 33 | allocation. That way though we may not be able to report the metadata for an old |
| 34 | allocation, we will not report incorrect stack traces. |
Vlad Tsyrklevich | dc1a9a5e8 | 2018-12-18 18:04:01 | [diff] [blame] | 35 | |
Vlad Tsyrklevich | 08bc0525 | 2018-12-04 06:58:54 | [diff] [blame] | 36 | ## Crash handler |
| 37 | |
| 38 | The allocator is designed so that memory errors with GWP-ASan allocations |
| 39 | intentionally trigger invalid access exceptions. A hook in the crashpad crash |
| 40 | handler process inspects crashes, determines if they are GWP-ASan exceptions, |
| 41 | and adds additional debug information to the crash minidump if so. |
| 42 | |
| 43 | The crash handler hook determines if the exception was related to GWP-ASan by |
| 44 | reading the allocator internals and seeing if the exception address was within |
| 45 | the bounds of the allocator region. If it is, the crash handler hook extracts |
| 46 | debug information about that allocation, such as thread IDs and stack traces |
| 47 | for allocation (and deallocation, if relevant) and writes it to the crash dump. |
| 48 | |
| 49 | The crash handler runs with elevated privileges so parsing information from a |
| 50 | lesser-privileged process is security sensitive. The GWP-ASan hook is specially |
| 51 | structured to minimize the amount of allocator logic it relies on and to |
| 52 | validate the allocator internals before reasoning about them. |
| 53 | |
| 54 | ## Status |
| 55 | |
Vlad Tsyrklevich | ee4629b | 2019-10-24 20:07:06 | [diff] [blame] | 56 | GWP-ASan is implemented for malloc and PartitionAlloc. It is enabled by default |
| 57 | on Windows and macOS. The allocator parameters can be manually modified by using |
| 58 | an invocation like the following: |
Vlad Tsyrklevich | 08bc0525 | 2018-12-04 06:58:54 | [diff] [blame] | 59 | |
| 60 | ```shell |
| 61 | chrome --enable-features="GwpAsanMalloc<Study" \ |
| 62 | --force-fieldtrials=Study/Group1 \ |
Vlad Tsyrklevich | 4a2e4d20 | 2019-04-25 00:22:43 | [diff] [blame] | 63 | --force-fieldtrial-params=Study.Group1:MaxAllocations/128/MaxMetadata/255/TotalPages/4096/AllocationSamplingFrequency/1000/ProcessSamplingProbability/1.0 |
Vlad Tsyrklevich | 08bc0525 | 2018-12-04 06:58:54 | [diff] [blame] | 64 | ``` |
| 65 | |
Vlad Tsyrklevich | 04d8664 | 2019-05-21 00:22:50 | [diff] [blame] | 66 | GWP-ASan is tuned more aggressively in canary/dev, to increase the likelihood we |
| 67 | catch newly introduced bugs, and for specific processes depending on the |
| 68 | particular allocator. |
Vlad Tsyrklevich | 4a2e4d20 | 2019-04-25 00:22:43 | [diff] [blame] | 69 | |
Vlad Tsyrklevich | 6e6402a | 2019-01-22 07:50:20 | [diff] [blame] | 70 | A [hotlist of bugs discovered by by GWP-ASan](https://bugs.chromium.org/p/chromium/issues/list?can=1&q=Hotlist%3DGWP-ASan) |
Vlad Tsyrklevich | ff3290ed | 2019-11-16 07:45:18 | [diff] [blame] | 71 | exists, though GWP-ASan crashes are filed Bug-Security, e.g. without external |
| 72 | visibility, by default. |
Vlad Tsyrklevich | 6e6402a | 2019-01-22 07:50:20 | [diff] [blame] | 73 | |
Vlad Tsyrklevich | 04d8664 | 2019-05-21 00:22:50 | [diff] [blame] | 74 | ## Limitations |
| 75 | |
| 76 | - GWP-ASan is configured with a small fixed-size amount of memory, so |
| 77 | long-lived allocations can quickly deplete the page pool and lead the |
| 78 | allocator to run out of memory. Depending on the sampling frequency and |
| 79 | distribution of allocation lifetimes this may lead to only allocations early |
| 80 | in the process lifetime being sampled. |
| 81 | - Allocations over a page in size are not sampled. |
| 82 | - The allocator skips zero-size allocations. Zero-size allocations on some |
| 83 | platforms return valid pointers and may be subject to lifetime and bounds |
| 84 | issues. |
| 85 | - GWP-ASan does not intercept allocations for Oilpan or the v8 GC. |
| 86 | - GWP-ASan does not hook PDFium's fork of PartitionAlloc. |
| 87 | - Right-aligned allocations to catch overflows are not perfectly right-aligned, |
| 88 | so small out-of-bounds accesses may be missed. |
Vlad Tsyrklevich | ee4629b | 2019-10-24 20:07:06 | [diff] [blame] | 89 | - GWP-ASan does not sample some early allocations that occur before field trial |
| 90 | initialization. |
| 91 | - Depending on the platform, GWP-ASan may or may not hook malloc allocations |
| 92 | that occur in code not linked directly against Chrome. |
Vlad Tsyrklevich | 04d8664 | 2019-05-21 00:22:50 | [diff] [blame] | 93 | |
Vlad Tsyrklevich | 08bc0525 | 2018-12-04 06:58:54 | [diff] [blame] | 94 | ## Testing |
| 95 | |
| 96 | There is [not yet](https://crbug.com/910751) a way to intentionally trigger a |
| 97 | GWP-ASan exception. |
| 98 | |
| 99 | There is [not yet](https://crbug.com/910749) a way to inspect GWP-ASan data in |
Vlad Tsyrklevich | 4a2e4d20 | 2019-04-25 00:22:43 | [diff] [blame] | 100 | a minidump (crash report) without access to Google's crash service. |
Kalvin Lee | d1ab7ea9 | 2023-08-08 04:14:51 | [diff] [blame] | 101 | |
| 102 | ## Appendix: Probabilities |
| 103 | |
| 104 | The question "shall we enable GWP-ASan at all in this process?" is |
| 105 | answered by |
| 106 | |
| 107 | `base::RandDouble()` < `ProcessSamplingProbability` × |
| 108 | `ProcessSamplingBoost2` |
| 109 | |
| 110 | where |
| 111 | |
| 112 | * 0.0 ≤ `ProcessSamplingProbability` ≤ 1.0, |
| 113 | |
| 114 | * `ProcessSamplingBoost2` ≥ 1, and |
| 115 | |
| 116 | * `base::RandDouble()` has range [0, 1). |
| 117 | |
| 118 | The question "on average, how many allocations shall occur before |
| 119 | GWP-ASan takes a sample?" is answered by |
| 120 | |
| 121 | `AllocationSamplingMultiplier` × (`AllocationSamplingRange` |
| 122 | ∗∗ `base::RandDouble()`) |
| 123 | |
| 124 | where |
| 125 | |
| 126 | * `AllocationSamplingMultiplier` ≥ 1, |
| 127 | |
| 128 | * `AllocationSamplingRange` ≥ 1, and |
| 129 | |
| 130 | * the final expression is < `max(size_t)`. |
| 131 | |
| 132 | As an example, on Linux, using the default parameters and |
| 133 | `base::RandDouble() == 0.5`, we get |
| 134 | |
| 135 | 1500 × (16 ∗∗ 0.5) = 6000 |