🤬
  • ■ ■ ■ ■ ■ ■
    0day-RCAs/2021/CVE-2021-39793.md
     1 +# CVE-2022-22706 / CVE-2021-39793: Mali GPU driver makes read-only imported pages host-writable
     2 +*Jann Horn*
     3 + 
     4 +## The Basics
     5 + 
     6 +**Disclosure or Patch Date:** March 7, 2022
     7 + 
     8 +**Product:** Arm Mali GPU driver for Linux/Android
     9 + 
     10 +**Advisory:**
     11 + 
     12 + - from Arm (upstream): https://developer.arm.com/Arm%20Security%20Center/Mali%20GPU%20Driver%20Vulnerabilities
     13 + - from Google Pixel: https://source.android.com/security/bulletin/pixel/2022-03-01#pixel
     14 + 
     15 +**Affected Versions:** see Arm advisory (note that the affected version range
     16 +for the Bifrost version of the related CVE-2021-28664 seems to be off-by-one)
     17 + 
     18 +**First Patched Version:**
     19 + 
     20 + - for Arm: see Arm advisory
     21 + - for Pixel: patch level 2022-03-05
     22 + 
     23 +**Issue/Bug Report:** N/A
     24 + 
     25 +**Patch CL:** https://android.googlesource.com/kernel/google-modules/gpu/+/5381ff7b4106b277ff207396e293ede2bf959f0c%5E%21/
     26 + 
     27 +**Bug-Introducing CL:** N/A, Arm usually only publishes driver versions as tarballs
     28 + 
     29 +**Reporter(s):** unknown
     30 + 
     31 +## The Code
     32 + 
     33 +**Proof-of-concept:**
     34 + 
     35 +**Exploit sample:** N/A
     36 + 
     37 +**Did you have access to the exploit sample when doing the analysis?** no
     38 + 
     39 +## The Vulnerability
     40 + 
     41 +**Bug class:** Broken access control logic
     42 + 
     43 +**Vulnerability details:**
     44 + 
     45 +The out-of-tree Mali driver allows userspace to create GPU memory objects
     46 +from host-virtual memory areas using the memory type
     47 +`KBASE_MEM_TYPE_IMPORTED_USER_BUF`, which grabs page references using
     48 +`pin_user_pages_remote()` (or `get_user_pages_remote()` on older kernels).
     49 +I think this is somewhat frowned upon in upstream GPU drivers nowadays; for
     50 +comparison, the upstream Intel GPU driver `i915` has a similar mechanism under
     51 +the name `userptr`, but the function `i915_gem_userptr_ioctl` implementing this
     52 +interface has the following comment on top of it:
     53 + 
     54 +https://elixir.bootlin.com/linux/v5.18.14/source/drivers/gpu/drm/i915/gem/i915_gem_userptr.c#L477
     55 +```
     56 + * Also note, that the object created here is not currently a "first class"
     57 + * object, in that several ioctls are banned. These are the CPU access
     58 + * ioctls: mmap(), pwrite and pread. In practice, you are expected to use
     59 + * direct access via your pointer rather than use those ioctls. Another
     60 + * restriction is that we do not allow userptr surfaces to be pinned to the
     61 + * hardware and so we reject any attempt to create a framebuffer out of a
     62 + * userptr.
     63 + *
     64 + * If you think this is a good interface to use to pass GPU memory between
     65 + * drivers, please use dma-buf instead. In fact, wherever possible use
     66 + * dma-buf instead.
     67 +```
     68 + 
     69 +Unlike i915, the Mali driver makes it possible for host userspace to create a
     70 +GPU memory object from a userspace area, but then access this object from
     71 +userspace.
     72 + 
     73 +The driver uses flags on the GPU memory object to track access permissions:
     74 + 
     75 + - `KBASE_REG_GPU_RD` and `KBASE_REG_GPU_WR` for read / write access from jobs
     76 + running on the GPU through GPU-virtual addresses; this mainly works by
     77 + controlling the `ENTRY_ACCESS_RW` and `ENTRY_ACCESS_RO` bits in the GPU page
     78 + tables
     79 + - `KBASE_REG_CPU_RD` and `KBASE_REG_CPU_WR` for read / write access from
     80 + host kernel code (on behalf of host userspace) and host userspace;
     81 + these flags affect VMA permission flags in the host kernel (which control
     82 + permission bits in host page tables) and are also used for explicit
     83 + permission checks in kernel code
     84 + 
     85 +However, in vulnerable versions of the driver, `kbase_jd_user_buf_pin_pages()`
     86 +only checks the `KBASE_REG_GPU_WR` flag to determine whether
     87 +`pin_user_pages_remote()` should request write access, and wrongly ignores
     88 +the `KBASE_REG_CPU_WR` flag. The fix is essentially (with lots of duplicate
     89 +changes to handle different kernel versions):
     90 +```diff
     91 +@ -4556,65 +4557,62 @@ int kbase_jd_user_buf_pin_pages(struct kbase_context *kctx,
     92 + struct kbase_va_region *reg)
     93 + {
     94 + struct kbase_mem_phy_alloc *alloc = reg->gpu_alloc;
     95 + struct page **pages = alloc->imported.user_buf.pages;
     96 + unsigned long address = alloc->imported.user_buf.address;
     97 + struct mm_struct *mm = alloc->imported.user_buf.mm;
     98 + long pinned_pages;
     99 + long i;
     100 ++ int write;
     101 +
     102 + if (WARN_ON(alloc->type != KBASE_MEM_TYPE_IMPORTED_USER_BUF))
     103 + return -EINVAL;
     104 +
     105 + if (alloc->nents) {
     106 + if (WARN_ON(alloc->nents != alloc->imported.user_buf.nr_pages))
     107 + return -EINVAL;
     108 + else
     109 + return 0;
     110 + }
     111 +
     112 + if (WARN_ON(reg->gpu_alloc->imported.user_buf.mm != current->mm))
     113 + return -EINVAL;
     114 +
     115 ++ write = reg->flags & (KBASE_REG_CPU_WR | KBASE_REG_GPU_WR);
     116 ++
     117 +[...]
     118 + pinned_pages = pin_user_pages_remote(
     119 + mm, address, alloc->imported.user_buf.nr_pages,
     120 +- reg->flags & KBASE_REG_GPU_WR ? FOLL_WRITE : 0, pages, NULL,
     121 +- NULL);
     122 ++ write ? FOLL_WRITE : 0, pages, NULL, NULL);
     123 +[...]
     124 +```
     125 + 
     126 + 
     127 +So in a vulnerable version, an attacker can write into read-only pages from
     128 +shared libraries and such as follows:
     129 + 
     130 + - Map some page from a shared library as read-only
     131 + - Create a Mali `KBASE_MEM_TYPE_IMPORTED_USER_BUF` with `KBASE_REG_CPU_WR` but
     132 + without `KBASE_REG_GPU_WR` from the victim page mapping; this involves
     133 + creating a host-side VMA for the Mali memory object.
     134 + The buffer has to be created in a way that doesn't set
     135 + `KBASE_REG_SHARE_BOTH`.
     136 + - Trigger `kbase_jd_user_buf_pin_pages()` on this memory object (either via
     137 + `KBASE_IOCTL_KCPU_QUEUE_ENQUEUE` with a `BASE_KCPU_COMMAND_TYPE_MAP_IMPORT`
     138 + command, or by submitting an atom with `BASE_JD_REQ_EXTERNAL_RESOURCES`) to
     139 + execute the incorrect `get_user_pages()` call
     140 + - Write into the Mali memory object from host userspace
     141 + 
     142 +**Patch analysis:**
     143 + 
     144 +The patch addresses the remaining site that was missed in
     145 +the CVE-2021-28664 fix (see below).
     146 +At this point, I see no remaining places in the driver that look up page
     147 +pointers with access flags that don't match the corresponding Mali memory
     148 +object.
     149 + 
     150 +**Thoughts on how this vuln might have been found:**
     151 + 
     152 +This vulnerability is a straightforward variant of a previous Mali bug,
     153 +CVE-2021-28664, which was fixed as follows around 10 months earlier
     154 +(from the diff between Mali Bifrost r29p0 and r30p0):
     155 +```diff
     156 + static struct kbase_va_region *kbase_mem_from_user_buffer(
     157 + struct kbase_context *kctx, unsigned long address,
     158 + unsigned long size, u64 *va_pages, u64 *flags)
     159 + {
     160 +[...]
     161 ++ int write;
     162 +[...]
     163 ++ write = reg->flags & (KBASE_REG_CPU_WR | KBASE_REG_GPU_WR);
     164 ++
     165 + #if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
     166 + faulted_pages = get_user_pages(current, current->mm, address, *va_pages,
     167 + #if KERNEL_VERSION(4, 4, 168) <= LINUX_VERSION_CODE && \
     168 + KERNEL_VERSION(4, 5, 0) > LINUX_VERSION_CODE
     169 +- reg->flags & KBASE_REG_CPU_WR ? FOLL_WRITE : 0,
     170 +- pages, NULL);
     171 ++ write ? FOLL_WRITE : 0, pages, NULL);
     172 + #else
     173 +- reg->flags & KBASE_REG_CPU_WR, 0, pages, NULL);
     174 ++ write, 0, pages, NULL);
     175 + #endif
     176 + #elif KERNEL_VERSION(4, 9, 0) > LINUX_VERSION_CODE
     177 + faulted_pages = get_user_pages(address, *va_pages,
     178 +- reg->flags & KBASE_REG_CPU_WR, 0, pages, NULL);
     179 ++ write, 0, pages, NULL);
     180 + #else
     181 + faulted_pages = get_user_pages(address, *va_pages,
     182 +- reg->flags & KBASE_REG_CPU_WR ? FOLL_WRITE : 0,
     183 +- pages, NULL);
     184 ++ write ? FOLL_WRITE : 0, pages, NULL);
     185 + #endif
     186 +```
     187 + 
     188 +This is very similar to the patch linked above - essentially, this was a bug in
     189 +duplicated code, and only one instance of it was patched.
     190 +Both copies of the code call `get_user_pages()` to grab page references for a
     191 +`KBASE_MEM_TYPE_IMPORTED_USER_BUF` memory object, and both of them wrongly
     192 +ignored `KBASE_REG_CPU_WR`. The only difference between them is that one copy is
     193 +for the case where pages are pinned directly at object creation, while the other
     194 +copy handles the case where pages are pinned at a later point.
     195 +Which one of these codepaths is used depends on the `KBASE_REG_SHARE_BOTH` flag.
     196 + 
     197 +It seems likely that an attacker could have discovered this issue by looking at
     198 +the fix for CVE-2021-28664 and searching for other `get_user_pages()` callers
     199 +in the Mali driver.
     200 + 
     201 +There has also been at least one very similar issue in an upstream graphics
     202 +driver: https://git.kernel.org/linus/cd5297b0855f
     203 + 
     204 +**(Historical/present/future) context of bug:**
     205 + 
     206 +See previous section. Additionally:
     207 + 
     208 +Looking through the list of public Mali bugs for issues described as
     209 +_"Mali GPU Kernel Driver elevates CPU RO pages to writable"_, there is a third
     210 +bug CVE-2021-44828 with this description. This bug doesn't involve
     211 +`get_user_pages()`, but it does again involve a missing check for the
     212 +`KBASE_REG_CPU_WR` flag.
     213 + 
     214 +Various methods across the driver (`kbase_kcpu_jit_allocate_process()`,
     215 +`kbasep_write_soft_event_status()` and `kbase_jit_allocate_process()`) would
     216 +write to Mali memory objects on behalf of the user, but instead of doing this
     217 +by directly writing to corresponding userspace-virtual addresses, they map the
     218 +corresponding page into kernel-virtual memory using `kbase_vmap()`, then write
     219 +to this kernel-virtual address.
     220 +The bug was that there was no check to ensure that the Mali memory object was
     221 +actually marked as writable using `KBASE_REG_CPU_WR`.
     222 +This was addressed by instead using `kbase_vmap_prot()`, which performs the
     223 +necessary access check.
     224 + 
     225 +## The Exploit
     226 + 
     227 +(The terms *exploit primitive*, *exploit strategy*, *exploit technique*, and *exploit flow* are [defined here](https://googleprojectzero.blogspot.com/2020/06/a-survey-of-recent-ios-kernel-exploits.html).)
     228 + 
     229 +**Exploit strategy (or strategies):**
     230 + 
     231 +**Exploit flow:**
     232 + 
     233 +**Known cases of the same exploit flow:**
     234 + 
     235 +**Part of an exploit chain?**
     236 + 
     237 +## The Next Steps
     238 + 
     239 +### Variant analysis
     240 + 
     241 +**Areas/approach for variant analysis (and why):**
     242 + 
     243 + - Audit permission flag checks in Mali and other GPU drivers for memory
     244 + imported via `get_user_pages()`. (**TODO**)
     245 + 
     246 +**Found variants:**
     247 + 
     248 +### Structural improvements
     249 + 
     250 +What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?
     251 + 
     252 +**Ideas to kill the bug class:**
     253 + 
     254 + - Maybe get rid of the `get_user_pages`-based interface if it's unnecessary,
     255 + since having `KBASE_MEM_TYPE_IMPORTED_USER_BUF` makes the impact of these
     256 + types of bugs much worse?
     257 + 
     258 +**Ideas to mitigate the exploit flow:**
     259 + 
     260 +**Other potential improvements:**
     261 + 
     262 +### 0-day detection methods
     263 + 
     264 +What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected **as a 0-day**?
     265 + 
     266 +## Other References
     267 + 
Please wait...
Page is in error, reload to recover