1
0
Fork 0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-22 16:06:04 -05:00
linux/mm
Yang Shi f442fa6141 mm: gup: stop abusing try_grab_folio
A kernel warning was reported when pinning folio in CMA memory when
launching SEV virtual machine.  The splat looks like:

[  464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520
[  464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6
[  464.325477] RIP: 0010:__get_user_pages+0x423/0x520
[  464.325515] Call Trace:
[  464.325520]  <TASK>
[  464.325523]  ? __get_user_pages+0x423/0x520
[  464.325528]  ? __warn+0x81/0x130
[  464.325536]  ? __get_user_pages+0x423/0x520
[  464.325541]  ? report_bug+0x171/0x1a0
[  464.325549]  ? handle_bug+0x3c/0x70
[  464.325554]  ? exc_invalid_op+0x17/0x70
[  464.325558]  ? asm_exc_invalid_op+0x1a/0x20
[  464.325567]  ? __get_user_pages+0x423/0x520
[  464.325575]  __gup_longterm_locked+0x212/0x7a0
[  464.325583]  internal_get_user_pages_fast+0xfb/0x190
[  464.325590]  pin_user_pages_fast+0x47/0x60
[  464.325598]  sev_pin_memory+0xca/0x170 [kvm_amd]
[  464.325616]  sev_mem_enc_register_region+0x81/0x130 [kvm_amd]

Per the analysis done by yangge, when starting the SEV virtual machine, it
will call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the memory. 
But the page is in CMA area, so fast GUP will fail then fallback to the
slow path due to the longterm pinnalbe check in try_grab_folio().

The slow path will try to pin the pages then migrate them out of CMA area.
But the slow path also uses try_grab_folio() to pin the page, it will
also fail due to the same check then the above warning is triggered.

In addition, the try_grab_folio() is supposed to be used in fast path and
it elevates folio refcount by using add ref unless zero.  We are guaranteed
to have at least one stable reference in slow path, so the simple atomic add
could be used.  The performance difference should be trivial, but the
misuse may be confusing and misleading.

Redefined try_grab_folio() to try_grab_folio_fast(), and try_grab_page()
to try_grab_folio(), and use them in the proper paths.  This solves both
the abuse and the kernel warning.

The proper naming makes their usecase more clear and should prevent from
abusing in the future.

peterx said:

: The user will see the pin fails, for gpu-slow it further triggers the WARN
: right below that failure (as in the original report):
: 
:         folio = try_grab_folio(page, page_increm - 1,
:                                 foll_flags);
:         if (WARN_ON_ONCE(!folio)) { <------------------------ here
:                 /*
:                         * Release the 1st page ref if the
:                         * folio is problematic, fail hard.
:                         */
:                 gup_put_folio(page_folio(page), 1,
:                                 foll_flags);
:                 ret = -EFAULT;
:                 goto out;
:         }

[1] https://lore.kernel.org/linux-mm/1719478388-31917-1-git-send-email-yangge1116@126.com/

[shy828301@gmail.com: fix implicit declaration of function try_grab_folio_fast]
  Link: https://lkml.kernel.org/r/CAHbLzkowMSso-4Nufc9hcMehQsK9PNz3OSu-+eniU-2Mm-xjhA@mail.gmail.com
Link: https://lkml.kernel.org/r/20240628191458.2605553-1-yang@os.amperecomputing.com
Fixes: 57edfcfd34 ("mm/gup: accelerate thp gup even for "pages != NULL"")
Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
Reported-by: yangge <yangge1116@126.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>	[6.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-06 11:39:51 -07:00
..
damon mm/damon/core: merge regions aggressively when max_nr_regions is unmet 2024-07-03 22:40:36 -07:00
kasan kasan: fix bad call to unpoison_slab_object 2024-06-24 20:52:09 -07:00
kfence
kmsan kmsan: do not wipe out origin when doing partial unpoisoning 2024-06-05 19:19:25 -07:00
backing-dev.c
balloon_compaction.c
bootmem_info.c
cma.c
cma.h
cma_debug.c
cma_sysfs.c
compaction.c mm: handle profiling for fake memory allocations during compaction 2024-06-24 20:52:09 -07:00
debug.c
debug_page_alloc.c
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick 2024-06-15 10:43:08 -07:00
dmapool.c
dmapool_test.c
early_ioremap.c
execmem.c
fadvise.c
fail_page_alloc.c
failslab.c
filemap.c cachestat: do not flush stats in recency check 2024-07-03 22:40:37 -07:00
folio-compat.c
gup.c mm: gup: stop abusing try_grab_folio 2024-07-06 11:39:51 -07:00
gup_test.c
gup_test.h
highmem.c
hmm.c
huge_memory.c mm: gup: stop abusing try_grab_folio 2024-07-06 11:39:51 -07:00
hugetlb.c mm/hugetlb_vmemmap: fix race with speculative PFN walkers 2024-07-03 22:40:38 -07:00
hugetlb_cgroup.c
hugetlb_vmemmap.c mm/hugetlb_vmemmap: fix race with speculative PFN walkers 2024-07-03 22:40:38 -07:00
hugetlb_vmemmap.h
hwpoison-inject.c
init-mm.c
internal.h mm: gup: stop abusing try_grab_folio 2024-07-06 11:39:51 -07:00
interval_tree.c
io-mapping.c
ioremap.c
Kconfig
Kconfig.debug
khugepaged.c
kmemleak.c
ksm.c mm/ksm: fix ksm_zero_pages accounting 2024-06-05 19:19:26 -07:00
list_lru.c
maccess.c
madvise.c mseal: add mseal syscall 2024-05-23 19:40:26 -07:00
Makefile mseal: add mseal syscall 2024-05-23 19:40:26 -07:00
mapping_dirty_helpers.c
memblock.c memblock: use numa_valid_node() helper to check for invalid node ID 2024-06-16 10:17:57 +03:00
memcontrol.c mm: shmem: fix getting incorrect lruvec when replacing a shmem folio 2024-06-15 10:43:08 -07:00
memfd.c
memory-failure.c mm/memory-failure: fix handling of dissolved but not taken off from buddy pages 2024-05-24 11:55:08 -07:00
memory-tiers.c
memory.c mm/memory: don't require head page for do_set_pmd() 2024-06-24 20:52:11 -07:00
memory_hotplug.c
mempolicy.c
mempool.c mm: fix xyz_noprof functions calling profiled functions 2024-06-05 19:19:26 -07:00
memremap.c
memtest.c
migrate.c mm/migrate: make migrate_pages_batch() stats consistent 2024-06-24 20:52:10 -07:00
migrate_device.c
mincore.c
mlock.c
mm_init.c Revert "mm: init_mlocked_on_free_v3" 2024-06-15 10:43:05 -07:00
mm_slot.h
mmap.c mseal: add mseal syscall 2024-05-23 19:40:26 -07:00
mmap_lock.c
mmu_gather.c
mmu_notifier.c
mmzone.c
mprotect.c mseal: add mseal syscall 2024-05-23 19:40:26 -07:00
mremap.c mseal: add mseal syscall 2024-05-23 19:40:26 -07:00
mseal.c mseal: add mseal syscall 2024-05-23 19:40:26 -07:00
msync.c
nommu.c
oom_kill.c
page-writeback.c mm: avoid overflows in dirty throttling logic 2024-07-03 12:29:24 -07:00
page_alloc.c mm/page_alloc: Separate THP PCP into movable and non-movable categories 2024-06-24 20:52:11 -07:00
page_counter.c
page_ext.c
page_idle.c
page_io.c mm: drop the 'anon_' prefix for swap-out mTHP counters 2024-06-05 19:19:23 -07:00
page_isolation.c
page_owner.c
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c mm/page_table_check: fix crash on ZONE_DEVICE 2024-06-15 10:43:04 -07:00
page_vma_mapped.c
pagewalk.c
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c
pgalloc-track.h
pgtable-generic.c
process_vm_access.c
ptdump.c
readahead.c mm/readahead: limit page cache size in page_cache_ra_order() 2024-07-03 22:40:37 -07:00
rmap.c
rodata_test.c
secretmem.c
shmem.c mm/shmem: disable PMD-sized page cache if needed 2024-07-03 22:40:37 -07:00
shmem_quota.c
show_mem.c
shrinker.c
shrinker_debug.c
shuffle.c
shuffle.h
slab.h
slab_common.c
slub.c mm/slab: fix 'variable obj_exts set but not used' warning 2024-06-24 20:52:09 -07:00
sparse-vmemmap.c
sparse.c
swap.c
swap.h
swap_cgroup.c
swap_slots.c
swap_state.c
swapfile.c
truncate.c
usercopy.c
userfaultfd.c
util.c hardening fixes for v6.10-rc5 2024-06-17 12:00:22 -07:00
vmalloc.c mm: vmalloc: check if a hash-index is in cpu_possible_mask 2024-07-03 22:40:36 -07:00
vmpressure.c
vmscan.c mm: drop the 'anon_' prefix for swap-out mTHP counters 2024-06-05 19:19:23 -07:00
vmstat.c
workingset.c cachestat: do not flush stats in recency check 2024-07-03 22:40:37 -07:00
z3fold.c
zbud.c
zpool.c
zsmalloc.c
zswap.c