1
0
Fork 0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-24 01:09:38 -05:00
linux/kernel/Kconfig.kexec
Eric DeVolder a72bbec70d crash: hotplug support for kexec_load()
The hotplug support for kexec_load() requires changes to the userspace
kexec-tools and a little extra help from the kernel.

Given a kdump capture kernel loaded via kexec_load(), and a subsequent
hotplug event, the crash hotplug handler finds the elfcorehdr and rewrites
it to reflect the hotplug change.  That is the desired outcome, however,
at kernel panic time, the purgatory integrity check fails (because the
elfcorehdr changed), and the capture kernel does not boot and no vmcore is
generated.

Therefore, the userspace kexec-tools/kexec must indicate to the kernel
that the elfcorehdr can be modified (because the kexec excluded the
elfcorehdr from the digest, and sized the elfcorehdr memory buffer
appropriately).

To facilitate hotplug support with kexec_load():
 - a new kexec flag KEXEC_UPATE_ELFCOREHDR indicates that it is
   safe for the kernel to modify the kexec_load()'d elfcorehdr
 - the /sys/kernel/crash_elfcorehdr_size node communicates the
   preferred size of the elfcorehdr memory buffer
 - The sysfs crash_hotplug nodes (ie.
   /sys/devices/system/[cpu|memory]/crash_hotplug) dynamically
   take into account kexec_file_load() vs kexec_load() and
   KEXEC_UPDATE_ELFCOREHDR.
   This is critical so that the udev rule processing of crash_hotplug
   is all that is needed to determine if the userspace unload-then-load
   of the kdump image is to be skipped, or not. The proposed udev
   rule change looks like:
   # The kernel updates the crash elfcorehdr for CPU and memory changes
   SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
   SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"

The table below indicates the behavior of kexec_load()'d kdump image
updates (with the new udev crash_hotplug rule in place):

 Kernel |Kexec
 -------+-----+----
 Old    |Old  |New
        |  a  | a
 -------+-----+----
 New    |  a  | b
 -------+-----+----

where kexec 'old' and 'new' delineate kexec-tools has the needed
modifications for the crash hotplug feature, and kernel 'old' and 'new'
delineate the kernel supports this crash hotplug feature.

Behavior 'a' indicates the unload-then-reload of the entire kdump image. 
For the kexec 'old' column, the unload-then-reload occurs due to the
missing flag KEXEC_UPDATE_ELFCOREHDR.  An 'old' kernel (with 'new' kexec)
does not present the crash_hotplug sysfs node, which leads to the
unload-then-reload of the kdump image.

Behavior 'b' indicates the desired optimized behavior of the kernel
directly modifying the elfcorehdr and avoiding the unload-then-reload of
the kdump image.

If the udev rule is not updated with crash_hotplug node check, then no
matter any combination of kernel or kexec is new or old, the kdump image
continues to be unload-then-reload on hotplug changes.

To fully support crash hotplug feature, there needs to be a rollout of
kernel, kexec-tools and udev rule changes.  However, the order of the
rollout of these pieces does not matter; kexec_load()'d kdump images still
function for hotplug as-is.

Link: https://lkml.kernel.org/r/20230814214446.6659-7-eric.devolder@oracle.com
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Suggested-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Akhil Raj <lf32.dev@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:25:14 -07:00

150 lines
5.1 KiB
Text

# SPDX-License-Identifier: GPL-2.0-only
menu "Kexec and crash features"
config CRASH_CORE
bool
config KEXEC_CORE
select CRASH_CORE
bool
config KEXEC_ELF
bool
config HAVE_IMA_KEXEC
bool
config KEXEC
bool "Enable kexec system call"
depends on ARCH_SUPPORTS_KEXEC
select KEXEC_CORE
help
kexec is a system call that implements the ability to shutdown your
current kernel, and to start another kernel. It is like a reboot
but it is independent of the system firmware. And like a reboot
you can start any kernel with it, not just Linux.
The name comes from the similarity to the exec system call.
It is an ongoing process to be certain the hardware in a machine
is properly shutdown, so do not be surprised if this code does not
initially work for you. As of this writing the exact hardware
interface is strongly in flux, so no good recommendation can be
made.
config KEXEC_FILE
bool "Enable kexec file based system call"
depends on ARCH_SUPPORTS_KEXEC_FILE
select KEXEC_CORE
help
This is new version of kexec system call. This system call is
file based and takes file descriptors as system call argument
for kernel and initramfs as opposed to list of segments as
accepted by kexec system call.
config KEXEC_SIG
bool "Verify kernel signature during kexec_file_load() syscall"
depends on ARCH_SUPPORTS_KEXEC_SIG
depends on KEXEC_FILE
help
This option makes the kexec_file_load() syscall check for a valid
signature of the kernel image. The image can still be loaded without
a valid signature unless you also enable KEXEC_SIG_FORCE, though if
there's a signature that we can check, then it must be valid.
In addition to this option, you need to enable signature
verification for the corresponding kernel image type being
loaded in order for this to work.
config KEXEC_SIG_FORCE
bool "Require a valid signature in kexec_file_load() syscall"
depends on ARCH_SUPPORTS_KEXEC_SIG_FORCE
depends on KEXEC_SIG
help
This option makes kernel signature verification mandatory for
the kexec_file_load() syscall.
config KEXEC_IMAGE_VERIFY_SIG
bool "Enable Image signature verification support (ARM)"
default ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG
depends on ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG
depends on KEXEC_SIG
depends on EFI && SIGNED_PE_FILE_VERIFICATION
help
Enable Image signature verification support.
config KEXEC_BZIMAGE_VERIFY_SIG
bool "Enable bzImage signature verification support"
depends on ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG
depends on KEXEC_SIG
depends on SIGNED_PE_FILE_VERIFICATION
select SYSTEM_TRUSTED_KEYRING
help
Enable bzImage signature verification support.
config KEXEC_JUMP
bool "kexec jump"
depends on ARCH_SUPPORTS_KEXEC_JUMP
depends on KEXEC && HIBERNATION
help
Jump between original kernel and kexeced kernel and invoke
code in physical address mode via KEXEC
config CRASH_DUMP
bool "kernel crash dumps"
depends on ARCH_SUPPORTS_CRASH_DUMP
depends on ARCH_SUPPORTS_KEXEC
select CRASH_CORE
select KEXEC_CORE
select KEXEC
help
Generate crash dump after being started by kexec.
This should be normally only set in special crash dump kernels
which are loaded in the main kernel with kexec-tools into
a specially reserved region and then later executed after
a crash by kdump/kexec. The crash dump kernel must be compiled
to a memory address not used by the main kernel or BIOS using
PHYSICAL_START, or it must be built as a relocatable image
(CONFIG_RELOCATABLE=y).
For more details see Documentation/admin-guide/kdump/kdump.rst
For s390, this option also enables zfcpdump.
See also <file:Documentation/s390/zfcpdump.rst>
config CRASH_HOTPLUG
bool "Update the crash elfcorehdr on system configuration changes"
default y
depends on CRASH_DUMP && (HOTPLUG_CPU || MEMORY_HOTPLUG)
depends on ARCH_SUPPORTS_CRASH_HOTPLUG
help
Enable direct update to the crash elfcorehdr (which contains
the list of CPUs and memory regions to be dumped upon a crash)
in response to hot plug/unplug or online/offline of CPUs or
memory. This is a much more advanced approach than userspace
attempting that.
If unsure, say Y.
config CRASH_MAX_MEMORY_RANGES
int "Specify the maximum number of memory regions for the elfcorehdr"
default 8192
depends on CRASH_HOTPLUG
help
For the kexec_file_load() syscall path, specify the maximum number of
memory regions that the elfcorehdr buffer/segment can accommodate.
These regions are obtained via walk_system_ram_res(); eg. the
'System RAM' entries in /proc/iomem.
This value is combined with NR_CPUS_DEFAULT and multiplied by
sizeof(Elf64_Phdr) to determine the final elfcorehdr memory buffer/
segment size.
The value 8192, for example, covers a (sparsely populated) 1TiB system
consisting of 128MiB memblocks, while resulting in an elfcorehdr
memory buffer/segment size under 1MiB. This represents a sane choice
to accommodate both baremetal and virtual machine configurations.
For the kexec_load() syscall path, CRASH_MAX_MEMORY_RANGES is part of
the computation behind the value provided through the
/sys/kernel/crash_elfcorehdr_size attribute.
endmenu