
Introduce the crash_hotplug attribute for memory and CPUs for use by userspace. These attributes directly facilitate the udev rule for managing userspace re-loading of the crash kernel upon hot un/plug changes. For memory, expose the crash_hotplug attribute to the /sys/devices/system/memory directory. For example: # udevadm info --attribute-walk /sys/devices/system/memory/memory81 looking at device '/devices/system/memory/memory81': KERNEL=="memory81" SUBSYSTEM=="memory" DRIVER=="" ATTR{online}=="1" ATTR{phys_device}=="0" ATTR{phys_index}=="00000051" ATTR{removable}=="1" ATTR{state}=="online" ATTR{valid_zones}=="Movable" looking at parent device '/devices/system/memory': KERNELS=="memory" SUBSYSTEMS=="" DRIVERS=="" ATTRS{auto_online_blocks}=="offline" ATTRS{block_size_bytes}=="8000000" ATTRS{crash_hotplug}=="1" For CPUs, expose the crash_hotplug attribute to the /sys/devices/system/cpu directory. For example: # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0 looking at device '/devices/system/cpu/cpu0': KERNEL=="cpu0" SUBSYSTEM=="cpu" DRIVER=="processor" ATTR{crash_notes}=="277c38600" ATTR{crash_notes_size}=="368" ATTR{online}=="1" looking at parent device '/devices/system/cpu': KERNELS=="cpu" SUBSYSTEMS=="" DRIVERS=="" ATTRS{crash_hotplug}=="1" ATTRS{isolated}=="" ATTRS{kernel_max}=="8191" ATTRS{nohz_full}==" (null)" ATTRS{offline}=="4-7" ATTRS{online}=="0-3" ATTRS{possible}=="0-7" ATTRS{present}=="0-3" With these sysfs attributes in place, it is possible to efficiently instruct the udev rule to skip crash kernel reloading for kernels configured with crash hotplug support. For example, the following is the proposed udev rule change for RHEL system 98-kexec.rules (as the first lines of the rule file): # The kernel updates the crash elfcorehdr for CPU and memory changes SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" When examined in the context of 98-kexec.rules, the above rules test if crash_hotplug is set, and if so, the userspace initiated unload-then-reload of the crash kernel is skipped. CPU and memory checks are separated in accordance with CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options. If an architecture supports, for example, memory hotplug but not CPU hotplug, then the /sys/devices/system/memory/crash_hotplug attribute file is present, but the /sys/devices/system/cpu/crash_hotplug attribute file will NOT be present. Thus the udev rule skips userspace processing of memory hot un/plug events, but the udev rule will evaluate false for CPU events, thus allowing userspace to process CPU hot un/plug events (ie the unload-then-reload of the kdump capture kernel). Link: https://lkml.kernel.org/r/20230814214446.6659-5-eric.devolder@oracle.com Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Akhil Raj <lf32.dev@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4.6 KiB
What: /sys/devices/system/memory Date: June 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The /sys/devices/system/memory contains a snapshot of the internal state of the kernel memory blocks. Files could be added or removed dynamically to represent hot-add/remove operations. Users: hotplug memory add/remove tools http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
What: /sys/devices/system/memory/memoryX/removable Date: June 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/removable is a legacy interface used to indicated whether a memory block is likely to be offlineable or not. Newer kernel versions return "1" if and only if the kernel supports memory offlining. Users: hotplug memory remove tools http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils lsmem/chmem part of util-linux
What: /sys/devices/system/memory/memoryX/phys_device Date: September 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/phys_device is read-only; it is a legacy interface only ever used on s390x to expose the covered storage increment. Users: Legacy s390-tools lsmem/chmem
What: /sys/devices/system/memory/memoryX/phys_index Date: September 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/phys_index is read-only and contains the section ID in hexadecimal which is equivalent to decimal X contained in the memory section directory name.
What: /sys/devices/system/memory/memoryX/state Date: September 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/state is read-write. When read, it returns the online/offline state of the memory block. When written, root can toggle the online/offline state of a memory block using the following commands::
# echo online > /sys/devices/system/memory/memoryX/state
# echo offline > /sys/devices/system/memory/memoryX/state
On newer kernel versions, advanced states can be specified
when onlining to select a target zone: "online_movable"
selects the movable zone. "online_kernel" selects the
applicable kernel zone (DMA, DMA32, or Normal). However,
after successfully setting one of the advanced states,
reading the file will return "online"; the zone information
can be obtained via "valid_zones" instead.
While onlining is unlikely to fail, there are no guarantees
that offlining will succeed. Offlining is more likely to
succeed if "valid_zones" indicates "Movable".
Users: hotplug memory remove tools http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
What: /sys/devices/system/memory/memoryX/valid_zones Date: July 2014 Contact: Zhang Zhen zhenzhang.zhang@huawei.com Description: The file /sys/devices/system/memory/memoryX/valid_zones is read-only.
For online memory blocks, it returns in which zone memory
provided by a memory block is managed. If multiple zones
apply (not applicable for hotplugged memory), "None" is returned
and the memory block cannot be offlined.
For offline memory blocks, it returns by which zone memory
provided by a memory block can be managed when onlining.
The first returned zone ("default") will be used when setting
the state of an offline memory block to "online". Only one of
the kernel zones (DMA, DMA32, Normal) is applicable for a single
memory block.
What: /sys/devices/system/memoryX/nodeY Date: October 2009 Contact: Linux Memory Management list linux-mm@kvack.org Description: When CONFIG_NUMA is enabled, a symbolic link that points to the corresponding NUMA node directory.
For example, the following symbolic link is created for
memory section 9 on node0:
/sys/devices/system/memory/memory9/node0 -> ../../node/node0
What: /sys/devices/system/node/nodeX/memoryY Date: September 2008 Contact: Gary Hade garyhade@us.ibm.com Description: When CONFIG_NUMA is enabled /sys/devices/system/node/nodeX/memoryY is a symbolic link that points to the corresponding /sys/devices/system/memory/memoryY memory section directory. For example, the following symbolic link is created for memory section 9 on node0.
/sys/devices/system/node/node0/memory9 -> ../../memory/memory9
What: /sys/devices/system/memory/crash_hotplug Date: Aug 2023 Contact: Linux kernel mailing list linux-kernel@vger.kernel.org Description: (RO) indicates whether or not the kernel directly supports modifying the crash elfcorehdr for memory hot un/plug and/or on/offline changes.