linux-imx/Documentation/ABI/testing/sysfs-devices-memory
Eric DeVolder 88a6f89944 crash: memory and CPU hotplug sysfs attributes
Introduce the crash_hotplug attribute for memory and CPUs for use by
userspace.  These attributes directly facilitate the udev rule for
managing userspace re-loading of the crash kernel upon hot un/plug
changes.

For memory, expose the crash_hotplug attribute to the
/sys/devices/system/memory directory.  For example:

 # udevadm info --attribute-walk /sys/devices/system/memory/memory81
  looking at device '/devices/system/memory/memory81':
    KERNEL=="memory81"
    SUBSYSTEM=="memory"
    DRIVER==""
    ATTR{online}=="1"
    ATTR{phys_device}=="0"
    ATTR{phys_index}=="00000051"
    ATTR{removable}=="1"
    ATTR{state}=="online"
    ATTR{valid_zones}=="Movable"

  looking at parent device '/devices/system/memory':
    KERNELS=="memory"
    SUBSYSTEMS==""
    DRIVERS==""
    ATTRS{auto_online_blocks}=="offline"
    ATTRS{block_size_bytes}=="8000000"
    ATTRS{crash_hotplug}=="1"

For CPUs, expose the crash_hotplug attribute to the
/sys/devices/system/cpu directory. For example:

 # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0
  looking at device '/devices/system/cpu/cpu0':
    KERNEL=="cpu0"
    SUBSYSTEM=="cpu"
    DRIVER=="processor"
    ATTR{crash_notes}=="277c38600"
    ATTR{crash_notes_size}=="368"
    ATTR{online}=="1"

  looking at parent device '/devices/system/cpu':
    KERNELS=="cpu"
    SUBSYSTEMS==""
    DRIVERS==""
    ATTRS{crash_hotplug}=="1"
    ATTRS{isolated}==""
    ATTRS{kernel_max}=="8191"
    ATTRS{nohz_full}=="  (null)"
    ATTRS{offline}=="4-7"
    ATTRS{online}=="0-3"
    ATTRS{possible}=="0-7"
    ATTRS{present}=="0-3"

With these sysfs attributes in place, it is possible to efficiently
instruct the udev rule to skip crash kernel reloading for kernels
configured with crash hotplug support.

For example, the following is the proposed udev rule change for RHEL
system 98-kexec.rules (as the first lines of the rule file):

 # The kernel updates the crash elfcorehdr for CPU and memory changes
 SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
 SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"

When examined in the context of 98-kexec.rules, the above rules test if
crash_hotplug is set, and if so, the userspace initiated
unload-then-reload of the crash kernel is skipped.

CPU and memory checks are separated in accordance with CONFIG_HOTPLUG_CPU
and CONFIG_MEMORY_HOTPLUG kernel config options.  If an architecture
supports, for example, memory hotplug but not CPU hotplug, then the
/sys/devices/system/memory/crash_hotplug attribute file is present, but
the /sys/devices/system/cpu/crash_hotplug attribute file will NOT be
present.  Thus the udev rule skips userspace processing of memory hot
un/plug events, but the udev rule will evaluate false for CPU events, thus
allowing userspace to process CPU hot un/plug events (ie the
unload-then-reload of the kdump capture kernel).

Link: https://lkml.kernel.org/r/20230814214446.6659-5-eric.devolder@oracle.com
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Akhil Raj <lf32.dev@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:25:14 -07:00

4.6 KiB

What: /sys/devices/system/memory Date: June 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The /sys/devices/system/memory contains a snapshot of the internal state of the kernel memory blocks. Files could be added or removed dynamically to represent hot-add/remove operations. Users: hotplug memory add/remove tools http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils

What: /sys/devices/system/memory/memoryX/removable Date: June 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/removable is a legacy interface used to indicated whether a memory block is likely to be offlineable or not. Newer kernel versions return "1" if and only if the kernel supports memory offlining. Users: hotplug memory remove tools http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils lsmem/chmem part of util-linux

What: /sys/devices/system/memory/memoryX/phys_device Date: September 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/phys_device is read-only; it is a legacy interface only ever used on s390x to expose the covered storage increment. Users: Legacy s390-tools lsmem/chmem

What: /sys/devices/system/memory/memoryX/phys_index Date: September 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/phys_index is read-only and contains the section ID in hexadecimal which is equivalent to decimal X contained in the memory section directory name.

What: /sys/devices/system/memory/memoryX/state Date: September 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/state is read-write. When read, it returns the online/offline state of the memory block. When written, root can toggle the online/offline state of a memory block using the following commands::

	  # echo online > /sys/devices/system/memory/memoryX/state
	  # echo offline > /sys/devices/system/memory/memoryX/state

	On newer kernel versions, advanced states can be specified
	when onlining to select a target zone: "online_movable"
	selects the movable zone.  "online_kernel" selects the
	applicable kernel zone (DMA, DMA32, or Normal).  However,
	after successfully setting one of the advanced states,
	reading the file will return "online"; the zone information
	can be obtained via "valid_zones" instead.

	While onlining is unlikely to fail, there are no guarantees
	that offlining will succeed.  Offlining is more likely to
	succeed if "valid_zones" indicates "Movable".

Users: hotplug memory remove tools http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils

What: /sys/devices/system/memory/memoryX/valid_zones Date: July 2014 Contact: Zhang Zhen zhenzhang.zhang@huawei.com Description: The file /sys/devices/system/memory/memoryX/valid_zones is read-only.

	For online memory blocks, it returns in which zone memory
	provided by a memory block is managed.  If multiple zones
	apply (not applicable for hotplugged memory), "None" is returned
	and the memory block cannot be offlined.

	For offline memory blocks, it returns by which zone memory
	provided by a memory block can be managed when onlining.
	The first returned zone ("default") will be used when setting
	the state of an offline memory block to "online".  Only one of
	the kernel zones (DMA, DMA32, Normal) is applicable for a single
	memory block.

What: /sys/devices/system/memoryX/nodeY Date: October 2009 Contact: Linux Memory Management list linux-mm@kvack.org Description: When CONFIG_NUMA is enabled, a symbolic link that points to the corresponding NUMA node directory.

	For example, the following symbolic link is created for
	memory section 9 on node0:

	/sys/devices/system/memory/memory9/node0 -> ../../node/node0

What: /sys/devices/system/node/nodeX/memoryY Date: September 2008 Contact: Gary Hade garyhade@us.ibm.com Description: When CONFIG_NUMA is enabled /sys/devices/system/node/nodeX/memoryY is a symbolic link that points to the corresponding /sys/devices/system/memory/memoryY memory section directory. For example, the following symbolic link is created for memory section 9 on node0.

	/sys/devices/system/node/node0/memory9 -> ../../memory/memory9

What: /sys/devices/system/memory/crash_hotplug Date: Aug 2023 Contact: Linux kernel mailing list linux-kernel@vger.kernel.org Description: (RO) indicates whether or not the kernel directly supports modifying the crash elfcorehdr for memory hot un/plug and/or on/offline changes.