Commit Graph

1312 Commits

Author SHA1 Message Date
Masahisa Kojima
6bb3703aa5 efi: expose efivar generic ops register function
This is a preparation for supporting efivar operations provided by other
than efi subsystem.  Both register and unregister functions are exposed
so that non-efi subsystem can revert the efi generic operation.

Acked-by: Sumit Garg <sumit.garg@linaro.org>
Co-developed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-12-11 11:19:18 +01:00
Wang Yao
271f2a4a95 efi/loongarch: Use load address to calculate kernel entry address
The efi_relocate_kernel() may load the PIE kernel to anywhere, the
loaded address may not be equal to link address or
EFI_KIMG_PREFERRED_ADDRESS.

Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Wang Yao <wangyao@lemote.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-12-11 11:18:26 +01:00
Raag Jadav
9e93507da2 efi: dev-path-parser: use acpi_dev_uid_match() for matching _UID
Now that we have _UID matching support for integer types, we can use
acpi_dev_uid_match() for it.

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-06 18:02:37 +01:00
Michael Roth
01b1e3ca0e efi/unaccepted: Fix off-by-one when checking for overlapping ranges
When a task needs to accept memory it will scan the accepting_list
to see if any ranges already being processed by other tasks overlap
with its range. Due to an off-by-one in the range comparisons, a task
might falsely determine that an overlapping range is being accepted,
leading to an unnecessary delay before it begins processing the range.

Fix the off-by-one in the range comparison to prevent this and slightly
improve performance.

Fixes: 50e782a86c ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance")
Link: https://lore.kernel.org/linux-mm/20231101004523.vseyi5bezgfaht5i@amd.com/T/#me2eceb9906fcae5fe958b3fe88e41f920f8335b6
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-11-28 12:49:21 +01:00
Linus Torvalds
56d428ae1c RISC-V Patches for the 6.7 Merge Window, Part 2
* Support for handling misaligned accesses in S-mode.
 * Probing for misaligned access support is now properly cached and
   handled in parallel.
 * PTDUMP now reflects the SW reserved bits, as well as the PBMT and
   NAPOT extensions.
 * Performance improvements for TLB flushing.
 * Support for many new relocations in the module loader.
 * Various bug fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmVOUCcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYicJ2D/9S+9dnHYHVGTeJfr9Zf2T4r+qHBPyx
 LXbTAbgHN6139MgcRLMRlcUaQ04RVxuBCWhxewJ6mQiHiYNlullgKmJO8oYMS4uZ
 2yQGHKhzKEVluXxe+qT6VW+zsP0cY6pDQ+e59AqZgyWzvATxMU4VtFfCDdjFG03I
 k/8Y3MUKSHAKzIHUsGHiMW5J2YRiM/iVehv2gZfanreulWlK6lyiV4AZ4KChu8Sa
 gix9QkFJw+9+7RHnouHvczt4xTqLPJQcdecLJsbisEI4VaaPtTVzkvXx/kwbMwX0
 qkQnZ7I60fPHrCb9ccuedjDMa1Z0lrfwRldBGz9f9QaW37Eppirn6LA5JiZ1cA47
 wKTwba6gZJCTRXELFTJLcv+Cwdy003E0y3iL5UK2rkbLqcxfvLdq1WAJU2t05Lmh
 aRQN10BtM2DZG+SNPlLoBpXPDw0Q3KOc20zGtuhmk010+X4yOK7WXlu8zNGLLE0+
 yHamiZqAbpIUIEzwDdGbb95jywR1sUhNTbScuhj4Rc79ZqLtPxty1PUhnfqFat1R
 i3ngQtCbeUUYFS2YV9tKkXjLf/xkQNRbt7kQBowuvFuvfksl9UwMdRAWcE/h0M9P
 7uz7cBFhuG0v/XblB7bUhYLkKITvP+ltSMyxaGlfpGqCLAH2KIztdZ2PLWLRdKeU
 +9dtZSQR6oBLqQ==
 =NhdR
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.7-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for handling misaligned accesses in S-mode

 - Probing for misaligned access support is now properly cached and
   handled in parallel

 - PTDUMP now reflects the SW reserved bits, as well as the PBMT and
   NAPOT extensions

 - Performance improvements for TLB flushing

 - Support for many new relocations in the module loader

 - Various bug fixes and cleanups

* tag 'riscv-for-linus-6.7-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  riscv: Optimize bitops with Zbb extension
  riscv: Rearrange hwcap.h and cpufeature.h
  drivers: perf: Do not broadcast to other cpus when starting a counter
  drivers: perf: Check find_first_bit() return value
  of: property: Add fw_devlink support for msi-parent
  RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs
  riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings
  riscv: Don't use PGD entries for the linear mapping
  RISC-V: Probe misaligned access speed in parallel
  RISC-V: Remove __init on unaligned_emulation_finish()
  RISC-V: Show accurate per-hart isa in /proc/cpuinfo
  RISC-V: Don't rely on positional structure initialization
  riscv: Add tests for riscv module loading
  riscv: Add remaining module relocations
  riscv: Avoid unaligned access when relocating modules
  riscv: split cache ops out of dma-noncoherent.c
  riscv: Improve flush_tlb_kernel_range()
  riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb
  riscv: Improve flush_tlb_range() for hugetlb pages
  riscv: Improve tlb_flush()
  ...
2023-11-10 09:23:17 -08:00
Xiao Wang
457926b253
riscv: Optimize bitops with Zbb extension
This patch leverages the alternative mechanism to dynamically optimize
bitops (including __ffs, __fls, ffs, fls) with Zbb instructions. When
Zbb ext is not supported by the runtime CPU, legacy implementation is
used. If Zbb is supported, then the optimized variants will be selected
via alternative patching.

The legacy bitops support is taken from the generic C implementation as
fallback.

If the parameter is a build-time constant, we leverage compiler builtin to
calculate the result directly, this approach is inspired by x86 bitops
implementation.

EFI stub runs before the kernel, so alternative mechanism should not be
used there, this patch introduces a macro NO_ALTERNATIVE for this purpose.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231031064553.2319688-3-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-09 10:15:52 -08:00
Linus Torvalds
1f24458a10 TTY/Serial changes for 6.7-rc1
Here is the big set of tty/serial driver changes for 6.7-rc1.  Included
 in here are:
   - console/vgacon cleanups and removals from Arnd
   - tty core and n_tty cleanups from Jiri
   - lots of 8250 driver updates and cleanups
   - sc16is7xx serial driver updates
   - dt binding updates
   - first set of port lock wrapers from Thomas for the printk fixes
     coming in future releases
   - other small serial and tty core cleanups and updates
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZUTbaw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk9+gCeKdoRb8FDwGCO/GaoHwR4EzwQXhQAoKXZRmN5
 LTtw9sbfGIiBdOTtgLPb
 =6PJr
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty and serial updates from Greg KH:
 "Here is the big set of tty/serial driver changes for 6.7-rc1. Included
  in here are:

   - console/vgacon cleanups and removals from Arnd

   - tty core and n_tty cleanups from Jiri

   - lots of 8250 driver updates and cleanups

   - sc16is7xx serial driver updates

   - dt binding updates

   - first set of port lock wrapers from Thomas for the printk fixes
     coming in future releases

   - other small serial and tty core cleanups and updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (193 commits)
  serdev: Replace custom code with device_match_acpi_handle()
  serdev: Simplify devm_serdev_device_open() function
  serdev: Make use of device_set_node()
  tty: n_gsm: add copyright Siemens Mobility GmbH
  tty: n_gsm: fix race condition in status line change on dead connections
  serial: core: Fix runtime PM handling for pending tx
  vgacon: fix mips/sibyte build regression
  dt-bindings: serial: drop unsupported samsung bindings
  tty: serial: samsung: drop earlycon support for unsupported platforms
  tty: 8250: Add note for PX-835
  tty: 8250: Fix IS-200 PCI ID comment
  tty: 8250: Add Brainboxes Oxford Semiconductor-based quirks
  tty: 8250: Add support for Intashield IX cards
  tty: 8250: Add support for additional Brainboxes PX cards
  tty: 8250: Fix up PX-803/PX-857
  tty: 8250: Fix port count of PX-257
  tty: 8250: Add support for Intashield IS-100
  tty: 8250: Add support for Brainboxes UP cards
  tty: 8250: Add support for additional Brainboxes UC cards
  tty: 8250: Remove UC-257 and UC-431
  ...
2023-11-03 15:44:25 -10:00
Linus Torvalds
ecae0bd517 Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
 
 - Kemeng Shi has contributed some compation maintenance work in the
   series "Fixes and cleanups to compaction".
 
 - Joel Fernandes has a patchset ("Optimize mremap during mutual
   alignment within PMD") which fixes an obscure issue with mremap()'s
   pagetable handling during a subsequent exec(), based upon an
   implementation which Linus suggested.
 
 - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the
   following patch series:
 
 	mm/damon: misc fixups for documents, comments and its tracepoint
 	mm/damon: add a tracepoint for damos apply target regions
 	mm/damon: provide pseudo-moving sum based access rate
 	mm/damon: implement DAMOS apply intervals
 	mm/damon/core-test: Fix memory leaks in core-test
 	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
 
 - In the series "Do not try to access unaccepted memory" Adrian Hunter
   provides some fixups for the recently-added "unaccepted memory' feature.
   To increase the feature's checking coverage.  "Plug a few gaps where
   RAM is exposed without checking if it is unaccepted memory".
 
 - In the series "cleanups for lockless slab shrink" Qi Zheng has done
   some maintenance work which is preparation for the lockless slab
   shrinking code.
 
 - Qi Zheng has redone the earlier (and reverted) attempt to make slab
   shrinking lockless in the series "use refcount+RCU method to implement
   lockless slab shrink".
 
 - David Hildenbrand contributes some maintenance work for the rmap code
   in the series "Anon rmap cleanups".
 
 - Kefeng Wang does more folio conversions and some maintenance work in
   the migration code.  Series "mm: migrate: more folio conversion and
   unification".
 
 - Matthew Wilcox has fixed an issue in the buffer_head code which was
   causing long stalls under some heavy memory/IO loads.  Some cleanups
   were added on the way.  Series "Add and use bdev_getblk()".
 
 - In the series "Use nth_page() in place of direct struct page
   manipulation" Zi Yan has fixed a potential issue with the direct
   manipulation of hugetlb page frames.
 
 - In the series "mm: hugetlb: Skip initialization of gigantic tail
   struct pages if freed by HVO" has improved our handling of gigantic
   pages in the hugetlb vmmemmep optimizaton code.  This provides
   significant boot time improvements when significant amounts of gigantic
   pages are in use.
 
 - Matthew Wilcox has sent the series "Small hugetlb cleanups" - code
   rationalization and folio conversions in the hugetlb code.
 
 - Yin Fengwei has improved mlock()'s handling of large folios in the
   series "support large folio for mlock"
 
 - In the series "Expose swapcache stat for memcg v1" Liu Shixin has
   added statistics for memcg v1 users which are available (and useful)
   under memcg v2.
 
 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
   prctl so that userspace may direct the kernel to not automatically
   propagate the denial to child processes.  The series is named "MDWE
   without inheritance".
 
 - Kefeng Wang has provided the series "mm: convert numa balancing
   functions to use a folio" which does what it says.
 
 - In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch
   makes is possible for a process to propagate KSM treatment across
   exec().
 
 - Huang Ying has enhanced memory tiering's calculation of memory
   distances.  This is used to permit the dax/kmem driver to use "high
   bandwidth memory" in addition to Optane Data Center Persistent Memory
   Modules (DCPMM).  The series is named "memory tiering: calculate
   abstract distance based on ACPI HMAT"
 
 - In the series "Smart scanning mode for KSM" Stefan Roesch has
   optimized KSM by teaching it to retain and use some historical
   information from previous scans.
 
 - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the
   series "mm: memcg: fix tracking of pending stats updates values".
 
 - In the series "Implement IOCTL to get and optionally clear info about
   PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits
   us to atomically read-then-clear page softdirty state.  This is mainly
   used by CRIU.
 
 - Hugh Dickins contributed the series "shmem,tmpfs: general maintenance"
   - a bunch of relatively minor maintenance tweaks to this code.
 
 - Matthew Wilcox has increased the use of the VMA lock over file-backed
   page faults in the series "Handle more faults under the VMA lock".  Some
   rationalizations of the fault path became possible as a result.
 
 - In the series "mm/rmap: convert page_move_anon_rmap() to
   folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups
   and folio conversions.
 
 - In the series "various improvements to the GUP interface" Lorenzo
   Stoakes has simplified and improved the GUP interface with an eye to
   providing groundwork for future improvements.
 
 - Andrey Konovalov has sent along the series "kasan: assorted fixes and
   improvements" which does those things.
 
 - Some page allocator maintenance work from Kemeng Shi in the series
   "Two minor cleanups to break_down_buddy_pages".
 
 - In thes series "New selftest for mm" Breno Leitao has developed
   another MM self test which tickles a race we had between madvise() and
   page faults.
 
 - In the series "Add folio_end_read" Matthew Wilcox provides cleanups
   and an optimization to the core pagecache code.
 
 - Nhat Pham has added memcg accounting for hugetlb memory in the series
   "hugetlb memcg accounting".
 
 - Cleanups and rationalizations to the pagemap code from Lorenzo
   Stoakes, in the series "Abstract vma_merge() and split_vma()".
 
 - Audra Mitchell has fixed issues in the procfs page_owner code's new
   timestamping feature which was causing some misbehaviours.  In the
   series "Fix page_owner's use of free timestamps".
 
 - Lorenzo Stoakes has fixed the handling of new mappings of sealed files
   in the series "permit write-sealed memfd read-only shared mappings".
 
 - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
   series "Batch hugetlb vmemmap modification operations".
 
 - Some buffer_head folio conversions and cleanups from Matthew Wilcox in
   the series "Finish the create_empty_buffers() transition".
 
 - As a page allocator performance optimization Huang Ying has added
   automatic tuning to the allocator's per-cpu-pages feature, in the series
   "mm: PCP high auto-tuning".
 
 - Roman Gushchin has contributed the patchset "mm: improve performance
   of accounted kernel memory allocations" which improves their performance
   by ~30% as measured by a micro-benchmark.
 
 - folio conversions from Kefeng Wang in the series "mm: convert page
   cpupid functions to folios".
 
 - Some kmemleak fixups in Liu Shixin's series "Some bugfix about
   kmemleak".
 
 - Qi Zheng has improved our handling of memoryless nodes by keeping them
   off the allocation fallback list.  This is done in the series "handle
   memoryless nodes more appropriately".
 
 - khugepaged conversions from Vishal Moola in the series "Some
   khugepaged folio conversions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA
 jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y
 FgeUPAD1oasg6CP+INZvCj34waNxwAc=
 =E+Y4
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Kemeng Shi has contributed some compation maintenance work in the
     series 'Fixes and cleanups to compaction'

   - Joel Fernandes has a patchset ('Optimize mremap during mutual
     alignment within PMD') which fixes an obscure issue with mremap()'s
     pagetable handling during a subsequent exec(), based upon an
     implementation which Linus suggested

   - More DAMON/DAMOS maintenance and feature work from SeongJae Park i
     the following patch series:

	mm/damon: misc fixups for documents, comments and its tracepoint
	mm/damon: add a tracepoint for damos apply target regions
	mm/damon: provide pseudo-moving sum based access rate
	mm/damon: implement DAMOS apply intervals
	mm/damon/core-test: Fix memory leaks in core-test
	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval

   - In the series 'Do not try to access unaccepted memory' Adrian
     Hunter provides some fixups for the recently-added 'unaccepted
     memory' feature. To increase the feature's checking coverage. 'Plug
     a few gaps where RAM is exposed without checking if it is
     unaccepted memory'

   - In the series 'cleanups for lockless slab shrink' Qi Zheng has done
     some maintenance work which is preparation for the lockless slab
     shrinking code

   - Qi Zheng has redone the earlier (and reverted) attempt to make slab
     shrinking lockless in the series 'use refcount+RCU method to
     implement lockless slab shrink'

   - David Hildenbrand contributes some maintenance work for the rmap
     code in the series 'Anon rmap cleanups'

   - Kefeng Wang does more folio conversions and some maintenance work
     in the migration code. Series 'mm: migrate: more folio conversion
     and unification'

   - Matthew Wilcox has fixed an issue in the buffer_head code which was
     causing long stalls under some heavy memory/IO loads. Some cleanups
     were added on the way. Series 'Add and use bdev_getblk()'

   - In the series 'Use nth_page() in place of direct struct page
     manipulation' Zi Yan has fixed a potential issue with the direct
     manipulation of hugetlb page frames

   - In the series 'mm: hugetlb: Skip initialization of gigantic tail
     struct pages if freed by HVO' has improved our handling of gigantic
     pages in the hugetlb vmmemmep optimizaton code. This provides
     significant boot time improvements when significant amounts of
     gigantic pages are in use

   - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
     rationalization and folio conversions in the hugetlb code

   - Yin Fengwei has improved mlock()'s handling of large folios in the
     series 'support large folio for mlock'

   - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
     added statistics for memcg v1 users which are available (and
     useful) under memcg v2

   - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
     prctl so that userspace may direct the kernel to not automatically
     propagate the denial to child processes. The series is named 'MDWE
     without inheritance'

   - Kefeng Wang has provided the series 'mm: convert numa balancing
     functions to use a folio' which does what it says

   - In the series 'mm/ksm: add fork-exec support for prctl' Stefan
     Roesch makes is possible for a process to propagate KSM treatment
     across exec()

   - Huang Ying has enhanced memory tiering's calculation of memory
     distances. This is used to permit the dax/kmem driver to use 'high
     bandwidth memory' in addition to Optane Data Center Persistent
     Memory Modules (DCPMM). The series is named 'memory tiering:
     calculate abstract distance based on ACPI HMAT'

   - In the series 'Smart scanning mode for KSM' Stefan Roesch has
     optimized KSM by teaching it to retain and use some historical
     information from previous scans

   - Yosry Ahmed has fixed some inconsistencies in memcg statistics in
     the series 'mm: memcg: fix tracking of pending stats updates
     values'

   - In the series 'Implement IOCTL to get and optionally clear info
     about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
     which permits us to atomically read-then-clear page softdirty
     state. This is mainly used by CRIU

   - Hugh Dickins contributed the series 'shmem,tmpfs: general
     maintenance', a bunch of relatively minor maintenance tweaks to
     this code

   - Matthew Wilcox has increased the use of the VMA lock over
     file-backed page faults in the series 'Handle more faults under the
     VMA lock'. Some rationalizations of the fault path became possible
     as a result

   - In the series 'mm/rmap: convert page_move_anon_rmap() to
     folio_move_anon_rmap()' David Hildenbrand has implemented some
     cleanups and folio conversions

   - In the series 'various improvements to the GUP interface' Lorenzo
     Stoakes has simplified and improved the GUP interface with an eye
     to providing groundwork for future improvements

   - Andrey Konovalov has sent along the series 'kasan: assorted fixes
     and improvements' which does those things

   - Some page allocator maintenance work from Kemeng Shi in the series
     'Two minor cleanups to break_down_buddy_pages'

   - In thes series 'New selftest for mm' Breno Leitao has developed
     another MM self test which tickles a race we had between madvise()
     and page faults

   - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
     and an optimization to the core pagecache code

   - Nhat Pham has added memcg accounting for hugetlb memory in the
     series 'hugetlb memcg accounting'

   - Cleanups and rationalizations to the pagemap code from Lorenzo
     Stoakes, in the series 'Abstract vma_merge() and split_vma()'

   - Audra Mitchell has fixed issues in the procfs page_owner code's new
     timestamping feature which was causing some misbehaviours. In the
     series 'Fix page_owner's use of free timestamps'

   - Lorenzo Stoakes has fixed the handling of new mappings of sealed
     files in the series 'permit write-sealed memfd read-only shared
     mappings'

   - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
     series 'Batch hugetlb vmemmap modification operations'

   - Some buffer_head folio conversions and cleanups from Matthew Wilcox
     in the series 'Finish the create_empty_buffers() transition'

   - As a page allocator performance optimization Huang Ying has added
     automatic tuning to the allocator's per-cpu-pages feature, in the
     series 'mm: PCP high auto-tuning'

   - Roman Gushchin has contributed the patchset 'mm: improve
     performance of accounted kernel memory allocations' which improves
     their performance by ~30% as measured by a micro-benchmark

   - folio conversions from Kefeng Wang in the series 'mm: convert page
     cpupid functions to folios'

   - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
     kmemleak'

   - Qi Zheng has improved our handling of memoryless nodes by keeping
     them off the allocation fallback list. This is done in the series
     'handle memoryless nodes more appropriately'

   - khugepaged conversions from Vishal Moola in the series 'Some
     khugepaged folio conversions'"

[ bcachefs conflicts with the dynamically allocated shrinkers have been
  resolved as per Stephen Rothwell in

     https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/

  with help from Qi Zheng.

  The clone3 test filtering conflict was half-arsed by yours truly ]

* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
  mm/damon/sysfs: update monitoring target regions for online input commit
  mm/damon/sysfs: remove requested targets when online-commit inputs
  selftests: add a sanity check for zswap
  Documentation: maple_tree: fix word spelling error
  mm/vmalloc: fix the unchecked dereference warning in vread_iter()
  zswap: export compression failure stats
  Documentation: ubsan: drop "the" from article title
  mempolicy: migration attempt to match interleave nodes
  mempolicy: mmap_lock is not needed while migrating folios
  mempolicy: alloc_pages_mpol() for NUMA policy without vma
  mm: add page_rmappable_folio() wrapper
  mempolicy: remove confusing MPOL_MF_LAZY dead code
  mempolicy: mpol_shared_policy_init() without pseudo-vma
  mempolicy trivia: use pgoff_t in shared mempolicy tree
  mempolicy trivia: slightly more consistent naming
  mempolicy trivia: delete those ancient pr_debug()s
  mempolicy: fix migrate_pages(2) syscall return nr_failed
  kernfs: drop shared NUMA mempolicy hooks
  hugetlbfs: drop shared NUMA mempolicy pretence
  mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
  ...
2023-11-02 19:38:47 -10:00
Linus Torvalds
1e0c505e13 asm-generic updates for v6.7
The ia64 architecture gets its well-earned retirement as planned,
 now that there is one last (mostly) working release that will
 be maintained as an LTS kernel.
 
 The architecture specific system call tables are updated for
 the added map_shadow_stack() syscall and to remove references
 to the long-gone sys_lookup_dcookie() syscall.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmVC40IACgkQYKtH/8kJ
 Uidhmw/9EX+aWSXGoObJ3fngaNSMw+PmrEuP8qEKBHxfKHcCdX3hc451Oh4GlhaQ
 tru91pPwgNvN2/rfoKusxT+V4PemGIzfNni/04rp+P0kvmdw5otQ2yNhsQNsfVmq
 XGWvkxF4P2GO6bkjjfR/1dDq7GtlyXtwwPDKeLbYb6TnJOZjtx+EAN27kkfSn1Ms
 R4Sa3zJ+DfHUmHL5S9g+7UD/CZ5GfKNmIskI4Mz5GsfoUz/0iiU+Bge/9sdcdSJQ
 kmbLy5YnVzfooLZ3TQmBFsO3iAMWb0s/mDdtyhqhTVmTUshLolkPYyKnPFvdupyv
 shXcpEST2XJNeaDRnL2K4zSCdxdbnCZHDpjfl9wfioBg7I8NfhXKpf1jYZHH1de4
 LXq8ndEFEOVQw/zSpYWfQq1sux8Jiqr+UK/ukbVeFWiGGIUs91gEWtPAf8T0AZo9
 ujkJvaWGl98O1g5wmBu0/dAR6QcFJMDfVwbmlIFpU8O+MEaz6X8mM+O5/T0IyTcD
 eMbAUjj4uYcU7ihKzHEv/0SS9Of38kzff67CLN5k8wOP/9NlaGZ78o1bVle9b52A
 BdhrsAefFiWHp1jT6Y9Rg4HOO/TguQ9e6EWSKOYFulsiLH9LEFaB9RwZLeLytV0W
 vlAgY9rUW77g1OJcb7DoNv33nRFuxsKqsnz3DEIXtgozo9CzbYI=
 =H1vH
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull ia64 removal and asm-generic updates from Arnd Bergmann:

 - The ia64 architecture gets its well-earned retirement as planned,
   now that there is one last (mostly) working release that will be
   maintained as an LTS kernel.

 - The architecture specific system call tables are updated for the
   added map_shadow_stack() syscall and to remove references to the
   long-gone sys_lookup_dcookie() syscall.

* tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  hexagon: Remove unusable symbols from the ptrace.h uapi
  asm-generic: Fix spelling of architecture
  arch: Reserve map_shadow_stack() syscall number for all architectures
  syscalls: Cleanup references to sys_lookup_dcookie()
  Documentation: Drop or replace remaining mentions of IA64
  lib/raid6: Drop IA64 support
  Documentation: Drop IA64 from feature descriptions
  kernel: Drop IA64 support from sig_fault handlers
  arch: Remove Itanium (IA-64) architecture
2023-11-01 15:28:33 -10:00
Linus Torvalds
2b95bb0526 Changes to the x86 boot code in v6.7:
- Rework PE header generation, primarily to generate a modern, 4k aligned
    kernel image view with narrower W^X permissions.
 
  - Further refine init-lifetime annotations
 
  - Misc cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU9B6ERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jXOg/+NAOQKhIYK0uFqAM+CEhZX4cqsJ9Ck0ze
 bqQ8pf5iCkbVZ+6ByiMSOszScTgVTSalRfKMYR+Fa9PVkLK4SNAeYPnGYugmLRoj
 U3lZYFpNDEwsZOmFwvqn7p+bGBQcBYKZuVI6bQh5U7Go4v6ujPjK4zTAK8SWDdTp
 DtEzhj9tELcYlm1NSV2OYu/k0IWAFV3Fc++G3WAm85xOK7oXVOYeMIlaVkpOyAXu
 th3yCw+Q0u1tuBS++77FwsEPt1KTzKGcTL7HpPrb4e4e4snOhmri+KAM/Noef7Vm
 lWqo8fTAeYwpYQ80oFsXVDhuI5LsfsuQgQid20sy1cWwswe1o1A73/AeP4pRogWl
 zLJuRcuNg2/VhPvMLdBWn5QdgJjH7CngeH+r/YkZPssPo6tfwa5UW7HOTCQvLsO9
 a+xy098qkk9d+8Za0sYMuv8/4+Ev5II2haP8edLgNWQ8S5qKIUQaY+r6268pIN/F
 0fGP9B3wblBjiNWCnd8UBh6T271g1O4vaMUt2URdcW3QObEq2EGnNiTc5tx9OPnP
 ZxQdAIl6pB0H0HIe9/7PABF40biKn84zmSl+KuXrhvh1f5FjYjJWVNyKlAKdSpSR
 wjvzg1KbhLiAHV05oQSHR7txMHJxfjpxAKmus0Hpqo6qVQ9FgrKiru9VHKocIpKU
 z66g+wEKUuY=
 =sxZJ
 -----END PGP SIGNATURE-----

Merge tag 'x86-boot-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 boot updates from Ingo Molnar:

 - Rework PE header generation, primarily to generate a modern, 4k
   aligned kernel image view with narrower W^X permissions.

 - Further refine init-lifetime annotations

 - Misc cleanups & fixes

* tag 'x86-boot-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  x86/boot: efistub: Assign global boot_params variable
  x86/boot: Rename conflicting 'boot_params' pointer to 'boot_params_ptr'
  x86/head/64: Move the __head definition to <asm/init.h>
  x86/head/64: Add missing __head annotation to startup_64_load_idt()
  x86/head/64: Mark 'startup_gdt[]' and 'startup_gdt_descr' as __initdata
  x86/boot: Harmonize the style of array-type parameter for fixup_pointer() calls
  x86/boot: Fix incorrect startup_gdt_descr.size
  x86/boot: Compile boot code with -std=gnu11 too
  x86/boot: Increase section and file alignment to 4k/512
  x86/boot: Split off PE/COFF .data section
  x86/boot: Drop PE/COFF .reloc section
  x86/boot: Construct PE/COFF .text section from assembler
  x86/boot: Derive file size from _edata symbol
  x86/boot: Define setup size in linker script
  x86/boot: Set EFI handover offset directly in header asm
  x86/boot: Grab kernel_info offset from zoffset header directly
  x86/boot: Drop references to startup_64
  x86/boot: Drop redundant code setting the root device
  x86/boot: Omit compression buffer from PE/COFF image memory footprint
  x86/boot: Remove the 'bugger off' message
  ...
2023-10-30 14:11:57 -10:00
Ard Biesheuvel
c03d21f05e Merge 3rd batch of EFI fixes into efi/urgent 2023-10-20 18:11:06 +02:00
Kirill A. Shutemov
50e782a86c efi/unaccepted: Fix soft lockups caused by parallel memory acceptance
Michael reported soft lockups on a system that has unaccepted memory.
This occurs when a user attempts to allocate and accept memory on
multiple CPUs simultaneously.

The root cause of the issue is that memory acceptance is serialized with
a spinlock, allowing only one CPU to accept memory at a time. The other
CPUs spin and wait for their turn, leading to starvation and soft lockup
reports.

To address this, the code has been modified to release the spinlock
while accepting memory. This allows for parallel memory acceptance on
multiple CPUs.

A newly introduced "accepting_list" keeps track of which memory is
currently being accepted. This is necessary to prevent parallel
acceptance of the same memory block. If a collision occurs, the lock is
released and the process is retried.

Such collisions should rarely occur. The main path for memory acceptance
is the page allocator, which accepts memory in MAX_ORDER chunks. As long
as MAX_ORDER is equal to or larger than the unit_size, collisions will
never occur because the caller fully owns the memory block being
accepted.

Aside from the page allocator, only memblock and deferered_free_range()
accept memory, but this only happens during boot.

The code has been tested with unit_size == 128MiB to trigger collisions
and validate the retry codepath.

Fixes: 2053bc57f3 ("efi: Add unaccepted memory support")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Michael Roth <michael.roth@amd.com
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Michael Roth <michael.roth@amd.com>
[ardb: drop unnecessary cpu_relax() call]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-10-20 18:10:06 +02:00
Ard Biesheuvel
50dcc2e0d6 x86/boot: efistub: Assign global boot_params variable
Now that the x86 EFI stub calls into some APIs exposed by the
decompressor (e.g., kaslr_get_random_long()), it is necessary to ensure
that the global boot_params variable is set correctly before doing so.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
2023-10-18 12:03:04 +02:00
Arnd Bergmann
b8466fe82b efi: move screen_info into efi init code
After the vga console no longer relies on global screen_info, there are
only two remaining use cases:

 - on the x86 architecture, it is used for multiple boot methods
   (bzImage, EFI, Xen, kexec) to commucate the initial VGA or framebuffer
   settings to a number of device drivers.

 - on other architectures, it is only used as part of the EFI stub,
   and only for the three sysfb framebuffers (simpledrm, simplefb, efifb).

Remove the duplicate data structure definitions by moving it into the
efi-init.c file that sets it up initially for the EFI case, leaving x86
as an exception that retains its own definition for non-EFI boots.

The added #ifdefs here are optional, I added them to further limit the
reach of screen_info to configurations that have at least one of the
users enabled.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231017093947.3627976-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-17 16:33:39 +02:00
Ard Biesheuvel
db7724134c x86/boot: efistub: Assign global boot_params variable
Now that the x86 EFI stub calls into some APIs exposed by the
decompressor (e.g., kaslr_get_random_long()), it is necessary to ensure
that the global boot_params variable is set correctly before doing so.

Note that the decompressor and the kernel proper carry conflicting
declarations for the global variable 'boot_params' so refer to it via an
alias to work around this.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-10-17 08:33:13 +02:00
Kuan-Wei Chiu
0d3ad19179 efi: fix memory leak in krealloc failure handling
In the previous code, there was a memory leak issue where the
previously allocated memory was not freed upon a failed krealloc
operation. This patch addresses the problem by releasing the old memory
before setting the pointer to NULL in case of a krealloc failure. This
ensures that memory is properly managed and avoids potential memory
leaks.

Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-10-13 12:32:37 +02:00
Nikolay Borisov
ff07186b4d x86/efistub: Don't try to print after ExitBootService()
setup_e820() is executed after UEFI's ExitBootService has been called.
This causes the firmware to throw an exception because the Console IO
protocol is supposed to work only during boot service environment. As
per UEFI 2.9, section 12.1:

 "This protocol is used to handle input and output of text-based
 information intended for the system user during the operation of code
 in the boot services environment."

So drop the diagnostic warning from this function. We might add back a
warning that is issued later when initializing the kernel itself.

Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-10-13 12:19:37 +02:00
Adrian Hunter
7cd34dd3c9 efi/unaccepted: do not let /proc/vmcore try to access unaccepted memory
Patch series "Do not try to access unaccepted memory", v2.

Support for unaccepted memory was added recently, refer commit
dcdfdd40fa ("mm: Add support for unaccepted memory"), whereby
a virtual machine may need to accept memory before it can be used.

Plug a few gaps where RAM is exposed without checking if it is
unaccepted memory.


This patch (of 2):

Support for unaccepted memory was added recently, refer commit
dcdfdd40fa ("mm: Add support for unaccepted memory"), whereby a virtual
machine may need to accept memory before it can be used.

Do not let /proc/vmcore try to access unaccepted memory because it can
cause the guest to fail.

For /proc/vmcore, which is read-only, this means a read or mmap of
unaccepted memory will return zeros.

Link: https://lkml.kernel.org/r/20230911112114.91323-1-adrian.hunter@intel.com
Link: https://lkml.kernel.org/r/20230911112114.91323-2-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:22 -07:00
Kirill A. Shutemov
8dbe33956d efi/unaccepted: Make sure unaccepted table is mapped
Unaccepted table is now allocated from EFI_ACPI_RECLAIM_MEMORY. It
translates into E820_TYPE_ACPI, which is not added to memblock and
therefore not mapped in the direct mapping.

This causes a crash on the first touch of the table.

Use memblock_add() to make sure that the table is mapped in direct
mapping.

Align the range to the nearest page borders. Ranges smaller than page
size are not mapped.

Fixes: e7761d827e ("efi/unaccepted: Use ACPI reclaim memory for unaccepted memory table")
Reported-by: Hongyu Ning <hongyu.ning@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-19 16:11:36 +00:00
Ard Biesheuvel
7e50262229 x86/efi: Disregard setup header of loaded image
The native EFI entrypoint does not take a struct boot_params from the
loader, but instead, it constructs one from scratch, using the setup
header data placed at the start of the image.

This setup header is placed in a way that permits legacy loaders to
manipulate the contents (i.e., to pass the kernel command line or the
address and size of an initial ramdisk), but EFI boot does not use it in
that way - it only copies the contents that were placed there at build
time, but EFI loaders will not (and should not) manipulate the setup
header to configure the boot. (Commit 63bf28ceb3 "efi: x86: Wipe
setup_data on pure EFI boot" deals with some of the fallout of using
setup_data in a way that breaks EFI boot.)

Given that none of the non-zero values that are copied from the setup
header into the EFI stub's struct boot_params are relevant to the boot
now that the EFI stub no longer enters via the legacy decompressor, the
copy can be omitted altogether.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230912090051.4014114-19-ardb@google.com
2023-09-15 11:18:40 +02:00
Ard Biesheuvel
5f51c5d0e9 x86/efi: Drop EFI stub .bss from .data section
Now that the EFI stub always zero inits its BSS section upon entry,
there is no longer a need to place the BSS symbols carried by the stub
into the .data section.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230912090051.4014114-18-ardb@google.com
2023-09-15 11:18:40 +02:00
Ard Biesheuvel
cf8e865810 arch: Remove Itanium (IA-64) architecture
The Itanium architecture is obsolete, and an informal survey [0] reveals
that any residual use of Itanium hardware in production is mostly HP-UX
or OpenVMS based. The use of Linux on Itanium appears to be limited to
enthusiasts that occasionally boot a fresh Linux kernel to see whether
things are still working as intended, and perhaps to churn out some
distro packages that are rarely used in practice.

None of the original companies behind Itanium still produce or support
any hardware or software for the architecture, and it is listed as
'Orphaned' in the MAINTAINERS file, as apparently, none of the engineers
that contributed on behalf of those companies (nor anyone else, for that
matter) have been willing to support or maintain the architecture
upstream or even be responsible for applying the odd fix. The Intel
firmware team removed all IA-64 support from the Tianocore/EDK2
reference implementation of EFI in 2018. (Itanium is the original
architecture for which EFI was developed, and the way Linux supports it
deviates significantly from other architectures.) Some distros, such as
Debian and Gentoo, still maintain [unofficial] ia64 ports, but many have
dropped support years ago.

While the argument is being made [1] that there is a 'for the common
good' angle to being able to build and run existing projects such as the
Grid Community Toolkit [2] on Itanium for interoperability testing, the
fact remains that none of those projects are known to be deployed on
Linux/ia64, and very few people actually have access to such a system in
the first place. Even if there were ways imaginable in which Linux/ia64
could be put to good use today, what matters is whether anyone is
actually doing that, and this does not appear to be the case.

There are no emulators widely available, and so boot testing Itanium is
generally infeasible for ordinary contributors. GCC still supports IA-64
but its compile farm [3] no longer has any IA-64 machines. GLIBC would
like to get rid of IA-64 [4] too because it would permit some overdue
code cleanups. In summary, the benefits to the ecosystem of having IA-64
be part of it are mostly theoretical, whereas the maintenance overhead
of keeping it supported is real.

So let's rip off the band aid, and remove the IA-64 arch code entirely.
This follows the timeline proposed by the Debian/ia64 maintainer [5],
which removes support in a controlled manner, leaving IA-64 in a known
good state in the most recent LTS release. Other projects will follow
once the kernel support is removed.

[0] https://lore.kernel.org/all/CAMj1kXFCMh_578jniKpUtx_j8ByHnt=s7S+yQ+vGbKt9ud7+kQ@mail.gmail.com/
[1] https://lore.kernel.org/all/0075883c-7c51-00f5-2c2d-5119c1820410@web.de/
[2] https://gridcf.org/gct-docs/latest/index.html
[3] https://cfarm.tetaneutral.net/machines/list/
[4] https://lore.kernel.org/all/87bkiilpc4.fsf@mid.deneb.enyo.de/
[5] https://lore.kernel.org/all/ff58a3e76e5102c94bb5946d99187b358def688a.camel@physik.fu-berlin.de/

Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11 08:13:17 +00:00
Ard Biesheuvel
e7761d827e efi/unaccepted: Use ACPI reclaim memory for unaccepted memory table
Kyril reports that crashkernels fail to work on confidential VMs that
rely on the unaccepted memory table, and this appears to be caused by
the fact that it is not considered part of the set of firmware tables
that the crashkernel needs to map.

This is an oversight, and a result of the use of the EFI_LOADER_DATA
memory type for this table. The correct memory type to use for any
firmware table is EFI_ACPI_RECLAIM_MEMORY (including ones created by the
EFI stub), even though the name suggests that is it specific to ACPI.
ACPI reclaim means that the memory is used by the firmware to expose
information to the operating system, but that the memory region has no
special significance to the firmware itself, and the OS is free to
reclaim the memory and use it as ordinary memory if it is not interested
in the contents, or if it has already consumed them. In Linux, this
memory is never reclaimed, but it is always covered by the kernel direct
map and generally made accessible as ordinary memory.

On x86, ACPI reclaim memory is translated into E820_ACPI, which the
kexec logic already recognizes as memory that the crashkernel may need
to to access, and so it will be mapped and accessible to the booting
crash kernel.

Fixes: 745e3ed85f ("efi/libstub: Implement support for unaccepted memory")
Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11 06:37:51 +00:00
Palmer Dabbelt
f578055558
Merge patch series "riscv: Introduce KASLR"
Alexandre Ghiti <alexghiti@rivosinc.com> says:

The following KASLR implementation allows to randomize the kernel mapping:

- virtually: we expect the bootloader to provide a seed in the device-tree
- physically: only implemented in the EFI stub, it relies on the firmware to
  provide a seed using EFI_RNG_PROTOCOL. arm64 has a similar implementation
  hence the patch 3 factorizes KASLR related functions for riscv to take
  advantage.

The new virtual kernel location is limited by the early page table that only
has one PUD and with the PMD alignment constraint, the kernel can only take
< 512 positions.

* b4-shazam-merge:
  riscv: libstub: Implement KASLR by using generic functions
  libstub: Fix compilation warning for rv32
  arm64: libstub: Move KASLR handling functions to kaslr.c
  riscv: Dump out kernel offset information on panic
  riscv: Introduce virtual kernel mapping KASLR

Link: https://lore.kernel.org/r/20230722123850.634544-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-09-08 11:25:13 -07:00
Alexandre Ghiti
b7ac4b8ee7
riscv: libstub: Implement KASLR by using generic functions
We can now use arm64 functions to handle the move of the kernel physical
mapping: if KASLR is enabled, we will try to get a random seed from the
firmware, if not possible, the kernel will be moved to a location that
suits its alignment constraints.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Song Shuai <songshuaishuai@tinylab.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20230722123850.634544-6-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-09-05 19:49:31 -07:00
Alexandre Ghiti
3c35d1a03c
libstub: Fix compilation warning for rv32
Fix the following warning which appears when compiled for rv32 by using
unsigned long type instead of u64.

../drivers/firmware/efi/libstub/efi-stub-helper.c: In function 'efi_kaslr_relocate_kernel':
../drivers/firmware/efi/libstub/efi-stub-helper.c:846:28: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  846 |                            (u64)_end < EFI_ALLOC_LIMIT) {

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Song Shuai <songshuaishuai@tinylab.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20230722123850.634544-5-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-09-05 19:49:30 -07:00
Alexandre Ghiti
6b56beb5f6
arm64: libstub: Move KASLR handling functions to kaslr.c
This prepares for riscv to use the same functions to handle the pĥysical
kernel move when KASLR is enabled.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Song Shuai <songshuaishuai@tinylab.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20230722123850.634544-4-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-09-05 19:49:29 -07:00
Linus Torvalds
461f35f014 drm for 6.6-rc1
core:
 - fix gfp flags in drmm_kmalloc
 
 gpuva:
 - add new generic GPU VA manager (for nouveau initially)
 
 syncobj:
 - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl
 
 dma-buf:
 - acquire resv lock for mmap() in exporters
 - support dma-buf self import automatically
 - docs fixes
 
 backlight:
 - fix fbdev interactions
 
 atomic:
 - improve logging
 
 prime:
 - remove struct gem_prim_mmap plus driver updates
 
 gem:
 - drm_exec: add locking over multiple GEM objects
 - fix lockdep checking
 
 fbdev:
 - make fbdev userspace interfaces optional
 - use linux device instead of fbdev device
 - use deferred i/o helper macros in various drivers
 - Make FB core selectable without drivers
 - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT
 - Add helper macros and Kconfig tokens for DMA-allocated framebuffer
 
 ttm:
 - support init_on_free
 - swapout fixes
 
 panel:
 - panel-edp: Support AUO B116XAB01.4
 - Support Visionox R66451 plus DT bindings
 - ld9040: Backlight support, magic improved,
           Kconfig fix
 - Convert to of_device_get_match_data()
 - Fix Kconfig dependencies
 - simple: Set bpc value to fix warning; Set connector type for AUO T215HVN01;
   Support Innolux G156HCE-L01 plus DT bindings
 - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings
 - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings
 - sitronix-st7789v: Support Inanbo T28CP45TN89 plus DT bindings;
          Support EDT ET028013DMA plus DT bindings; Various cleanups
 - edp: Add timings for N140HCA-EAC
 - Allow panels and touchscreens to power sequence together
 - Fix Innolux G156HCE-L01 LVDS clock
 
 bridge:
 - debugfs for chains support
 - dw-hdmi: Improve support for YUV420 bus format
            CEC suspend/resume, update EDID on HDMI detect
 - dw-mipi-dsi: Fix enable/disable of DSI controller
 - lt9611uxc: Use MODULE_FIRMWARE()
 - ps8640: Remove broken EDID code
 - samsung-dsim: Fix command transfer
 - tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
 - adv7511: Fix low refresh rate
 - anx7625: Switch to macros instead of hardcoded values
            locking fixes
 - tc358767: fix hardware delays
 - sitronix-st7789v: Support panel orientation; Support rotation
                     property; Add support for Jasonic
                     JT240MHQS-HWT-EK-E3 plus DT bindings
 
 amdgpu:
 - SDMA 6.1.0 support
 - HDP 6.1 support
 - SMUIO 14.0 support
 - PSP 14.0 support
 - IH 6.1 support
 - Lots of checkpatch cleanups
 - GFX 9.4.3 updates
 - Add USB PD and IFWI flashing documentation
 - GPUVM updates
 - RAS fixes
 - DRR fixes
 - FAMS fixes
 - Virtual display fixes
 - Soft IH fixes
 - SMU13 fixes
 - Rework PSP firmware loading for other IPs
 - Kernel doc fixes
 - DCN 3.0.1 fixes
 - LTTPR fixes
 - DP MST fixes
 - DCN 3.1.6 fixes
 - SMU 13.x fixes
 - PSP 13.x fixes
 - SubVP fixes
 - GC 9.4.3 fixes
 - Display bandwidth calculation fixes
 - VCN4 secure submission fixes
 - Allow building DC on RISC-V
 - Add visible FB info to bo_print_info
 - HBR3 fixes
 - GFX9 MCBP fix
 - GMC10 vmhub index fix
 - GMC11 vmhub index fix
 - Create a new doorbell manager
 - SR-IOV fixes
 - initial freesync panel replay support
 - revert zpos properly until igt regression is fixeed
 - use TTM to manage doorbell BAR
 - Expose both current and average power via hwmon if supported
 
 amdkfd:
 - Cleanup CRIU dma-buf handling
 - Use KIQ to unmap HIQ
 - GFX 9.4.3 debugger updates
 - GFX 9.4.2 debugger fixes
 - Enable cooperative groups fof gfx11
 - SVM fixes
 - Convert older APUs to use dGPU path like newer APUs
 - Drop IOMMUv2 path as it is no longer used
 - TBA fix for aldebaran
 
 i915:
 - ICL+ DSI modeset sequence
 - HDCP improvements
 - MTL display fixes and cleanups
 - HSW/BDW PSR1 restored
 - Init DDI ports in VBT order
 - General display refactors
 - Start using plane scale factor for relative data rate
 - Use shmem for dpt objects
 - Expose RPS thresholds in sysfs
 - Apply GuC SLPC min frequency softlimit correctly
 - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL
 - Fix a VMA UAF for multi-gt platform
 - Do not use stolen on MTL due to HW bug
 - Check HuC and GuC version compatibility on MTL
 - avoid infinite GPU waits due to premature release
   of request memory
 - Fixes and updates for GSC memory allocation
 - Display SDVO fixes
 - Take stolen handling out of FBC code
 - Make i915_coherent_map_type GT-centric
 - Simplify shmem_create_from_object map_type
 
 msm:
 - SM6125 MDSS support
 - DPU: SM6125 DPU support
 - DSI: runtime PM support, burst mode support
 - DSI PHY: SM6125 support in 14nm DSI PHY driver
 - GPU: prepare for a7xx
 - fix a690 firmware
 - disable relocs on a6xx and newer
 
 radeon:
 - Lots of checkpatch cleanups
 
 ast:
 - improve device-model detection
 - Represent BMV as virtual connector
 - Report DP connection status
 
 nouveau:
 - add new exec/bind interface to support Vulkan
 - document some getparam ioctls
 - improve VRAM detection
 - various fixes/cleanups
 - workraound DPCD issues
 
 ivpu:
 - MMU updates
 - debugfs support
 - Support vpu4
 
 virtio:
 - add sync object support
 
 atmel-hlcdc:
 - Support inverted pixclock polarity
 
 etnaviv:
 - runtime PM cleanups
 - hang handling fixes
 
 exynos:
 - use fbdev DMA helpers
 - fix possible NULL ptr dereference
 
 komeda:
 - always attach encoder
 
 omapdrm:
 - use fbdev DMA helpers
 ingenic:
 - kconfig regmap fixes
 
 loongson:
 - support display controller
 
 mediatek:
 - Small mtk-dpi cleanups
 - DisplayPort: support eDP and aux-bus
 - Fix coverity issues
 - Fix potential memory leak if vmap() fail
 
 mgag200:
 - minor fixes
 
 mxsfb:
 - support disabling overlay planes
 
 panfrost:
 - fix sync in IRQ handling
 
 ssd130x:
 - Support per-controller default resolution plus DT bindings
 - Reduce memory-allocation overhead
 - Improve intermediate buffer size computation
 - Fix allocation of temporary buffers
 - Fix pitch computation
 - Fix shadow plane allocation
 
 tegra:
 - use fbdev DMA helpers
 - Convert to devm_platform_ioremap_resource()
 - support bridge/connector
 - enable PM
 
 tidss:
 - Support TI AM625 plus DT bindings
 - Implement new connector model plus driver updates
 
 vkms:
 - improve write back support
 - docs fixes
 - support gamma LUT
 
 zynqmp-dpsub:
 - misc fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmTukSYACgkQDHTzWXnE
 hr6vnQ/+J7vBVkBr8JsaEV/twcZwzbNdpivsIagd8U83GQB50nDReVXbNx+Wo0/C
 WiGlrC6Sw3NVOGbkigd5IQ7fb5C/7RnBmzMi/iS7Qnk2uEqLqgV00VxfGwdm6wgr
 0gNB8zuu2xYphHz2K8LzwnmeQRdN+YUQpUa2wNzLO88IEkTvq5vx2rJEn5p9/3hp
 OxbbPBzpDRRPlkNFfVQCN8todbKdsPc4am81Eqgv7BJf21RFgQodPGW5koCYuv0w
 3m+PJh1KkfYAL974EsLr/pkY7yhhiZ6SlFLX8ssg4FyZl/Vthmc9bl14jRq/pqt4
 GBp8yrPq1XjrwXR8wv3MiwNEdANQ+KD9IoGlzLxqVgmEFRE+g4VzZZXeC3AIrTVP
 FPg4iLUrDrmj9RpJmbVqhq9X2jZs+EtRAFkJPrPbq2fItAD2a2dW4X3ISSnnTqDI
 6O2dVwuLCU6OfWnvN4bPW9p8CqRgR8Itqv1SI8qXooDy307YZu1eTUf5JAVwG/SW
 xbDEFVFlMPyFLm+KN5dv1csJKK21vWi9gLg8phK8mTWYWnqMEtJqbxbRzmdBEFmE
 pXKVu01P6ZqgBbaETpCljlOaEDdJnvO4W+o70MgBtpR2IWFMbMNO+iS0EmLZ6Vgj
 9zYZctpL+dMuHV0Of1GMkHFRHTMYEzW4tuctLIQfG13y4WzyczY=
 =CwV9
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "The drm core grew a new generic gpu virtual address manager, and new
  execution locking helpers. These are used by nouveau now to provide
  uAPI support for the userspace Vulkan driver. AMD had a bunch of new
  IP core support, loads of refactoring around fbdev, but mostly just
  the usual amount of stuff across the board.

  core:
   - fix gfp flags in drmm_kmalloc

  gpuva:
   - add new generic GPU VA manager (for nouveau initially)

  syncobj:
   - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl

  dma-buf:
   - acquire resv lock for mmap() in exporters
   - support dma-buf self import automatically
   - docs fixes

  backlight:
   - fix fbdev interactions

  atomic:
   - improve logging

  prime:
   - remove struct gem_prim_mmap plus driver updates

  gem:
   - drm_exec: add locking over multiple GEM objects
   - fix lockdep checking

  fbdev:
   - make fbdev userspace interfaces optional
   - use linux device instead of fbdev device
   - use deferred i/o helper macros in various drivers
   - Make FB core selectable without drivers
   - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT
   - Add helper macros and Kconfig tokens for DMA-allocated framebuffer

  ttm:
   - support init_on_free
   - swapout fixes

  panel:
   - panel-edp: Support AUO B116XAB01.4
   - Support Visionox R66451 plus DT bindings
   - ld9040:
      - Backlight support
      - magic improved
      - Kconfig fix
   - Convert to of_device_get_match_data()
   - Fix Kconfig dependencies
   - simple:
      - Set bpc value to fix warning
      - Set connector type for AUO T215HVN01
      - Support Innolux G156HCE-L01 plus DT bindings
   - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings
   - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings
   - sitronix-st7789v:
      - Support Inanbo T28CP45TN89 plus DT bindings
      - Support EDT ET028013DMA plus DT bindings
      - Various cleanups
   - edp: Add timings for N140HCA-EAC
   - Allow panels and touchscreens to power sequence together
   - Fix Innolux G156HCE-L01 LVDS clock

  bridge:
   - debugfs for chains support
   - dw-hdmi:
      - Improve support for YUV420 bus format
      - CEC suspend/resume
      - update EDID on HDMI detect
   - dw-mipi-dsi: Fix enable/disable of DSI controller
   - lt9611uxc: Use MODULE_FIRMWARE()
   - ps8640: Remove broken EDID code
   - samsung-dsim: Fix command transfer
   - tc358764:
      - Handle HS/VS polarity
      - Use BIT() macro
      - Various cleanups
   - adv7511: Fix low refresh rate
   - anx7625:
      - Switch to macros instead of hardcoded values
      - locking fixes
   - tc358767: fix hardware delays
   - sitronix-st7789v:
      - Support panel orientation
      - Support rotation property
      - Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings

  amdgpu:
   - SDMA 6.1.0 support
   - HDP 6.1 support
   - SMUIO 14.0 support
   - PSP 14.0 support
   - IH 6.1 support
   - Lots of checkpatch cleanups
   - GFX 9.4.3 updates
   - Add USB PD and IFWI flashing documentation
   - GPUVM updates
   - RAS fixes
   - DRR fixes
   - FAMS fixes
   - Virtual display fixes
   - Soft IH fixes
   - SMU13 fixes
   - Rework PSP firmware loading for other IPs
   - Kernel doc fixes
   - DCN 3.0.1 fixes
   - LTTPR fixes
   - DP MST fixes
   - DCN 3.1.6 fixes
   - SMU 13.x fixes
   - PSP 13.x fixes
   - SubVP fixes
   - GC 9.4.3 fixes
   - Display bandwidth calculation fixes
   - VCN4 secure submission fixes
   - Allow building DC on RISC-V
   - Add visible FB info to bo_print_info
   - HBR3 fixes
   - GFX9 MCBP fix
   - GMC10 vmhub index fix
   - GMC11 vmhub index fix
   - Create a new doorbell manager
   - SR-IOV fixes
   - initial freesync panel replay support
   - revert zpos properly until igt regression is fixeed
   - use TTM to manage doorbell BAR
   - Expose both current and average power via hwmon if supported

  amdkfd:
   - Cleanup CRIU dma-buf handling
   - Use KIQ to unmap HIQ
   - GFX 9.4.3 debugger updates
   - GFX 9.4.2 debugger fixes
   - Enable cooperative groups fof gfx11
   - SVM fixes
   - Convert older APUs to use dGPU path like newer APUs
   - Drop IOMMUv2 path as it is no longer used
   - TBA fix for aldebaran

  i915:
   - ICL+ DSI modeset sequence
   - HDCP improvements
   - MTL display fixes and cleanups
   - HSW/BDW PSR1 restored
   - Init DDI ports in VBT order
   - General display refactors
   - Start using plane scale factor for relative data rate
   - Use shmem for dpt objects
   - Expose RPS thresholds in sysfs
   - Apply GuC SLPC min frequency softlimit correctly
   - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL
   - Fix a VMA UAF for multi-gt platform
   - Do not use stolen on MTL due to HW bug
   - Check HuC and GuC version compatibility on MTL
   - avoid infinite GPU waits due to premature release of request memory
   - Fixes and updates for GSC memory allocation
   - Display SDVO fixes
   - Take stolen handling out of FBC code
   - Make i915_coherent_map_type GT-centric
   - Simplify shmem_create_from_object map_type

  msm:
   - SM6125 MDSS support
   - DPU: SM6125 DPU support
   - DSI: runtime PM support, burst mode support
   - DSI PHY: SM6125 support in 14nm DSI PHY driver
   - GPU: prepare for a7xx
   - fix a690 firmware
   - disable relocs on a6xx and newer

  radeon:
   - Lots of checkpatch cleanups

  ast:
   - improve device-model detection
   - Represent BMV as virtual connector
   - Report DP connection status

  nouveau:
   - add new exec/bind interface to support Vulkan
   - document some getparam ioctls
   - improve VRAM detection
   - various fixes/cleanups
   - workraound DPCD issues

  ivpu:
   - MMU updates
   - debugfs support
   - Support vpu4

  virtio:
   - add sync object support

  atmel-hlcdc:
   - Support inverted pixclock polarity

  etnaviv:
   - runtime PM cleanups
   - hang handling fixes

  exynos:
   - use fbdev DMA helpers
   - fix possible NULL ptr dereference

  komeda:
   - always attach encoder

  omapdrm:
   - use fbdev DMA helpers
ingenic:
   - kconfig regmap fixes

  loongson:
   - support display controller

  mediatek:
   - Small mtk-dpi cleanups
   - DisplayPort: support eDP and aux-bus
   - Fix coverity issues
   - Fix potential memory leak if vmap() fail

  mgag200:
   - minor fixes

  mxsfb:
   - support disabling overlay planes

  panfrost:
   - fix sync in IRQ handling

  ssd130x:
   - Support per-controller default resolution plus DT bindings
   - Reduce memory-allocation overhead
   - Improve intermediate buffer size computation
   - Fix allocation of temporary buffers
   - Fix pitch computation
   - Fix shadow plane allocation

  tegra:
   - use fbdev DMA helpers
   - Convert to devm_platform_ioremap_resource()
   - support bridge/connector
   - enable PM

  tidss:
   - Support TI AM625 plus DT bindings
   - Implement new connector model plus driver updates

  vkms:
   - improve write back support
   - docs fixes
   - support gamma LUT

  zynqmp-dpsub:
   - misc fixes"

* tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm: (1327 commits)
  drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map()
  drm/tests/drm_kunit_helpers: Place correct function name in the comment header
  drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitly
  drm/nouveau: uvmm: fix unset region pointer on remap
  drm/nouveau: sched: avoid job races between entities
  drm/i915: Fix HPD polling, reenabling the output poll work as needed
  drm: Add an HPD poll helper to reschedule the poll work
  drm/i915: Fix TLB-Invalidation seqno store
  drm/ttm/tests: Fix type conversion in ttm_pool_test
  drm/msm/a6xx: Bail out early if setting GPU OOB fails
  drm/msm/a6xx: Move LLC accessors to the common header
  drm/msm/a6xx: Introduce a6xx_llc_read
  drm/ttm/tests: Require MMU when testing
  drm/panel: simple: Fix Innolux G156HCE-L01 LVDS clock
  Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0""
  drm/amdgpu: Add memory vendor information
  drm/amd: flush any delayed gfxoff on suspend entry
  drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
  drm/amdgpu: Remove gfxoff check in GFX v9.4.3
  drm/amd/pm: Update pci link speed for smu v13.0.6
  ...
2023-08-30 13:34:34 -07:00
Linus Torvalds
d7dd9b449f EFI updates for v6.6
- one bugfix for x86 mixed mode that did not make it into v6.5
 - first pass of cleanup for the EFI runtime wrappers
 - some cosmetic touchups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZOx+LAAKCRAwbglWLn0t
 XKzLAPwK8wIZ14NlC55NCtdvKEzK3N/muxKRqg2MAHfrbnREFgD9GgUxSbIEU2gz
 BXQM9GLaP86qCXkZlPYktjP6RVfDYAk=
 =e2mx
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:
 "This primarily covers some cleanup work on the EFI runtime wrappers,
  which are shared between all EFI architectures except Itanium, and
  which provide some level of isolation to prevent faults occurring in
  the firmware code (which runs at the same privilege level as the
  kernel) from bringing down the system.

  Beyond that, there is a fix that did not make it into v6.5, and some
  doc fixes and dead code cleanup.

   - one bugfix for x86 mixed mode that did not make it into v6.5

   - first pass of cleanup for the EFI runtime wrappers

   - some cosmetic touchups"

* tag 'efi-next-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  x86/efistub: Fix PCI ROM preservation in mixed mode
  efi/runtime-wrappers: Clean up white space and add __init annotation
  acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers
  efi/runtime-wrappers: Don't duplicate setup/teardown code
  efi/runtime-wrappers: Remove duplicated macro for service returning void
  efi/runtime-wrapper: Move workqueue manipulation out of line
  efi/runtime-wrappers: Use type safe encapsulation of call arguments
  efi/riscv: Move EFI runtime call setup/teardown helpers out of line
  efi/arm64: Move EFI runtime call setup/teardown helpers out of line
  efi/riscv: libstub: Fix comment about absolute relocation
  efi: memmap: Remove kernel-doc warnings
  efi: Remove unused extern declaration efi_lookup_mapped_addr()
2023-08-28 16:25:45 -07:00
Ard Biesheuvel
b691118f2c Merge remote-tracking branch 'linux-efi/urgent' into efi/next 2023-08-28 12:57:05 +02:00
Mikel Rychliski
8b94da9255 x86/efistub: Fix PCI ROM preservation in mixed mode
preserve_pci_rom_image() was accessing the romsize field in
efi_pci_io_protocol_t directly instead of using the efi_table_attr()
helper. This prevents the ROM image from being saved correctly during a
mixed mode boot.

Fixes: 2c3625cb9f ("efi/x86: Fold __setup_efi_pci32() and __setup_efi_pci64() into one function")
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-24 11:22:12 +02:00
Ard Biesheuvel
a14198dfe9 efi/runtime-wrappers: Clean up white space and add __init annotation
Some cosmetic changes as well as a missing __init annotation.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-22 10:39:26 +02:00
Ard Biesheuvel
5894cf571e acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers
Instead of bypassing the kernel's adaptation layer for performing EFI
runtime calls, wire up ACPI PRM handling into it. This means these calls
can no longer occur concurrently with EFI runtime calls, and will be
made from the EFI runtime workqueue. It also means any page faults
occurring during PRM handling will be identified correctly as
originating in firmware code.

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-22 10:39:26 +02:00
Ard Biesheuvel
3c17ae4161 efi/runtime-wrappers: Don't duplicate setup/teardown code
Avoid duplicating the EFI arch setup and teardown routine calls numerous
times in efi_call_rts(). Instead, expand the efi_call_virt_pointer()
macro into efi_call_rts(), taking the pre and post parts out of the
switch.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-22 10:39:26 +02:00
Ard Biesheuvel
e38abdab44 efi/runtime-wrappers: Remove duplicated macro for service returning void
__efi_call_virt() exists as an alternative for efi_call_virt() for the
sole reason that ResetSystem() returns void, and so we cannot use a call
to it in the RHS of an assignment.

Given that there is only a single user, let's drop the macro, and expand
it into the caller. That way, the remaining macro can be tightened
somewhat in terms of type safety too.

Note that the use of typeof() on the runtime service invocation does not
result in an actual call being made, but it does require a few pointer
types to be fixed up and converted into the proper function pointer
prototypes.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-22 10:39:26 +02:00
Ard Biesheuvel
c99ba6e546 efi/runtime-wrapper: Move workqueue manipulation out of line
efi_queue_work() is a macro that implements the non-trivial manipulation
of the EFI runtime workqueue and completion data structure, most of
which is generic, and could be shared between all the users of the
macro. So move it out of the macro and into a new helper function.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-21 17:59:54 +02:00
Ard Biesheuvel
c7c7bce093 efi/runtime-wrappers: Use type safe encapsulation of call arguments
The current code that marshalls the EFI runtime call arguments to hand
them off to a async helper does so in a type unsafe and slightly messy
manner - everything is cast to void* except for some integral types that
are passed by reference and dereferenced on the receiver end.

Let's clean this up a bit, and record the arguments of each runtime
service invocation exactly as they are issued, in a manner that permits
the compiler to check the types of the arguments at both ends.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-21 17:59:54 +02:00
Ard Biesheuvel
d8ea2ffd01 efi/riscv: Move EFI runtime call setup/teardown helpers out of line
Only the arch_efi_call_virt() macro that some architectures override
needs to be a macro, given that it is variadic and encapsulates calls
via function pointers that have different prototypes.

The associated setup and teardown code are not special in this regard,
and don't need to be instantiated at each call site. So turn them into
ordinary C functions and move them out of line.

Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-21 17:59:25 +02:00
Ard Biesheuvel
a1b87d54f4 x86/efistub: Avoid legacy decompressor when doing EFI boot
The bare metal decompressor code was never really intended to run in a
hosted environment such as the EFI boot services, and does a few things
that are becoming problematic in the context of EFI boot now that the
logo requirements are getting tighter: EFI executables will no longer be
allowed to consist of a single executable section that is mapped with
read, write and execute permissions if they are intended for use in a
context where Secure Boot is enabled (and where Microsoft's set of
certificates is used, i.e., every x86 PC built to run Windows).

To avoid stepping on reserved memory before having inspected the E820
tables, and to ensure the correct placement when running a kernel build
that is non-relocatable, the bare metal decompressor moves its own
executable image to the end of the allocation that was reserved for it,
in order to perform the decompression in place. This means the region in
question requires both write and execute permissions, which either need
to be given upfront (which EFI will no longer permit), or need to be
applied on demand using the existing page fault handling framework.

However, the physical placement of the kernel is usually randomized
anyway, and even if it isn't, a dedicated decompression output buffer
can be allocated anywhere in memory using EFI APIs when still running in
the boot services, given that EFI support already implies a relocatable
kernel. This means that decompression in place is never necessary, nor
is moving the compressed image from one end to the other.

Since EFI already maps all of memory 1:1, it is also unnecessary to
create new page tables or handle page faults when decompressing the
kernel. That means there is also no need to replace the special
exception handlers for SEV. Generally, there is little need to do
any of the things that the decompressor does beyond

- initialize SEV encryption, if needed,
- perform the 4/5 level paging switch, if needed,
- decompress the kernel
- relocate the kernel

So do all of this from the EFI stub code, and avoid the bare metal
decompressor altogether.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-24-ardb@kernel.org
2023-08-07 21:07:43 +02:00
Ard Biesheuvel
31c77a5099 x86/efistub: Perform SNP feature test while running in the firmware
Before refactoring the EFI stub boot flow to avoid the legacy bare metal
decompressor, duplicate the SNP feature check in the EFI stub before
handing over to the kernel proper.

The SNP feature check can be performed while running under the EFI boot
services, which means it can force the boot to fail gracefully and
return an error to the bootloader if the loaded kernel does not
implement support for all the features that the hypervisor enabled.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-23-ardb@kernel.org
2023-08-07 21:03:53 +02:00
Ard Biesheuvel
bc5ddceff4 efi/libstub: Add limit argument to efi_random_alloc()
x86 will need to limit the kernel memory allocation to the lowest 512
MiB of memory, to match the behavior of the existing bare metal KASLR
physical randomization logic. So in preparation for that, add a limit
parameter to efi_random_alloc() and wire it up.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-22-ardb@kernel.org
2023-08-07 21:01:46 +02:00
Ard Biesheuvel
11078876b7 x86/efistub: Prefer EFI memory attributes protocol over DXE services
Currently, the EFI stub relies on DXE services in some cases to clear
non-execute restrictions from page allocations that need to be
executable. This is dodgy, because DXE services are not specified by
UEFI but by PI, and they are not intended for consumption by OS loaders.
However, no alternative existed at the time.

Now, there is a new UEFI protocol that should be used instead, so if it
exists, prefer it over the DXE services calls.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-18-ardb@kernel.org
2023-08-07 20:54:15 +02:00
Ard Biesheuvel
cb1c9e02b0 x86/efistub: Perform 4/5 level paging switch from the stub
In preparation for updating the EFI stub boot flow to avoid the bare
metal decompressor code altogether, implement the support code for
switching between 4 and 5 levels of paging before jumping to the kernel
proper.

Reuse the newly refactored trampoline that the bare metal decompressor
uses, but relies on EFI APIs to allocate 32-bit addressable memory and
remap it with the appropriate permissions. Given that the bare metal
decompressor will no longer call into the trampoline if the number of
paging levels is already set correctly, it is no longer needed to remove
NX restrictions from the memory range where this trampoline may end up.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20230807162720.545787-17-ardb@kernel.org
2023-08-07 20:52:32 +02:00
Ard Biesheuvel
d7156b986d x86/efistub: Clear BSS in EFI handover protocol entrypoint
The so-called EFI handover protocol is value-add from the distros that
permits a loader to simply copy a PE kernel image into memory and call
an alternative entrypoint that is described by an embedded boot_params
structure.

Most implementations of this protocol do not bother to check the PE
header for minimum alignment, section placement, etc, and therefore also
don't clear the image's BSS, or even allocate enough memory for it.

Allocating more memory on the fly is rather difficult, but at least
clear the BSS region explicitly when entering in this manner, so that
the EFI stub code does not get confused by global variables that were
not zero-initialized correctly.

When booting in mixed mode, this BSS clearing must occur before any
global state is created, so clear it in the 32-bit asm entry point.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-7-ardb@kernel.org
2023-08-07 20:40:36 +02:00
Ard Biesheuvel
df9215f152 x86/efistub: Simplify and clean up handover entry code
Now that the EFI entry code in assembler is only used by the optional
and deprecated EFI handover protocol, and given that the EFI stub C code
no longer returns to it, most of it can simply be dropped.

While at it, clarify the symbol naming, by merging efi_main() and
efi_stub_entry(), making the latter the shared entry point for all
different boot modes that enter via the EFI stub.

The efi32_stub_entry() and efi64_stub_entry() names are referenced
explicitly by the tooling that populates the setup header, so these must
be retained, but can be emitted as aliases of efi_stub_entry() where
appropriate.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-5-ardb@kernel.org
2023-08-07 20:37:09 +02:00
Ard Biesheuvel
d2d7a54f69 x86/efistub: Branch straight to kernel entry point from C code
Instead of returning to the calling code in assembler that does nothing
more than perform an indirect call with the boot_params pointer in
register ESI/RSI, perform the jump directly from the EFI stub C code.
This will allow the asm entrypoint code to be dropped entirely in
subsequent patches.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-4-ardb@kernel.org
2023-08-07 20:36:06 +02:00
Xiao Wang
f6e6e95ce1 efi/riscv: libstub: Fix comment about absolute relocation
We don't want absolute symbols references in the stub, so fix the double
negation in the comment.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-03 15:40:22 +02:00
Daniel Vetter
6c7f27441d drm-misc-next for v6.6:
UAPI Changes:
 
  * fbdev:
    * Make fbdev userspace interfaces optional; only leaves the
      framebuffer console active
 
  * prime:
    * Support dma-buf self-import for all drivers automatically: improves
      support for many userspace compositors
 
 Cross-subsystem Changes:
 
  * backlight:
    * Fix interaction with fbdev in several drivers
 
  * base: Convert struct platform.remove to return void; part of a larger,
    tree-wide effort
 
  * dma-buf: Acquire reservation lock for mmap() in exporters; part
    of an on-going effort to simplify locking around dma-bufs
 
  * fbdev:
    * Use Linux device instead of fbdev device in many places
    * Use deferred-I/O helper macros in various drivers
 
  * i2c: Convert struct i2c from .probe_new to .probe; part of a larger,
    tree-wide effort
 
  * video:
    * Avoid including <linux/screen_info.h>
 
 Core Changes:
 
  * atomic:
    * Improve logging
 
  * prime:
    * Remove struct drm_driver.gem_prime_mmap plus driver updates: all
      drivers now implement this callback with drm_gem_prime_mmap()
 
  * gem:
    * Support execution contexts: provides locking over multiple GEM
      objects
 
  * ttm:
    * Support init_on_free
    * Swapout fixes
 
 Driver Changes:
 
  * accel:
    * ivpu: MMU updates; Support debugfs
 
  * ast:
    * Improve device-model detection
    * Cleanups
 
  * bridge:
    * dw-hdmi: Improve support for YUV420 bus format
    * dw-mipi-dsi: Fix enable/disable of DSI controller
    * lt9611uxc: Use MODULE_FIRMWARE()
    * ps8640: Remove broken EDID code
    * samsung-dsim: Fix command transfer
    * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
    * Cleanups
 
  * ingenic:
    * Kconfig REGMAP fixes
 
  * loongson:
    * Support display controller
 
  * mgag200:
    * Minor fixes
 
  * mxsfb:
    * Support disabling overlay planes
 
  * nouveau:
    * Improve VRAM detection
    * Various fixes and cleanups
 
  * panel:
    * panel-edp: Support AUO B116XAB01.4
    * Support Visionox R66451 plus DT bindings
    * Cleanups
 
  * ssd130x:
    * Support per-controller default resolution plus DT bindings
    * Reduce memory-allocation overhead
    * Cleanups
 
  * tidss:
    * Support TI AM625 plus DT bindings
    * Implement new connector model plus driver updates
 
  * vkms
    * Improve write-back support
    * Documentation fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmSvvRAACgkQaA3BHVML
 eiNpGQgAs8jq1XjN9t8jZsdgXnoCbkZyVUI2NO0HwoVwpRCLgbXp5AX5qq2oRciE
 TBhe4Fceh/ZsYqHTZQahnguxgRKM5JgXwbI4Z0iiOVcqasNbycaKAqipxJJ7kdo1
 qPhGCbgQFVX7oIq2xjfXehh6O0SYX+R9r88X8dMJxMYv/pcLwOHG74kS040WOcQq
 uATgcnobOf/D8ZmlqvfKGAeTUoFo/RSR2Uhlauka58qgeUbicrTELZT2barY9d+k
 as6U5vv4wx2zMklTkjrlkMpAT1ZpbB9d3jGHwL27VEnjlfd3wV2bdH7Dzn9qZRf/
 gn0ALg/b3u5yBWk/k7YBvijXyNcH6Q==
 =bBuG
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v6.6:

UAPI Changes:

 * fbdev:
   * Make fbdev userspace interfaces optional; only leaves the
     framebuffer console active

 * prime:
   * Support dma-buf self-import for all drivers automatically: improves
     support for many userspace compositors

Cross-subsystem Changes:

 * backlight:
   * Fix interaction with fbdev in several drivers

 * base: Convert struct platform.remove to return void; part of a larger,
   tree-wide effort

 * dma-buf: Acquire reservation lock for mmap() in exporters; part
   of an on-going effort to simplify locking around dma-bufs

 * fbdev:
   * Use Linux device instead of fbdev device in many places
   * Use deferred-I/O helper macros in various drivers

 * i2c: Convert struct i2c from .probe_new to .probe; part of a larger,
   tree-wide effort

 * video:
   * Avoid including <linux/screen_info.h>

Core Changes:

 * atomic:
   * Improve logging

 * prime:
   * Remove struct drm_driver.gem_prime_mmap plus driver updates: all
     drivers now implement this callback with drm_gem_prime_mmap()

 * gem:
   * Support execution contexts: provides locking over multiple GEM
     objects

 * ttm:
   * Support init_on_free
   * Swapout fixes

Driver Changes:

 * accel:
   * ivpu: MMU updates; Support debugfs

 * ast:
   * Improve device-model detection
   * Cleanups

 * bridge:
   * dw-hdmi: Improve support for YUV420 bus format
   * dw-mipi-dsi: Fix enable/disable of DSI controller
   * lt9611uxc: Use MODULE_FIRMWARE()
   * ps8640: Remove broken EDID code
   * samsung-dsim: Fix command transfer
   * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
   * Cleanups

 * ingenic:
   * Kconfig REGMAP fixes

 * loongson:
   * Support display controller

 * mgag200:
   * Minor fixes

 * mxsfb:
   * Support disabling overlay planes

 * nouveau:
   * Improve VRAM detection
   * Various fixes and cleanups

 * panel:
   * panel-edp: Support AUO B116XAB01.4
   * Support Visionox R66451 plus DT bindings
   * Cleanups

 * ssd130x:
   * Support per-controller default resolution plus DT bindings
   * Reduce memory-allocation overhead
   * Cleanups

 * tidss:
   * Support TI AM625 plus DT bindings
   * Implement new connector model plus driver updates

 * vkms
   * Improve write-back support
   * Documentation fixes

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230713090830.GA23281@linux-uq9g
2023-07-17 15:37:57 +02:00
Thomas Zimmermann
8b0d13545b efi: Do not include <linux/screen_info.h> from EFI header
The header file <linux/efi.h> does not need anything from
<linux/screen_info.h>. Declare struct screen_info and remove
the include statements. Update a number of source files that
require struct screen_info's definition.

v2:
	* update loongarch (Jingfeng)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230706104852.27451-2-tzimmermann@suse.de
2023-07-08 20:26:36 +02:00
Linus Torvalds
937d96d2d5 EFI updates for v6.5
Although some more stuff is brewing, the EFI changes that are ready for
 mainline are few, so not a lot to pull this cycle:
 
 - improve the PCI DMA paranoia logic in the EFI stub
 - some constification changes
 - add statfs support to efivarfs
 - allow user space to enumerate updatable firmware resources without
   CAP_SYS_ADMIN
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZJ1jIwAKCRAwbglWLn0t
 XDs8AP9PAAWIgukyXkYpoxabaQQK1Pqw6Zv63XAcNYBHa4zjHwD/UTcYviQIlI0B
 Rfj4i8pDQVVfReSI+lKWvhXfRQ5Qbgs=
 =w6zX
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:
 "Although some more stuff is brewing, the EFI changes that are ready
  for mainline are few this cycle:

   - improve the PCI DMA paranoia logic in the EFI stub

   - some constification changes

   - add statfs support to efivarfs

   - allow user space to enumerate updatable firmware resources without
     CAP_SYS_ADMIN"

* tag 'efi-next-for-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi/libstub: Disable PCI DMA before grabbing the EFI memory map
  efi/esrt: Allow ESRT access without CAP_SYS_ADMIN
  efivarfs: expose used and total size
  efi: make kobj_type structure constant
  efi: x86: make kobj_type structure constant
2023-06-30 21:35:52 -07:00
Ard Biesheuvel
2e28a798c3 efi/libstub: Disable PCI DMA before grabbing the EFI memory map
Currently, the EFI stub will disable PCI DMA as the very last thing it
does before calling ExitBootServices(), to avoid interfering with the
firmware's normal operation as much as possible.

However, the stub will invoke DisconnectController() on all endpoints
downstream of the PCI bridges it disables, and this may affect the
layout of the EFI memory map, making it substantially more likely that
ExitBootServices() will fail the first time around, and that the EFI
memory map needs to be reloaded.

This, in turn, increases the likelihood that the slack space we
allocated is insufficient (and we can no longer allocate memory via boot
services after having called ExitBootServices() once), causing the
second call to GetMemoryMap (and therefore the boot) to fail. This makes
the PCI DMA disable feature a bit more fragile than it already is, so
let's make it more robust, by allocating the space for the EFI memory
map after disabling PCI DMA.

Fixes: 4444f8541d ("efi: Allow disabling PCI busmastering on bridges during boot")
Reported-by: Glenn Washburn <development@efficientek.com>
Acked-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-06-27 12:27:06 +02:00
Linus Torvalds
2c96136a3f - Add support for unaccepted memory as specified in the UEFI spec v2.9.
The gist of it all is that Intel TDX and AMD SEV-SNP confidential
   computing guests define the notion of accepting memory before using it
   and thus preventing a whole set of attacks against such guests like
   memory replay and the like.
 
   There are a couple of strategies of how memory should be accepted
   - the current implementation does an on-demand way of accepting.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSZ0f4ACgkQEsHwGGHe
 VUpasw//RKoNW9HSU1csY+XnG9uuaT6QKgji+gIEZWWIGPO9iibvbBj6P5WxJE8T
 fe7yb6CGa6d6thoU0v+mQGVVvCd7OjCFwPD5wAo4mXToD7Ig+4mI6jMkaKifqa2f
 N1Uuy8u/zQnGyWrP5Y//WH5bJYfsmds4UGwXI2nLvKlhE7MG90/ePjt7iqnnwZsy
 waLp6a0Q1VeOvnfRszFLHZw/SoER5RSJ4qeVqttkFNmPPEKMK1Kirrl2poR56OQJ
 nMr6LqVtD7erlSJ36VRXOKzLI443A4iIEIg/wBjIOU6L5ZEWJGNqtCDnIqFJ6+TM
 XatsejfRYkkMZH0qXtX9+M0u+HJHbZPCH5rEcA21P3Nbd7od/ANq91qCGoMjtUZ4
 7pZohMG8M6IDvkLiOb8fQVkR5k/9Jbk8UvdN/8jdPx1ERxYMFO3BDvJpV2gzrW4B
 KYtFTPR7j2nY3eKfDpe3flanqYzKUBsKoTlLnlH7UHaiMZ2idwG8AQjlrhC/erCq
 /Lq1LXt4Mq46FyHABc+PSHytu0WWj1nBUftRt+lviY/Uv7TlkBldOTT7wm7itsfF
 HUCTfLWl0CJXKPq8rbbZhAG/exN6Ay6MO3E3OcNq8A72E5y4cXenuG3ic/0tUuOu
 FfjpiMk35qE2Qb4hnj1YtF3XINtd1MpKcuwzGSzEdv9s3J7hrS0=
 =FS95
 -----END PGP SIGNATURE-----

Merge tag 'x86_cc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 confidential computing update from Borislav Petkov:

 - Add support for unaccepted memory as specified in the UEFI spec v2.9.

   The gist of it all is that Intel TDX and AMD SEV-SNP confidential
   computing guests define the notion of accepting memory before using
   it and thus preventing a whole set of attacks against such guests
   like memory replay and the like.

   There are a couple of strategies of how memory should be accepted -
   the current implementation does an on-demand way of accepting.

* tag 'x86_cc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  virt: sevguest: Add CONFIG_CRYPTO dependency
  x86/efi: Safely enable unaccepted memory in UEFI
  x86/sev: Add SNP-specific unaccepted memory support
  x86/sev: Use large PSC requests if applicable
  x86/sev: Allow for use of the early boot GHCB for PSC requests
  x86/sev: Put PSC struct on the stack in prep for unaccepted memory support
  x86/sev: Fix calculation of end address based on number of pages
  x86/tdx: Add unaccepted memory support
  x86/tdx: Refactor try_accept_one()
  x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub
  efi/unaccepted: Avoid load_unaligned_zeropad() stepping into unaccepted memory
  efi: Add unaccepted memory support
  x86/boot/compressed: Handle unaccepted memory
  efi/libstub: Implement support for unaccepted memory
  efi/x86: Get full memory map in allocate_e820()
  mm: Add support for unaccepted memory
2023-06-26 15:32:39 -07:00
Linus Torvalds
69cbeb61ff Revert "efi: random: refresh non-volatile random seed when RNG is initialized"
This reverts commit e7b813b32a (and the
subsequent fix for it: 41a15855c1 "efi: random: fix NULL-deref when
refreshing seed").

It turns otu to cause non-deterministic boot stalls on at least a HP
6730b laptop.

Reported-and-bisected-by: Sami Korkalainen <sami.korkalainen@proton.me>
Link: https://lore.kernel.org/all/GQUnKz2al3yke5mB2i1kp3SzNHjK8vi6KJEh7rnLrOQ24OrlljeCyeWveLW9pICEmB9Qc8PKdNt3w1t_g3-Uvxq1l8Wj67PpoMeWDoH8PKk=@proton.me/
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-21 10:58:46 -07:00
Dionna Glaze
c0461bd166 x86/efi: Safely enable unaccepted memory in UEFI
The UEFI v2.9 specification includes a new memory type to be used in
environments where the OS must accept memory that is provided from its
host. Before the introduction of this memory type, all memory was
accepted eagerly in the firmware. In order for the firmware to safely
stop accepting memory on the OS's behalf, the OS must affirmatively
indicate support to the firmware. This is only a problem for AMD
SEV-SNP, since Linux has had support for it since 5.19. The other
technology that can make use of unaccepted memory, Intel TDX, does not
yet have Linux support, so it can strictly require unaccepted memory
support as a dependency of CONFIG_TDX and not require communication with
the firmware.

Enabling unaccepted memory requires calling a 0-argument enablement
protocol before ExitBootServices. This call is only made if the kernel
is compiled with UNACCEPTED_MEMORY=y

This protocol will be removed after the end of life of the first LTS
that includes it, in order to give firmware implementations an
expiration date for it. When the protocol is removed, firmware will
strictly infer that a SEV-SNP VM is running an OS that supports the
unaccepted memory type. At the earliest convenience, when unaccepted
memory support is added to Linux, SEV-SNP may take strict dependence in
it. After the firmware removes support for the protocol, this should be
reverted.

  [tl: address some checkscript warnings]

Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/0d5f3d9a20b5cf361945b7ab1263c36586a78a42.1686063086.git.thomas.lendacky@amd.com
2023-06-06 18:32:59 +02:00
Kirill A. Shutemov
c211c19e80 efi/unaccepted: Avoid load_unaligned_zeropad() stepping into unaccepted memory
load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
The unwanted loads are typically harmless. But, they might be made to
totally unrelated or even unmapped memory. load_unaligned_zeropad()
relies on exception fixup (#PF, #GP and now #VE) to recover from these
unwanted loads.

But, this approach does not work for unaccepted memory. For TDX, a load
from unaccepted memory will not lead to a recoverable exception within
the guest. The guest will exit to the VMM where the only recourse is to
terminate the guest.

There are two parts to fix this issue and comprehensively avoid access
to unaccepted memory. Together these ensure that an extra "guard" page
is accepted in addition to the memory that needs to be used.

1. Implicitly extend the range_contains_unaccepted_memory(start, end)
   checks up to end+unit_size if 'end' is aligned on a unit_size
   boundary.
2. Implicitly extend accept_memory(start, end) to end+unit_size if 'end'
   is aligned on a unit_size boundary.

Side note: This leads to something strange. Pages which were accepted
	   at boot, marked by the firmware as accepted and will never
	   _need_ to be accepted might be on unaccepted_pages list
	   This is a cue to ensure that the next page is accepted
	   before 'page' can be used.

This is an actual, real-world problem which was discovered during TDX
testing.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230606142637.5171-7-kirill.shutemov@linux.intel.com
2023-06-06 17:27:08 +02:00
Kirill A. Shutemov
2053bc57f3 efi: Add unaccepted memory support
efi_config_parse_tables() reserves memory that holds unaccepted memory
configuration table so it won't be reused by page allocator.

Core-mm requires few helpers to support unaccepted memory:

 - accept_memory() checks the range of addresses against the bitmap and
   accept memory if needed.

 - range_contains_unaccepted_memory() checks if anything within the
   range requires acceptance.

Architectural code has to provide efi_get_unaccepted_table() that
returns pointer to the unaccepted memory configuration table.

arch_accept_memory() handles arch-specific part of memory acceptance.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230606142637.5171-6-kirill.shutemov@linux.intel.com
2023-06-06 17:22:20 +02:00
Kirill A. Shutemov
745e3ed85f efi/libstub: Implement support for unaccepted memory
UEFI Specification version 2.9 introduces the concept of memory
acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD
SEV-SNP, requiring memory to be accepted before it can be used by the
guest. Accepting happens via a protocol specific for the Virtual
Machine platform.

Accepting memory is costly and it makes VMM allocate memory for the
accepted guest physical address range. It's better to postpone memory
acceptance until memory is needed. It lowers boot time and reduces
memory overhead.

The kernel needs to know what memory has been accepted. Firmware
communicates this information via memory map: a new memory type --
EFI_UNACCEPTED_MEMORY -- indicates such memory.

Range-based tracking works fine for firmware, but it gets bulky for
the kernel: e820 (or whatever the arch uses) has to be modified on every
page acceptance. It leads to table fragmentation and there's a limited
number of entries in the e820 table.

Another option is to mark such memory as usable in e820 and track if the
range has been accepted in a bitmap. One bit in the bitmap represents a
naturally aligned power-2-sized region of address space -- unit.

For x86, unit size is 2MiB: 4k of the bitmap is enough to track 64GiB or
physical address space.

In the worst-case scenario -- a huge hole in the middle of the
address space -- It needs 256MiB to handle 4PiB of the address
space.

Any unaccepted memory that is not aligned to unit_size gets accepted
upfront.

The bitmap is allocated and constructed in the EFI stub and passed down
to the kernel via EFI configuration table. allocate_e820() allocates the
bitmap if unaccepted memory is present, according to the size of
unaccepted region.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230606142637.5171-4-kirill.shutemov@linux.intel.com
2023-06-06 16:58:23 +02:00
Kirill A. Shutemov
2e9f46ee15 efi/x86: Get full memory map in allocate_e820()
Currently allocate_e820() is only interested in the size of map and size
of memory descriptor to determine how many e820 entries the kernel
needs.

UEFI Specification version 2.9 introduces a new memory type --
unaccepted memory. To track unaccepted memory, the kernel needs to
allocate a bitmap. The size of the bitmap is dependent on the maximum
physical address present in the system. A full memory map is required to
find the maximum address.

Modify allocate_e820() to get a full memory map.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230606142637.5171-3-kirill.shutemov@linux.intel.com
2023-06-06 16:45:14 +02:00
Nicholas Bishop
d0a1865cf7 efi/esrt: Allow ESRT access without CAP_SYS_ADMIN
Access to the files in /sys/firmware/efi/esrt has been restricted to
CAP_SYS_ADMIN since support for ESRT was added, but this seems overly
restrictive given that the files are read-only and just provide
information about UEFI firmware updates.

Remove the CAP_SYS_ADMIN restriction so that a non-root process can read
the files, provided a suitably-privileged process changes the file
ownership first. The files are still read-only and still owned by root
by default.

Signed-off-by: Nicholas Bishop <nicholasbishop@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-06-06 15:33:59 +02:00
Arnd Bergmann
fd936fd8ac efi: fix missing prototype warnings
The cper.c file needs to include an extra header, and efi_zboot_entry
needs an extern declaration to avoid these 'make W=1' warnings:

drivers/firmware/efi/libstub/zboot.c:65:1: error: no previous prototype for 'efi_zboot_entry' [-Werror=missing-prototypes]
drivers/firmware/efi/efi.c:176:16: error: no previous prototype for 'efi_attr_is_visible' [-Werror=missing-prototypes]
drivers/firmware/efi/cper.c:626:6: error: no previous prototype for 'cper_estatus_print' [-Werror=missing-prototypes]
drivers/firmware/efi/cper.c:649:5: error: no previous prototype for 'cper_estatus_check_header' [-Werror=missing-prototypes]
drivers/firmware/efi/cper.c:662:5: error: no previous prototype for 'cper_estatus_check' [-Werror=missing-prototypes]

To make this easier, move the cper specific declarations to
include/linux/cper.h.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-25 09:26:19 +02:00
Ard Biesheuvel
095aabe338 efi/libstub: zboot: Avoid eager evaluation of objcopy flags
The Make variable containing the objcopy flags may be constructed from
the output of build tools operating on build artifacts, and these may
not exist when doing a make clean.

So avoid evaluating them eagerly, to prevent spurious build warnings.

Suggested-by: Pedro Falcato <pedro.falcato@gmail.com>
Tested-by: Alan Bartlett <ajb@elrepo.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-25 09:26:00 +02:00
Anisse Astier
d86ff3333c efivarfs: expose used and total size
When writing EFI variables, one might get errors with no other message
on why it fails. Being able to see how much is used by EFI variables
helps analyzing such issues.

Since this is not a conventional filesystem, block size is intentionally
set to 1 instead of PAGE_SIZE.

x86 quirks of reserved size are taken into account; so that available
and free size can be different, further helping debugging space issues.

With this patch, one can see the remaining space in EFI variable storage
via efivarfs, like this:

   $ df -h /sys/firmware/efi/efivars/
   Filesystem      Size  Used Avail Use% Mounted on
   efivarfs        176K  106K   66K  62% /sys/firmware/efi/efivars

Signed-off-by: Anisse Astier <an.astier@criteo.com>
[ardb: - rename efi_reserved_space() to efivar_reserved_space()
       - whitespace/coding style tweaks]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-17 18:21:34 +02:00
Thomas Weißschuh
0153431c85 efi: make kobj_type structure constant
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definition to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-10 19:00:52 +02:00
Linus Torvalds
825a0714d2 EFI updates for v6.4:
- relocate the LoongArch kernel if the preferred address is already
   occupied;
 
 - implement BTI annotations for arm64 EFI stub and zboot images;
 
 - clean up arm64 zboot Kbuild rules for injecting the kernel code size.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZEwUOwAKCRAwbglWLn0t
 XMNzAQChdPim0N+l2G4XLa1g8WCGany/+6/B9GHPJVcmQ25zLQD/UaNvAofkHwjR
 Y3P3ZEY1SPEA+UJBL/BTI0wO9/XgpAA=
 =hGWP
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:

 - relocate the LoongArch kernel if the preferred address is already
   occupied

 - implement BTI annotations for arm64 EFI stub and zboot images

 - clean up arm64 zboot Kbuild rules for injecting the kernel code size

* tag 'efi-next-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi/zboot: arm64: Grab code size from ELF symbol in payload
  efi/zboot: arm64: Inject kernel code size symbol into the zboot payload
  efi/zboot: Set forward edge CFI compat header flag if supported
  efi/zboot: Add BSS padding before compression
  arm64: efi: Enable BTI codegen and add PE/COFF annotation
  efi/pe: Import new BTI/IBT header flags from the spec
  efi/loongarch: Reintroduce efi_relocate_kernel() to relocate kernel
2023-04-29 17:42:33 -07:00
Linus Torvalds
b6a7828502 modules-6.4-rc1
The summary of the changes for this pull requests is:
 
  * Song Liu's new struct module_memory replacement
  * Nick Alcock's MODULE_LICENSE() removal for non-modules
  * My cleanups and enhancements to reduce the areas where we vmalloc
    module memory for duplicates, and the respective debug code which
    proves the remaining vmalloc pressure comes from userspace.
 
 Most of the changes have been in linux-next for quite some time except
 the minor fixes I made to check if a module was already loaded
 prior to allocating the final module memory with vmalloc and the
 respective debug code it introduces to help clarify the issue. Although
 the functional change is small it is rather safe as it can only *help*
 reduce vmalloc space for duplicates and is confirmed to fix a bootup
 issue with over 400 CPUs with KASAN enabled. I don't expect stable
 kernels to pick up that fix as the cleanups would have also had to have
 been picked up. Folks on larger CPU systems with modules will want to
 just upgrade if vmalloc space has been an issue on bootup.
 
 Given the size of this request, here's some more elaborate details
 on this pull request.
 
 The functional change change in this pull request is the very first
 patch from Song Liu which replaces the struct module_layout with a new
 struct module memory. The old data structure tried to put together all
 types of supported module memory types in one data structure, the new
 one abstracts the differences in memory types in a module to allow each
 one to provide their own set of details. This paves the way in the
 future so we can deal with them in a cleaner way. If you look at changes
 they also provide a nice cleanup of how we handle these different memory
 areas in a module. This change has been in linux-next since before the
 merge window opened for v6.3 so to provide more than a full kernel cycle
 of testing. It's a good thing as quite a bit of fixes have been found
 for it.
 
 Jason Baron then made dynamic debug a first class citizen module user by
 using module notifier callbacks to allocate / remove module specific
 dynamic debug information.
 
 Nick Alcock has done quite a bit of work cross-tree to remove module
 license tags from things which cannot possibly be module at my request
 so to:
 
   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area
      is active with no clear solution in sight.
 
   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags
 
 In so far as a) is concerned, although module license tags are a no-op
 for non-modules, tools which would want create a mapping of possible
 modules can only rely on the module license tag after the commit
 8b41fc4454 ("kbuild: create modules.builtin without Makefile.modbuiltin
 or tristate.conf").  Nick has been working on this *for years* and
 AFAICT I was the only one to suggest two alternatives to this approach
 for tooling. The complexity in one of my suggested approaches lies in
 that we'd need a possible-obj-m and a could-be-module which would check
 if the object being built is part of any kconfig build which could ever
 lead to it being part of a module, and if so define a new define
 -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've
 suggested would be to have a tristate in kconfig imply the same new
 -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names
 mapping to modules always, and I don't think that's the case today. I am
 not aware of Nick or anyone exploring either of these options. Quite
 recently Josh Poimboeuf has pointed out that live patching, kprobes and
 BPF would benefit from resolving some part of the disambiguation as
 well but for other reasons. The function granularity KASLR (fgkaslr)
 patches were mentioned but Joe Lawrence has clarified this effort has
 been dropped with no clear solution in sight [1].
 
 In the meantime removing module license tags from code which could never
 be modules is welcomed for both objectives mentioned above. Some
 developers have also welcomed these changes as it has helped clarify
 when a module was never possible and they forgot to clean this up,
 and so you'll see quite a bit of Nick's patches in other pull
 requests for this merge window. I just picked up the stragglers after
 rc3. LWN has good coverage on the motivation behind this work [2] and
 the typical cross-tree issues he ran into along the way. The only
 concrete blocker issue he ran into was that we should not remove the
 MODULE_LICENSE() tags from files which have no SPDX tags yet, even if
 they can never be modules. Nick ended up giving up on his efforts due
 to having to do this vetting and backlash he ran into from folks who
 really did *not understand* the core of the issue nor were providing
 any alternative / guidance. I've gone through his changes and dropped
 the patches which dropped the module license tags where an SPDX
 license tag was missing, it only consisted of 11 drivers.  To see
 if a pull request deals with a file which lacks SPDX tags you
 can just use:
 
   ./scripts/spdxcheck.py -f \
 	$(git diff --name-only commid-id | xargs echo)
 
 You'll see a core module file in this pull request for the above,
 but that's not related to his changes. WE just need to add the SPDX
 license tag for the kernel/module/kmod.c file in the future but
 it demonstrates the effectiveness of the script.
 
 Most of Nick's changes were spread out through different trees,
 and I just picked up the slack after rc3 for the last kernel was out.
 Those changes have been in linux-next for over two weeks.
 
 The cleanups, debug code I added and final fix I added for modules
 were motivated by David Hildenbrand's report of boot failing on
 a systems with over 400 CPUs when KASAN was enabled due to running
 out of virtual memory space. Although the functional change only
 consists of 3 lines in the patch "module: avoid allocation if module is
 already present and ready", proving that this was the best we can
 do on the modules side took quite a bit of effort and new debug code.
 
 The initial cleanups I did on the modules side of things has been
 in linux-next since around rc3 of the last kernel, the actual final
 fix for and debug code however have only been in linux-next for about a
 week or so but I think it is worth getting that code in for this merge
 window as it does help fix / prove / evaluate the issues reported
 with larger number of CPUs. Userspace is not yet fixed as it is taking
 a bit of time for folks to understand the crux of the issue and find a
 proper resolution. Worst come to worst, I have a kludge-of-concept [3]
 of how to make kernel_read*() calls for modules unique / converge them,
 but I'm currently inclined to just see if userspace can fix this
 instead.
 
 [0] https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/
 [1] https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com
 [2] https://lwn.net/Articles/927569/
 [3] https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRG4m0SHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinQ2oP/0xlvKwJg6Ey8fHZF0qv8VOskE80zoLF
 hMazU3xfqLA+1TQvouW1YBxt3jwS3t1Ehs+NrV+nY9Yzcm0MzRX/n3fASJVe7nRr
 oqWWQU+voYl5Pw1xsfdp6C8IXpBQorpYby3Vp0MAMoZyl2W2YrNo36NV488wM9KC
 jD4HF5Z6xpnPSZTRR7AgW9mo7FdAtxPeKJ76Bch7lH8U6omT7n36WqTw+5B1eAYU
 YTOvrjRs294oqmWE+LeebyiOOXhH/yEYx4JNQgCwPdxwnRiGJWKsk5va0hRApqF/
 WW8dIqdEnjsa84lCuxnmWgbcPK8cgmlO0rT0DyneACCldNlldCW1LJ0HOwLk9pea
 p3JFAsBL7TKue4Tos6I7/4rx1ufyBGGIigqw9/VX5g0Iif+3BhWnqKRfz+p9wiMa
 Fl7cU6u7yC68CHu1HBSisK16cYMCPeOnTSd89upHj8JU/t74O6k/ARvjrQ9qmNUt
 c5U+OY+WpNJ1nXQydhY/yIDhFdYg8SSpNuIO90r4L8/8jRQYXNG80FDd1UtvVDuy
 eq0r2yZ8C0XHSlOT9QHaua/tWV/aaKtyC/c0hDRrigfUrq8UOlGujMXbUnrmrWJI
 tLJLAc7ePWAAoZXGSHrt0U27l029GzLwRdKqJ6kkDANVnTeOdV+mmBg9zGh3/Mp6
 agiwdHUMVN7X
 =56WK
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
 "The summary of the changes for this pull requests is:

   - Song Liu's new struct module_memory replacement

   - Nick Alcock's MODULE_LICENSE() removal for non-modules

   - My cleanups and enhancements to reduce the areas where we vmalloc
     module memory for duplicates, and the respective debug code which
     proves the remaining vmalloc pressure comes from userspace.

  Most of the changes have been in linux-next for quite some time except
  the minor fixes I made to check if a module was already loaded prior
  to allocating the final module memory with vmalloc and the respective
  debug code it introduces to help clarify the issue. Although the
  functional change is small it is rather safe as it can only *help*
  reduce vmalloc space for duplicates and is confirmed to fix a bootup
  issue with over 400 CPUs with KASAN enabled. I don't expect stable
  kernels to pick up that fix as the cleanups would have also had to
  have been picked up. Folks on larger CPU systems with modules will
  want to just upgrade if vmalloc space has been an issue on bootup.

  Given the size of this request, here's some more elaborate details:

  The functional change change in this pull request is the very first
  patch from Song Liu which replaces the 'struct module_layout' with a
  new 'struct module_memory'. The old data structure tried to put
  together all types of supported module memory types in one data
  structure, the new one abstracts the differences in memory types in a
  module to allow each one to provide their own set of details. This
  paves the way in the future so we can deal with them in a cleaner way.
  If you look at changes they also provide a nice cleanup of how we
  handle these different memory areas in a module. This change has been
  in linux-next since before the merge window opened for v6.3 so to
  provide more than a full kernel cycle of testing. It's a good thing as
  quite a bit of fixes have been found for it.

  Jason Baron then made dynamic debug a first class citizen module user
  by using module notifier callbacks to allocate / remove module
  specific dynamic debug information.

  Nick Alcock has done quite a bit of work cross-tree to remove module
  license tags from things which cannot possibly be module at my request
  so to:

   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area is
      active with no clear solution in sight.

   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags

  In so far as a) is concerned, although module license tags are a no-op
  for non-modules, tools which would want create a mapping of possible
  modules can only rely on the module license tag after the commit
  8b41fc4454 ("kbuild: create modules.builtin without
  Makefile.modbuiltin or tristate.conf").

  Nick has been working on this *for years* and AFAICT I was the only
  one to suggest two alternatives to this approach for tooling. The
  complexity in one of my suggested approaches lies in that we'd need a
  possible-obj-m and a could-be-module which would check if the object
  being built is part of any kconfig build which could ever lead to it
  being part of a module, and if so define a new define
  -DPOSSIBLE_MODULE [0].

  A more obvious yet theoretical approach I've suggested would be to
  have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
  well but that means getting kconfig symbol names mapping to modules
  always, and I don't think that's the case today. I am not aware of
  Nick or anyone exploring either of these options. Quite recently Josh
  Poimboeuf has pointed out that live patching, kprobes and BPF would
  benefit from resolving some part of the disambiguation as well but for
  other reasons. The function granularity KASLR (fgkaslr) patches were
  mentioned but Joe Lawrence has clarified this effort has been dropped
  with no clear solution in sight [1].

  In the meantime removing module license tags from code which could
  never be modules is welcomed for both objectives mentioned above. Some
  developers have also welcomed these changes as it has helped clarify
  when a module was never possible and they forgot to clean this up, and
  so you'll see quite a bit of Nick's patches in other pull requests for
  this merge window. I just picked up the stragglers after rc3. LWN has
  good coverage on the motivation behind this work [2] and the typical
  cross-tree issues he ran into along the way. The only concrete blocker
  issue he ran into was that we should not remove the MODULE_LICENSE()
  tags from files which have no SPDX tags yet, even if they can never be
  modules. Nick ended up giving up on his efforts due to having to do
  this vetting and backlash he ran into from folks who really did *not
  understand* the core of the issue nor were providing any alternative /
  guidance. I've gone through his changes and dropped the patches which
  dropped the module license tags where an SPDX license tag was missing,
  it only consisted of 11 drivers. To see if a pull request deals with a
  file which lacks SPDX tags you can just use:

    ./scripts/spdxcheck.py -f \
	$(git diff --name-only commid-id | xargs echo)

  You'll see a core module file in this pull request for the above, but
  that's not related to his changes. WE just need to add the SPDX
  license tag for the kernel/module/kmod.c file in the future but it
  demonstrates the effectiveness of the script.

  Most of Nick's changes were spread out through different trees, and I
  just picked up the slack after rc3 for the last kernel was out. Those
  changes have been in linux-next for over two weeks.

  The cleanups, debug code I added and final fix I added for modules
  were motivated by David Hildenbrand's report of boot failing on a
  systems with over 400 CPUs when KASAN was enabled due to running out
  of virtual memory space. Although the functional change only consists
  of 3 lines in the patch "module: avoid allocation if module is already
  present and ready", proving that this was the best we can do on the
  modules side took quite a bit of effort and new debug code.

  The initial cleanups I did on the modules side of things has been in
  linux-next since around rc3 of the last kernel, the actual final fix
  for and debug code however have only been in linux-next for about a
  week or so but I think it is worth getting that code in for this merge
  window as it does help fix / prove / evaluate the issues reported with
  larger number of CPUs. Userspace is not yet fixed as it is taking a
  bit of time for folks to understand the crux of the issue and find a
  proper resolution. Worst come to worst, I have a kludge-of-concept [3]
  of how to make kernel_read*() calls for modules unique / converge
  them, but I'm currently inclined to just see if userspace can fix this
  instead"

Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]

* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
  module: add debugging auto-load duplicate module support
  module: stats: fix invalid_mod_bytes typo
  module: remove use of uninitialized variable len
  module: fix building stats for 32-bit targets
  module: stats: include uapi/linux/module.h
  module: avoid allocation if module is already present and ready
  module: add debug stats to help identify memory pressure
  module: extract patient module check into helper
  modules/kmod: replace implementation with a semaphore
  Change DEFINE_SEMAPHORE() to take a number argument
  module: fix kmemleak annotations for non init ELF sections
  module: Ignore L0 and rename is_arm_mapping_symbol()
  module: Move is_arm_mapping_symbol() to module_symbol.h
  module: Sync code of is_arm_mapping_symbol()
  scripts/gdb: use mem instead of core_layout to get the module address
  interconnect: remove module-related code
  interconnect: remove MODULE_LICENSE in non-modules
  zswap: remove MODULE_LICENSE in non-modules
  zpool: remove MODULE_LICENSE in non-modules
  x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
  ...
2023-04-27 16:36:55 -07:00
Linus Torvalds
34b62f186d pci-v6.4-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmRIKooUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxq7A/9G0sInrqvqH2I9/Set/FnmMfCtGDH
 YcEjHYYxL+pztSiXTavDV+ib9iaut83oYtcV9p1bUMhJoZdKNZhrNdIGzRFSemI4
 0/ShtklPzNEu6nPPL24CnEzgbrODBU56ZvzrIE/tShEoOjkKa1triBnOA/JMxYTL
 cUwqDQlDkdpYniCgxy05QfcFZ0mmSOkbl7runGfTMTiUKKC3xSRiaW5YN9KZe3i7
 G5YHu1VVCjeQdQSICHYwyFmkyiqosCoajQNp1IHBkWqSwilzyZMg0NWJobVSA7M/
 mXXnzLtFcC60oT58/9MaggQwDTaSGDE8mG+sWv05bB2u5TQVyZEZqZ4c2FzmZIZT
 WLZYLB6PFRW0zePEuMnVkSLS2npkX+aGaBv28bf88sjorpaYNG01uYijnLEceolQ
 yBPFRN3bsRuOyHvYY/tiZX/BP7z/DS++XXwA8zQWZnYsXSlncJdwCNquV0xIwUt+
 hij4/Yu7o9SgV1LbuwtkMFAn3C9Szc65Eer+IvRRdnMZYphjVHbA5F2msRFyiCeR
 HxECtMQ1jBnVrpQAcBX1Sz+Vu5MrwCqzc2n6tvTQHDvVNjXfkG3NaFhxYPc1IL9Z
 NJMeCKfK1qzw7TtbvWXCluTTIM9N/bNJXrJhQbjNY7V6IaBZY1QNYW0ZFfGgj6Gb
 UUPgndidRy4/hzw=
 =HPXl
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Resource management:

   - Add pci_dev_for_each_resource() and pci_bus_for_each_resource()
     iterators

  PCIe native device hotplug:

   - Fix AB-BA deadlock between reset_lock and device_lock

  Power management:

   - Wait longer for devices to become ready after resume (as we do for
     reset) to accommodate Intel Titan Ridge xHCI devices

   - Extend D3hot delay for NVIDIA HDA controllers to avoid
     unrecoverable devices after a bus reset

  Error handling:

   - Clear PCIe Device Status after EDR since generic error recovery now
     only clears it when AER is native

  ASPM:

   - Work around Chromebook firmware defect that clobbers Capability
     list (including ASPM L1 PM Substates Cap) when returning from
     D3cold to D0

  Freescale i.MX6 PCIe controller driver:

   - Install imprecise external abort handler only when DT indicates
     PCIe support

  Freescale Layerscape PCIe controller driver:

   - Add ls1028a endpoint mode support

  Qualcomm PCIe controller driver:

   - Add SM8550 DT binding and driver support

   - Add SDX55 DT binding and driver support

   - Use bulk APIs for clocks of IP 1.0.0, 2.3.2, 2.3.3

   - Use bulk APIs for reset of IP 2.1.0, 2.3.3, 2.4.0

   - Add DT "mhi" register region for supported SoCs

   - Expose link transition counts via debugfs to help debug low power
     issues

   - Support system suspend and resume; reduce interconnect bandwidth
     and turn off clock and PHY if there are no active devices

   - Enable async probe by default to reduce boot time

  Miscellaneous:

   - Sort controller Kconfig entries by vendor"

* tag 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (56 commits)
  PCI: xilinx: Drop obsolete dependency on COMPILE_TEST
  PCI: mobiveil: Sort Kconfig entries by vendor
  PCI: dwc: Sort Kconfig entries by vendor
  PCI: Sort controller Kconfig entries by vendor
  PCI: Use consistent controller Kconfig menu entry language
  PCI: xilinx-nwl: Add 'Xilinx' to Kconfig prompt
  PCI: hv: Add 'Microsoft' to Kconfig prompt
  PCI: meson: Add 'Amlogic' to Kconfig prompt
  PCI: Use of_property_present() for testing DT property presence
  PCI/PM: Extend D3hot delay for NVIDIA HDA controllers
  dt-bindings: PCI: qcom: Document msi-map and msi-map-mask properties
  PCI: qcom: Add SM8550 PCIe support
  dt-bindings: PCI: qcom: Add SM8550 compatible
  PCI: qcom: Add support for SDX55 SoC
  dt-bindings: PCI: qcom-ep: Fix the unit address used in example
  dt-bindings: PCI: qcom: Add SDX55 SoC
  dt-bindings: PCI: qcom: Update maintainers entry
  PCI: qcom: Enable async probe by default
  PCI: qcom: Add support for system suspend and resume
  PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter
  ...
2023-04-27 10:45:30 -07:00
Ard Biesheuvel
026b85796a efi/zboot: arm64: Grab code size from ELF symbol in payload
Instead of relying on a dodgy dd hack to copy the image code size from
the uncompressed image's PE header to the end of the compressed image,
let's grab the code size from the symbol that is injected into the ELF
object by the Kbuild rules that generate the compressed payload.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
2023-04-26 23:06:48 +02:00
Ard Biesheuvel
45dd403da8 efi/zboot: arm64: Inject kernel code size symbol into the zboot payload
The EFI zboot code is not built as part of the kernel proper, like the
ordinary EFI stub, but still needs access to symbols that are defined
only internally in the kernel, and are left unexposed deliberately to
avoid creating ABI inadvertently that we're stuck with later.

So capture the kernel code size of the kernel image, and inject it as an
ELF symbol into the object that contains the compressed payload, where
it will be accessible to zboot code that needs it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
2023-04-26 18:01:41 +02:00
Ard Biesheuvel
538bc0f40b efi/zboot: Set forward edge CFI compat header flag if supported
Add some plumbing to the zboot EFI header generation to set the newly
introduced DllCharacteristicsEx flag associated with forward edge CFI
enforcement instructions (BTI on arm64, IBT on x86)

x86 does not currently uses the zboot infrastructure, so let's wire it
up only for arm64.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-04-20 15:45:12 +02:00
Ard Biesheuvel
bca2f3a940 efi/zboot: Add BSS padding before compression
We don't really care about the size of the decompressed image - what
matters is how much space needs to be allocated for the image to
execute, and this includes space for BSS that is not part of the
loadable image and so it is not accounted for in the decompressed size.

So let's add some zero padding to the end of the image: this compresses
well, and it ensures that BSS is accounted for, and as a bonus, it will
be zeroed before launching the image.

Since all architectures that implement support for EFI zboot carry this
value in the header in the same location, we can just grab it from the
binary that is being compressed.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-04-20 15:44:35 +02:00
Ard Biesheuvel
8358098b97 arm64: efi: Enable BTI codegen and add PE/COFF annotation
UEFI heavily relies on so-called protocols, which are essentially
tables populated with pointers to executable code, and these are invoked
indirectly using BR or BLR instructions.

This makes the EFI execution context vulnerable to attacks on forward
edge control flow, and so it would help if we could enable hardware
enforcement (BTI) on CPUs that implement it.

So let's no longer disable BTI codegen for the EFI stub, and set the
newly introduced PE/COFF header flag when the kernel is built with BTI
landing pads.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
2023-04-20 15:43:45 +02:00
Peter Zijlstra
48380368de Change DEFINE_SEMAPHORE() to take a number argument
Fundamentally semaphores are a counted primitive, but
DEFINE_SEMAPHORE() does not expose this and explicitly creates a
binary semaphore.

Change DEFINE_SEMAPHORE() to take a number argument and use that in the
few places that open-coded it using __SEMAPHORE_INITIALIZER().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[mcgrof: add some tribal knowledge about why some folks prefer
 binary sempahores over mutexes]
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-18 11:15:24 -07:00
Bjorn Helgaas
72a31ff9d7 efi/cper: Remove unnecessary aer.h include
<linux/aer.h> is unused, so remove it.

Link: https://lore.kernel.org/r/20230307203356.882479-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
2023-04-07 16:42:31 -05:00
Huacai Chen
8364f6d000 efi/loongarch: Reintroduce efi_relocate_kernel() to relocate kernel
Since Linux-6.3, LoongArch supports PIE kernel now, so let's reintroduce
efi_relocate_kernel() to relocate the core kernel.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-04-05 12:05:11 +02:00
Ard Biesheuvel
0b1d9debe3 efi/libstub: randomalloc: Return EFI_OUT_OF_RESOURCES on failure
The logic in efi_random_alloc() will iterate over the memory map twice,
once to count the number of candidate slots, and another time to locate
the chosen slot after randomization.

If there is insufficient memory to do the allocation, the second loop
will run to completion without actually having located a slot, but we
currently return EFI_SUCCESS in this case, as we fail to initialize
status to the appropriate error value of EFI_OUT_OF_RESOURCES.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-23 15:15:45 +01:00
Ard Biesheuvel
fc3608aaa5 efi/libstub: Use relocated version of kernel's struct screen_info
In some cases, we expose the kernel's struct screen_info to the EFI stub
directly, so it gets populated before even entering the kernel.  This
means the early console is available as soon as the early param parsing
happens, which is nice. It also means we need two different ways to pass
this information, as this trick only works if the EFI stub is baked into
the core kernel image, which is not always the case.

Huacai reports that the preparatory refactoring that was needed to
implement this alternative method for zboot resulted in a non-functional
efifb earlycon for other cases as well, due to the reordering of the
kernel image relocation with the population of the screen_info struct,
and the latter now takes place after copying the image to its new
location, which means we copy the old, uninitialized state.

So let's ensure that the same-image version of alloc_screen_info()
produces the correct screen_info pointer, by taking the displacement of
the loaded image into account.

Reported-by: Huacai Chen <chenhuacai@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://lore.kernel.org/linux-efi/20230310021749.921041-1-chenhuacai@loongson.cn/
Fixes: 42c8ea3dca ("efi: libstub: Factor out EFI stub entrypoint into separate file")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-23 12:27:18 +01:00
Ard Biesheuvel
97fd768e50 efi/libstub: zboot: Add compressed image to make targets
Avoid needlessly rebuilding the compressed image by adding the file
'vmlinuz' to the 'targets' Kbuild make variable.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-21 15:20:56 +01:00
Hans de Goede
5ed213dd64 efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L
Another Lenovo convertable which reports a landscape resolution of
1920x1200 with a pitch of (1920 * 4) bytes, while the actual framebuffer
has a resolution of 1200x1920 with a pitch of (1200 * 4) bytes.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-18 11:44:57 +01:00
Hans de Goede
3615c78673 efi: sysfb_efi: Fix DMI quirks not working for simpledrm
Commit 8633ef82f1 ("drivers/firmware: consolidate EFI framebuffer setup
for all arches") moved the sysfb_apply_efi_quirks() call in sysfb_init()
from before the [sysfb_]parse_mode() call to after it.
But sysfb_apply_efi_quirks() modifies the global screen_info struct which
[sysfb_]parse_mode() parses, so doing it later is too late.

This has broken all DMI based quirks for correcting wrong firmware efifb
settings when simpledrm is used.

To fix this move the sysfb_apply_efi_quirks() call back to its old place
and split the new setup of the efifb_fwnode (which requires
the platform_device) into its own function and call that at
the place of the moved sysfb_apply_efi_quirks(pd) calls.

Fixes: 8633ef82f1 ("drivers/firmware: consolidate EFI framebuffer setup for all arches")
Cc: stable@vger.kernel.org
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-18 11:44:57 +01:00
Ard Biesheuvel
f59a7ec1e6 efi/libstub: smbios: Drop unused 'recsize' parameter
We no longer use the recsize argument for locating the string table in
an SMBIOS record, so we can drop it from the internal API.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-18 11:44:57 +01:00
Ard Biesheuvel
eb684408f3 arm64: efi: Use SMBIOS processor version to key off Ampere quirk
Instead of using the SMBIOS type 1 record 'family' field, which is often
modified by OEMs, use the type 4 'processor ID' and 'processor version'
fields, which are set to a small set of probe-able values on all known
Ampere EFI systems in the field.

Fixes: 550b33cfd4 ("arm64: efi: Force the use of ...")
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-18 11:44:56 +01:00
Ard Biesheuvel
34343eb06a efi/libstub: smbios: Use length member instead of record struct size
The type 1 SMBIOS record happens to always be the same size, but there
are other record types which have been augmented over time, and so we
should really use the length field in the header to decide where the
string table starts.

Fixes: 550b33cfd4 ("arm64: efi: Force the use of ...")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-18 11:37:56 +01:00
Ard Biesheuvel
8b3a149db4 efi: earlycon: Reprobe after parsing config tables
Commit 732ea9db9d ("efi: libstub: Move screen_info handling to common
code") reorganized the earlycon handling so that all architectures pass
the screen_info data via a EFI config table instead of populating struct
screen_info directly, as the latter is only possible when the EFI stub
is baked into the kernel (and not into the decompressor).

However, this means that struct screen_info may not have been populated
yet by the time the earlycon probe takes place, and this results in a
non-functional early console.

So let's probe again right after parsing the config tables and
populating struct screen_info. Note that this means that earlycon output
starts a bit later than before, and so it may fail to capture issues
that occur while doing the early EFI initialization.

Fixes: 732ea9db9d ("efi: libstub: Move screen_info handling to common code")
Reported-by: Shawn Guo <shawn.guo@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-13 23:28:43 +01:00
Ard Biesheuvel
3c60f67b4b efi/libstub: arm64: Remap relocated image with strict permissions
After relocating the executable image, use the EFI memory attributes
protocol to remap the code and data regions with the appropriate
permissions.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-10 14:11:39 +01:00
Ard Biesheuvel
c7d9e628b8 efi/libstub: zboot: Mark zboot EFI application as NX compatible
Now that the zboot loader will invoke the EFI memory attributes protocol
to remap the decompressed code and rodata as read-only/executable, we
can set the PE/COFF header flag that indicates to the firmware that the
application does not rely on writable memory being executable at the
same time.

Cc: <stable@vger.kernel.org> # v6.2+
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-10 14:11:39 +01:00
Linus Torvalds
06e1a81c48 A healthy mix of EFI contributions this time:
- Performance tweaks for efifb earlycon by Andy
 
 - Preparatory refactoring and cleanup work in the efivar layer by Johan,
   which is needed to accommodate the Snapdragon arm64 laptops that
   expose their EFI variable store via a TEE secure world API.
 
 - Enhancements to the EFI memory map handling so that Xen dom0 can
   safely access EFI configuration tables (Demi Marie)
 
 - Wire up the newly introduced IBT/BTI flag in the EFI memory attributes
   table, so that firmware that is generated with ENDBR/BTI landing pads
   will be mapped with enforcement enabled.
 
 - Clean up how we check and print the EFI revision exposed by the
   firmware.
 
 - Incorporate EFI memory attributes protocol definition contributed by
   Evgeniy and wire it up in the EFI zboot code. This ensures that these
   images can execute under new and stricter rules regarding the default
   memory permissions for EFI page allocations. (More work is in progress
   here)
 
 - CPER header cleanup by Dan Williams
 
 - Use a raw spinlock to protect the EFI runtime services stack on arm64
   to ensure the correct semantics under -rt. (Pierre)
 
 - EFI framebuffer quirk for Lenovo Ideapad by Darrell.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmPzuwsACgkQw08iOZLZ
 jyS7dwwAm95DlDxFIQi4FmTm2mqJws9PyDrkfaAK1CoyqCgeOLQT2FkVolgr8jne
 pwpwCTXtYP8y0BZvdQEIjpAq/BHKaD3GJSPfl7lo+pnUu68PpsFWaV6EdT33KKfj
 QeF0MnUvrqUeTFI77D+S0ZW2zxdo9eCcahF3TPA52/bEiiDHWBF8Qm9VHeQGklik
 zoXA15ft3mgITybgjEA0ncGrVZiBMZrYoMvbdkeoedfw02GN/eaQn8d2iHBtTDEh
 3XNlo7ONX0v50cjt0yvwFEA0AKo0o7R1cj+ziKH/bc4KjzIiCbINhy7blroSq+5K
 YMlnPHuj8Nhv3I+MBdmn/nxRCQeQsE4RfRru04hfNfdcqjAuqwcBvRXvVnjWKZHl
 CmUYs+p/oqxrQ4BjiHfw0JKbXRsgbFI6o3FeeLH9kzI9IDUPpqu3Ma814FVok9Ai
 zbOCrJf5tEtg5tIavcUESEMBuHjEafqzh8c7j7AAqbaNjlihsqosDy9aYoarEi5M
 f/tLec86
 =+pOz
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:
 "A healthy mix of EFI contributions this time:

   - Performance tweaks for efifb earlycon (Andy)

   - Preparatory refactoring and cleanup work in the efivar layer, which
     is needed to accommodate the Snapdragon arm64 laptops that expose
     their EFI variable store via a TEE secure world API (Johan)

   - Enhancements to the EFI memory map handling so that Xen dom0 can
     safely access EFI configuration tables (Demi Marie)

   - Wire up the newly introduced IBT/BTI flag in the EFI memory
     attributes table, so that firmware that is generated with ENDBR/BTI
     landing pads will be mapped with enforcement enabled

   - Clean up how we check and print the EFI revision exposed by the
     firmware

   - Incorporate EFI memory attributes protocol definition and wire it
     up in the EFI zboot code (Evgeniy)

     This ensures that these images can execute under new and stricter
     rules regarding the default memory permissions for EFI page
     allocations (More work is in progress here)

   - CPER header cleanup (Dan Williams)

   - Use a raw spinlock to protect the EFI runtime services stack on
     arm64 to ensure the correct semantics under -rt (Pierre)

   - EFI framebuffer quirk for Lenovo Ideapad (Darrell)"

* tag 'efi-next-for-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
  firmware/efi sysfb_efi: Add quirk for Lenovo IdeaPad Duet 3
  arm64: efi: Make efi_rt_lock a raw_spinlock
  efi: Add mixed-mode thunk recipe for GetMemoryAttributes
  efi: x86: Wire up IBT annotation in memory attributes table
  efi: arm64: Wire up BTI annotation in memory attributes table
  efi: Discover BTI support in runtime services regions
  efi/cper, cxl: Remove cxl_err.h
  efi: Use standard format for printing the EFI revision
  efi: Drop minimum EFI version check at boot
  efi: zboot: Use EFI protocol to remap code/data with the right attributes
  efi/libstub: Add memory attribute protocol definitions
  efi: efivars: prevent double registration
  efi: verify that variable services are supported
  efivarfs: always register filesystem
  efi: efivars: add efivars printk prefix
  efi: Warn if trying to reserve memory under Xen
  efi: Actually enable the ESRT under Xen
  efi: Apply allowlist to EFI configuration tables when running under Xen
  efi: xen: Implement memory descriptor lookup based on hypercall
  efi: memmap: Disregard bogus entries instead of returning them
  ...
2023-02-23 14:41:48 -08:00
Linus Torvalds
8bf1a529cd arm64 updates for 6.3:
- Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit
   architectural register (ZT0, for the look-up table feature) that Linux
   needs to save/restore.
 
 - Include TPIDR2 in the signal context and add the corresponding
   kselftests.
 
 - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI
   support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG
   (ARM CMN) at probe time.
 
 - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64.
 
 - Permit EFI boot with MMU and caches on. Instead of cleaning the entire
   loaded kernel image to the PoC and disabling the MMU and caches before
   branching to the kernel bare metal entry point, leave the MMU and
   caches enabled and rely on EFI's cacheable 1:1 mapping of all of
   system RAM to populate the initial page tables.
 
 - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64
   kernel (the arm32 kernel only defines the values).
 
 - Harden the arm64 shadow call stack pointer handling: stash the shadow
   stack pointer in the task struct on interrupt, load it directly from
   this structure.
 
 - Signal handling cleanups to remove redundant validation of size
   information and avoid reading the same data from userspace twice.
 
 - Refactor the hwcap macros to make use of the automatically generated
   ID registers. It should make new hwcaps writing less error prone.
 
 - Further arm64 sysreg conversion and some fixes.
 
 - arm64 kselftest fixes and improvements.
 
 - Pointer authentication cleanups: don't sign leaf functions, unify
   asm-arch manipulation.
 
 - Pseudo-NMI code generation optimisations.
 
 - Minor fixes for SME and TPIDR2 handling.
 
 - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable, replace
   strtobool() to kstrtobool() in the cpufeature.c code, apply dynamic
   shadow call stack in two passes, intercept pfn changes in set_pte_at()
   without the required break-before-make sequence, attempt to dump all
   instructions on unhandled kernel faults.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmP0/QsACgkQa9axLQDI
 XvG+gA/+JDVEH9wRzAIZvbp9hSuohPc48xgAmIMP1eiVB0/5qeRjYAJwS33H0rXS
 BPC2kj9IBy/eQeM9ICg0nFd0zYznSVacITqe6NrqeJ1F+ftS4rrHdfxd+J7kIoCs
 V2L8e+BJvmHdhmNV2qMAgJdGlfxfQBA7fv2cy52HKYcouoOh1AUVR/x+yXVXAsCd
 qJP3+dlUKccgm/oc5unEC1eZ49u8O+EoasqOyfG6K5udMgzhEX3K6imT9J3hw0WT
 UjstYkx5uGS/prUrRCQAX96VCHoZmzEDKtQuHkHvQXEYXsYPF3ldbR2CziNJnHe7
 QfSkjJlt8HAtExA+BkwEe9i0MQO/2VF5qsa2e4fA6l7uqGu3LOtS/jJd23C9n9fR
 Id8aBMeN6S8+MjqRA9L2uf4t6e4ISEHoG9ZRdc4WOwloxEEiJoIeun+7bHdOSZLj
 AFdHFCz4NXiiwC0UP0xPDI2YeCLqt5np7HmnrUqwzRpVO8UUagiJD8TIpcBSjBN9
 J68eidenHUW7/SlIeaMKE2lmo8AUEAJs9AorDSugF19/ThJcQdx7vT2UAZjeVB3j
 1dbbwajnlDOk/w8PQC4thFp5/MDlfst0htS3WRwa+vgkweE2EAdTU4hUZ8qEP7FQ
 smhYtlT1xUSTYDTqoaG/U2OWR6/UU79wP0jgcOsHXTuyYrtPI/Q=
 =VmXL
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit
   architectural register (ZT0, for the look-up table feature) that
   Linux needs to save/restore

 - Include TPIDR2 in the signal context and add the corresponding
   kselftests

 - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI
   support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG
   (ARM CMN) at probe time

 - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64

 - Permit EFI boot with MMU and caches on. Instead of cleaning the
   entire loaded kernel image to the PoC and disabling the MMU and
   caches before branching to the kernel bare metal entry point, leave
   the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of
   all of system RAM to populate the initial page tables

 - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64
   kernel (the arm32 kernel only defines the values)

 - Harden the arm64 shadow call stack pointer handling: stash the shadow
   stack pointer in the task struct on interrupt, load it directly from
   this structure

 - Signal handling cleanups to remove redundant validation of size
   information and avoid reading the same data from userspace twice

 - Refactor the hwcap macros to make use of the automatically generated
   ID registers. It should make new hwcaps writing less error prone

 - Further arm64 sysreg conversion and some fixes

 - arm64 kselftest fixes and improvements

 - Pointer authentication cleanups: don't sign leaf functions, unify
   asm-arch manipulation

 - Pseudo-NMI code generation optimisations

 - Minor fixes for SME and TPIDR2 handling

 - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable,
   replace strtobool() to kstrtobool() in the cpufeature.c code, apply
   dynamic shadow call stack in two passes, intercept pfn changes in
   set_pte_at() without the required break-before-make sequence, attempt
   to dump all instructions on unhandled kernel faults

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (130 commits)
  arm64: fix .idmap.text assertion for large kernels
  kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests
  kselftest/arm64: Copy whole EXTRA context
  arm64: kprobes: Drop ID map text from kprobes blacklist
  perf: arm_spe: Print the version of SPE detected
  perf: arm_spe: Add support for SPEv1.2 inverted event filtering
  perf: Add perf_event_attr::config3
  arm64/sme: Fix __finalise_el2 SMEver check
  drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable
  arm64/signal: Only read new data when parsing the ZT context
  arm64/signal: Only read new data when parsing the ZA context
  arm64/signal: Only read new data when parsing the SVE context
  arm64/signal: Avoid rereading context frame sizes
  arm64/signal: Make interface for restore_fpsimd_context() consistent
  arm64/signal: Remove redundant size validation from parse_user_sigframe()
  arm64/signal: Don't redundantly verify FPSIMD magic
  arm64/cpufeature: Use helper macros to specify hwcaps
  arm64/cpufeature: Always use symbolic name for feature value in hwcaps
  arm64/sysreg: Initial unsigned annotations for ID registers
  arm64/sysreg: Initial annotation of signed ID registers
  ...
2023-02-21 15:27:48 -08:00
Darrell Kavanagh
e1d447157f firmware/efi sysfb_efi: Add quirk for Lenovo IdeaPad Duet 3
Another Lenovo convertable which reports a landscape resolution of
1920x1200 with a pitch of (1920 * 4) bytes, while the actual framebuffer
has a resolution of 1200x1920 with a pitch of (1200 * 4) bytes.

Signed-off-by: Darrell Kavanagh <darrell.kavanagh@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-19 14:41:33 +01:00
Darren Hart
190233164c arm64: efi: Force the use of SetVirtualAddressMap() on eMAG and Altra Max machines
Commit 550b33cfd4 ("arm64: efi: Force the use of SetVirtualAddressMap()
on Altra machines") identifies the Altra family via the family field in
the type#1 SMBIOS record. eMAG and Altra Max machines are similarly
affected but not detected with the strict strcmp test.

The type1_family smbios string is not an entirely reliable means of
identifying systems with this issue as OEMs can, and do, use their own
strings for these fields. However, until we have a better solution,
capture the bulk of these systems by adding strcmp matching for "eMAG"
and "Altra Max".

Fixes: 550b33cfd4 ("arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines")
Cc: <stable@vger.kernel.org> # 6.1.x
Cc: Alexandru Elisei <alexandru.elisei@gmail.com>
Signed-off-by: Darren Hart <darren@os.amperecomputing.com>
Tested-by: Justin He <justin.he@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-09 12:38:35 +01:00
Ard Biesheuvel
cf1d2ffcc6 efi: Discover BTI support in runtime services regions
Add the generic plumbing to detect whether or not the runtime code
regions were constructed with BTI/IBT landing pads by the firmware,
permitting the OS to enable enforcement when mapping these regions into
the OS's address space.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
2023-02-04 09:19:02 +01:00
Dan Williams
b0048092f7 efi/cper, cxl: Remove cxl_err.h
While going to create include/linux/cxl.h for some cross-subsystem CXL
definitions I noticed that include/linux/cxl_err.h was already present.
That header has no reason to be global, and it duplicates the RAS
Capability Structure definitions in drivers/cxl/cxl.h. A follow-on patch
can consider unifying the CXL native error tracing with the CPER error
printing.

Also fixed up the spec reference as the latest released spec is v3.0.

Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-03 23:59:58 +01:00
Ard Biesheuvel
1758817e7e efi: Use standard format for printing the EFI revision
The UEFI spec section 4.2.1 describes the way the human readable EFI
revision should be constructed from the 32-bit revision field in the
system table:

  The upper 16 bits of this field contain the major revision value,
  and the lower 16 bits contain the minor revision value. The minor
  revision values are binary coded decimals and are limited to the
  range of 00..99.

  When printed or displayed UEFI spec revision is referred as (Major
  revision).(Minor revision upper decimal).(Minor revision lower
  decimal) or (Major revision).(Minor revision upper decimal) in case
  Minor revision lower decimal is set to 0.

Let's adhere to this when logging the EFI revision to the kernel log.

Note that the bit about binary coded decimals is bogus, and the minor
revision lower decimal is simply the minor revision modulo 10, given the
symbolic definitions provided by the spec itself:

  #define EFI_2_40_SYSTEM_TABLE_REVISION ((2<<16) | (40))
  #define EFI_2_31_SYSTEM_TABLE_REVISION ((2<<16) | (31))
  #define EFI_2_30_SYSTEM_TABLE_REVISION ((2<<16) | (30))

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-03 18:09:00 +01:00
Ard Biesheuvel
234fa51db9 efi: Drop minimum EFI version check at boot
We currently pass a minimum major version to the generic EFI helper that
checks the system table magic and version, and refuse to boot if the
value is lower.

The motivation for this check is unknown, and even the code that uses
major version 2 as the minimum (ARM, arm64 and RISC-V) should make it
past this check without problems, and boot to a point where we have
access to a console or some other means to inform the user that the
firmware's major revision number made us unhappy. (Revision 2.0 of the
UEFI specification was released in January 2006, whereas ARM, arm64 and
RISC-V support where added in 2009, 2013 and 2017, respectively, so
checking for major version 2 or higher is completely arbitrary)

So just drop the check.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-03 18:01:07 +01:00
Ard Biesheuvel
ace013a543 efi: zboot: Use EFI protocol to remap code/data with the right attributes
Use the recently introduced EFI_MEMORY_ATTRIBUTES_PROTOCOL in the zboot
implementation to set the right attributes for the code and data
sections of the decompressed image, i.e., EFI_MEMORY_RO for code and
EFI_MEMORY_XP for data.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-03 16:39:22 +01:00
Anton Gusev
966d47e1f2 efi: fix potential NULL deref in efi_mem_reserve_persistent
When iterating on a linked list, a result of memremap is dereferenced
without checking it for NULL.

This patch adds a check that falls back on allocating a new page in
case memremap doesn't succeed.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 18df7577ad ("efi/memreserve: deal with memreserve entries in unmapped memory")
Signed-off-by: Anton Gusev <aagusev@ispras.ru>
[ardb: return -ENOMEM instead of breaking out of the loop]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-03 14:52:10 +01:00
Ard Biesheuvel
636ab417a7 efi: Accept version 2 of memory attributes table
UEFI v2.10 introduces version 2 of the memory attributes table, which
turns the reserved field into a flags field, but is compatible with
version 1 in all other respects. So let's not complain about version 2
if we encounter it.

Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-02 18:32:25 +01:00
Evgeniy Baskov
79729f26b0 efi/libstub: Add memory attribute protocol definitions
EFI_MEMORY_ATTRIBUTE_PROTOCOL servers as a better alternative to
DXE services for setting memory attributes in EFI Boot Services
environment. This protocol is better since it is a part of UEFI
specification itself and not UEFI PI specification like DXE
services.

Add EFI_MEMORY_ATTRIBUTE_PROTOCOL definitions.
Support mixed mode properly for its calls.

Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Evgeniy Baskov <baskov@ispras.ru>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-30 13:11:34 +01:00
Johan Hovold
0217a40d7b efi: efivars: prevent double registration
Add the missing sanity check to efivars_register() so that it is no
longer possible to override an already registered set of efivar ops
(without first deregistering them).

This can help debug initialisation ordering issues where drivers have so
far unknowingly been relying on overriding the generic ops.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-26 21:32:01 +01:00
Johan Hovold
bad267f9e1 efi: verify that variable services are supported
Current Qualcomm UEFI firmware does not implement the variable services
but not all revisions clear the corresponding bits in the RT_PROP table
services mask and instead the corresponding calls return
EFI_UNSUPPORTED.

This leads to efi core registering the generic efivar ops even when the
variable services are not supported or when they are accessed through
some other interface (e.g. Google SMI or the upcoming Qualcomm SCM
implementation).

Instead of playing games with init call levels to make sure that the
custom implementations are registered after the generic one, make sure
that get_next_variable() is actually supported before registering the
generic ops.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-26 21:32:01 +01:00
Ard Biesheuvel
6178617038 efi: arm64: enter with MMU and caches enabled
Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.

This removes the need for managing coherency in software, which is
tedious and error prone.

Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.

This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-7-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:51:08 +00:00
Johan Hovold
beeb107c5b efi: efivars: add efivars printk prefix
Add an 'efivars: ' printk prefix to make the log entries stand out more,
for example:

	efivars: Registered efivars operations

While at it, change the sole remaining direct printk() call to pr_err().

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-23 12:32:21 +01:00
Demi Marie Obenour
fa7bee867d efi: Warn if trying to reserve memory under Xen
Doing so cannot work and should never happen.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-23 11:33:24 +01:00
Demi Marie Obenour
01de145dc7 efi: Actually enable the ESRT under Xen
The ESRT can be parsed if EFI_PARAVIRT is enabled, even if EFI_MEMMAP is
not.  Also allow the ESRT to be in reclaimable memory, as that is where
future Xen versions will put it.

Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-23 11:33:24 +01:00
Demi Marie Obenour
c0fecaa44d efi: Apply allowlist to EFI configuration tables when running under Xen
As it turns out, Xen does not guarantee that EFI boot services data
regions in memory are preserved, which means that EFI configuration
tables pointing into such memory regions may be corrupted before the
dom0 OS has had a chance to inspect them.

This is causing problems for Qubes OS when it attempts to perform system
firmware updates, which requires that the contents of the EFI System
Resource Table are valid when the fwupd userspace program runs.

However, other configuration tables such as the memory attributes table
or the runtime properties table are equally affected, and so we need a
comprehensive workaround that works for any table type.

So when running under Xen, check the EFI memory descriptor covering the
start of the table, and disregard the table if it does not reside in
memory that is preserved by Xen.

Co-developed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-23 11:33:24 +01:00
Demi Marie Obenour
aca1d27ac3 efi: xen: Implement memory descriptor lookup based on hypercall
Xen on x86 boots dom0 in EFI mode but without providing a memory map.
This means that some consistency checks we would like to perform on
configuration tables or other data structures in memory are not
currently possible.  Xen does, however, expose EFI memory descriptor
info via a Xen hypercall, so let's wire that up instead.  It turns out
that the returned information is not identical to what Linux's
efi_mem_desc_lookup would return: the address returned is the address
passed to the hypercall, and the size returned is the number of bytes
remaining in the configuration table.  However, none of the callers of
efi_mem_desc_lookup() currently care about this.  In the future, Xen may
gain a hypercall that returns the actual start address, which can be
used instead.

Co-developed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-22 10:14:15 +01:00
Demi Marie Obenour
ab03e91e60 efi: memmap: Disregard bogus entries instead of returning them
The ESRT code currently contains two consistency checks on the memory
descriptor it obtains, but one of them is both incomplete and can only
trigger on invalid descriptors.

So let's drop these checks, and instead disregard descriptors entirely
if the start address is misaligned, or if the number of pages reaches
to or beyond the end of the address space.  Note that the memory map as
a whole could still be inconsistent: multiple entries might cover the
same area, or the address could be outside of the addressable PA space,
but validating that goes beyond the scope of these helpers.  Also note
that since the physical address space is never 64-bits wide, a
descriptor that includes the last page of memory is not valid.  This is
fortunate, since it means that a valid physical address will never be an
error pointer and that the length of a memory descriptor in bytes will
fit in a 64-bit unsigned integer.

Co-developed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-22 10:14:15 +01:00
Johan Hovold
2cf9e278ef efi: efivars: make efivar_supports_writes() return bool
For consistency with the new efivar_is_available() function, change the
return type of efivar_supports_writes() to bool.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-17 16:21:45 +01:00
Johan Hovold
ade7fd908d efi: efivars: drop kobject from efivars_register()
Since commit 0f5b2c69a4 ("efi: vars: Remove deprecated 'efivars' sysfs
interface") and the removal of the sysfs interface there are no users of
the efivars kobject.

Drop the kobject argument from efivars_register() and add a new
efivar_is_available() helper in favour of the old efivars_kobject().

Note that the new helper uses the prefix 'efivar' (i.e. without an 's')
for consistency with efivar_supports_writes() and the rest of the
interface (except the registration functions).

For the benefit of drivers with optional EFI support, also provide a
dummy implementation of efivar_is_available().

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-17 16:14:36 +01:00
Andy Shevchenko
2a5b4ccf0d efi/earlycon: Speed up scrolling by disregarding empty space
Currently the scroll copies the full screen which is slow on high
resolution displays. At the same time, most of the screen is an empty
space which has no need to be copied over and over.

Optimize the scrolling algorithm by caching the x coordinates of the
last printed lines and scroll in accordance with the maximum x in that
cache.

On my Microsoft Surface Book (the first version) this produces a
significant speedup of the console 90 seconds vs. 168 seconds with the
kernel command line having

	ignore_loglevel earlycon=efifb keep_bootcon

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-10 15:16:12 +01:00
Andy Shevchenko
b7a1cd2438 efi/earlycon: Replace open coded strnchrnul()
strnchrnul() can be called in the early stages. Replace
open coded variant in the EFI early console driver.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-10 15:16:12 +01:00
Ding Hui
e006ac3003 efi: fix userspace infinite retry read efivars after EFI runtime services page fault
After [1][2], if we catch exceptions due to EFI runtime service, we will
clear EFI_RUNTIME_SERVICES bit to disable EFI runtime service, then the
subsequent routine which invoke the EFI runtime service should fail.

But the userspace cat efivars through /sys/firmware/efi/efivars/ will stuck
and infinite loop calling read() due to efivarfs_file_read() return -EINTR.

The -EINTR is converted from EFI_ABORTED by efi_status_to_err(), and is
an improper return value in this situation, so let virt_efi_xxx() return
EFI_DEVICE_ERROR and converted to -EIO to invoker.

Cc: <stable@vger.kernel.org>
Fixes: 3425d934fc ("efi/x86: Handle page faults occurring while running EFI runtime services")
Fixes: 23715a26c8 ("arm64: efi: Recover from synchronous exceptions occurring in firmware")
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-03 10:52:16 +01:00
Johan Hovold
703c13fe3c efi: fix NULL-deref in init error path
In cases where runtime services are not supported or have been disabled,
the runtime services workqueue will never have been allocated.

Do not try to destroy the workqueue unconditionally in the unlikely
event that EFI initialisation fails to avoid dereferencing a NULL
pointer.

Fixes: 98086df8b7 ("efi: add missed destroy_workqueue when efisubsys_init fails")
Cc: stable@vger.kernel.org
Cc: Li Heng <liheng40@huawei.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-03 10:52:15 +01:00
Johan Hovold
41a15855c1 efi: random: fix NULL-deref when refreshing seed
Do not try to refresh the RNG seed in case the firmware does not support
setting variables.

This is specifically needed to prevent a NULL-pointer dereference on the
Lenovo X13s with some firmware revisions, or more generally, whenever
the runtime services have been disabled (e.g. efi=noruntime or with
PREEMPT_RT).

Fixes: e7b813b32a ("efi: random: refresh non-volatile random seed when RNG is initialized")
Reported-by: Steev Klimaszewski <steev@kali.org>
Reported-by: Bjorn Andersson <andersson@kernel.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sc8280xp-lenovo-thinkpad-x13s
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-12-20 03:13:45 +01:00
Linus Torvalds
4cb1fc6fff ARM updates for 6.2
- update unwinder to cope with module PLTs
 - enable UBSAN on ARM
 - improve kernel fault message
 - update UEFI runtime page tables dump
 - avoid clang's __aeabi_uldivmod generated in NWFPE code
 - disable FIQs on CPU shutdown paths
 - update XOR register usage
 - a number of build updates (using .arch, thread pointer,
   removal of lazy evaluation in Makefile)
 - conversion of stacktrace code to stackwalk
 - findbit assembly updates
 - hwcap feature updates for ARMv8 CPUs
 - instruction dump updates for big-endian platforms
 - support for function error injection
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmOYbjMACgkQ9OeQG+St
 rGScZw//ePQ+E/Me/p+mV6ecVpx0r3n7iM01TCqtLj2j+wSuk/VhYQLqLAaNVUR1
 YeBxvpGbmigzOCERo2hUxosmloP0bTh9zelNYJCywg3yeezoV8IvfTYYY3UyTCBX
 mlWwm4lKyvTnfY3qXrmLCu/HxVJqyOi6IWLZFzqxAz9zS9VYX/nbUrsUzbZgpgs6
 Kvcysj/jvdknbh1aMHoD/uHV7EoOKLUegmW7BXQToBMiLKIemeEoeiaD1rMGl9Ro
 DJiyfnUlGJkchsy+sRWKXL1GQG4jCfPNVhnBoBpAfLJgjIa9ia9wTpfsKER69pJ2
 Xod2b78VusYim5SS72WU+AF53fH4HN8s1RMOiP35XazT0j+bYgv+WRUXLNwtyEYW
 lPBhFe4P622LjJgJlswilZ8+RWtY9Inw5Cl9xKfWbC+qwE88Bpi63FQ5lyshqUUJ
 anLQ+ic/6Gy8jQRWjZM6f1z5sEtESHgi631B+gJ8L4BeeaB3KozqrlYEtnMDkVRo
 Tz+4EO4RHV+fwUd0wj0O5ZxwKPXdFKivte++XWgogr5u/Qqhl+kzi9H+j27u4koF
 nvfMbz7Nf9xe4CSAiJTn7qs3f2mZWFiQNQHGtXWACAbZc7oGVPwhGXKDN44SFYAE
 oq7P7Hkcs+d51K8ZEL3IVC28bHejdR4pI5jNm9ECgFdG90s03+0=
 =1spR
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM updates from Russell King:

 - update unwinder to cope with module PLTs

 - enable UBSAN on ARM

 - improve kernel fault message

 - update UEFI runtime page tables dump

 - avoid clang's __aeabi_uldivmod generated in NWFPE code

 - disable FIQs on CPU shutdown paths

 - update XOR register usage

 - a number of build updates (using .arch, thread pointer, removal of
   lazy evaluation in Makefile)

 - conversion of stacktrace code to stackwalk

 - findbit assembly updates

 - hwcap feature updates for ARMv8 CPUs

 - instruction dump updates for big-endian platforms

 - support for function error injection

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (31 commits)
  ARM: 9279/1: support function error injection
  ARM: 9277/1: Make the dumped instructions are consistent with the disassembled ones
  ARM: 9276/1: Refactor dump_instr()
  ARM: 9275/1: Drop '-mthumb' from AFLAGS_ISA
  ARM: 9274/1: Add hwcap for Speculative Store Bypassing Safe
  ARM: 9273/1: Add hwcap for Speculation Barrier(SB)
  ARM: 9272/1: vfp: Add hwcap for FEAT_AA32I8MM
  ARM: 9271/1: vfp: Add hwcap for FEAT_AA32BF16
  ARM: 9270/1: vfp: Add hwcap for FEAT_FHM
  ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd
  ARM: 9268/1: vfp: Add hwcap FPHP and ASIMDHP for FEAT_FP16
  ARM: 9267/1: Define Armv8 registers in AArch32 state
  ARM: findbit: add unwinder information
  ARM: findbit: operate by words
  ARM: findbit: convert to macros
  ARM: findbit: provide more efficient ARMv7 implementation
  ARM: findbit: document ARMv5 bit offset calculation
  ARM: 9259/1: stacktrace: Convert stacktrace to generic ARCH_STACKWALK
  ARM: 9258/1: stacktrace: Make stack walk callback consistent with generic code
  ARM: 9265/1: pass -march= only to compiler
  ...
2022-12-13 15:22:14 -08:00
Linus Torvalds
4eb77fa102 - Do some spring cleaning to the compressed boot code by moving the
EFI mixed-mode code to a separate compilation unit, the AMD memory
 encryption early code where it belongs and fixing up build dependencies.
 Make the deprecated EFI handover protocol optional with the goal of
 removing it at some point (Ard Biesheuvel)
 
 - Skip realmode init code on Xen PV guests as it is not needed there
 
 - Remove an old 32-bit PIC code compiler workaround
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOYaiMACgkQEsHwGGHe
 VUrNVhAAk3lLagEsrBcQ24SnMMAyQvdKfRucn9fbs72jBCyWbDqXcE59qNgdbMS1
 3rIL+EJdF8jlm5K28GjRS1WSvwUyYbyFEfUcYfqZl9L/5PAl7PlG7nNQw7/gXnw+
 xS57w/Q3cONlo5LC0K2Zkbj/59RvDoBEs3nkhozkKR0npTDW/LK3Vl0zgKTkvqsV
 DzRIHhWsqSEvpdowbQmQCyqFh/pOoQlZkQwjYVA9+SaQYdH3Yo1dpLd5i9I9eVmJ
 dci/HDU+plwYYuZ1XhxwXr82PcdCUVYjJ/DTt9GkTVYq7u5EWx62puxTl+c+wbG2
 H1WBXuZHBGdzNMFdnb1k9RuLCaYdaxKTNlZh3FPMMDtkjtjKTl/olXTlFUYFgI6E
 FPv4hi15g6pMveS3K6YUAd0uGvpsjvLUZHPqMDVS2trhxLENQALc6Id/PwqzrQ1T
 FzfPYcDyFFwMM3MDuWc8ClwEDD9wr0Z4m4Aek/ca2r85AKEX8ZtTTlWZoI4E9A4B
 hEjUFnRhT/d6XLWwZqcOIKfwtbpKAjdsCN3ElFst8ogRFAXqW8luDoI4BRCkBC4p
 T4RHdij4afkuFjSAxBacazpaavtcCsDqXwBpeL4YN+4fA7+NokVZGiQVh/3S8BPn
 LlgIf6awFq6yQq7JyEGPdk+dWn5sknldixZ55m666ZLzSvQhvE8=
 =VGZx
 -----END PGP SIGNATURE-----

Merge tag 'x86_boot_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 boot updates from Borislav Petkov:
 "A  of early boot cleanups and fixes.

   - Do some spring cleaning to the compressed boot code by moving the
     EFI mixed-mode code to a separate compilation unit, the AMD memory
     encryption early code where it belongs and fixing up build
     dependencies. Make the deprecated EFI handover protocol optional
     with the goal of removing it at some point (Ard Biesheuvel)

   - Skip realmode init code on Xen PV guests as it is not needed there

   - Remove an old 32-bit PIC code compiler workaround"

* tag 'x86_boot_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Remove x86_32 PIC using %ebx workaround
  x86/boot: Skip realmode init code when running as Xen PV guest
  x86/efi: Make the deprecated EFI handover protocol optional
  x86/boot/compressed: Only build mem_encrypt.S if AMD_MEM_ENCRYPT=y
  x86/boot/compressed: Adhere to calling convention in get_sev_encryption_bit()
  x86/boot/compressed: Move startup32_check_sev_cbit() out of head_64.S
  x86/boot/compressed: Move startup32_check_sev_cbit() into .text
  x86/boot/compressed: Move startup32_load_idt() out of head_64.S
  x86/boot/compressed: Move startup32_load_idt() into .text section
  x86/boot/compressed: Pull global variable reference into startup32_load_idt()
  x86/boot/compressed: Avoid touching ECX in startup32_set_idt_entry()
  x86/boot/compressed: Simplify IDT/GDT preserve/restore in the EFI thunk
  x86/boot/compressed, efi: Merge multiple definitions of image_offset into one
  x86/boot/compressed: Move efi32_pe_entry() out of head_64.S
  x86/boot/compressed: Move efi32_entry out of head_64.S
  x86/boot/compressed: Move efi32_pe_entry into .text section
  x86/boot/compressed: Move bootargs parsing out of 32-bit startup code
  x86/boot/compressed: Move 32-bit entrypoint code into .text section
  x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S
2022-12-13 14:45:29 -08:00
Linus Torvalds
fc4c9f4504 EFI updates for v6.2:
- Refactor the zboot code so that it incorporates all the EFI stub
   logic, rather than calling the decompressed kernel as a EFI app.
 - Add support for initrd= command line option to x86 mixed mode.
 - Allow initrd= to be used with arbitrary EFI accessible file systems
   instead of just the one the kernel itself was loaded from.
 - Move some x86-only handling and manipulation of the EFI memory map
   into arch/x86, as it is not used anywhere else.
 - More flexible handling of any random seeds provided by the boot
   environment (i.e., systemd-boot) so that it becomes available much
   earlier during the boot.
 - Allow improved arch-agnostic EFI support in loaders, by setting a
   uniform baseline of supported features, and adding a generic magic
   number to the DOS/PE header. This should allow loaders such as GRUB or
   systemd-boot to reduce the amount of arch-specific handling
   substantially.
 - (arm64) Run EFI runtime services from a dedicated stack, and use it to
   recover from synchronous exceptions that might occur in the firmware
   code.
 - (arm64) Ensure that we don't allocate memory outside of the 48-bit
   addressable physical range.
 - Make EFI pstore record size configurable
 - Add support for decoding CXL specific CPER records
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmOTQ1cACgkQw08iOZLZ
 jyQRkAv+LqaZFWeVwhAQHiw/N3RnRM0nZHea6++D2p1y/ZbCpwv3pdLl2YHQ1KmW
 wDG9Nr4C1ITLtfy1YZKeYpwloQtq9S1GZDWnFpVv/hdo7L924eRAwIlxowWn1OnP
 ruxv2PaYXyb0plh1YD1f6E1BqrfUOtajET55Kxs9ZsxmnMtDpIX3NiYy4LKMBIZC
 +Eywt41M3uBX+wgmSujFBMVVJjhOX60WhUYXqy0RXwDKOyrz/oW5td+eotSCreB6
 FVbjvwQvUdtzn4s1FayOMlTrkxxLw4vLhsaUGAdDOHd3rg3sZT9Xh1HqFFD6nss6
 ZAzAYQ6BzdiV/5WSB9meJe+BeG1hjTNKjJI6JPO2lctzYJqlnJJzI6JzBuH9vzQ0
 dffLB8NITeEW2rphIh+q+PAKFFNbXWkJtV4BMRpqmzZ/w7HwupZbUXAzbWE8/5km
 qlFpr0kmq8GlVcbXNOFjmnQVrJ8jPYn+O3AwmEiVAXKZJOsMH0sjlXHKsonme9oV
 Sk71c6Em
 =JEXz
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:
 "Another fairly sizable pull request, by EFI subsystem standards.

  Most of the work was done by me, some of it in collaboration with the
  distro and bootloader folks (GRUB, systemd-boot), where the main focus
  has been on removing pointless per-arch differences in the way EFI
  boots a Linux kernel.

   - Refactor the zboot code so that it incorporates all the EFI stub
     logic, rather than calling the decompressed kernel as a EFI app.

   - Add support for initrd= command line option to x86 mixed mode.

   - Allow initrd= to be used with arbitrary EFI accessible file systems
     instead of just the one the kernel itself was loaded from.

   - Move some x86-only handling and manipulation of the EFI memory map
     into arch/x86, as it is not used anywhere else.

   - More flexible handling of any random seeds provided by the boot
     environment (i.e., systemd-boot) so that it becomes available much
     earlier during the boot.

   - Allow improved arch-agnostic EFI support in loaders, by setting a
     uniform baseline of supported features, and adding a generic magic
     number to the DOS/PE header. This should allow loaders such as GRUB
     or systemd-boot to reduce the amount of arch-specific handling
     substantially.

   - (arm64) Run EFI runtime services from a dedicated stack, and use it
     to recover from synchronous exceptions that might occur in the
     firmware code.

   - (arm64) Ensure that we don't allocate memory outside of the 48-bit
     addressable physical range.

   - Make EFI pstore record size configurable

   - Add support for decoding CXL specific CPER records"

* tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (43 commits)
  arm64: efi: Recover from synchronous exceptions occurring in firmware
  arm64: efi: Execute runtime services from a dedicated stack
  arm64: efi: Limit allocations to 48-bit addressable physical region
  efi: Put Linux specific magic number in the DOS header
  efi: libstub: Always enable initrd command line loader and bump version
  efi: stub: use random seed from EFI variable
  efi: vars: prohibit reading random seed variables
  efi: random: combine bootloader provided RNG seed with RNG protocol output
  efi/cper, cxl: Decode CXL Error Log
  efi/cper, cxl: Decode CXL Protocol Error Section
  efi: libstub: fix efi_load_initrd_dev_path() kernel-doc comment
  efi: x86: Move EFI runtime map sysfs code to arch/x86
  efi: runtime-maps: Clarify purpose and enable by default for kexec
  efi: pstore: Add module parameter for setting the record size
  efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures
  efi: memmap: Move manipulation routines into x86 arch tree
  efi: memmap: Move EFI fake memmap support into x86 arch tree
  efi: libstub: Undeprecate the command line initrd loader
  efi: libstub: Add mixed mode support to command line initrd loader
  efi: libstub: Permit mixed mode return types other than efi_status_t
  ...
2022-12-13 14:31:47 -08:00
Linus Torvalds
268325bda5 Random number generator updates for Linux 6.2-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmOU+U8ACgkQSfxwEqXe
 A67NnQ//Y5DltmvibyPd7r1TFT2gUYv+Rx3sUV9ZE1NYptd/SWhhcL8c5FZ70Fuw
 bSKCa1uiWjOxosjXT1kGrWq3de7q7oUpAPSOGxgxzoaNURIt58N/ajItCX/4Au8I
 RlGAScHy5e5t41/26a498kB6qJ441fBEqCYKQpPLINMBAhe8TQ+NVp0rlpUwNHFX
 WrUGg4oKWxdBIW3HkDirQjJWDkkAiklRTifQh/Al4b6QDbOnRUGGCeckNOhixsvS
 waHWTld+Td8jRrA4b82tUb2uVZ2/b8dEvj/A8CuTv4yC0lywoyMgBWmJAGOC+UmT
 ZVNdGW02Jc2T+Iap8ZdsEmeLHNqbli4+IcbY5xNlov+tHJ2oz41H9TZoYKbudlr6
 /ReAUPSn7i50PhbQlEruj3eg+M2gjOeh8OF8UKwwRK8PghvyWQ1ScW0l3kUhPIhI
 PdIG6j4+D2mJc1FIj2rTVB+Bg933x6S+qx4zDxGlNp62AARUFYf6EgyD6aXFQVuX
 RxcKb6cjRuFkzFiKc8zkqg5edZH+IJcPNuIBmABqTGBOxbZWURXzIQvK/iULqZa4
 CdGAFIs6FuOh8pFHLI3R4YoHBopbHup/xKDEeAO9KZGyeVIuOSERDxxo5f/ITzcq
 APvT77DFOEuyvanr8RMqqh0yUjzcddXqw9+ieufsAyDwjD9DTuE=
 =QRhK
 -----END PGP SIGNATURE-----

Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:

 - Replace prandom_u32_max() and various open-coded variants of it,
   there is now a new family of functions that uses fast rejection
   sampling to choose properly uniformly random numbers within an
   interval:

       get_random_u32_below(ceil) - [0, ceil)
       get_random_u32_above(floor) - (floor, U32_MAX]
       get_random_u32_inclusive(floor, ceil) - [floor, ceil]

   Coccinelle was used to convert all current users of
   prandom_u32_max(), as well as many open-coded patterns, resulting in
   improvements throughout the tree.

   I'll have a "late" 6.1-rc1 pull for you that removes the now unused
   prandom_u32_max() function, just in case any other trees add a new
   use case of it that needs to converted. According to linux-next,
   there may be two trivial cases of prandom_u32_max() reintroductions
   that are fixable with a 's/.../.../'. So I'll have for you a final
   conversion patch doing that alongside the removal patch during the
   second week.

   This is a treewide change that touches many files throughout.

 - More consistent use of get_random_canary().

 - Updates to comments, documentation, tests, headers, and
   simplification in configuration.

 - The arch_get_random*_early() abstraction was only used by arm64 and
   wasn't entirely useful, so this has been replaced by code that works
   in all relevant contexts.

 - The kernel will use and manage random seeds in non-volatile EFI
   variables, refreshing a variable with a fresh seed when the RNG is
   initialized. The RNG GUID namespace is then hidden from efivarfs to
   prevent accidental leakage.

   These changes are split into random.c infrastructure code used in the
   EFI subsystem, in this pull request, and related support inside of
   EFISTUB, in Ard's EFI tree. These are co-dependent for full
   functionality, but the order of merging doesn't matter.

 - Part of the infrastructure added for the EFI support is also used for
   an improvement to the way vsprintf initializes its siphash key,
   replacing an sleep loop wart.

 - The hardware RNG framework now always calls its correct random.c
   input function, add_hwgenerator_randomness(), rather than sometimes
   going through helpers better suited for other cases.

 - The add_latent_entropy() function has long been called from the fork
   handler, but is a no-op when the latent entropy gcc plugin isn't
   used, which is fine for the purposes of latent entropy.

   But it was missing out on the cycle counter that was also being mixed
   in beside the latent entropy variable. So now, if the latent entropy
   gcc plugin isn't enabled, add_latent_entropy() will expand to a call
   to add_device_randomness(NULL, 0), which adds a cycle counter,
   without the absent latent entropy variable.

 - The RNG is now reseeded from a delayed worker, rather than on demand
   when used. Always running from a worker allows it to make use of the
   CPU RNG on platforms like S390x, whose instructions are too slow to
   do so from interrupts. It also has the effect of adding in new inputs
   more frequently with more regularity, amounting to a long term
   transcript of random values. Plus, it helps a bit with the upcoming
   vDSO implementation (which isn't yet ready for 6.2).

 - The jitter entropy algorithm now tries to execute on many different
   CPUs, round-robining, in hopes of hitting even more memory latencies
   and other unpredictable effects. It also will mix in a cycle counter
   when the entropy timer fires, in addition to being mixed in from the
   main loop, to account more explicitly for fluctuations in that timer
   firing. And the state it touches is now kept within the same cache
   line, so that it's assured that the different execution contexts will
   cause latencies.

* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits)
  random: include <linux/once.h> in the right header
  random: align entropy_timer_state to cache line
  random: mix in cycle counter when jitter timer fires
  random: spread out jitter callback to different CPUs
  random: remove extraneous period and add a missing one in comments
  efi: random: refresh non-volatile random seed when RNG is initialized
  vsprintf: initialize siphash key using notifier
  random: add back async readiness notifier
  random: reseed in delayed work rather than on-demand
  random: always mix cycle counter in add_latent_entropy()
  hw_random: use add_hwgenerator_randomness() for early entropy
  random: modernize documentation comment on get_random_bytes()
  random: adjust comment to account for removed function
  random: remove early archrandom abstraction
  random: use random.trust_{bootloader,cpu} command line option only
  stackprotector: actually use get_random_canary()
  stackprotector: move get_random_canary() into stackprotector.h
  treewide: use get_random_u32_inclusive() when possible
  treewide: use get_random_u32_{above,below}() instead of manual loop
  treewide: use get_random_u32_below() instead of deprecated function
  ...
2022-12-12 16:22:22 -08:00
Linus Torvalds
7adcadb984 - Make ghes_edac a simple module like the rest of the EDAC drivers and
drop this forced built-in only configuration by disentangling it from
 GHES. Work by Jia He.
 
 - The usual small cleanups and improvements all over EDAC land
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOXPvoACgkQEsHwGGHe
 VUq98xAAmhz4u9e9pXG0Ixkx25ZtnZ+YxANeQ53Hsa2gWicbcoFgL2E30gi97c1y
 X9W361B2Q5dYq+J/YRUnEOXlI/KMWLxzNykSvVipUFNfxXZH+PijEAArz2V35/uE
 6ZISRLUYVYEtHEoUXbTogeyBmBUnIaJfYheZCluDQlWPggsDESP1qmE+FTg25OBs
 rDl5y+zUZYPxrWustNodVThPyhdMwGyYAUS6qYKCoNs9SNkAjGnrXoPc9j/U+cV+
 qMY2dNS3uKnCujKEssQhcHucyWgCEDvmEKWMH4ItryV2UBBjpNRoM6HDe7XFKwVJ
 riOKX8VDrpdSdlV1jbCx9KB47BUwFygOYsFdW7gIDJ1hb8usN4nSYQNDIlZKEIQG
 cHNpv2XGT+pCSvyc4Iv2Fgyvnp25XensSQwQAtk5Y4/lJL1yrgcPjMOkPmRS+mmH
 BclDWNbL+gwqkyWxgfoivDBOetLgwJYTr2ewBr6QbBtwLB8rL4BxXIdomcoFPuxi
 jAxixZnTbS+Xq5S7uYK4r6KbaHGcJtwolXMGjx13IHmPfvYtTTQzfRcrBlAtQ/pV
 BDLoygmDVlkhSVx6bi5V5QZ06rcWYR4cRsBQ54FnBGMr730ZljgFONOHFtUab28T
 C+YUOaeLEYEYI0cIkkyoSuiz6avB6YvQAiyEPM0EdHZrQFwhBBw=
 =DFp8
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - Make ghes_edac a simple module like the rest of the EDAC drivers and
   drop the forced built-in only configuration by disentangling it from
   GHES (Jia He)

 - The usual small cleanups and improvements all over EDAC land

* tag 'edac_updates_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/i10nm: fix refcount leak in pci_get_dev_wrapper()
  EDAC/i5400: Fix typo in comment: vaious -> various
  EDAC/mc_sysfs: Increase legacy channel support to 12
  MAINTAINERS: Make Mauro EDAC reviewer
  MAINTAINERS: Make Manivannan Sadhasivam the maintainer of qcom_edac
  EDAC/igen6: Return the correct error type when not the MC owner
  apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg()
  EDAC: Check for GHES preference in the chipset-specific EDAC drivers
  EDAC/ghes: Make ghes_edac a proper module
  EDAC/ghes: Prepare to make ghes_edac a proper module
  EDAC/ghes: Add a notifier for reporting memory errors
  efi/cper: Export several helpers for ghes_edac to use
  EDAC/i5000: Mark as BROKEN
2022-12-12 14:47:31 -08:00
Linus Torvalds
06cff4a58e arm64 updates for 6.2
ACPI:
 	* Enable FPDT support for boot-time profiling
 	* Fix CPU PMU probing to work better with PREEMPT_RT
 	* Update SMMUv3 MSI DeviceID parsing to latest IORT spec
 	* APMT support for probing Arm CoreSight PMU devices
 
 CPU features:
 	* Advertise new SVE instructions (v2.1)
 	* Advertise range prefetch instruction
 	* Advertise CSSC ("Common Short Sequence Compression") scalar
 	  instructions, adding things like min, max, abs, popcount
 	* Enable DIT (Data Independent Timing) when running in the kernel
 	* More conversion of system register fields over to the generated
 	  header
 
 CPU misfeatures:
 	* Workaround for Cortex-A715 erratum #2645198
 
 Dynamic SCS:
 	* Support for dynamic shadow call stacks to allow switching at
 	  runtime between Clang's SCS implementation and the CPU's
 	  pointer authentication feature when it is supported (complete
 	  with scary DWARF parser!)
 
 Tracing and debug:
 	* Remove static ftrace in favour of, err, dynamic ftrace!
 	* Seperate 'struct ftrace_regs' from 'struct pt_regs' in core
 	  ftrace and existing arch code
 	* Introduce and implement FTRACE_WITH_ARGS on arm64 to replace
 	  the old FTRACE_WITH_REGS
 	* Extend 'crashkernel=' parameter with default value and fallback
 	  to placement above 4G physical if initial (low) allocation
 	  fails
 
 SVE:
 	* Optimisation to avoid disabling SVE unconditionally on syscall
 	  entry and just zeroing the non-shared state on return instead
 
 Exceptions:
 	* Rework of undefined instruction handling to avoid serialisation
 	  on global lock (this includes emulation of user accesses to the
 	  ID registers)
 
 Perf and PMU:
 	* Support for TLP filters in Hisilicon's PCIe PMU device
 	* Support for the DDR PMU present in Amlogic Meson G12 SoCs
 	* Support for the terribly-named "CoreSight PMU" architecture
 	  from Arm (and Nvidia's implementation of said architecture)
 
 Misc:
 	* Tighten up our boot protocol for systems with memory above
           52 bits physical
 	* Const-ify static keys to satisty jump label asm constraints
 	* Trivial FFA driver cleanups in preparation for v1.1 support
 	* Export the kernel_neon_* APIs as GPL symbols
 	* Harden our instruction generation routines against
 	  instrumentation
 	* A bunch of robustness improvements to our arch-specific selftests
 	* Minor cleanups and fixes all over (kbuild, kprobes, kfence, PMU, ...)
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmOPLFAQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNPRcCACLyDTvkimiqfoPxzzgdkx/6QOvw9s3/mXg
 UcTORSZBR1VnYkiMYEKVz/tTfG99dnWtD8/0k/rz48NbhBfsF2sN4ukyBBXVf0zR
 fjnaVyVC11LUgBgZKPo6maV+jf/JWf9hJtpPl06KTiPb2Hw2JX4DXg+PeF8t2hGx
 NLH4ekQOrlDM8mlsN5mc0YsHbiuO7Xe/NRuet8TsgU4bEvLAwO6bzOLVUMqDQZNq
 bQe2ENcGVAzAf7iRJb38lj9qB/5hrQTHRXqLXMSnJyyVjQEwYca0PeJMa7x30bXF
 ZZ+xQ8Wq0mxiffZraf6SE34yD4gaYS4Fziw7rqvydC15vYhzJBH1
 =hV+2
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The highlights this time are support for dynamically enabling and
  disabling Clang's Shadow Call Stack at boot and a long-awaited
  optimisation to the way in which we handle the SVE register state on
  system call entry to avoid taking unnecessary traps from userspace.

  Summary:

  ACPI:
   - Enable FPDT support for boot-time profiling
   - Fix CPU PMU probing to work better with PREEMPT_RT
   - Update SMMUv3 MSI DeviceID parsing to latest IORT spec
   - APMT support for probing Arm CoreSight PMU devices

  CPU features:
   - Advertise new SVE instructions (v2.1)
   - Advertise range prefetch instruction
   - Advertise CSSC ("Common Short Sequence Compression") scalar
     instructions, adding things like min, max, abs, popcount
   - Enable DIT (Data Independent Timing) when running in the kernel
   - More conversion of system register fields over to the generated
     header

  CPU misfeatures:
   - Workaround for Cortex-A715 erratum #2645198

  Dynamic SCS:
   - Support for dynamic shadow call stacks to allow switching at
     runtime between Clang's SCS implementation and the CPU's pointer
     authentication feature when it is supported (complete with scary
     DWARF parser!)

  Tracing and debug:
   - Remove static ftrace in favour of, err, dynamic ftrace!
   - Seperate 'struct ftrace_regs' from 'struct pt_regs' in core ftrace
     and existing arch code
   - Introduce and implement FTRACE_WITH_ARGS on arm64 to replace the
     old FTRACE_WITH_REGS
   - Extend 'crashkernel=' parameter with default value and fallback to
     placement above 4G physical if initial (low) allocation fails

  SVE:
   - Optimisation to avoid disabling SVE unconditionally on syscall
     entry and just zeroing the non-shared state on return instead

  Exceptions:
   - Rework of undefined instruction handling to avoid serialisation on
     global lock (this includes emulation of user accesses to the ID
     registers)

  Perf and PMU:
   - Support for TLP filters in Hisilicon's PCIe PMU device
   - Support for the DDR PMU present in Amlogic Meson G12 SoCs
   - Support for the terribly-named "CoreSight PMU" architecture from
     Arm (and Nvidia's implementation of said architecture)

  Misc:
   - Tighten up our boot protocol for systems with memory above 52 bits
     physical
   - Const-ify static keys to satisty jump label asm constraints
   - Trivial FFA driver cleanups in preparation for v1.1 support
   - Export the kernel_neon_* APIs as GPL symbols
   - Harden our instruction generation routines against instrumentation
   - A bunch of robustness improvements to our arch-specific selftests
   - Minor cleanups and fixes all over (kbuild, kprobes, kfence, PMU, ...)"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (151 commits)
  arm64: kprobes: Return DBG_HOOK_ERROR if kprobes can not handle a BRK
  arm64: kprobes: Let arch do_page_fault() fix up page fault in user handler
  arm64: Prohibit instrumentation on arch_stack_walk()
  arm64:uprobe fix the uprobe SWBP_INSN in big-endian
  arm64: alternatives: add __init/__initconst to some functions/variables
  arm_pmu: Drop redundant armpmu->map_event() in armpmu_event_init()
  kselftest/arm64: Allow epoll_wait() to return more than one result
  kselftest/arm64: Don't drain output while spawning children
  kselftest/arm64: Hold fp-stress children until they're all spawned
  arm64/sysreg: Remove duplicate definitions from asm/sysreg.h
  arm64/sysreg: Convert ID_DFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_DFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_AFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_MMFR5_EL1 to automatic generation
  arm64/sysreg: Convert MVFR2_EL1 to automatic generation
  arm64/sysreg: Convert MVFR1_EL1 to automatic generation
  arm64/sysreg: Convert MVFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR2_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR0_EL1 to automatic generation
  ...
2022-12-12 09:50:05 -08:00
Linus Torvalds
98d0052d0d printk changes for 6.2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmORzikACgkQUqAMR0iA
 lPKF/g/7Bmcao3rJkZjEagsYY+s7rGhaFaSbML8FDdyE3UzeXLJOnNxBLrD0JIe9
 XFW7+DMqr2uRxsab5C7APy0mrIWp/zCGyJ8CmBILnrPDNcAQ27OhFzxv6WlMUmEc
 xEjGHrk5dFV96s63gyHGLkKGOZMd/cfcpy/QDOyg0vfF8EZCiPywWMbQQ2Ij8E50
 N6UL70ExkoLjT9tzb8NXQiaDqHxqNRvd15aIomDjRrce7eeaL4TaZIT7fKnEcULz
 0Lmdo8RUknonCI7Y00RWdVXMqqPD2JsKz3+fh0vBnXEN+aItwyxis/YajtN+m6l7
 jhPGt7hNhCKG17auK0/6XVJ3717QwjI3+xLXCvayA8jyewMK14PgzX70hCws0eXM
 +5M+IeXI4ze5qsq+ln9Dt8zfC+5HGmwXODUtaYTBWhB4nVWdL/CZ+nTv349zt+Uc
 VIi/QcPQ4vq6EfsxUZR2r6Y12+sSH40iLIROUfqSchtujbLo7qxSNF5x7x9+rtff
 nWuXo5OsjGE7TZDwn3kr0zSuJ+w/pkWMYQ7jch+A2WqUMYyGC86sL3At7ocL+Esq
 34uvzwEgWnNySV8cLiMh34kBmgBwhAP34RhV0RS9iCv8kev2DV7pLQTs9V3QAjw9
 EZnFDHATUdikgugaFKCeDV86R3wFgnRWWOdlRrRi6aAzFDqNcYk=
 =1PTZ
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Add NMI-safe SRCU reader API. It uses atomic_inc() instead of
   this_cpu_inc() on strong load-store architectures.

 - Introduce new console_list_lock to synchronize a manipulation of the
   list of registered consoles and their flags.

   This is a first step in removing the big-kernel-lock-like behavior of
   console_lock(). This semaphore still serializes console->write()
   calbacks against:

      - each other. It primary prevents potential races between early
        and proper console drivers using the same device.

      - suspend()/resume() callbacks and init() operations in some
        drivers.

      - various other operations in the tty/vt and framebufer
        susbsystems. It is likely that console_lock() serializes even
        operations that are not directly conflicting with the
        console->write() callbacks here. This is the most complicated
        big-kernel-lock aspect of the console_lock() that will be hard
        to untangle.

 - Introduce new console_srcu lock that is used to safely iterate and
   access the registered console drivers under SRCU read lock.

   This is a prerequisite for introducing atomic console drivers and
   console kthreads. It will reduce the complexity of serialization
   against normal consoles and console_lock(). Also it should remove the
   risk of deadlock during critical situations, like Oops or panic, when
   only atomic consoles are registered.

 - Check whether the console is registered instead of enabled on many
   locations. It was a historical leftover.

 - Cleanly force a preferred console in xenfb code instead of a dirty
   hack.

 - A lot of code and comment clean ups and improvements.

* tag 'printk-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (47 commits)
  printk: htmldocs: add missing description
  tty: serial: sh-sci: use setup() callback for early console
  printk: relieve console_lock of list synchronization duties
  tty: serial: kgdboc: use console_list_lock to trap exit
  tty: serial: kgdboc: synchronize tty_find_polling_driver() and register_console()
  tty: serial: kgdboc: use console_list_lock for list traversal
  tty: serial: kgdboc: use srcu console list iterator
  proc: consoles: use console_list_lock for list iteration
  tty: tty_io: use console_list_lock for list synchronization
  printk, xen: fbfront: create/use safe function for forcing preferred
  netconsole: avoid CON_ENABLED misuse to track registration
  usb: early: xhci-dbc: use console_is_registered()
  tty: serial: xilinx_uartps: use console_is_registered()
  tty: serial: samsung_tty: use console_is_registered()
  tty: serial: pic32_uart: use console_is_registered()
  tty: serial: earlycon: use console_is_registered()
  tty: hvc: use console_is_registered()
  efi: earlycon: use console_is_registered()
  tty: nfcon: use console_is_registered()
  serial_core: replace uart_console_enabled() with uart_console_registered()
  ...
2022-12-12 09:01:36 -08:00
Linus Torvalds
059c4a341d pstore updates for v6.2-rc1
- Reporting improvements and return path fixes (Guilherme G. Piccoli,
   Wang Yufen, Kees Cook).
 
 - Clean up kmsg_bytes module parameter usage (Guilherme G. Piccoli).
 
 - Add Guilherme to pstore MAINTAINERS entry.
 
 - Choose friendlier allocation flags (Qiujun Huang, Stephen Boyd).
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOOi3cWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJm8QD/901WcETCGFZlkWKsXLym8123rr
 Y87WifzKuI3cTf1oYTtG7zrYBTWMaFYEiPZBltcy0nEbLlUs0YtYukNlkykEt9S4
 CWmyxV7DDFn2sZ/HluPhKvsIZlzcHtW1o5dzxoJadRMN06pjnAFZOHkktpuVniVN
 0IXDOOTTEEBxh11BjbD7UrilnYR6BA9kXGKcZTd6Oo/GmO8EkpzXGnVxLRr6U1/i
 qwxhOZGgVzhFuCogQvOo1VQ0DcJ8l5u3h1UIS3b9vQD/oZlpe4brVGCoD5CGugwQ
 1IpqqiBsLrsXIBtqbtg02MMgSy1bELgyLgb5jHRClfuuEiwcxw1GvAy6JzS78Uye
 5g3eiKh3oVkF9/TojSVMAzD3ObAukH4hBo4y98Jy+X2PYvSzUn/WpW0itnxFIaou
 MqZZeYn2Xz7AMXQ5N3WF3fJLjscKoCT2D0WyyiNOqoWAaYSHeZcILXUGltT+Zjtz
 vyvEhLlzQ+avh6Tx0NOKrnIA91nemuW0TYjtGlKx4X8uBvEmt+cFaKd0oZ2M8grB
 l+B2iRxVMlIrMk63mzy+qISVzLN73XCdmhcpPw60Gqin7TyIOGJ6JvZ3viq9Col7
 os5ii4MZyoerDM0bsdmPQlUq8bn0DMDUV+4kGAiZwczPkB1oigxn37ksDHMNbwRu
 jrFtb+v5Vazmb5Lafg==
 =EsLr
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore updates from Kees Cook:
 "A small collection of bug fixes, refactorings, and general
  improvements:

   - Reporting improvements and return path fixes (Guilherme G. Piccoli,
     Wang Yufen, Kees Cook)

   - Clean up kmsg_bytes module parameter usage (Guilherme G. Piccoli)

   - Add Guilherme to pstore MAINTAINERS entry

   - Choose friendlier allocation flags (Qiujun Huang, Stephen Boyd)"

* tag 'pstore-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore: Avoid kcore oops by vmap()ing with VM_IOREMAP
  pstore/ram: Fix error return code in ramoops_probe()
  pstore: Alert on backend write error
  MAINTAINERS: Update pstore maintainers
  pstore/ram: Set freed addresses to NULL
  pstore/ram: Move internal definitions out of kernel-wide include
  pstore/ram: Move pmsg init earlier
  pstore/ram: Consolidate kfree() paths
  efi: pstore: Follow convention for the efi-pstore backend name
  pstore: Inform unregistered backend names as well
  pstore: Expose kmsg_bytes as a module parameter
  pstore: Improve error reporting in case of backend overlap
  pstore/zone: Use GFP_ATOMIC to allocate zone buffer
2022-12-12 08:31:13 -08:00
Ard Biesheuvel
e8dfdf3162 arm64: efi: Recover from synchronous exceptions occurring in firmware
Unlike x86, which has machinery to deal with page faults that occur
during the execution of EFI runtime services, arm64 has nothing like
that, and a synchronous exception raised by firmware code brings down
the whole system.

With more EFI based systems appearing that were not built to run Linux
(such as the Windows-on-ARM laptops based on Qualcomm SOCs), as well as
the introduction of PRM (platform specific firmware routines that are
callable just like EFI runtime services), we are more likely to run into
issues of this sort, and it is much more likely that we can identify and
work around such issues if they don't bring down the system entirely.

Since we already use a EFI runtime services call wrapper in assembler,
we can quite easily add some code that captures the execution state at
the point where the call is made, allowing us to revert to this state
and proceed execution if the call triggered a synchronous exception.

Given that the kernel and the firmware don't share any data structures
that could end up in an indeterminate state, we can happily continue
running, as long as we mark the EFI runtime services as unavailable from
that point on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2022-12-08 18:33:34 +01:00
Ard Biesheuvel
a37dac5c5d arm64: efi: Limit allocations to 48-bit addressable physical region
The UEFI spec does not mention or reason about the configured size of
the virtual address space at all, but it does mention that all memory
should be identity mapped using a page size of 4 KiB.

This means that a LPA2 capable system that has any system memory outside
of the 48-bit addressable physical range and follows the spec to the
letter may serve page allocation requests from regions of memory that
the kernel cannot access unless it was built with LPA2 support and
enables it at runtime.

So let's ensure that all page allocations are limited to the 48-bit
range.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-12-07 19:50:44 +01:00
Ard Biesheuvel
29636a5ce8 efi: Put Linux specific magic number in the DOS header
GRUB currently relies on the magic number in the image header of ARM and
arm64 EFI kernel images to decide whether or not the image in question
is a bootable kernel.

However, the purpose of the magic number is to identify the image as one
that implements the bare metal boot protocol, and so GRUB, which only
does EFI boot, is limited unnecessarily to booting images that could
potentially be booted in a non-EFI manner as well.

This is problematic for the new zboot decompressor image format, as it
can only boot in EFI mode, and must therefore not use the bare metal
boot magic number in its header.

For this reason, the strict magic number was dropped from GRUB, to
permit essentially any kind of EFI executable to be booted via the
'linux' command, blurring the line between the linux loader and the
chainloader.

So let's use the same field in the DOS header that RISC-V and arm64
already use for their 'bare metal' magic numbers to store a 'generic
Linux kernel' magic number, which can be used to identify bootable
kernel images in PE format which don't necessarily implement a bare
metal boot protocol in the same binary. Note that, in the context of
EFI, the MS-DOS header is only described in terms of the fields that it
shares with the hybrid PE/COFF image format, (i.e., the MS-DOS EXE magic
number at offset #0 and the PE header offset at byte offset #0x3c).
Since we aim for compatibility with EFI only, and not with MS-DOS or
MS-Windows, we can use the remaining space in the MS-DOS header however
we want.

Let's set the generic magic number for x86 images as well: existing
bootloaders already have their own methods to identify x86 Linux images
that can be booted in a non-EFI manner, and having the magic number in
place there will ease any future transitions in loader implementations
to merge the x86 and non-x86 EFI boot paths.

Note that 32-bit ARM already uses the same location in the header for a
different purpose, but the ARM support is already widely implemented and
the EFI zboot decompressor is not available on ARM anyway, so we just
disregard it here.

Acked-by: Leif Lindholm <quic_llindhol@quicinc.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-12-05 09:31:46 +01:00
John Ogness
794c8e847d efi: earlycon: use console_is_registered()
The CON_ENABLED status of a console is a runtime setting that does not
involve the console driver. Drivers must not assume that if the console
is disabled then proper hardware management is not needed. For the EFI
earlycon case, it is about remapping/unmapping memory for the
framebuffer.

Use console_is_registered() instead of checking CON_ENABLED.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-25-john.ogness@linutronix.de
2022-12-02 11:25:01 +01:00
Ard Biesheuvel
e346bebbd3 efi: libstub: Always enable initrd command line loader and bump version
In preparation for setting a cross-architecture baseline for EFI boot
support, remove the Kconfig option that permits the command line initrd
loader to be disabled. Also, bump the minor version so that any image
built with the new version can be identified as supporting this.

Acked-by: Leif Lindholm <quic_llindhol@quicinc.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-12-01 16:08:18 +01:00
Jason A. Donenfeld
a89474aaf7 efi: stub: use random seed from EFI variable
EFI has a rather unique benefit that it has access to some limited
non-volatile storage, where the kernel can store a random seed. Read
that seed in EFISTUB and concatenate it with other seeds we wind up
passing onward to the kernel in the configuration table. This is
complementary to the current other two sources - previous bootloaders,
and the EFI RNG protocol.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
[ardb: check for non-NULL RNG protocol pointer, call GetVariable()
       without buffer first to obtain the size]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-12-01 09:51:21 +01:00
Ard Biesheuvel
4b52016247 x86/boot/compressed, efi: Merge multiple definitions of image_offset into one
There is no need for head_32.S and head_64.S both declaring a copy of
the global 'image_offset' variable, so drop those and make the extern C
declaration the definition.

When image_offset is moved to the .c file, it needs to be placed
particularly in the .data section because it lands by default in the
.bss section which is cleared too late, in .Lrelocated, before the first
access to it and thus garbage gets read, leading to SEV guests exploding
in early boot.

This happens only when the SEV guest kernel is loaded through grub. If
supplied with qemu's -kernel command line option, that memory is always
cleared upfront by qemu and all is fine there.

  [ bp: Expand commit message with SEV aspect. ]

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-8-ardb@kernel.org
2022-11-24 08:55:55 +01:00
Jason A. Donenfeld
e7b813b32a efi: random: refresh non-volatile random seed when RNG is initialized
EFI has a rather unique benefit that it has access to some limited
non-volatile storage, where the kernel can store a random seed. Register
a notification for when the RNG is initialized, and at that point, store
a new random seed.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-22 14:53:36 +01:00
Ard Biesheuvel
196dff2712 efi: random: combine bootloader provided RNG seed with RNG protocol output
Instead of blindly creating the EFI random seed configuration table if
the RNG protocol is implemented and works, check whether such a EFI
configuration table was provided by an earlier boot stage and if so,
concatenate the existing and the new seeds, leaving it up to the core
code to mix it in and credit it the way it sees fit.

This can be used for, e.g., systemd-boot, to pass an additional seed to
Linux in a way that can be consumed by the kernel very early. In that
case, the following definitions should be used to pass the seed to the
EFI stub:

struct linux_efi_random_seed {
      u32     size; // of the 'seed' array in bytes
      u8      seed[];
};

The memory for the struct must be allocated as EFI_ACPI_RECLAIM_MEMORY
pool memory, and the address of the struct in memory should be installed
as a EFI configuration table using the following GUID:

LINUX_EFI_RANDOM_SEED_TABLE_GUID        1ce1e5bc-7ceb-42f2-81e5-8aadf180f57b

Note that doing so is safe even on kernels that were built without this
patch applied, but the seed will simply be overwritten with a seed
derived from the EFI RNG protocol, if available. The recommended seed
size is 32 bytes, and seeds larger than 512 bytes are considered
corrupted and ignored entirely.

In order to preserve forward secrecy, seeds from previous bootloaders
are memzero'd out, and in order to preserve memory, those older seeds
are also freed from memory. Freeing from memory without first memzeroing
is not safe to do, as it's possible that nothing else will ever
overwrite those pages used by EFI.

Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
[ardb: incorporate Jason's followup changes to extend the maximum seed
       size on the consumer end, memzero() it and drop a needless printk]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 10:19:55 +01:00
Smita Koralahalli
2fb6999dd0 efi/cper, cxl: Decode CXL Error Log
Print the CXL Error Log field as found in CXL Protocol Error Section.

The CXL RAS Capability structure will be reused by OS First Handling
and the duplication/appropriate placement will be addressed eventually.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:10 +01:00
Smita Koralahalli
abdbf1a25d efi/cper, cxl: Decode CXL Protocol Error Section
Add support for decoding CXL Protocol Error Section as defined in UEFI 2.10
Section N.2.13.

Do the section decoding in a new cper_cxl.c file. This new file will be
used in the future for more CXL CPERs decode support. Add this to the
existing UEFI_CPER config.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:10 +01:00
Jialin Zhang
d981a88c16 efi: libstub: fix efi_load_initrd_dev_path() kernel-doc comment
commit f4dc7fffa9 ("efi: libstub: unify initrd loading between
architectures") merge the first and the second parameters into a
struct without updating the kernel-doc. Let's fix it.

Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:09 +01:00
Ard Biesheuvel
1fff234de2 efi: x86: Move EFI runtime map sysfs code to arch/x86
The EFI runtime map code is only wired up on x86, which is the only
architecture that has a need for it in its implementation of kexec.

So let's move this code under arch/x86 and drop all references to it
from generic code. To ensure that the efi_runtime_map_init() is invoked
at the appropriate time use a 'sync' subsys_initcall() that will be
called right after the EFI initcall made from generic code where the
original invocation of efi_runtime_map_init() resided.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Dave Young <dyoung@redhat.com>
2022-11-18 09:14:09 +01:00
Ard Biesheuvel
8dfac4d8ad efi: runtime-maps: Clarify purpose and enable by default for kexec
The current Kconfig logic for CONFIG_EFI_RUNTIME_MAPS does not convey
that without it, a kexec kernel is not able to boot in EFI mode at all.
So clarify this, and make the option only configurable via the menu
system if CONFIG_EXPERT is set.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Dave Young <dyoung@redhat.com>
2022-11-18 09:14:09 +01:00
Guilherme G. Piccoli
36d5786a1c efi: pstore: Add module parameter for setting the record size
By default, the efi-pstore backend hardcode the UEFI variable size
as 1024 bytes. The historical reasons for that were discussed by
Ard in threads [0][1]:

"there is some cargo cult from prehistoric EFI times going
on here, it seems. Or maybe just misinterpretation of the maximum
size for the variable *name* vs the variable itself.".

"OVMF has
OvmfPkg/OvmfPkgX64.dsc:
gEfiMdeModulePkgTokenSpaceGuid.PcdMaxVariableSize|0x2000
OvmfPkg/OvmfPkgX64.dsc:
gEfiMdeModulePkgTokenSpaceGuid.PcdMaxVariableSize|0x8400

where the first one is without secure boot and the second with secure
boot. Interestingly, the default is

gEfiMdeModulePkgTokenSpaceGuid.PcdMaxVariableSize|0x400

so this is probably where this 1k number comes from."

With that, and since there is not such a limit in the UEFI spec, we
have the confidence to hereby add a module parameter to enable advanced
users to change the UEFI record size for efi-pstore data collection,
this way allowing a much easier reading of the collected log, which
wouldn't be scattered anymore among many small files.

Through empirical analysis we observed that extreme low values (like 8
bytes) could eventually cause writing issues, so given that and the OVMF
default discussed, we limited the minimum value to 1024 bytes, which also
is still the default.

[0] https://lore.kernel.org/lkml/CAMj1kXF4UyRMh2Y_KakeNBHvkHhTtavASTAxXinDO1rhPe_wYg@mail.gmail.com/
[1] https://lore.kernel.org/lkml/CAMj1kXFy-2KddGu+dgebAdU9v2sindxVoiHLWuVhqYw+R=kqng@mail.gmail.com/

Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:09 +01:00
Ard Biesheuvel
d85e3e3494 efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures
Currently, the EFI_PARAVIRT flag is only used by Xen dom0 boot on x86,
even though other architectures also support pseudo-EFI boot, where the
core kernel is invoked directly and provided with a set of data tables
that resemble the ones constructed by the EFI stub, which never actually
runs in that case.

Let's fix this inconsistency, and always set this flag when booting dom0
via the EFI boot path. Note that Xen on x86 does not provide the EFI
memory map in this case, whereas other architectures do, so move the
associated EFI_PARAVIRT check into the x86 platform code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:09 +01:00
Ard Biesheuvel
fdc6d38d64 efi: memmap: Move manipulation routines into x86 arch tree
The EFI memory map is a description of the memory layout as provided by
the firmware, and only x86 manipulates it in various different ways for
its own memory bookkeeping. So let's move the memmap routines that are
only used by x86 into the x86 arch tree.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:09 +01:00
Ard Biesheuvel
4059ba656c efi: memmap: Move EFI fake memmap support into x86 arch tree
The EFI fake memmap support is specific to x86, which manipulates the
EFI memory map in various different ways after receiving it from the EFI
stub. On other architectures, we have managed to push back on this, and
the EFI memory map is kept pristine.

So let's move the fake memmap code into the x86 arch tree, where it
arguably belongs.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:09 +01:00
Ard Biesheuvel
75e1a2460d efi: libstub: Undeprecate the command line initrd loader
The initrd= command line loader can be useful for development, but it
was limited to loading files from the same file system as the loaded
kernel (and it didn't work on x86 mixed mode).

As both issues have been fixed, and the initrd= can now be used with
files residing on any simple file system exposed by the EFI firmware,
let's permit it to be enabled on RISC-V and LoongArch, which did not
support it up to this point.

Note that LoadFile2 remains the preferred option, as it is much simpler
to use and implement, but generic loaders (including the UEFI shell) may
not implement this so there, initrd= can now be used as well (if enabled
in the build)

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:08 +01:00
Ard Biesheuvel
f8a31244d7 efi: libstub: Add mixed mode support to command line initrd loader
Now that we have support for calling protocols that need additional
marshalling for mixed mode, wire up the initrd command line loader.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:08 +01:00
Ard Biesheuvel
a61962d8e7 efi: libstub: Permit mixed mode return types other than efi_status_t
Rework the EFI stub macro wrappers around protocol method calls and
other indirect calls in order to allow return types other than
efi_status_t. This means the widening should be conditional on whether
or not the return type is efi_status_t, and should be omitted otherwise.

Also, switch to _Generic() to implement the type based compile time
conditionals, which is more concise, and distinguishes between
efi_status_t and u64 properly.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:08 +01:00
Ard Biesheuvel
7091298554 efi: libstub: Implement devicepath support for initrd commandline loader
Currently, the initrd= command line option to the EFI stub only supports
loading files that reside on the same volume as the loaded image, which
is not workable for loaders like GRUB that don't even implement the
volume abstraction (EFI_SIMPLE_FILE_SYSTEM_PROTOCOL), and load the
kernel from an anonymous buffer in memory. For this reason, another
method was devised that relies on the LoadFile2 protocol.

However, the command line loader is rather useful when using the UEFI
shell or other generic loaders that have no awareness of Linux specific
protocols so let's make it a bit more flexible, by permitting textual
device paths to be provided to initrd= as well, provided that they refer
to a file hosted on a EFI_SIMPLE_FILE_SYSTEM_PROTOCOL volume. E.g.,

  initrd=PciRoot(0x0)/Pci(0x3,0x0)/HD(1,MBR,0xBE1AFDFA,0x3F,0xFBFC1)/rootfs.cpio.gz

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:08 +01:00
Ard Biesheuvel
9cf42bca30 efi: libstub: use EFI_LOADER_CODE region when moving the kernel in memory
The EFI spec is not very clear about which permissions are being given
when allocating pages of a certain type. However, it is quite obvious
that EFI_LOADER_CODE is more likely to permit execution than
EFI_LOADER_DATA, which becomes relevant once we permit booting the
kernel proper with the firmware's 1:1 mapping still active.

Ostensibly, recent systems such as the Surface Pro X grant executable
permissions to EFI_LOADER_CODE regions but not EFI_LOADER_DATA regions.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18 09:14:08 +01:00
Ard Biesheuvel
977122898e Merge tag 'efi-zboot-direct-for-v6.2' into efi/next 2022-11-18 09:13:57 +01:00
Ard Biesheuvel
550b33cfd4 arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines
Ampere Altra machines are reported to misbehave when the SetTime() EFI
runtime service is called after ExitBootServices() but before calling
SetVirtualAddressMap(). Given that the latter is horrid, pointless and
explicitly documented as optional by the EFI spec, we no longer invoke
it at boot if the configured size of the VA space guarantees that the
EFI runtime memory regions can remain mapped 1:1 like they are at boot
time.

On Ampere Altra machines, this results in SetTime() calls issued by the
rtc-efi driver triggering synchronous exceptions during boot.  We can
now recover from those without bringing down the system entirely, due to
commit 23715a26c8 ("arm64: efi: Recover from synchronous
exceptions occurring in firmware"). However, it would be better to avoid
the issue entirely, given that the firmware appears to remain in a funny
state after this.

So attempt to identify these machines based on the 'family' field in the
type #1 SMBIOS record, and call SetVirtualAddressMap() unconditionally
in that case.

Tested-by: Alexandru Elisei <alexandru.elisei@gmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-10 23:14:14 +01:00
Ard Biesheuvel
68c76ad4a9 arm64: unwind: add asynchronous unwind tables to kernel and modules
Enable asynchronous unwind table generation for both the core kernel as
well as modules, and emit the resulting .eh_frame sections as init code
so we can use the unwind directives for code patching at boot or module
load time.

This will be used by dynamic shadow call stack support, which will rely
on code patching rather than compiler codegen to emit the shadow call
stack push and pop instructions.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20221027155908.1940624-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09 18:06:35 +00:00
Ard Biesheuvel
c51e97e7f1 efi: libstub: Merge zboot decompressor with the ordinary stub
Even though our EFI zboot decompressor is pedantically spec compliant
and idiomatic for EFI image loaders, calling LoadImage() and
StartImage() for the nested image is a bit of a burden. Not only does it
create workflow issues for the distros (as both the inner and outer
PE/COFF images need to be signed for secure boot), it also copies the
image around in memory numerous times:
- first, the image is decompressed into a buffer;
- the buffer is consumed by LoadImage(), which copies the sections into
  a newly allocated memory region to hold the executable image;
- once the EFI stub is invoked by StartImage(), it will also move the
  image in memory in case of KASLR, mirrored memory or if the image must
  execute from a certain a priori defined address.

There are only two EFI spec compliant ways to load code into memory and
execute it:
- use LoadImage() and StartImage(),
- call ExitBootServices() and take ownership of the entire system, after
  which anything goes.

Given that the EFI zboot decompressor always invokes the EFI stub, and
given that both are built from the same set of objects, let's merge the
two, so that we can avoid LoadImage()/StartImage but still load our
image into memory without breaking the above rules.

This also means we can decompress the image directly into its final
location, which could be randomized or meet other platform specific
constraints that LoadImage() does not know how to adhere to. It also
means that, even if the encapsulated image still has the EFI stub
incorporated as well, it does not need to be signed for secure boot when
wrapping it in the EFI zboot decompressor.

In the future, we might decide to retire the EFI stub attached to the
decompressed image, but for the time being, they can happily coexist.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09 12:42:04 +01:00
Ard Biesheuvel
d729b554e1 efi/loongarch: libstub: Split off kernel image relocation for builtin stub
The LoongArch build of the EFI stub is part of the core kernel image, and
therefore accesses section markers directly when it needs to figure out
the size of the various section.

The zboot decompressor does not have access to those symbols, but
doesn't really need that either. So let's move handle_kernel_image()
into a separate file (or rather, move everything else into a separate
file) so that the zboot build does not pull in unused code that links to
symbols that it does not define.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09 12:42:03 +01:00
Ard Biesheuvel
0efb61c89f efi/loongarch: Don't jump to kernel entry via the old image
Currently, the EFI entry code for LoongArch is set up to copy the
executable image to the preferred offset, but instead of branching
directly into that image, it branches to the local copy of kernel_entry,
and relies on the logic in that function to switch to the link time
address instead.

This is a bit sloppy, and not something we can support once we merge the
EFI decompressor with the EFI stub. So let's clean this up a bit, by
adding a helper that computes the offset of kernel_entry from the start
of the image, and simply adding the result to VMLINUX_LOAD_ADDRESS.

And considering that we cannot execute from anywhere else anyway, let's
avoid efi_relocate_kernel() and just allocate the pages instead.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09 12:42:03 +01:00