Commit Graph

125 Commits

Author SHA1 Message Date
Matthew Auld
46f1f4b0f3 drm/xe: improve hibernation on igpu
The GGTT looks to be stored inside stolen memory on igpu which is not
treated as normal RAM.  The core kernel skips this memory range when
creating the hibernation image, therefore when coming back from
hibernation the GGTT programming is lost. This seems to cause issues
with broken resume where GuC FW fails to load:

[drm] *ERROR* GT0: load failed: status = 0x400000A0, time = 10ms, freq = 1250MHz (req 1300MHz), done = -1
[drm] *ERROR* GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
[drm] *ERROR* GT0: firmware signature verification failed
[drm] *ERROR* CRITICAL: Xe has declared device 0000:00:02.0 as wedged.

Current GGTT users are kernel internal and tracked as pinned, so it
should be possible to hook into the existing save/restore logic that we
use for dgpu, where the actual evict is skipped but on restore we
importantly restore the GGTT programming.  This has been confirmed to
fix hibernation on at least ADL and MTL, though likely all igpu
platforms are affected.

This also means we have a hole in our testing, where the existing s4
tests only really test the driver hooks, and don't go as far as actually
rebooting and restoring from the hibernation image and in turn powering
down RAM (and therefore losing the contents of stolen).

v2 (Brost)
 - Remove extra newline and drop unnecessary parentheses.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3275
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241101170156.213490-2-matthew.auld@intel.com
(cherry picked from commit f2a6b8e396)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-13 07:58:31 -08:00
Matthew Brost
dd886a63d6 drm/xe: Restore system memory GGTT mappings
GGTT mappings reside on the device and this state is lost during suspend
/ d3cold thus this state must be restored resume regardless if the BO is
in system memory or VRAM.

v2:
 - Unnecessary parentheses around bo->placements[0] (Checkpatch)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241031182257.2949579-1-matthew.brost@intel.com
(cherry picked from commit a19d1db9a3)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-13 07:58:22 -08:00
Zhanjun Dong
7257d9c9a3 drm/xe: Prevent null pointer access in xe_migrate_copy
xe_migrate_copy designed to copy content of TTM resources. When source
resource is null, it will trigger a NULL pointer dereference in
xe_migrate_copy. To avoid this situation, update lacks source flag to
true for this case, the flag will trigger xe_migrate_clear rather than
xe_migrate_copy.

Issue trace:
<7> [317.089847] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 14,
 sizes: 4194304 & 4194304
<7> [317.089945] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 15,
 sizes: 4194304 & 4194304
<1> [317.128055] BUG: kernel NULL pointer dereference, address:
 0000000000000010
<1> [317.128064] #PF: supervisor read access in kernel mode
<1> [317.128066] #PF: error_code(0x0000) - not-present page
<6> [317.128069] PGD 0 P4D 0
<4> [317.128071] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
<4> [317.128074] CPU: 1 UID: 0 PID: 1440 Comm: kunit_try_catch Tainted:
 G     U           N 6.11.0-rc7-xe #1
<4> [317.128078] Tainted: [U]=USER, [N]=TEST
<4> [317.128080] Hardware name: Intel Corporation Lunar Lake Client
 Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3221.D80.2407291239 07/29/2024
<4> [317.128082] RIP: 0010:xe_migrate_copy+0x66/0x13e0 [xe]
<4> [317.128158] Code: 00 00 48 89 8d e0 fe ff ff 48 8b 40 10 4c 89 85 c8
 fe ff ff 44 88 8d bd fe ff ff 65 48 8b 3c 25 28 00 00 00 48 89 7d d0 31
 ff <8b> 79 10 48 89 85 a0 fe ff ff 48 8b 00 48 89 b5 d8 fe ff ff 83 ff
<4> [317.128162] RSP: 0018:ffffc9000167f9f0 EFLAGS: 00010246
<4> [317.128164] RAX: ffff8881120d8028 RBX: ffff88814d070428 RCX:
 0000000000000000
<4> [317.128166] RDX: ffff88813cb99c00 RSI: 0000000004000000 RDI:
 0000000000000000
<4> [317.128168] RBP: ffffc9000167fbb8 R08: ffff88814e7b1f08 R09:
 0000000000000001
<4> [317.128170] R10: 0000000000000001 R11: 0000000000000001 R12:
 ffff88814e7b1f08
<4> [317.128172] R13: ffff88814e7b1f08 R14: ffff88813cb99c00 R15:
 0000000000000001
<4> [317.128174] FS:  0000000000000000(0000) GS:ffff88846f280000(0000)
 knlGS:0000000000000000
<4> [317.128176] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [317.128178] CR2: 0000000000000010 CR3: 000000011f676004 CR4:
 0000000000770ef0
<4> [317.128180] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
 0000000000000000
<4> [317.128182] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
 0000000000000400
<4> [317.128184] PKRU: 55555554
<4> [317.128185] Call Trace:
<4> [317.128187]  <TASK>
<4> [317.128189]  ? show_regs+0x67/0x70
<4> [317.128194]  ? __die_body+0x20/0x70
<4> [317.128196]  ? __die+0x2b/0x40
<4> [317.128198]  ? page_fault_oops+0x15f/0x4e0
<4> [317.128203]  ? do_user_addr_fault+0x3fb/0x970
<4> [317.128205]  ? lock_acquire+0xc7/0x2e0
<4> [317.128209]  ? exc_page_fault+0x87/0x2b0
<4> [317.128212]  ? asm_exc_page_fault+0x27/0x30
<4> [317.128216]  ? xe_migrate_copy+0x66/0x13e0 [xe]
<4> [317.128263]  ? __lock_acquire+0xb9d/0x26f0
<4> [317.128265]  ? __lock_acquire+0xb9d/0x26f0
<4> [317.128267]  ? sg_free_append_table+0x20/0x80
<4> [317.128271]  ? lock_acquire+0xc7/0x2e0
<4> [317.128273]  ? mark_held_locks+0x4d/0x80
<4> [317.128275]  ? trace_hardirqs_on+0x1e/0xd0
<4> [317.128278]  ? _raw_spin_unlock_irqrestore+0x31/0x60
<4> [317.128281]  ? __pm_runtime_resume+0x60/0xa0
<4> [317.128284]  xe_bo_move+0x682/0xc50 [xe]
<4> [317.128315]  ? lock_is_held_type+0xaa/0x120
<4> [317.128318]  ttm_bo_handle_move_mem+0xe5/0x1a0 [ttm]
<4> [317.128324]  ttm_bo_validate+0xd1/0x1a0 [ttm]
<4> [317.128328]  shrink_test_run_device+0x721/0xc10 [xe]
<4> [317.128360]  ? find_held_lock+0x31/0x90
<4> [317.128363]  ? lock_release+0xd1/0x2a0
<4> [317.128365]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
 [kunit]
<4> [317.128370]  xe_bo_shrink_kunit+0x11/0x20 [xe]
<4> [317.128397]  kunit_try_run_case+0x6e/0x150 [kunit]
<4> [317.128400]  ? trace_hardirqs_on+0x1e/0xd0
<4> [317.128402]  ? _raw_spin_unlock_irqrestore+0x31/0x60
<4> [317.128404]  kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
<4> [317.128407]  kthread+0xf5/0x130
<4> [317.128410]  ? __pfx_kthread+0x10/0x10
<4> [317.128412]  ret_from_fork+0x39/0x60
<4> [317.128415]  ? __pfx_kthread+0x10/0x10
<4> [317.128416]  ret_from_fork_asm+0x1a/0x30
<4> [317.128420]  </TASK>

Fixes: 266c858852 ("drm/xe/xe2: Handle flat ccs move for igfx.")
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240927161308.862323-2-zhanjun.dong@intel.com
(cherry picked from commit 59a1c9c7e1)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-03 01:24:22 -05:00
Matthew Auld
ddc73c4656 drm/xe/bo: add some annotations in bo_put()
If the put() triggers bo destroy then there is at least one potential
sleeping lock. Also annotate bos_lock and ggtt lock.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240911155527.178910-8-matthew.auld@intel.com
(cherry picked from commit 3b04c2cfd7)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-09-17 23:31:06 -05:00
Dave Airlie
2ef8d63da8 Cross-subsystem Changes:
- Split dma fence array creation into alloc and arm (Matthew Brost)
 
 Driver Changes:
 - Move kernel_lrc to execlist backend (Ilia)
 - Fix type width for pcode coommand (Karthik)
 - Make xe_drm.h include unambiguous (Jani)
 - Fixes and debug improvements for GSC load (Daniele)
 - Track resources and VF state by PF (Michal Wajdeczko)
 - Fix memory leak on error path (Nirmoy)
 - Cleanup header includes (Matt Roper)
 - Move pcode logic to tile scope (Matt Roper)
 - Move hwmon logic to device scope (Matt Roper)
 - Fix media TLB invalidation (Matthew Brost)
 - Threshold config fixes for PF (Michal Wajdeczko)
 - Remove extra "[drm]" from logs (Michal Wajdeczko)
 - Add missing runtime ref (Rodrigo Vivi)
 - Fix circular locking on runtime suspend (Rodrigo Vivi)
 - Fix rpm in TTM swapout path (Thomas)
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmbaZs8ZHGx1Y2FzLmRl
 bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU8cqEACG6sxanezneROhqqTW/kaj
 xEoOkPbVOwP7NBz08VDKwDvGtPjUNHclLlx2AaQkz/uJhThEmW8h8tgjEbvRy6kG
 4VpS5eglu4VZp1ZoEIJKUWeEroz/crJ49NgWfx9baUFW2AUp7iW1RDLW0TRjdb+B
 3++T/J15a+Wa7l5kJLYd/z8O3K9tQ6M4O1MJjZSoijrXv1C0MzgKcJemfU2HQYZD
 3Yz3/7FO4MFJvTCJX+zFFzqaRP+hSPisxknRAa0ImOnbEMdczdNLD74ebGLFwIri
 uP/1d/wVZ84nqQXndZF3D5+zmK0JOAL55yabZVYju0/KoPXb+GmnuiQXztPwen27
 EAnQvHueD1rwXIgKUh8neMh492RejHQhbMIZf1eF1abKEOif0NWjpCdMHxkEfHqW
 76v03Q4fdBIW7uK9PNNcI5hQ//7SbUfdXatwfUmJ61Rkryu3eYXtMA0PtZUIfqRL
 57+TIurQyf5ivRjgnoag0KhAhN7eP7YqN+Q70E8QGKy6Fql79AssTgLWCWRVAW8L
 +kDT3waNaZeDQgB9usZcyXKSrwb4rYGIqmC3ZDKya6XPmmOyTtfywe34UhVPK/FC
 qKg1TpEinEmCftm3WZz1rXzFwe4K4lRs2PFmc/3oHmDp83Qx5Rsu0iosqH6nhi9l
 az2p4dnhRUGMZGtN5EflAQ==
 =rTLT
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-next-2024-09-05' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Cross-subsystem Changes:
- Split dma fence array creation into alloc and arm (Matthew Brost)

Driver Changes:
- Move kernel_lrc to execlist backend (Ilia)
- Fix type width for pcode coommand (Karthik)
- Make xe_drm.h include unambiguous (Jani)
- Fixes and debug improvements for GSC load (Daniele)
- Track resources and VF state by PF (Michal Wajdeczko)
- Fix memory leak on error path (Nirmoy)
- Cleanup header includes (Matt Roper)
- Move pcode logic to tile scope (Matt Roper)
- Move hwmon logic to device scope (Matt Roper)
- Fix media TLB invalidation (Matthew Brost)
- Threshold config fixes for PF (Michal Wajdeczko)
- Remove extra "[drm]" from logs (Michal Wajdeczko)
- Add missing runtime ref (Rodrigo Vivi)
- Fix circular locking on runtime suspend (Rodrigo Vivi)
- Fix rpm in TTM swapout path (Thomas)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/eirx5vdvoflbbqlrzi5cip6bpu3zjojm2pxseufu3rlq4pp6xv@eytjvhizfyu6
2024-09-10 13:18:00 +10:00
Thomas Hellström
34bb7b813a drm/xe: Use xe_pm_runtime_get in xe_bo_move() if reclaim-safe.
xe_bo_move() might be called in the TTM swapout path from validation
by another TTM device. If so, we are not likely to have a RPM
reference. So iff xe_pm_runtime_get() is safe to call from reclaim,
use it instead of xe_pm_runtime_get_noresume().

Strictly this is currently needed only if handle_system_ccs is true,
but use xe_pm_runtime_get() if possible anyway to increase test
coverage.

At the same time warn if handle_system_ccs is true and we can't
call xe_pm_runtime_get() from reclaim context. This will likely trip
if someone tries to enable SRIOV on LNL, without fixing Xe SRIOV
runtime resume / suspend.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240903094232.166342-1-thomas.hellstrom@linux.intel.com
2024-09-04 09:28:09 +02:00
Dave Airlie
6d0ebb3904 Cross-driver (xe-core) Changes:
- Require BMG scanout buffers to be 64k physically aligned (Maarten)
 
 Core (drm) Changes:
 - Introducing Xe2 ccs modifiers for integrated and discrete graphics (Juha-Pekka)
 
 Driver Changes:
 - General cleanup and more work moving towards intel_display isolation (Jani)
 - New display workaround (Suraj)
 - Use correct cp_irq_count on HDCP (Suraj)
 - eDP PSR fix when CRC is enabled (Jouni)
 - Fix DP MST state after a sink reset (Imre)
 - Fix Arrow Lake GSC firmware version (John)
 - Use chained DSBs for LUT programming (Ville)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmbQf+AACgkQ+mJfZA7r
 E8pKQQf+K6n6lMhtpfNUYkFS6IFkNDMuWZV6R262AWNFIJkBumfiqb+LLsfoxfb8
 jcrYHKw9cPsx2BWg645KbImhT1L6iISfY0OKkUmDLCyUUV8uhbbnKScF8OK01HVW
 XWeZo9LB9gM8SmVJMljHsko+4JCiOq8+99Jm/GjVCZmW69tQ2fFfjeF3Nh9v32uI
 +DcGlFP33Htp7oynOJxyfqGK1RfKjkJSZEQKODNHKJ7Q+2LRkA85xs3yfHkUkBX8
 AjNI82r1SqAodhCgw8k19yeawvsAx8eu0h6Cal5UQzHlnAYEjfh+flbIDJv7YAZX
 Nh5lpn71LBBGCytCI/MvnsmUtcor7A==
 =tLyy
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2024-08-29' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

Cross-driver (xe-core) Changes:
- Require BMG scanout buffers to be 64k physically aligned (Maarten)

Core (drm) Changes:
- Introducing Xe2 ccs modifiers for integrated and discrete graphics (Juha-Pekka)

Driver Changes:
- General cleanup and more work moving towards intel_display isolation (Jani)
- New display workaround (Suraj)
- Use correct cp_irq_count on HDCP (Suraj)
- eDP PSR fix when CRC is enabled (Jouni)
- Fix DP MST state after a sink reset (Imre)
- Fix Arrow Lake GSC firmware version (John)
- Use chained DSBs for LUT programming (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZtCC0lJ0Zf3MoSdW@intel.com
2024-08-30 13:41:32 +10:00
Jani Nikula
87d8ecf015
drm/xe: replace #include <drm/xe_drm.h> with <uapi/drm/xe_drm.h>
include/drm/xe_drm.h does not exist. Prefer the explicit uapi include.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240827091539.4136838-1-jani.nikula@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-28 15:17:54 -04:00
Nirmoy Das
7546a8201b Revert "drm/xe/lnl: Offload system clear page activity to GPU"
This optimization relied on having to clear CCS on allocations.
If there is no need to clear CCS on allocations then this would mostly
help in reducing CPU utilization.

Revert this patch at this moment because of:
1 Currently Xe can't do clear on free and using a invalid ttm flag,
TTM_TT_FLAG_CLEARED_ON_FREE which could poison global ttm pool on
multi-device setup.

2 Also for LNL CPU:WB doesn't require clearing CCS as such BO will
not be allowed to bind with compression PTE. Subsequent patch will
disable clearing CCS for CPU:WB BOs for LNL.

This reverts commit 2368306180.

Cc: Christian König <christian.koenig@amd.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240828083635.23601-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-28 06:45:52 -07:00
Maarten Lankhorst
c66f4711f7
drm/xe: Align all VRAM scanout buffers to 64k physical pages when needed.
For CCS formats on affected platforms, CCS can be used freely, but
display engine requires a multiple of 64k physical pages. No other
changes are needed.

At the BO creation time we don't know if the BO will be used for CCS
or not. If the scanout flag is set, and the BO is a multiple of 64k,
we take the safe route and force the physical alignment of 64k pages.

If the BO is not a multiple of 64k, or the scanout flag was not set
at BO creation, we reject it for usage as CCS in display. The physical
pages are likely not aligned correctly, and this will cause corruption
when used as FB.

The scanout flag and size being a multiple of 64k are used together
to enforce 64k physical placement.

VM_BIND is completely unaffected, mappings to a VM can still be aligned
to 4k, just like for normal buffers.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240826170117.327709-3-maarten.lankhorst@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-27 18:15:20 -04:00
Daniel Vetter
4461e9e5c3 Linux 6.11-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmbK2B8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGFwkH/10QpUgzIfbFKbF+
 5hwcvaqS5myxWwJ4PjN0eR1qGE6RzVO0Tb24+TVql+7pxu+iWm1kYgC3+/T5xJsP
 ECAszdmPWSco1xaHrh2y3PyCJjaBiqFbIxdjPp7odjDpG9qarbcty8YpWs44u/gd
 RDXzHUuScEShBhEt0ZhvE1pIDL8jJ8JL3yqOMZ+XaDxtJbjaHw4GHp8efxlBWc8N
 jZKIVJi22q5NWG5T0tGtPWwzCm0ewA/JNMTEvE9leoSoAgO85NZ0ivxMC76q/tbj
 BrYk5KnzfhJs4b/n/KtIwWaLTgLyXKGqHMaMq8sbXtp410aUdgnRJO2cl3fI+1vc
 vxQfAfk=
 =RemI
 -----END PGP SIGNATURE-----

Merge v6.11-rc5 into drm-next

amdgpu pr conconflicts due to patches cherry-picked to -fixes, I might
as well catch up with a backmerge and handle them all. Plus both misc
and intel maintainers asked for a backmerge anyway.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-08-27 14:09:45 +02:00
Rodrigo Vivi
34e804220f
drm/xe: Make xe_ggtt_node struct independent
In some rare cases, the drm_mm node cannot be removed synchronously
due to runtime PM conditions. In this situation, the node removal will
be delegated to a workqueue that will be able to wake up the device
before removing the node.

However, in this situation, the lifetime of the xe_ggtt_node cannot
be restricted to the lifetime of the parent object. So, this patch
introduces the infrastructure so the xe_ggtt_node struct can be
allocated in advance and freed when needed.

By having the ggtt backpointer, it also ensure that the init function
is always called before any attempt to insert or reserve the node
in the GGTT.

v2: s/xe_ggtt_node_force_fini/xe_ggtt_node_fini and use it
    internaly (Brost)
v3: - Use GF_NOFS for node allocation (CI)
    - Avoid ggtt argument, now that we have it inside the node (Lucas)
    - Fix some missed fini cases (CI)
v4: - Fix SRIOV critical case where config->ggtt_region was
      lost (Michal)
    - Avoid ggtt argument also on removal (missed case on v3) (Michal)
    - Remove useless checks (Michal)
    - Return 0 instead of negative errno on a u32 addr. (Michal)
    - s/xe_ggtt_assign/xe_ggtt_node_assign for coherence, while we
      are touching it (Michal)
v5: - Fix VFs' ggtt_balloon

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-11-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-22 14:00:45 -04:00
Rodrigo Vivi
6062ea9398
drm/xe: Encapsulate drm_mm_node inside xe_ggtt_node
The xe_ggtt component uses drm_mm to manage the GGTT.
The drm_mm_node is just a node inside drm_mm, but in Xe we use that
only in the GGTT context. So, this patch encapsulates the drm_mm_node
into a xe_ggtt's new struct.

This is the first step towards limiting all the drm_mm access
through xe_ggtt. The ultimate goal is to have a better control of
the node insertion and removal, so the removal can be delegated
to a delayed workqueue.

v2: Fix includes and typos (Michal and Brost)

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-5-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-22 14:00:44 -04:00
Daniele Ceraolo Spurio
8636a5c29b
drm/xe: use devm instead of drmm for managed bo
The BO cleanup touches the GGTT and therefore requires the HW to be
available, so we need to use devm instead of drmm.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1160
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809231237.1503796-2-daniele.ceraolospurio@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 8d3a2d3d76)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-19 13:37:50 -04:00
Nirmoy Das
2368306180 drm/xe/lnl: Offload system clear page activity to GPU
On LNL because of flat CCS, driver creates migrates job to clear
CCS meta data. Extend that to also clear system pages using GPU.
Inform TTM to allocate pages without __GFP_ZERO to avoid double page
clearing by clearing out TTM_TT_FLAG_ZERO_ALLOC flag and set
TTM_TT_FLAG_CLEARED_ON_FREE while freeing to skip ttm pool's clear
on free as XE now takes care of clearing pages. If a bo is in system
placement such as BO created with  DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING
and there is a cpu map then for such BO gpu clear will be avoided as
there is no dma mapping for such BO at that moment to create migration
jobs.

Tested this patch api_overhead_benchmark_l0 from
https://github.com/intel/compute-benchmarks

Without the patch:
api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation:
UsmMemoryAllocation(api=l0 type=Host size=4KB) 84.206 us
UsmMemoryAllocation(api=l0 type=Host size=1GB) 105775.56 us
erf tool top 5 entries:
71.44% api_overhead_be  [kernel.kallsyms]   [k] clear_page_erms
6.34%  api_overhead_be  [kernel.kallsyms]   [k] __pageblock_pfn_to_page
2.24%  api_overhead_be  [kernel.kallsyms]   [k] cpa_flush
2.15%  api_overhead_be  [kernel.kallsyms]   [k] pages_are_mergeable
1.94%  api_overhead_be  [kernel.kallsyms]   [k] find_next_iomem_res

With the patch:
api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation:
UsmMemoryAllocation(api=l0 type=Host size=4KB) 79.439 us
UsmMemoryAllocation(api=l0 type=Host size=1GB) 98677.75 us
Perf tool top 5 entries:
11.16% api_overhead_be  [kernel.kallsyms]   [k] __pageblock_pfn_to_page
7.85%  api_overhead_be  [kernel.kallsyms]   [k] cpa_flush
7.59%  api_overhead_be  [kernel.kallsyms]   [k] find_next_iomem_res
7.24%  api_overhead_be  [kernel.kallsyms]   [k] pages_are_mergeable
5.53%  api_overhead_be  [kernel.kallsyms]   [k] lookup_address_in_pgd_attr

Without this patch clear_page_erms() dominates execution time which is
also not pipelined with migration jobs. With this patch page clearing
will get pipelined with migration job and will free CPU for more work.

v2: Handle regression on dgfx(Himal)
    Update commit message as no ttm API changes needed.
v3: Fix Kunit test.
v4: handle data leak on cpu mmap(Thomas)
v5: s/gpu_page_clear/gpu_page_clear_sys and move setting
    it to xe_ttm_sys_mgr_init() and other nits (Matt Auld)
v6: Disable it when init_on_alloc and/or init_on_free is active(Matt)
    Use compute-benchmarks as reporter used it to report this
    allocation latency issue also a proper test application than mime.
    In v5, the test showed significant reduction in alloc latency but
    that is not the case any more, I think this was mostly because
    previous test was done on IFWI which had low mem BW from CPU.

Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240816135154.19678-2-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-08-19 17:49:00 +02:00
Nirmoy Das
6b77dab5da drm/xe: Remove redundant param from xe_bo_create_user
BO from xe_bo_create_user() will always be of type,
ttm_bo_type_device. So remove that redundant parameter.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240816102248.25628-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-08-19 09:38:16 +02:00
Daniele Ceraolo Spurio
8d3a2d3d76 drm/xe: use devm instead of drmm for managed bo
The BO cleanup touches the GGTT and therefore requires the HW to be
available, so we need to use devm instead of drmm.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1160
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809231237.1503796-2-daniele.ceraolospurio@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-15 15:08:00 -07:00
Nirmoy Das
b8cdc47adf drm/xe/migrate: Parameterize ccs and bo data clear in xe_migrate_clear()
Parameterize clearing ccs and bo data in xe_migrate_clear() which  higher
layers can utilize. This patch will be used later on when doing bo data
clear for igfx as well.

v2: Replace multiple params with flags in xe_migrate_clear (Matt B)
v3: s/CLEAR_BO_DATA_FLAG_*/XE_MIGRATE_CLEAR_FLAG_* and move to
    xe_migrate.h. other nits(Matt B)

Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809220347.25330-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-08-13 10:11:07 +02:00
Michal Wajdeczko
25ec7e809c drm/xe: Add NEEDS_2M BO flag
In addition of NEEDS_64K BO flag, add similar one to force 2 MiB
alignment of the buffer objects. Explicitly use this flag during
VF LMEM provisioning as LMTT uses 2 MiB pages and one day we may
drop requirement of allocating pinned objects as contiguous.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240715180538.1418-3-michal.wajdeczko@intel.com
2024-07-22 12:53:06 +02:00
Michal Wajdeczko
9790bbe3ba drm/xe: Normalize NEEDS_64K BO flag
In commit 62742d1266 ("drm/xe: Normalize bo flags macros"),
we normalized all BO flags but XE_BO_NEEDS_64K. Do it now.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240715180538.1418-2-michal.wajdeczko@intel.com
2024-07-22 12:53:04 +02:00
Linus Torvalds
b3ce7a3084 drm next for 6.11-rc1:
core:
 - deprecate DRM data and return 0 date
 - connector: Create a set of helpers to help with HDMI support
 - Remove driver owner assignments
 - Allow more drivers to compile with COMPILE_TEST
 - Conversions to drm_edid
 - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
 - Remove drm_mm_replace_node
 - print: Add a drm prefix to warn level messages too, remove
          ___drm_dbg, consolidate prefix handling
 - New monochrome TV mode variant
 
 ttm:
 - improve number of page faults on some platforms
 - fix test builds under PREEMPT_RT
 - more test coverage
 
 ci:
 - Require a more recent version of mesa,
 - improve farm setup and test generation
 
 dma-buf:
 - warn if reserving 0 fence slots
 - internal API heap enhancements
 
 fbdev:
 - Create memory manager optimized fbdev emulation
 
 panic:
 - Allow to select fonts,
 - improve drm_fb_dma_get_scanout_buffer
 - Allow to dump kmsg to the screen
 
 bridge:
 - Remove redundant checks on bridge->encoder
 - Remove drm_bridge_chain_mode_fixup
 - bridge-connector: Plumb in the new HDMI helper
 - analogix_dp: Various improvements, handle AUX transfers timeout
 - samsung-dsim: Fix timings calculation
 - tc358767: Plenty of small fixes, fix no connector attach, fix clocks
 - sii902x: state validation improvements
 
 panels:
 - Switch panels from register table initialization to proper code
 - Now that the panel code tracks the panel state, remove every
   ad-hoc implementation in the panel drivers
 - More cleanup of prepare / enable state tracking in drivers
 - edp: Drop legacy panel compatibles
 - simple-bridge: Switch to devm_drm_bridge_add
 - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
   13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
   nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView PM070WL4,
   Lincoln Technologies LCD197, Ortustech COM35H3P70ULC,
   AUO G104STN01, K&d kd101ne3-40ti
 
 amdgpu:
 - DCN 4.0.x support
 - GC 12.0 support
 - GMC 12.0 support
 - SDMA 7.0 support
 - MES12 support
 - MMHUB 4.1 support
 - GFX12 modifier and DCC support
 - lots of IP fixes/updates
 
 amdkfd:
 - Contiguous VRAM allocations
 - GC 12.0 support
 - SDMA 7.0 support
 - SR-IOV fixes
 - KFD GFX ALU exceptions
 
 i915:
 - Battlemage Xe2 HPD display enablement
 - Panel Replay enabling
 - DP AUX-less ALPM/LOBF
 - Enable link training failure fallback for DP MST links
 - CMRR (Content Match Refresh Rate) enabling
 - Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
 - Enable eDP AUX based HDR backlight
 - Support replaying GPU hangs with captured context image
 - Automate CCS Mode setting during engine resets
 - lots of refactoring
 - Support replaying GPU hangs with captured context image
 - Increase FLR timeout from 3s to 9s
 - Enable w/a 16021333562 for DG2, MTL and ARL [guc]
 
 xe:
 - update MAINATINERS
 - New uapi adding OA functionality to Xe
 - expose l3 bank mask
 - fix display detect on ADL-N
 - runtime PM Fixes
 - Fix silent backmerge issues
 - More prep for SR-IOV
 - HWmon additions
 - per client usage info
 - Rework GPU page fault handling
 - Drop EXEC_QUEUE_FLAG_BANNED
 - Add BMG PCI IDs
 - Scheduler fixes and improvements
 - Rename xe_exec_queue::compute to xe_exec_queue::lr
 - Use ttm_uncached for BO with NEEDS_UC flag
 - Rename xe perf layer as xe observation layer
 - lots of refactoring
 
 radeon:
 - Backlight workaround for iMac
 - Silence UBSAN flex array warnings
 
 msm:
 - Validate registers XML description against schema in CI
 - core/dpu: SM7150 support
 - mdp5: Add support for MSM8937
 - gpu: Add param for userspace to know if raytracing is supported
 - gpu: X185 support (aka gpu in X1 laptop chips)
 - gpu: a505 support
 
 ivpu:
 - hardware scheduler support
 - profiling support
 - improvements to the platform support layer
 - firmware handling improvements
 - clocks/power mgmt improvements
 - scheduler/logging improvements
 
 habanalabs:
 - Gradual sleep in polling memory macro.
 - Reduce Gaudi2 MSI-X interrupt count to 128.
 - Add Gaudi2-D revision support.
 - Add timestamp to CPLD info.
 - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error.
 - Align Gaudi2 interrupt names.
 - Check for errors after preboot is ready.
 - Change habanalabs maintainer and git repo path.
 
 mgag200:
 - refactoring and improvements
 - Add BMC output
 - enable polling
 
 nouveau:
 - add registry command line
 
 v3d:
 - perf counters improvements
 
 zynqmp:
 - irq and debugfs improvements
 
 atmel-hlcdc:
 - Support XLCDC in sam9x7
 
 mipi-dbi:
 - Remove mipi_dbi_machine_little_endian
 - make SPI bits per word configurable
 - support RGB888
 - allow pixel formats to be specified in the DT
 
 sun4i:
 - Rework the blender setup for DE2
 
 panfrost:
 - Enable MT8188 support
 
 vc4:
 - Monochrome TV support
 
 exynos:
 - fix fallback mode regression
 - fix memory leak
 - Use drm_edid_duplicate() instead of kmemdup()
 
 etnaviv:
 - fix i.MX8MP NPU clock gating
 - workaround FE register cdc issues on some cores
 - fix DMA sync handling for cached buffers
 - fix job timeout handling
 - keep TS enabled on MMUv2 cores for improved performance
 
 mediatek:
 - Convert to platform remove callback returning void-
 - Drop chain_mode_fixup call in mode_valid()
 - Fixes the errors of MediaTek display driver found by IGT.
 - Add display support for the MT8365-EVK board
 - Fix bit depth overwritten for mtk_ovl_set bit_depth()
 - Fix possible_crtcs calculation
 - Fix spurious kfree()
 
 ast:
 - refactor mode setting code
 
 stm:
 - Add LVDS support
 - DSI PHY updates
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmaYqVEACgkQDHTzWXnE
 hr5p3Q/+OOxTHKJ/8WMwfV1Tuep5otkCZdBgNdcuu9zqzpEMEDUDwmV1iboIvT9x
 qJsDwSAJomwbZAnVjDKsbZuycSHUBV6HQdf+5+rtq6be1EfFRwJVzOq0u5+D3KGt
 7f2vy6sM9tw4tR6EikiuP7vCvnSz4iGrWERvEJDEtXECbALhju8sulht8ZMnr6GW
 /MfUetULLSDjq0L1x3TWAq2MPGnJ5UxIkIeOBUP6n4etAUX1BPTNA6N76eN/xMvn
 a40JhtM+pCjjkHxvloIZ+KTYN3S+hskIRksczPHh9HtNX7y/A437wyhOHJZ1NvZb
 yc5ke9GjXxGcxyZH+PY5aCS7O/XElzSSkR1jFZ2s3/MX7PVKgCahGK7+yWjPsiK2
 R5oXebdObshUa8LHDE/3WgBUmTchkvKRTXV9cvGqzxEPhC2zrxArvwP5v6B4mhCn
 Vqo3Pv0Cyr+n65Z5Dzqz/9+m999LJjFTsTrug0p5b/qBJQKu2rQONe4lpZ0NFwwY
 ExyjdxILj7mqrQpKcA6V5Bel5ZCnlVsGfTshFL6Iux54VFlJyRMzKWZ+Gdv4av5k
 dbjz+re+CojKabn3ML/7pAQujK6Rqe58vPuHV78zkvAGJnQgJOOTrmYNYtn3oBqe
 ogdCN+/PREb/9U7i6mQv5hhdHs4tT9ROXaT9jyb8XSHXW+t9lBM=
 =g+Ad
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "There's a lot of stuff in here, amd, i915 and xe have new platform
  work, lots of core rework around EDID handling, some new COMPILE_TEST
  options, maintainer changes and a lots of other stuff. Summary:

  core:
   - deprecate DRM data and return 0 date
   - connector: Create a set of helpers to help with HDMI support
   - Remove driver owner assignments
   - Allow more drivers to compile with COMPILE_TEST
   - Conversions to drm_edid
   - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
   - Remove drm_mm_replace_node
   - print: Add a drm prefix to warn level messages too, remove
            ___drm_dbg, consolidate prefix handling
   - New monochrome TV mode variant

  ttm:
   - improve number of page faults on some platforms
   - fix test builds under PREEMPT_RT
   - more test coverage

  ci:
   - Require a more recent version of mesa
   - improve farm setup and test generation

  dma-buf:
   - warn if reserving 0 fence slots
   - internal API heap enhancements

  fbdev:
   - Create memory manager optimized fbdev emulation

  panic:
   - Allow to select fonts
   - improve drm_fb_dma_get_scanout_buffer
   - Allow to dump kmsg to the screen

  bridge:
   - Remove redundant checks on bridge->encoder
   - Remove drm_bridge_chain_mode_fixup
   - bridge-connector: Plumb in the new HDMI helper
   - analogix_dp: Various improvements, handle AUX transfers timeout
   - samsung-dsim: Fix timings calculation
   - tc358767: Plenty of small fixes, fix no connector attach, fix
               clocks
   - sii902x: state validation improvements

  panels:
   - Switch panels from register table initialization to proper code
   - Now that the panel code tracks the panel state, remove every ad-hoc
     implementation in the panel drivers
   - More cleanup of prepare / enable state tracking in drivers
   - edp: Drop legacy panel compatibles
   - simple-bridge: Switch to devm_drm_bridge_add
   - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
                 13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0,
                 BOE nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView
                 PM070WL4, Lincoln Technologies LCD197, Ortustech
                 COM35H3P70ULC, AUO G104STN01, K&d kd101ne3-40ti

  amdgpu:
   - DCN 4.0.x support
   - GC 12.0 support
   - GMC 12.0 support
   - SDMA 7.0 support
   - MES12 support
   - MMHUB 4.1 support
   - GFX12 modifier and DCC support
   - lots of IP fixes/updates

  amdkfd:
   - Contiguous VRAM allocations
   - GC 12.0 support
   - SDMA 7.0 support
   - SR-IOV fixes
   - KFD GFX ALU exceptions

  i915:
   - Battlemage Xe2 HPD display enablement
   - Panel Replay enabling
   - DP AUX-less ALPM/LOBF
   - Enable link training failure fallback for DP MST links
   - CMRR (Content Match Refresh Rate) enabling
   - Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
   - Enable eDP AUX based HDR backlight
   - Support replaying GPU hangs with captured context image
   - Automate CCS Mode setting during engine resets
   - lots of refactoring
   - Support replaying GPU hangs with captured context image
   - Increase FLR timeout from 3s to 9s
   - Enable w/a 16021333562 for DG2, MTL and ARL [guc]

  xe:
   - update MAINATINERS
   - New uapi adding OA functionality to Xe
   - expose l3 bank mask
   - fix display detect on ADL-N
   - runtime PM Fixes
   - Fix silent backmerge issues
   - More prep for SR-IOV
   - HWmon additions
   - per client usage info
   - Rework GPU page fault handling
   - Drop EXEC_QUEUE_FLAG_BANNED
   - Add BMG PCI IDs
   - Scheduler fixes and improvements
   - Rename xe_exec_queue::compute to xe_exec_queue::lr
   - Use ttm_uncached for BO with NEEDS_UC flag
   - Rename xe perf layer as xe observation layer
   - lots of refactoring

  radeon:
   - Backlight workaround for iMac
   - Silence UBSAN flex array warnings

  msm:
   - Validate registers XML description against schema in CI
   - core/dpu: SM7150 support
   - mdp5: Add support for MSM8937
   - gpu: Add param for userspace to know if raytracing is supported
   - gpu: X185 support (aka gpu in X1 laptop chips)
   - gpu: a505 support

  ivpu:
   - hardware scheduler support
   - profiling support
   - improvements to the platform support layer
   - firmware handling improvements
   - clocks/power mgmt improvements
   - scheduler/logging improvements

  habanalabs:
   - Gradual sleep in polling memory macro
   - Reduce Gaudi2 MSI-X interrupt count to 128
   - Add Gaudi2-D revision support
   - Add timestamp to CPLD info
   - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error
   - Align Gaudi2 interrupt names
   - Check for errors after preboot is ready
   - Change habanalabs maintainer and git repo path

  mgag200:
   - refactoring and improvements
   - Add BMC output
   - enable polling

  nouveau:
   - add registry command line

  v3d:
   - perf counters improvements

  zynqmp:
   - irq and debugfs improvements

  atmel-hlcdc:
   - Support XLCDC in sam9x7

  mipi-dbi:
   - Remove mipi_dbi_machine_little_endian
   - make SPI bits per word configurable
   - support RGB888
   - allow pixel formats to be specified in the DT

  sun4i:
   - Rework the blender setup for DE2

  panfrost:
   - Enable MT8188 support

  vc4:
   - Monochrome TV support

  exynos:
   - fix fallback mode regression
   - fix memory leak
   - Use drm_edid_duplicate() instead of kmemdup()

  etnaviv:
   - fix i.MX8MP NPU clock gating
   - workaround FE register cdc issues on some cores
   - fix DMA sync handling for cached buffers
   - fix job timeout handling
   - keep TS enabled on MMUv2 cores for improved performance

  mediatek:
   - Convert to platform remove callback returning void-
   - Drop chain_mode_fixup call in mode_valid()
   - Fixes the errors of MediaTek display driver found by IGT
   - Add display support for the MT8365-EVK board
   - Fix bit depth overwritten for mtk_ovl_set bit_depth()
   - Fix possible_crtcs calculation
   - Fix spurious kfree()

  ast:
   - refactor mode setting code

  stm:
   - Add LVDS support
   - DSI PHY updates"

* tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel: (2501 commits)
  drm/amdgpu/mes12: add missing opcode string
  drm/amdgpu/mes11: update opcode strings
  Revert "drm/amd/display: Reset freesync config before update new state"
  drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
  drm/xe: Drop trace_xe_hw_fence_free
  drm/xe/uapi: Rename xe perf layer as xe observation layer
  drm/amdgpu: remove exp hw support check for gfx12
  drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
  drm/amdgpu: flush all cached ras bad pages to eeprom
  drm/amdgpu: select compute ME engines dynamically
  drm/amd/display: Allow display DCC for DCN401
  drm/amdgpu: select compute ME engines dynamically
  drm/amdgpu/job: Replace DRM_INFO/ERROR logging
  drm/amdgpu: select compute ME engines dynamically
  drm/amd/pm: Ignore initial value in smu response register
  drm/amdgpu: Initialize VF partition mode
  drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping
  MAINTAINERS: fix Xinhui's name
  MAINTAINERS: update powerplay and swsmu
  drm/qxl: Pin buffer objects for internal mappings
  ...
2024-07-18 09:34:02 -07:00
Thomas Hellström
5207c393d3 drm/xe: Use write-back caching mode for system memory on DGFX
The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.

However, write-combined system memory is expensive to allocate and
even though it is pooled, the pool is expensive to shrink, since
it involves global CPU TLB flushes.

Moreover write-combined system memory from TTM is only reliably
available on x86 and DGFX doesn't have an x86 restriction.

So regardless of the cpu caching mode selected for a bo,
internally use write-back caching mode for system memory on DGFX.

Coherency is maintained, but user-space clients may perceive a
difference in cpu access speeds.

v2:
- Update RB- and Ack tags.
- Rephrase wording in xe_drm.h (Matt Roper)
v3:
- Really rephrase wording.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 622f709ca6 ("drm/xe/uapi: Add support for CPU caching mode")
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Acked-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: 622f709ca6 ("drm/xe/uapi: Add support for CPU caching mode")
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Effie Yu <effie.yu@intel.com> #On chat
Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 01e0cfc994)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-11 08:25:26 -07:00
Thomas Hellström
01e0cfc994 drm/xe: Use write-back caching mode for system memory on DGFX
The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.

However, write-combined system memory is expensive to allocate and
even though it is pooled, the pool is expensive to shrink, since
it involves global CPU TLB flushes.

Moreover write-combined system memory from TTM is only reliably
available on x86 and DGFX doesn't have an x86 restriction.

So regardless of the cpu caching mode selected for a bo,
internally use write-back caching mode for system memory on DGFX.

Coherency is maintained, but user-space clients may perceive a
difference in cpu access speeds.

v2:
- Update RB- and Ack tags.
- Rephrase wording in xe_drm.h (Matt Roper)
v3:
- Really rephrase wording.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 622f709ca6 ("drm/xe/uapi: Add support for CPU caching mode")
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Acked-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: 622f709ca6 ("drm/xe/uapi: Add support for CPU caching mode")
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Effie Yu <effie.yu@intel.com> #On chat
Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com
2024-07-06 11:05:46 +02:00
Michal Wajdeczko
8e7455dd0d drm/xe: Use ttm_uncached for BO with NEEDS_UC flag
We should honor requested uncached mode also at the TTM layer.
Otherwise, we risk losing updates to the memory based interrupts
source or status vectors, as those require uncached memory.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240618104947.729-1-michal.wajdeczko@intel.com
2024-06-19 13:45:16 +02:00
Radhakrishna Sripada
e46d3f813a drm/xe/trace: Extract bo, vm, vma traces
xe_trace.h is starting to get over crowded. Move the traces
related to bo, vm, vma's to its own file.

v2: Update year in License(Gustavo)

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Suggested-by: Jani Nikula <jani.nikula@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240607182943.3572524-2-radhakrishna.sripada@intel.com
2024-06-12 09:25:07 -07:00
Nirmoy Das
a17aceb34e drm/xe: Check empty pinned BO list with lock held.
Use lock that is meant to use for accessing the BO pin list.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240528115408.22094-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-29 10:56:47 +02:00
Nirmoy Das
a4b725767d drm/xe: Add function to check if BO has single placement
A new helper function xe_bo_has_single_placement() to check
if a BO has single placement.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430162529.21588-5-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-06 18:14:11 +02:00
Thomas Hellström
75521e8b56 drm/xe: Perform dma_map when moving system buffer objects to TT
Currently we dma_map on ttm_tt population and dma_unmap when
the pages are released in ttm_tt unpopulate.

Strictly, the dma_map is not needed until the bo is moved to the
XE_PL_TT placement, so perform the dma_mapping on such moves
instead, and remove the dma_mappig when moving to XE_PL_SYSTEM.

This is desired for the upcoming shrinker series where shrinking
of a ttm_tt might fail. That would lead to an odd construct where
we first dma_unmap, then shrink and if shrinking fails dma_map
again. If dma_mapping instead is performed on move like this,
shrinking does not need to care at all about dma mapping.

Finally, where a ttm_tt is destroyed while bound to a different
memory type than XE_PL_SYSTEM, we keep the dma_unmap in
unpopulate().

v2:
- Don't accidently unmap the dma-buf's sgtable.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502183251.10170-1-thomas.hellstrom@linux.intel.com
2024-05-03 09:46:54 +02:00
Rodrigo Vivi
783d6cdc82
drm/xe: Kill xe_device_mem_access_{get*,put}
Let's simply convert all the current callers towards direct
xe_pm_runtime access and remove this extra layer of indirection.

No functional change is expected with this patch since
xe_mem_access_get was already using the xe_pm_runtime_get_noresume
at this point.

v2: Convert all the current callers instead of a big refactor
at once.

v3: - Rebased
    - Squashed the GSC/HDCP
    - Added a new case: sriov_pf_policy
    - Improved commit message to highlight that
      there's no functional change in this patch.

Reviewed-by: Matthew Auld <matthew.auld@intel.com> #v2
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240418143049.43231-1-rodrigo.vivi@intel.com
2024-04-22 09:03:09 -04:00
Rodrigo Vivi
fdea94a4c2
drm/xe: Convert xe_gem_fault to use direct xe_pm_runtime calls
The gem page fault is one of the outer bound protections where
we want to ensure that the hardware is in D0 before proceeding
with memory access. Let's convert it towards the xe_pm_runtime
functions directly so we can then convert the mem_access to be
inner protection only and then Kill it for good.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-6-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-18 08:31:40 -04:00
Akshata Jahagirdar
e9c22984e9 drm/xe/xe2hpg: Remove extra allocation of CCS pages for dgfx
On Xe2 dGPU, compression is only supported with VRAM. When copying from
VRAM -> system memory the KMD uses mapping with uncompressed PAT
so the copy in system memory is guaranteed to be uncompressed.
When restoring such buffers from system memory -> VRAM the KMD can't
easily know which pages were originally compressed, so we always use
uncompressed -> uncompressed here.
so this means that there's no need for extra CCS storage on such
platforms.

v2: More description added to commit message

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240408170545.3769566-8-balasubramani.vivekanandan@intel.com
2024-04-09 14:22:04 -07:00
Lucas De Marchi
62742d1266 drm/xe: Normalize bo flags macros
The flags stored in the BO grew over time without following
much a naming pattern. First of all, get rid of the _BIT suffix that was
banned from everywhere else due to the guideline in
drivers/gpu/drm/i915/i915_reg.h that xe kind of follows:

	Define bits using ``REG_BIT(N)``. Do **not** add ``_BIT`` suffix to the name.

Here the flags aren't for a register, but it's good practice to keep it
consistent.

Second divergence on names is the use or not of "CREATE". This is
because most of the flags are passed to xe_bo_create*() family of
functions, changing its behavior. However, since the flags are also
stored in the bo itself and checked elsewhere in the code, it seems
better to just omit the CREATE part.

With those 2 guidelines, all the flags are given the form
XE_BO_FLAG_<FLAG_NAME> with the following commands:

	git grep -le "XE_BO_" -- drivers/gpu/drm/xe | xargs sed -i \
		-e "s/XE_BO_\([_A-Z0-9]*\)_BIT/XE_BO_\1/g" \
		-e 's/XE_BO_CREATE_/XE_BO_FLAG_/g'
	git grep -le "XE_BO_" -- drivers/gpu/drm/xe | xargs sed -i -r \
		-e 's/XE_BO_(DEFER_BACKING|SCANOUT|FIXED_PLACEMENT|PAGETABLE|NEEDS_CPU_ACCESS|NEEDS_UC|INTERNAL_TEST|INTERNAL_64K|GGTT_INVALIDATE)/XE_BO_FLAG_\1/g'

And then the defines in drivers/gpu/drm/xe/xe_bo.h are adjusted to
follow the coding style.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240322142702.186529-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-02 10:33:57 -07:00
Lucas De Marchi
e27f8a45c8 drm/xe: Stop passing user flag to xe_bo_create_user()
It's quite redundant to pass XE_BO_CREATE_USER_BIT to
xe_bo_create_user() since the only difference of that function is to
force that flag. Stop passing the flag in the few cases that were
explicitly doing so.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240322142702.186529-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-02 10:32:54 -07:00
Matthew Brost
231c411087 drm/xe: Add XE_BO_GGTT_INVALIDATE flag
Add XE_BO_GGTT_INVALIDATE flag which indicates the GGTT should be
invalidated when a BO is added / removed from the GGTT. This is
typically set when a BO is used by the GuC as the GuC has GGTT TLBs.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
[mlankhorst: Small fix to only inherit GGTT_INVALIDATE from src bo]
[mlankhorst: Remove _BIT from name]
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-4-matthew.brost@intel.com
2024-03-20 10:49:14 +01:00
Michal Wajdeczko
6583b0839a drm/xe: Allow VRAM BO allocations aligned to 64K
While today we are getting VRAM allocations aligned to 64K as the
XE_VRAM_FLAGS_NEED64K flag could be set, we shouldn't only rely on
that flag and we should also allow caller to specify required 64K
alignment explicitly.  Define new XE_BO_NEEDS_64K flag for that.

Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313104132.1045-2-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2024-03-15 22:20:52 +01:00
Nirmoy Das
002d8f0b4f drm/xe: Remove unused xe_bo->props struct
Property struct is not being used so remove it and related dead code.

Fixes: ddfa2d6a84 ("drm/xe/uapi: Kill VM_MADVISE IOCTL")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: intel-xe@lists.freedesktop.org
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240311151159.10036-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-03-13 21:21:56 -07:00
Matthew Brost
198bc28d0a drm/xe: Pipeline evict / restore of pinned BOs during suspend / resume
Rather than waiting for each evict / restore of pinned BOs to complete
just wait on migrate exec queue to be idle once during suspend / resume.

Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305173503.285223-1-matthew.brost@intel.com
2024-03-05 11:45:51 -08:00
Priyanka Dandamudi
8034f6b070 drm/xe/xe_trace: Add move_lacks_source detail to xe_bo_move trace
Add move_lacks_source detail to xe_bo_move trace to make it readable
that is to check if it is migrate clear or migrate copy.

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: a0df2cc858 ("drm/xe/xe_bo_move: Enhance xe_bo_move trace")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221101950.1019312-1-priyanka.dandamudi@intel.com
2024-02-29 11:51:58 +01:00
Michał Winiarski
a44bbace73 drm/xe/guc: Allocate GuC data structures in system memory for initial load
GuC load will need to happen at an earlier point in probe, where local
memory is not yet available. Use system memory for GuC data structures
used for initial "hwconfig" load, and realloc at a later,
"post-hwconfig" load if needed, when local memory is available.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-1-michal.winiarski@intel.com
2024-02-20 14:13:42 -05:00
Lucas De Marchi
fbb944086f Merge drm/drm-next into drm-xe-next
Bring changes from drm-misc-next that got merged in drm-next back to
drm-xe so they can be used for additional features.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-02-20 09:57:17 -08:00
Priyanka Dandamudi
a0df2cc858 drm/xe/xe_bo_move: Enhance xe_bo_move trace
Enhanced xe_bo_move trace to be more readable.
It will help to show the migration details.
Src and dst details.

v2: Modify trace_xe_bo_move(), it takes the integer mem_type
rather than a string.
Make mem_type_to_name() extern, it will be used by trace.(Thomas)

v3: Move mem_type_to_name() to xe_bo.[ch] (Thomas, Matt)

v4: Add device details to reduce ambiquity related to vram0/vram1. (Oak)

v5: Rename mem_type_to_name to xe_mem_type_to_name. (Thomas)

v6: Optimised code to use xe_bo_device(__entry->bo). (Thomas)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Kempczynski Zbigniew <Zbigniew.Kempczynski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240220044748.948496-1-priyanka.dandamudi@intel.com
2024-02-20 08:35:14 +01:00
Thomas Zimmermann
0e85f1ae4a Merge drm/drm-next into drm-misc-next
Backmerging to update drm-misc-next to the state of v6.8-rc3. Also
fixes a build problem with xe.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-02-07 13:02:20 +01:00
Jani Nikula
2fe36db5fd drm/xe: make xe_ttm_funcs const
Place the function pointers in rodata. Also drop the extra declaration
while at it.

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117122044.1544174-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-01-19 14:34:27 +02:00
Brian Welty
8049e3954a drm/xe: Fix bounds checking in __xe_bo_placement_for_flags()
Requesting all memory regions on PVC will fill bo->placements up to
XE_BO_MAX_PLACEMENTS. The subsequent call to try_add_stolen() will trip
over the bounds checking even though XE_PL_STOLEN is not expected to
be used in this case.

This is hit with igt@xe_exec_fault_mode@once-basic-prefetch:
    xe 0000:8c:00.0: [drm] Assertion `*c < (sizeof(bo->placements) / sizeof((bo->placements)[0]) + ((int)(sizeof(struct { int:(-!!(__builtin_types_compatible_p(typeof((bo->placements)), typeof(&(bo->placements)[0])))); }))))` failed!
    WARNING: CPU: 30 PID: 6161 at drivers/gpu/drm/xe/xe_bo.c:203 __xe_bo_placement_for_flags+0x218/0x240 [xe]

Is fixed here by moving the bounds checks closer to where we actually
write into the bo->placement array.

Fixes: 8c54ee8a86 ("drm/xe: Ensure that we don't access the placements array out-of-bounds")
Link: https://patchwork.freedesktop.org/patch/msgid/20240111002111.10190-1-brian.welty@intel.com
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
(cherry picked from commit 52e3fa3e3e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15 15:37:03 +01:00
Thomas Hellström
77232e6a28 drm/xe: Annotate xe_mem_region::mapping with __iomem
The pointer points to IO memory, but the __iomem annotation was
incorrectly placed. Annotate it correctly, update its usage accordingly
and fix the corresponding sparse error.

Fixes: 0887a2e7ab ("drm/xe: Make xe_mem_region struct")
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-3-thomas.hellstrom@linux.intel.com
(cherry picked from commit 20855b62a3)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15 15:36:42 +01:00
Brian Welty
52e3fa3e3e drm/xe: Fix bounds checking in __xe_bo_placement_for_flags()
Requesting all memory regions on PVC will fill bo->placements up to
XE_BO_MAX_PLACEMENTS. The subsequent call to try_add_stolen() will trip
over the bounds checking even though XE_PL_STOLEN is not expected to
be used in this case.

This is hit with igt@xe_exec_fault_mode@once-basic-prefetch:
    xe 0000:8c:00.0: [drm] Assertion `*c < (sizeof(bo->placements) / sizeof((bo->placements)[0]) + ((int)(sizeof(struct { int:(-!!(__builtin_types_compatible_p(typeof((bo->placements)), typeof(&(bo->placements)[0])))); }))))` failed!
    WARNING: CPU: 30 PID: 6161 at drivers/gpu/drm/xe/xe_bo.c:203 __xe_bo_placement_for_flags+0x218/0x240 [xe]

Is fixed here by moving the bounds checks closer to where we actually
write into the bo->placement array.

Fixes: 8c54ee8a86 ("drm/xe: Ensure that we don't access the placements array out-of-bounds")
Link: https://patchwork.freedesktop.org/patch/msgid/20240111002111.10190-1-brian.welty@intel.com
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2024-01-12 09:36:37 -08:00
Thomas Hellström
20855b62a3 drm/xe: Annotate xe_mem_region::mapping with __iomem
The pointer points to IO memory, but the __iomem annotation was
incorrectly placed. Annotate it correctly, update its usage accordingly
and fix the corresponding sparse error.

Fixes: 0887a2e7ab ("drm/xe: Make xe_mem_region struct")
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-3-thomas.hellstrom@linux.intel.com
2024-01-09 18:00:55 +01:00
Badal Nilawar
fa78e188d8 drm/xe/dgfx: Release mmap mappings on rpm suspend
Release all mmap mappings for all vram objects which are associated
with userfault such that, while pcie function in D3hot, any access
to memory mappings will raise a userfault.

Upon userfault, in order to access memory mappings, if graphics
function is in D3 then runtime resume of dgpu will be triggered to
transition to D0.

v2:
  - Avoid iomem check before bo migration check as bo can migrate
    to system memory (Matthew Auld)
v3:
  - Delete bo userfault link during bo destroy
  - Upon bo move (vram-smem), do bo userfault link deletion in
    xe_bo_move_notify instead of xe_bo_move (Thomas Hellström)
  - Grab lock in rpm hook while deleting bo userfault link (Matthew Auld)
v4:
  - Add kernel doc and wrap vram_userfault related
    stuff in the structure (Matthew Auld)
  - Get rpm wakeref before taking dma reserve lock (Matthew Auld)
  - In suspend path apply lock for entire list op
    including list iteration (Matthew Auld)
v5:
  - Use mutex lock instead of spin lock
v6:
  - Fix review comments (Matthew Auld)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #For the xe_bo_move_notify() changes
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20240104130702.950078-1-badal.nilawar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-01-08 16:55:44 -05:00
Lucas De Marchi
80166e9567 drm/xe/bo: Remove unusued variable
bo is not used since all the checks are against tbo. Fix warning:

	../drivers/gpu/drm/xe/xe_bo.c: In function ‘xe_evict_flags’:
	../drivers/gpu/drm/xe/xe_bo.c:250:23: error: variable ‘bo’ set but not used [-Werror=unused-but-set-variable]
	  250 |         struct xe_bo *bo;

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:17 -05:00
Himal Prasad Ghimiray
266c858852 drm/xe/xe2: Handle flat ccs move for igfx.
- Clear flat ccs during user bo creation.
- copy ccs meta data between flat ccs and bo during eviction and
restore.
- Add a bool field ccs_cleared in bo, true means ccs region of bo is
already cleared.

v2:
 - Rebase.

v3:
 - Maintain order of xe_bo_move_notify for ttm_bo_type_sg.

v4:
 - xe_migrate_copy can be used to copy src to dst bo on igfx too.
Add a bool which handles only ccs metadata copy.

v5:
- on dgfx ccs should be cleared even if the bo is not compression enabled.

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:15 -05:00