Commit Graph

31433 Commits

Author SHA1 Message Date
Alex Deucher
44f392fbf6 Revert "drm/amd/pm: correct the workload setting"
This reverts commit 74e1006430.

This causes a regression in the workload selection.
A more extensive fix is being worked on.
For now, revert.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Fixes: 74e1006430 ("drm/amd/pm: correct the workload setting")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-16 09:41:11 -05:00
Vijendar Mukunda
7013a8268d drm/amd: Fix initialization mistake for NBIO 7.7.0
There is a strapping issue on NBIO 7.7.0 that can lead to spurious PME
events while in the D0 state.

Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20241112161142.28974-1-mario.limonciello@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 447a54a0f7)
Cc: stable@vger.kernel.org
2024-11-12 17:37:39 -05:00
Alex Deucher
5f77ee21eb Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"
This reverts commit 694c79769c.

This was not the root cause.  Revert.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3678
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: aurabindo.pillai@amd.com
Cc: hamishclaxton@gmail.com
(cherry picked from commit 3c2296b1ee)
Cc: stable@vger.kernel.org # 6.11.x
2024-11-12 17:37:39 -05:00
Hamish Claxton
4bb2f52ac0 drm/amd/display: Fix failure to read vram info due to static BP_RESULT
The static declaration causes the check to fail.  Remove it.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678
Fixes: 00c391102a ("drm/amd/display: Add misc DC changes for DCN401")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Hamish Claxton <hamishclaxton@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: aurabindo.pillai@amd.com
Cc: hamishclaxton@gmail.com
(cherry picked from commit 91314e7dfd)
Cc: stable@vger.kernel.org # 6.11.x
2024-11-12 17:37:38 -05:00
Christian König
5a67c31669 drm/amdgpu: enable GTT fallback handling for dGPUs only
That is just a waste of time on APUs.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3704
Fixes: 216c1282dd ("drm/amdgpu: use GTT only as fallback for VRAM|GTT")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e8fc090d32)
Cc: stable@vger.kernel.org
2024-11-12 17:37:38 -05:00
Jack Xiao
79365ea707 drm/amdgpu/mes12: correct kiq unmap latency
Correct kiq unmap queue timeout value.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cfe98204a0)
Cc: stable@vger.kernel.org # 6.11.x
2024-11-11 14:05:51 -05:00
Christian König
0e5ac88fb9 drm/amdgpu: fix check in gmc_v9_0_get_vm_pte()
The coherency flags can only be determined when the BO is locked and that
in turn is only guaranteed when the mapping is validated.

Fix the check, move the resource check into the function and add an assert
that the BO is locked.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: d1a372af1c ("drm/amdgpu: Set MTYPE in PTE based on BO flags")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1b4ca8546f)
Cc: stable@vger.kernel.org
2024-11-11 14:05:44 -05:00
Tim Huang
df0279e2a1 drm/amd/pm: print pp_dpm_mclk in ascending order on SMU v14.0.0
Currently, the pp_dpm_mclk values are reported in descending order
on SMU IP v14.0.0/1/4. Adjust to ascending order for consistency
with other clock interfaces.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d4be16ccfd)
Cc: stable@vger.kernel.org
2024-11-11 14:05:39 -05:00
David Rosca
d641a151fc drm/amdgpu: Fix video caps for H264 and HEVC encode maximum size
H264 supports 4096x4096 starting from Polaris.
HEVC also supports 4096x4096, with VCN 3 and newer 8192x4352
is supported.

Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 69e9a9e65b)
Cc: stable@vger.kernel.org
2024-11-11 14:05:36 -05:00
Rodrigo Siqueira
16dd2825c2 drm/amd/display: Adjust VSDB parser for replay feature
At some point, the IEEE ID identification for the replay check in the
AMD EDID was added. However, this check causes the following
out-of-bounds issues when using KASAN:

[   27.804016] BUG: KASAN: slab-out-of-bounds in amdgpu_dm_update_freesync_caps+0xefa/0x17a0 [amdgpu]
[   27.804788] Read of size 1 at addr ffff8881647fdb00 by task systemd-udevd/383

...

[   27.821207] Memory state around the buggy address:
[   27.821215]  ffff8881647fda00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   27.821224]  ffff8881647fda80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   27.821234] >ffff8881647fdb00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   27.821243]                    ^
[   27.821250]  ffff8881647fdb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   27.821259]  ffff8881647fdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   27.821268] ==================================================================

This is caused because the ID extraction happens outside of the range of
the edid lenght. This commit addresses this issue by considering the
amd_vsdb_block size.

Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b7e381b1cc)
Cc: stable@vger.kernel.org
2024-11-11 14:05:30 -05:00
Dillon Varone
9fc0cbcb6e drm/amd/display: Require minimum VBlank size for stutter optimization
If the nominal VBlank is too small, optimizing for stutter can cause
the prefetch bandwidth to increase drasticaly, resulting in higher
clock and power requirements. Only optimize if it is >3x the stutter
latency.

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 003215f962)
Cc: stable@vger.kernel.org
2024-11-11 14:05:26 -05:00
Ryan Seto
6825cb07b7 drm/amd/display: Handle dml allocation failure to avoid crash
[Why]
In the case where a dml allocation fails for any reason, the
current state's dml contexts would no longer be valid. Then
subsequent calls dc_state_copy_internal would shallow copy
invalid memory and if the new state was released, a double
free would occur.

[How]
Reset dml pointers in new_state to NULL and avoid invalid
pointer

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Ryan Seto <ryanseto@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bcafdc6152)
Cc: stable@vger.kernel.org
2024-11-11 14:05:22 -05:00
Tom Chung
bd8a957661 drm/amd/display: Fix Panel Replay not update screen correctly
[Why]
In certain use case such as KDE login screen, there will be no atomic
commit while do the frame update.
If the Panel Replay enabled, it will cause the screen not updated and
looks like system hang.

[How]
Delay few atomic commits before enabled the Panel Replay just like PSR.

Fixes: be64336307 ("drm/amd/display: Re-enable panel replay feature")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3686
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3682
Tested-By: Corey Hickey <bugfood-c@fatooh.org>
Tested-By: James Courtier-Dutton <james.dutton@gmail.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ca628f0edd)
Cc: stable@vger.kernel.org # 6.11+
2024-11-11 14:05:15 -05:00
Tom Chung
b8d9d5fef4 drm/amd/display: Change some variable name of psr
Panel Replay feature may also use the same variable with PSR.
Change the variable name and make it not specify for PSR.

Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c7fafb7a46)
Cc: stable@vger.kernel.org # 6.11+
2024-11-11 14:05:11 -05:00
Alex Deucher
4d75b94680 drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read()
Avoid a possible buffer overflow if size is larger than 4K.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f5d873f582)
Cc: stable@vger.kernel.org
2024-11-05 10:54:11 -05:00
Alex Deucher
f790a2c494 drm/amdgpu: Adjust debugfs eviction and IB access permissions
Users should not be able to run these.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ba9395430)
Cc: stable@vger.kernel.org
2024-11-05 10:53:48 -05:00
Alex Deucher
b46dadf7e3 drm/amdgpu: Adjust debugfs register access permissions
Regular users shouldn't have read access.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c0cfd2e652)
Cc: stable@vger.kernel.org
2024-11-05 10:53:21 -05:00
Lijo Lazar
3ce3f85787 drm/amdgpu: Fix DPX valid mode check on GC 9.4.3
For DPX mode, the number of memory partitions supported should be less
than or equal to 2.

Fixes: 1589c82a10 ("drm/amdgpu: Check memory ranges for valid xcp mode")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 990c4f5807)
Cc: stable@vger.kernel.org
2024-11-05 10:52:40 -05:00
Kenneth Feng
74e1006430 drm/amd/pm: correct the workload setting
Correct the workload setting in order not to mix the setting
with the end user. Update the workload mask accordingly.

v2: changes as below:
1. the end user can not erase the workload from driver except default workload.
2. always shows the real highest priority workoad to the end user.
3. the real workload mask is combined with driver workload mask and end user workload mask.

v3: apply this to the other ASICs as well.
v4: simplify the code
v5: refine the code based on the review comments.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8cc438be5d)
Cc: stable@vger.kernel.org # 6.11.x
2024-11-04 12:51:01 -05:00
Kenneth Feng
1356bfc54c drm/amd/pm: always pick the pptable from IFWI
always pick the pptable from IFWI on smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 136ce12bd5)
Cc: stable@vger.kernel.org # 6.11.x
2024-11-04 12:48:51 -05:00
Antonio Quartulli
a6dd15981c drm/amdgpu: prevent NULL pointer dereference if ATIF is not supported
acpi_evaluate_object() may return AE_NOT_FOUND (failure), which
would result in dereferencing buffer.pointer (obj) while being NULL.

Although this case may be unrealistic for the current code, it is
still better to protect against possible bugs.

Bail out also when status is AE_NOT_FOUND.

This fixes 1 FORWARD_NULL issue reported by Coverity
Report: CID 1600951:  Null pointer dereferences  (FORWARD_NULL)

Signed-off-by: Antonio Quartulli <antonio@mandelbit.com>
Fixes: c9b7c809b8 ("drm/amd: Guard against bad data for ATIF ACPI method")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241031152848.4716-1-antonio@mandelbit.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 91c9e221fe)
Cc: stable@vger.kernel.org
2024-11-04 12:48:21 -05:00
Aurabindo Pillai
694c79769c drm/amd/display: parse umc_info or vram_info based on ASIC
An upstream bug report suggests that there are production dGPUs that are
older than DCN401 but still have a umc_info in VBIOS tables with the
same version as expected for a DCN401 product. Hence, reading this
tables should be guarded with a version check.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2551b4a321)
Fixes: 00c391102a ("drm/amd/display: Add misc DC changes for DCN401")
Cc: stable@vger.kernel.org # 6.11.x
2024-11-04 12:44:07 -05:00
Tom Chung
4f26c95ffc drm/amd/display: Fix brightness level not retained over reboot
[Why]
During boot up and resume the DC layer will reset the panel
brightness to fix a flicker issue.

It will cause the dm->actual_brightness is not the current panel
brightness level. (the dm->brightness is the correct panel level)

[How]
Set the backlight level after do the set mode.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Fixes: d9e865826c ("drm/amd/display: Simplify brightness initialization")
Reported-by: Mark Herbert <mark.herbert42@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3655
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7875afafba)
Cc: stable@vger.kernel.org
2024-11-04 12:43:03 -05:00
Alex Deucher
935abb86a9 drm/amdgpu/smu13: fix profile reporting
The following 3 commits landed in parallel:
commit d7d2688bf4 ("drm/amd/pm: update workload mask after the setting")
commit 7a1613e47e ("drm/amdgpu/smu13: always apply the powersave optimization")
commit 7c210ca5a2 ("drm/amdgpu: handle default profile on on devices without fullscreen 3D")
While everything is set correctly, this caused the profile to be
reported incorrectly because both the powersave and fullscreen3d bits
were set in the mask and when the driver prints the profile, it looks
for the first bit set.

Fixes: d7d2688bf4 ("drm/amd/pm: update workload mask after the setting")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ecfe9b2376)
Cc: stable@vger.kernel.org
2024-10-28 17:19:45 -04:00
Tvrtko Ursulin
4aa923a6e6 drm/amd/pm: Vangogh: Fix kernel memory out of bounds write
KASAN reports that the GPU metrics table allocated in
vangogh_tables_init() is not large enough for the memset done in
smu_cmn_init_soft_gpu_metrics(). Condensed report follows:

[   33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu]
[   33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067
...
[   33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G        W          6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544
[   33.861816] Tainted: [W]=WARN
[   33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023
[   33.861822] Call Trace:
[   33.861826]  <TASK>
[   33.861829]  dump_stack_lvl+0x66/0x90
[   33.861838]  print_report+0xce/0x620
[   33.861853]  kasan_report+0xda/0x110
[   33.862794]  kasan_check_range+0xfd/0x1a0
[   33.862799]  __asan_memset+0x23/0x40
[   33.862803]  smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[   33.863306]  vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[   33.864257]  vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[   33.865682]  amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[   33.866160]  amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[   33.867135]  dev_attr_show+0x43/0xc0
[   33.867147]  sysfs_kf_seq_show+0x1f1/0x3b0
[   33.867155]  seq_read_iter+0x3f8/0x1140
[   33.867173]  vfs_read+0x76c/0xc50
[   33.867198]  ksys_read+0xfb/0x1d0
[   33.867214]  do_syscall_64+0x90/0x160
...
[   33.867353] Allocated by task 378 on cpu 7 at 22.794876s:
[   33.867358]  kasan_save_stack+0x33/0x50
[   33.867364]  kasan_save_track+0x17/0x60
[   33.867367]  __kasan_kmalloc+0x87/0x90
[   33.867371]  vangogh_init_smc_tables+0x3f9/0x840 [amdgpu]
[   33.867835]  smu_sw_init+0xa32/0x1850 [amdgpu]
[   33.868299]  amdgpu_device_init+0x467b/0x8d90 [amdgpu]
[   33.868733]  amdgpu_driver_load_kms+0x19/0xf0 [amdgpu]
[   33.869167]  amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu]
[   33.869608]  local_pci_probe+0xda/0x180
[   33.869614]  pci_device_probe+0x43f/0x6b0

Empirically we can confirm that the former allocates 152 bytes for the
table, while the latter memsets the 168 large block.

Root cause appears that when GPU metrics tables for v2_4 parts were added
it was not considered to enlarge the table to fit.

The fix in this patch is rather "brute force" and perhaps later should be
done in a smarter way, by extracting and consolidating the part version to
size logic to a common helper, instead of brute forcing the largest
possible allocation. Nevertheless, for now this works and fixes the out of
bounds write.

v2:
 * Drop impossible v3_0 case. (Mario)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 41cec40bc9 ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Wenyou Yang <WenYou.Yang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0880f58f96)
Cc: stable@vger.kernel.org # v6.6+
2024-10-28 17:14:08 -04:00
Ovidiu Bunea
1b6063a577 Revert "drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35"
This reverts
commit 9dad21f910 ("drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35")

[why & how]
The offending commit exposes a hang with lid close/open behavior.
Both issues seem to be related to ODM 2:1 mode switching, so there
is another issue generic to that sequence that needs to be
investigated.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 68bf95317e)
Cc: stable@vger.kernel.org
2024-10-28 17:13:25 -04:00
Alex Deucher
7c210ca5a2 drm/amdgpu: handle default profile on on devices without fullscreen 3D
Some devices do not support fullscreen 3D.

v2: Make the check generic.

Fixes: ec1aab7816 ("drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
(cherry picked from commit 1cdd67510e)
2024-10-22 18:21:51 -04:00
Mario Limonciello
ba1959f711 drm/amd/display: Disable PSR-SU on Parade 08-01 TCON too
Stuart Hayhurst has found that both at bootup and fullscreen VA-API video
is leading to black screens for around 1 second and kernel WARNING [1] traces
when calling dmub_psr_enable() with Parade 08-01 TCON.

These symptoms all go away with PSR-SU disabled for this TCON, so disable
it for now while DMUB traces [2] from the failure can be analyzed and the failure
state properly root caused.

Cc: Marc Rossi <Marc.Rossi@amd.com>
Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/uploads/a832dd515b571ee171b3e3b566e99a13/dmesg.log [1]
Link: https://gitlab.freedesktop.org/drm/amd/uploads/8f13ff3b00963c833e23e68aa8116959/output.log [2]
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2645
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Link: https://lore.kernel.org/r/20240205211233.2601-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit afb634a682)
Cc: stable@vger.kernel.org
2024-10-22 18:13:03 -04:00
Frank Min
108bc59fe8 drm/amdgpu: fix random data corruption for sdma 7
There is random data corruption caused by const fill, this is caused by
write compression mode not correctly configured.

So correct compression mode for const fill.

Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 75400f8d6e)
Cc: stable@vger.kernel.org # 6.11.x
2024-10-22 18:11:43 -04:00
Aurabindo Pillai
63feb35cd2 drm/amd/display: temp w/a for DP Link Layer compliance
[Why&How]
Disabling P-State support on full updates for DCN401 results in
introducing additional communication with SMU. A UCLK hard min message
to SMU takes 4 seconds to go through, which was due to DCN not allowing
pstate switch, which was caused by incorrect value for TTU watermark
before blanking the HUBP prior to DPG on for servicing the test request.

Fix the issue temporarily by disallowing pstate changes for compliance
test while test request handler is reworked for a proper fix.

Fixes: 67ea53a4bd ("drm/amd/display: Disable DCN401 UCLK P-State support on full updates")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8a79f7cdbb)
Cc: stable@vger.kernel.org
2024-10-22 18:11:20 -04:00
Aurabindo Pillai
23d16ede33 drm/amd/display: temp w/a for dGPU to enter idle optimizations
[Why&How]
vblank immediate disable currently does not work for all asics. On
DCN401, the vblank interrupts never stop coming, and hence we never
get a chance to trigger idle optimizations.

Add a workaround to enable immediate disable only on APUs for now. This
adds a 2-frame delay for triggering idle optimization, which is a
negligible overhead.

Fixes: 58a261bfc9 ("drm/amd/display: use a more lax vblank enable policy for older ASICs")
Fixes: e45b6716de ("drm/amd/display: use a more lax vblank enable policy for DCN35+")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9b47278cec)
Cc: stable@vger.kernel.org
2024-10-22 18:10:55 -04:00
Kenneth Feng
f67644b219 drm/amd/pm: update deep sleep status on smu v14.0.2/3
disable deep sleep during the compute workload for the
potential performance loss on smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7d9af459f4)
2024-10-22 18:10:17 -04:00
Kenneth Feng
f888e3d34b drm/amd/pm: update overdrive function on smu v14.0.2/3
update overdrive function on smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit dcf822fca5)
2024-10-22 18:10:08 -04:00
Kenneth Feng
9515e74d75 drm/amd/pm: update the driver-fw interface file for smu v14.0.2/3
update the driver-fw interface file for smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0642c95efb)
2024-10-22 18:09:06 -04:00
Mario Limonciello
bf58f03931 drm/amd: Guard against bad data for ATIF ACPI method
If a BIOS provides bad data in response to an ATIF method call
this causes a NULL pointer dereference in the caller.

```
? show_regs (arch/x86/kernel/dumpstack.c:478 (discriminator 1))
? __die (arch/x86/kernel/dumpstack.c:423 arch/x86/kernel/dumpstack.c:434)
? page_fault_oops (arch/x86/mm/fault.c:544 (discriminator 2) arch/x86/mm/fault.c:705 (discriminator 2))
? do_user_addr_fault (arch/x86/mm/fault.c:440 (discriminator 1) arch/x86/mm/fault.c:1232 (discriminator 1))
? acpi_ut_update_object_reference (drivers/acpi/acpica/utdelete.c:642)
? exc_page_fault (arch/x86/mm/fault.c:1542)
? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
? amdgpu_atif_query_backlight_caps.constprop.0 (drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:387 (discriminator 2)) amdgpu
? amdgpu_atif_query_backlight_caps.constprop.0 (drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:386 (discriminator 1)) amdgpu
```

It has been encountered on at least one system, so guard for it.

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c9b7c809b8)
Cc: stable@vger.kernel.org
2024-10-22 18:08:12 -04:00
Alex Deucher
ec1aab7816 drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs
This uses more aggressive hueristics than the the bootup default
profile.  On windows the OS has a special fullscreen 3D mode
where this is used.  Since we don't have the equivalent on Linux
default to this profile for dGPUs.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1500
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 336568de91)
2024-10-16 15:51:10 -04:00
Alex Deucher
cb07c8338f drm/amdgpu/swsmu: Only force workload setup on init
Needed to set the workload type at init time so that
we can apply the navi3x margin optimization.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Fixes: c50fe289ed ("drm/amdgpu/swsmu: always force a state reprogram on init")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 580ad7cbd4)
Cc: stable@vger.kernel.org
2024-10-15 11:53:27 -04:00
Alex Deucher
7a1613e47e drm/amdgpu/smu13: always apply the powersave optimization
It can avoid margin issues in some very demanding applications.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Fixes: c50fe289ed ("drm/amdgpu/swsmu: always force a state reprogram on init")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 62f38b4cca)
Cc: stable@vger.kernel.org
2024-10-15 11:52:46 -04:00
Philip Yang
68d26c10ef drm/amdkfd: Accounting pdd vram_usage for svm
Process device data pdd->vram_usage is read by rocm-smi via sysfs, this
is currently missing the svm_bo usage accounting, so "rocm-smi
--showpids" per process VRAM usage report is incorrect.

Add pdd->vram_usage accounting when svm_bo allocation and release,
change to atomic64_t type because it is updated outside process mutex
now.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 98c0b0efcc)
2024-10-15 11:50:13 -04:00
Srinivasan Shanmugam
e7457532cb drm/amd/amdgpu: Fix double unlock in amdgpu_mes_add_ring
This patch addresses a double unlock issue in the amdgpu_mes_add_ring
function. The mutex was being unlocked twice under certain error
conditions, which could lead to undefined behavior.

The fix ensures that the mutex is unlocked only once before jumping to
the clean_up_memory label. The unlock operation is moved to just before
the goto statement within the conditional block that checks the return
value of amdgpu_ring_init. This prevents the second unlock attempt after
the clean_up_memory label, which is no longer necessary as the mutex is
already unlocked by this point in the code flow.

This change resolves the potential double unlock and maintains the
correct mutex handling throughout the function.

Fixes below:
Commit d0c423b647 ("drm/amdgpu/mes: use ring for kernel queue
submission"), leads to the following Smatch static checker warning:

	drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1240 amdgpu_mes_add_ring()
	warn: double unlock '&adev->mes.mutex_hidden' (orig line 1213)

drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
    1143 int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
    1144                         int queue_type, int idx,
    1145                         struct amdgpu_mes_ctx_data *ctx_data,
    1146                         struct amdgpu_ring **out)
    1147 {
    1148         struct amdgpu_ring *ring;
    1149         struct amdgpu_mes_gang *gang;
    1150         struct amdgpu_mes_queue_properties qprops = {0};
    1151         int r, queue_id, pasid;
    1152
    1153         /*
    1154          * Avoid taking any other locks under MES lock to avoid circular
    1155          * lock dependencies.
    1156          */
    1157         amdgpu_mes_lock(&adev->mes);
    1158         gang = idr_find(&adev->mes.gang_id_idr, gang_id);
    1159         if (!gang) {
    1160                 DRM_ERROR("gang id %d doesn't exist\n", gang_id);
    1161                 amdgpu_mes_unlock(&adev->mes);
    1162                 return -EINVAL;
    1163         }
    1164         pasid = gang->process->pasid;
    1165
    1166         ring = kzalloc(sizeof(struct amdgpu_ring), GFP_KERNEL);
    1167         if (!ring) {
    1168                 amdgpu_mes_unlock(&adev->mes);
    1169                 return -ENOMEM;
    1170         }
    1171
    1172         ring->ring_obj = NULL;
    1173         ring->use_doorbell = true;
    1174         ring->is_mes_queue = true;
    1175         ring->mes_ctx = ctx_data;
    1176         ring->idx = idx;
    1177         ring->no_scheduler = true;
    1178
    1179         if (queue_type == AMDGPU_RING_TYPE_COMPUTE) {
    1180                 int offset = offsetof(struct amdgpu_mes_ctx_meta_data,
    1181                                       compute[ring->idx].mec_hpd);
    1182                 ring->eop_gpu_addr =
    1183                         amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
    1184         }
    1185
    1186         switch (queue_type) {
    1187         case AMDGPU_RING_TYPE_GFX:
    1188                 ring->funcs = adev->gfx.gfx_ring[0].funcs;
    1189                 ring->me = adev->gfx.gfx_ring[0].me;
    1190                 ring->pipe = adev->gfx.gfx_ring[0].pipe;
    1191                 break;
    1192         case AMDGPU_RING_TYPE_COMPUTE:
    1193                 ring->funcs = adev->gfx.compute_ring[0].funcs;
    1194                 ring->me = adev->gfx.compute_ring[0].me;
    1195                 ring->pipe = adev->gfx.compute_ring[0].pipe;
    1196                 break;
    1197         case AMDGPU_RING_TYPE_SDMA:
    1198                 ring->funcs = adev->sdma.instance[0].ring.funcs;
    1199                 break;
    1200         default:
    1201                 BUG();
    1202         }
    1203
    1204         r = amdgpu_ring_init(adev, ring, 1024, NULL, 0,
    1205                              AMDGPU_RING_PRIO_DEFAULT, NULL);
    1206         if (r)
    1207                 goto clean_up_memory;
    1208
    1209         amdgpu_mes_ring_to_queue_props(adev, ring, &qprops);
    1210
    1211         dma_fence_wait(gang->process->vm->last_update, false);
    1212         dma_fence_wait(ctx_data->meta_data_va->last_pt_update, false);
    1213         amdgpu_mes_unlock(&adev->mes);
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    1214
    1215         r = amdgpu_mes_add_hw_queue(adev, gang_id, &qprops, &queue_id);
    1216         if (r)
    1217                 goto clean_up_ring;
                         ^^^^^^^^^^^^^^^^^^

    1218
    1219         ring->hw_queue_id = queue_id;
    1220         ring->doorbell_index = qprops.doorbell_off;
    1221
    1222         if (queue_type == AMDGPU_RING_TYPE_GFX)
    1223                 sprintf(ring->name, "gfx_%d.%d.%d", pasid, gang_id, queue_id);
    1224         else if (queue_type == AMDGPU_RING_TYPE_COMPUTE)
    1225                 sprintf(ring->name, "compute_%d.%d.%d", pasid, gang_id,
    1226                         queue_id);
    1227         else if (queue_type == AMDGPU_RING_TYPE_SDMA)
    1228                 sprintf(ring->name, "sdma_%d.%d.%d", pasid, gang_id,
    1229                         queue_id);
    1230         else
    1231                 BUG();
    1232
    1233         *out = ring;
    1234         return 0;
    1235
    1236 clean_up_ring:
    1237         amdgpu_ring_fini(ring);
    1238 clean_up_memory:
    1239         kfree(ring);
--> 1240         amdgpu_mes_unlock(&adev->mes);
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    1241         return r;
    1242 }

Fixes: d0c423b647 ("drm/amdgpu/mes: use ring for kernel queue submission")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Reported by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bfaf188360)
2024-10-15 11:49:08 -04:00
Michael Chen
7760d7f93c drm/amdgpu/mes: fix issue of writing to the same log buffer from 2 MES pipes
With Unified MES enabled in gfx12, need separate event log buffer for the
2 MES pipes to avoid data overwrite.

Signed-off-by: Michael Chen <michael.chen@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 144df260f3)
Cc: stable@vger.kernel.org # 6.11.x
2024-10-15 11:48:36 -04:00
Mohammed Anees
c0ec082f10 drm/amdgpu: prevent BO_HANDLES error from being overwritten
Before this patch, if multiple BO_HANDLES chunks were submitted,
the error -EINVAL would be correctly set but could be overwritten
by the return value from amdgpu_cs_p1_bo_handles(). This patch
ensures that if there are multiple BO_HANDLES, we stop.

Fixes: fec5f8e8c6 ("drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit")
Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 40f2cd9882)
Cc: stable@vger.kernel.org
2024-10-15 11:48:05 -04:00
Alex Deucher
d2c72d96df drm/amdgpu: enable enforce_isolation sysfs node on VFs
It should be enabled on both bare metal and VFs.

Fixes: e189be9b2e ("drm/amdgpu: Add enforce_isolation sysfs attribute")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Cc: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
(cherry picked from commit dc8847b054)
2024-10-15 11:47:41 -04:00
Dave Airlie
fc4d262721 amd-drm-fixes-6.12-2024-10-08:
amdgpu:
 - Fix invalid UBSAN warnings
 - Fix artifacts in MPO transitions
 - Hibernation fix
 
 amdkfd:
 - Fix an eviction fence leak
 
 radeon:
 - Add late register for connectors
 - Always set GEM function pointers
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZwVAzQAKCRC93/aFa7yZ
 2JP2AQC/n4RMsATvyJ0iWNL7R9XGNLi6B6NryaZStd/iYh8RlgD9FUZ/S3svF8kQ
 lwRxw61x7+0vCVBOSCM/jyt270oYqwY=
 =pGmT
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.12-2024-10-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.12-2024-10-08:

amdgpu:
- Fix invalid UBSAN warnings
- Fix artifacts in MPO transitions
- Hibernation fix

amdkfd:
- Fix an eviction fence leak

radeon:
- Add late register for connectors
- Always set GEM function pointers

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008142831.3739244-1-alexander.deucher@amd.com
2024-10-09 16:31:16 +10:00
Hamza Mahfooz
79bc412ef7 drm/amd/display: fix hibernate entry for DCN35+
Since, two suspend-resume cycles are required to enter hibernate and,
since we only need to enable idle optimizations in the first cycle
(which is pretty much equivalent to s2idle). We can check in_s0ix, to
prevent the system from entering idle optimizations before it actually
enters hibernate (from display's perspective). Also, call
dc_set_power_state() before dc_allow_idle_optimizations(), since it's
safer to do so because dc_set_power_state() writes to DMUB.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2fe79508d9)
Cc: stable@vger.kernel.org # 6.10+
2024-10-07 14:59:28 -04:00
Josip Pavic
0a9906cc45 drm/amd/display: Clear update flags after update has been applied
[Why]
Since the surface/stream update flags aren't cleared after applying
updates, those same updates may be applied again in a future call to
update surfaces/streams for surfaces/streams that aren't actually part
of that update (i.e. applying an update for one surface/stream can
trigger unintended programming on a different surface/stream).

For example, when an update results in a call to
program_front_end_for_ctx, that function may call program_pipe on all
pipes. If there are surface update flags that were never cleared on the
surface some pipe is attached to, then the same update will be
programmed again.

[How]
Clear the surface and stream update flags after applying the updates.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3441
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3616
Cc: Melissa Wen <mwen@igalia.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Josip Pavic <Josip.Pavic@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7671f62c10)
Cc: stable@vger.kernel.org
2024-10-07 14:58:58 -04:00
Alex Deucher
d6b9f492e2 drm/amdgpu: partially revert powerplay __counted_by changes
Partially revert
commit 0ca9f757a0 ("drm/amd/pm: powerplay: Add `__counted_by` attribute for flexible arrays")

The count attribute for these arrays does not get set until
after the arrays are allocated and populated leading to false
UBSAN warnings.

Fixes: 0ca9f757a0 ("drm/amd/pm: powerplay: Add `__counted_by` attribute for flexible arrays")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3662
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8a5ae927b6)
Cc: stable@vger.kernel.org
2024-10-07 14:58:26 -04:00
Lang Yu
d7d7b947a4 drm/amdkfd: Fix an eviction fence leak
Only creating a new reference for each process instead of each VM.

Fixes: 9a1c1339ab ("drm/amdkfd: Run restore_workers on freezable WQs")
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5fa4362894)
Cc: stable@vger.kernel.org
2024-10-07 14:53:23 -04:00
Linus Torvalds
fe6fceceae drm fixes for 6.12-rc2
atomic:
 - Use correct type when reading damage rectangles
 
 display:
 - Fix kernel docs
 
 dp-mst:
 - Fix DSC decompression detection
 
 hdmi:
 - Fix infoframe size
 
 sched:
 - Update maintainers
 - Fix race condition whne queueing up jobs
 - Fix locking in drm_sched_entity_modify_sched()
 - Fix pointer deref if entity queue changes
 
 sysfb:
 - Disable sysfb if framebuffer parent device is unknown
 
 amdgpu:
 - DML2 fix
 - DSC fix
 - Dispclk fix
 - eDP HDR fix
 - IPS fix
 - TBT fix
 
 i915:
 - One fix for bitwise and logical "and" mixup in PM code
 
 xe:
 - Restore pci state on resume
 - Fix locking on submission, queue and vm
 - Fix UAF on queue destruction
 - Fix resource release on freq init error path
 - Use rw_semaphore to reduce contention on ASID->VM lookup
 - Fix steering for media on Xe2_HPM
 - Tuning updates to Xe2
 - Resume TDR after GT reset to prevent jobs running forever
 - Move id allocation to avoid userspace using a guessed number
   to trigger UAF
 - Fix OA stream close preventing pbatch buffers to complete
 - Fix NPD when migrating memory on LNL
 - Fix memory leak when aborting binds
 
 panthor:
 - Fix locking
 - Set FOP_UNSIGNED_OFFSET in fops instance
 - Acquire lock in panthor_vm_prepare_map_op_ctx()
 - Avoid uninitialized variable in tick_ctx_cleanup()
 - Do not block scheduler queue if work is pending
 - Do not add write fences to the shared BOs
 
 vbox:
 - Fix VLA handling
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmb/YwgACgkQDHTzWXnE
 hr6VbxAAoq9FYTAdRPWzfG1HYpG96UyTh+IT6lz1bk/Hblxhi7oRdfmRy/bVPQYh
 vj+Q2xrnyS6JYhyfeDT2nU75tD3gvR1V/qSamxXS7c1nrqcb431DaMzuSQ5ST6MZ
 jrmob2TxlbXDDw70dxtiGCmSu0a9QInbelEamJQySKOdun0Il5C0LRZIBMpicDGc
 7Y3eSpCIwgTSU6bnApGyOchppvzptiqBWGmhoIuACMOgXI8eaLUPqbROKEHlPe5g
 JIG603rRK7cf+on/KEwvgrd2ZO59fJZvmwFrM5yY5bOsDCwTIJ6mHhOmutUNQvmd
 G5n6ZFnVxlBRSVWCAqPRBgA405s/0wi2IQprilaPCu2qAXToBXAUpIHuuat5I/8b
 BVwurVRAGV/GSeg7E51H3o8cu/fcQr4aGNW4Ul6fS1G123ZuUISpcUp9IEnqG7nB
 5PSnHapadb5Pu+7kwhbWUD4kONp16oEacZPhymlN+74Q6X3v8UVlK/YSyD6wq4fj
 2s4TBWUmXmNxztNEkgJhyJORYQhZeBaD0PtPq8kSzMUCFj3Q7Wf0bAODhbnmdCw1
 iPgxRd9+38IpfW621AROJUoTcyCaLtlSUvHWgFfska5CYnbtbKuNPBW6baQxeEHe
 Rhns01dZhNTIbKNEw37cfOf2DqcjpmRm4cVJj4xjZawxWYlNldk=
 =wcmz
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Weekly fixes, xe and amdgpu lead the way, with panthor, and few core
  components getting various fixes. Nothing seems too out of the
  ordinary.

  atomic:
   - Use correct type when reading damage rectangles

  display:
   - Fix kernel docs

  dp-mst:
   - Fix DSC decompression detection

  hdmi:
   - Fix infoframe size

  sched:
   - Update maintainers
   - Fix race condition whne queueing up jobs
   - Fix locking in drm_sched_entity_modify_sched()
   - Fix pointer deref if entity queue changes

  sysfb:
   - Disable sysfb if framebuffer parent device is unknown

  amdgpu:
   - DML2 fix
   - DSC fix
   - Dispclk fix
   - eDP HDR fix
   - IPS fix
   - TBT fix

  i915:
   - One fix for bitwise and logical "and" mixup in PM code

  xe:
   - Restore pci state on resume
   - Fix locking on submission, queue and vm
   - Fix UAF on queue destruction
   - Fix resource release on freq init error path
   - Use rw_semaphore to reduce contention on ASID->VM lookup
   - Fix steering for media on Xe2_HPM
   - Tuning updates to Xe2
   - Resume TDR after GT reset to prevent jobs running forever
   - Move id allocation to avoid userspace using a guessed number to
     trigger UAF
   - Fix OA stream close preventing pbatch buffers to complete
   - Fix NPD when migrating memory on LNL
   - Fix memory leak when aborting binds

  panthor:
   - Fix locking
   - Set FOP_UNSIGNED_OFFSET in fops instance
   - Acquire lock in panthor_vm_prepare_map_op_ctx()
   - Avoid uninitialized variable in tick_ctx_cleanup()
   - Do not block scheduler queue if work is pending
   - Do not add write fences to the shared BOs

  vbox:
   - Fix VLA handling"

* tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel: (41 commits)
  drm/xe: Fix memory leak when aborting binds
  drm/xe: Prevent null pointer access in xe_migrate_copy
  drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close
  drm/xe/queue: move xa_alloc to prevent UAF
  drm/xe/vm: move xa_alloc to prevent UAF
  drm/xe: Clean up VM / exec queue file lock usage.
  drm/xe: Resume TDR after GT reset
  drm/xe/xe2: Add performance tuning for L3 cache flushing
  drm/xe/xe2: Extend performance tuning to media GT
  drm/xe/mcr: Use Xe2_LPM steering tables for Xe2_HPM
  drm/xe: Use helper for ASID -> VM in GPU faults and access counters
  drm/xe: Convert to USM lock to rwsem
  drm/xe: use devm_add_action_or_reset() helper
  drm/xe: fix UAF around queue destruction
  drm/xe/guc_submit: add missing locking in wedged_fini
  drm/xe: Restore pci state upon resume
  drm/amd/display: Fix system hang while resume with TBT monitor
  drm/amd/display: Enable idle workqueue for more IPS modes
  drm/amd/display: Add HDR workaround for specific eDP
  drm/amd/display: avoid set dispclk to 0
  ...
2024-10-04 11:25:14 -07:00
Dave Airlie
156cc376a2 amd-drm-fixes-6.12-2024-10-02:
amdgpu:
 - DML2 fix
 - DSC fix
 - Dispclk fix
 - eDP HDR fix
 - IPS fix
 - TBT fix
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZv1Q1wAKCRC93/aFa7yZ
 2I/4AP9T5VRlMWZ0FbX1S4dwY6HvjEUrceGKk3fsuwXEJboTawD+OLu85wWnhOqu
 5tCuT8qitixuN2wJGVLNOahkia//6Ak=
 =LdpR
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.12-2024-10-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.12-2024-10-02:

amdgpu:
- DML2 fix
- DSC fix
- Dispclk fix
- eDP HDR fix
- IPS fix
- TBT fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241002135831.2510790-1-alexander.deucher@amd.com
2024-10-03 10:02:52 +10:00