linux-yocto/drivers/accel
Karol Wachowski 6420a8d27e accel/ivpu: Trigger device recovery on engine reset/resume failure
[ Upstream commit a47e36dc5d ]

Trigger full device recovery when the driver fails to restore device state
via engine reset and resume operations. This is necessary because, even if
submissions from a faulty context are blocked, the NPU may still process
previously submitted faulty jobs if the engine reset fails to abort them.
Such jobs can continue to generate faults and occupy device resources.
When engine reset is ineffective, the only way to recover is to perform
a full device recovery.

Fixes: dad945c27a ("accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW")
Cc: stable@vger.kernel.org # v6.15+
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250528154253.500556-1-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-06 11:01:38 +02:00
..
habanalabs accel/habanalabs: gradual sleep in polling memory macro 2024-06-23 09:53:33 +03:00
ivpu accel/ivpu: Trigger device recovery on engine reset/resume failure 2025-07-06 11:01:38 +02:00
qaic accel/qaic: Mask out SR-IOV PCI resources 2025-05-29 11:03:07 +02:00
drm_accel.c accel: Use XArray instead of IDR for minors 2024-08-26 17:06:22 +02:00
Kconfig accel/qaic: Add qaic driver to the build system 2023-04-06 08:23:03 +02:00
Makefile accel/qaic: Add qaic driver to the build system 2023-04-06 08:23:03 +02:00