linux-yocto

mirror of git://git.yoctoproject.org/linux-yocto.git synced 2025-08-22 00:42:01 +02:00

Author	SHA1	Message	Date
Justin Sanders	3e554f1153	aoe: defer rexmit timer downdev work to workqueue [ Upstream commit `cffc873d68` ] When aoe's rexmit_timer() notices that an aoe target fails to respond to commands for more than aoe_deadsecs, it calls aoedev_downdev() which cleans the outstanding aoe and block queues. This can involve sleeping, such as in blk_mq_freeze_queue(), which should not occur in irq context. This patch defers that aoedev_downdev() call to the aoe device's workqueue. Link: https://bugzilla.kernel.org/show_bug.cgi?id=212665 Signed-off-by: Justin Sanders <jsanders.devel@gmail.com> Link: https://lore.kernel.org/r/20250610170600.869-2-jsanders.devel@gmail.com Tested-By: Valentin Kleibel <valentin@vrvis.at> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:07 +02:00
Maurizio Lombardi	7296c938df	scsi: target: Fix NULL pointer dereference in core_scsi3_decode_spec_i_port() [ Upstream commit `d8ab68bdb2` ] The function core_scsi3_decode_spec_i_port(), in its error code path, unconditionally calls core_scsi3_lunacl_undepend_item() passing the dest_se_deve pointer, which may be NULL. This can lead to a NULL pointer dereference if dest_se_deve remains unset. SPC-3 PR SPEC_I_PT: Unable to locate dest_tpg Unable to handle kernel paging request at virtual address dfff800000000012 Call trace: core_scsi3_lunacl_undepend_item+0x2c/0xf0 [target_core_mod] (P) core_scsi3_decode_spec_i_port+0x120c/0x1c30 [target_core_mod] core_scsi3_emulate_pro_register+0x6b8/0xcd8 [target_core_mod] target_scsi3_emulate_pr_out+0x56c/0x840 [target_core_mod] Fix this by adding a NULL check before calling core_scsi3_lunacl_undepend_item() Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20250612101556.24829-1-mlombard@redhat.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:07 +02:00
Heiko Stuebner	3d546c8b10	regulator: fan53555: add enable_time support and soft-start times [ Upstream commit `8acfb165a4` ] The datasheets for all the fan53555 variants (and clones using the same interface) define so called soft start times, from enabling the regulator until at least some percentage of the output (i.e. 92% for the rk860x types) are available. The regulator framework supports this with the enable_time property but currently the fan53555 driver does not define enable_times for any variant. I ran into a problem with this while testing the new driver for the Rockchip NPUs (rocket), which does runtime-pm including disabling and enabling a rk8602 as needed. When reenabling the regulator while running a load, fatal hangs could be observed while enabling the associated power-domain, which the regulator supplies. Experimentally setting the regulator to always-on, made the issue disappear, leading to the missing delay to let power stabilize. And as expected, setting the enable-time to a non-zero value according to the datasheet also resolved the regulator-issue. The datasheets in nearly all cases only specify "typical" values, except for the fan53555 type 08. There both a typical and maximum value are listed - 40uS apart. For all typical values I've added 100uS to be on the safe side. Individual details for the relevant regulators below: - fan53526: The datasheet for all variants lists a typical value of 150uS, so make that 250uS with safety margin. - fan53555: types 08 and 18 (unsupported) are given a typical enable time of 135uS but also a maximum of 175uS so use that value. All the other types only have a typical time in the datasheet of 300uS, so give a bit margin by setting it to 400uS. - rk8600 + rk8602: Datasheet reports a typical value of 260us, so use 360uS to be safe. - syr82x + syr83x: All datasheets report typical soft-start values of 300uS for these regulators, so use 400uS. - tcs452x: Datasheet sadly does not report a soft-start time, so I've not set an enable-time Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patch.msgid.link/20250606190418.478633-1-heiko@sntech.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:07 +02:00
Raven Black	2ec1cc322a	ASoC: amd: yc: update quirk data for HP Victus [ Upstream commit `13b86ea92e` ] Make the internal microphone work on HP Victus laptops. Signed-off-by: Raven Black <ravenblack@gmail.com> Link: https://patch.msgid.link/20250613-support-hp-victus-microphone-v1-1-bebc4c3a2041@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:07 +02:00
Madhavan Srinivasan	39e36a744e	powerpc: Fix struct termio related ioctl macros [ Upstream commit `ab10727660` ] Since termio interface is now obsolete, include/uapi/asm/ioctls.h has some constant macros referring to "struct termio", this caused build failure at userspace. In file included from /usr/include/asm/ioctl.h:12, from /usr/include/asm/ioctls.h:5, from tst-ioctls.c:3: tst-ioctls.c: In function 'get_TCGETA': tst-ioctls.c:12:10: error: invalid application of 'sizeof' to incomplete type 'struct termio' 12 \| return TCGETA; \| ^~~~~~ Even though termios.h provides "struct termio", trying to juggle definitions around to make it compile could introduce regressions. So better to open code it. Reported-by: Tulio Magno <tuliom@ascii.art.br> Suggested-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Closes: https://lore.kernel.org/linuxppc-dev/8734dji5wl.fsf@ascii.art.br/ Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250517142237.156665-1-maddy@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:07 +02:00
Gyeyoung Baek	19bd759785	genirq/irq_sim: Initialize work context pointers properly [ Upstream commit `8a2277a3c9` ] Initialize `ops` member's pointers properly by using kzalloc() instead of kmalloc() when allocating the simulation work context. Otherwise the pointers contain random content leading to invalid dereferencing. Signed-off-by: Gyeyoung Baek <gye976@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250612124827.63259-1-gye976@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:07 +02:00
Mario Limonciello	c584b9b62c	platform/x86/amd/pmc: Add PCSpecialist Lafite Pro V 14M to 8042 quirks list [ Upstream commit `9ba75ccad8` ] Every other s2idle cycle fails to reach hardware sleep when keyboard wakeup is enabled. This appears to be an EC bug, but the vendor refuses to fix it. It was confirmed that turning off i8042 wakeup avoids ths issue (albeit keyboard wakeup is disabled). Take the lesser of two evils and add it to the i8042 quirk list. Reported-by: Raoul <ein4rth@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220116 Tested-by: Raoul <ein4rth@gmail.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250611203341.3733478-1-superm1@kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:06 +02:00
Gabriel Santese	f8155ee19d	ASoC: amd: yc: Add quirk for MSI Bravo 17 D7VF internal mic [ Upstream commit `ba06528ad5` ] MSI Bravo 17 (D7VF), like other laptops from the family, has broken ACPI tables and needs a quirk for internal mic to work properly. Signed-off-by: Gabriel Santese <santesegabriel@gmail.com> Link: https://patch.msgid.link/20250530005444.23398-1-santesegabriel@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:06 +02:00
Johannes Berg	c24c06bd14	ata: pata_cs5536: fix build on 32-bit UML [ Upstream commit `fe5b391fc5` ] On 32-bit ARCH=um, CONFIG_X86_32 is still defined, so it doesn't indicate building on real X86 machines. There's no MSR on UML though, so add a check for CONFIG_X86. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20250606090110.15784-2-johannes@sipsolutions.net Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:06 +02:00
Tasos Sahanidis	3ce57d493d	ata: libata-acpi: Do not assume 40 wire cable if no devices are enabled [ Upstream commit `33877220b8` ] On at least an ASRock 990FX Extreme 4 with a VIA VT6330, the devices have not yet been enabled by the first time ata_acpi_cbl_80wire() is called. This means that the ata_for_each_dev loop is never entered, and a 40 wire cable is assumed. The VIA controller on this board does not report the cable in the PCI config space, thus having to fall back to ACPI even though no SATA bridge is present. The _GTM values are correctly reported by the firmware through ACPI, which has already set up faster transfer modes, but due to the above the controller is forced down to a maximum of UDMA/33. Resolve this by modifying ata_acpi_cbl_80wire() to directly return the cable type. First, an unknown cable is assumed which preserves the mode set by the firmware, and then on subsequent calls when the devices have been enabled, an 80 wire cable is correctly detected. Since the function now directly returns the cable type, it is renamed to ata_acpi_cbl_pata_type(). Signed-off-by: Tasos Sahanidis <tasos@tasossah.com> Link: https://lore.kernel.org/r/20250519085945.1399466-1-tasos@tasossah.com Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:06 +02:00
Takashi Iwai	f42b8e5753	ALSA: sb: Force to disable DMAs once when DMA mode is changed [ Upstream commit `4c267ae2ef` ] When the DMA mode is changed on the (still real!) SB AWE32 after playing a stream and closing, the previous DMA setup was still silently kept, and it can confuse the hardware, resulting in the unexpected noises. As a workaround, enforce the disablement of DMA setups when the DMA setup is changed by the kcontrol. https://bugzilla.kernel.org/show_bug.cgi?id=218185 Link: https://patch.msgid.link/20250610064322.26787-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:06 +02:00
Takashi Iwai	c5e0af68c8	ALSA: sb: Don't allow changing the DMA mode during operations [ Upstream commit `ed29e073ba` ] When a PCM stream is already running, one shouldn't change the DMA mode via kcontrol, which may screw up the hardware. Return -EBUSY instead. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218185 Link: https://patch.msgid.link/20250610064322.26787-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:06 +02:00
Rob Clark	3f6ce8433a	drm/msm: Fix another leak in the submit error path [ Upstream commit `f681c2aa86` ] put_unused_fd() doesn't free the installed file, if we've already done fd_install(). So we need to also free the sync_file. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/653583/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:06 +02:00
Rob Clark	0eaa495b3d	drm/msm: Fix a fence leak in submit error path [ Upstream commit `5d319f75cc` ] In error paths, we could unref the submit without calling drm_sched_entity_push_job(), so msm_job_free() will never get called. Since drm_sched_job_cleanup() will NULL out the s_fence, we can use that to detect this case. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/653584/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:05 +02:00
Ewan D. Milne	c0527f7534	scsi: lpfc: Restore clearing of NLP_UNREG_INP in ndlp->nlp_flag [ Upstream commit `040492ac25` ] Commit `32566a6f1a` ("scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist structure") introduced a regression with SLI-3 adapters (e.g. LPe12000 8Gb) where a Link Down / Link Up such as caused by disabling an host FC switch port would result in the devices remaining in the transport-offline state and multipath reporting them as failed. This problem was not seen with newer SLI-4 adapters. The problem was caused by portions of the patch which removed the functions __lpfc_sli_rpi_release() and lpfc_sli_rpi_release() and all their callers. This was presumably because with the removal of the NLP_RELEASE_RPI flag there was no need to free the rpi. However, __lpfc_sli_rpi_release() and lpfc_sli_rpi_release() which calls it reset the NLP_UNREG_INP flag. And, lpfc_sli_def_mbox_cmpl() has a path where __lpfc_sli_rpi_release() was called in a particular case where NLP_UNREG_INP was not otherwise cleared because of other conditions. Restoring the else clause of this conditional and simply clearing the NLP_UNREG_INP flag appears to resolve the problem with SLI-3 adapters. It should be noted that the code path in question is not specific to SLI-3, but there are other SLI-4 code paths which may have masked the issue. Fixes: `32566a6f1a` ("scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist structure") Cc: stable@vger.kernel.org Tested-by: Marco Patalano <mpatalan@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Link: https://lore.kernel.org/r/20250317163731.356873-1-emilne@redhat.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:05 +02:00
Tejun Heo	790ce73721	sched_ext: Make scx_group_set_weight() always update tg->scx.weight [ Upstream commit `c50784e99f` ] Otherwise, tg->scx.weight can go out of sync while scx_cgroup is not enabled and ops.cgroup_init() may be called with a stale weight value. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `8195136669` ("sched_ext: Add cgroup support") Cc: stable@vger.kernel.org # v6.12+ Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:05 +02:00
Alex Deucher	7ccaa5fa5d	drm/amdgpu/mes: add missing locking in helper functions [ Upstream commit `40f970ba7a` ] We need to take the MES lock. Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:05 +02:00
Johan Hovold	238a218d42	arm64: dts: qcom: x1e80100-crd: mark l12b and l15b always-on [ Upstream commit `abf89bc4bb` ] The l12b and l15b supplies are used by components that are not (fully) described (and some never will be) and must never be disabled. Mark the regulators as always-on to prevent them from being disabled, for example, when consumers probe defer or suspend. Fixes: `bd50b1f5b6` ("arm64: dts: qcom: x1e80100: Add Compute Reference Device") Cc: stable@vger.kernel.org # 6.8 Cc: Abel Vesa <abel.vesa@linaro.org> Cc: Rajendra Nayak <quic_rjendra@quicinc.com> Cc: Sibi Sankar <quic_sibis@quicinc.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20250314145440.11371-2-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:05 +02:00
Nicholas Kazlauskas	6464427589	drm/amd/display: Add more checks for DSC / HUBP ONO guarantees [ Upstream commit `0d57dd1765` ] [WHY] For non-zero DSC instances it's possible that the HUBP domain required to drive it for sequential ONO ASICs isn't met, potentially causing the logic to the tile to enter an undefined state leading to a system hang. [HOW] Add more checks to ensure that the HUBP domain matching the DSC instance is appropriately powered. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Duncan Ma <duncan.ma@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `da63df0711`) Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:05 +02:00
Frank Min	81ebb8d755	drm/amdgpu: add kicker fws loading for gfx11/smu13/psp13 [ Upstream commit `854171405e` ] 1. Add kicker firmwares loading for gfx11/smu13/psp13 2. Register additional MODULE_FIRMWARE entries for kicker fws - gc_11_0_0_rlc_kicker.bin - gc_11_0_0_imu_kicker.bin - psp_13_0_0_sos_kicker.bin - psp_13_0_0_ta_kicker.bin - smu_13_0_0_kicker.bin Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `fb5ec2174d`) Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:04 +02:00
Imre Deak	710deaff6a	drm/i915/dp_mst: Work around Thunderbolt sink disconnect after SINK_COUNT_ESI read [ Upstream commit `9cb1547891` ] Due to a problem in the iTBT DP-in adapter's firmware the sink on a TBT link may get disconnected inadvertently if the SINK_COUNT_ESI and the DP_LINK_SERVICE_IRQ_VECTOR_ESI0 registers are read in a single AUX transaction. Work around the issue by reading these registers in separate transactions. The issue affects MTL+ platforms and will be fixed in the DP-in adapter firmware, however releasing that firmware fix may take some time and is not guaranteed to be available for all systems. Based on this apply the workaround on affected platforms. See HSD #13013007775. v2: Cc'ing Mika Westerberg. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13760 Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14147 Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250519133417.1469181-1-imre.deak@intel.com (cherry picked from commit `c3a48363cf`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:04 +02:00
Sonny Jiang	b47a1f9323	drm/amdgpu: VCN v5_0_1 to prevent FW checking RB during DPG pause [ Upstream commit `46e15197b5` ] Add a protection to ensure programming are all complete prior VCPU starting. This is a WA for an unintended VCPU running. Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `c29521b529`) Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:04 +02:00
Thomas Zimmermann	4f77d8f8a9	drm/simpledrm: Do not upcast in release helpers [ Upstream commit `d231cde7c8` ] The res pointer passed to simpledrm_device_release_clocks() and simpledrm_device_release_regulators() points to an instance of struct simpledrm_device. No need to upcast from struct drm_device. The upcast is harmless, as DRM device is the first field in struct simpledrm_device. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: `11e8f5fd22` ("drm: Add simpledrm driver") Cc: <stable@vger.kernel.org> # v5.14+ Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250407134753.985925-2-tzimmermann@suse.de Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:04 +02:00
Stephen Smalley	acf9ab15ec	selinux: change security_compute_sid to return the ssid or tsid on match [ Upstream commit `fde46f60f6` ] If the end result of a security_compute_sid() computation matches the ssid or tsid, return that SID rather than looking it up again. This avoids the problem of multiple initial SIDs that map to the same context. Cc: stable@vger.kernel.org Reported-by: Guido Trentalancia <guido@trentalancia.com> Fixes: `ae254858ce` ("selinux: introduce an initial SID for early boot processes") Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Tested-by: Guido Trentalancia <guido@trentalancia.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:04 +02:00
Michal Wajdeczko	6d0b588614	drm/xe/guc: Explicitly exit CT safe mode on unwind [ Upstream commit `ad40098da5` ] During driver probe we might be briefly using CT safe mode, which is based on a delayed work, but usually we are able to stop this once we have IRQ fully operational. However, if we abort the probe quite early then during unwind we might try to destroy the workqueue while there is still a pending delayed work that attempts to restart itself which triggers a WARN. This was recently observed during unsuccessful VF initialization: [ ] xe 0000:00:02.1: probe with driver xe failed with error -62 [ ] ------------[ cut here ]------------ [ ] workqueue: cannot queue safe_mode_worker_func [xe] on wq xe-g2h-wq [ ] WARNING: CPU: 9 PID: 0 at kernel/workqueue.c:2257 __queue_work+0x287/0x710 [ ] RIP: 0010:__queue_work+0x287/0x710 [ ] Call Trace: [ ] delayed_work_timer_fn+0x19/0x30 [ ] call_timer_fn+0xa1/0x2a0 Exit the CT safe mode on unwind to avoid that warning. Fixes: `09b286950f` ("drm/xe/guc: Allow CTB G2H processing without G2H IRQ") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250612220937.857-3-michal.wajdeczko@intel.com (cherry picked from commit `2ddbb73ec2`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:04 +02:00
John Harrison	ff6482fb45	drm/xe/guc: Dead CT helper [ Upstream commit `d2c5a5a926` ] Add a worker function helper for asynchronously dumping state when an internal/fatal error is detected in CT processing. Being asynchronous is required to avoid deadlocks and scheduling-while-atomic or process-stalled-for-too-long issues. Also check for a bunch more error conditions and improve the handling of some existing checks. v2: Use compile time CONFIG check for new (but not directly CT_DEAD related) checks and use unsigned int for a bitmask, rename CT_DEAD_RESET to CT_DEAD_REARM and add some explaining comments, rename 'hxg' macro parameter to 'ctb' - review feedback from Michal W. Drop CT_DEAD_ALIVE as no need for a bitfield define to just set the entire mask to zero. v3: Fix kerneldoc v4: Nullify some floating pointers after free. v5: Add section headings and device info to make the state dump look more like a devcoredump to allow parsing by the same tools (eventual aim is to just call the devcoredump code itself, but that currently requires an xe_sched_job, which is not available in the CT code). v6: Fix potential for leaking snapshots with concurrent error conditions (review feedback from Julia F). v7: Don't complain about unexpected G2H messages yet because there is a known issue causing them. Fix bit shift bug with v6 change. Add GT id to fake coredump headers and use puts instead of printf. v8: Disable the head mis-match check in g2h_read because it is failing on various discrete platforms due to unknown reasons. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-9-John.C.Harrison@Intel.com Stable-dep-of: `ad40098da5` ("drm/xe/guc: Explicitly exit CT safe mode on unwind") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:04 +02:00
Nitin Gote	e595433c63	drm/xe: Replace double space with single space after comma [ Upstream commit `cd89de14bb` ] Avoid using double space, ", " in function or macro parameters where it's not required by any alignment purpose. Replace it with a single space, ", ". Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240823080643.2461992-1-nitin.r.gote@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Stable-dep-of: `ad40098da5` ("drm/xe/guc: Explicitly exit CT safe mode on unwind") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:03 +02:00
Matthew Auld	0dadcd17e2	drm/xe: move DPT l2 flush to a more sensible place [ Upstream commit `f16873f42a` ] Only need the flush for DPT host updates here. Normal GGTT updates don't need special flush. Fixes: `01570b4469` ("drm/xe/bmg: implement Wa_16023588340") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250606104546.1996818-4-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `35db1da40c`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:03 +02:00
Niranjana Vishwanathapura	1883a83695	drm/xe: Allow bo mapping on multiple ggtts [ Upstream commit `5a3b0df25d` ] Make bo->ggtt an array to support bo mapping on multiple ggtts. Add XE_BO_FLAG_GGTTx flags to map the bo on ggtt of tile 'x'. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241120000222.204095-2-John.C.Harrison@Intel.com Stable-dep-of: `f16873f42a` ("drm/xe: move DPT l2 flush to a more sensible place") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:03 +02:00
Juha-Pekka Heikkila	ce1ef3b64e	drm/xe: add interface to request physical alignment for buffer objects [ Upstream commit `3ad86ae1da` ] Add xe_bo_create_pin_map_at_aligned() which augment xe_bo_create_pin_map_at() with alignment parameter allowing to pass required alignemnt if it differ from default. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Mika Kahola <mika.kahola@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009151947.2240099-2-juhapekka.heikkila@gmail.com Stable-dep-of: `f16873f42a` ("drm/xe: move DPT l2 flush to a more sensible place") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:03 +02:00
Maarten Lankhorst	98e5c71e7e	drm/xe: Move DSB l2 flush to a more sensible place [ Upstream commit `a4b1b51ae1` ] Flushing l2 is only needed after all data has been written. Fixes: `01570b4469` ("drm/xe/bmg: implement Wa_16023588340") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/20250606104546.1996818-3-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `0dd2dd0182`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:03 +02:00
Maarten Lankhorst	e5f01b2b67	drm/xe: Fix DSB buffer coherency [ Upstream commit `71a3161e9d` ] Add the scanout flag to force WC caching, and add the memory barrier where needed. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240913114754.7956-2-maarten.lankhorst@linux.intel.com Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Stable-dep-of: `a4b1b51ae1` ("drm/xe: Move DSB l2 flush to a more sensible place") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:03 +02:00
Christophe JAILLET	61628111e7	mfd: exynos-lpass: Fix another error handling path in exynos_lpass_probe() [ Upstream commit `f41cc37f4b` ] If devm_of_platform_populate() fails, some clean-up needs to be done, as already done in the remove function. Add a new devm_add_action_or_reset() to fix the leak in the probe and remove the need of a remove function. Fixes: `c695abab24` ("mfd: Add Samsung Exynos Low Power Audio Subsystem driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/69471e839efc0249a504492a8de3497fcdb6a009.1745247209.git.christophe.jaillet@wanadoo.fr Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:02 +02:00
David Howells	e0fefe9bc0	netfs: Fix oops in write-retry from mis-resetting the subreq iterator [ Upstream commit `4481f7f2b3` ] Fix the resetting of the subrequest iterator in netfs_retry_write_stream() to use the iterator-reset function as the iterator may have been shortened by a previous retry. In such a case, the amount of data to be written by the subrequest is not "subreq->len" but "subreq->len - subreq->transferred". Without this, KASAN may see an error in iov_iter_revert(): BUG: KASAN: slab-out-of-bounds in iov_iter_revert lib/iov_iter.c:633 [inline] BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x443/0x5a0 lib/iov_iter.c:611 Read of size 4 at addr ffff88802912a0b8 by task kworker/u32:7/1147 CPU: 1 UID: 0 PID: 1147 Comm: kworker/u32:7 Not tainted 6.15.0-rc6-syzkaller-00052-g9f35e33144ae #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Workqueue: events_unbound netfs_write_collection_worker Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc3/0x670 mm/kasan/report.c:521 kasan_report+0xe0/0x110 mm/kasan/report.c:634 iov_iter_revert lib/iov_iter.c:633 [inline] iov_iter_revert+0x443/0x5a0 lib/iov_iter.c:611 netfs_retry_write_stream fs/netfs/write_retry.c:44 [inline] netfs_retry_writes+0x166d/0x1a50 fs/netfs/write_retry.c:231 netfs_collect_write_results fs/netfs/write_collect.c:352 [inline] netfs_write_collection_worker+0x23fd/0x3830 fs/netfs/write_collect.c:374 process_one_work+0x9cf/0x1b70 kernel/workqueue.c:3238 process_scheduled_works kernel/workqueue.c:3319 [inline] worker_thread+0x6c8/0xf10 kernel/workqueue.c:3400 kthread+0x3c2/0x780 kernel/kthread.c:464 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> Fixes: `cd0277ed0c` ("netfs: Use new folio_queue data type and iterator instead of xarray iter") Reported-by: syzbot+25b83a6f2c702075fcbc@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=25b83a6f2c702075fcbc Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250519090707.2848510-2-dhowells@redhat.com Tested-by: syzbot+25b83a6f2c702075fcbc@syzkaller.appspotmail.com cc: Paulo Alcantara <pc@manguebit.com> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:02 +02:00
Beleswar Padhi	c2a952fb41	remoteproc: k3-r5: Refactor sequential core power up/down operations [ Upstream commit `701177511a` ] The existing implementation of the waiting mechanism in "k3_r5_cluster_rproc_init()" waits for the "released_from_reset" flag to be set as part of the firmware boot process in "k3_r5_rproc_start()". The "k3_r5_cluster_rproc_init()" function is invoked in the probe routine which causes unexpected failures in cases where the firmware is unavailable at boot time, resulting in probe failure and removal of the remoteproc handles in the sysfs paths. To address this, the waiting mechanism is refactored out of the probe routine into the appropriate "k3_r5_rproc_{prepare/unprepare}()" functions. This allows the probe routine to complete without depending on firmware booting, while still maintaining the required power-synchronization between cores. Further, this wait mechanism is dropped from "k3_r5_rproc_{start/stop}()" functions as they deal with Core Run/Halt operations, and as such, there is no constraint in Running or Halting the cores of a cluster in order. Fixes: `61f6f68447` ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1") Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Tested-by: Judith Mendez <jm@ti.com> Reviewed-by: Andrew Davis <afd@ti.com> Link: https://lore.kernel.org/r/20250513054510.3439842-4-b-padhi@ti.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:02 +02:00
Beleswar Padhi	b14a64c1a9	remoteproc: k3-r5: Use devm_rproc_add() helper [ Upstream commit `de182d2f5c` ] Use device lifecycle managed devm_rproc_add() helper function. This helps prevent mistakes like deleting out of order in cleanup functions and forgetting to delete on all error paths. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Reviewed-by: Andrew Davis <afd@ti.com> Link: https://lore.kernel.org/r/20241219110545.1898883-5-b-padhi@ti.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Stable-dep-of: `701177511a` ("remoteproc: k3-r5: Refactor sequential core power up/down operations") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:02 +02:00
Beleswar Padhi	0ea3572c15	remoteproc: k3-r5: Use devm_ioremap_wc() helper [ Upstream commit `a572439f71` ] Use a device lifecycle managed ioremap helper function. This helps prevent mistakes like unmapping out of order in cleanup functions and forgetting to unmap on all error paths. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Reviewed-by: Andrew Davis <afd@ti.com> Link: https://lore.kernel.org/r/20241219110545.1898883-4-b-padhi@ti.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Stable-dep-of: `701177511a` ("remoteproc: k3-r5: Refactor sequential core power up/down operations") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:02 +02:00
Beleswar Padhi	e392148f7f	remoteproc: k3-r5: Use devm_kcalloc() helper [ Upstream commit `f2e3d0d709` ] Use a device lifecycle managed action to free memory. This helps prevent mistakes like freeing out of order in cleanup functions and forgetting to free on error paths. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Reviewed-by: Andrew Davis <afd@ti.com> Link: https://lore.kernel.org/r/20241219110545.1898883-3-b-padhi@ti.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Stable-dep-of: `701177511a` ("remoteproc: k3-r5: Refactor sequential core power up/down operations") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:02 +02:00
Beleswar Padhi	f802fb717d	remoteproc: k3-r5: Add devm action to release reserved memory [ Upstream commit `972361e397` ] Use a device lifecycle managed action to release reserved memory. This helps prevent mistakes like releasing out of order in cleanup functions and forgetting to release on error paths. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Reviewed-by: Andrew Davis <afd@ti.com> Link: https://lore.kernel.org/r/20241219110545.1898883-2-b-padhi@ti.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Stable-dep-of: `701177511a` ("remoteproc: k3-r5: Refactor sequential core power up/down operations") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:02 +02:00
Markus Elfring	5eec92eb4f	remoteproc: k3: Call of_node_put(rmem_np) only once in three functions [ Upstream commit `a36d9f96d1` ] An of_node_put(rmem_np) call was immediately used after a pointer check for a of_reserved_mem_lookup() call in three function implementations. Thus call such a function only once instead directly before the checks. This issue was transformed by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Link: https://lore.kernel.org/r/c46b06f9-72b1-420b-9dce-a392b982140e@web.de Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Stable-dep-of: `701177511a` ("remoteproc: k3-r5: Refactor sequential core power up/down operations") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:01 +02:00
Kees Cook	5b6eb04c05	ubsan: integer-overflow: depend on BROKEN to keep this out of CI [ Upstream commit `d6a0e0bfec` ] Depending on !COMPILE_TEST isn't sufficient to keep this feature out of CI because we can't stop it from being included in randconfig builds. This feature is still highly experimental, and is developed in lock-step with Clang's Overflow Behavior Types[1]. Depend on BROKEN to keep it from being enabled by anyone not expecting it. Link: https://discourse.llvm.org/t/rfc-v2-clang-introduce-overflowbehaviortypes-for-wrapping-and-non-wrapping-arithmetic/86507 [1] Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202505281024.f42beaa7-lkp@intel.com Fixes: `557f8c582a` ("ubsan: Reintroduce signed overflow sanitizer") Acked-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/r/20250528182616.work.296-kees@kernel.org Reviewed-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Marco Elver <elver@google.com> Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:01 +02:00
Pengyu Luo	f3a472b914	arm64: dts: qcom: sm8650: add the missing l2 cache node [ Upstream commit `4becd72352` ] Only two little a520s share the same L2, every a720 has their own L2 cache. Fixes: `d235037799` ("arm64: dts: qcom: add initial SM8650 dtsi") Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250405105529.309711-1-mitltlatltl@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:01 +02:00
Geert Uytterhoeven	5a867d09f5	arm64: dts: renesas: white-hawk-single: Improve Ethernet TSN description [ Upstream commit `8ffec7d62c` ] - Add the missing "ethernet3" alias for the Ethernet TSN port, so U-Boot will fill its local-mac-address property based on the "eth3addr" environment variable (if set), avoiding a random MAC address being assigned by the OS, - Rename the numerical Ethernet PHY label to "tsn0_phy", to avoid future conflicts, and for consistency with the "avbN_phy" labels. Fixes: `3d8e475bd7` ("arm64: dts: renesas: white-hawk-single: Wire-up Ethernet TSN") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Link: https://lore.kernel.org/367f10a18aa196ff1c96734dd9bd5634b312c421.1746624368.git.geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:01 +02:00
Geert Uytterhoeven	7f0e933241	arm64: dts: renesas: Factor out White Hawk Single board support [ Upstream commit `d43c077cb8` ] Move the common parts for the Renesas White Hawk Single board to white-hawk-single.dtsi, to enable future reuse. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/1661743b18a9ff9fac716f98a663b39fc8488d7e.1733156661.git.geert+renesas@glider.be Stable-dep-of: `8ffec7d62c` ("arm64: dts: renesas: white-hawk-single: Improve Ethernet TSN description") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:01 +02:00
Geert Uytterhoeven	b9baad894b	arm64: dts: renesas: Use interrupts-extended for Ethernet PHYs [ Upstream commit `ba4d843a2a` ] Use the more concise interrupts-extended property to fully describe the interrupts. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # G2L family and G3S Link: https://lore.kernel.org/e9db8758d275ec63b0d6ce086ac3d0ea62966865.1728045620.git.geert+renesas@glider.be Stable-dep-of: `8ffec7d62c` ("arm64: dts: renesas: white-hawk-single: Improve Ethernet TSN description") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:01 +02:00
Luca Weiss	d8b92a122a	arm64: dts: qcom: sm8650: Fix domain-idle-state for CPU2 [ Upstream commit `9bb5ca4641` ] On SM8650 the CPUs 0-1 are "silver" (Cortex-A520), CPU 2-6 are "gold" (Cortex-A720) and CPU 7 is "gold-plus" (Cortex-X4). So reference the correct "gold" idle-state for CPU core 2. Fixes: `d235037799` ("arm64: dts: qcom: add initial SM8650 dtsi") Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250314-sm8650-cpu2-sleep-v1-1-31d5c7c87a5d@fairphone.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:01 +02:00
Krzysztof Kozlowski	67b3bb57fa	arm64: dts: qcom: sm8650: change labels to lower-case [ Upstream commit `20eb2057b3` ] DTS coding style expects labels to be lowercase. No functional impact. Verified with comparing decompiled DTB (dtx_diff and fdtdump+diff). Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20241022-dts-qcom-label-v3-14-0505bc7d2c56@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Stable-dep-of: `9bb5ca4641` ("arm64: dts: qcom: sm8650: Fix domain-idle-state for CPU2") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:00 +02:00
Yonghong Song	4265682c29	bpf: Do not include stack ptr register in precision backtracking bookkeeping [ Upstream commit `e2d2115e56` ] Yi Lai reported an issue ([1]) where the following warning appears in kernel dmesg: [ 60.643604] verifier backtracking bug [ 60.643635] WARNING: CPU: 10 PID: 2315 at kernel/bpf/verifier.c:4302 __mark_chain_precision+0x3a6c/0x3e10 [ 60.648428] Modules linked in: bpf_testmod(OE) [ 60.650471] CPU: 10 UID: 0 PID: 2315 Comm: test_progs Tainted: G OE 6.15.0-rc4-gef11287f8289-dirty #327 PREEMPT(full) [ 60.654385] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 60.656682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 60.660475] RIP: 0010:__mark_chain_precision+0x3a6c/0x3e10 [ 60.662814] Code: 5a 30 84 89 ea e8 c4 d9 01 00 80 3d 3e 7d d8 04 00 0f 85 60 fa ff ff c6 05 31 7d d8 04 01 48 c7 c7 00 58 30 84 e8 c4 06 a5 ff <0f> 0b e9 46 fa ff ff 48 ... [ 60.668720] RSP: 0018:ffff888116cc7298 EFLAGS: 00010246 [ 60.671075] RAX: 54d70e82dfd31900 RBX: ffff888115b65e20 RCX: 0000000000000000 [ 60.673659] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 00000000ffffffff [ 60.676241] RBP: 0000000000000400 R08: ffff8881f6f23bd3 R09: 1ffff1103ede477a [ 60.678787] R10: dffffc0000000000 R11: ffffed103ede477b R12: ffff888115b60ae8 [ 60.681420] R13: 1ffff11022b6cbc4 R14: 00000000fffffff2 R15: 0000000000000001 [ 60.684030] FS: 00007fc2aedd80c0(0000) GS:ffff88826fa8a000(0000) knlGS:0000000000000000 [ 60.686837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 60.689027] CR2: 000056325369e000 CR3: 000000011088b002 CR4: 0000000000370ef0 [ 60.691623] Call Trace: [ 60.692821] <TASK> [ 60.693960] ? __pfx_verbose+0x10/0x10 [ 60.695656] ? __pfx_disasm_kfunc_name+0x10/0x10 [ 60.697495] check_cond_jmp_op+0x16f7/0x39b0 [ 60.699237] do_check+0x58fa/0xab10 ... Further analysis shows the warning is at line 4302 as below: 4294 /* static subprog call instruction, which 4295 * means that we are exiting current subprog, 4296 * so only r1-r5 could be still requested as 4297 * precise, r0 and r6-r10 or any stack slot in 4298 * the current frame should be zero by now 4299 */ 4300 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) { 4301 verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); 4302 WARN_ONCE(1, "verifier backtracking bug"); 4303 return -EFAULT; 4304 } With the below test (also in the next patch): __used __naked static void __bpf_jmp_r10(void) { asm volatile ( "r2 = 2314885393468386424 ll;" "goto +0;" "if r2 <= r10 goto +3;" "if r1 >= -1835016 goto +0;" "if r2 <= 8 goto +0;" "if r3 <= 0 goto +0;" "exit;" ::: __clobber_all); } SEC("?raw_tp") __naked void bpf_jmp_r10(void) { asm volatile ( "r3 = 0 ll;" "call __bpf_jmp_r10;" "r0 = 0;" "exit;" ::: __clobber_all); } The following is the verifier failure log: 0: (18) r3 = 0x0 ; R3_w=0 2: (85) call pc+2 caller: R10=fp0 callee: frame1: R1=ctx() R3_w=0 R10=fp0 5: frame1: R1=ctx() R3_w=0 R10=fp0 ; asm volatile (" \ @ verifier_precision.c:184 5: (18) r2 = 0x20202000256c6c78 ; frame1: R2_w=0x20202000256c6c78 7: (05) goto pc+0 8: (bd) if r2 <= r10 goto pc+3 ; frame1: R2_w=0x20202000256c6c78 R10=fp0 9: (35) if r1 >= 0xffe3fff8 goto pc+0 ; frame1: R1=ctx() 10: (b5) if r2 <= 0x8 goto pc+0 mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1 mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0 mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3 mark_precise: frame1: regs=r2,r10 stack= before 7: (05) goto pc+0 mark_precise: frame1: regs=r2,r10 stack= before 5: (18) r2 = 0x20202000256c6c78 mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+2 BUG regs 400 The main failure reason is due to r10 in precision backtracking bookkeeping. Actually r10 is always precise and there is no need to add it for the precision backtracking bookkeeping. One way to fix the issue is to prevent bt_set_reg() if any src/dst reg is r10. Andrii suggested to go with push_insn_history() approach to avoid explicitly checking r10 in backtrack_insn(). This patch added push_insn_history() support for cond_jmp like 'rX <op> rY' operations. In check_cond_jmp_op(), if any of rX or rY is a stack pointer, push_insn_history() will record such information, and later backtrack_insn() will do bt_set_reg() properly for those register(s). [1] https://lore.kernel.org/bpf/Z%2F8q3xzpU59CIYQE@ly-workstation/ Reported by: Yi Lai <yi1.lai@linux.intel.com> Fixes: `407958a0e9` ("bpf: encapsulate precision backtracking bookkeeping") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250524041335.4046126-1-yonghong.song@linux.dev Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:00 +02:00
Andrii Nakryiko	c5474a7b04	bpf: use common instruction history across all states [ Upstream commit `96a30e469c` ] Instead of allocating and copying instruction history each time we enqueue child verifier state, switch to a model where we use one common dynamically sized array of instruction history entries across all states. The key observation for proving this is correct is that instruction history is only relevant while state is active, which means it either is a current state (and thus we are actively modifying instruction history and no other state can interfere with us) or we are checkpointed state with some children still active (either enqueued or being current). In the latter case our portion of instruction history is finalized and won't change or grow, so as long as we keep it immutable until the state is finalized, we are good. Now, when state is finalized and is put into state hash for potentially future pruning lookups, instruction history is not used anymore. This is because instruction history is only used by precision marking logic, and we never modify precision markings for finalized states. So, instead of each state having its own small instruction history, we keep a global dynamically-sized instruction history, where each state in current DFS path from root to active state remembers its portion of instruction history. Current state can append to this history, but cannot modify any of its parent histories. Async callback state enqueueing, while logically detached from parent state, still is part of verification backtracking tree, so has to follow the same schema as normal state checkpoints. Because the insn_hist array can be grown through realloc, states don't keep pointers, they instead maintain two indices, [start, end), into global instruction history array. End is exclusive index, so `start == end` means there is no relevant instruction history. This eliminates a lot of allocations and minimizes overall memory usage. For instance, running a worst-case test from [0] (but without the heuristics-based fix [1]), it took 12.5 minutes until we get -ENOMEM. With the changes in this patch the whole test succeeds in 10 minutes (very slow, so heuristics from [1] is important, of course). To further validate correctness, veristat-based comparison was performed for Meta production BPF objects and BPF selftests objects. In both cases there were no differences at all in terms of verdict or instruction and state counts, providing a good confidence in the change. Having this low-memory-overhead solution of keeping dynamic per-instruction history cheaply opens up some new possibilities, like keeping extra information for literally every single validated instruction. This will be used for simplifying precision backpropagation logic in follow up patches. [0] https://lore.kernel.org/bpf/20241029172641.1042523-2-eddyz87@gmail.com/ [1] https://lore.kernel.org/bpf/20241029172641.1042523-1-eddyz87@gmail.com/ Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20241115001303.277272-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: `e2d2115e56` ("bpf: Do not include stack ptr register in precision backtracking bookkeeping") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:00 +02:00
Longfang Liu	be1e0287ac	hisi_acc_vfio_pci: bugfix the problem of uninstalling driver [ Upstream commit `db6525a857` ] In a live migration scenario. If the number of VFs at the destination is greater than the source, the recovery operation will fail and qemu will not be able to complete the process and exit after shutting down the device FD. This will cause the driver to be unable to be unloaded normally due to abnormal reference counting of the live migration driver caused by the abnormal closing operation of fd. Therefore, make sure the migration file descriptor references are always released when the device is closed. Fixes: `b0eed08590` ("hisi_acc_vfio_pci: Add support for VFIO live migration") Signed-off-by: Longfang Liu <liulongfang@huawei.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Link: https://lore.kernel.org/r/20250510081155.55840-5-liulongfang@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:00 +02:00

1 2 3 4 5 ...

1319709 Commits