linux-yocto

mirror of git://git.yoctoproject.org/linux-yocto.git synced 2025-08-22 00:42:01 +02:00

Author	SHA1	Message	Date
Thomas Fourier	39dac98aca	scsi: qla4xxx: Fix missing DMA mapping error in qla4xxx_alloc_pdu() [ Upstream commit `00f452a1b0` ] dma_map_XXX() can fail and should be tested for errors with dma_mapping_error(). Fixes: `b3a271a94d` ("[SCSI] qla4xxx: support iscsiadm session mgmt") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://lore.kernel.org/r/20250618071742.21822-2-fourier.thomas@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:03:06 +02:00
Thomas Fourier	c3ec87fbb0	scsi: qla2xxx: Fix DMA mapping test in qla24xx_get_port_database() [ Upstream commit `c3b214719a` ] dma_map_XXX() functions return as error values DMA_MAPPING_ERROR which is often ~0. The error value should be tested with dma_mapping_error() like it was done in qla26xx_dport_diagnostics(). Fixes: `818c7f87a1` ("scsi: qla2xxx: Add changes in preparation for vendor extended FDMI/RDP") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://lore.kernel.org/r/20250617161115.39888-2-fourier.thomas@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:03:06 +02:00
Chen Yu	bf2c1643ab	scsi: megaraid_sas: Fix invalid node index commit `752eb816b5` upstream. On a system with DRAM interleave enabled, out-of-bound access is detected: megaraid_sas 0000:3f:00.0: requested/available msix 128/128 poll_queue 0 ------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in ./arch/x86/include/asm/topology.h:72:28 index -1 is out of range for type 'cpumask *[1024]' dump_stack_lvl+0x5d/0x80 ubsan_epilogue+0x5/0x2b __ubsan_handle_out_of_bounds.cold+0x46/0x4b megasas_alloc_irq_vectors+0x149/0x190 [megaraid_sas] megasas_probe_one.cold+0xa4d/0x189c [megaraid_sas] local_pci_probe+0x42/0x90 pci_device_probe+0xdc/0x290 really_probe+0xdb/0x340 __driver_probe_device+0x78/0x110 driver_probe_device+0x1f/0xa0 __driver_attach+0xba/0x1c0 bus_for_each_dev+0x8b/0xe0 bus_add_driver+0x142/0x220 driver_register+0x72/0xd0 megasas_init+0xdf/0xff0 [megaraid_sas] do_one_initcall+0x57/0x310 do_init_module+0x90/0x250 init_module_from_file+0x85/0xc0 idempotent_init_module+0x114/0x310 __x64_sys_finit_module+0x65/0xc0 do_syscall_64+0x82/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fix it accordingly. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Link: https://lore.kernel.org/r/20250604042556.3731059-1-yu.c.chen@intel.com Fixes: `8049da6f39` ("scsi: megaraid_sas: Use irq_set_affinity_and_hint()") Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-06 11:00:14 +02:00
Vitaliy Shevtsov	a77d0a14ed	scsi: elx: efct: Fix memory leak in efct_hw_parse_filter() [ Upstream commit `2a8a5a5dd0` ] strsep() modifies the address of the pointer passed to it so that it no longer points to the original address. This means kfree() gets the wrong pointer. Fix this by passing unmodified pointer returned from kstrdup() to kfree(). Found by Linux Verification Center (linuxtesting.org) with Svace. Fixes: `4df84e8466` ("scsi: elx: efct: Driver initialization routines") Signed-off-by: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru> Link: https://lore.kernel.org/r/20250612163616.24298-1-v.shevtsov@mt-integration.ru Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-27 11:09:04 +01:00
Dexuan Cui	8d60df50e6	scsi: storvsc: Increase the timeouts to storvsc_timeout commit `b2f966568f` upstream. Currently storvsc_timeout is only used in storvsc_sdev_configure(), and 5s and 10s are used elsewhere. It turns out that rarely the 5s is not enough on Azure, so let's use storvsc_timeout everywhere. In case a timeout happens and storvsc_channel_init() returns an error, close the VMBus channel so that any host-to-guest messages in the channel's ringbuffer, which might come late, can be safely ignored. Add a "const" to storvsc_timeout. Cc: stable@kernel.org Signed-off-by: Dexuan Cui <decui@microsoft.com> Link: https://lore.kernel.org/r/1749243459-10419-1-git-send-email-decui@microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-06-27 11:08:59 +01:00
Daniel Wagner	34c0a67055	scsi: lpfc: Use memcpy() for BIOS version [ Upstream commit `ae82eaf4ae` ] The strlcat() with FORTIFY support is triggering a panic because it thinks the target buffer will overflow although the correct target buffer size is passed in. Anyway, instead of memset() with 0 followed by a strlcat(), just use memcpy() and ensure that the resulting buffer is NULL terminated. BIOSVersion is only used for the lpfc_printf_log() which expects a properly terminated string. Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20250409-fix-lpfc-bios-str-v1-1-05dac9e51e13@kernel.org Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-27 11:08:56 +01:00
Justin Tee	32f25633f3	scsi: lpfc: Fix lpfc_check_sli_ndlp() handling for GEN_REQUEST64 commands [ Upstream commit `05ae6c9c73` ] In lpfc_check_sli_ndlp(), the get_job_els_rsp64_did remote_id assignment does not apply for GEN_REQUEST64 commands as it only has meaning for a ELS_REQUEST64 command. So, if (iocb->ndlp == ndlp) is false, we could erroneously return the wrong value. Fix by replacing the fallthrough statement with a break statement before the remote_id check. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250425194806.3585-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-27 11:08:55 +01:00
Alok Tiwari	d4cbcf274c	scsi: iscsi: Fix incorrect error path labels for flashnode operations [ Upstream commit `9b17621366` ] Correct the error handling goto labels used when host lookup fails in various flashnode-related event handlers: - iscsi_new_flashnode() - iscsi_del_flashnode() - iscsi_login_flashnode() - iscsi_logout_flashnode() - iscsi_logout_flashnode_sid() scsi_host_put() is not required when shost is NULL, so jumping to the correct label avoids unnecessary operations. These functions previously jumped to the wrong goto label (put_host), which did not match the intended cleanup logic. Use the correct exit labels (exit_new_fnode, exit_del_fnode, etc.) to ensure proper error handling. Also remove the unused put_host label under iscsi_new_flashnode() as it is no longer needed. No functional changes beyond accurate error path correction. Fixes: `c6a4bb2ef5` ("[SCSI] scsi_transport_iscsi: Add flash node mgmt support") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://lore.kernel.org/r/20250530193012.3312911-1-alok.a.tiwari@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-19 15:28:40 +02:00
Yihang Li	624b4cf6c4	scsi: hisi_sas: Call I_T_nexus after soft reset for SATA disk [ Upstream commit `e4d953ca55` ] In commit `21c7e97247` ("scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure"), if the softreset fails upon certain conditions, the PHY connected to the disk is disabled directly. Manual recovery is required, which is inconvenient for users in actual use. In addition, SATA disks do not support simultaneous connection of multiple hosts. Therefore, when multiple controllers are connected to a SATA disk at the same time, the controller which is connected later failed to issue an ATA softreset to the SATA disk. As a result, the PHY associated with the disk is disabled and cannot be automatically recovered. Now that, we will not focus on the execution result of softreset. No matter whether the execution is successful or not, we will directly carry out I_T_nexus_reset. Fixes: `21c7e97247` ("scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure") Signed-off-by: Yihang Li <liyihang9@huawei.com> Link: https://lore.kernel.org/r/20250414080845.1220997-4-liyihang9@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-19 15:28:12 +02:00
Kees Cook	ee96502062	scsi: qedf: Use designated initializer for struct qed_fcoe_cb_ops [ Upstream commit `d8720235d5` ] Recent fixes to the randstruct GCC plugin allowed it to notice that this structure is entirely function pointers and is therefore subject to randomization, but doing so requires that it always use designated initializers. Explicitly specify the "common" member as being initialized. Silences: drivers/scsi/qedf/qedf_main.c:702:9: error: positional initialization of field in 'struct' declared with 'designated_init' attribute [-Werror=designated-init] 702 \| { \| ^ Fixes: `035f7f87b7` ("randstruct: Enable Clang support") Link: https://lore.kernel.org/r/20250502224156.work.617-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-19 15:28:08 +02:00
Kai Mäkisara	32bcf54138	scsi: st: Restore some drive settings after reset [ Upstream commit `7081dc75df` ] Some of the allowed operations put the tape into a known position to continue operation assuming only the tape position has changed. But reset sets partition, density and block size to drive default values. These should be restored to the values before reset. Normally the current block size and density are stored by the drive. If the settings have been changed, the changed values have to be saved by the driver across reset. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250120194925.44432-2-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-04 14:42:13 +02:00
Justin Tee	1960bb56a9	scsi: lpfc: Free phba irq in lpfc_sli4_enable_msi() when pci_irq_vector() fails [ Upstream commit `f0842902b3` ] Fix smatch warning regarding missed calls to free_irq(). Free the phba IRQ in the failed pci_irq_vector cases. lpfc_init.c: lpfc_sli4_enable_msi() warn: 'phba->pcidev->irq' from request_irq() not released. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-04 14:42:13 +02:00
Justin Tee	3dfeee957a	scsi: lpfc: Handle duplicate D_IDs in ndlp search-by D_ID routine [ Upstream commit `56c3d809b7` ] After a port swap between separate fabrics, there may be multiple nodes in the vport's fc_nodes list with the same fabric well known address. Duplication is temporary and eventually resolves itself after dev_loss_tmo expires, but nameserver queries may still occur before dev_loss_tmo. This possibly results in returning stale fabric ndlp objects. Fix by adding an nlp_state check to ensure the ndlp search routine returns the correct newer allocated ndlp fabric object. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-04 14:42:13 +02:00
Shivasharan S	f38a1b35c8	scsi: mpt3sas: Send a diag reset if target reset fails [ Upstream commit `5612d6d51e` ] When an IOCTL times out and driver issues a target reset, if firmware fails the task management elevate the recovery by issuing a diag reset to controller. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Link: https://lore.kernel.org/r/1739410016-27503-5-git-send-email-shivasharan.srikanteshwara@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-04 14:42:06 +02:00
Kai Mäkisara	9f51fa1971	scsi: st: ERASE does not change tape location [ Upstream commit `ad77cebf97` ] The SCSI ERASE command erases from the current position onwards. Don't clear the position variables. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-3-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-04 14:41:58 +02:00
Kai Mäkisara	01195aa1d6	scsi: st: Tighten the page format heuristics with MODE SELECT [ Upstream commit `8db816c6f1` ] In the days when SCSI-2 was emerging, some drives did claim SCSI-2 but did not correctly implement it. The st driver first tries MODE SELECT with the page format bit set to set the block descriptor. If not successful, the non-page format is tried. The test only tests the sense code and this triggers also from illegal parameter in the parameter list. The test is limited to "old" devices and made more strict to remove false alarms. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-4-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-04 14:41:58 +02:00
Ranjan Kumar	c36f5f659a	scsi: mpi3mr: Add level check to control event logging [ Upstream commit `b0b7ee3b57` ] Ensure event logs are only generated when the debug logging level MPI3_DEBUG_EVENT is enabled. This prevents unnecessary logging. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250415101546.204018-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-04 14:41:53 +02:00
Steve Siwinski	c682a19344	scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer commit `e8007fad54` upstream. The REPORT ZONES buffer size is currently limited by the HBA's maximum segment count to ensure the buffer can be mapped. However, the block layer further limits the number of iovec entries to 1024 when allocating a bio. To avoid allocation of buffers too large to be mapped, further restrict the maximum buffer size to BIO_MAX_INLINE_VECS. Replace the UIO_MAXIOV symbolic name with the more contextually appropriate BIO_MAX_INLINE_VECS. Fixes: `b091ac6168` ("sd_zbc: Fix report zones buffer allocation") Cc: stable@vger.kernel.org Signed-off-by: Steve Siwinski <ssiwinski@atto.com> Link: https://lore.kernel.org/r/20250508200122.243129-1-ssiwinski@atto.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-05-22 14:12:22 +02:00
Michael Kelley	c0f3f0c88f	Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges commit `380b75d307` upstream. vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver and is hardcoded to create a single GPA range. To allow it to also be used by the netvsc driver to create multiple GPA ranges, no longer hardcode as having a single GPA range. Allow the calling driver to specify the rangecount in the supplied descriptor. Update the storvsc driver to reflect this new approach. Cc: <stable@vger.kernel.org> # 6.1.x Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://patch.msgid.link/20250513000604.1396-2-mhklinux@outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-05-22 14:12:21 +02:00
Igor Pylypiv	a862d24e1f	scsi: pm80xx: Set phy_attached to zero when device is gone [ Upstream commit `f7b705c238` ] When a fatal error occurs, a phy down event may not be received to set phy->phy_attached to zero. Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Salomon Dushimirimana <salomondush@google.com> Link: https://lore.kernel.org/r/20250319230305.3172920-1-salomondush@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-05-02 07:51:01 +02:00
Xingui Yang	fa99f1886e	scsi: hisi_sas: Fix I/O errors caused by hardware port ID changes [ Upstream commit `daff37f00c` ] The hw port ID of phy may change when inserting disks in batches, causing the port ID in hisi_sas_port and itct to be inconsistent with the hardware, resulting in I/O errors. The solution is to set the device state to gone to intercept I/O sent to the device, and then execute linkreset to discard and find the disk to re-update its information. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Link: https://lore.kernel.org/r/20250312095135.3048379-3-yangxingui@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-05-02 07:51:00 +02:00
Damien Le Moal	164bc7e26d	scsi: Improve CDL control commit `14a3cc7558` upstream. With ATA devices supporting the CDL feature, using CDL requires that the feature be enabled with a SET FEATURES command. This command is issued as the translated command for the MODE SELECT command issued by scsi_cdl_enable() when the user enables CDL through the device cdl_enable sysfs attribute. However, the implementation of scsi_cdl_enable() always issues a MODE SELECT command for ATA devices when the enable argument is true, even if CDL is already enabled on the device. While this does not cause any issue with using CDL descriptors with read/write commands (the CDL feature will be enabled on the drive), issuing the MODE SELECT command even when the device CDL feature is already enabled will cause a reset of the ATA device CDL statistics log page (as defined in ACS, any CDL enable action must reset the device statistics). Avoid this needless actions (and the implied statistics log page reset) by modifying scsi_cdl_enable() to issue the MODE SELECT command to enable CDL if and only if CDL is not reported as already enabled on the device. And while at it, simplify the initialization of the is_ata boolean variable and move the declaration of the scsi mode data and sense header variables to within the scope of ATA device handling. Fixes: `1b22cfb141` ("scsi: core: Allow enabling and disabling command duration limits") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Igor Pylypiv <ipylypiv@google.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-05-02 07:50:48 +02:00
Anastasia Kovaleva	4d6919dd7c	scsi: core: Clear flags for scsi_cmnd that did not complete [ Upstream commit `54bebe4687` ] Commands that have not been completed with scsi_done() do not clear the SCMD_INITIALIZED flag and therefore will not be properly reinitialized. Thus, the next time the scsi_cmnd structure is used, the command may fail in scsi_cmd_runtime_exceeded() due to the old jiffies_at_alloc value: kernel: sd 16:0:1:84: [sdts] tag#405 timing out command, waited 720s kernel: sd 16:0:1:84: [sdts] tag#405 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=66636s Clear flags for commands that have not been completed by SCSI. Fixes: `4abafdc436` ("block: remove the initialize_rq_fn blk_mq_ops method") Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com> Link: https://lore.kernel.org/r/20250324084933.15932-2-a.kovaleva@yadro.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-05-02 07:50:43 +02:00
Chandrakanth Patil	92d8a4e621	scsi: megaraid_sas: Block zero-length ATA VPD inquiry commit `aad9945623` upstream. A firmware bug was observed where ATA VPD inquiry commands with a zero-length data payload were not handled and failed with a non-standard status code of 0xf0. Avoid sending ATA VPD inquiry commands without data payload by setting the device no_vpd_size flag to 1. In addition, if the firmware returns a status code of 0xf0, set scsi_cmnd->result to CHECK_CONDITION to facilitate proper error handling. Suggested-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250402193735.5098-1-chandrakanth.patil@broadcom.com Tested-by: Ryan Lahfa <ryan@lahfa.xyz> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-04-25 10:45:51 +02:00
Miaoqian Lin	b0348f3394	scsi: iscsi: Fix missing scsi_host_put() in error path [ Upstream commit `72eea84a10` ] Add goto to ensure scsi_host_put() is called in all error paths of iscsi_set_host_param() function. This fixes a potential memory leak when strlen() check fails. Fixes: `ce51c81700` ("scsi: iscsi: Add strlen() check in iscsi_if_set{_host}_param()") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20250318094344.91776-1-linmq006@gmail.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-25 10:45:38 +02:00
Xingui Yang	a70ea92964	scsi: hisi_sas: Enable force phy when SATA disk directly connected [ Upstream commit `8aa580cd92` ] when a SATA disk is directly connected the SAS controller determines the disk to which I/Os are delivered based on the port ID in the DQ entry. When many phys are disconnected and reconnect, the port ID of phys were changed and used by other link, resulting in I/O being sent to incorrect disk. Data inconsistency on the SATA disk may occur during I/O retries using the old port ID. So enable force phy, then force the command to be executed in a certain phy, and if the actual phy ID of the port does not match the phy configured in the command, the chip will stop delivering the I/O to disk. Fixes: `ce60689e12` ("scsi: hisi_sas: add v3 code to send ATA frame") Signed-off-by: Xingui Yang <yangxingui@huawei.com> Link: https://lore.kernel.org/r/20250312095135.3048379-2-yangxingui@huawei.com Reviewed-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-25 10:45:38 +02:00
Kai Mäkisara	e4d1ca0a84	scsi: st: Fix array overflow in st_setup() [ Upstream commit `a018d1cf99` ] Change the array size to follow parms size instead of a fixed value. Reported-by: Chenyuan Yang <chenyuan0y@gmail.com> Closes: https://lore.kernel.org/linux-scsi/CALGdzuoubbra4xKOJcsyThdk5Y1BrAmZs==wbqjbkAgmKS39Aw@mail.gmail.com/ Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-2-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-25 10:45:16 +02:00
Steven Rostedt (Google)	f568fbe8c6	tracing: Allow creating instances with specified system events [ Upstream commit `d23569979c` ] A trace instance may only need to enable specific events. As the eventfs directory of an instance currently creates all events which adds overhead, allow internal instances to be created with just the events in systems that they care about. This currently only deals with systems and not individual events, but this should bring down the overhead of creating instances for specific use cases quite bit. The trace_array_get_by_name() now has another parameter "systems". This parameter is a const string pointer of a comma/space separated list of event systems that should be created by the trace_array. (Note if the trace_array already exists, this parameter is ignored). The list of systems is saved and if a module is loaded, its events will not be added unless the system for those events also match the systems string. Link: https://lore.kernel.org/linux-trace-kernel/20231213093701.03fddec0@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: Arun Easi <aeasi@marvell.com> Cc: Daniel Wagner <dwagner@suse.de> Tested-by: Dmytro Maluka <dmaluka@chromium.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Stable-dep-of: `0b4ffbe488` ("tracing: Correct the refcount if the hist/hist_debug file fails to open") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-10 14:37:41 +02:00
Magnus Lindholm	ea371d1cde	scsi: qla1280: Fix kernel oops when debug level > 2 [ Upstream commit `5233e3235d` ] A null dereference or oops exception will eventually occur when qla1280.c driver is compiled with DEBUG_QLA1280 enabled and ql_debug_level > 2. I think its clear from the code that the intention here is sg_dma_len(s) not length of sg_next(s) when printing the debug info. Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/20250125095033.26188-1-linmag7@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-03-22 12:50:40 -07:00
Rik van Riel	e9d4044f4b	scsi: core: Use GFP_NOIO to avoid circular locking dependency [ Upstream commit `5363ee9d11` ] Filesystems can write to disk from page reclaim with __GFP_FS set. Marc found a case where scsi_realloc_sdev_budget_map() ends up in page reclaim with GFP_KERNEL, where it could try to take filesystem locks again, leading to a deadlock. WARNING: possible circular locking dependency detected 6.13.0 #1 Not tainted ------------------------------------------------------ kswapd0/70 is trying to acquire lock: ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0 but task is already holding lock: ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760 The full lockdep splat can be found in Marc's report: https://lkml.org/lkml/2025/1/24/1101 Avoid the potential deadlock by doing the allocation with GFP_NOIO, which prevents both filesystem and block layer recursion. Reported-by: Marc Aurèle La France <tsi@tuyoix.net> Signed-off-by: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/r/20250129104525.0ae8421e@fangorn Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-03-22 12:50:40 -07:00
Ye Bin	06518de45e	scsi: core: Clear driver private data when retrying request [ Upstream commit `dce5c4afd0` ] After commit `1bad6c4a57` ("scsi: zero per-cmd private driver data for each MQ I/O"), the xen-scsifront/virtio_scsi/snic drivers all removed code that explicitly zeroed driver-private command data. In combination with commit `464a00c9e0` ("scsi: core: Kill DRIVER_SENSE"), after virtio_scsi performs a capacity expansion, the first request will return a unit attention to indicate that the capacity has changed. And then the original command is retried. As driver-private command data was not cleared, the request would return UA again and eventually time out and fail. Zero driver-private command data when a request is retried. Fixes: `f7de50da14` ("scsi: xen-scsifront: Remove code that zeroes driver-private command data") Fixes: `c2bb87318b` ("scsi: virtio_scsi: Remove code that zeroes driver-private command data") Fixes: `c3006a9264` ("scsi: snic: Remove code that zeroes driver-private command data") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20250217021628.2929248-1-yebin@huaweicloud.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-03-07 16:45:37 +01:00
Igor Pylypiv	3cfce644d8	scsi: core: Do not retry I/Os during depopulation [ Upstream commit `9ff7c383b8` ] Fail I/Os instead of retry to prevent user space processes from being blocked on the I/O completion for several minutes. Retrying I/Os during "depopulation in progress" or "depopulation restore in progress" results in a continuous retry loop until the depopulation completes or until the I/O retry loop is aborted due to a timeout by the scsi_cmd_runtime_exceeced(). Depopulation is slow and can take 24+ hours to complete on 20+ TB HDDs. Most I/Os in the depopulation retry loop end up taking several minutes before returning the failure to user space. Cc: stable@vger.kernel.org # 4.18.x: `2bbeb8d` scsi: core: Handle depopulation and restoration in progress Cc: stable@vger.kernel.org # 4.18.x Fixes: `e37c7d9a03` ("scsi: core: sanitize++ in progress") Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20250131184408.859579-1-ipylypiv@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-02-27 04:10:46 -08:00
Douglas Gilbert	7f818ac0ac	scsi: core: Handle depopulation and restoration in progress [ Upstream commit `2bbeb8d124` ] The default handling of the NOT READY sense key is to wait for the device to become ready. The "wait" is assumed to be relatively short. However there is a sub-class of NOT READY that have the "... in progress" phrase in their additional sense code and these can take much longer. Following on from commit `505aa4b6a8` ("scsi: sd: Defer spinning up drive while SANITIZE is in progress") we now have element depopulation and restoration that can take a long time. For example, over 24 hours for a 20 TB, 7200 rpm hard disk to depopulate 1 of its 20 elements. Add handling of ASC/ASCQ: 0x4,0x24 (depopulation in progress) and ASC/ASCQ: 0x4,0x25 (depopulation restoration in progress) to sd.c . The scsi_lib.c has incomplete handling of these two messages, so complete it. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20231015050650.131145-1-dgilbert@interlog.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: `9ff7c383b8` ("scsi: core: Do not retry I/Os during depopulation") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-02-27 04:10:46 -08:00
Long Li	7df68980e8	scsi: storvsc: Set correct data length for sending SCSI command without payload commit `87c4b5e8a6` upstream. In StorVSC, payload->range.len is used to indicate if this SCSI command carries payload. This data is allocated as part of the private driver data by the upper layer and may get passed to lower driver uninitialized. For example, the SCSI error handling mid layer may send TEST_UNIT_READY or REQUEST_SENSE while reusing the buffer from a failed command. The private data section may have stale data from the previous command. If the SCSI command doesn't carry payload, the driver may use this value as is for communicating with host, resulting in possible corruption. Fix this by always initializing this value. Fixes: `be0cf6ca30` ("scsi: storvsc: Set the tablesize based on the information given by the host") Cc: stable@kernel.org Tested-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Long Li <longli@microsoft.com> Link: https://lore.kernel.org/r/1737601642-7759-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-17 09:40:27 +01:00
Quinn Tran	217230bc87	scsi: qla2xxx: Move FCE Trace buffer allocation to user control commit `841df27d61` upstream. Currently FCE Tracing is enabled to log additional ELS events. Instead, user will enable or disable this feature through debugfs. Modify existing DFS knob to allow user to enable or disable this feature. echo [1 \| 0] > /sys/kernel/debug/qla2xxx/qla2xxx_??/fce cat /sys/kernel/debug/qla2xxx/qla2xxx_??/fce Cc: stable@vger.kernel.org Fixes: `df613b9607` ("[SCSI] qla2xxx: Add Fibre Channel Event (FCE) tracing support.") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-17 09:40:26 +01:00
Kai Mäkisara	7bfa83ee25	scsi: st: Don't set pos_unknown just after device recognition commit `98b37881b7` upstream. Commit `9604eea5bd` ("scsi: st: Add third party poweron reset handling") in v6.6 added new code to handle the Power On/Reset Unit Attention (POR UA) sense data. This was in addition to the existing method. When this Unit Attention is received, the driver blocks attempts to read, write and some other operations because the reset may have rewinded the tape. Because of the added code, also the initial POR UA resulted in blocking operations, including those that are used to set the driver options after the device is recognized. Also, reading and writing are refused, whereas they succeeded before this commit. Add code to not set pos_unknown to block operations if the POR UA is received from the first test_ready() call after the st device has been created. This restores the behavior before v6.6. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20241216113755.30415-1-Kai.Makisara@kolumbus.fi Fixes: `9604eea5bd` ("scsi: st: Add third party poweron reset handling") CC: stable@vger.kernel.org Closes: https://lore.kernel.org/linux-scsi/2201CF73-4795-4D3B-9A79-6EE5215CF58D@kolumbus.fi/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-17 09:40:26 +01:00
Paul Menzel	495dcb00d4	scsi: mpt3sas: Set ioc->manu_pg11.EEDPTagMode directly to 1 [ Upstream commit `ad7c3c0cb8` ] Currently, the code does: if (x == 0) { x &= ~0x3; x \|= 0x1; } Zeroing bits 0 and 1 of a variable that is 0 is not necessary. So directly set the variable to 1. Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Fixes: `f92363d123` ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/r/20241212221817.78940-2-pmenzel@molgen.mpg.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-02-08 09:52:26 +01:00
Easwar Hariharan	088bde862f	scsi: storvsc: Ratelimit warning logs to prevent VM denial of service commit `d2138eab8c` upstream. If there's a persistent error in the hypervisor, the SCSI warning for failed I/O can flood the kernel log and max out CPU utilization, preventing troubleshooting from the VM side. Ratelimit the warning so it doesn't DoS the VM. Closes: https://github.com/microsoft/WSL/issues/9173 Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Link: https://lore.kernel.org/r/20250107-eahariha-ratelimit-storvsc-v1-1-7fc193d1f2b0@linux.microsoft.com Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-01 18:37:55 +01:00
Xiang Zhang	44c485f0fc	scsi: iscsi: Fix redundant response for ISCSI_UEVENT_GET_HOST_STATS request [ Upstream commit `63ca02221c` ] The ISCSI_UEVENT_GET_HOST_STATS request is already handled in iscsi_get_host_stats(). This fix ensures that redundant responses are skipped in iscsi_if_rx(). - On success: send reply and stats from iscsi_get_host_stats() within if_recv_msg(). - On error: fall through. Signed-off-by: Xiang Zhang <hawkxiang.cpp@gmail.com> Link: https://lore.kernel.org/r/20250107022432.65390-1-hawkxiang.cpp@gmail.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-02-01 18:37:51 +01:00
Yihang Li	9722973ad0	scsi: hisi_sas: Remove redundant checks for automatic debugfs dump commit `3f03055047` upstream. In commit `63f0733d07` ("scsi: hisi_sas: Allocate DFX memory during dump trigger"), the memory allocation time of the DFX is changed from device initialization to dump occurs, so .debugfs_itct is not a valid address and do not need to check. The parameter hisi_sas_debugfs_enable is enough to check whether automatic debugfs dump is triggered, so remove redunant checks. Fixes: `63f0733d07` ("scsi: hisi_sas: Allocate DFX memory during dump trigger") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-09 13:32:09 +01:00
Yihang Li	a47f0b0314	scsi: hisi_sas: Fix a deadlock issue related to automatic dump [ Upstream commit `3c4f53b2c3` ] If we issue a disabling PHY command, the device attached with it will go offline, if a 2 bit ECC error occurs at the same time, a hung task may be found: [ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds. [ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.674809] task:kworker/u256:0 state:D stack: 0 pid:165233 ppid: 2 flags:0x00000208 [ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas] [ 4613.691518] Call trace: [ 4613.694678] __switch_to+0xf8/0x17c [ 4613.698872] __schedule+0x660/0xee0 [ 4613.703063] schedule+0xac/0x240 [ 4613.706994] schedule_timeout+0x500/0x610 [ 4613.711705] __down+0x128/0x36c [ 4613.715548] down+0x240/0x2d0 [ 4613.719221] hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main] [ 4613.726618] sas_execute_internal_abort+0x144/0x310 [libsas] [ 4613.732976] sas_execute_internal_abort_dev+0x44/0x60 [libsas] [ 4613.739504] hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main] [ 4613.747499] hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main] [ 4613.753682] sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas] [ 4613.759781] sas_unregister_common_dev+0x4c/0x7a0 [libsas] [ 4613.765962] sas_destruct_devices+0xb8/0x120 [libsas] [ 4613.771709] sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas] [ 4613.778930] sas_revalidate_domain+0x60/0xa4 [libsas] [ 4613.784716] process_one_work+0x248/0x950 [ 4613.789424] worker_thread+0x318/0x934 [ 4613.793878] kthread+0x190/0x200 [ 4613.797810] ret_from_fork+0x10/0x18 [ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds. [ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.824538] task:kworker/u256:4 state:D stack: 0 pid:316722 ppid: 2 flags:0x00000208 [ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main] [ 4613.841491] Call trace: [ 4613.844647] __switch_to+0xf8/0x17c [ 4613.848852] __schedule+0x660/0xee0 [ 4613.853052] schedule+0xac/0x240 [ 4613.856984] schedule_timeout+0x500/0x610 [ 4613.861695] __down+0x128/0x36c [ 4613.865542] down+0x240/0x2d0 [ 4613.869216] hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main] [ 4613.876324] hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main] [ 4613.883019] process_one_work+0x248/0x950 [ 4613.887732] worker_thread+0x318/0x934 [ 4613.892204] kthread+0x190/0x200 [ 4613.896118] ret_from_fork+0x10/0x18 [ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds. [ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.922852] task:kworker/u256:1 state:D stack: 0 pid:348985 ppid: 2 flags:0x00000208 [ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas] [ 4613.939549] Call trace: [ 4613.942702] __switch_to+0xf8/0x17c [ 4613.946892] __schedule+0x660/0xee0 [ 4613.951083] schedule+0xac/0x240 [ 4613.955015] schedule_timeout+0x500/0x610 [ 4613.959725] wait_for_common+0x200/0x610 [ 4613.964349] wait_for_completion+0x3c/0x5c [ 4613.969146] flush_workqueue+0x198/0x790 [ 4613.973776] sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas] [ 4613.979960] sas_port_event_worker+0x54/0xa0 [libsas] [ 4613.985708] process_one_work+0x248/0x950 [ 4613.990420] worker_thread+0x318/0x934 [ 4613.994868] kthread+0x190/0x200 [ 4613.998800] ret_from_fork+0x10/0x18 This is because when the device goes offline, we obtain the hisi_hba semaphore and send the ABORT_DEV command to the device. However, the internal abort timed out due to the 2 bit ECC error and triggers automatic dump. In addition, since the hisi_hba semaphore has been obtained, the dump cannot be executed and the controller cannot be reset. Therefore, the deadlocks occur on the following circular dependencies: hisi_sas_dev_gone() -> down() -> hisi_sas_internal_task_abort_dev() -> ... -> hisi_sas_internal_abort_timeout() -> down(). The deadlock is triggered only when the timeout occurs during device goes offline. To fix this issue, use .rst_ha_timeout to distinguish the scenario where a device goes offline from other scenarios. Fixes: `2ff07b5c6f` ("scsi: hisi_sas: Directly call register snapshot instead of using workqueue") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-09 13:31:53 +01:00
Ranjan Kumar	f5a2042408	scsi: mpi3mr: Start controller indexing from 0 [ Upstream commit `0d32014f1e` ] Instead of displaying the controller index starting from '1' make the driver display the controller index starting from '0'. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-09 13:31:51 +01:00
Guixin Liu	d424303d8d	scsi: mpi3mr: Use ida to manage mrioc ID [ Upstream commit `29b75184f7` ] To ensure that the same ID is not obtained during concurrent execution of the probe, an ida is used to manage the mrioc's ID. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Link: https://lore.kernel.org/r/20231229040331.52518-1-kanie@linux.alibaba.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: `0d32014f1e` ("scsi: mpi3mr: Start controller indexing from 0") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-09 13:31:51 +01:00
Yihang Li	7c8c50c985	scsi: hisi_sas: Create all dump files during debugfs initialization [ Upstream commit `9f564f15f8` ] For the current debugfs of hisi_sas, after user triggers dump, the driver allocate memory space to save the register information and create debugfs files to display the saved information. In this process, the debugfs files created after each dump. Therefore, when the dump is triggered while the driver is unbind, the following hang occurs: [67840.853907] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0 [67840.862947] Mem abort info: [67840.865855] ESR = 0x0000000096000004 [67840.869713] EC = 0x25: DABT (current EL), IL = 32 bits [67840.875125] SET = 0, FnV = 0 [67840.878291] EA = 0, S1PTW = 0 [67840.881545] FSC = 0x04: level 0 translation fault [67840.886528] Data abort info: [67840.889524] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [67840.895117] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [67840.900284] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [67840.905709] user pgtable: 4k pages, 48-bit VAs, pgdp=0000002803a1f000 [67840.912263] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000 [67840.919177] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [67840.996435] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [67841.003628] pc : down_write+0x30/0x98 [67841.007546] lr : start_creating.part.0+0x60/0x198 [67841.012495] sp : ffff8000b979ba20 [67841.016046] x29: ffff8000b979ba20 x28: 0000000000000010 x27: 0000000000024b40 [67841.023412] x26: 0000000000000012 x25: ffff20202b355ae8 x24: ffff20202b35a8c8 [67841.030779] x23: ffffa36877928208 x22: ffffa368b4972240 x21: ffff8000b979bb18 [67841.038147] x20: ffff00281dc1e3c0 x19: fffffffffffffffe x18: 0000000000000020 [67841.045515] x17: 0000000000000000 x16: ffffa368b128a530 x15: ffffffffffffffff [67841.052888] x14: ffff8000b979bc18 x13: ffffffffffffffff x12: ffff8000b979bb18 [67841.060263] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa368b1289b18 [67841.067640] x8 : 0000000000000012 x7 : 0000000000000000 x6 : 00000000000003a9 [67841.075014] x5 : 0000000000000000 x4 : ffff002818c5cb00 x3 : 0000000000000001 [67841.082388] x2 : 0000000000000000 x1 : ffff002818c5cb00 x0 : 00000000000000a0 [67841.089759] Call trace: [67841.092456] down_write+0x30/0x98 [67841.096017] start_creating.part.0+0x60/0x198 [67841.100613] debugfs_create_dir+0x48/0x1f8 [67841.104950] debugfs_create_files_v3_hw+0x88/0x348 [hisi_sas_v3_hw] [67841.111447] debugfs_snapshot_regs_v3_hw+0x708/0x798 [hisi_sas_v3_hw] [67841.118111] debugfs_trigger_dump_v3_hw_write+0x9c/0x120 [hisi_sas_v3_hw] [67841.125115] full_proxy_write+0x68/0xc8 [67841.129175] vfs_write+0xd8/0x3f0 [67841.132708] ksys_write+0x70/0x108 [67841.136317] __arm64_sys_write+0x24/0x38 [67841.140440] invoke_syscall+0x50/0x128 [67841.144385] el0_svc_common.constprop.0+0xc8/0xf0 [67841.149273] do_el0_svc+0x24/0x38 [67841.152773] el0_svc+0x38/0xd8 [67841.156009] el0t_64_sync_handler+0xc0/0xc8 [67841.160361] el0t_64_sync+0x1a4/0x1a8 [67841.164189] Code: b9000882 d2800002 d2800023 f9800011 (c85ffc05) [67841.170443] ---[ end trace 0000000000000000 ]--- To fix this issue, create all directories and files during debugfs initialization. In this way, the driver only needs to allocate memory space to save information each time the user triggers dumping. Signed-off-by: Yihang Li <liyihang9@huawei.com> Link: https://lore.kernel.org/r/20241008021822.2617339-13-liyihang9@huawei.com Reviewed-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-09 13:31:44 +01:00
Yihang Li	0449286798	scsi: hisi_sas: Allocate DFX memory during dump trigger [ Upstream commit `63f0733d07` ] Currently, if CONFIG_SCSI_HISI_SAS_DEBUGFS_DEFAULT_ENABLE is enabled, the memory space used by DFX is allocated during device initialization, which occupies a large number of memory resources. The memory usage before and after the driver is loaded is as follows: Memory usage before the driver is loaded: $ free -m total used free shared buff/cache available Mem: 867352 2578 864037 11 735 861681 Swap: 4095 0 4095 Memory usage after the driver which include 4 HBAs is loaded: $ insmod hisi_sas_v3_hw.ko $ free -m total used free shared buff/cache available Mem: 867352 4760 861848 11 743 859495 Swap: 4095 0 4095 The driver with 4 HBAs connected will allocate about 110 MB of memory without enabling debugfs. Therefore, to avoid wasting memory resources, DFX memory is allocated during dump triggering. The dump may fail due to memory allocation failure. After this change, each dump costs about 10 MB of memory, and each dump lasts about 100 ms. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1694571327-78697-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: `9f564f15f8` ("scsi: hisi_sas: Create all dump files during debugfs initialization") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-09 13:31:44 +01:00
Yihang Li	91e035e98f	scsi: hisi_sas: Directly call register snapshot instead of using workqueue [ Upstream commit `2ff07b5c6f` ] Currently, register information dump is performed via workqueue, regardless of the trigger mode (automatic or manual). There is a delay in dumping register through workqueue, the exact register information at trigger time cannot be obtained. Call register snapshot directly instead of through a workqueue. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1694571327-78697-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: `9f564f15f8` ("scsi: hisi_sas: Create all dump files during debugfs initialization") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-09 13:31:44 +01:00
Cathy Avery	3556af9a68	scsi: storvsc: Do not flag MAINTENANCE_IN return of SRB_STATUS_DATA_OVERRUN as an error [ Upstream commit `b1aee7f034` ] This partially reverts commit `812fe6420a` ("scsi: storvsc: Handle additional SRB status values"). HyperV does not support MAINTENANCE_IN resulting in FC passthrough returning the SRB_STATUS_DATA_OVERRUN value. Now that SRB_STATUS_DATA_OVERRUN is treated as an error, multipath ALUA paths go into a faulty state as multipath ALUA submits RTPG commands via MAINTENANCE_IN. [ 3.215560] hv_storvsc 1d69d403-9692-4460-89f9-a8cbcc0f94f3: tag#230 cmd 0xa3 status: scsi 0x0 srb 0x12 hv 0xc0000001 [ 3.215572] scsi 1:0:0:32: alua: rtpg failed, result 458752 Make MAINTENANCE_IN return success to avoid the error path as is currently done with INQUIRY and MODE_SENSE. Suggested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Cathy Avery <cavery@redhat.com> Link: https://lore.kernel.org/r/20241127181324.3318443-1-cavery@redhat.com Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-02 10:32:05 +01:00
Ranjan Kumar	cf4bea16bb	scsi: mpt3sas: Diag-Reset when Doorbell-In-Use bit is set during driver load time [ Upstream commit `3f5eb062e8` ] Issue a Diag-Reset when the "Doorbell-In-Use" bit is set during the driver load/initialization. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110173341.11595-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-02 10:32:05 +01:00
Tomas Henzl	f50783148e	scsi: megaraid_sas: Fix for a potential deadlock [ Upstream commit `50740f4dc7` ] This fixes a 'possible circular locking dependency detected' warning CPU0 CPU1 ---- ---- lock(&instance->reset_mutex); lock(&shost->scan_mutex); lock(&instance->reset_mutex); lock(&shost->scan_mutex); Fix this by temporarily releasing the reset_mutex. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20240923174833.45345-1-thenzl@redhat.com Acked-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-02 10:32:04 +01:00
Magnus Lindholm	fbd7deb459	scsi: qla1280: Fix hw revision numbering for ISP1020/1040 [ Upstream commit `c064de86d2` ] Fix the hardware revision numbering for Qlogic ISP1020/1040 boards. HWMASK suggests that the revision number only needs four bits, this is consistent with how NetBSD does things in their ISP driver. Verified on a IPS1040B which is seen as rev 5 not as BIT_4. Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/20241113225636.2276-1-linmag7@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-02 10:32:04 +01:00

1 2 3 4 5 ...

24476 Commits