linux-yocto

mirror of git://git.yoctoproject.org/linux-yocto.git synced 2025-10-22 23:13:01 +02:00

Author	SHA1	Message	Date
Eric Biggers	878d87fc68	crypto: skcipher - call cond_resched() directly In skcipher_walk_done(), instead of calling crypto_yield() which requires a translation between flags, just call cond_resched() directly. This has the same effect. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:33 +08:00
Eric Biggers	8b13c2239d	crypto: skcipher - optimize initializing skcipher_walk fields The helper functions like crypto_skcipher_blocksize() take in a pointer to a tfm object, but they actually return properties of the algorithm. As the Linux kernel is compiled with -fno-strict-aliasing, the compiler has to assume that the writes to struct skcipher_walk could clobber the tfm's pointer to its algorithm. Thus it gets repeatedly reloaded in the generated code. Therefore, replace the use of these helper functions with staightforward accesses to the struct fields. Note that while users of the skcipher and aead APIs are supposed to use the helper functions, this particular code is part of the API implementation in crypto/skcipher.c, which already accesses the algorithm struct directly in many cases. So there is no reason to prefer the helper functions here. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:33 +08:00
Eric Biggers	f2489456fe	crypto: skcipher - clean up initialization of skcipher_walk::flags - Initialize SKCIPHER_WALK_SLEEP in a consistent way, and check for atomic=true at the same time as CRYPTO_TFM_REQ_MAY_SLEEP. Technically atomic=true only needs to apply after the first step, but it is very rarely used. We should optimize for the common case. So, check 'atomic' alongside CRYPTO_TFM_REQ_MAY_SLEEP. This is more efficient. - Initialize flags other than SKCIPHER_WALK_SLEEP to 0 rather than preserving them. No caller actually initializes the flags, which makes it impossible to use their original values for anything. Indeed, that does not happen and all meaningful flags get overridden anyway. It may have been thought that just clearing one flag would be faster than clearing all flags, but that's not the case as the former is a read-write operation whereas the latter is just a write. - Move the explicit clearing of SKCIPHER_WALK_SLOW, SKCIPHER_WALK_COPY, and SKCIPHER_WALK_DIFF into skcipher_walk_done(), since it is now only needed on non-first steps. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:33 +08:00
Eric Biggers	d97d0668e8	crypto: skcipher - fold skcipher_walk_skcipher() into skcipher_walk_virt() Fold skcipher_walk_skcipher() into skcipher_walk_virt() which is its only remaining caller. No change in behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:33 +08:00
Eric Biggers	24300d282f	crypto: skcipher - remove redundant check for SKCIPHER_WALK_SLOW In skcipher_walk_done(), remove the check for SKCIPHER_WALK_SLOW because it is always true. All other flags (and lack thereof) were checked earlier in the function, leaving SKCIPHER_WALK_SLOW as the only remaining possibility. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:32 +08:00
Eric Biggers	a22a2316be	crypto: skcipher - remove redundant clamping to page size In the case where skcipher_walk_next() allocates a bounce page, that page by definition has size PAGE_SIZE. The number of bytes to copy 'n' is guaranteed to fit in it, since earlier in the function it was clamped to be at most a page. Therefore remove the unnecessary logic that tried to clamp 'n' again to fit in the bounce page. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:32 +08:00
Eric Biggers	807c8018f5	crypto: skcipher - remove unnecessary page alignment of bounce buffer In the slow path of skcipher_walk where it uses a slab bounce buffer for the data and/or IV, do not bother to avoid crossing a page boundary in the part(s) of this buffer that are used, and do not bother to allocate extra space in the buffer for that purpose. The buffer is accessed only by virtual address, so pages are irrelevant for it. This logic may have been present due to the physical address support in skcipher_walk, but that has now been removed. Or it may have been present to be consistent with the fast path that currently does not hand back addresses that span pages, but that behavior is a side effect of the pages being "mapped" one by one and is not actually a requirement. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:32 +08:00
Eric Biggers	e71778c95a	crypto: skcipher - document skcipher_walk_done() and rename some vars skcipher_walk_done() has an unusual calling convention, and some of its local variables have unclear names. Document it and rename variables to make it a bit clearer what is going on. No change in behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:32 +08:00
Eric Biggers	42c5675c2f	crypto: omap - switch from scatter_walk to plain offset The omap driver was using struct scatter_walk, but only to maintain an offset, rather than iterating through the virtual addresses of the data contained in the scatterlist which is what scatter_walk is intended for. Make it just use a plain offset instead. This is simpler and avoids using struct scatter_walk in a way that is not well supported. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:32 +08:00
Eric Biggers	ee3c9c7e27	crypto: powerpc/p10-aes-gcm - simplify handling of linear associated data p10_aes_gcm_crypt() is abusing the scatter_walk API to get the virtual address for the first source scatterlist element. But this code is only built for PPC64 which is a !HIGHMEM platform, and it can read past a page boundary from the address returned by scatterwalk_map() which means it already assumes the address is from the kernel's direct map. Thus, just use sg_virt() instead to get the same result in a simpler way. Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Danny Tsen <dtsen@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N Rao <naveen@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:38:32 +08:00
Krzysztof Kozlowski	1742b0a0e4	crypto: bcm - Drop unused setting of local 'ptr' variable spum_cipher_req_init() assigns 'spu_hdr' to local 'ptr' variable and later increments 'ptr' over specific fields like it was meant to point to pieces of message for some purpose. However the code does not read 'ptr' at all thus this entire iteration over 'spu_hdr' seams pointless. Reported by clang W=1 build: drivers/crypto/bcm/spu.c:839:6: error: variable 'ptr' set but not used [-Werror,-Wunused-but-set-variable] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:31:13 +08:00
Yang Shen	061b27e372	crypto: hisilicon/qm - support new function communication On the HiSilicon accelerators drivers, the PF/VFs driver can send messages to the VFs/PF by writing hardware registers, and the VFs/PF driver receives messages from the PF/VFs by reading hardware registers. To support this feature, a new version id is added, different communication mechanism are used based on different version id. Signed-off-by: Yang Shen <shenyang39@huawei.com> Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:31:13 +08:00
Thorsten Blum	a268231678	crypto: proc - Use str_yes_no() and str_no_yes() helpers Remove hard-coded strings by using the str_yes_no() and str_no_yes() helpers. Remove unnecessary curly braces. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-14 11:31:13 +08:00
Eric Biggers	7fa4817340	crypto: ahash - make hash walk functions private to ahash.c Due to the removal of the Niagara2 SPU driver, crypto_hash_walk_first(), crypto_hash_walk_done(), crypto_hash_walk_last(), and struct crypto_hash_walk are now only used in crypto/ahash.c. Therefore, make them all private to crypto/ahash.c. I.e. un-export the two functions that were exported, make the functions static, and move the struct definition to the .c file. As part of this, move the functions to earlier in the file to avoid needing to add forward declarations. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-04 08:53:47 +08:00
Thomas Weißschuh	9ff6e943bc	padata: fix sysfs store callback check padata_sysfs_store() was copied from padata_sysfs_show() but this check was not adapted. Today there is no attribute which can fail this check, but if there is one it may as well be correct. Fixes: `5e017dc3f8` ("padata: Added sysfs primitives to padata subsystem") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-04 08:53:47 +08:00
Eric Biggers	730f67d8b8	crypto: keywrap - remove unused keywrap algorithm The keywrap (kw) algorithm has no in-tree user. It has never had an in-tree user, and the patch that added it provided no justification for its inclusion. Even use of it via AF_ALG is impossible, as it uses a weird calling convention where part of the ciphertext is returned via the IV buffer, which is not returned to userspace in AF_ALG. It's also unclear whether any new code in the kernel that does key wrapping would actually use this algorithm. It is controversial in the cryptographic community due to having no clearly stated security goal, no security proof, poor performance, and only a 64-bit auth tag. Later work (https://eprint.iacr.org/2006/221) suggested that the goal is deterministic authenticated encryption. But there are now more modern algorithms for this, and this is not the same as key wrapping, for which a regular AEAD such as AES-GCM usually can be (and is) used instead. Therefore, remove this unused code. There were several special cases for this algorithm in the self-tests, due to its weird calling convention. Remove those too. Cc: Stephan Mueller <smueller@chronox.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-04 08:53:47 +08:00
Eric Biggers	2890601f54	crypto: vmac - remove unused VMAC algorithm Remove the vmac64 template, as it has no known users. It also continues to have longstanding bugs such as alignment violations (see https://lore.kernel.org/r/20241226134847.6690-1-evepolonium@gmail.com/). This code was added in 2009 by commit `f1939f7c56` ("crypto: vmac - New hash algorithm for intel_txt support"). Based on the mention of intel_txt support in the commit title, it seems it was added as a prerequisite for the contemporaneous patch "intel_txt: add s3 userspace memory integrity verification" (https://lore.kernel.org/r/4ABF2B50.6070106@intel.com/). In the design proposed by that patch, when an Intel Trusted Execution Technology (TXT) enabled system resumed from suspend, the "tboot" trusted executable launched the Linux kernel without verifying userspace memory, and then the Linux kernel used VMAC to verify userspace memory. However, that patch was never merged, as reviewers had objected to the design. It was later reworked into commit `4bd96a7a81` ("x86, tboot: Add support for S3 memory integrity protection") which made tboot verify the memory instead. Thus the VMAC support in Linux was never used. No in-tree user has appeared since then, other than potentially the usual components that allow specifying arbitrary hash algorithms by name, namely AF_ALG and dm-integrity. However there are no indications that VMAC is being used with these components. Debian Code Search and web searches for "vmac64" (the actual algorithm name) do not return any results other than the kernel itself, suggesting that it does not appear in any other code or documentation. Explicitly grepping the source code of the usual suspects (libell, iwd, cryptsetup) finds no matches either. Before 2018, the vmac code was also completely broken due to using a hardcoded nonce and the wrong endianness for the MAC. It was then fixed by commit `ed331adab3` ("crypto: vmac - add nonced version with big endian digest") and commit `0917b87312` ("crypto: vmac - remove insecure version with hardcoded nonce"). These were intentionally breaking changes that changed all the computed MAC values as well as the algorithm name ("vmac" to "vmac64"). No complaints were ever received about these breaking changes, strongly suggesting the absence of users. The reason I had put some effort into fixing this code in 2018 is because it was used by an out-of-tree driver. But if it is still needed in that particular out-of-tree driver, the code can be carried in that driver instead. There is no need to carry it upstream. Cc: Atharva Tiwari <evepolonium@gmail.com> Cc: Shane Wang <shane.wang@intel.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-04 08:52:03 +08:00
Md Sadre Alam	eb680160cf	dt-bindings: crypto: qcom,prng: document ipq9574, ipq5424 and ipq5322 Document ipq9574, ipq5424 and ipq5322 compatible for the True Random Number Generator. Signed-off-by: Md Sadre Alam <quic_mdalam@quicinc.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-04 08:52:03 +08:00
Thorsten Blum	8f904adef6	crypto: fips - Use str_enabled_disabled() helper in fips_enable() Remove hard-coded strings by using the str_enabled_disabled() helper function. Use pr_info() instead of printk(KERN_INFO) to silence a checkpatch warning. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-01-04 08:52:03 +08:00
Kanchana P Sridhar	4ebd9a5ca4	crypto: iaa - Fix IAA disabling that occurs when sync_mode is set to 'async' With the latest mm-unstable, setting the iaa_crypto sync_mode to 'async' causes crypto testmgr.c test_acomp() failure and dmesg call traces, and zswap being unable to use 'deflate-iaa' as a compressor: echo async > /sys/bus/dsa/drivers/crypto/sync_mode [ 255.271030] zswap: compressor deflate-iaa not available [ 369.960673] INFO: task cryptomgr_test:4889 blocked for more than 122 seconds. [ 369.970127] Not tainted 6.13.0-rc1-mm-unstable-12-16-2024+ #324 [ 369.977411] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 369.986246] task:cryptomgr_test state:D stack:0 pid:4889 tgid:4889 ppid:2 flags:0x00004000 [ 369.986253] Call Trace: [ 369.986256] <TASK> [ 369.986260] __schedule+0x45c/0xfa0 [ 369.986273] schedule+0x2e/0xb0 [ 369.986277] schedule_timeout+0xe7/0x100 [ 369.986284] ? __prepare_to_swait+0x4e/0x70 [ 369.986290] wait_for_completion+0x8d/0x120 [ 369.986293] test_acomp+0x284/0x670 [ 369.986305] ? __pfx_cryptomgr_test+0x10/0x10 [ 369.986312] alg_test_comp+0x263/0x440 [ 369.986315] ? sched_balance_newidle+0x259/0x430 [ 369.986320] ? __pfx_cryptomgr_test+0x10/0x10 [ 369.986323] alg_test.part.27+0x103/0x410 [ 369.986326] ? __schedule+0x464/0xfa0 [ 369.986330] ? __pfx_cryptomgr_test+0x10/0x10 [ 369.986333] cryptomgr_test+0x20/0x40 [ 369.986336] kthread+0xda/0x110 [ 369.986344] ? __pfx_kthread+0x10/0x10 [ 369.986346] ret_from_fork+0x2d/0x40 [ 369.986355] ? __pfx_kthread+0x10/0x10 [ 369.986358] ret_from_fork_asm+0x1a/0x30 [ 369.986365] </TASK> This happens because the only async polling without interrupts that iaa_crypto currently implements is with the 'sync' mode. With 'async', iaa_crypto calls to compress/decompress submit the descriptor and return -EINPROGRESS, without any mechanism in the driver to poll for completions. Hence callers such as test_acomp() in crypto/testmgr.c or zswap, that wrap the calls to crypto_acomp_compress() and crypto_acomp_decompress() in synchronous wrappers, will block indefinitely. Even before zswap can notice this problem, the crypto testmgr.c's test_acomp() will fail and prevent registration of "deflate-iaa" as a valid crypto acomp algorithm, thereby disallowing the use of "deflate-iaa" as a zswap compress (zswap will fall-back to the default compressor in this case). To fix this issue, this patch modifies the iaa_crypto sync_mode set function to treat 'async' equivalent to 'sync', so that the correct and only supported driver async polling without interrupts implementation is enabled, and zswap can use 'deflate-iaa' as the compressor. Hence, with this patch, this is what will happen: echo async > /sys/bus/dsa/drivers/crypto/sync_mode cat /sys/bus/dsa/drivers/crypto/sync_mode sync There are no crypto/testmgr.c test_acomp() errors, no call traces and zswap can use 'deflate-iaa' without any errors. The iaa_crypto documentation has also been updated to mention this caveat with 'async' and what to expect with this fix. True iaa_crypto async polling without interrupts is enabled in patch "crypto: iaa - Implement batch_compress(), batch_decompress() API in iaa_crypto." [1] which is under review as part of the "zswap IAA compress batching" patch-series [2]. Until this is merged, we would appreciate it if this current patch can be considered for a hotfix. [1]: https://patchwork.kernel.org/project/linux-mm/patch/20241221063119.29140-5-kanchana.p.sridhar@intel.com/ [2]: https://patchwork.kernel.org/project/linux-mm/list/?series=920084 Fixes: `09646c98d` ("crypto: iaa - Add irq support for the crypto async interface") Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-28 19:49:23 +08:00
Herbert Xu	de662429f3	crypto: lib/aesgcm - Reduce stack usage in libaesgcm_init The stack frame in libaesgcm_init triggers a size warning on x86-64. Reduce it by making buf static. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-28 19:49:22 +08:00
Nathan Chancellor	7b6092ee7a	crypto: qce - revert "use __free() for a buffer that's always freed" Commit `ce8fd0500b` ("crypto: qce - use __free() for a buffer that's always freed") introduced a buggy use of __free(), which clang rightfully points out: drivers/crypto/qce/sha.c:365:3: error: cannot jump from this goto statement to its label 365 \| goto err_free_ahash; \| ^ drivers/crypto/qce/sha.c:373:6: note: jump bypasses initialization of variable with __attribute__((cleanup)) 373 \| u8 *buf __free(kfree) = kzalloc(keylen + QCE_MAX_ALIGN_SIZE, \| ^ Jumping over a variable declared with the cleanup attribute does not prevent the cleanup function from running; instead, the cleanup function is called with an uninitialized value. Moving the declaration back to the top function with __free() and a NULL initialization would resolve the bug but that is really not much different from the original code. Since the function is so simple and there is no functional reason to use __free() here, just revert the original change to resolve the issue. Fixes: `ce8fd0500b` ("crypto: qce - use __free() for a buffer that's always freed") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/CA+G9fYtpAwXa5mUQ5O7vDLK2xN4t-kJoxgUe1ZFRT=AGqmLSRA@mail.gmail.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Joe Hattori	472a989029	crypto: ixp4xx - fix OF node reference leaks in init_ixp_crypto() init_ixp_crypto() calls of_parse_phandle_with_fixed_args() multiple times, but does not release all the obtained refcounts. Fix it by adding of_node_put() calls. This bug was found by an experimental static analysis tool that I am developing. Fixes: `76f24b4f46` ("crypto: ixp4xx - Add device tree support") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Wenkai Lin	a5a9d95993	crypto: hisilicon/sec2 - fix for aead invalid authsize When the digest alg is HMAC-SHAx or another, the authsize may be less than 4 bytes and mac_len of the BD is set to zero, the hardware considers it a BD configuration error and reports a ras error, so the sec driver needs to switch to software calculation in this case, this patch add a check for it and remove unnecessary check that has been done by crypto. Fixes: `2f072d75d1` ("crypto: hisilicon - Add aead support on SEC2") Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com> Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Wenkai Lin	fd337f852b	crypto: hisilicon/sec2 - fix for aead icv error When the AEAD algorithm is used for encryption or decryption, the input authentication length varies, the hardware needs to obtain the input length to pass the integrity check verification. Currently, the driver uses a fixed authentication length,which causes decryption failure, so the length configuration is modified. In addition, the step of setting the auth length is unnecessary, so it was deleted from the setkey function. Fixes: `2f072d75d1` ("crypto: hisilicon - Add aead support on SEC2") Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com> Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Eric Biggers	3cd46a78ee	crypto: x86/aes-xts - additional optimizations Reduce latency by taking advantage of the property vaesenclast(key, a) ^ b == vaesenclast(key ^ b, a), like I did in the AES-GCM code. Also replace a vpand and vpxor with a vpternlogd. On AMD Zen 5 this improves performance by about 3%. Intel performance remains about the same, with a 0.1% improvement being seen on Icelake. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Eric Biggers	68e95f5c64	crypto: x86/aes-xts - more code size optimizations Prefer immediates of -128 to 128, since the former fits in a signed byte, saving 3 bytes per instruction. Also prefer VEX-coded instructions to EVEX where this is easy to do. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Eric Biggers	77a4b5675b	crypto: x86/aes-xts - change len parameter to int The AES-XTS assembly code currently treats the length as signed, since this saves a few instructions in the loop compared to treating it as unsigned. Therefore update the type to make this clear. (It is not actually passed any values larger than PAGE_SIZE.) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Eric Biggers	bd7e7df6e6	crypto: x86/aes-xts - improve some comments Improve some of the comments in aes-xts-avx-x86_64.S. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Eric Biggers	d1bb1c32f9	crypto: x86/aes-xts - make the register aliases per-function Since aes-xts-avx-x86_64.S contains multiple functions, move the register aliases for the parameters and local variables of the XTS update function into the macro that generates that function. Then add register aliases to aes_xts_encrypt_iv() to improve readability there. This makes aes-xts-avx-x86_64.S consistent with the GCM assembly files. No change in the generated code. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Eric Biggers	5b7981c1ca	crypto: x86/aes-xts - use .irp when useful Use .irp instead of repeating code. No change in the generated code. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Eric Biggers	95791ccd11	crypto: x86/aes-gcm - tune better for AMD CPUs Reorganize the main loop to free up the RNDKEYLAST[0-3] registers and use them for more cached round keys. This improves performance by about 2% on AMD Zen 4 and Zen 5. Intel performance remains about the same. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Eric Biggers	3cae5a3c05	crypto: x86/aes-gcm - code size optimization Prefer immediates of -128 to 128, since the former fits in a signed byte, saving 3 bytes per instruction. Also replace a vpand and vpxor with a vpternlogd. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Dr. David Alan Gilbert	b9b894642f	crypto: lib/gf128mul - Remove some bbe deadcode gf128mul_4k_bbe(), gf128mul_bbe() and gf128mul_init_4k_bbe() are part of the library originally added in 2006 by commit `c494e0705d` ("[CRYPTO] lib: table driven multiplications in GF(2^128)") but have never been used. Remove them. (BBE is Big endian Byte/Big endian bits Note the 64k table version is used and I've left that in) Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 22:46:24 +08:00
Breno Leitao	e1d3422c95	rhashtable: Fix potential deadlock by moving schedule_work outside lock Move the hash table growth check and work scheduling outside the rht lock to prevent a possible circular locking dependency. The original implementation could trigger a lockdep warning due to a potential deadlock scenario involving nested locks between rhashtable bucket, rq lock, and dsq lock. By relocating the growth check and work scheduling after releasing the rth lock, we break this potential deadlock chain. This change expands the flexibility of rhashtable by removing restrictive locking that previously limited its use in scheduler and workqueue contexts. Import to say that this calls rht_grow_above_75(), which reads from struct rhashtable without holding the lock, if this is a problem, we can move the check to the lock, and schedule the workqueue after the lock. Fixes: `f0e1a0643a` ("sched_ext: Implement BPF extensible scheduler class") Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Modified so that atomic_inc is also moved outside of the bucket lock along with the growth above 75% check. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-21 17:05:29 +08:00
Eric Biggers	f916e44487	crypto: keywrap - remove assignment of 0 to cra_alignmask Since this code is zero-initializing the algorithm struct, the assignment of 0 to cra_alignmask is redundant. Remove it to reduce the number of matches that are found when grepping for cra_alignmask. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:44 +08:00
Eric Biggers	5478ced478	crypto: aegis - remove assignments of 0 to cra_alignmask Struct fields are zero by default, so these lines of code have no effect. Remove them to reduce the number of matches that are found when grepping for cra_alignmask. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:44 +08:00
Eric Biggers	a6185842d1	crypto: x86 - remove assignments of 0 to cra_alignmask Struct fields are zero by default, so these lines of code have no effect. Remove them to reduce the number of matches that are found when grepping for cra_alignmask. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:44 +08:00
Eric Biggers	047ea6d85e	crypto: seed - stop using cra_alignmask Instead of specifying a nonzero alignmask, use the unaligned access helpers. This eliminates unnecessary alignment operations on most CPUs, which can handle unaligned accesses efficiently, and brings us a step closer to eventually removing support for the alignmask field. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:44 +08:00
Eric Biggers	7e0061586f	crypto: khazad - stop using cra_alignmask Instead of specifying a nonzero alignmask, use the unaligned access helpers. This eliminates unnecessary alignment operations on most CPUs, which can handle unaligned accesses efficiently, and brings us a step closer to eventually removing support for the alignmask field. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:44 +08:00
Eric Biggers	5e252f490c	crypto: tea - stop using cra_alignmask Instead of specifying a nonzero alignmask, use the unaligned access helpers. This eliminates unnecessary alignment operations on most CPUs, which can handle unaligned accesses efficiently, and brings us a step closer to eventually removing support for the alignmask field. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Eric Biggers	6c178fd66b	crypto: aria - stop using cra_alignmask Instead of specifying a nonzero alignmask, use the unaligned access helpers. This eliminates unnecessary alignment operations on most CPUs, which can handle unaligned accesses efficiently, and brings us a step closer to eventually removing support for the alignmask field. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Eric Biggers	8d90528228	crypto: anubis - stop using cra_alignmask Instead of specifying a nonzero alignmask, use the unaligned access helpers. This eliminates unnecessary alignment operations on most CPUs, which can handle unaligned accesses efficiently, and brings us a step closer to eventually removing support for the alignmask field. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Eric Biggers	07d58e0a60	crypto: skcipher - remove support for physical address walks Since the physical address support in skcipher_walk is not used anymore, remove all the code associated with it. This includes: - The skcipher_walk_async() and skcipher_walk_complete() functions; - The SKCIPHER_WALK_PHYS flag and everything conditional on it; - The buffers, phys, and virt.page fields in struct skcipher_walk; - struct skcipher_walk_buffer. As a result, skcipher_walk now just supports virtual addresses. Physical address support in skcipher_walk is unneeded because drivers that need physical addresses just use the scatterlists directly. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Eric Biggers	9cda46babd	crypto: n2 - remove Niagara2 SPU driver Remove the driver for the Stream Processing Unit (SPU) on the Niagara 2. Removing this driver allows removing the support for physical address walks in skcipher_walk. That is a misfeature that is used only by this driver and increases the overhead of the crypto API for everyone else. There is little evidence that anyone cares about this driver. The Niagara 2, a.k.a. the UltraSPARC T2, is a server CPU released in 2007. The SPU is also present on the SPARC T3, released in 2010. However, the SPU went away in SPARC T4, released in 2012, which replaced it with proper cryptographic instructions instead. These newer instructions are supported by the kernel in arch/sparc/crypto/. This driver was completely broken from (at least) 2015 to 2022, from commit `8996eafdcb` ("crypto: ahash - ensure statesize is non-zero") to commit `76a4e87459` ("crypto: n2 - add missing hash statesize"), since its probe function always returned an error before registering any algorithms. Though, even with that obvious issue fixed, it is unclear whether the driver now works correctly. E.g., there are no indications that anyone has run the self-tests recently. One bug report for this driver in 2017 (https://lore.kernel.org/r/nycvar.YFH.7.76.1712110214220.28416@n3.vanv.qr) complained that it crashed the kernel while being loaded. The reporter didn't seem to care about the functionality of the driver, but rather just the fact that loading it crashed the kernel. In fact not until 2022 was the driver fixed to maybe actually register its algorithms with the crypto API. The 2022 fix does have a Reported-by and Tested-by, but that may similarly have been just about making the error messages go away as opposed to someone actually wanting to use the driver. As such, it seems appropriate to retire this driver in mainline. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Eric Biggers	49b9258b05	crypto: qce - fix priority to be less than ARMv8 CE As QCE is an order of magnitude slower than the ARMv8 Crypto Extensions on the CPU, and is also less well tested, give it a lower priority. Previously the QCE SHA algorithms had higher priority than the ARMv8 CE equivalents, and the ciphers such as AES-XTS had the same priority which meant the QCE versions were chosen if they happened to be loaded later. Fixes: `ec8f5d8f6f` ("crypto: qce - Qualcomm crypto engine driver") Cc: stable@vger.kernel.org Cc: Bartosz Golaszewski <brgl@bgdev.pl> Cc: Neil Armstrong <neil.armstrong@linaro.org> Cc: Thara Gopinath <thara.gopinath@gmail.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Mario Limonciello	f1e532d05a	crypto: ccp - Use scoped guard for mutex Use a scoped guard to simplify the cleanup handling. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Bartosz Golaszewski	3382c44f0c	crypto: qce - switch to using a mutex Having switched to workqueue from tasklet, we are no longer limited to atomic APIs and can now convert the spinlock to a mutex. This, along with the conversion from tasklet to workqueue grants us ~15% improvement in cryptsetup benchmarks for AES encryption. While at it: use guards to simplify locking code. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Bartosz Golaszewski	eb7986e5e1	crypto: qce - convert tasklet to workqueue There's nothing about the qce driver that requires running from a tasklet. Switch to using the system workqueue. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00
Bartosz Golaszewski	ce8fd0500b	crypto: qce - use __free() for a buffer that's always freed The buffer allocated in qce_ahash_hmac_setkey is always freed before returning to use __free() to automate it. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-14 17:21:43 +08:00

1 2 3 4 5 ...

1323891 Commits