linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-26 18:43:33 -05:00

Author	SHA1	Message	Date
Paulo Alcantara	a13ca780af	smb: client: stop flooding dmesg in smb2_calc_signature() When having several mounts that share same credential and the client couldn't re-establish an SMB session due to an expired kerberos ticket or rotated password, smb2_calc_signature() will end up flooding dmesg when not finding SMB sessions to calculate signatures. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-09-26 18:15:06 -05:00
Enzo Matsumiya	f7025d8616	smb: client: allocate crypto only for primary server For extra channels, point ->secmech.{enc,dec} to the primary server ones. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-09-26 18:15:02 -05:00
Enzo Matsumiya	b0abcd65ec	smb: client: fix UAF in async decryption Doing an async decryption (large read) crashes with a slab-use-after-free way down in the crypto API. Reproducer: # mount.cifs -o ...,seal,esize=1 //srv/share /mnt # dd if=/mnt/largefile of=/dev/null ... [ 194.196391] ================================================================== [ 194.196844] BUG: KASAN: slab-use-after-free in gf128mul_4k_lle+0xc1/0x110 [ 194.197269] Read of size 8 at addr ffff888112bd0448 by task kworker/u77:2/899 [ 194.197707] [ 194.197818] CPU: 12 UID: 0 PID: 899 Comm: kworker/u77:2 Not tainted 6.11.0-lku-00028-gfca3ca14a17a-dirty #43 [ 194.198400] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014 [ 194.199046] Workqueue: smb3decryptd smb2_decrypt_offload [cifs] [ 194.200032] Call Trace: [ 194.200191] <TASK> [ 194.200327] dump_stack_lvl+0x4e/0x70 [ 194.200558] ? gf128mul_4k_lle+0xc1/0x110 [ 194.200809] print_report+0x174/0x505 [ 194.201040] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 194.201352] ? srso_return_thunk+0x5/0x5f [ 194.201604] ? __virt_addr_valid+0xdf/0x1c0 [ 194.201868] ? gf128mul_4k_lle+0xc1/0x110 [ 194.202128] kasan_report+0xc8/0x150 [ 194.202361] ? gf128mul_4k_lle+0xc1/0x110 [ 194.202616] gf128mul_4k_lle+0xc1/0x110 [ 194.202863] ghash_update+0x184/0x210 [ 194.203103] shash_ahash_update+0x184/0x2a0 [ 194.203377] ? __pfx_shash_ahash_update+0x10/0x10 [ 194.203651] ? srso_return_thunk+0x5/0x5f [ 194.203877] ? crypto_gcm_init_common+0x1ba/0x340 [ 194.204142] gcm_hash_assoc_remain_continue+0x10a/0x140 [ 194.204434] crypt_message+0xec1/0x10a0 [cifs] [ 194.206489] ? __pfx_crypt_message+0x10/0x10 [cifs] [ 194.208507] ? srso_return_thunk+0x5/0x5f [ 194.209205] ? srso_return_thunk+0x5/0x5f [ 194.209925] ? srso_return_thunk+0x5/0x5f [ 194.210443] ? srso_return_thunk+0x5/0x5f [ 194.211037] decrypt_raw_data+0x15f/0x250 [cifs] [ 194.212906] ? __pfx_decrypt_raw_data+0x10/0x10 [cifs] [ 194.214670] ? srso_return_thunk+0x5/0x5f [ 194.215193] smb2_decrypt_offload+0x12a/0x6c0 [cifs] This is because TFM is being used in parallel. Fix this by allocating a new AEAD TFM for async decryption, but keep the existing one for synchronous READ cases (similar to what is done in smb3_calc_signature()). Also remove the calls to aead_request_set_callback() and crypto_wait_req() since it's always going to be a synchronous operation. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-09-26 18:14:48 -05:00
David Howells	df9b455633	netfs: Fix write oops in generic/346 (9p) and generic/074 (cifs) In netfslib, a buffered writeback operation has a 'write queue' of folios that are being written, held in a linear sequence of folio_queue structs. The 'issuer' adds new folio_queues on the leading edge of the queue and populates each one progressively; the 'collector' pops them off the trailing edge and discards them and the folios they point to as they are consumed. The queue is required to always retain at least one folio_queue structure. This allows the queue to be accessed without locking and with just a bit of barriering. When a new subrequest is prepared, its ->io_iter iterator is pointed at the current end of the write queue and then the iterator is extended as more data is added to the queue until the subrequest is committed. Now, the problem is that the folio_queue at the leading edge of the write queue when a subrequest is prepared might have been entirely consumed - but not yet removed from the queue as it is the only remaining one and is preventing the queue from collapsing. So, what happens is that subreq->io_iter is pointed at the spent folio_queue, then a new folio_queue is added, and, at that point, the collector is at entirely at liberty to immediately delete the spent folio_queue. This leaves the subreq->io_iter pointing at a freed object. If the system is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted freed object and advances to the next folio_queue in the queue. In the case seen, however, the freed object gets recycled and put back onto the queue at the tail and filled to the end. This confuses iterate_folioq() and it tries to step ->next, which may be NULL - resulting in an oops. Fix this by the following means: (1) When preparing a write subrequest, make sure there's a folio_queue struct with space in it at the leading edge of the queue. A function to make space is split out of the function to append a folio so that it can be called for this purpose. (2) If the request struct iterator is pointing to a completely spent folio_queue when we make space, then advance the iterator to the newly allocated folio_queue. The subrequest's iterator will then be set from this. The oops could be triggered using the generic/346 xfstest with a filesystem on9P over TCP with cache=loose. The oops looked something like: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page ... RIP: 0010:_copy_from_iter+0x2db/0x530 ... Call Trace: <TASK> ... p9pdu_vwritef+0x3d8/0x5d0 p9_client_prepare_req+0xa8/0x140 p9_client_rpc+0x81/0x280 p9_client_write+0xcf/0x1c0 v9fs_issue_write+0x87/0xc0 netfs_advance_write+0xa0/0xb0 netfs_write_folio.isra.0+0x42d/0x500 netfs_writepages+0x15a/0x1f0 do_writepages+0xd1/0x220 filemap_fdatawrite_wbc+0x5c/0x80 v9fs_mmap_vm_close+0x7d/0xb0 remove_vma+0x35/0x70 vms_complete_munmap_vmas+0x11a/0x170 do_vmi_align_munmap+0x17d/0x1c0 do_vmi_munmap+0x13e/0x150 __vm_munmap+0x92/0xd0 __x64_sys_munmap+0x17/0x20 do_syscall_64+0x80/0xe0 entry_SYSCALL_64_after_hwframe+0x71/0x79 This also fixed a similar-looking issue with cifs and generic/074. Fixes: `cd0277ed0c` ("netfs: Use new folio_queue data type and iterator instead of xarray iter") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202409131438.3f225fbf-oliver.sang@intel.com Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kernel test robot <oliver.sang@intel.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2024-09-26 17:45:20 -05:00
Linus Torvalds	ac34bb40f7	12 smb3 client fixes, and also an important netfs fix for cifs mtime write regression -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmb0mWUACgkQiiy9cAdy T1Fwbgv/Zoe5LZukUe4s87xO7IC73Wfn2UBUQmvDUtK1djRF3HrL1QOtXLnFfPb/ pFJTPiNljM/NPcpXAk+7qz1XFihkOwGNJOFFuQPNrwcDX4LLF35sqoeRij1qRkXn 06yLPQRBI2SQLehLqi/Avk4TEatber7uGZMXgOaLN54doiNY8kMYcsIgEQWoe15h muxCUoPopSokU5+s0H6ObDoXX10KS3ir/1ArmmZ8oh1be363ysye0bf6+mnVNr/P I5yiERdYrN+oo6ZzC0XjyYSp0SnCbu8jck2g5ydIKUyQ7gbiSE8XqCNVy6ALndxg URMlYtL+gVknmJk9NJcc8gVp79EZcdjUIbFSTQ1Pa8x++nQCBl9rge1AZ9G/zzY2 Ul6xIVoP5DNgcwXvMka+lJgAsoRgB5olcEBMdltaCpKCLjWNjyzvOzb+kP2L30IC /nPZJbVQSrdr3ropybapAlHLG57Jk1ad1QdaBEiu5ss528mSmKc+t288zPQKIhU5 Ogqr3CxB =nVf0 -----END PGP SIGNATURE----- Merge tag 'v6.12-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: "Most are from the recent SMB3.1.1 test event, and also an important netfs fix for a cifs mtime write regression - fix mode reported by stat of readonly directories and files - DFS (global namespace) related fixes - fixes for special file support via reparse points - mount improvement and reconnect fix - fix for noisy log message on umount - two netfs related fixes, one fixing a recent regression, and add new write tracepoint" * tag 'v6.12-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: netfs, cifs: Fix mtime/ctime update for mmapped writes cifs: update internal version number smb: client: print failed session logoffs with FYI cifs: Fix reversion of the iter in cifs_readv_receive(). smb3: fix incorrect mode displayed for read-only files smb: client: fix parsing of device numbers smb: client: set correct device number on nfs reparse points smb: client: propagate error from cifs_construct_tcon() smb: client: fix DFS failover in multiuser mounts cifs: Make the write_{enter,done,err} tracepoints display netfs info smb: client: fix DFS interlink failover smb: client: improve purging of cached referrals smb: client: avoid unnecessary reconnects when refreshing referrals	2024-09-26 09:20:19 -07:00
Linus Torvalds	5159938e10	Probes updates for v6.12: - uprobes: make trace_uprobe->nhit counter a per-CPU one This makes uprobe event's hit counter per-CPU for improving scalability on multi-core environment. - kprobes: Remove obsoleted declaration for init_test_probes Remove unused init_test_probes() from header. - Raw tracepoint probe supports raw tracepoint events on modules. The tracepoint events using fprobe were introduced in v6.5, but tracepoints can be compiled in modules. This supports such a case. This includes the following improvements. . tracepoint: add a function for iterating over all tracepoints in all modules. . tracepoint: Add a function for iterating over tracepoints in a module. . tracing/fprobe: Support raw tracepoint events on modules. . tracing/fprobe: Support raw tracepoints on future loaded modules. This allows user to add tracepoint events on modules which is not loaded yet. . selftests/tracing: Add a test for tracepoint events on modules. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmb0HXgACgkQ2/sHvwUr Pxs7AAf+K89Q7eyqKLP/oG5LGsnmWwhZHP26HTbGKh7mRaxGE+cf3l1O2lCMAgBt 0Y1J0sHkgRSnubmlPrgEMKKLOKVBwnvwBqbqO8Zw8L3GxMegG5YYsl3Y60Q0T6Gq xiL17sHILbb/yefUqnf6C3QHoSjR4aTMEaQSpux1tsCqG/sLeU7V6DZrWdM5t4Fl CvQDuy//UdQUKFTUC5XOc6lRbKr94ktp/VTxdHZLXa5u6p/slq8ISf9EA+Rrsjkp m+FtW8MpfcYt3K+hs0kV58F43XWeRt9F7OlLf+MlyCeRRQor4xvkVlV0iw6VcRG9 sXt6ml6AmyA2JWRzR5qSKYvMAsNVyA== =GYlS -----END PGP SIGNATURE----- Merge tag 'probes-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes updates from Masami Hiramatsu: - uprobes: make trace_uprobe->nhit counter a per-CPU one This makes uprobe event's hit counter per-CPU for improving scalability on multi-core environment - kprobes: Remove obsoleted declaration for init_test_probes Remove unused init_test_probes() from header - Raw tracepoint probe supports raw tracepoint events on modules: - add a function for iterating over all tracepoints in all modules - add a function for iterating over tracepoints in a module - support raw tracepoint events on modules - support raw tracepoints on future loaded modules - add a test for tracepoint events on modules" * tag 'probes-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: sefltests/tracing: Add a test for tracepoint events on modules tracing/fprobe: Support raw tracepoints on future loaded modules tracing/fprobe: Support raw tracepoint events on modules tracepoint: Support iterating tracepoints in a loading module tracepoint: Support iterating over tracepoints on modules kprobes: Remove obsoleted declaration for init_test_probes uprobes: turn trace_uprobe's nhit counter to be per-CPU one	2024-09-26 08:55:36 -07:00
Linus Torvalds	0181f8c809	virtio: features, fixes, cleanups Several new features here: virtio-balloon supports new stats vdpa supports setting mac address vdpa/mlx5 suspend/resume as well as MKEY ops are now faster virtio_fs supports new sysfs entries for queue info virtio/vsock performance has been improved Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbz7ykPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpkk8H/A3vMRYXBzne9anezZLvADKS/CpX7v0DFEVj VfSMWXvYdUariYDyyb7pZsvK5QR22pE0pIaW6Kcgv9fNwq27M/H6g6NJk5ny8a7d 216AQs1J28pXPPY+q03fhf3SzE3yHP8aeD9lyiO9QJYfs9vjtoyZeBGt3a4IUSX4 ZeNBAx8xWTBcEDIIcZLdY1DNDTbZ4+qQ12Ln9IKq7D4xkE6l7Xh+HGdgTWTnDZ8P qEUUOmJTFKTQdOiVuU4NN3wzgHKWHdwKg0uWXo7ereYr3kYe3q//jCcLMv88a1x0 XP7NRBQg/rsErwTMdLz6ffyqXJs6lGGqNXzRfZKEwAvmnh/+zs4= =gNBq -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "Several new features here: - virtio-balloon supports new stats - vdpa supports setting mac address - vdpa/mlx5 suspend/resume as well as MKEY ops are now faster - virtio_fs supports new sysfs entries for queue info - virtio/vsock performance has been improved And fixes, cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (34 commits) vsock/virtio: avoid queuing packets when intermediate queue is empty vsock/virtio: refactor virtio_transport_send_pkt_work fw_cfg: Constify struct kobj_type vdpa/mlx5: Postpone MR deletion vdpa/mlx5: Introduce init/destroy for MR resources vdpa/mlx5: Rename mr_mtx -> lock vdpa/mlx5: Extract mr members in own resource struct vdpa/mlx5: Rename function vdpa/mlx5: Delete direct MKEYs in parallel vdpa/mlx5: Create direct MKEYs in parallel MAINTAINERS: add virtio-vsock driver in the VIRTIO CORE section virtio_fs: add sysfs entries for queue information virtio_fs: introduce virtio_fs_put_locked helper vdpa: Remove unused declarations vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ command vdpa/mlx5: Small improvement for change_num_qps() vdpa/mlx5: Keep notifiers during suspend but ignore vdpa/mlx5: Parallelize device resume vdpa/mlx5: Parallelize device suspend vdpa/mlx5: Use async API for vq modify commands ...	2024-09-26 08:43:17 -07:00
Linus Torvalds	11a299a793	for-6.12/block-20240925 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmb0T5AQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpnfHEADCXqmqZC+xr3sHZH9T1lz9KaFp1FjuBhCw bGpUgXQ9aLcqQUWJxmYVer8N2x2+Ds+xq4fm/rP1BfvNgRupqheHBwuLxSrz14EX lYmKZ+krMIPTDaLFewmEWflDwmZX0WFgV6nKTMLiO5BMeI4zXCkFGtwYFys2+Cdd 9zYCFPgGDZUR77Ws5PpyqPVz2MoiNtsjrGmHpEmNZ+rIDzlpVOYgYk27X9ZbvNxC /l0KTc9+ayAeG0Kx5jO+m6Hrj3I6ehvM9JZMgpS/tF/jtccD2oVkJFJDlU+Jciv6 BwVzgyDPGV7sXFT1fnSqDBYYwr/73nzNH0Gk8wn4Jg2LhjmVANVo9eQSOXDTYZI+ O4HfIHGTIrk75TQd4bhq3dqaylS78pKBI/eQJUli2UNoyLWMrMyE88yh2YJam2Fs vJ/MHGxvFRurYbAlqLr33nb3ajvpg+D7XuAYfqHPMc2ZUe28Kza50Dj+luNjfVCu 3qfR6qBlsdWuABtUS3vneB9jZp5jDnOpVfuBgtcAqIboUjehTXsI7If09Ex/mxLq O0KqNwBMfunPOKd5kGXlAgY8LRMfOhNaAAFBlXYUZB2eAadQnqVselTFvHMZkXo7 wH/l6trd+/Tf+7Rav0YduNIlpVr7IctC+A7ph4zPdIjQxFEySCrC7cvAjel29LyV zgWW0Mw/sA== =yiWu -----END PGP SIGNATURE----- Merge tag 'for-6.12/block-20240925' of git://git.kernel.dk/linux Pull more block updates from Jens Axboe: - Improve blk-integrity segment counting and merging (Keith) - NVMe pull request via Keith: - Multipath fixes (Hannes) - Sysfs attribute list NULL terminate fix (Shin'ichiro) - Remove problematic read-back (Keith) - Fix for a regression with the IO scheduler switching freezing from 6.11 (Damien) - Use a raw spinlock for sbitmap, as it may get called from preempt disabled context (Ming) - Cleanup for bd_claiming waiting, using var_waitqueue() rather than the bit waitqueues, as that more accurately describes that it does (Neil) - Various cleanups (Kanchan, Qiu-ji, David) * tag 'for-6.12/block-20240925' of git://git.kernel.dk/linux: nvme: remove CC register read-back during enabling nvme: null terminate nvme_tls_attrs nvme-multipath: avoid hang on inaccessible namespaces nvme-multipath: system fails to create generic nvme device lib/sbitmap: define swap_lock as raw_spinlock_t block: Remove unused blk_limits_io_{min,opt} drbd: Fix atomicity violation in drbd_uuid_set_bm() block: Fix elv_iosched_local_module handling of "none" scheduler block: remove bogus union block: change wait on bd_claiming to use a var_waitqueue blk-integrity: improved sg segment mapping block: unexport blk_rq_count_integrity_sg nvme-rdma: use request to get integrity segments scsi: use request to get integrity segments block: provide a request helper for user integrity segments blk-integrity: consider entire bio list for merging blk-integrity: properly account for segments blk-mq: set the nr_integrity_segments from bio blk-mq: unconditional nr_integrity_segments	2024-09-25 14:56:40 -07:00
Linus Torvalds	fe29393877	spi: Fixes for v6.12 Some driver specific fixes that came in during the merge window. Lorenzo Bianconi did some extra testing on the recently added arioha driver and found some issues, Alexander Dahl fixed some issues with signal delays in the Atmel QSPI driver and Jinjie Ruan has been fixing some nits with runtime PM cleanup. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmbz0NIACgkQJNaLcl1U h9B1KQf/fcq8ApnJCgD9EykeE9ekziI6PRwxc3XqI+DCws7CN2n4EbIsJxCeiygl d4vWcnBssQGxVv6IbSl3Vgqr6PSAxBxibpBgmANR0HJ8YLFjxoaDTk9ufnJevOEm pMgAtzvt3Ral1VwspKsz3puXEVoJaIoELfEQRB7D8XCX26ypD7+sZSN/AjMATp/N zla5mPjQfGIX7dOis1kaRCQ0ysxoEdZ9zsyFlGs+VZkFIhHbu3KTLhlfADruOa7k RKuVABAIJwqPz6Gew82xlh4hQPP+aecWNdwCSK6+4FZkZLgvUz2rx7evVj3U5+xd Hc3jQmoGnVMYqeuK4BC8T0frnlp6eA== =iJa9 -----END PGP SIGNATURE----- Merge tag 'spi-fix-v6.12-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "Some driver specific fixes that came in during the merge window. Lorenzo Bianconi did some extra testing on the recently added arioha driver and found some issues, Alexander Dahl fixed some issues with signal delays in the Atmel QSPI driver and Jinjie Ruan has been fixing some nits with runtime PM cleanup" * tag 'spi-fix-v6.12-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: atmel-quadspi: Avoid overwriting delay register settings spi: airoha: remove read cache in airoha_snand_dirmap_read() spi: spi-fsl-lpspi: Undo runtime PM changes at driver exit time spi: atmel-quadspi: Undo runtime PM changes at driver exit time spi: airoha: fix airoha_snand_{write,read}_data data_len estimation spi: airoha: fix dirmap_{read,write} operations	2024-09-25 14:49:34 -07:00
Linus Torvalds	b2149f948c	RTC for 6.12 New driver: - DFRobot SD2405AL Drivers: - stm32: add alarm A out and LSCO support - sun6i: disable automatic clock input switching - m48t59: set range -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEBqsFVZXh8s/0O5JiY6TcMGxwOjIFAmbzMJ8ACgkQY6TcMGxw OjIVpw//YU/pOMIjD7I+cVcedK8UNGSP5GKlpd/4rK8sN7U3MiowbnvpSQT8rHvG N8WaRUjKkZRV6+Yo8iX5JQXacGTFv/WFfqQDCr3QgnU934AGibQT4WAQCooc8IrP fKB3Wzljcdr5cZZCWL5IBDiB6P7oWRdbdAHvwQUeTDMbF8zFd2ortLKDdCdHykXd iiK8gDDh7iVt2eiQ8duhhvruQXMLmxc7GdNNkTG/hhO6kis4sMUuwNwf+ifuO0fD 00HOJr0wEJkje5CeTlcU+R2z2Nf3PMnQMMyQzrQnoUv0AEaTcUDx4MiNN9YZ7Kds IGkJpVkH5UCMT2RyvVuKVcL2OWA9m2RP63NVfq1iV+lmFRf7neJlQ4Ul/sYcMaL8 wEL8UwIx4ya61O7mIZT8wI4ntl0yNm+hJw5r2yrIX1Cb7BACK4g0PzK6SoOOsMwN xbDkMrFQlpQm5YNL1vSV7ojFYvBwbvJXJfU3WTxF9V0O5RgqFFUSn/mlYog/DCZD zenPXYBQo4tp6Jkmg2A8J5g1R/274PQf08k4IQo4HTqtkjHyNRMEILDnneFX8Q0r f5W0k1HaWeYR8k08n6XI8A12lAj232RoGvFAtwaxRrqaEM5I5plcwrZIsMcpLSfK z0qnreEs9OAL2ZxNUYNNpWyphT4X3IDDSpRkh0Vj0chwgu1IQwo= =sjE7 -----END PGP SIGNATURE----- Merge tag 'rtc-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "More conversions of DT bindings to yaml. There is one new driver, for the DFRobot SD2405AL and support for important features of the stm32 RTC. Summary: New driver: - DFRobot SD2405AL Drivers: - stm32: add alarm A out and LSCO support - sun6i: disable automatic clock input switching - m48t59: set range" * tag 'rtc-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: rtc: rc5t619: use proper module tables rtc: m48t59: set range dt-bindings: rtc: microcrystal,rv3028: add #clock-cells property rtc: m48t59: Remove division condition with direct comparison rtc: at91sam9: fix OF node leak in probe() error path rtc: sun6i: disable automatic clock input switching dt-bindings: rtc: Drop non-trivial duplicate compatibles dt-bindings: vendor-prefixes: Add DFRobot. dt-bindings: rtc: Add support for SD2405AL. rtc: Add driver for SD2405AL rtc: s35390a: Drop vendorless compatible string from match table rtc: twl: convert comma to semicolon dt-bindings: rtc: sprd,sc2731-rtc: convert to YAML rtc: stm32: add alarm A out feature rtc: stm32: add Low Speed Clock Output (LSCO) support rtc: stm32: add pinctrl and pinmux interfaces dt-bindings: rtc: stm32: describe pinmux nodes	2024-09-25 14:38:37 -07:00
Linus Torvalds	aa486552a1	memblock: updates for 6.12-rc1 * new memblock_estimated_nr_free_pages() helper to replace totalram_pages() which is less accurate when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set * fixes for memblock tests -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmbejv0QHHJwcHRAa2Vy bmVsLm9yZwAKCRA5A4Ymyw79kVVlB/4yOoCDvJyUocEY0/Zv5bdRGXlAI0Igp3VV E0rEpvIjTBWwp/KZziQ8zMFk5zL/Aqb081vRsCko0lh2wjD5tFgNWWJG/sryQ/tX vc88p83KEXxNy4QC1qCh8dvHGIZVuLQ8oWQ7QFuH2ResdOaLdcfnobcu6/W/pBE0 60/0bNdNgFPgnCpFIcWvGFOqZ10akhw4xYrwRsCKAQEeqeKyQE/DBFUvNrqkOuNG +4k71X/9mcuEDBKGRCf5XzCf7nwk4k8pzOc4xMeEhAaaV2uZdENfQuu1Av7nqRah zhYveo0Wd0cnGWORBT/ddzPDeBjdP2ZM9qR70yoSj2mQ7a3ixLfd =wtsK -----END PGP SIGNATURE----- Merge tag 'memblock-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - new memblock_estimated_nr_free_pages() helper to replace totalram_pages() which is less accurate when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set - fixes for memblock tests * tag 'memblock-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: s390/mm: get estimated free pages by memblock api kernel/fork.c: get estimated free pages by memblock api mm/memblock: introduce a new helper memblock_estimated_nr_free_pages() memblock test: fix implicit declaration of function 'strscpy' memblock test: fix implicit declaration of function 'isspace' memblock test: fix implicit declaration of function 'memparse' memblock test: add the definition of __setup() memblock test: fix implicit declaration of function 'virt_to_phys' tools/testing: abstract two init.h into common include directory memblock tests: include export.h in linkage.h as kernel dose memblock tests: include memory_hotplug.h in mmzone.h as kernel dose	2024-09-25 11:35:19 -07:00
Linus Torvalds	eb5b0f9812	This includes the following changes related to sparc for v6.12: - Remove an unused variable for sparc32 -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQQfqfbgobF48oKMeq81AykqDLayywUCZvPE4hQcYW5kcmVhc0Bn YWlzbGVyLmNvbQAKCRA1AykqDLayy85TAP94zlZ0Ggy3y/lwfo6Z10s8pZz91DL4 yQqTETw/2FPM2QEAyUN8B7IgJpJVcZaOJnZFZLziQlqESrU+6GAzW3RZdgU= =pK/R -----END PGP SIGNATURE----- Merge tag 'sparc-for-6.12-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc Pull sparc32 update from Andreas Larsson: - Remove an unused variable for sparc32 * tag 'sparc-for-6.12-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc: arch/sparc: remove unused varible paddrbase in function leon_swprobe()	2024-09-25 11:21:06 -07:00
Linus Torvalds	4ffc458083	powerpc fixes for 6.12 #2 - Fix build error in vdso32 when building 64-bit with COMPAT=y and -Os. - Fix build error in pseries EEH when CONFIG_DEBUG_FS is not set. Thanks to: Christophe Leroy, Narayana Murty N, Christian Zigotzky, Ritesh Harjani. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRjvi15rv0TSTaE+SIF0oADX8seIQUCZvPfwQAKCRAF0oADX8se IeqKAPkBZJBXheDywa0aZ1hIKfwB2mWeAMgZLb1AhWeAGpJMBAEA9LtZbM4KhQHG r2JCuHtV7rDyTG04+97V/K4M2iQSsgg= =cygk -----END PGP SIGNATURE----- Merge tag 'powerpc-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix build error in vdso32 when building 64-bit with COMPAT=y and -Os - Fix build error in pseries EEH when CONFIG_DEBUG_FS is not set Thanks to Christophe Leroy, Narayana Murty N, Christian Zigotzky, and Ritesh Harjani. * tag 'powerpc-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries/eeh: move pseries_eeh_err_inject() outside CONFIG_DEBUG_FS block powerpc/vdso32: Fix use of crtsavres for PPC64	2024-09-25 11:17:25 -07:00
Linus Torvalds	e520813b2d	clang-format changes for v6.12 A routine update of the 'for_each' macro list. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmbzPJ4ACgkQGXyLc2ht IW3j2BAAt+zGl9DQWLRv6OjyVzt70bBkcdLpwSn5OdcPnpmyadMSPZqOKjpdR0gY e4rD59WPpB+CCZknMHX6f83suHtSPDxEVvEeKNGcaHQMkJhAa5i9M/eiivEPPMlV 9sEK3x573CeGdoskKdIbdZRAhHiOIbMCymxTiCgBo4T9jx/r4DJZxNnHwFmZOrRU nlZPxreW7pGBJqcEnPPOCF85qZx6ToDhmgSQicu2LCZiU7QU9hLJ2lf2JbNYNGvS AHoBIOdwASMqYaGjUykNlaMcwC9eAQ0OY9XxjHdY2D4vYnonQobU/hBFYOeQkNEm idrojx0Q26+lZJVbEQF//jyCjWG4iU8RzIBRJOc85KRY3t+JZIsIQQWytTjCniaR t0n7tpWzu/xXmdN+es/Zvnf3LiXn7boxAY85CfjZ8WkOAiVU11HkCzBuZDPON+kS Bmn1H+LRdA7vlBBXryNUSqyVyQxGK1r6TNFfTAU7lyxrVHpW5HnCu2UZjRwR22jI 7awS5ccRfXgZCxXhRzAQ51yVj15/qUMeJmggPfUAza3n5tgEb2WFllYgJ+tkDY+n nc9i7Y+OAr5wPe2n4FZPPXrQJsD5ZFvnd/4Dbc4fmPugi5kLtj5aSTWbDQfv3Kuw LcqqIaneD5k4MQii6h1sWyY98o0iQUS9hrpks8tUD4epmWTY7q8= =Zybb -----END PGP SIGNATURE----- Merge tag 'clang-format-6.12' of https://github.com/ojeda/linux Pull clang-format updates from Miguel Ojeda: "A routine update of the 'for_each' macro list" * tag 'clang-format-6.12' of https://github.com/ojeda/linux: clang-format: Update with v6.11-rc1's `for_each` macro list	2024-09-25 11:10:39 -07:00
Linus Torvalds	1f9c4a9967	Kbuild: make MODVERSIONS support depend on not being a compile test build Currently the Rust support is gated on not having MODVERSIONS enabled, and as a result an "allmodconfig" build will disable Rust build tests. While MODVERSIONS configurations are worth build testing, the feature is not actually meaningful unless you run the result, and I'd rather get build coverage of Rust than MODVERSIONS. So let's disable MODVERSIONS for build testing until the Rust side clears up. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-09-25 11:08:28 -07:00
Linus Torvalds	5701725692	Rust changes for v6.12 Toolchain and infrastructure: - Support 'MITIGATION_{RETHUNK,RETPOLINE,SLS}' (which cleans up objtool warnings), teach objtool about 'noreturn' Rust symbols and mimic '___ADDRESSABLE()' for 'module_{init,exit}'. With that, we should be objtool-warning-free, so enable it to run for all Rust object files. - KASAN (no 'SW_TAGS'), KCFI and shadow call sanitizer support. - Support 'RUSTC_VERSION', including re-config and re-build on change. - Split helpers file into several files in a folder, to avoid conflicts in it. Eventually those files will be moved to the right places with the new build system. In addition, remove the need to manually export the symbols defined there, reusing existing machinery for that. - Relax restriction on configurations with Rust + GCC plugins to just the RANDSTRUCT plugin. 'kernel' crate: - New 'list' module: doubly-linked linked list for use with reference counted values, which is heavily used by the upcoming Rust Binder. This includes 'ListArc' (a wrapper around 'Arc' that is guaranteed unique for the given ID), 'AtomicTracker' (tracks whether a 'ListArc' exists using an atomic), 'ListLinks' (the prev/next pointers for an item in a linked list), 'List' (the linked list itself), 'Iter' (an iterator over a 'List'), 'Cursor' (a cursor into a 'List' that allows to remove elements), 'ListArcField' (a field exclusively owned by a 'ListArc'), as well as support for heterogeneous lists. - New 'rbtree' module: red-black tree abstractions used by the upcoming Rust Binder. This includes 'RBTree' (the red-black tree itself), 'RBTreeNode' (a node), 'RBTreeNodeReservation' (a memory reservation for a node), 'Iter' and 'IterMut' (immutable and mutable iterators), 'Cursor' (bidirectional cursor that allows to remove elements), as well as an entry API similar to the Rust standard library one. - 'init' module: add 'write_[pin_]init' methods and the 'InPlaceWrite' trait. Add the 'assert_pinned!' macro. - 'sync' module: implement the 'InPlaceInit' trait for 'Arc' by introducing an associated type in the trait. - 'alloc' module: add 'drop_contents' method to 'BoxExt'. - 'types' module: implement the 'ForeignOwnable' trait for 'Pin<Box<T>>' and improve the trait's documentation. In addition, add the 'into_raw' method to the 'ARef' type. - 'error' module: in preparation for the upcoming Rust support for 32-bit architectures, like arm, locally allow Clippy lint for those. Documentation: - https://rust.docs.kernel.org has been announced, so link to it. - Enable rustdoc's "jump to definition" feature, making its output a bit closer to the experience in a cross-referencer. - Debian Testing now also provides recent Rust releases (outside of the freeze period), so add it to the list. MAINTAINERS: - Trevor is joining as reviewer of the "RUST" entry. And a few other small bits. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmbzNz4ACgkQGXyLc2ht IW3muA/9HcPL0QqVB5+SqSRqcatmrFU/wq8Oaa6Z/No0JaynqyikK+R1WNokUd/5 WpQi4PC1OYV+ekyAuWdkooKmaSqagH5r53XlezNw+cM5zo8y7p0otVlbepQ0t3Ky pVEmfDRIeSFXsKrg91BJUKyJf70TQlgSggDVCExlanfOjPz88C1+s3EcJ/XWYGKQ cRk/XDdbF5eNaldp2MriVF0fw7XktgIrmVzxt/z0lb4PE7RaCAnO6gSQI+90Vb2d zvyOYKS4AkqE3suFvDIIUlPUv+8XbACj0c4wvBZHH5uZGTbgWUffqygJ45GqChEt c4fS/+E8VaM1z0EvxNczC0nQkfLwkTc1mgbP+sG3VZJMPVCJ2zQan1/ond7GqCpw pt6uQaGvDsAvllm7sbiAIVaAY81icqyYWKfNBXLLEL7DhY5je5Wq+E83XQ8d5u5F EuapnZhW3y12d6UCsSe9bD8W45NFoWHPXky1TzT+whTxnX1yH9YsPXbJceGSbbgd Lw3GmUtZx2bVAMToVjNFD2lPA3OmPY1e2lk0jwzTuQrEXfnZYuzbjqs3YUijb7xR AlsWfIb0IHBwHWpB7da24ezqWP2VD4eaDdD8/+LmDSj6XLngxMNWRLKmXT000eTW vIFP9GJrvag2R3YFPhrurgGpRsp8HUTLtvcZROxp2JVQGQ7Z4Ww= =52BN -----END PGP SIGNATURE----- Merge tag 'rust-6.12' of https://github.com/Rust-for-Linux/linux Pull Rust updates from Miguel Ojeda: "Toolchain and infrastructure: - Support 'MITIGATION_{RETHUNK,RETPOLINE,SLS}' (which cleans up objtool warnings), teach objtool about 'noreturn' Rust symbols and mimic '___ADDRESSABLE()' for 'module_{init,exit}'. With that, we should be objtool-warning-free, so enable it to run for all Rust object files. - KASAN (no 'SW_TAGS'), KCFI and shadow call sanitizer support. - Support 'RUSTC_VERSION', including re-config and re-build on change. - Split helpers file into several files in a folder, to avoid conflicts in it. Eventually those files will be moved to the right places with the new build system. In addition, remove the need to manually export the symbols defined there, reusing existing machinery for that. - Relax restriction on configurations with Rust + GCC plugins to just the RANDSTRUCT plugin. 'kernel' crate: - New 'list' module: doubly-linked linked list for use with reference counted values, which is heavily used by the upcoming Rust Binder. This includes 'ListArc' (a wrapper around 'Arc' that is guaranteed unique for the given ID), 'AtomicTracker' (tracks whether a 'ListArc' exists using an atomic), 'ListLinks' (the prev/next pointers for an item in a linked list), 'List' (the linked list itself), 'Iter' (an iterator over a 'List'), 'Cursor' (a cursor into a 'List' that allows to remove elements), 'ListArcField' (a field exclusively owned by a 'ListArc'), as well as support for heterogeneous lists. - New 'rbtree' module: red-black tree abstractions used by the upcoming Rust Binder. This includes 'RBTree' (the red-black tree itself), 'RBTreeNode' (a node), 'RBTreeNodeReservation' (a memory reservation for a node), 'Iter' and 'IterMut' (immutable and mutable iterators), 'Cursor' (bidirectional cursor that allows to remove elements), as well as an entry API similar to the Rust standard library one. - 'init' module: add 'write_[pin_]init' methods and the 'InPlaceWrite' trait. Add the 'assert_pinned!' macro. - 'sync' module: implement the 'InPlaceInit' trait for 'Arc' by introducing an associated type in the trait. - 'alloc' module: add 'drop_contents' method to 'BoxExt'. - 'types' module: implement the 'ForeignOwnable' trait for 'Pin<Box<T>>' and improve the trait's documentation. In addition, add the 'into_raw' method to the 'ARef' type. - 'error' module: in preparation for the upcoming Rust support for 32-bit architectures, like arm, locally allow Clippy lint for those. Documentation: - https://rust.docs.kernel.org has been announced, so link to it. - Enable rustdoc's "jump to definition" feature, making its output a bit closer to the experience in a cross-referencer. - Debian Testing now also provides recent Rust releases (outside of the freeze period), so add it to the list. MAINTAINERS: - Trevor is joining as reviewer of the "RUST" entry. And a few other small bits" * tag 'rust-6.12' of https://github.com/Rust-for-Linux/linux: (54 commits) kasan: rust: Add KASAN smoke test via UAF kbuild: rust: Enable KASAN support rust: kasan: Rust does not support KHWASAN kbuild: rust: Define probing macros for rustc kasan: simplify and clarify Makefile rust: cfi: add support for CFI_CLANG with Rust cfi: add CONFIG_CFI_ICALL_NORMALIZE_INTEGERS rust: support for shadow call stack sanitizer docs: rust: include other expressions in conditional compilation section kbuild: rust: replace proc macros dependency on `core.o` with the version text kbuild: rust: rebuild if the version text changes kbuild: rust: re-run Kconfig if the version text changes kbuild: rust: add `CONFIG_RUSTC_VERSION` rust: avoid `box_uninit_write` feature MAINTAINERS: add Trevor Gross as Rust reviewer rust: rbtree: add `RBTree::entry` rust: rbtree: add cursor rust: rbtree: add mutable iterator rust: rbtree: add iterator rust: rbtree: add red-black tree implementation backed by the C version ...	2024-09-25 10:25:40 -07:00
Masami Hiramatsu (Google)	4e78dd6b4c	sefltests/tracing: Add a test for tracepoint events on modules Add a test case for tracepoint events on modules. This checks if it can add and remove the events correctly. Link: https://lore.kernel.org/all/172397781494.286558.7581515061075998225.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-09-25 23:23:44 +09:00
Masami Hiramatsu (Google)	57a7e6de9e	tracing/fprobe: Support raw tracepoints on future loaded modules Support raw tracepoint events on future loaded (unloaded) modules. This allows user to create raw tracepoint events which can be used from module's __init functions. Note: since the kernel does not have any information about the tracepoints in the unloaded modules, fprobe events can not check whether the tracepoint exists nor extend the BTF based arguments. Link: https://lore.kernel.org/all/172397780593.286558.18360375226968537828.stgit@devnote2/ Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-09-25 23:23:44 +09:00
Masami Hiramatsu (Google)	67e9a9ee47	tracing/fprobe: Support raw tracepoint events on modules Support raw tracepoint event on module by fprobe events. Since it only uses for_each_kernel_tracepoint() to find a tracepoint, the tracepoints on modules are not handled. Thus if user specified a tracepoint on a module, it shows an error. This adds new for_each_module_tracepoint() API to tracepoint subsystem, and uses it to find tracepoints on modules. Link: https://lore.kernel.org/all/172397779651.286558.15903703620679186867.stgit@devnote2/ Reported-by: don <zds100@gmail.com> Closes: https://lore.kernel.org/all/20240530215718.aeec973a1d0bf058d39cb1e3@kernel.org/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-09-25 23:23:44 +09:00
Masami Hiramatsu (Google)	d4df54f338	tracepoint: Support iterating tracepoints in a loading module Add for_each_tracepoint_in_module() function to iterate tracepoints in a module. This API is needed for handling tracepoints in a loading module from tracepoint_module_notifier callback function. This also update for_each_module_tracepoint() to pass the module to callback function so that it can find module easily. Link: https://lore.kernel.org/all/172397778740.286558.15781131277732977643.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-09-25 23:23:44 +09:00
Masami Hiramatsu (Google)	d5dbf8b48a	tracepoint: Support iterating over tracepoints on modules Add for_each_module_tracepoint() for iterating over tracepoints on modules. This is similar to the for_each_kernel_tracepoint() but only for the tracepoints on modules (not including kernel built-in tracepoints). Link: https://lore.kernel.org/all/172397777800.286558.14554748203446214056.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-09-25 23:23:44 +09:00
Gaosheng Cui	ce4db753de	kprobes: Remove obsoleted declaration for init_test_probes The init_test_probes() have been removed since commit `e44e81c5b9` ("kprobes: convert tests to kunit"), and now it is useless, so remove it. Link: https://lore.kernel.org/all/20240826032552.4016314-1-cuigaosheng1@huawei.com/ Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-09-25 20:12:58 +09:00
Andrii Nakryiko	10cdb82aa7	uprobes: turn trace_uprobe's nhit counter to be per-CPU one trace_uprobe->nhit counter is not incremented atomically, so its value is questionable in when uprobe is hit on multiple CPUs simultaneously. Also, doing this shared counter increment across many CPUs causes heavy cache line bouncing, limiting uprobe/uretprobe performance scaling with number of CPUs. Solve both problems by making this a per-CPU counter. Link: https://lore.kernel.org/all/20240813203409.3985398-1-andrii@kernel.org/ Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-09-25 20:10:38 +09:00
Luigi Leonardi	efcd71af38	vsock/virtio: avoid queuing packets when intermediate queue is empty When the driver needs to send new packets to the device, it always queues the new sk_buffs into an intermediate queue (send_pkt_queue) and schedules a worker (send_pkt_work) to then queue them into the virtqueue exposed to the device. This increases the chance of batching, but also introduces a lot of latency into the communication. So we can optimize this path by adding a fast path to be taken when there is no element in the intermediate queue, there is space available in the virtqueue, and no other process that is sending packets (tx_lock held). The following benchmarks were run to check improvements in latency and throughput. The test bed is a host with Intel i7-10700KF CPU @ 3.80GHz and L1 guest running on QEMU/KVM with vhost process and all vCPUs pinned individually to pCPUs. - Latency Tool: Fio version 3.37-56 Mode: pingpong (h-g-h) Test runs: 50 Runtime-per-test: 50s Type: SOCK_STREAM In the following fio benchmark (pingpong mode) the host sends a payload to the guest and waits for the same payload back. fio process pinned both inside the host and the guest system. Before: Linux 6.9.8 Payload 64B: 1st perc. overall 99th perc. Before 12.91 16.78 42.24 us After 9.77 13.57 39.17 us Payload 512B: 1st perc. overall 99th perc. Before 13.35 17.35 41.52 us After 10.25 14.11 39.58 us Payload 4K: 1st perc. overall 99th perc. Before 14.71 19.87 41.52 us After 10.51 14.96 40.81 us - Throughput Tool: iperf-vsock The size represents the buffer length (-l) to read/write P represents the number of parallel streams P=1 4K 64K 128K Before 6.87 29.3 29.5 Gb/s After 10.5 39.4 39.9 Gb/s P=2 4K 64K 128K Before 10.5 32.8 33.2 Gb/s After 17.8 47.7 48.5 Gb/s P=4 4K 64K 128K Before 12.7 33.6 34.2 Gb/s After 16.9 48.1 50.5 Gb/s The performance improvement is related to this optimization, I used a ebpf kretprobe on virtio_transport_send_skb to check that each packet was sent directly to the virtqueue Co-developed-by: Marco Pinna <marco.pinn95@gmail.com> Signed-off-by: Marco Pinna <marco.pinn95@gmail.com> Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com> Message-Id: <20240730-pinna-v4-2-5c9179164db5@outlook.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2024-09-25 07:07:44 -04:00
Marco Pinna	26618da3b2	vsock/virtio: refactor virtio_transport_send_pkt_work Preliminary patch to introduce an optimization to the enqueue system. All the code used to enqueue a packet into the virtqueue is removed from virtio_transport_send_pkt_work() and moved to the new virtio_transport_send_skb() function. Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com> Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com> Signed-off-by: Marco Pinna <marco.pinn95@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20240730-pinna-v4-1-5c9179164db5@outlook.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:44 -04:00
Hongbo Li	4a21d31d7b	fw_cfg: Constify struct kobj_type This 'struct kobj_type' is not modified. It is only used in kobject_init_and_add() which takes a 'const struct kobj_type *ktype' parameter. Constifying this structure and moving it to a read-only section, and this can increase over all security. ``` [Before] text data bss dec hex filename 5974 1008 96 7078 1ba6 drivers/firmware/qemu_fw_cfg.o [After] text data bss dec hex filename 6038 944 96 7078 1ba6 drivers/firmware/qemu_fw_cfg.o ``` Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Message-Id: <20240904011743.2010319-1-lihongbo22@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:44 -04:00
Dragos Tatulea	6211165448	vdpa/mlx5: Postpone MR deletion Currently, when a new MR is set up, the old MR is deleted. MR deletion is about 30-40% the time of MR creation. As deleting the old MR is not important for the process of setting up the new MR, this operation can be postponed. This series adds a workqueue that does MR garbage collection at a later point. If the MR lock is taken, the handler will back off and reschedule. The exception during shutdown: then the handler must not postpone the work. Note that this is only a speculative optimization: if there is some mapping operation that is triggered while the garbage collector handler has the lock taken, this operation it will have to wait for the handler to finish. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240830105838.2666587-9-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:44 -04:00
Dragos Tatulea	f30a1232b6	vdpa/mlx5: Introduce init/destroy for MR resources There's currently not a lot of action happening during the init/destroy of MR resources. But more will be added in the upcoming patches. As the mr mutex lock init/destroy has been moved to these new functions, the lifetime has now shifted away from mlx5_vdpa_alloc_resources() / mlx5_vdpa_free_resources() into these new functions. However, the lifetime at the outer scope remains the same: mlx5_vdpa_dev_add() / mlx5_vdpa_dev_free() Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240830105838.2666587-8-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:44 -04:00
Dragos Tatulea	58d4d50e75	vdpa/mlx5: Rename mr_mtx -> lock Now that the mr resources have their own namespace in the struct, give the lock a clearer name. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240830105838.2666587-7-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:44 -04:00
Dragos Tatulea	5fc8567907	vdpa/mlx5: Extract mr members in own resource struct Group all mapping related resources into their own structure. Upcoming patches will add more members in this new structure. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240830105838.2666587-6-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	0b916a9c45	vdpa/mlx5: Rename function A followup patch will use this name for something else. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240830105838.2666587-5-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	e1ba5c947e	vdpa/mlx5: Delete direct MKEYs in parallel Use the async interface to issue MTT MKEY deletion. This makes destroy_user_mr() on average 8x times faster. This number is also dependent on the size of the MR being deleted. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240830105838.2666587-4-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	0071b138d4	vdpa/mlx5: Create direct MKEYs in parallel Use the async interface to issue MTT MKEY creation. Extra care is taken at the allocation of FW input commands due to the MTT tables having variable sizes depending on MR. The indirect MKEY is still created synchronously at the end as the direct MKEYs need to be filled in. This makes create_user_mr() 3-5x faster, depending on the size of the MR. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240830105838.2666587-3-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Stefano Garzarella	db0a314f84	MAINTAINERS: add virtio-vsock driver in the VIRTIO CORE section The virtio-vsock driver is already under VM SOCKETS (AF_VSOCK), managed pricipally with the net tree, and VIRTIO AND VHOST VSOCK DRIVER. However, changes that only affect the virtio part usually go with Michael's tree, so let's also put the driver in the VIRTIO CORE section to have its maintainers in CC for changes to the virtio-vsock driver. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20240829143757.85844-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2024-09-25 07:07:43 -04:00
Max Gurtovoy	87cbdc396a	virtio_fs: add sysfs entries for queue information Introduce sysfs entries to provide visibility to the multiple queues used by the Virtio FS device. This enhancement allows users to query information about these queues. Specifically, add two sysfs entries: 1. Queue name: Provides the name of each queue (e.g. hiprio/requests.8). 2. CPU list: Shows the list of CPUs that can process requests for each queue. The CPU list feature is inspired by similar functionality in the block MQ layer, which provides analogous sysfs entries for block devices. These new sysfs entries will improve observability and aid in debugging and performance tuning of Virtio FS devices. Reviewed-by: Idan Zach <izach@nvidia.com> Reviewed-by: Shai Malin <smalin@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <20240825130716.9506-2-mgurtovoy@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Max Gurtovoy	4045b64298	virtio_fs: introduce virtio_fs_put_locked helper Introduce a new helper function virtio_fs_put_locked to encapsulate the common pattern of releasing a virtio_fs reference while holding a lock. The existing virtio_fs_put helper will be used to release a virtio_fs reference while not holding a lock. Also add an assertion in case the lock is not taken when it should. Reviewed-by: Idan Zach <izach@nvidia.com> Reviewed-by: Shai Malin <smalin@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <20240825130716.9506-1-mgurtovoy@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2024-09-25 07:07:43 -04:00
Yue Haibing	561a16366e	vdpa: Remove unused declarations There is no caller and implementation in tree. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Message-Id: <20240819140930.122019-1-yuehaibing@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Zhu Lingshan <lingshan.zhu@kernel.org> Reviewed-by: Shannon Nelson <<a href="mailto:shannon.nelson@amd.com" target="_blank">shannon.nelson@amd.com</a>><br> Reviewed-by: Zhu Lingshan <lingshan.zhu@kernel.org>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	9dba41951a	vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ command change_num_qps() is still suspending/resuming VQs one by one. This change switches to parallel suspend/resume. When increasing the number of queues the flow has changed a bit for simplicity: the setup_vq() function will always be called before resume_vqs(). If the VQ is initialized, setup_vq() will exit early. If the VQ is not initialized, setup_vq() will create it and resume_vqs() will resume it. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-11-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	74c89072f2	vdpa/mlx5: Small improvement for change_num_qps() change_num_qps() has a lot of multiplications by 2 to convert the number of VQ pairs to number of VQs. This patch simplifies the code by doing the VQP -> VQ count conversion at the beginning in a variable. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-10-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	55a7cb05b0	vdpa/mlx5: Keep notifiers during suspend but ignore Unregistering notifiers is a costly operation. Instead of removing the notifiers during device suspend and adding them back at resume, simply ignore the call when the device is suspended. At resume time call queue_link_work() to make sure that the device state is propagated in case there were changes. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device suspend time is reduced from ~13 ms to ~2.5 ms. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240816090159.1967650-9-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	5eb8c7eb1e	vdpa/mlx5: Parallelize device resume Currently device resume works on vqs serially. Building up on previous changes that converted vq operations to the async api, this patch parallelizes the device resume. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device resume time is reduced from ~16 ms to ~4.5 ms. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240816090159.1967650-8-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	dcf3eac01f	vdpa/mlx5: Parallelize device suspend Currently device suspend works on vqs serially. Building up on previous changes that converted vq operations to the async api, this patch parallelizes the device suspend: 1) Suspend all active vqs parallel. 2) Query suspended vqs in parallel. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device suspend time is reduced from ~37 ms to ~13 ms. A later patch will remove the link unregister operation which will make it even faster. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240816090159.1967650-7-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	61674c154b	vdpa/mlx5: Use async API for vq modify commands Switch firmware vq modify command to be issued via the async API to allow future parallelization. The new refactored function applies the modify on a range of vqs and waits for their execution to complete. For now the command is still used in a serial fashion. A later patch will switch to modifying multiple vqs in parallel. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-6-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	1fcdf43ea6	vdpa/mlx5: Use async API for vq query command Switch firmware vq query command to be issued via the async API to allow future parallelization. For now the command is still serial but the infrastructure is there to issue commands in parallel, including ratelimiting the number of issued async commands to firmware. A later patch will switch to issuing more commands at a time. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-5-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	d89d58f488	vdpa/mlx5: Introduce async fw command wrapper Introduce a new function mlx5_vdpa_exec_async_cmds() which wraps the mlx5_core async firmware command API in a way that will be used to parallelize certain operation in this driver. The wrapper deals with the case when mlx5_cmd_exec_cb() returns EBUSY due to the command being throttled. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-4-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	de2cd39fc1	vdpa/mlx5: Introduce error logging function mlx5_vdpa_err() was missing. This patch adds it and uses it in the necessary places. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240816090159.1967650-3-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	7d627137dc	net/mlx5: Support throttled commands from async API Currently, commands that qualify as throttled can't be used via the async API. That's due to the fact that the throttle semaphore can sleep but the async API can't. This patch allows throttling in the async API by using the tentative variant of the semaphore and upon failure (semaphore at 0) returns EBUSY to signal to the caller that they need to wait for the completion of previously issued commands. Furthermore, make sure that the semaphore is released in the callback. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Cc: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-2-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:41 -04:00
Jens Axboe	a045553362	nvme fixes for Linux 6.12 - Multipath fixes (Hannes) - Sysfs attribute list NULL terminate fix (Shin'ichiro) - Remove problematic read-back (Keith) -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE3Fbyvv+648XNRdHTPe3zGtjzRgkFAmbzr78ACgkQPe3zGtjz Rgl12xAA6Gy854fs6YeiolYD8qrZIN+TQeRc+aK+X7IP83tzDHN+VqqIFSdIi4Uq GhzksHYBHs/xSNA7aFeNebu+7p84mHh/73yRcvxrliac2JBq42NJ8/n02npmjozO IUn693VV5UR/twURAgdH2XbMRymIoC5CkPo2d36pF3zvBQYq75ZypcfVHQZm4vT6 sU7Hb+t4VlWsEkwDT2gAVFzRzS9gOghtEThsNq0vzKOUoC/GXQO/w697rQ3HqKD9 Ab1Q2WQEhy/yRJnqwBh/gumMtJvwx2uxdpw6G9hjR3U7CSQWFjneHBDd3CwssErJ JbhN0Nof1cMuAwwCNSCg398JofcwUdK7BXtZVgBEMKOJ8hMYUnmHiA4N8DMwoo7I VeKQ9wy4otYX42wHP1ieuWQKYaGI/EwPOxExdLJPYvtv8JQlsFKc+miM/lTi95HR /JKZk1Icn4RqSqXr9tpEBCRVAcUXjmKCxxup1kde5x/HMkxt+HelLvb/G0c4SL3U yMJrKMk34eBCFAXbdCuUXHeeiXvK40DKM70ocj0dyMeZ53oaTvCPiLcilRqrsx4y zGzSZ49YczSUueup0sAEmdzl6dZOKUeUD+s0LCSe4xX8M7CSZeBGhT+MhwzdWBwg O8SYk2iqkEhapDh0tldGF35VZr3hLSDmhp/b4kpVgBN8m9ZYxGg= =bLzT -----END PGP SIGNATURE----- Merge tag 'nvme-6.12-2024-09-25' of git://git.infradead.org/nvme into for-6.12/block Pull NVMe fixes from Keith: "nvme fixes for Linux 6.12 - Multipath fixes (Hannes) - Sysfs attribute list NULL terminate fix (Shin'ichiro) - Remove problematic read-back (Keith)" * tag 'nvme-6.12-2024-09-25' of git://git.infradead.org/nvme: nvme: remove CC register read-back during enabling nvme: null terminate nvme_tls_attrs nvme-multipath: avoid hang on inaccessible namespaces nvme-multipath: system fails to create generic nvme device	2024-09-25 03:29:17 -06:00
Keith Busch	9064610348	nvme: remove CC register read-back during enabling Any non-posted read should flush the previous write, so we don't necessarily need to read back the value we just wrote. I've found at least some controllers that respond with 0 for short moments after writing the CC register with EN (enable) cleared, so the read-back is overwriting our valid ctrl_config value and ends up breaking on the subsequent enabling. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-09-24 23:35:10 -07:00
Shin'ichiro Kawasaki	83340d9c61	nvme: null terminate nvme_tls_attrs Commit `1e48b34c9b` ("nvme: split off TLS sysfs attributes into a separate group") introduced the struct attribute array nvme_tls_attrs. However, the array was not null terminated and caused BUG KASAN global- out-of-bounds. To avoid the BUG, null terminate the array. Reported-by: Yi Zhang <yi.zhang@redhat.com> Closes: https://lore.kernel.org/linux-nvme/jhllwfxcedrcxcnbajwl4x2l2ujcqowqcd4ps574zrafrqhjna@f4icvecutekm/ Fixes: `1e48b34c9b` ("nvme: split off TLS sysfs attributes into a separate group") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-09-24 23:34:13 -07:00

1 2 3 4 5 ...

1307379 commits