- 09 8月, 2022 1 次提交
-
-
由 Al Viro 提交于
Most of the users immediately follow successful iov_iter_get_pages() with advancing by the amount it had returned. Provide inline wrappers doing that, convert trivial open-coded uses of those. BTW, iov_iter_get_pages() never returns more than it had been asked to; such checks in cifs ought to be removed someday... Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 02 8月, 2022 6 次提交
-
-
由 Ammar Faizi 提交于
Commit 2dec18ad forgets to call mutex_unlock() before the function returns in the error path: New smatch warnings: net/core/devlink.c:6392 devlink_nl_cmd_region_new() warn: inconsistent \ returns '®ion->snapshot_lock'. Make sure we call mutex_unlock() in this error path. Reported-by: Nkernel test robot <lkp@intel.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Fixes: 2dec18ad ("net: devlink: remove region snapshots list dependency on devlink->lock") Signed-off-by: NAmmar Faizi <ammarfaizi2@gnuweeb.org> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20220801115742.1309329-1-ammar.faizi@intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Tariq Toukan 提交于
destroy_workqueue() safely destroys the workqueue after draining it. No need for the explicit call to flush_workqueue(). Remove it. Signed-off-by: NTariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220801112444.26175-1-tariqt@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xie Shaowen 提交于
fix follow spelling misktakes: desconstructed ==> deconstructed enforcment ==> enforcement Reported-by: NHacash Robot <hacashRobot@santino.com> Signed-off-by: NXie Shaowen <studentxswpy@163.com> Link: https://lore.kernel.org/r/20220730092254.3102875-1-studentxswpy@163.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Hangyu Hua 提交于
In the case of sk->dccps_qpolicy == DCCPQ_POLICY_PRIO, dccp_qpolicy_full will drop a skb when qpolicy is full. And the lock in dccp_sendmsg is released before sock_alloc_send_skb and then relocked after sock_alloc_send_skb. The following conditions may lead dccp_qpolicy_push to add skb to an already full sk_write_queue: thread1--->lock thread1--->dccp_qpolicy_full: queue is full. drop a skb thread1--->unlock thread2--->lock thread2--->dccp_qpolicy_full: queue is not full. no need to drop. thread2--->unlock thread1--->lock thread1--->dccp_qpolicy_push: add a skb. queue is full. thread1--->unlock thread2--->lock thread2--->dccp_qpolicy_push: add a skb! thread2--->unlock Fix this by moving dccp_qpolicy_full. Fixes: b1308dc0 ("[DCCP]: Set TX Queue Length Bounds via Sysctl") Signed-off-by: NHangyu Hua <hbh25y@gmail.com> Link: https://lore.kernel.org/r/20220729110027.40569-1-hbh25y@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
This will help debugging netdevice refcount problems with CONFIG_NET_DEV_REFCNT_TRACKER=y Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Tested-by: Bernard Pidoux <f6bvp@free.fr> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
Bernard reported that trying to unload rose module would lead to infamous messages: unregistered_netdevice: waiting for rose0 to become free. Usage count = xx This patch solves the issue, by making sure each socket referring to a netdevice holds a reference count on it, and properly releases it in rose_release(). rose_dev_first() is also fixed to take a device reference before leaving the rcu_read_locked section. Following patch will add ref_tracker annotations to ease future bug hunting. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: NBernard Pidoux <f6bvp@free.fr> Signed-off-by: NEric Dumazet <edumazet@google.com> Tested-by: NBernard Pidoux <f6bvp@free.fr> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 01 8月, 2022 6 次提交
-
-
由 Jiri Pirko 提交于
As the devlink_mutex was removed and all devlink instances are protected individually by devlink->lock mutex, allow the netlink ops to run in parallel and therefore allow user to execute commands on multiple devlink instances simultaneously. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
All accesses to devlink structure from userspace and drivers are locked with devlink->lock instance mutex. Also, devlinks xa_array iteration is taken care of by iteration helpers taking devlink reference. Therefore, remove devlink_mutex as it is no longer needed. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Convert reload command to behave the same way as the rest of the commands and let if be called with devlink->lock held. Remove the temporary devl_lock taking from drivers. As the DEVLINK_NL_FLAG_NO_LOCK flag is no longer used, remove it alongside. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Add new mark called "unregistering" to be set at the beginning of devlink_unregister() function. Check this mark during devlinks iteration in order to prevent getting a reference of devlink which is being currently unregistered. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kuniyuki Iwashima 提交于
__udp_sysctl_init() is called for init_net via udp_sysctl_ops. While at it, we can rename __udp_sysctl_init() to udp_sysctl_init(). Fixes: 1e802951 ("udp: Move the udp sysctl to namespace.") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li Qiong 提交于
If 'local_odp_mr->r_trans_private' is a error code, it is better to print the error code than to print the value of IS_ERR(). Signed-off-by: NLi Qiong <liqiong@nfschina.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 7月, 2022 1 次提交
-
-
由 Yu Zhe 提交于
Use "jiffies != now" to replace "jiffies - now > 0" to make code more readable. We want to put a limit on how long the loop can run for before rescheduling. Signed-off-by: NYu Zhe <yuzhe@nfschina.com> Link: https://lore.kernel.org/r/20220729061712.22666-1-yuzhe@nfschina.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 29 7月, 2022 15 次提交
-
-
由 Andrea Mayer 提交于
The SRv6 H.L2Encaps.Red behavior described in [1] is an optimization of the SRv6 H.L2Encaps behavior [2]. H.L2Encaps.Red reduces the length of the SRH by excluding the first segment (SID) in the SRH of the pushed IPv6 header. The first SID is only placed in the IPv6 Destination Address field of the pushed IPv6 header. When the SRv6 Policy only contains one SID the SRH is omitted, unless there is an HMAC TLV to be carried. [1] - https://datatracker.ietf.org/doc/html/rfc8986#section-5.4 [2] - https://datatracker.ietf.org/doc/html/rfc8986#section-5.3Signed-off-by: NAndrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: NAnton Makarov <anton.makarov11235@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrea Mayer 提交于
The SRv6 H.Encaps.Red behavior described in [1] is an optimization of the SRv6 H.Encaps behavior [2]. H.Encaps.Red reduces the length of the SRH by excluding the first segment (SID) in the SRH of the pushed IPv6 header. The first SID is only placed in the IPv6 Destination Address field of the pushed IPv6 header. When the SRv6 Policy only contains one SID the SRH is omitted, unless there is an HMAC TLV to be carried. [1] - https://datatracker.ietf.org/doc/html/rfc8986#section-5.2 [2] - https://datatracker.ietf.org/doc/html/rfc8986#section-5.1Signed-off-by: NAndrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: NAnton Makarov <anton.makarov11235@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Zhengchao Shao 提交于
User can use AF_PACKET socket to send packets with the length of 0. When min_header_len equals to 0, packet_snd will call __dev_queue_xmit to send packets, and sock->type can be any type. Reported-by: syzbot+5ea725c25d06fb9114c4@syzkaller.appspotmail.com Fixes: fd189422 ("bpf: Don't redirect packets with invalid pkt_len") Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
While investigating a separate rose issue [1], and enabling CONFIG_NET_DEV_REFCNT_TRACKER=y, Bernard reported an orthogonal ax25 issue [2] An ax25_dev can be used by one (or many) struct ax25_cb. We thus need different dev_tracker, one per struct ax25_cb. After this patch is applied, we are able to focus on rose. [1] https://lore.kernel.org/netdev/fb7544a1-f42e-9254-18cc-c9b071f4ca70@free.fr/ [2] [ 205.798723] reference already released. [ 205.798732] allocated in: [ 205.798734] ax25_bind+0x1a2/0x230 [ax25] [ 205.798747] __sys_bind+0xea/0x110 [ 205.798753] __x64_sys_bind+0x18/0x20 [ 205.798758] do_syscall_64+0x5c/0x80 [ 205.798763] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 205.798768] freed in: [ 205.798770] ax25_release+0x115/0x370 [ax25] [ 205.798778] __sock_release+0x42/0xb0 [ 205.798782] sock_close+0x15/0x20 [ 205.798785] __fput+0x9f/0x260 [ 205.798789] ____fput+0xe/0x10 [ 205.798792] task_work_run+0x64/0xa0 [ 205.798798] exit_to_user_mode_prepare+0x18b/0x190 [ 205.798804] syscall_exit_to_user_mode+0x26/0x40 [ 205.798808] do_syscall_64+0x69/0x80 [ 205.798812] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 205.798827] ------------[ cut here ]------------ [ 205.798829] WARNING: CPU: 2 PID: 2605 at lib/ref_tracker.c:136 ref_tracker_free.cold+0x60/0x81 [ 205.798837] Modules linked in: rose netrom mkiss ax25 rfcomm cmac algif_hash algif_skcipher af_alg bnep snd_hda_codec_hdmi nls_iso8859_1 i915 rtw88_8821ce rtw88_8821c x86_pkg_temp_thermal rtw88_pci intel_powerclamp rtw88_core snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio coretemp snd_hda_intel kvm_intel snd_intel_dspcfg mac80211 snd_hda_codec kvm i2c_algo_bit drm_buddy drm_dp_helper btusb drm_kms_helper snd_hwdep btrtl snd_hda_core btbcm joydev crct10dif_pclmul btintel crc32_pclmul ghash_clmulni_intel mei_hdcp btmtk intel_rapl_msr aesni_intel bluetooth input_leds snd_pcm crypto_simd syscopyarea processor_thermal_device_pci_legacy sysfillrect cryptd intel_soc_dts_iosf snd_seq sysimgblt ecdh_generic fb_sys_fops rapl libarc4 processor_thermal_device intel_cstate processor_thermal_rfim cec snd_timer ecc snd_seq_device cfg80211 processor_thermal_mbox mei_me processor_thermal_rapl mei rc_core at24 snd intel_pch_thermal intel_rapl_common ttm soundcore int340x_thermal_zone video [ 205.798948] mac_hid acpi_pad sch_fq_codel ipmi_devintf ipmi_msghandler drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid i2c_i801 i2c_smbus r8169 xhci_pci ahci libahci realtek lpc_ich xhci_pci_renesas [last unloaded: ax25] [ 205.798992] CPU: 2 PID: 2605 Comm: ax25ipd Not tainted 5.18.11-F6BVP #3 [ 205.798996] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CK3, BIOS 5.011 09/16/2020 [ 205.798999] RIP: 0010:ref_tracker_free.cold+0x60/0x81 [ 205.799005] Code: e8 d2 01 9b ff 83 7b 18 00 74 14 48 c7 c7 2f d7 ff 98 e8 10 6e fc ff 8b 7b 18 e8 b8 01 9b ff 4c 89 ee 4c 89 e7 e8 5d fd 07 00 <0f> 0b b8 ea ff ff ff e9 30 05 9b ff 41 0f b6 f7 48 c7 c7 a0 fa 4e [ 205.799008] RSP: 0018:ffffaf5281073958 EFLAGS: 00010286 [ 205.799011] RAX: 0000000080000000 RBX: ffff9a0bd687ebe0 RCX: 0000000000000000 [ 205.799014] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 00000000ffffffff [ 205.799016] RBP: ffffaf5281073a10 R08: 0000000000000003 R09: fffffffffffd5618 [ 205.799019] R10: 0000000000ffff10 R11: 000000000000000f R12: ffff9a0bc53384d0 [ 205.799022] R13: 0000000000000282 R14: 00000000ae000001 R15: 0000000000000001 [ 205.799024] FS: 0000000000000000(0000) GS:ffff9a0d0f300000(0000) knlGS:0000000000000000 [ 205.799028] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 205.799031] CR2: 00007ff6b8311554 CR3: 000000001ac10004 CR4: 00000000001706e0 [ 205.799033] Call Trace: [ 205.799035] <TASK> [ 205.799038] ? ax25_dev_device_down+0xd9/0x1b0 [ax25] [ 205.799047] ? ax25_device_event+0x9f/0x270 [ax25] [ 205.799055] ? raw_notifier_call_chain+0x49/0x60 [ 205.799060] ? call_netdevice_notifiers_info+0x52/0xa0 [ 205.799065] ? dev_close_many+0xc8/0x120 [ 205.799070] ? unregister_netdevice_many+0x13d/0x890 [ 205.799073] ? unregister_netdevice_queue+0x90/0xe0 [ 205.799076] ? unregister_netdev+0x1d/0x30 [ 205.799080] ? mkiss_close+0x7c/0xc0 [mkiss] [ 205.799084] ? tty_ldisc_close+0x2e/0x40 [ 205.799089] ? tty_ldisc_hangup+0x137/0x210 [ 205.799092] ? __tty_hangup.part.0+0x208/0x350 [ 205.799098] ? tty_vhangup+0x15/0x20 [ 205.799103] ? pty_close+0x127/0x160 [ 205.799108] ? tty_release+0x139/0x5e0 [ 205.799112] ? __fput+0x9f/0x260 [ 205.799118] ax25_dev_device_down+0xd9/0x1b0 [ax25] [ 205.799126] ax25_device_event+0x9f/0x270 [ax25] [ 205.799135] raw_notifier_call_chain+0x49/0x60 [ 205.799140] call_netdevice_notifiers_info+0x52/0xa0 [ 205.799146] dev_close_many+0xc8/0x120 [ 205.799152] unregister_netdevice_many+0x13d/0x890 [ 205.799157] unregister_netdevice_queue+0x90/0xe0 [ 205.799161] unregister_netdev+0x1d/0x30 [ 205.799165] mkiss_close+0x7c/0xc0 [mkiss] [ 205.799170] tty_ldisc_close+0x2e/0x40 [ 205.799173] tty_ldisc_hangup+0x137/0x210 [ 205.799178] __tty_hangup.part.0+0x208/0x350 [ 205.799184] tty_vhangup+0x15/0x20 [ 205.799188] pty_close+0x127/0x160 [ 205.799193] tty_release+0x139/0x5e0 [ 205.799199] __fput+0x9f/0x260 [ 205.799203] ____fput+0xe/0x10 [ 205.799208] task_work_run+0x64/0xa0 [ 205.799213] do_exit+0x33b/0xab0 [ 205.799217] ? __handle_mm_fault+0xc4f/0x15f0 [ 205.799224] do_group_exit+0x35/0xa0 [ 205.799228] __x64_sys_exit_group+0x18/0x20 [ 205.799232] do_syscall_64+0x5c/0x80 [ 205.799238] ? handle_mm_fault+0xba/0x290 [ 205.799242] ? debug_smp_processor_id+0x17/0x20 [ 205.799246] ? fpregs_assert_state_consistent+0x26/0x50 [ 205.799251] ? exit_to_user_mode_prepare+0x49/0x190 [ 205.799256] ? irqentry_exit_to_user_mode+0x9/0x20 [ 205.799260] ? irqentry_exit+0x33/0x40 [ 205.799263] ? exc_page_fault+0x87/0x170 [ 205.799268] ? asm_exc_page_fault+0x8/0x30 [ 205.799273] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 205.799277] RIP: 0033:0x7ff6b80eaca1 [ 205.799281] Code: Unable to access opcode bytes at RIP 0x7ff6b80eac77. [ 205.799283] RSP: 002b:00007fff6dfd4738 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 205.799287] RAX: ffffffffffffffda RBX: 00007ff6b8215a00 RCX: 00007ff6b80eaca1 [ 205.799290] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001 [ 205.799293] RBP: 0000000000000001 R08: ffffffffffffff80 R09: 0000000000000028 [ 205.799295] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff6b8215a00 [ 205.799298] R13: 0000000000000000 R14: 00007ff6b821aee8 R15: 00007ff6b821af00 [ 205.799304] </TASK> Fixes: feef318c ("ax25: fix UAF bugs of net_device caused by rebinding operation") Reported-by: NBernard F6BVP <f6bvp@free.fr> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Duoming Zhou <duoming@zju.edu.cn> Link: https://lore.kernel.org/r/20220728051821.3160118-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Moshe Shemesh 提交于
Let the core take the devlink instance lock around health callbacks and remove the now redundant locking in the drivers. Signed-off-by: NMoshe Shemesh <moshe@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
After mlx4 driver is converted to do locked reload, devlink_region_snapshot_create() may be called from both locked and unlocked context. Note that in mlx4 region snapshots could be created on any command failure. That can happen in any flow that involves commands to FW, which means most of the driver flows. So resolve this by removing dependency on devlink->lock for region snapshots list consistency and introduce new mutex to ensure it. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
After mlx4 driver is converted to do locked reload, functions to get/put regions snapshot ID may be called from both locked and unlocked context. So resolve this by removing dependency on devlink->lock for region snapshot ID tracking by using internal xa_lock() to maintain shapshot_ids xa_array consistency. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vikas Gupta 提交于
Add a framework for running selftests. Framework exposes devlink commands and test suite(s) to the user to execute and query the supported tests by the driver. Below are new entries in devlink_nl_ops devlink_nl_cmd_selftests_show_doit/dumpit: To query the supported selftests by the drivers. devlink_nl_cmd_selftests_run: To execute selftests. Users can provide a test mask for executing group tests or standalone tests. Documentation/networking/devlink/ path is already part of MAINTAINERS & the new files come under this path. Hence no update needed to the MAINTAINERS Signed-off-by: NVikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: NAndy Gospodarek <gospo@broadcom.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Tariq Toukan 提交于
Multiple TLS device-offloaded contexts can be added in parallel via concurrent calls to .tls_dev_add, while calls to .tls_dev_del are sequential in tls_device_gc_task. This is not a sustainable behavior. This creates a rate gap between add and del operations (addition rate outperforms the deletion rate). When running for enough time, the TLS device resources could get exhausted, failing to offload new connections. Replace the single-threaded garbage collector work with a per-context alternative, so they can be handled on several cores in parallel. Use a new dedicated destruct workqueue for this. Tested with mlx5 device: Before: 22141 add/sec, 103 del/sec After: 11684 add/sec, 11684 del/sec Signed-off-by: NTariq Toukan <tariqt@nvidia.com> Reviewed-by: NMaxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Tariq Toukan 提交于
TLS context destructor can be run in atomic context. Cleanup operations for device-offloaded contexts could require access and interaction with the device callbacks, which might sleep. Hence, the cleanup of such contexts must be deferred and completed inside an async work. For all others, this is not necessary, as cleanup is atomic. Invoke cleanup immediately for them, avoiding queueing redundant gc work. Signed-off-by: NTariq Toukan <tariqt@nvidia.com> Reviewed-by: NMaxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Yang Li 提交于
The return from the call to tls_rx_msg_size() is int, it can be a negative error code, however this is being assigned to an unsigned long variable 'sz', so making 'sz' an int. Eliminate the following coccicheck warning: ./net/tls/tls_strp.c:211:6-8: WARNING: Unsigned expression compared with zero: sz < 0 Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NYang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20220728031019.32838-1-yang.lee@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
I went too far in the accessor conversion, we can't use tls_strp_msg() after decryption because the message may not be ready. What we care about on this path is that the output skb is detached, i.e. we didn't somehow just turn around and used the input skb with its TCP data still attached. So look at the anchor directly. Fixes: 84c61fe1 ("tls: rx: do not use the standard strparser") Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
Paolo points out that there seems to be no strong reason strparser users a single threaded workqueue. Perhaps there were some performance or pinning considerations? Since we don't know (and it's the slow path) let's default to the most natural, multi-threaded choice. Also rename the workqueue to "tls-". Suggested-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
Eric indicates that restarting rcvtimeo on every wait may be fine. I thought that we should consider it cumulative, and made tls_rx_reader_lock() return the remaining timeo after acquiring the reader lock. tls_rx_rec_wait() gets its timeout passed in by value so it does not keep track of time previously spent. Make the lock waiting consistent with tls_rx_rec_wait() - don't keep track of time spent. Read the timeo fresh in tls_rx_rec_wait(). It's unclear to me why callers are supposed to cache the value. Link: https://lore.kernel.org/all/CANn89iKcmSfWgvZjzNGbsrndmCch2HC_EPZ7qmGboDNaWoviNQ@mail.gmail.com/Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Kuniyuki Iwashima 提交于
When we close ping6 sockets, some resources are left unfreed because pingv6_prot is missing sk->sk_prot->destroy(). As reported by syzbot [0], just three syscalls leak 96 bytes and easily cause OOM. struct ipv6_sr_hdr *hdr; char data[24] = {0}; int fd; hdr = (struct ipv6_sr_hdr *)data; hdr->hdrlen = 2; hdr->type = IPV6_SRCRT_TYPE_4; fd = socket(AF_INET6, SOCK_DGRAM, NEXTHDR_ICMP); setsockopt(fd, IPPROTO_IPV6, IPV6_RTHDR, data, 24); close(fd); To fix memory leaks, let's add a destroy function. Note the socket() syscall checks if the GID is within the range of net.ipv4.ping_group_range. The default value is [1, 0] so that no GID meets the condition (1 <= GID <= 0). Thus, the local DoS does not succeed until we change the default value. However, at least Ubuntu/Fedora/RHEL loosen it. $ cat /usr/lib/sysctl.d/50-default.conf ... -net.ipv4.ping_group_range = 0 2147483647 Also, there could be another path reported with these options, and some of them require CAP_NET_RAW. setsockopt IPV6_ADDRFORM (inet6_sk(sk)->pktoptions) IPV6_RECVPATHMTU (inet6_sk(sk)->rxpmtu) IPV6_HOPOPTS (inet6_sk(sk)->opt) IPV6_RTHDRDSTOPTS (inet6_sk(sk)->opt) IPV6_RTHDR (inet6_sk(sk)->opt) IPV6_DSTOPTS (inet6_sk(sk)->opt) IPV6_2292PKTOPTIONS (inet6_sk(sk)->opt) getsockopt IPV6_FLOWLABEL_MGR (inet6_sk(sk)->ipv6_fl_list) For the record, I left a different splat with syzbot's one. unreferenced object 0xffff888006270c60 (size 96): comm "repro2", pid 231, jiffies 4294696626 (age 13.118s) hex dump (first 32 bytes): 01 00 00 00 44 00 00 00 00 00 00 00 00 00 00 00 ....D........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000f6bc7ea9>] sock_kmalloc (net/core/sock.c:2564 net/core/sock.c:2554) [<000000006d699550>] do_ipv6_setsockopt.constprop.0 (net/ipv6/ipv6_sockglue.c:715) [<00000000c3c3b1f5>] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:1024) [<000000007096a025>] __sys_setsockopt (net/socket.c:2254) [<000000003a8ff47b>] __x64_sys_setsockopt (net/socket.c:2265 net/socket.c:2262 net/socket.c:2262) [<000000007c409dcb>] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [<00000000e939c4a9>] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [0]: https://syzkaller.appspot.com/bug?extid=a8430774139ec3ab7176 Fixes: 6d0bfe22 ("net: ipv6: Add IPv6 support to the ping socket.") Reported-by: syzbot+a8430774139ec3ab7176@syzkaller.appspotmail.com Reported-by: NAyushman Dutta <ayudutta@amazon.com> Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Reviewed-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220728012220.46918-1-kuniyu@amazon.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 28 7月, 2022 4 次提交
-
-
由 Jiri Pirko 提交于
The net_eq() check is already performed inside devlinks_xa_for_each_registered_get() helper, so remove the redundant appearance. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20220727055912.568391-1-jiri@resnulli.usSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Zhengchao Shao 提交于
Change the type of cbq_set_lss to void. Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20220726030748.243505-1-shaozhengchao@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
A NULL pointer dereference was reported by Wei Chen: BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:__list_del_entry_valid+0x26/0x80 Call Trace: <TASK> sctp_sched_dequeue_common+0x1c/0x90 sctp_sched_prio_dequeue+0x67/0x80 __sctp_outq_teardown+0x299/0x380 sctp_outq_free+0x15/0x20 sctp_association_free+0xc3/0x440 sctp_do_sm+0x1ca7/0x2210 sctp_assoc_bh_rcv+0x1f6/0x340 This happens when calling sctp_sendmsg without connecting to server first. In this case, a data chunk already queues up in send queue of client side when processing the INIT_ACK from server in sctp_process_init() where it calls sctp_stream_init() to alloc stream_in. If it fails to alloc stream_in all stream_out will be freed in sctp_stream_init's err path. Then in the asoc freeing it will crash when dequeuing this data chunk as stream_out is missing. As we can't free stream out before dequeuing all data from send queue, and this patch is to fix it by moving the err path stream_out/in freeing in sctp_stream_init() to sctp_stream_free() which is eventually called when freeing the asoc in sctp_association_free(). This fix also makes the code in sctp_process_init() more clear. Note that in sctp_association_init() when it fails in sctp_stream_init(), sctp_association_free() will not be called, and in that case it should go to 'stream_free' err path to free stream instead of 'fail_init'. Fixes: 5bbbbe32 ("sctp: introduce stream scheduler foundations") Reported-by: NWei Chen <harperchen1110@gmail.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/831a3dc100c4908ff76e5bcc363be97f2778bc0b.1658787066.git.lucien.xin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
After the blamed commit, IPv4 SYN packets handled by a dual stack IPv6 socket are dropped, even if perfectly valid. $ nstat | grep MD5 TcpExtTCPMD5Failure 5 0.0 For a dual stack listener, an incoming IPv4 SYN packet would call tcp_inbound_md5_hash() with @family == AF_INET, while tp->af_specific is pointing to tcp_sock_ipv6_specific. Only later when an IPv4-mapped child is created, tp->af_specific is changed to tcp_sock_ipv6_mapped_specific. Fixes: 7bbb765b ("net/tcp: Merge TCP-MD5 inbound callbacks") Reported-by: NBrian Vazquez <brianvv@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Reviewed-by: NDmitry Safonov <dima@arista.com> Tested-by: NLeonard Crestez <cdleonard@gmail.com> Link: https://lore.kernel.org/r/20220726115743.2759832-1-edumazet@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 27 7月, 2022 7 次提交
-
-
由 Stefan Raspl 提交于
Previously, the smc and smc_diag modules were automatically loaded as dependencies of the ism module whenever an ISM device was present. With the pending rework of the ISM API, the smc module will no longer automatically be loaded in presence of an ISM device. Usage of an AF_SMC socket will still trigger loading of the smc modules, but usage of a netlink socket will not. This is addressed by setting the correct module aliases. Signed-off-by: NStefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: NTony Lu <tonylu@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stefan Raspl 提交于
Make the DMBE bits, which are passed on individually in ism_move() as parameter idx, available to the receiver. Signed-off-by: NStefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: NTony Lu <tonylu@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stefan Raspl 提交于
Reworked signature of the function to retrieve the system EID: No plausible reason to use a double pointer. And neither to pass in the device as an argument, as this identifier is by definition per system, not per device. Plus some minor consistency edits. Signed-off-by: NStefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: NTony Lu <tonylu@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Heiko Carstens 提交于
This struct is used in a single place only, and its usage generates inefficient code. Time to clean up! Signed-off-by: NHeiko Carstens <hca@linux.ibm.com> Reviewed-and-tested-by: NStefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: NTony Lu <tonylu@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
One rcu_read_unlock() should have been removed in blamed commit. Fixes: 9b1c21d8 ("ip6mr: do not acquire mrt_lock while calling ip6_mr_forward()") Reported-by: NVladimir Oltean <olteanv@gmail.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NVladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220725200554.2563581-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Mat Martineau 提交于
New subflows are created within the kernel using O_NONBLOCK, so EINPROGRESS is the expected return value from kernel_connect(). __mptcp_subflow_connect() has the correct logic to consider EINPROGRESS to be a successful case, but it has also used that error code as its return value. Before v5.19 this was benign: all the callers ignored the return value. Starting in v5.19 there is a MPTCP_PM_CMD_SUBFLOW_CREATE generic netlink command that does use the return value, so the EINPROGRESS gets propagated to userspace. Make __mptcp_subflow_connect() always return 0 on success instead. Fixes: ec3edaa7 ("mptcp: Add handling of outgoing MP_JOIN requests") Fixes: 702c2f64 ("mptcp: netlink: allow userspace-driven subflow establishment") Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/20220725205231.87529-1-mathew.j.martineau@linux.intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
TLS is a relatively poor fit for strparser. We pause the input every time a message is received, wait for a read which will decrypt the message, start the parser, repeat. strparser is built to delineate the messages, wrap them in individual skbs and let them float off into the stack or a different socket. TLS wants the data pages and nothing else. There's no need for TLS to keep cloning (and occasionally skb_unclone()'ing) the TCP rx queue. This patch uses a pre-allocated skb and attaches the skbs from the TCP rx queue to it as frags. TLS is careful never to modify the input skb without CoW'ing / detaching it first. Since we call TCP rx queue cleanup directly we also get back the benefit of skb deferred free. Overall this results in a 6% gain in my benchmarks. Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-