1. 15 6月, 2021 40 次提交
    • M
      i40e: optimize for XDP_REDIRECT in xsk path · 011bd59c
      Magnus Karlsson 提交于
      stable inclusion
      from stable-5.10.43
      commit fbae1a97ce342470dcf2c3f51e63faf2e8e557f0
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 346497c7 ]
      
      Optimize i40e_run_xdp_zc() for the XDP program verdict being
      XDP_REDIRECT in the xsk zero-copy path. This path is only used when
      having AF_XDP zero-copy on and in that case most packets will be
      directed to user space. This provides a little over 100k extra packets
      in throughput on my server when running l2fwd in xdpsock.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: NGeorge Kuruvinakunnel <george.kuruvinakunnel@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      011bd59c
    • R
      cxgb4: avoid link re-train during TC-MQPRIO configuration · 27e58c9a
      Rahul Lakkireddy 提交于
      stable inclusion
      from stable-5.10.43
      commit 1958a31c035dd8981a7313acf8f91ae1886173cc
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 3822d067 ]
      
      When configuring TC-MQPRIO offload, only turn off netdev carrier and
      don't bring physical link down in hardware. Otherwise, when the
      physical link is brought up again after configuration, it gets
      re-trained and stalls ongoing traffic.
      
      Also, when firmware is no longer accessible or crashed, avoid sending
      FLOWC and waiting for reply that will never come.
      
      Fix following hung_task_timeout_secs trace seen in these cases.
      
      INFO: task tc:20807 blocked for more than 122 seconds.
            Tainted: G S                5.13.0-rc3+ #122
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:tc   state:D stack:14768 pid:20807 ppid: 19366 flags:0x00000000
      Call Trace:
       __schedule+0x27b/0x6a0
       schedule+0x37/0xa0
       schedule_preempt_disabled+0x5/0x10
       __mutex_lock.isra.14+0x2a0/0x4a0
       ? netlink_lookup+0x120/0x1a0
       ? rtnl_fill_ifinfo+0x10f0/0x10f0
       __netlink_dump_start+0x70/0x250
       rtnetlink_rcv_msg+0x28b/0x380
       ? rtnl_fill_ifinfo+0x10f0/0x10f0
       ? rtnl_calcit.isra.42+0x120/0x120
       netlink_rcv_skb+0x4b/0xf0
       netlink_unicast+0x1a0/0x280
       netlink_sendmsg+0x216/0x440
       sock_sendmsg+0x56/0x60
       __sys_sendto+0xe9/0x150
       ? handle_mm_fault+0x6d/0x1b0
       ? do_user_addr_fault+0x1c5/0x620
       __x64_sys_sendto+0x1f/0x30
       do_syscall_64+0x3c/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f7f73218321
      RSP: 002b:00007ffd19626208 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 000055b7c0a8b240 RCX: 00007f7f73218321
      RDX: 0000000000000028 RSI: 00007ffd19626210 RDI: 0000000000000003
      RBP: 000055b7c08680ff R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000055b7c085f5f6
      R13: 000055b7c085f60a R14: 00007ffd19636470 R15: 00007ffd196262a0
      
      Fixes: b1396c2b ("cxgb4: parse and configure TC-MQPRIO offload")
      Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      27e58c9a
    • R
      i2c: qcom-geni: Add shutdown callback for i2c · 82a6e3f8
      Roja Rani Yarubandi 提交于
      stable inclusion
      from stable-5.10.43
      commit 21d494d4446b020e69e7b22fa6ed9274db1f175c
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 9f78c607 ]
      
      If the hardware is still accessing memory after SMMU translation
      is disabled (as part of smmu shutdown callback), then the
      IOVAs (I/O virtual address) which it was using will go on the bus
      as the physical addresses which will result in unknown crashes
      like NoC/interconnect errors.
      
      So, implement shutdown callback for i2c driver to suspend the bus
      during system "reboot" or "shutdown".
      
      Fixes: 37692de5 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
      Signed-off-by: NRoja Rani Yarubandi <rojay@codeaurora.org>
      Reviewed-by: NStephen Boyd <swboyd@chromium.org>
      Signed-off-by: NWolfram Sang <wsa@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      82a6e3f8
    • D
      ice: Allow all LLDP packets from PF to Tx · 718c797a
      Dave Ertman 提交于
      stable inclusion
      from stable-5.10.43
      commit c4b796f20c9581ed8502e42be3d5bde59469143d
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit f9f83202 ]
      
      Currently in the ice driver, the check whether to
      allow a LLDP packet to egress the interface from the
      PF_VSI is being based on the SKB's priority field.
      It checks to see if the packets priority is equal to
      TC_PRIO_CONTROL.  Injected LLDP packets do not always
      meet this condition.
      
      SCAPY defaults to a sk_buff->protocol value of ETH_P_ALL
      (0x0003) and does not set the priority field.  There will
      be other injection methods (even ones used by end users)
      that will not correctly configure the socket so that
      SKB fields are correctly populated.
      
      Then ethernet header has to have to correct value for
      the protocol though.
      
      Add a check to also allow packets whose ethhdr->h_proto
      matches ETH_P_LLDP (0x88CC).
      
      Fixes: 0c3a6101 ("ice: Allow egress control packets from PF_VSI")
      Signed-off-by: NDave Ertman <david.m.ertman@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      718c797a
    • P
      ice: report supported and advertised autoneg using PHY capabilities · 00bccc22
      Paul Greenwalt 提交于
      stable inclusion
      from stable-5.10.43
      commit 68db78345f7383dcd3ffd3c20f379f2f8e1b445f
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 5cd349c3 ]
      
      Ethtool incorrectly reported supported and advertised auto-negotiation
      settings for a backplane PHY image which did not support auto-negotiation.
      This can occur when using media or PHY type for reporting ethtool
      supported and advertised auto-negotiation settings.
      
      Remove setting supported and advertised auto-negotiation settings based
      on PHY type in ice_phy_type_to_ethtool(), and MAC type in
      ice_get_link_ksettings().
      
      Ethtool supported and advertised auto-negotiation settings should be
      based on the PHY image using the AQ command get PHY capabilities with
      media. Add setting supported and advertised auto-negotiation settings
      based get PHY capabilities with media in ice_get_link_ksettings().
      
      Fixes: 48cb27f2 ("ice: Implement handlers for ethtool PHY/link operations")
      Signed-off-by: NPaul Greenwalt <paul.greenwalt@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      00bccc22
    • H
      ice: handle the VF VSI rebuild failure · 397d4156
      Haiyue Wang 提交于
      stable inclusion
      from stable-5.10.43
      commit 8726b9e81be7b30d7a9f4f1e3426352b37e6129d
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit c7ee6ce1 ]
      
      VSI rebuild can be failed for LAN queue config, then the VF's VSI will
      be NULL, the VF reset should be stopped with the VF entering into the
      disable state.
      
      Fixes: 12bb018c ("ice: Refactor VF reset")
      Signed-off-by: NHaiyue Wang <haiyue.wang@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      397d4156
    • B
      ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared · 8fa33b83
      Brett Creeley 提交于
      stable inclusion
      from stable-5.10.43
      commit a79883ce1e9f7ccc1616c7659332db1266a9d434
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 8679f07a ]
      
      Some AVF drivers expect the VF_MBX_ATQLEN register to be cleared for any
      type of VFR/VFLR. Fix this by clearing the VF_MBX_ATQLEN register at the
      same time as VF_MBX_ARQLEN.
      
      Fixes: 82ba0128 ("ice: clear VF ARQLEN register on reset")
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      8fa33b83
    • B
      ice: Fix allowing VF to request more/less queues via virtchnl · ee56afb7
      Brett Creeley 提交于
      stable inclusion
      from stable-5.10.43
      commit b94580b055b8a3e90f47dc2f8c7192455306d655
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit f0457690 ]
      
      Commit 12bb018c ("ice: Refactor VF reset") caused a regression
      that removes the ability for a VF to request a different amount of
      queues via VIRTCHNL_OP_REQUEST_QUEUES. This prevents VF drivers to
      either increase or decrease the number of queue pairs they are
      allocated. Fix this by using the variable vf->num_req_qs when
      determining the vf->num_vf_qs during VF VSI creation.
      
      Fixes: 12bb018c ("ice: Refactor VF reset")
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      ee56afb7
    • C
      ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions · 6b967910
      Coco Li 提交于
      stable inclusion
      from stable-5.10.43
      commit 09870235827451409ff546b073d754a19fd17e2e
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 821bbf79 ]
      
      Reported by syzbot:
      HEAD commit:    90c911ad Merge tag 'fixes' of git://git.kernel.org/pub/scm..
      git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
      dashboard link: https://syzkaller.appspot.com/bug?extid=123aa35098fd3c000eb7
      compiler:       Debian clang version 11.0.1-2
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline]
      BUG: KASAN: slab-out-of-bounds in fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732
      Read of size 8 at addr ffff8880145c78f8 by task syz-executor.4/17760
      
      CPU: 0 PID: 17760 Comm: syz-executor.4 Not tainted 5.12.0-rc8-syzkaller #0
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x202/0x31e lib/dump_stack.c:120
       print_address_description+0x5f/0x3b0 mm/kasan/report.c:232
       __kasan_report mm/kasan/report.c:399 [inline]
       kasan_report+0x15c/0x200 mm/kasan/report.c:416
       fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline]
       fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732
       fib6_nh_release+0x9a/0x430 net/ipv6/route.c:3536
       fib6_info_destroy_rcu+0xcb/0x1c0 net/ipv6/ip6_fib.c:174
       rcu_do_batch kernel/rcu/tree.c:2559 [inline]
       rcu_core+0x8f6/0x1450 kernel/rcu/tree.c:2794
       __do_softirq+0x372/0x7a6 kernel/softirq.c:345
       invoke_softirq kernel/softirq.c:221 [inline]
       __irq_exit_rcu+0x22c/0x260 kernel/softirq.c:422
       irq_exit_rcu+0x5/0x20 kernel/softirq.c:434
       sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1100
       </IRQ>
       asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
      RIP: 0010:lock_acquire+0x1f6/0x720 kernel/locking/lockdep.c:5515
      Code: f6 84 24 a1 00 00 00 02 0f 85 8d 02 00 00 f7 c3 00 02 00 00 49 bd 00 00 00 00 00 fc ff df 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 3d 00 00 00 00 00 4b c7 44 3d 09 00 00 00 00 43 c7 44 3d
      RSP: 0018:ffffc90009e06560 EFLAGS: 00000206
      RAX: 1ffff920013c0cc0 RBX: 0000000000000246 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: ffffc90009e066e0 R08: dffffc0000000000 R09: fffffbfff1f992b1
      R10: fffffbfff1f992b1 R11: 0000000000000000 R12: 0000000000000000
      R13: dffffc0000000000 R14: 0000000000000000 R15: 1ffff920013c0cb4
       rcu_lock_acquire+0x2a/0x30 include/linux/rcupdate.h:267
       rcu_read_lock include/linux/rcupdate.h:656 [inline]
       ext4_get_group_info+0xea/0x340 fs/ext4/ext4.h:3231
       ext4_mb_prefetch+0x123/0x5d0 fs/ext4/mballoc.c:2212
       ext4_mb_regular_allocator+0x8a5/0x28f0 fs/ext4/mballoc.c:2379
       ext4_mb_new_blocks+0xc6e/0x24f0 fs/ext4/mballoc.c:4982
       ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4238
       ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638
       ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848
       ext4_bread+0x2a/0x1c0 fs/ext4/inode.c:900
       ext4_append+0x1a4/0x360 fs/ext4/namei.c:67
       ext4_init_new_dir+0x337/0xa10 fs/ext4/namei.c:2768
       ext4_mkdir+0x4b8/0xc00 fs/ext4/namei.c:2814
       vfs_mkdir+0x45b/0x640 fs/namei.c:3819
       ovl_do_mkdir fs/overlayfs/overlayfs.h:161 [inline]
       ovl_mkdir_real+0x53/0x1a0 fs/overlayfs/dir.c:146
       ovl_create_real+0x280/0x490 fs/overlayfs/dir.c:193
       ovl_workdir_create+0x425/0x600 fs/overlayfs/super.c:788
       ovl_make_workdir+0xed/0x1140 fs/overlayfs/super.c:1355
       ovl_get_workdir fs/overlayfs/super.c:1492 [inline]
       ovl_fill_super+0x39ee/0x5370 fs/overlayfs/super.c:2035
       mount_nodev+0x52/0xe0 fs/super.c:1413
       legacy_get_tree+0xea/0x180 fs/fs_context.c:592
       vfs_get_tree+0x86/0x270 fs/super.c:1497
       do_new_mount fs/namespace.c:2903 [inline]
       path_mount+0x196f/0x2be0 fs/namespace.c:3233
       do_mount fs/namespace.c:3246 [inline]
       __do_sys_mount fs/namespace.c:3454 [inline]
       __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3431
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x4665f9
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f68f2b87188 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
      RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 00000000004665f9
      RDX: 00000000200000c0 RSI: 0000000020000000 RDI: 000000000040000a
      RBP: 00000000004bfbb9 R08: 0000000020000100 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
      R13: 00007ffe19002dff R14: 00007f68f2b87300 R15: 0000000000022000
      
      Allocated by task 17768:
       kasan_save_stack mm/kasan/common.c:38 [inline]
       kasan_set_track mm/kasan/common.c:46 [inline]
       set_alloc_info mm/kasan/common.c:427 [inline]
       ____kasan_kmalloc+0xc2/0xf0 mm/kasan/common.c:506
       kasan_kmalloc include/linux/kasan.h:233 [inline]
       __kmalloc+0xb4/0x380 mm/slub.c:4055
       kmalloc include/linux/slab.h:559 [inline]
       kzalloc include/linux/slab.h:684 [inline]
       fib6_info_alloc+0x2c/0xd0 net/ipv6/ip6_fib.c:154
       ip6_route_info_create+0x55d/0x1a10 net/ipv6/route.c:3638
       ip6_route_add+0x22/0x120 net/ipv6/route.c:3728
       inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352
       rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553
       netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502
       netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
       netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338
       netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927
       sock_sendmsg_nosec net/socket.c:654 [inline]
       sock_sendmsg net/socket.c:674 [inline]
       ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350
       ___sys_sendmsg net/socket.c:2404 [inline]
       __sys_sendmsg+0x319/0x400 net/socket.c:2433
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Last potentially related work creation:
       kasan_save_stack+0x27/0x50 mm/kasan/common.c:38
       kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345
       __call_rcu kernel/rcu/tree.c:3039 [inline]
       call_rcu+0x1b1/0xa30 kernel/rcu/tree.c:3114
       fib6_info_release include/net/ip6_fib.h:337 [inline]
       ip6_route_info_create+0x10c4/0x1a10 net/ipv6/route.c:3718
       ip6_route_add+0x22/0x120 net/ipv6/route.c:3728
       inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352
       rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553
       netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502
       netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
       netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338
       netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927
       sock_sendmsg_nosec net/socket.c:654 [inline]
       sock_sendmsg net/socket.c:674 [inline]
       ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350
       ___sys_sendmsg net/socket.c:2404 [inline]
       __sys_sendmsg+0x319/0x400 net/socket.c:2433
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Second to last potentially related work creation:
       kasan_save_stack+0x27/0x50 mm/kasan/common.c:38
       kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345
       insert_work+0x54/0x400 kernel/workqueue.c:1331
       __queue_work+0x981/0xcc0 kernel/workqueue.c:1497
       queue_work_on+0x111/0x200 kernel/workqueue.c:1524
       queue_work include/linux/workqueue.h:507 [inline]
       call_usermodehelper_exec+0x283/0x470 kernel/umh.c:433
       kobject_uevent_env+0x1349/0x1730 lib/kobject_uevent.c:617
       kvm_uevent_notify_change+0x309/0x3b0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4809
       kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:877 [inline]
       kvm_put_kvm+0x9c/0xd10 arch/x86/kvm/../../../virt/kvm/kvm_main.c:920
       kvm_vcpu_release+0x53/0x60 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3120
       __fput+0x352/0x7b0 fs/file_table.c:280
       task_work_run+0x146/0x1c0 kernel/task_work.c:140
       tracehook_notify_resume include/linux/tracehook.h:189 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
       exit_to_user_mode_prepare+0x10b/0x1e0 kernel/entry/common.c:208
       __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
       syscall_exit_to_user_mode+0x26/0x70 kernel/entry/common.c:301
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The buggy address belongs to the object at ffff8880145c7800
       which belongs to the cache kmalloc-192 of size 192
      The buggy address is located 56 bytes to the right of
       192-byte region [ffff8880145c7800, ffff8880145c78c0)
      The buggy address belongs to the page:
      page:ffffea00005171c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x145c7
      flags: 0xfff00000000200(slab)
      raw: 00fff00000000200 ffffea00006474c0 0000000200000002 ffff888010c41a00
      raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880145c7780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff8880145c7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      >ffff8880145c7880: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
                                                                      ^
       ffff8880145c7900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880145c7980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      ==================================================================
      
      In the ip6_route_info_create function, in the case that the nh pointer
      is not NULL, the fib6_nh in fib6_info has not been allocated.
      Therefore, when trying to free fib6_info in this error case using
      fib6_info_release, the function will call fib6_info_destroy_rcu,
      which it will access fib6_nh_release(f6i->fib6_nh);
      However, f6i->fib6_nh doesn't have any refcount yet given the lack of allocation
      causing the reported memory issue above.
      Therefore, releasing the empty pointer directly instead would be the solution.
      
      Fixes: f88d8ea6 ("ipv6: Plumb support for nexthop object in a fib6_info")
      Fixes: 706ec919 ("ipv6: Fix nexthop refcnt leak when creating ipv6 route info")
      Signed-off-by: NCoco Li <lixiaoyan@google.com>
      Cc: David Ahern <dsahern@kernel.org>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      6b967910
    • R
      cxgb4: fix regression with HASH tc prio value update · a8a8c605
      Rahul Lakkireddy 提交于
      stable inclusion
      from stable-5.10.43
      commit 1dcf3d435bf6147691560b8670ff06bfaa023a69
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit a27fb314 ]
      
      commit db43b30c ("cxgb4: add ethtool n-tuple filter deletion")
      has moved searching for next highest priority HASH filter rule to
      cxgb4_flow_rule_destroy(), which searches the rhashtable before the
      the rule is removed from it and hence always finds at least 1 entry.
      Fix by removing the rule from rhashtable first before calling
      cxgb4_flow_rule_destroy() and hence avoid fetching stale info.
      
      Fixes: db43b30c ("cxgb4: add ethtool n-tuple filter deletion")
      Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      a8a8c605
    • M
      ixgbevf: add correct exception tracing for XDP · 04728f53
      Magnus Karlsson 提交于
      stable inclusion
      from stable-5.10.43
      commit 8067da904921c1de8c1cd055170ab1f5944945e3
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit faae8142 ]
      
      Add missing exception tracing to XDP when a number of different
      errors can occur. The support was only partial. Several errors
      where not logged which would confuse the user quite a lot not
      knowing where and why the packets disappeared.
      
      Fixes: 21092e9c ("ixgbevf: Add support for XDP_TX action")
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: NVishakha Jambekar <vishakha.jambekar@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      04728f53
    • M
      igb: add correct exception tracing for XDP · 8560c2c4
      Magnus Karlsson 提交于
      stable inclusion
      from stable-5.10.43
      commit e0b61cda5f07b87b07841ee15066e90b6cb2ca6e
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 74431c40 ]
      
      Add missing exception tracing to XDP when a number of different
      errors can occur. The support was only partial. Several errors
      where not logged which would confuse the user quite a lot not
      knowing where and why the packets disappeared.
      
      Fixes: 9cbc948b ("igb: add XDP support")
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: NVishakha Jambekar <vishakha.jambekar@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      8560c2c4
    • W
      ieee802154: fix error return code in ieee802154_llsec_getparams() · 4d7750db
      Wei Yongjun 提交于
      stable inclusion
      from stable-5.10.43
      commit e513d889625b5d8c6c2942243cd7f967455d300a
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 373e864c ]
      
      Fix to return negative error code -ENOBUFS from the error handling
      case instead of 0, as done elsewhere in this function.
      
      Fixes: 3e9c156e ("ieee802154: add netlink interfaces for llsec")
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
      Link: https://lore.kernel.org/r/20210519141614.3040055-1-weiyongjun1@huawei.comSigned-off-by: NStefan Schmidt <stefan@datenfreihafen.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      4d7750db
    • Z
      ieee802154: fix error return code in ieee802154_add_iface() · 1653a912
      Zhen Lei 提交于
      stable inclusion
      from stable-5.10.43
      commit 2a0ba0125c2c62566023882293face06046698a5
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 79c6b8ed ]
      
      Fix to return a negative error code from the error handling
      case instead of 0, as done elsewhere in this function.
      
      Fixes: be51da0f ("ieee802154: Stop using NLA_PUT*().")
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
      Link: https://lore.kernel.org/r/20210508062517.2574-1-thunder.leizhen@huawei.comSigned-off-by: NStefan Schmidt <stefan@datenfreihafen.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      1653a912
    • D
      bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks · 2e0ed072
      Daniel Borkmann 提交于
      stable inclusion
      from stable-5.10.43
      commit ff5039ec75c83d2ed5b781dc7733420ee8c985fc
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit ff40e510 ]
      
      Commit 59438b46 ("security,lockdown,selinux: implement SELinux lockdown")
      added an implementation of the locked_down LSM hook to SELinux, with the aim
      to restrict which domains are allowed to perform operations that would breach
      lockdown. This is indirectly also getting audit subsystem involved to report
      events. The latter is problematic, as reported by Ondrej and Serhei, since it
      can bring down the whole system via audit:
      
        1) The audit events that are triggered due to calls to security_locked_down()
           can OOM kill a machine, see below details [0].
      
        2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit()
           when trying to wake up kauditd, for example, when using trace_sched_switch()
           tracepoint, see details in [1]. Triggering this was not via some hypothetical
           corner case, but with existing tools like runqlat & runqslower from bcc, for
           example, which make use of this tracepoint. Rough call sequence goes like:
      
           rq_lock(rq) -> -------------------------+
             trace_sched_switch() ->               |
               bpf_prog_xyz() ->                   +-> deadlock
                 selinux_lockdown() ->             |
                   audit_log_end() ->              |
                     wake_up_interruptible() ->    |
                       try_to_wake_up() ->         |
                         rq_lock(rq) --------------+
      
      What's worse is that the intention of 59438b46 to further restrict lockdown
      settings for specific applications in respect to the global lockdown policy is
      completely broken for BPF. The SELinux policy rule for the current lockdown check
      looks something like this:
      
        allow <who> <who> : lockdown { <reason> };
      
      However, this doesn't match with the 'current' task where the security_locked_down()
      is executed, example: httpd does a syscall. There is a tracing program attached
      to the syscall which triggers a BPF program to run, which ends up doing a
      bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does
      the permission check against 'current', that is, httpd in this example. httpd
      has literally zero relation to this tracing program, and it would be nonsensical
      having to write an SELinux policy rule against httpd to let the tracing helper
      pass. The policy in this case needs to be against the entity that is installing
      the BPF program. For example, if bpftrace would generate a histogram of syscall
      counts by user space application:
      
        bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
      
      bpftrace would then go and generate a BPF program from this internally. One way
      of doing it [for the sake of the example] could be to call bpf_get_current_task()
      helper and then access current->comm via one of bpf_probe_read_kernel{,_str}()
      helpers. So the program itself has nothing to do with httpd or any other random
      app doing a syscall here. The BPF program _explicitly initiated_ the lockdown
      check. The allow/deny policy belongs in the context of bpftrace: meaning, you
      want to grant bpftrace access to use these helpers, but other tracers on the
      system like my_random_tracer _not_.
      
      Therefore fix all three issues at the same time by taking a completely different
      approach for the security_locked_down() hook, that is, move the check into the
      program verification phase where we actually retrieve the BPF func proto. This
      also reliably gets the task (current) that is trying to install the BPF tracing
      program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since
      we're moving this out of the BPF helper's fast-path which can be called several
      millions of times per second.
      
      The check is then also in line with other security_locked_down() hooks in the
      system where the enforcement is performed at open/load time, for example,
      open_kcore() for /proc/kcore access or module_sig_check() for module signatures
      just to pick few random ones. What's out of scope in the fix as well as in
      other security_locked_down() hook locations /outside/ of BPF subsystem is that
      if the lockdown policy changes on the fly there is no retrospective action.
      This requires a different discussion, potentially complex infrastructure, and
      it's also not clear whether this can be solved generically. Either way, it is
      out of scope for a suitable stable fix which this one is targeting. Note that
      the breakage is specifically on 59438b46 where it started to rely on 'current'
      as UAPI behavior, and _not_ earlier infrastructure such as 9d1f8be5 ("bpf:
      Restrict bpf when kernel lockdown is in confidentiality mode").
      
      [0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says:
      
        I starting seeing this with F-34. When I run a container that is traced with
        BPF to record the syscalls it is doing, auditd is flooded with messages like:
      
        type=AVC msg=audit(1619784520.593:282387): avc:  denied  { confidentiality }
          for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM"
            scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0
              tclass=lockdown permissive=0
      
        This seems to be leading to auditd running out of space in the backlog buffer
        and eventually OOMs the machine.
      
        [...]
        auditd running at 99% CPU presumably processing all the messages, eventually I get:
        Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
        Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64
        Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000
        Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1
        Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
        [...]
      
      [1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/,
          Serhei Makarov says:
      
        Upstream kernel 5.11.0-rc7 and later was found to deadlock during a
        bpf_probe_read_compat() call within a sched_switch tracepoint. The problem
        is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend
        testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on
        ppc64le. Example stack trace:
      
        [...]
        [  730.868702] stack backtrace:
        [  730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef.166.fc35.x86_64 #1
        [  730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
        [  730.873278] Call Trace:
        [  730.873770]  dump_stack+0x7f/0xa1
        [  730.874433]  check_noncircular+0xdf/0x100
        [  730.875232]  __lock_acquire+0x1202/0x1e10
        [  730.876031]  ? __lock_acquire+0xfc0/0x1e10
        [  730.876844]  lock_acquire+0xc2/0x3a0
        [  730.877551]  ? __wake_up_common_lock+0x52/0x90
        [  730.878434]  ? lock_acquire+0xc2/0x3a0
        [  730.879186]  ? lock_is_held_type+0xa7/0x120
        [  730.880044]  ? skb_queue_tail+0x1b/0x50
        [  730.880800]  _raw_spin_lock_irqsave+0x4d/0x90
        [  730.881656]  ? __wake_up_common_lock+0x52/0x90
        [  730.882532]  __wake_up_common_lock+0x52/0x90
        [  730.883375]  audit_log_end+0x5b/0x100
        [  730.884104]  slow_avc_audit+0x69/0x90
        [  730.884836]  avc_has_perm+0x8b/0xb0
        [  730.885532]  selinux_lockdown+0xa5/0xd0
        [  730.886297]  security_locked_down+0x20/0x40
        [  730.887133]  bpf_probe_read_compat+0x66/0xd0
        [  730.887983]  bpf_prog_250599c5469ac7b5+0x10f/0x820
        [  730.888917]  trace_call_bpf+0xe9/0x240
        [  730.889672]  perf_trace_run_bpf_submit+0x4d/0xc0
        [  730.890579]  perf_trace_sched_switch+0x142/0x180
        [  730.891485]  ? __schedule+0x6d8/0xb20
        [  730.892209]  __schedule+0x6d8/0xb20
        [  730.892899]  schedule+0x5b/0xc0
        [  730.893522]  exit_to_user_mode_prepare+0x11d/0x240
        [  730.894457]  syscall_exit_to_user_mode+0x27/0x70
        [  730.895361]  entry_SYSCALL_64_after_hwframe+0x44/0xae
        [...]
      
      Fixes: 59438b46 ("security,lockdown,selinux: implement SELinux lockdown")
      Reported-by: NOndrej Mosnacek <omosnace@redhat.com>
      Reported-by: NJakub Hrozek <jhrozek@redhat.com>
      Reported-by: NSerhei Makarov <smakarov@redhat.com>
      Reported-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: James Morris <jamorris@linux.microsoft.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Frank Eigler <fche@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lore.kernel.org/bpf/01135120-8bf7-df2e-cff0-1d73f1f841c3@iogearbox.netSigned-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      2e0ed072
    • T
      bpf: Simplify cases in bpf_base_func_proto · 52dff11d
      Tobias Klauser 提交于
      stable inclusion
      from stable-5.10.43
      commit cdf3f6db1a86fc1e3d70423f4ee4fa81e4831157
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 61ca36c8 ]
      
      !perfmon_capable() is checked before the last switch(func_id) in
      bpf_base_func_proto. Thus, the cases BPF_FUNC_trace_printk and
      BPF_FUNC_snprintf_btf can be moved to that last switch(func_id) to omit
      the inline !perfmon_capable() checks.
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20210127174615.3038-1-tklauser@distanz.chSigned-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      52dff11d
    • Z
      drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest() · 842e1a26
      Zhihao Cheng 提交于
      stable inclusion
      from stable-5.10.43
      commit 4cf297ef595ce98cb4e9d80ae3e00bb5af0a8de0
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 10c1f0cb ]
      
      In case of error, the function live_context() returns ERR_PTR() and never
      returns NULL. The NULL test in the return value check should be replaced
      with IS_ERR().
      
      Fixes: 52c0fdb2 ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/33c46ef24cd547d0ad21dc106441491a@intel.com
      [tursulin: Wrap commit text, fix Fixes: tag.]
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      (cherry picked from commit 8f4caef8)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      842e1a26
    • P
      netfilter: nfnetlink_cthelper: hit EBUSY on updates if size mismatches · 38aa8b52
      Pablo Neira Ayuso 提交于
      stable inclusion
      from stable-5.10.43
      commit 8d614eebc003bb7763993e6fcdc8f853401bc17e
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 8971ee8b ]
      
      The private helper data size cannot be updated. However, updates that
      contain NFCTH_PRIV_DATA_LEN might bogusly hit EBUSY even if the size is
      the same.
      
      Fixes: 12f7a505 ("netfilter: add user-space connection tracking helper infrastructure")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      38aa8b52
    • P
      netfilter: nft_ct: skip expectations for confirmed conntrack · 1c3c6414
      Pablo Neira Ayuso 提交于
      stable inclusion
      from stable-5.10.43
      commit 5f3429c05e4028a0e241afdad856dd15dec2ffb9
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 1710eb91 ]
      
      nft_ct_expect_obj_eval() calls nf_ct_ext_add() for a confirmed
      conntrack entry. However, nf_ct_ext_add() can only be called for
      !nf_ct_is_confirmed().
      
      [ 1825.349056] WARNING: CPU: 0 PID: 1279 at net/netfilter/nf_conntrack_extend.c:48 nf_ct_xt_add+0x18e/0x1a0 [nf_conntrack]
      [ 1825.351391] RIP: 0010:nf_ct_ext_add+0x18e/0x1a0 [nf_conntrack]
      [ 1825.351493] Code: 41 5c 41 5d 41 5e 41 5f c3 41 bc 0a 00 00 00 e9 15 ff ff ff ba 09 00 00 00 31 f6 4c 89 ff e8 69 6c 3d e9 eb 96 45 31 ed eb cd <0f> 0b e9 b1 fe ff ff e8 86 79 14 e9 eb bf 0f 1f 40 00 0f 1f 44 00
      [ 1825.351721] RSP: 0018:ffffc90002e1f1e8 EFLAGS: 00010202
      [ 1825.351790] RAX: 000000000000000e RBX: ffff88814f5783c0 RCX: ffffffffc0e4f887
      [ 1825.351881] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88814f578440
      [ 1825.351971] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88814f578447
      [ 1825.352060] R10: ffffed1029eaf088 R11: 0000000000000001 R12: ffff88814f578440
      [ 1825.352150] R13: ffff8882053f3a00 R14: 0000000000000000 R15: 0000000000000a20
      [ 1825.352240] FS:  00007f992261c900(0000) GS:ffff889faec00000(0000) knlGS:0000000000000000
      [ 1825.352343] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1825.352417] CR2: 000056070a4d1158 CR3: 000000015efe0000 CR4: 0000000000350ee0
      [ 1825.352508] Call Trace:
      [ 1825.352544]  nf_ct_helper_ext_add+0x10/0x60 [nf_conntrack]
      [ 1825.352641]  nft_ct_expect_obj_eval+0x1b8/0x1e0 [nft_ct]
      [ 1825.352716]  nft_do_chain+0x232/0x850 [nf_tables]
      
      Add the ct helper extension only for unconfirmed conntrack. Skip rule
      evaluation if the ct helper extension does not exist. Thus, you can
      only create expectations from the first packet.
      
      It should be possible to remove this limitation by adding a new action
      to attach a generic ct helper to the first packet. Then, use this ct
      helper extension from follow up packets to create the ct expectation.
      
      While at it, add a missing check to skip the template conntrack too
      and remove check for IPCT_UNTRACK which is implicit to !ct.
      
      Fixes: 857b4602 ("netfilter: nft_ct: add ct expectations support")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      1c3c6414
    • M
      nvmet: fix freeing unallocated p2pmem · 3c5b4de4
      Max Gurtovoy 提交于
      stable inclusion
      from stable-5.10.43
      commit c440cd080761b18a52cac20f2a42e5da1e3995af
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit bcd9a079 ]
      
      In case p2p device was found but the p2p pool is empty, the nvme target
      is still trying to free the sgl from the p2p pool instead of the
      regular sgl pool and causing a crash (BUG() is called). Instead, assign
      the p2p_dev for the request only if it was allocated from p2p pool.
      
      This is the crash that was caused:
      
      [Sun May 30 19:13:53 2021] ------------[ cut here ]------------
      [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518!
      [Sun May 30 19:13:53 2021] invalid opcode: 0000 [#1] SMP PTI
      ...
      [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518!
      ...
      [Sun May 30 19:13:53 2021] RIP: 0010:gen_pool_free_owner+0xa8/0xb0
      ...
      [Sun May 30 19:13:53 2021] Call Trace:
      [Sun May 30 19:13:53 2021] ------------[ cut here ]------------
      [Sun May 30 19:13:53 2021]  pci_free_p2pmem+0x2b/0x70
      [Sun May 30 19:13:53 2021]  pci_p2pmem_free_sgl+0x4f/0x80
      [Sun May 30 19:13:53 2021]  nvmet_req_free_sgls+0x1e/0x80 [nvmet]
      [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518!
      [Sun May 30 19:13:53 2021]  nvmet_rdma_release_rsp+0x4e/0x1f0 [nvmet_rdma]
      [Sun May 30 19:13:53 2021]  nvmet_rdma_send_done+0x1c/0x60 [nvmet_rdma]
      
      Fixes: c6e3f133 ("nvmet: add metadata support for block devices")
      Reviewed-by: NIsrael Rukshin <israelr@nvidia.com>
      Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      3c5b4de4
    • Y
      net/mlx5: DR, Create multi-destination flow table with level less than 64 · c912d5a8
      Yevgeny Kliteynik 提交于
      stable inclusion
      from stable-5.10.43
      commit 2a8cda3867cd06fbc3f414a78e1c692f973d21e4
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 216214c6 ]
      
      Flow table that contains flow pointing to multiple flow tables or multiple
      TIRs must have a level lower than 64. In our case it applies to muli-
      destination flow table.
      Fix the level of the created table to comply with HW Spec definitions, and
      still make sure that its level lower than SW-owned tables, so that it
      would be possible to point from the multi-destination FW table to SW
      tables.
      
      Fixes: 34583bee ("net/mlx5: DR, Create multi-destination table for SW-steering use")
      Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com>
      Reviewed-by: NAlex Vesker <valex@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      c912d5a8
    • R
      net/mlx5e: Check for needed capability for cvlan matching · cd61b0f9
      Roi Dayan 提交于
      stable inclusion
      from stable-5.10.43
      commit c8972cf28ea11043280135859903ad69b03e0851
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit afe93f71 ]
      
      If not supported show an error and return instead of trying to offload
      to the hardware and fail.
      
      Fixes: 699e96dd ("net/mlx5e: Support offloading tc double vlan headers match")
      Reported-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      cd61b0f9
    • M
      net/mlx5: Check firmware sync reset requested is set before trying to abort it · 98d0b74f
      Moshe Shemesh 提交于
      stable inclusion
      from stable-5.10.43
      commit 730700337593b41551e17427bc33edcbd95d3f05
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 5940e642 ]
      
      In case driver sent NACK to firmware on sync reset request, it will get
      sync reset abort event while it didn't set sync reset requested mode.
      Thus, on abort sync reset event handler, driver should check reset
      requested is set before trying to stop sync reset poll.
      
      Fixes: 7dd6df32 ("net/mlx5: Handle sync reset abort event")
      Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      98d0b74f
    • A
      net/mlx5e: Fix incompatible casting · 8620fcec
      Aya Levin 提交于
      stable inclusion
      from stable-5.10.43
      commit c1ea8c0e71ead1efaaba33e241c1e7d35e9cbf51
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit d8ec9200 ]
      
      Device supports setting of a single fec mode at a time, enforce this
      by bitmap_weight == 1. Input from fec command is in u32, avoid cast to
      unsigned long and use bitmap_from_arr32 to populate bitmap safely.
      
      Fixes: 4bd9d507 ("net/mlx5e: Enforce setting of a single FEC mode")
      Signed-off-by: NAya Levin <ayal@nvidia.com>
      Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      8620fcec
    • M
      net/tls: Fix use-after-free after the TLS device goes down and up · aa3905c0
      Maxim Mikityanskiy 提交于
      stable inclusion
      from stable-5.10.43
      commit f1d4184f128dede82a59a841658ed40d4e6d3aa2
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit c55dcdd4 ]
      
      When a netdev with active TLS offload goes down, tls_device_down is
      called to stop the offload and tear down the TLS context. However, the
      socket stays alive, and it still points to the TLS context, which is now
      deallocated. If a netdev goes up, while the connection is still active,
      and the data flow resumes after a number of TCP retransmissions, it will
      lead to a use-after-free of the TLS context.
      
      This commit addresses this bug by keeping the context alive until its
      normal destruction, and implements the necessary fallbacks, so that the
      connection can resume in software (non-offloaded) kTLS mode.
      
      On the TX side tls_sw_fallback is used to encrypt all packets. The RX
      side already has all the necessary fallbacks, because receiving
      non-decrypted packets is supported. The thing needed on the RX side is
      to block resync requests, which are normally produced after receiving
      non-decrypted packets.
      
      The necessary synchronization is implemented for a graceful teardown:
      first the fallbacks are deployed, then the driver resources are released
      (it used to be possible to have a tls_dev_resync after tls_dev_del).
      
      A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
      mode. It's used to skip the RX resync logic completely, as it becomes
      useless, and some objects may be released (for example, resync_async,
      which is allocated and freed by the driver).
      
      Fixes: e8f69799 ("net/tls: Add generic NIC offload infrastructure")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      aa3905c0
    • M
      net/tls: Replace TLS_RX_SYNC_RUNNING with RCU · ea55ff3c
      Maxim Mikityanskiy 提交于
      stable inclusion
      from stable-5.10.43
      commit 874ece252ed269f5ac1f55167a3f2735ab0f249f
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 05fc8b6c ]
      
      RCU synchronization is guaranteed to finish in finite time, unlike a
      busy loop that polls a flag. This patch is a preparation for the bugfix
      in the next patch, where the same synchronize_net() call will also be
      used to sync with the TX datapath.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      ea55ff3c
    • A
      net: sock: fix in-kernel mark setting · 4f9ac2fe
      Alexander Aring 提交于
      stable inclusion
      from stable-5.10.43
      commit a5de17bb916a7f5b2e5b35a7c961ebee6d95bb28
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit dd9082f4 ]
      
      This patch fixes the in-kernel mark setting by doing an additional
      sk_dst_reset() which was introduced by commit 50254256 ("sock: Reset
      dst when changing sk_mark via setsockopt"). The code is now shared to
      avoid any further suprises when changing the socket mark value.
      
      Fixes: 84d1c617 ("net: sock: add sock_set_mark")
      Reported-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NAlexander Aring <aahringo@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      4f9ac2fe
    • V
      net: dsa: tag_8021q: fix the VLAN IDs used for encoding sub-VLANs · a827c971
      Vladimir Oltean 提交于
      stable inclusion
      from stable-5.10.43
      commit 09fdb6747b7ed3bc4da720301de52ac7b159214a
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 4ef8d857 ]
      
      When using sub-VLANs in the range of 1-7, the resulting value from:
      
      	rx_vid = dsa_8021q_rx_vid_subvlan(ds, port, subvlan);
      
      is wrong according to the description from tag_8021q.c:
      
       | 11  | 10  |  9  |  8  |  7  |  6  |  5  |  4  |  3  |  2  |  1  |  0  |
       +-----------+-----+-----------------+-----------+-----------------------+
       |    DIR    | SVL |    SWITCH_ID    |  SUBVLAN  |          PORT         |
       +-----------+-----+-----------------+-----------+-----------------------+
      
      For example, when ds->index == 0, port == 3 and subvlan == 1,
      dsa_8021q_rx_vid_subvlan() returns 1027, same as it returns for
      subvlan == 0, but it should have returned 1043.
      
      This is because the low portion of the subvlan bits are not masked
      properly when writing into the 12-bit VLAN value. They are masked into
      bits 4:3, but they should be masked into bits 5:4.
      
      Fixes: 3eaae1d0 ("net: dsa: tag_8021q: support up to 8 VLANs per port using sub-VLANs")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      a827c971
    • L
      perf probe: Fix NULL pointer dereference in convert_variable_location() · da741510
      Li Huafei 提交于
      stable inclusion
      from stable-5.10.43
      commit 091283e3d5eb9f424b85e71804fc26092c3c4915
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 3cb17cce ]
      
      If we just check whether the variable can be converted, 'tvar' should be
      a null pointer. However, the null pointer check is missing in the
      'Constant value' execution path.
      
      The following cases can trigger this problem:
      
      	$ cat test.c
      	#include <stdio.h>
      
      	void main(void)
      	{
      	        int a;
      	        const int b = 1;
      
      	        asm volatile("mov %1, %0" : "=r"(a): "i"(b));
      	        printf("a: %d\n", a);
      	}
      
      	$ gcc test.c -o test -O -g
      	$ sudo ./perf probe -x ./test -L "main"
      	<main@/home/lhf/test.c:0>
      	      0  void main(void)
      	         {
      	      2          int a;
      	                 const int b = 1;
      
      	                 asm volatile("mov %1, %0" : "=r"(a): "i"(b));
      	      6          printf("a: %d\n", a);
      	         }
      
      	$ sudo ./perf probe -x ./test -V "main:6"
      	Segmentation fault
      
      The check on 'tvar' is added. If 'tavr' is a null pointer, we return 0
      to indicate that the variable can be converted. Now, we can successfully
      show the variables that can be accessed.
      
      	$ sudo ./perf probe -x ./test -V "main:6"
      	Available variables at main:6
      	        @<main+13>
      	                char*   __fmt
      	                int     a
      	                int     b
      
      However, the variable 'b' cannot be tracked.
      
      	$ sudo ./perf probe -x ./test -D "main:6 b"
      	Failed to find the location of the 'b' variable at this address.
      	 Perhaps it has been optimized out.
      	 Use -V with the --range option to show 'b' location range.
      	  Error: Failed to add events.
      
      This is because __die_find_variable_cb() did not successfully match
      variable 'b', which has the DW_AT_const_value attribute instead of
      DW_AT_location. We added support for DW_AT_const_value in
      __die_find_variable_cb(). With this modification, we can successfully
      track the variable 'b'.
      
      	$ sudo ./perf probe -x ./test -D "main:6 b"
      	p:probe_test/main_L6 /home/lhf/test:0x1156 b=\1:s32
      
      Fixes: 66f69b21 ("perf probe: Support DW_AT_const_value constant value")
      Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Jianlin Lv <jianlin.lv@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
      http://lore.kernel.org/lkml/20210601092750.169601-1-lihuafei1@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      da741510
    • E
      ACPICA: Clean up context mutex during object deletion · d9c889ef
      Erik Kaneda 提交于
      stable inclusion
      from stable-5.10.43
      commit 100c872c75112da26630309f4020991ffab2a11d
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit e4dfe108 ]
      
      ACPICA commit bc43c878fd4ff27ba75b1d111b97ee90d4a82707
      
      Fixes: c27f3d01 ("Fix race in GenericSerialBus (I2C) and GPIO OpRegion parameter handling")
      Link: https://github.com/acpica/acpica/commit/bc43c878Reported-by: NJohn Garry <john.garry@huawei.com>
      Reported-by: NXiang Chen <chenxiang66@hisilicon.com>
      Tested-by: NXiang Chen <chenxiang66@hisilicon.com>
      Signed-off-by: NErik Kaneda <erik.kaneda@intel.com>
      Signed-off-by: NBob Moore <robert.moore@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      d9c889ef
    • S
      nvme-rdma: fix in-casule data send for chained sgls · d314c14e
      Sagi Grimberg 提交于
      stable inclusion
      from stable-5.10.43
      commit df7c913f90c3dcda988a254141bf01eb3bb6f123
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 12b2aaad ]
      
      We have only 2 inline sg entries and we allow 4 sg entries for the send
      wr sge. Larger sgls entries will be chained. However when we build
      in-capsule send wr sge, we iterate without taking into account that the
      sgl may be chained and still fit in-capsule (which can happen if the sgl
      is bigger than 2, but lower-equal to 4).
      
      Fix in-capsule data mapping to correctly iterate chained sgls.
      
      Fixes: 38e18002 ("nvme-rdma: Avoid preallocating big SGL for data")
      Reported-by: NWalker, Benjamin <benjamin.walker@intel.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      d314c14e
    • P
      mptcp: always parse mptcp options for MPC reqsk · 36ddcae8
      Paolo Abeni 提交于
      stable inclusion
      from stable-5.10.43
      commit b198f77a3613a066cefa91f2cd9e0766612a19ce
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 06f9a435 ]
      
      In subflow_syn_recv_sock() we currently skip options parsing
      for OoO packet, given that such packets may not carry the relevant
      MPC option.
      
      If the peer generates an MPC+data TSO packet and some of the early
      segments are lost or get reorder, we server will ignore the peer key,
      causing transient, unexpected fallback to TCP.
      
      The solution is always parsing the incoming MPTCP options, and
      do the fallback only for in-order packets. This actually cleans
      the existing code a bit.
      
      Fixes: d22f4988 ("mptcp: process MP_CAPABLE data option")
      Reported-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      36ddcae8
    • A
      net/sched: act_ct: Fix ct template allocation for zone 0 · 32c726f1
      Ariel Levkovich 提交于
      stable inclusion
      from stable-5.10.43
      commit be0d8507268646a6ca524c0f40f29c501b3d78d9
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit fb91702b ]
      
      Fix current behavior of skipping template allocation in case the
      ct action is in zone 0.
      
      Skipping the allocation may cause the datapath ct code to ignore the
      entire ct action with all its attributes (commit, nat) in case the ct
      action in zone 0 was preceded by a ct clear action.
      
      The ct clear action sets the ct_state to untracked and resets the
      skb->_nfct pointer. Under these conditions and without an allocated
      ct template, the skb->_nfct pointer will remain NULL which will
      cause the tc ct action handler to exit without handling commit and nat
      actions, if such exist.
      
      For example, the following rule in OVS dp:
      recirc_id(0x2),ct_state(+new-est-rel-rpl+trk),ct_label(0/0x1), \
      in_port(eth0),actions:ct_clear,ct(commit,nat(src=10.11.0.12)), \
      recirc(0x37a)
      
      Will result in act_ct skipping the commit and nat actions in zone 0.
      
      The change removes the skipping of template allocation for zone 0 and
      treats it the same as any other zone.
      
      Fixes: b57dc7c1 ("net/sched: Introduce action ct")
      Signed-off-by: NAriel Levkovich <lariel@nvidia.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Link: https://lore.kernel.org/r/20210526170110.54864-1-lariel@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      32c726f1
    • P
      net/sched: act_ct: Offload connections with commit action · aed61203
      Paul Blakey 提交于
      stable inclusion
      from stable-5.10.43
      commit f07c548314776231f0d47d73ec6caa5b17e876e8
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 0cc254e5 ]
      
      Currently established connections are not offloaded if the filter has a
      "ct commit" action. This behavior will not offload connections of the
      following scenario:
      
      $ tc_filter add dev $DEV ingress protocol ip prio 1 flower \
        ct_state -trk \
        action ct commit action goto chain 1
      
      $ tc_filter add dev $DEV ingress protocol ip chain 1 prio 1 flower \
        action mirred egress redirect dev $DEV2
      
      $ tc_filter add dev $DEV2 ingress protocol ip prio 1 flower \
        action ct commit action goto chain 1
      
      $ tc_filter add dev $DEV2 ingress protocol ip prio 1 chain 1 flower \
        ct_state +trk+est \
        action mirred egress redirect dev $DEV
      
      Offload established connections, regardless of the commit flag.
      
      Fixes: 46475bb2 ("net/sched: act_ct: Software offload of established flows")
      Reviewed-by: NOz Shlomo <ozsh@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NPaul Blakey <paulb@nvidia.com>
      Link: https://lore.kernel.org/r/1622029449-27060-1-git-send-email-paulb@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      aed61203
    • P
      devlink: Correct VIRTUAL port to not have phys_port attributes · 711cfdc2
      Parav Pandit 提交于
      stable inclusion
      from stable-5.10.43
      commit 4f00f9c169d9f6840613a44490d7800be8d73a61
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit b28d8f0c ]
      
      Physical port name, port number attributes do not belong to virtual port
      flavour. When VF or SF virtual ports are registered they incorrectly
      append "np0" string in the netdevice name of the VF/SF.
      
      Before this fix, VF netdevice name were ens2f0np0v0, ens2f0np0v1 for VF
      0 and 1 respectively.
      
      After the fix, they are ens2f0v0, ens2f0v1.
      
      With this fix, reading /sys/class/net/ens2f0v0/phys_port_name returns
      -EOPNOTSUPP.
      
      Also devlink port show example for 2 VFs on one PF to ensure that any
      physical port attributes are not exposed.
      
      $ devlink port show
      pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false
      pci/0000:06:00.3/196608: type eth netdev ens2f0v0 flavour virtual splittable false
      pci/0000:06:00.4/262144: type eth netdev ens2f0v1 flavour virtual splittable false
      
      This change introduces a netdevice name change on systemd/udev
      version 245 and higher which honors phys_port_name sysfs file for
      generation of netdevice name.
      
      This also aligns to phys_port_name usage which is limited to switchdev
      ports as described in [1].
      
      [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/Documentation/networking/switchdev.rst
      
      Fixes: acf1ee44 ("devlink: Introduce devlink port flavour virtual")
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Link: https://lore.kernel.org/r/20210526200027.14008-1-parav@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      711cfdc2
    • A
      HID: i2c-hid: fix format string mismatch · 0c6f1052
      Arnd Bergmann 提交于
      stable inclusion
      from stable-5.10.43
      commit 56c45ab00abab9481bb55151bf5719ec7c93f01a
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit dc5f9f55 ]
      
      clang doesn't like printing a 32-bit integer using %hX format string:
      
      drivers/hid/i2c-hid/i2c-hid-core.c:994:18: error: format specifies type 'unsigned short' but the argument has type '__u32' (aka 'unsigned int') [-Werror,-Wformat]
                       client->name, hid->vendor, hid->product);
                                     ^~~~~~~~~~~
      drivers/hid/i2c-hid/i2c-hid-core.c:994:31: error: format specifies type 'unsigned short' but the argument has type '__u32' (aka 'unsigned int') [-Werror,-Wformat]
                       client->name, hid->vendor, hid->product);
                                                  ^~~~~~~~~~~~
      
      Use an explicit cast to truncate it to the low 16 bits instead.
      
      Fixes: 9ee3e066 ("HID: i2c-hid: override HID descriptors for certain devices")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NNathan Chancellor <nathan@kernel.org>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      0c6f1052
    • Z
      HID: pidff: fix error return code in hid_pidff_init() · 5cd2b814
      Zhen Lei 提交于
      stable inclusion
      from stable-5.10.43
      commit 744db828d6f9a0908f64d337642bb8ee227a7ea9
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 3dd653c0 ]
      
      Fix to return a negative error code from the error handling
      case instead of 0, as done elsewhere in this function.
      
      Fixes: 224ee88f ("Input: add force feedback driver for PID devices")
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      5cd2b814
    • T
      HID: logitech-hidpp: initialize level variable · 433f94bf
      Tom Rix 提交于
      stable inclusion
      from stable-5.10.43
      commit 39b92726a38092c704c4e1bcc8262a8959ae978e
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 81c8bf91 ]
      
      Static analysis reports this representative problem
      
      hid-logitech-hidpp.c:1356:23: warning: Assigned value is
        garbage or undefined
              hidpp->battery.level = level;
                                   ^ ~~~~~
      
      In some cases, 'level' is never set in hidpp20_battery_map_status_voltage()
      Since level is not available on all hw, initialize level to unknown.
      
      Fixes: be281368 ("hid-logitech-hidpp: read battery voltage from newer devices")
      Signed-off-by: NTom Rix <trix@redhat.com>
      Reviewed-by: NFilipe Laíns <lains@riseup.net>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      433f94bf
    • J
      ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service · dd517223
      Julian Anastasov 提交于
      stable inclusion
      from stable-5.10.43
      commit 4b1aba653642e469d954afefafa842025e10f00a
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 56e4ee82 ]
      
      syzbot reported memory leak [1] when adding service with
      HASHED flag. We should ignore this flag both from sockopt
      and netlink provided data, otherwise the service is not
      hashed and not visible while releasing resources.
      
      [1]
      BUG: memory leak
      unreferenced object 0xffff888115227800 (size 512):
        comm "syz-executor263", pid 8658, jiffies 4294951882 (age 12.560s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff83977188>] kmalloc include/linux/slab.h:556 [inline]
          [<ffffffff83977188>] kzalloc include/linux/slab.h:686 [inline]
          [<ffffffff83977188>] ip_vs_add_service+0x598/0x7c0 net/netfilter/ipvs/ip_vs_ctl.c:1343
          [<ffffffff8397d770>] do_ip_vs_set_ctl+0x810/0xa40 net/netfilter/ipvs/ip_vs_ctl.c:2570
          [<ffffffff838449a8>] nf_setsockopt+0x68/0xa0 net/netfilter/nf_sockopt.c:101
          [<ffffffff839ae4e9>] ip_setsockopt+0x259/0x1ff0 net/ipv4/ip_sockglue.c:1435
          [<ffffffff839fa03c>] raw_setsockopt+0x18c/0x1b0 net/ipv4/raw.c:857
          [<ffffffff83691f20>] __sys_setsockopt+0x1b0/0x360 net/socket.c:2117
          [<ffffffff836920f2>] __do_sys_setsockopt net/socket.c:2128 [inline]
          [<ffffffff836920f2>] __se_sys_setsockopt net/socket.c:2125 [inline]
          [<ffffffff836920f2>] __x64_sys_setsockopt+0x22/0x30 net/socket.c:2125
          [<ffffffff84350efa>] do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
          [<ffffffff84400068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Reported-and-tested-by: syzbot+e562383183e4b1766930@syzkaller.appspotmail.com
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Reviewed-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      dd517223
    • M
      vfio/platform: fix module_put call in error flow · 4bcb33fa
      Max Gurtovoy 提交于
      stable inclusion
      from stable-5.10.43
      commit 46ae882bb19a12e8df9936e8dcf1924b6abb2b56
      bugzilla: 109284
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit dc51ff91 ]
      
      The ->parent_module is the one that use in try_module_get. It should
      also be the one the we use in module_put during vfio_platform_open().
      
      Fixes: 32a2d71c ("vfio: platform: introduce vfio-platform-base module")
      Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
      Message-Id: <20210518192133.59195-1-mgurtovoy@nvidia.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      4bcb33fa