1. 26 11月, 2019 1 次提交
    • L
      vfs: mark pipes and sockets as stream-like file descriptors · d8e464ec
      Linus Torvalds 提交于
      In commit 3975b097 ("convert stream-like files -> stream_open, even
      if they use noop_llseek") Kirill used a coccinelle script to change
      "nonseekable_open()" to "stream_open()", which changed the trivial cases
      of stream-like file descriptors to the new model with FMODE_STREAM.
      
      However, the two big cases - sockets and pipes - don't actually have
      that trivial pattern at all, and were thus never converted to
      FMODE_STREAM even though it makes lots of sense to do so.
      
      That's particularly true when looking forward to the next change:
      getting rid of FMODE_ATOMIC_POS entirely, and just using FMODE_STREAM to
      decide whether f_pos updates are needed or not.  And if they are, we'll
      always do them atomically.
      
      This came up because KCSAN (correctly) noted that the non-locked f_pos
      updates are data races: they are clearly benign for the case where we
      don't care, but it would be good to just not have that issue exist at
      all.
      
      Note that the reason we used FMODE_ATOMIC_POS originally is that only
      doing it for the minimal required case is "safer" in that it's possible
      that the f_pos locking can cause unnecessary serialization across the
      whole write() call.  And in the worst case, that kind of serialization
      can cause deadlock issues: think writers that need readers to empty the
      state using the same file descriptor.
      
      [ Note that the locking is per-file descriptor - because it protects
        "f_pos", which is obviously per-file descriptor - so it only affects
        cases where you literally use the same file descriptor to both read
        and write.
      
        So a regular pipe that has separate reading and writing file
        descriptors doesn't really have this situation even though it's the
        obvious case of "reader empties what a bit writer concurrently fills"
      
        But we want to make pipes as being stream-line anyway, because we
        don't want the unnecessary overhead of locking, and because a named
        pipe can be (ab-)used by reading and writing to the same file
        descriptor. ]
      
      There are likely a lot of other cases that might want FMODE_STREAM, and
      looking for ".llseek = no_llseek" users and other cases that don't have
      an lseek file operation at all and making them use "stream_open()" might
      be a good idea.  But pipes and sockets are likely to be the two main
      cases.
      
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Eic Dumazet <edumazet@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Marco Elver <elver@google.com>
      Cc: Andrea Parri <parri.andrea@gmail.com>
      Cc: Paul McKenney <paulmck@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d8e464ec
  2. 23 11月, 2019 2 次提交
    • F
      udp: drop skb extensions before marking skb stateless · 677bf08c
      Florian Westphal 提交于
      Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free
      assumes all skb head state has been dropped already.
      
      This will leak the extension memory in case the skb has extensions other
      than the ipsec secpath, e.g. bridge nf data.
      
      To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have
      extensions or if the extension space can be free'd.
      
      Fixes: 895b5c9f ("netfilter: drop bridge nf reset from nf_reset")
      Cc: Paolo Abeni <pabeni@redhat.com>
      Reported-by: NByron Stanoszek <gandalf@winds.org>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      677bf08c
    • D
      net: rtnetlink: prevent underflows in do_setvfinfo() · ff08ddba
      Dan Carpenter 提交于
      The "ivm->vf" variable is a u32, but the problem is that a number of
      drivers cast it to an int and then forget to check for negatives.  An
      example of this is in the cxgb4 driver.
      
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
        2890  static int cxgb4_mgmt_get_vf_config(struct net_device *dev,
        2891                                      int vf, struct ifla_vf_info *ivi)
                                                  ^^^^^^
        2892  {
        2893          struct port_info *pi = netdev_priv(dev);
        2894          struct adapter *adap = pi->adapter;
        2895          struct vf_info *vfinfo;
        2896
        2897          if (vf >= adap->num_vfs)
                          ^^^^^^^^^^^^^^^^^^^
        2898                  return -EINVAL;
        2899          vfinfo = &adap->vfinfo[vf];
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
      
      There are 48 functions affected.
      
      drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8435 hclge_set_vf_vlan_filter() warn: can 'vfid' underflow 's32min-2147483646'
      drivers/net/ethernet/freescale/enetc/enetc_pf.c:377 enetc_pf_set_vf_mac() warn: can 'vf' underflow 's32min-2147483646'
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2899 cxgb4_mgmt_get_vf_config() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2960 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3019 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3038 cxgb4_mgmt_set_vf_vlan() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3086 cxgb4_mgmt_set_vf_link_state() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/chelsio/cxgb/cxgb2.c:791 get_eeprom() warn: can 'i' underflow 's32min-(-4),0,4-s32max'
      drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:82 bnxt_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:164 bnxt_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:186 bnxt_get_vf_config() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:228 bnxt_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:264 bnxt_set_vf_vlan() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:293 bnxt_set_vf_bw() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:333 bnxt_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63'
      drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63'
      drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2281 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
      drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2285 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
      drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2286 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
      drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2292 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
      drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2297 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
      drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1832 qlcnic_sriov_set_vf_mac() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1864 qlcnic_sriov_set_vf_tx_rate() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1937 qlcnic_sriov_set_vf_vlan() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2005 qlcnic_sriov_get_vf_config() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2036 qlcnic_sriov_set_vf_spoofchk() warn: can 'vf' underflow 's32min-254'
      drivers/net/ethernet/emulex/benet/be_main.c:1914 be_get_vf_config() warn: can 'vf' underflow 's32min-65534'
      drivers/net/ethernet/emulex/benet/be_main.c:1915 be_get_vf_config() warn: can 'vf' underflow 's32min-65534'
      drivers/net/ethernet/emulex/benet/be_main.c:1922 be_set_vf_tvt() warn: can 'vf' underflow 's32min-65534'
      drivers/net/ethernet/emulex/benet/be_main.c:1951 be_clear_vf_tvt() warn: can 'vf' underflow 's32min-65534'
      drivers/net/ethernet/emulex/benet/be_main.c:2063 be_set_vf_tx_rate() warn: can 'vf' underflow 's32min-65534'
      drivers/net/ethernet/emulex/benet/be_main.c:2091 be_set_vf_link_state() warn: can 'vf' underflow 's32min-65534'
      drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:2609 ice_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3050 ice_get_vf_cfg() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3103 ice_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3181 ice_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3237 ice_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3286 ice_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3919 i40e_validate_vf() warn: can 'vf_id' underflow 's32min-2147483646'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3957 i40e_ndo_set_vf_mac() warn: can 'vf_id' underflow 's32min-2147483646'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4104 i40e_ndo_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-2147483646'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4263 i40e_ndo_set_vf_bw() warn: can 'vf_id' underflow 's32min-2147483646'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4309 i40e_ndo_get_vf_config() warn: can 'vf_id' underflow 's32min-2147483646'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4371 i40e_ndo_set_vf_link_state() warn: can 'vf_id' underflow 's32min-2147483646'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646'
      drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4504 i40e_ndo_set_vf_trust() warn: can 'vf_id' underflow 's32min-2147483646'
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff08ddba
  3. 22 11月, 2019 1 次提交
  4. 21 11月, 2019 3 次提交
    • E
      net-sysfs: fix netdev_queue_add_kobject() breakage · 48a322b6
      Eric Dumazet 提交于
      kobject_put() should only be called in error path.
      
      Fixes: b8eb7183 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jouni Hogander <jouni.hogander@unikie.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48a322b6
    • H
      ipv6/route: return if there is no fib_nh_gw_family · 004b3942
      Hangbin Liu 提交于
      Previously we will return directly if (!rt || !rt->fib6_nh.fib_nh_gw_family)
      in function rt6_probe(), but after commit cc3a86c8
      ("ipv6: Change rt6_probe to take a fib6_nh"), the logic changed to
      return if there is fib_nh_gw_family.
      
      Fixes: cc3a86c8 ("ipv6: Change rt6_probe to take a fib6_nh")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      004b3942
    • J
      net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject · b8eb7183
      Jouni Hogander 提交于
      kobject_init_and_add takes reference even when it fails. This has
      to be given up by the caller in error handling. Otherwise memory
      allocated by kobject_init_and_add is never freed. Originally found
      by Syzkaller:
      
      BUG: memory leak
      unreferenced object 0xffff8880679f8b08 (size 8):
        comm "netdev_register", pid 269, jiffies 4294693094 (age 12.132s)
        hex dump (first 8 bytes):
          72 78 2d 30 00 36 20 d4                          rx-0.6 .
        backtrace:
          [<000000008c93818e>] __kmalloc_track_caller+0x16e/0x290
          [<000000001f2e4e49>] kvasprintf+0xb1/0x140
          [<000000007f313394>] kvasprintf_const+0x56/0x160
          [<00000000aeca11c8>] kobject_set_name_vargs+0x5b/0x140
          [<0000000073a0367c>] kobject_init_and_add+0xd8/0x170
          [<0000000088838e4b>] net_rx_queue_update_kobjects+0x152/0x560
          [<000000006be5f104>] netdev_register_kobject+0x210/0x380
          [<00000000e31dab9d>] register_netdevice+0xa1b/0xf00
          [<00000000f68b2465>] __tun_chr_ioctl+0x20d5/0x3dd0
          [<000000004c50599f>] tun_chr_ioctl+0x2f/0x40
          [<00000000bbd4c317>] do_vfs_ioctl+0x1c7/0x1510
          [<00000000d4c59e8f>] ksys_ioctl+0x99/0xb0
          [<00000000946aea81>] __x64_sys_ioctl+0x78/0xb0
          [<0000000038d946e5>] do_syscall_64+0x16f/0x580
          [<00000000e0aa5d8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
          [<00000000285b3d1a>] 0xffffffffffffffff
      
      Cc: David Miller <davem@davemloft.net>
      Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: NJouni Hogander <jouni.hogander@unikie.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8eb7183
  5. 20 11月, 2019 3 次提交
    • D
      net/sched: act_pedit: fix WARN() in the traffic path · f67169fe
      Davide Caratti 提交于
      when configuring act_pedit rules, the number of keys is validated only on
      addition of a new entry. This is not sufficient to avoid hitting a WARN()
      in the traffic path: for example, it is possible to replace a valid entry
      with a new one having 0 extended keys, thus causing splats in dmesg like:
      
       pedit BUG: index 42
       WARNING: CPU: 2 PID: 4054 at net/sched/act_pedit.c:410 tcf_pedit_act+0xc84/0x1200 [act_pedit]
       [...]
       RIP: 0010:tcf_pedit_act+0xc84/0x1200 [act_pedit]
       Code: 89 fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ac 00 00 00 48 8b 44 24 10 48 c7 c7 a0 c4 e4 c0 8b 70 18 e8 1c 30 95 ea <0f> 0b e9 a0 fa ff ff e8 00 03 f5 ea e9 14 f4 ff ff 48 89 58 40 e9
       RSP: 0018:ffff888077c9f320 EFLAGS: 00010286
       RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffac2983a2
       RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888053927bec
       RBP: dffffc0000000000 R08: ffffed100a726209 R09: ffffed100a726209
       R10: 0000000000000001 R11: ffffed100a726208 R12: ffff88804beea780
       R13: ffff888079a77400 R14: ffff88804beea780 R15: ffff888027ab2000
       FS:  00007fdeec9bd740(0000) GS:ffff888053900000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007ffdb3dfd000 CR3: 000000004adb4006 CR4: 00000000001606e0
       Call Trace:
        tcf_action_exec+0x105/0x3f0
        tcf_classify+0xf2/0x410
        __dev_queue_xmit+0xcbf/0x2ae0
        ip_finish_output2+0x711/0x1fb0
        ip_output+0x1bf/0x4b0
        ip_send_skb+0x37/0xa0
        raw_sendmsg+0x180c/0x2430
        sock_sendmsg+0xdb/0x110
        __sys_sendto+0x257/0x2b0
        __x64_sys_sendto+0xdd/0x1b0
        do_syscall_64+0xa5/0x4e0
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
       RIP: 0033:0x7fdeeb72e993
       Code: 48 8b 0d e0 74 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 0d d6 2c 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 4b cc 00 00 48 89 04 24
       RSP: 002b:00007ffdb3de8a18 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
       RAX: ffffffffffffffda RBX: 000055c81972b700 RCX: 00007fdeeb72e993
       RDX: 0000000000000040 RSI: 000055c81972b700 RDI: 0000000000000003
       RBP: 00007ffdb3dea130 R08: 000055c819728510 R09: 0000000000000010
       R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
       R13: 000055c81972b6c0 R14: 000055c81972969c R15: 0000000000000080
      
      Fix this moving the check on 'nkeys' earlier in tcf_pedit_init(), so that
      attempts to install rules having 0 keys are always rejected with -EINVAL.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f67169fe
    • I
      taprio: don't reject same mqprio settings · b5a0faa3
      Ivan Khoronzhuk 提交于
      The taprio qdisc allows to set mqprio setting but only once. In case
      if mqprio settings are provided next time the error is returned as
      it's not allowed to change traffic class mapping in-flignt and that
      is normal. But if configuration is absolutely the same - no need to
      return error. It allows to provide same command couple times,
      changing only base time for instance, or changing only scheds maps,
      but leaving mqprio setting w/o modification. It more corresponds the
      message: "Changing the traffic mapping of a running schedule is not
      supported", so reject mqprio if it's really changed.
      
      Also corrected TC_BITMASK + 1 for consistency, as proposed.
      
      Fixes: a3d43c0d ("taprio: Add support adding an admin schedule")
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Tested-by: NVladimir Oltean <olteanv@gmail.com>
      Acked-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: NIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5a0faa3
    • W
      net/tls: enable sk_msg redirect to tls socket egress · d4ffb02d
      Willem de Bruijn 提交于
      Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket
      with TLS_TX takes the following path:
      
        tcp_bpf_sendmsg_redir
          tcp_bpf_push_locked
            tcp_bpf_push
              kernel_sendpage_locked
                sock->ops->sendpage_locked
      
      Also update the flags test in tls_sw_sendpage_locked to allow flag
      MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this.
      
      Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t
      Link: https://github.com/wdebruij/kerneltools/commits/icept.2
      Fixes: 0608c69c ("bpf: sk_msg, sock{map|hash} redirect through ULP")
      Fixes: f3de19af ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"")
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4ffb02d
  6. 19 11月, 2019 3 次提交
  7. 17 11月, 2019 6 次提交
    • G
      ipmr: Fix skb headroom in ipmr_get_route(). · 7901cd97
      Guillaume Nault 提交于
      In route.c, inet_rtm_getroute_build_skb() creates an skb with no
      headroom. This skb is then used by inet_rtm_getroute() which may pass
      it to rt_fill_info() and, from there, to ipmr_get_route(). The later
      might try to reuse this skb by cloning it and prepending an IPv4
      header. But since the original skb has no headroom, skb_push() triggers
      skb_under_panic():
      
      skbuff: skb_under_panic: text:00000000ca46ad8a len:80 put:20 head:00000000cd28494e data:000000009366fd6b tail:0x3c end:0xec0 dev:veth0
      ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:108!
      invalid opcode: 0000 [#1] SMP KASAN PTI
      CPU: 6 PID: 587 Comm: ip Not tainted 5.4.0-rc6+ #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
      RIP: 0010:skb_panic+0xbf/0xd0
      Code: 41 a2 ff 8b 4b 70 4c 8b 4d d0 48 c7 c7 20 76 f5 8b 44 8b 45 bc 48 8b 55 c0 48 8b 75 c8 41 54 41 57 41 56 41 55 e8 75 dc 7a ff <0f> 0b 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
      RSP: 0018:ffff888059ddf0b0 EFLAGS: 00010286
      RAX: 0000000000000086 RBX: ffff888060a315c0 RCX: ffffffff8abe4822
      RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88806c9a79cc
      RBP: ffff888059ddf118 R08: ffffed100d9361b1 R09: ffffed100d9361b0
      R10: ffff88805c68aee3 R11: ffffed100d9361b1 R12: ffff88805d218000
      R13: ffff88805c689fec R14: 000000000000003c R15: 0000000000000ec0
      FS:  00007f6af184b700(0000) GS:ffff88806c980000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffc8204a000 CR3: 0000000057b40006 CR4: 0000000000360ee0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       skb_push+0x7e/0x80
       ipmr_get_route+0x459/0x6fa
       rt_fill_info+0x692/0x9f0
       inet_rtm_getroute+0xd26/0xf20
       rtnetlink_rcv_msg+0x45d/0x630
       netlink_rcv_skb+0x1a5/0x220
       rtnetlink_rcv+0x15/0x20
       netlink_unicast+0x305/0x3a0
       netlink_sendmsg+0x575/0x730
       sock_sendmsg+0xb5/0xc0
       ___sys_sendmsg+0x497/0x4f0
       __sys_sendmsg+0xcb/0x150
       __x64_sys_sendmsg+0x48/0x50
       do_syscall_64+0xd2/0xac0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Actually the original skb used to have enough headroom, but the
      reserve_skb() call was lost with the introduction of
      inet_rtm_getroute_build_skb() by commit 404eb77e ("ipv4: support
      sport, dport and ip_proto in RTM_GETROUTE").
      
      We could reserve some headroom again in inet_rtm_getroute_build_skb(),
      but this function shouldn't be responsible for handling the special
      case of ipmr_get_route(). Let's handle that directly in
      ipmr_get_route() by calling skb_realloc_headroom() instead of
      skb_clone().
      
      Fixes: 404eb77e ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE")
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7901cd97
    • U
      net/smc: fix fastopen for non-blocking connect() · 8204df72
      Ursula Braun 提交于
      FASTOPEN does not work with SMC-sockets. Since SMC allows fallback to
      TCP native during connection start, the FASTOPEN setsockopts trigger
      this fallback, if the SMC-socket is still in state SMC_INIT.
      But if a FASTOPEN setsockopt is called after a non-blocking connect(),
      this is broken, and fallback does not make sense.
      This change complements
      commit cd206360 ("net/smc: avoid fallback in case of non-blocking connect")
      and fixes the syzbot reported problem "WARNING in smc_unhash_sk".
      
      Reported-by: syzbot+8488cc4cf1c9e09b8b86@syzkaller.appspotmail.com
      Fixes: e1bbdd57 ("net/smc: reduce sock_put() for fallback sockets")
      Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8204df72
    • D
      rds: ib: update WR sizes when bringing up connection · a36e629e
      Dag Moxnes 提交于
      Currently WR sizes are updated from rds_ib_sysctl_max_send_wr and
      rds_ib_sysctl_max_recv_wr when a connection is shut down. As a result,
      a connection being down while rds_ib_sysctl_max_send_wr or
      rds_ib_sysctl_max_recv_wr are updated, will not update the sizes when
      it comes back up.
      
      Move resizing of WRs to rds_ib_setup_qp so that connections will be setup
      with the most current WR sizes.
      Signed-off-by: NDag Moxnes <dag.moxnes@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a36e629e
    • V
      net: dsa: tag_8021q: Fix dsa_8021q_restore_pvid for an absent pvid · c80ed84e
      Vladimir Oltean 提交于
      This sequence of operations:
      ip link set dev br0 type bridge vlan_filtering 1
      bridge vlan del dev swp2 vid 1
      ip link set dev br0 type bridge vlan_filtering 1
      ip link set dev br0 type bridge vlan_filtering 0
      
      apparently fails with the message:
      
      [   31.305716] sja1105 spi0.1: Reset switch and programmed static config. Reason: VLAN filtering
      [   31.322161] sja1105 spi0.1: Couldn't determine PVID attributes (pvid 0)
      [   31.328939] sja1105 spi0.1: Failed to setup VLAN tagging for port 1: -2
      [   31.335599] ------------[ cut here ]------------
      [   31.340215] WARNING: CPU: 1 PID: 194 at net/switchdev/switchdev.c:157 switchdev_port_attr_set_now+0x9c/0xa4
      [   31.349981] br0: Commit of attribute (id=6) failed.
      [   31.354890] Modules linked in:
      [   31.357942] CPU: 1 PID: 194 Comm: ip Not tainted 5.4.0-rc6-01792-gf4f632e07665-dirty #2062
      [   31.366167] Hardware name: Freescale LS1021A
      [   31.370437] [<c03144dc>] (unwind_backtrace) from [<c030e184>] (show_stack+0x10/0x14)
      [   31.378153] [<c030e184>] (show_stack) from [<c11d1c1c>] (dump_stack+0xe0/0x10c)
      [   31.385437] [<c11d1c1c>] (dump_stack) from [<c034c730>] (__warn+0xf4/0x10c)
      [   31.392373] [<c034c730>] (__warn) from [<c034c7bc>] (warn_slowpath_fmt+0x74/0xb8)
      [   31.399827] [<c034c7bc>] (warn_slowpath_fmt) from [<c11ca204>] (switchdev_port_attr_set_now+0x9c/0xa4)
      [   31.409097] [<c11ca204>] (switchdev_port_attr_set_now) from [<c117036c>] (__br_vlan_filter_toggle+0x6c/0x118)
      [   31.418971] [<c117036c>] (__br_vlan_filter_toggle) from [<c115d010>] (br_changelink+0xf8/0x518)
      [   31.427637] [<c115d010>] (br_changelink) from [<c0f8e9ec>] (__rtnl_newlink+0x3f4/0x76c)
      [   31.435613] [<c0f8e9ec>] (__rtnl_newlink) from [<c0f8eda8>] (rtnl_newlink+0x44/0x60)
      [   31.443329] [<c0f8eda8>] (rtnl_newlink) from [<c0f89f20>] (rtnetlink_rcv_msg+0x2cc/0x51c)
      [   31.451477] [<c0f89f20>] (rtnetlink_rcv_msg) from [<c1008df8>] (netlink_rcv_skb+0xb8/0x110)
      [   31.459796] [<c1008df8>] (netlink_rcv_skb) from [<c1008648>] (netlink_unicast+0x17c/0x1f8)
      [   31.468026] [<c1008648>] (netlink_unicast) from [<c1008980>] (netlink_sendmsg+0x2bc/0x3b4)
      [   31.476261] [<c1008980>] (netlink_sendmsg) from [<c0f43858>] (___sys_sendmsg+0x230/0x250)
      [   31.484408] [<c0f43858>] (___sys_sendmsg) from [<c0f44c84>] (__sys_sendmsg+0x50/0x8c)
      [   31.492209] [<c0f44c84>] (__sys_sendmsg) from [<c0301000>] (ret_fast_syscall+0x0/0x28)
      [   31.500090] Exception stack(0xedf47fa8 to 0xedf47ff0)
      [   31.505122] 7fa0:                   00000002 b6f2e060 00000003 beabd6a4 00000000 00000000
      [   31.513265] 7fc0: 00000002 b6f2e060 5d6e3213 00000128 00000000 00000001 00000006 000619c4
      [   31.521405] 7fe0: 00086078 beabd658 0005edbc b6e7ce68
      
      The reason is the implementation of br_get_pvid:
      
      static inline u16 br_get_pvid(const struct net_bridge_vlan_group *vg)
      {
      	if (!vg)
      		return 0;
      
      	smp_rmb();
      	return vg->pvid;
      }
      
      Since VID 0 is an invalid pvid from the bridge's point of view, let's
      add this check in dsa_8021q_restore_pvid to avoid restoring a pvid that
      doesn't really exist.
      
      Fixes: 5f33183b ("net: dsa: tag_8021q: Restore bridge VLANs when enabling vlan_filtering")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c80ed84e
    • A
      seg6: fix skb transport_header after decap_and_validate() · c71644d0
      Andrea Mayer 提交于
      in the receive path (more precisely in ip6_rcv_core()) the
      skb->transport_header is set to skb->network_header + sizeof(*hdr). As a
      consequence, after routing operations, destination input expects to find
      skb->transport_header correctly set to the next protocol (or extension
      header) that follows the network protocol. However, decap behaviors (DX*,
      DT*) remove the outer IPv6 and SRH extension and do not set again the
      skb->transport_header pointer correctly. For this reason, the patch sets
      the skb->transport_header to the skb->network_header + sizeof(hdr) in each
      DX* and DT* behavior.
      Signed-off-by: NAndrea Mayer <andrea.mayer@uniroma2.it>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c71644d0
    • A
      seg6: fix srh pointer in get_srh() · 7f91ed8c
      Andrea Mayer 提交于
      pskb_may_pull may change pointers in header. For this reason, it is
      mandatory to reload any pointer that points into skb header.
      Signed-off-by: NAndrea Mayer <andrea.mayer@uniroma2.it>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f91ed8c
  8. 15 11月, 2019 1 次提交
  9. 13 11月, 2019 10 次提交
  10. 12 11月, 2019 2 次提交
    • X
      xfrm: release device reference for invalid state · 4944a4b1
      Xiaodong Xu 提交于
      An ESP packet could be decrypted in async mode if the input handler for
      this packet returns -EINPROGRESS in xfrm_input(). At this moment the device
      reference in skb is held. Later xfrm_input() will be invoked again to
      resume the processing.
      If the transform state is still valid it would continue to release the
      device reference and there won't be a problem; however if the transform
      state is not valid when async resumption happens, the packet will be
      dropped while the device reference is still being held.
      When the device is deleted for some reason and the reference to this
      device is not properly released, the kernel will keep logging like:
      
      unregister_netdevice: waiting for ppp2 to become free. Usage count = 1
      
      The issue is observed when running IPsec traffic over a PPPoE device based
      on a bridge interface. By terminating the PPPoE connection on the server
      end for multiple times, the PPPoE device on the client side will eventually
      get stuck on the above warning message.
      
      This patch will check the async mode first and continue to release device
      reference in async resumption, before it is dropped due to invalid state.
      
      v2: Do not assign address family from outer_mode in the transform if the
      state is invalid
      
      v3: Release device reference in the error path instead of jumping to resume
      
      Fixes: 4ce3dbe3 ("xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)")
      Signed-off-by: NXiaodong Xu <stid.smth@gmail.com>
      Reported-by: NBo Chen <chenborfc@163.com>
      Tested-by: NBo Chen <chenborfc@163.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      4944a4b1
    • A
      devlink: Add method for time-stamp on reporter's dump · d279505b
      Aya Levin 提交于
      When setting the dump's time-stamp, use ktime_get_real in addition to
      jiffies. This simplifies the user space implementation and bypasses
      some inconsistent behavior with translating jiffies to current time.
      The time taken is transformed into nsec, to comply with y2038 issue.
      
      Fixes: c8e1da0b ("devlink: Add health report functionality")
      Signed-off-by: NAya Levin <ayal@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d279505b
  11. 10 11月, 2019 1 次提交
  12. 09 11月, 2019 1 次提交
    • S
      vsock/virtio: fix sock refcnt holding during the shutdown · ad8a7220
      Stefano Garzarella 提交于
      The "42f5cda5" commit rightly set SOCK_DONE on peer shutdown,
      but there is an issue if we receive the SHUTDOWN(RDWR) while the
      virtio_transport_close_timeout() is scheduled.
      In this case, when the timeout fires, the SOCK_DONE is already
      set and the virtio_transport_close_timeout() will not call
      virtio_transport_reset() and virtio_transport_do_close().
      This causes that both sockets remain open and will never be released,
      preventing the unloading of [virtio|vhost]_transport modules.
      
      This patch fixes this issue, calling virtio_transport_reset() and
      virtio_transport_do_close() when we receive the SHUTDOWN(RDWR)
      and there is nothing left to read.
      
      Fixes: 42f5cda5 ("vsock/virtio: set SOCK_DONE on peer shutdown")
      Cc: Stephen Barber <smbarber@chromium.org>
      Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad8a7220
  13. 08 11月, 2019 5 次提交
    • A
      mac80211: fix station inactive_time shortly after boot · 285531f9
      Ahmed Zaki 提交于
      In the first 5 minutes after boot (time of INITIAL_JIFFIES),
      ieee80211_sta_last_active() returns zero if last_ack is zero. This
      leads to "inactive time" showing jiffies_to_msecs(jiffies).
      
       # iw wlan0 station get fc:ec:da:64:a6:dd
       Station fc:ec:da:64:a6:dd (on wlan0)
      	inactive time:	4294894049 ms
      	.
      	.
      	connected time:	70 seconds
      
      Fix by returning last_rx if last_ack == 0.
      Signed-off-by: NAhmed Zaki <anzaki@gmail.com>
      Link: https://lore.kernel.org/r/20191031121243.27694-1-anzaki@gmail.comSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
      285531f9
    • J
      mac80211: fix ieee80211_txq_setup_flows() failure path · 6dd47d97
      Johannes Berg 提交于
      If ieee80211_txq_setup_flows() fails, we don't clean up LED
      state properly, leading to crashes later on, fix that.
      
      Fixes: dc8b274f ("mac80211: Move up init of TXQs")
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Acked-by: NToke Høiland-Jørgensen <toke@toke.dk>
      Link: https://lore.kernel.org/r/20191105154110.1ccf7112ba5d.I0ba865792446d051867b33153be65ce6b063d98c@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
      6dd47d97
    • D
      ipv4: Fix table id reference in fib_sync_down_addr · e0a31262
      David Ahern 提交于
      Hendrik reported routes in the main table using source address are not
      removed when the address is removed. The problem is that fib_sync_down_addr
      does not account for devices in the default VRF which are associated
      with the main table. Fix by updating the table id reference.
      
      Fixes: 5a56a0b3 ("net: Don't delete routes in different VRFs")
      Reported-by: NHendrik Donner <hd@os-cillation.de>
      Signed-off-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0a31262
    • E
      ipv6: fixes rt6_probe() and fib6_nh->last_probe init · 1bef4c22
      Eric Dumazet 提交于
      While looking at a syzbot KCSAN report [1], I found multiple
      issues in this code :
      
      1) fib6_nh->last_probe has an initial value of 0.
      
         While probably okay on 64bit kernels, this causes an issue
         on 32bit kernels since the time_after(jiffies, 0 + interval)
         might be false ~24 days after boot (for HZ=1000)
      
      2) The data-race found by KCSAN
         I could use READ_ONCE() and WRITE_ONCE(), but we also can
         take the opportunity of not piling-up too many rt6_probe_deferred()
         works by using instead cmpxchg() so that only one cpu wins the race.
      
      [1]
      BUG: KCSAN: data-race in find_match / find_match
      
      write to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 1:
       rt6_probe net/ipv6/route.c:663 [inline]
       find_match net/ipv6/route.c:757 [inline]
       find_match+0x5bd/0x790 net/ipv6/route.c:733
       __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
       find_rr_leaf net/ipv6/route.c:852 [inline]
       rt6_select net/ipv6/route.c:896 [inline]
       fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
       ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
       ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
       fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
       ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
       ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
       ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
       ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
       inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
       inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
       __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
       tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
       tcp_xmit_probe_skb+0x19b/0x1d0 net/ipv4/tcp_output.c:3735
      
      read to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 0:
       rt6_probe net/ipv6/route.c:657 [inline]
       find_match net/ipv6/route.c:757 [inline]
       find_match+0x521/0x790 net/ipv6/route.c:733
       __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
       find_rr_leaf net/ipv6/route.c:852 [inline]
       rt6_select net/ipv6/route.c:896 [inline]
       fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
       ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
       ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
       fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
       ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
       ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
       ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
       ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
       inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
       inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
       __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 18894 Comm: udevd Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: cc3a86c8 ("ipv6: Change rt6_probe to take a fib6_nh")
      Fixes: f547fac6 ("ipv6: rate-limit probes for neighbourless routes")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bef4c22
    • P
      nfc: netlink: fix double device reference drop · 025ec40b
      Pan Bian 提交于
      The function nfc_put_device(dev) is called twice to drop the reference
      to dev when there is no associated local llcp. Remove one of them to fix
      the bug.
      
      Fixes: 52feb444 ("NFC: Extend netlink interface for LTO, RW, and MIUX parameters support")
      Fixes: d9b8d8e1 ("NFC: llcp: Service Name Lookup netlink interface")
      Signed-off-by: NPan Bian <bianpan2016@163.com>
      Reviewed-by: NJohan Hovold <johan@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      025ec40b
  14. 07 11月, 2019 1 次提交
反馈
建议
客服 返回
顶部