1. 07 2月, 2019 33 次提交
    • P
      CIFS: Do not count -ENODATA as failure for query directory · c6961288
      Pavel Shilovsky 提交于
      commit 8e6e72aeceaaed5aeeb1cb43d3085de7ceb14f79 upstream.
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c6961288
    • T
      virtio_net: Differentiate sk_buff and xdp_frame on freeing · fbb49172
      Toshiaki Makita 提交于
      [ Upstream commit 5050471d35d1316ba32dfcbb409978337eb9e75e
      
        I had to fold commit df133f3f9625 ("virtio_net: bulk free tx skbs")
        into this to make it work.  ]
      
      We do not reset or free up unused buffers when enabling/disabling XDP,
      so it can happen that xdp_frames are freed after disabling XDP or
      sk_buffs are freed after enabling XDP on xdp tx queues.
      Thus we need to handle both forms (xdp_frames and sk_buffs) regardless
      of XDP setting.
      One way to trigger this problem is to disable XDP when napi_tx is
      enabled. In that case, virtnet_xdp_set() calls virtnet_napi_enable()
      which kicks NAPI. The NAPI handler will call virtnet_poll_cleantx()
      which invokes free_old_xmit_skbs() for queues which have been used by
      XDP.
      
      Note that even with this change we need to keep skipping
      free_old_xmit_skbs() from NAPI handlers when XDP is enabled, because XDP
      tx queues do not aquire queue locks.
      
      - v2: Use napi_consume_skb() instead of dev_consume_skb_any()
      
      Fixes: 4941d472 ("virtio-net: do not reset during XDP set")
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fbb49172
    • T
      virtio_net: Use xdp_return_frame to free xdp_frames on destroying vqs · ed6a5fc8
      Toshiaki Makita 提交于
      [ Upstream commit 07b344f494ddda9f061b396407c96df8c46c82b5 ]
      
      put_page() can work as a fallback for freeing xdp_frames, but the
      appropriate way is to use xdp_return_frame().
      
      Fixes: cac320c8 ("virtio_net: convert to use generic xdp_frame and xdp_return_frame API")
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed6a5fc8
    • T
      virtio_net: Don't process redirected XDP frames when XDP is disabled · 05e260f9
      Toshiaki Makita 提交于
      [ Upstream commit 03aa6d34868c07b2b1b8b2db080602d7ec528173 ]
      
      Commit 8dcc5b0a ("virtio_net: fix ndo_xdp_xmit crash towards dev not
      ready for XDP") tried to avoid access to unexpected sq while XDP is
      disabled, but was not complete.
      
      There was a small window which causes out of bounds sq access in
      virtnet_xdp_xmit() while disabling XDP.
      
      An example case of
       - curr_queue_pairs = 6 (2 for SKB and 4 for XDP)
       - online_cpu_num = xdp_queue_paris = 4
      when XDP is enabled:
      
      CPU 0                         CPU 1
      (Disabling XDP)               (Processing redirected XDP frames)
      
                                    virtnet_xdp_xmit()
      virtnet_xdp_set()
       _virtnet_set_queues()
        set curr_queue_pairs (2)
                                     check if rq->xdp_prog is not NULL
                                     virtnet_xdp_sq(vi)
                                      qp = curr_queue_pairs -
                                           xdp_queue_pairs +
                                           smp_processor_id()
                                         = 2 - 4 + 1 = -1
                                      sq = &vi->sq[qp] // out of bounds access
        set xdp_queue_pairs (0)
        rq->xdp_prog = NULL
      
      Basically we should not change curr_queue_pairs and xdp_queue_pairs
      while someone can read the values. Thus, when disabling XDP, assign NULL
      to rq->xdp_prog first, and wait for RCU grace period, then change
      xxx_queue_pairs.
      Note that we need to keep the current order when enabling XDP though.
      
      - v2: Make rcu_assign_pointer/synchronize_net conditional instead of
            _virtnet_set_queues.
      
      Fixes: 186b3c99 ("virtio-net: support XDP_REDIRECT")
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05e260f9
    • T
      virtio_net: Fix out of bounds access of sq · 0921dd50
      Toshiaki Makita 提交于
      [ Upstream commit 1667c08a9d31c7cdf09f4890816bfbf20b685495 ]
      
      When XDP is disabled, curr_queue_pairs + smp_processor_id() can be
      larger than max_queue_pairs.
      There is no guarantee that we have enough XDP send queues dedicated for
      each cpu when XDP is disabled, so do not count drops on sq in that case.
      
      Fixes: 5b8f3c8d ("virtio_net: Add XDP related stats")
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0921dd50
    • T
      virtio_net: Fix not restoring real_num_rx_queues · d97117bd
      Toshiaki Makita 提交于
      [ Upstream commit 188313c137c4f76afd0862f50dbc185b198b9e2a ]
      
      When _virtnet_set_queues() failed we did not restore real_num_rx_queues.
      Fix this by placing the change of real_num_rx_queues after
      _virtnet_set_queues().
      This order is also in line with virtnet_set_channels().
      
      Fixes: 4941d472 ("virtio-net: do not reset during XDP set")
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d97117bd
    • T
      virtio_net: Don't call free_old_xmit_skbs for xdp_frames · 4c2e63dc
      Toshiaki Makita 提交于
      [ Upstream commit 534da5e856334fb54cb0272a9fb3afec28ea3aed ]
      
      When napi_tx is enabled, virtnet_poll_cleantx() called
      free_old_xmit_skbs() even for xdp send queue.
      This is bogus since the queue has xdp_frames, not sk_buffs, thus mangled
      device tx bytes counters because skb->len is meaningless value, and even
      triggered oops due to general protection fault on freeing them.
      
      Since xdp send queues do not aquire locks, old xdp_frames should be
      freed only in virtnet_xdp_xmit(), so just skip free_old_xmit_skbs() for
      xdp send queues.
      
      Similarly virtnet_poll_tx() called free_old_xmit_skbs(). This NAPI
      handler is called even without calling start_xmit() because cb for tx is
      by default enabled. Once the handler is called, it enabled the cb again,
      and then the handler would be called again. We don't need this handler
      for XDP, so don't enable cb as well as not calling free_old_xmit_skbs().
      
      Also, we need to disable tx NAPI when disabling XDP, so
      virtnet_poll_tx() can safely access curr_queue_pairs and
      xdp_queue_pairs, which are not atomically updated while disabling XDP.
      
      Fixes: b92f1e67 ("virtio-net: transmit napi")
      Fixes: 7b0411ef ("virtio-net: clean tx descriptors from rx napi")
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c2e63dc
    • T
      virtio_net: Don't enable NAPI when interface is down · b6862baa
      Toshiaki Makita 提交于
      [ Upstream commit 8be4d9a492f88b96d4d3a06c6cbedbc40ca14c83 ]
      
      Commit 4e09ff53 ("virtio-net: disable NAPI only when enabled during
      XDP set") tried to fix inappropriate NAPI enabling/disabling when
      !netif_running(), but was not complete.
      
      On error path virtio_net could enable NAPI even when !netif_running().
      This can cause enabling NAPI twice on virtnet_open(), which would
      trigger BUG_ON() in napi_enable().
      
      Fixes: 4941d472 ("virtio-net: do not reset during XDP set")
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6862baa
    • X
      sctp: set flow sport from saddr only when it's 0 · 37b34a91
      Xin Long 提交于
      [ Upstream commit ecf938fe7d0088077ee1280419a2b3c5429b47c8 ]
      
      Now sctp_transport_pmtu() passes transport->saddr into .get_dst() to set
      flow sport from 'saddr'. However, transport->saddr is set only when
      transport->dst exists in sctp_transport_route().
      
      If sctp_transport_pmtu() is called without transport->saddr set, like
      when transport->dst doesn't exists, the flow sport will be set to 0
      from transport->saddr, which will cause a wrong route to be got.
      
      Commit 6e91b578 ("sctp: re-use sctp_transport_pmtu in
      sctp_transport_route") made the issue be triggered more easily
      since sctp_transport_pmtu() would be called in sctp_transport_route()
      after that.
      
      In gerneral, fl4->fl4_sport should always be set to
      htons(asoc->base.bind_addr.port), unless transport->asoc doesn't exist
      in sctp_v4/6_get_dst(), which is the case:
      
        sctp_ootb_pkt_new() ->
          sctp_transport_route()
      
      For that, we can simply handle it by setting flow sport from saddr only
      when it's 0 in sctp_v4/6_get_dst().
      
      Fixes: 6e91b578 ("sctp: re-use sctp_transport_pmtu in sctp_transport_route")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      37b34a91
    • X
      sctp: set chunk transport correctly when it's a new asoc · cbf23d40
      Xin Long 提交于
      [ Upstream commit 4ff40b86262b73553ee47cc3784ce8ba0f220bd8 ]
      
      In the paths:
      
        sctp_sf_do_unexpected_init() ->
          sctp_make_init_ack()
        sctp_sf_do_dupcook_a/b()() ->
          sctp_sf_do_5_1D_ce()
      
      The new chunk 'retval' transport is set from the incoming chunk 'chunk'
      transport. However, 'retval' transport belong to the new asoc, which
      is a different one from 'chunk' transport's asoc.
      
      It will cause that the 'retval' chunk gets set with a wrong transport.
      Later when sending it and because of Commit b9fd6839 ("sctp: add
      sctp_packet_singleton"), sctp_packet_singleton() will set some fields,
      like vtag to 'retval' chunk from that wrong transport's asoc.
      
      This patch is to fix it by setting 'retval' transport correctly which
      belongs to the right asoc in sctp_make_init_ack() and
      sctp_sf_do_5_1D_ce().
      
      Fixes: b9fd6839 ("sctp: add sctp_packet_singleton")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cbf23d40
    • B
      Revert "net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager" · a188f568
      Bodong Wang 提交于
      [ Upstream commit 4e046de0f50e04acd48eb373d6a9061ddf014e0c ]
      
      This reverts commit 5f5991f3.
      
      With the original commit, eswitch instance will not be initialized for
      a function which is vport group manager but not eswitch manager such as
      host PF on SmartNIC (BlueField) card. This will result in a kernel crash
      when such a vport group manager is trying to access vports in its group.
      E.g, PF vport manager (not eswitch manager) tries to configure the MAC
      of its VF vport, a kernel trace will happen similar as bellow:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
       ...
       RIP: 0010:mlx5_eswitch_get_vport_config+0xc/0x180 [mlx5_core]
       ...
      
      Fixes: 5f5991f3 ("net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager")
      Signed-off-by: NBodong Wang <bodong@mellanox.com>
      Reported-by: NYuval Avnery <yuvalav@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a188f568
    • N
      ip6mr: Fix notifiers call on mroute_clean_tables() · 505e5f3d
      Nir Dotan 提交于
      [ Upstream commit 146820cc240f4389cf33481c058d9493aef95e25 ]
      
      When the MC route socket is closed, mroute_clean_tables() is called to
      cleanup existing routes. Mistakenly notifiers call was put on the cleanup
      of the unresolved MC route entries cache.
      In a case where the MC socket closes before an unresolved route expires,
      the notifier call leads to a crash, caused by the driver trying to
      increment a non initialized refcount_t object [1] and then when handling
      is done, to decrement it [2]. This was detected by a test recently added in
      commit 6d4efada3b82 ("selftests: forwarding: Add multicast routing test").
      
      Fix that by putting notifiers call on the resolved entries traversal,
      instead of on the unresolved entries traversal.
      
      [1]
      
      [  245.748967] refcount_t: increment on 0; use-after-free.
      [  245.754829] WARNING: CPU: 3 PID: 3223 at lib/refcount.c:153 refcount_inc_checked+0x2b/0x30
      ...
      [  245.802357] Hardware name: Mellanox Technologies Ltd. MSN2740/SA001237, BIOS 5.6.5 06/07/2016
      [  245.811873] RIP: 0010:refcount_inc_checked+0x2b/0x30
      ...
      [  245.907487] Call Trace:
      [  245.910231]  mlxsw_sp_router_fib_event.cold.181+0x42/0x47 [mlxsw_spectrum]
      [  245.917913]  notifier_call_chain+0x45/0x7
      [  245.922484]  atomic_notifier_call_chain+0x15/0x20
      [  245.927729]  call_fib_notifiers+0x15/0x30
      [  245.932205]  mroute_clean_tables+0x372/0x3f
      [  245.936971]  ip6mr_sk_done+0xb1/0xc0
      [  245.940960]  ip6_mroute_setsockopt+0x1da/0x5f0
      ...
      
      [2]
      
      [  246.128487] refcount_t: underflow; use-after-free.
      [  246.133859] WARNING: CPU: 0 PID: 7 at lib/refcount.c:187 refcount_sub_and_test_checked+0x4c/0x60
      [  246.183521] Hardware name: Mellanox Technologies Ltd. MSN2740/SA001237, BIOS 5.6.5 06/07/2016
      ...
      [  246.193062] Workqueue: mlxsw_core_ordered mlxsw_sp_router_fibmr_event_work [mlxsw_spectrum]
      [  246.202394] RIP: 0010:refcount_sub_and_test_checked+0x4c/0x60
      ...
      [  246.298889] Call Trace:
      [  246.301617]  refcount_dec_and_test_checked+0x11/0x20
      [  246.307170]  mlxsw_sp_router_fibmr_event_work.cold.196+0x47/0x78 [mlxsw_spectrum]
      [  246.315531]  process_one_work+0x1fa/0x3f0
      [  246.320005]  worker_thread+0x2f/0x3e0
      [  246.324083]  kthread+0x118/0x130
      [  246.327683]  ? wq_update_unbound_numa+0x1b0/0x1b0
      [  246.332926]  ? kthread_park+0x80/0x80
      [  246.337013]  ret_from_fork+0x1f/0x30
      
      Fixes: 088aa3ee ("ip6mr: Support fib notifications")
      Signed-off-by: NNir Dotan <nird@mellanox.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      505e5f3d
    • A
      net/mlx5e: Allow MAC invalidation while spoofchk is ON · 50990a40
      Aya Levin 提交于
      [ Upstream commit 9d2cbdc5d334967c35b5f58c7bf3208e17325647 ]
      
      Prior to this patch the driver prohibited spoof checking on invalid MAC.
      Now the user can set this configuration if it wishes to.
      
      This is required since libvirt might invalidate the VF Mac by setting it
      to zero, while spoofcheck is ON.
      
      Fixes: 1ab2068a ("net/mlx5: Implement vports admin state backup/restore")
      Signed-off-by: NAya Levin <ayal@mellanox.com>
      Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50990a40
    • X
      sctp: improve the events for sctp stream adding · 4ec13999
      Xin Long 提交于
      [ Upstream commit 8220c870cb0f4eaa4e335c9645dbd9a1c461c1dd ]
      
      This patch is to improve sctp stream adding events in 2 places:
      
        1. In sctp_process_strreset_addstrm_out(), move up SCTP_MAX_STREAM
           and in stream allocation failure checks, as the adding has to
           succeed after reconf_timer stops for the in stream adding
           request retransmission.
      
        3. In sctp_process_strreset_addstrm_in(), no event should be sent,
           as no in or out stream is added here.
      
      Fixes: 50a41591 ("sctp: implement receiver-side procedures for the Add Outgoing Streams Request Parameter")
      Fixes: c5c4ebb3 ("sctp: implement receiver-side procedures for the Add Incoming Streams Request Parameter")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ec13999
    • L
      net: ip6_gre: always reports o_key to userspace · 9f7d849b
      Lorenzo Bianconi 提交于
      [ Upstream commit c706863bc8902d0c2d1a5a27ac8e1ead5d06b79d ]
      
      As Erspan_v4, Erspan_v6 protocol relies on o_key to configure
      session id header field. However TUNNEL_KEY bit is cleared in
      ip6erspan_tunnel_xmit since ERSPAN protocol does not set the key field
      of the external GRE header and so the configured o_key is not reported
      to userspace. The issue can be triggered with the following reproducer:
      
      $ip link add ip6erspan1 type ip6erspan local 2000::1 remote 2000::2 \
          key 1 seq erspan_ver 1
      $ip link set ip6erspan1 up
      ip -d link sh ip6erspan1
      
      ip6erspan1@NONE: <BROADCAST,MULTICAST> mtu 1422 qdisc noop state DOWN mode DEFAULT
          link/ether ba:ff:09:24:c3:0e brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500
          ip6erspan remote 2000::2 local 2000::1 encaplimit 4 flowlabel 0x00000 ikey 0.0.0.1 iseq oseq
      
      Fix the issue adding TUNNEL_KEY bit to the o_flags parameter in
      ip6gre_fill_info
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f7d849b
    • J
      vhost: fix OOB in get_rx_bufs() · aafe74b7
      Jason Wang 提交于
      [ Upstream commit b46a0bf78ad7b150ef5910da83859f7f5a514ffd ]
      
      After batched used ring updating was introduced in commit e2b3b35e
      ("vhost_net: batch used ring update in rx"). We tend to batch heads in
      vq->heads for more than one packet. But the quota passed to
      get_rx_bufs() was not correctly limited, which can result a OOB write
      in vq->heads.
      
              headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx,
                          vhost_len, &in, vq_log, &log,
                          likely(mergeable) ? UIO_MAXIOV : 1);
      
      UIO_MAXIOV was still used which is wrong since we could have batched
      used in vq->heads, this will cause OOB if the next buffer needs more
      than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've
      batched 64 (VHOST_NET_BATCH) heads:
      Acked-by: NStefan Hajnoczi <stefanha@redhat.com>
      
      =============================================================================
      BUG kmalloc-8k (Tainted: G    B            ): Redzone overwritten
      -----------------------------------------------------------------------------
      
      INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc
      INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674
          kmem_cache_alloc_trace+0xbb/0x140
          alloc_pd+0x22/0x60
          gen8_ppgtt_create+0x11d/0x5f0
          i915_ppgtt_create+0x16/0x80
          i915_gem_create_context+0x248/0x390
          i915_gem_context_create_ioctl+0x4b/0xe0
          drm_ioctl_kernel+0xa5/0xf0
          drm_ioctl+0x2ed/0x3a0
          do_vfs_ioctl+0x9f/0x620
          ksys_ioctl+0x6b/0x80
          __x64_sys_ioctl+0x11/0x20
          do_syscall_64+0x43/0xf0
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x          (null) flags=0x200000000010201
      INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b
      
      Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for
      vhost-net. This is done through set the limitation through
      vhost_dev_init(), then set_owner can allocate the number of iov in a
      per device manner.
      
      This fixes CVE-2018-16880.
      
      Fixes: e2b3b35e ("vhost_net: batch used ring update in rx")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aafe74b7
    • M
      ucc_geth: Reset BQL queue when stopping device · d0773852
      Mathias Thore 提交于
      [ Upstream commit e15aa3b2b1388c399c1a2ce08550d2cc4f7e3e14 ]
      
      After a timeout event caused by for example a broadcast storm, when
      the MAC and PHY are reset, the BQL TX queue needs to be reset as
      well. Otherwise, the device will exhibit severe performance issues
      even after the storm has ended.
      Co-authored-by: NDavid Gounaris <david.gounaris@infinera.com>
      Signed-off-by: NMathias Thore <mathias.thore@infinera.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0773852
    • G
      tun: move the call to tun_set_real_num_queues · dace5274
      George Amanakis 提交于
      [ Upstream commit 3a03cb8456cc1d61c467a5375e0a10e5207b948c ]
      
      Call tun_set_real_num_queues() after the increment of tun->numqueues
      since the former depends on it. Otherwise, the number of queues is not
      correctly accounted for, which results to warnings similar to:
      "vnet0 selects TX queue 11, but real number of TX queues is 11".
      
      Fixes: 0b7959b62573 ("tun: publish tfile after it's fully initialized")
      Reported-and-tested-by: NGeorge Amanakis <gamanakis@gmail.com>
      Signed-off-by: NGeorge Amanakis <gamanakis@gmail.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dace5274
    • X
      sctp: improve the events for sctp stream reset · e569927a
      Xin Long 提交于
      [ Upstream commit 2e6dc4d95110becfe0ff4c3d4749c33ea166e9e7 ]
      
      This patch is to improve sctp stream reset events in 4 places:
      
        1. In sctp_process_strreset_outreq(), the flag should always be set with
           SCTP_STREAM_RESET_INCOMING_SSN instead of OUTGOING, as receiver's in
           stream is reset here.
        2. In sctp_process_strreset_outreq(), move up SCTP_STRRESET_ERR_WRONG_SSN
           check, as the reset has to succeed after reconf_timer stops for the
           in stream reset request retransmission.
        3. In sctp_process_strreset_inreq(), no event should be sent, as no in
           or out stream is reset here.
        4. In sctp_process_strreset_resp(), SCTP_STREAM_RESET_INCOMING_SSN or
           OUTGOING event should always be sent for stream reset requests, no
           matter it fails or succeeds to process the request.
      
      Fixes: 81054476 ("sctp: implement receiver-side procedures for the Outgoing SSN Reset Request Parameter")
      Fixes: 16e1a919 ("sctp: implement receiver-side procedures for the Incoming SSN Reset Request Parameter")
      Fixes: 11ae76e6 ("sctp: implement receiver-side procedures for the Reconf Response Parameter")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e569927a
    • S
      ravb: expand rx descriptor data to accommodate hw checksum · 4fae696c
      Simon Horman 提交于
      [ Upstream commit 12da64300fbc76b875900445f4146c3dc617d43e ]
      
      EtherAVB may provide a checksum of packet data appended to packet data. In
      order to allow this checksum to be received by the host descriptor data
      needs to be enlarged by 2 bytes to accommodate the checksum.
      
      In the case of MTU-sized packets without a VLAN tag the
      checksum were already accommodated by virtue of the space reserved for the
      VLAN tag. However, a packet of MTU-size with a  VLAN tag consumed all
      packet data space provided by a descriptor leaving no space for the
      trailing checksum.
      
      This was not detected by the driver which incorrectly used the last two
      bytes of packet data as the checksum and truncate the packet by two bytes.
      This resulted all such packets being dropped.
      
      A work around is to disable RX checksum offload
       # ethtool -K eth0 rx off
      
      This patch resolves this problem by increasing the size available for
      packet data in RX descriptors by two bytes.
      
      Tested on R-Car E3 (r8a77990) ES1.0 based Ebisu-4D board
      
      v2
      * Use sizeof(__sum16) directly rather than adding a driver-local
        #define for the size of the checksum provided by the hw (2 bytes).
      
      Fixes: 4d86d381 ("ravb: RX checksum offload")
      Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>
      Reviewed-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4fae696c
    • J
      net: set default network namespace in init_dummy_netdev() · 5f1a18e0
      Josh Elsasser 提交于
      [ Upstream commit 35edfdc77f683c8fd27d7732af06cf6489af60a5 ]
      
      Assign a default net namespace to netdevs created by init_dummy_netdev().
      Fixes a NULL pointer dereference caused by busy-polling a socket bound to
      an iwlwifi wireless device, which bumps the per-net BUSYPOLLRXPACKETS stat
      if napi_poll() received packets:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000190
        IP: napi_busy_loop+0xd6/0x200
        Call Trace:
          sock_poll+0x5e/0x80
          do_sys_poll+0x324/0x5a0
          SyS_poll+0x6c/0xf0
          do_syscall_64+0x6b/0x1f0
          entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: 7db6b048 ("net: Commonize busy polling code to focus on napi_id instead of socket")
      Signed-off-by: NJosh Elsasser <jelsasser@appneta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f1a18e0
    • B
      net/rose: fix NULL ax25_cb kernel panic · fc4154c7
      Bernard Pidoux 提交于
      [ Upstream commit b0cf029234f9b18e10703ba5147f0389c382bccc ]
      
      When an internally generated frame is handled by rose_xmit(),
      rose_route_frame() is called:
      
              if (!rose_route_frame(skb, NULL)) {
                      dev_kfree_skb(skb);
                      stats->tx_errors++;
                      return NETDEV_TX_OK;
              }
      
      We have the same code sequence in Net/Rom where an internally generated
      frame is handled by nr_xmit() calling nr_route_frame(skb, NULL).
      However, in this function NULL argument is tested while it is not in
      rose_route_frame().
      Then kernel panic occurs later on when calling ax25cmp() with a NULL
      ax25_cb argument as reported many times and recently with syzbot.
      
      We need to test if ax25 is NULL before using it.
      
      Testing:
      Built kernel with CONFIG_ROSE=y.
      Signed-off-by: NBernard Pidoux <f6bvp@free.fr>
      Acked-by: NDmitry Vyukov <dvyukov@google.com>
      Reported-by: syzbot+1a2c456a1ea08fa5b5f7@syzkaller.appspotmail.com
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Bernard Pidoux <f6bvp@free.fr>
      Cc: linux-hams@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc4154c7
    • C
      netrom: switch to sock timer API · 2c6b5724
      Cong Wang 提交于
      [ Upstream commit 63346650c1a94a92be61a57416ac88c0a47c4327 ]
      
      sk_reset_timer() and sk_stop_timer() properly handle
      sock refcnt for timer function. Switching to them
      could fix a refcounting bug reported by syzbot.
      
      Reported-and-tested-by: syzbot+defa700d16f1bd1b9a05@syzkaller.appspotmail.com
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-hams@vger.kernel.org
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c6b5724
    • A
      net/mlx4_core: Add masking for a few queries on HCA caps · 00865891
      Aya Levin 提交于
      [ Upstream commit a40ded6043658444ee4dd6ee374119e4e98b33fc ]
      
      Driver reads the query HCA capabilities without the corresponding masks.
      Without the correct masks, the base addresses of the queues are
      unaligned.  In addition some reserved bits were wrongly read.  Using the
      correct masks, ensures alignment of the base addresses and allows future
      firmware versions safe use of the reserved bits.
      
      Fixes: ab9c17a0 ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet")
      Fixes: 0ff1fb65 ("{NET, IB}/mlx4: Add device managed flow steering firmware API")
      Signed-off-by: NAya Levin <ayal@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      00865891
    • L
      net: ip_gre: use erspan key field for tunnel lookup · 0a198e0b
      Lorenzo Bianconi 提交于
      [ Upstream commit cb73ee40b1b381eaf3749e6dbeed567bb38e5258 ]
      
      Use ERSPAN key header field as tunnel key in gre_parse_header routine
      since ERSPAN protocol sets the key field of the external GRE header to
      0 resulting in a tunnel lookup fail in ip6gre_err.
      In addition remove key field parsing and pskb_may_pull check in
      erspan_rcv and ip6erspan_rcv
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a198e0b
    • L
      net: ip_gre: always reports o_key to userspace · 897ea28b
      Lorenzo Bianconi 提交于
      [ Upstream commit feaf5c796b3f0240f10d0d6d0b686715fd58a05b ]
      
      Erspan protocol (version 1 and 2) relies on o_key to configure
      session id header field. However TUNNEL_KEY bit is cleared in
      erspan_xmit since ERSPAN protocol does not set the key field
      of the external GRE header and so the configured o_key is not reported
      to userspace. The issue can be triggered with the following reproducer:
      
      $ip link add erspan1 type erspan local 192.168.0.1 remote 192.168.0.2 \
          key 1 seq erspan_ver 1
      $ip link set erspan1 up
      $ip -d link sh erspan1
      
      erspan1@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UNKNOWN mode DEFAULT
        link/ether 52:aa:99:95:9a:b5 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500
        erspan remote 192.168.0.2 local 192.168.0.1 ttl inherit ikey 0.0.0.1 iseq oseq erspan_index 0
      
      Fix the issue adding TUNNEL_KEY bit to the o_flags parameter in
      ipgre_fill_info
      
      Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN")
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      897ea28b
    • J
      l2tp: fix reading optional fields of L2TPv3 · 8de67666
      Jacob Wen 提交于
      [ Upstream commit 4522a70db7aa5e77526a4079628578599821b193 ]
      
      Use pskb_may_pull() to make sure the optional fields are in skb linear
      parts, so we can safely read them later.
      
      It's easy to reproduce the issue with a net driver that supports paged
      skb data. Just create a L2TPv3 over IP tunnel and then generates some
      network traffic.
      Once reproduced, rx err in /sys/kernel/debug/l2tp/tunnels will increase.
      
      Changes in v4:
      1. s/l2tp_v3_pull_opt/l2tp_v3_ensure_opt_in_linear/
      2. s/tunnel->version != L2TP_HDR_VER_2/tunnel->version == L2TP_HDR_VER_3/
      3. Add 'Fixes' in commit messages.
      
      Changes in v3:
      1. To keep consistency, move the code out of l2tp_recv_common.
      2. Use "net" instead of "net-next", since this is a bug fix.
      
      Changes in v2:
      1. Only fix L2TPv3 to make code simple.
         To fix both L2TPv3 and L2TPv2, we'd better refactor l2tp_recv_common.
         It's complicated to do so.
      2. Reloading pointers after pskb_may_pull
      
      Fixes: f7faffa3 ("l2tp: Add L2TPv3 protocol support")
      Fixes: 0d76751f ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
      Fixes: a32e0eec ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
      Signed-off-by: NJacob Wen <jian.w.wen@oracle.com>
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8de67666
    • J
      l2tp: copy 4 more bytes to linear part if necessary · 3d418a25
      Jacob Wen 提交于
      [ Upstream commit 91c524708de6207f59dd3512518d8a1c7b434ee3 ]
      
      The size of L2TPv2 header with all optional fields is 14 bytes.
      l2tp_udp_recv_core only moves 10 bytes to the linear part of a
      skb. This may lead to l2tp_recv_common read data outside of a skb.
      
      This patch make sure that there is at least 14 bytes in the linear
      part of a skb to meet the maximum need of l2tp_udp_recv_core and
      l2tp_recv_common. The minimum size of both PPP HDLC-like frame and
      Ethernet frame is larger than 14 bytes, so we are safe to do so.
      
      Also remove L2TP_HDR_SIZE_NOSEQ, it is unused now.
      
      Fixes: fd558d18 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
      Suggested-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NJacob Wen <jian.w.wen@oracle.com>
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d418a25
    • D
      ipvlan, l3mdev: fix broken l3s mode wrt local routes · fcc9c69a
      Daniel Borkmann 提交于
      [ Upstream commit d5256083f62e2720f75bb3c5a928a0afe47d6bc3 ]
      
      While implementing ipvlan l3 and l3s mode for kubernetes CNI plugin,
      I ran into the issue that while l3 mode is working fine, l3s mode
      does not have any connectivity to kube-apiserver and hence all pods
      end up in Error state as well. The ipvlan master device sits on
      top of a bond device and hostns traffic to kube-apiserver (also running
      in hostns) is DNATed from 10.152.183.1:443 to 139.178.29.207:37573
      where the latter is the address of the bond0. While in l3 mode, a
      curl to https://10.152.183.1:443 or to https://139.178.29.207:37573
      works fine from hostns, neither of them do in case of l3s. In the
      latter only a curl to https://127.0.0.1:37573 appeared to work where
      for local addresses of bond0 I saw kernel suddenly starting to emit
      ARP requests to query HW address of bond0 which remained unanswered
      and neighbor entries in INCOMPLETE state. These ARP requests only
      happen while in l3s.
      
      Debugging this further, I found the issue is that l3s mode is piggy-
      backing on l3 master device, and in this case local routes are using
      l3mdev_master_dev_rcu(dev) instead of net->loopback_dev as per commit
      f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev
      if relevant") and 5f02ce24 ("net: l3mdev: Allow the l3mdev to be
      a loopback"). I found that reverting them back into using the
      net->loopback_dev fixed ipvlan l3s connectivity and got everything
      working for the CNI.
      
      Now judging from 4fbae7d8 ("ipvlan: Introduce l3s mode") and the
      l3mdev paper in [0] the only sole reason why ipvlan l3s is relying
      on l3 master device is to get the l3mdev_ip_rcv() receive hook for
      setting the dst entry of the input route without adding its own
      ipvlan specific hacks into the receive path, however, any l3 domain
      semantics beyond just that are breaking l3s operation. Note that
      ipvlan also has the ability to dynamically switch its internal
      operation from l3 to l3s for all ports via ipvlan_set_port_mode()
      at runtime. In any case, l3 vs l3s soley distinguishes itself by
      'de-confusing' netfilter through switching skb->dev to ipvlan slave
      device late in NF_INET_LOCAL_IN before handing the skb to L4.
      
      Minimal fix taken here is to add a IFF_L3MDEV_RX_HANDLER flag which,
      if set from ipvlan setup, gets us only the wanted l3mdev_l3_rcv() hook
      without any additional l3mdev semantics on top. This should also have
      minimal impact since dev->priv_flags is already hot in cache. With
      this set, l3s mode is working fine and I also get things like
      masquerading pod traffic on the ipvlan master properly working.
      
        [0] https://netdevconf.org/1.2/papers/ahern-what-is-l3mdev-paper.pdf
      
      Fixes: f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev if relevant")
      Fixes: 5f02ce24 ("net: l3mdev: Allow the l3mdev to be a loopback")
      Fixes: 4fbae7d8 ("ipvlan: Introduce l3s mode")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: David Ahern <dsa@cumulusnetworks.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Martynas Pumputis <m@lambda.lt>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fcc9c69a
    • Y
      ipv6: sr: clear IP6CB(skb) on SRH ip4ip6 encapsulation · 2f704348
      Yohei Kanemaru 提交于
      [ Upstream commit ef489749aae508e6f17886775c075f12ff919fb1 ]
      
      skb->cb may contain data from previous layers (in an observed case
      IPv4 with L3 Master Device). In the observed scenario, the data in
      IPCB(skb)->frags was misinterpreted as IP6CB(skb)->frag_max_size,
      eventually caused an unexpected IPv6 fragmentation in ip6_fragment()
      through ip6_finish_output().
      
      This patch clears IP6CB(skb), which potentially contains garbage data,
      on the SRH ip4ip6 encapsulation.
      
      Fixes: 32d99d0b ("ipv6: sr: add support for ip4ip6 encapsulation")
      Signed-off-by: NYohei Kanemaru <yohei.kanemaru@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f704348
    • D
      ipv6: Consider sk_bound_dev_if when binding a socket to an address · 7e9a6476
      David Ahern 提交于
      [ Upstream commit c5ee066333ebc322a24a00a743ed941a0c68617e ]
      
      IPv6 does not consider if the socket is bound to a device when binding
      to an address. The result is that a socket can be bound to eth0 and then
      bound to the address of eth1. If the device is a VRF, the result is that
      a socket can only be bound to an address in the default VRF.
      
      Resolve by considering the device if sk_bound_dev_if is set.
      
      This problem exists from the beginning of git history.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e9a6476
    • A
      drm/msm/gpu: fix building without debugfs · 8877843b
      Arnd Bergmann 提交于
      commit c878a628e0c483ec36fa70f4590e4a58e34a6e49 upstream.
      
      When debugfs is disabled, but coredump is turned on, the adreno driver fails to build:
      
      drivers/gpu/drm/msm/adreno/a3xx_gpu.c:460:4: error: 'struct msm_gpu_funcs' has no member named 'show'
         .show = adreno_show,
          ^~~~
      drivers/gpu/drm/msm/adreno/a3xx_gpu.c:460:11: note: (near initialization for 'funcs.base')
      drivers/gpu/drm/msm/adreno/a3xx_gpu.c:460:11: error: initialization of 'void (*)(struct msm_gpu *, struct msm_gem_submit *, struct msm_file_private *)' from incompatible pointer type 'void (*)(struct msm_gpu *, struct msm_gpu_state *, struct drm_printer *)' [-Werror=incompatible-pointer-types]
      drivers/gpu/drm/msm/adreno/a3xx_gpu.c:460:11: note: (near initialization for 'funcs.base.submit')
      drivers/gpu/drm/msm/adreno/a4xx_gpu.c:546:4: error: 'struct msm_gpu_funcs' has no member named 'show'
      drivers/gpu/drm/msm/adreno/a5xx_gpu.c:1460:4: error: 'struct msm_gpu_funcs' has no member named 'show'
      drivers/gpu/drm/msm/adreno/a6xx_gpu.c:769:4: error: 'struct msm_gpu_funcs' has no member named 'show'
      drivers/gpu/drm/msm/msm_gpu.c: In function 'msm_gpu_devcoredump_read':
      drivers/gpu/drm/msm/msm_gpu.c:289:12: error: 'const struct msm_gpu_funcs' has no member named 'show'
      
      Adjust the #ifdef to make it build again.
      
      Fixes: c0fec7f5 ("drm/msm/gpu: Capture the GPU state on a GPU hang")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NRob Clark <robdclark@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8877843b
    • G
      Fix "net: ipv4: do not handle duplicate fragments as overlapping" · 8c763a3c
      Greg Kroah-Hartman 提交于
      ade446403bfb ("net: ipv4: do not handle duplicate fragments as
      overlapping") was backported to many stable trees, but it had a problem
      that was "accidentally" fixed by the upstream commit 0ff89efb5246 ("ip:
      fail fast on IP defrag errors")
      
      This is the fixup for that problem as we do not want the larger patch in
      the older stable trees.
      
      Fixes: ade446403bfb ("net: ipv4: do not handle duplicate fragments as overlapping")
      Reported-by: NIvan Babrou <ivan@cloudflare.com>
      Reported-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c763a3c
  2. 31 1月, 2019 7 次提交