1. 22 5月, 2018 1 次提交
  2. 18 5月, 2018 2 次提交
  3. 08 5月, 2018 3 次提交
    • T
      cfg80211: Expose TXQ stats and parameters to userspace · 52539ca8
      Toke Høiland-Jørgensen 提交于
      This adds support for exporting the mac80211 TXQ stats via nl80211 by
      way of a nested TXQ stats attribute, as well as for configuring the
      quantum and limits that were previously only changeable through debugfs.
      
      This commit adds just the nl80211 API, a subsequent commit adds support to
      mac80211 itself.
      Signed-off-by: NToke Høiland-Jørgensen <toke@toke.dk>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      52539ca8
    • B
      cfg80211: average ack rssi support for data frames · 81d5439d
      Balaji Pothunoori 提交于
      Average ack rssi will be given to userspace via NL80211 interface
      if firmware is capable. Userspace tool ‘iw’ can process this
      information and give the output as one of the fields in
      ‘iw dev wlanX station dump’.
      
      Example output :
      
      localhost ~ #iw dev wlan-5000mhz station dump Station
      34:f3:9a:aa:3b:29 (on wlan-5000mhz)
              inactive time:  5370 ms
              rx bytes:       85321
              rx packets:     576
              tx bytes:       14225
              tx packets:     71
              tx retries:     0
              tx failed:      2
              beacon loss:    0
              rx drop misc:   0
              signal:         -54 dBm
              signal avg:     -53 dBm
              tx bitrate:     866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2
              rx bitrate:     866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2
              avg ack signal: -56 dBm
              authorized:     yes
              authenticated:  yes
              associated:     yes
              preamble:       short
              WMM/WME:        yes
              MFP:            no
              TDLS peer:      no
              DTIM period:    2
              beacon interval:100
             short preamble: yes
             short slot time:yes
             connected time: 203 seconds
      
      Main use case is to measure the signal strength of a connected station
      to AP. Data packet transmit rates and bandwidth used by station can vary
      a lot even if the station is at fixed location, especially if the rates
      used are multi stream(2stream, 3stream) rates with different bandwidth(20/40/80 Mhz).
      These multi stream rates are sensitive and station can use different transmit power
      for each of the rate and bandwidth combinations. RSSI measured from these RX packets
      on AP will be not stable and can vary a lot with in a short time.
      Whereas 802.11 ack frames from station are sent relatively at a constant
      rate (6/12/24 Mbps) with constant bandwidth(20 Mhz).
      So average rssi of the ack packets is good and more accurate.
      Signed-off-by: NBalaji Pothunoori <bpothuno@codeaurora.org>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      81d5439d
    • G
      mac80211: add api to set CSA counter in mac80211 · 03737001
      Gregory Greenman 提交于
      Sometimes the most updated CSA counter values are known only
      to the device. Add an API to pass this data to mac80211.
      Signed-off-by: NGregory Greenman <gregory.greenman@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      03737001
  4. 18 4月, 2018 19 次提交
  5. 17 4月, 2018 11 次提交
    • J
      xdp: transition into using xdp_frame for return API · 03993094
      Jesper Dangaard Brouer 提交于
      Changing API xdp_return_frame() to take struct xdp_frame as argument,
      seems like a natural choice. But there are some subtle performance
      details here that needs extra care, which is a deliberate choice.
      
      When de-referencing xdp_frame on a remote CPU during DMA-TX
      completion, result in the cache-line is change to "Shared"
      state. Later when the page is reused for RX, then this xdp_frame
      cache-line is written, which change the state to "Modified".
      
      This situation already happens (naturally) for, virtio_net, tun and
      cpumap as the xdp_frame pointer is the queued object.  In tun and
      cpumap, the ptr_ring is used for efficiently transferring cache-lines
      (with pointers) between CPUs. Thus, the only option is to
      de-referencing xdp_frame.
      
      It is only the ixgbe driver that had an optimization, in which it can
      avoid doing the de-reference of xdp_frame.  The driver already have
      TX-ring queue, which (in case of remote DMA-TX completion) have to be
      transferred between CPUs anyhow.  In this data area, we stored a
      struct xdp_mem_info and a data pointer, which allowed us to avoid
      de-referencing xdp_frame.
      
      To compensate for this, a prefetchw is used for telling the cache
      coherency protocol about our access pattern.  My benchmarks show that
      this prefetchw is enough to compensate the ixgbe driver.
      
      V7: Adjust for commit d9314c47 ("i40e: add support for XDP_REDIRECT")
      V8: Adjust for commit bd658dda ("net/mlx5e: Separate dma base address
      and offset in dma_sync call")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03993094
    • J
      xdp: allow page_pool as an allocator type in xdp_return_frame · 57d0a1c1
      Jesper Dangaard Brouer 提交于
      New allocator type MEM_TYPE_PAGE_POOL for page_pool usage.
      
      The registered allocator page_pool pointer is not available directly
      from xdp_rxq_info, but it could be (if needed).  For now, the driver
      should keep separate track of the page_pool pointer, which it should
      use for RX-ring page allocation.
      
      As suggested by Saeed, to maintain a symmetric API it is the drivers
      responsibility to allocate/create and free/destroy the page_pool.
      Thus, after the driver have called xdp_rxq_info_unreg(), it is drivers
      responsibility to free the page_pool, but with a RCU free call.  This
      is done easily via the page_pool helper page_pool_destroy() (which
      avoids touching any driver code during the RCU callback, which could
      happen after the driver have been unloaded).
      
      V8: address issues found by kbuild test robot
       - Address sparse should be static warnings
       - Allow xdp.o to be compiled without page_pool.o
      
      V9: Remove inline from .c file, compiler knows best
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57d0a1c1
    • J
      page_pool: refurbish version of page_pool code · ff7d6b27
      Jesper Dangaard Brouer 提交于
      Need a fast page recycle mechanism for ndo_xdp_xmit API for returning
      pages on DMA-TX completion time, which have good cross CPU
      performance, given DMA-TX completion time can happen on a remote CPU.
      
      Refurbish my page_pool code, that was presented[1] at MM-summit 2016.
      Adapted page_pool code to not depend the page allocator and
      integration into struct page.  The DMA mapping feature is kept,
      even-though it will not be activated/used in this patchset.
      
      [1] http://people.netfilter.org/hawk/presentations/MM-summit2016/generic_page_pool_mm_summit2016.pdf
      
      V2: Adjustments requested by Tariq
       - Changed page_pool_create return codes, don't return NULL, only
         ERR_PTR, as this simplifies err handling in drivers.
      
      V4: many small improvements and cleanups
      - Add DOC comment section, that can be used by kernel-doc
      - Improve fallback mode, to work better with refcnt based recycling
        e.g. remove a WARN as pointed out by Tariq
        e.g. quicker fallback if ptr_ring is empty.
      
      V5: Fixed SPDX license as pointed out by Alexei
      
      V6: Adjustments requested by Eric Dumazet
       - Adjust ____cacheline_aligned_in_smp usage/placement
       - Move rcu_head in struct page_pool
       - Free pages quicker on destroy, minimize resources delayed an RCU period
       - Remove code for forward/backward compat ABI interface
      
      V8: Issues found by kbuild test robot
       - Address sparse should be static warnings
       - Only compile+link when a driver use/select page_pool,
         mlx5 selects CONFIG_PAGE_POOL, although its first used in two patches
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff7d6b27
    • J
      xdp: rhashtable with allocator ID to pointer mapping · 8d5d8852
      Jesper Dangaard Brouer 提交于
      Use the IDA infrastructure for getting a cyclic increasing ID number,
      that is used for keeping track of each registered allocator per
      RX-queue xdp_rxq_info.  Instead of using the IDR infrastructure, which
      uses a radix tree, use a dynamic rhashtable, for creating ID to
      pointer lookup table, because this is faster.
      
      The problem that is being solved here is that, the xdp_rxq_info
      pointer (stored in xdp_buff) cannot be used directly, as the
      guaranteed lifetime is too short.  The info is needed on a
      (potentially) remote CPU during DMA-TX completion time . In an
      xdp_frame the xdp_mem_info is stored, when it got converted from an
      xdp_buff, which is sufficient for the simple page refcnt based recycle
      schemes.
      
      For more advanced allocators there is a need to store a pointer to the
      registered allocator.  Thus, there is a need to guard the lifetime or
      validity of the allocator pointer, which is done through this
      rhashtable ID map to pointer. The removal and validity of of the
      allocator and helper struct xdp_mem_allocator is guarded by RCU.  The
      allocator will be created by the driver, and registered with
      xdp_rxq_info_reg_mem_model().
      
      It is up-to debate who is responsible for freeing the allocator
      pointer or invoking the allocator destructor function.  In any case,
      this must happen via RCU freeing.
      
      Use the IDA infrastructure for getting a cyclic increasing ID number,
      that is used for keeping track of each registered allocator per
      RX-queue xdp_rxq_info.
      
      V4: Per req of Jason Wang
      - Use xdp_rxq_info_reg_mem_model() in all drivers implementing
        XDP_REDIRECT, even-though it's not strictly necessary when
        allocator==NULL for type MEM_TYPE_PAGE_SHARED (given it's zero).
      
      V6: Per req of Alex Duyck
      - Introduce rhashtable_lookup() call in later patch
      
      V8: Address sparse should be static warnings (from kbuild test robot)
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d5d8852
    • J
      bpf: cpumap convert to use generic xdp_frame · 70280ed9
      Jesper Dangaard Brouer 提交于
      The generic xdp_frame format, was inspired by the cpumap own internal
      xdp_pkt format.  It is now time to convert it over to the generic
      xdp_frame format.  The cpumap needs one extra field dev_rx.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70280ed9
    • J
      xdp: introduce a new xdp_frame type · c0048cff
      Jesper Dangaard Brouer 提交于
      This is needed to convert drivers tuntap and virtio_net.
      
      This is a generalization of what is done inside cpumap, which will be
      converted later.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0048cff
    • J
      xdp: move struct xdp_buff from filter.h to xdp.h · 106ca27f
      Jesper Dangaard Brouer 提交于
      This is done to prepare for the next patch, and it is also
      nice to move this XDP related struct out of filter.h.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      106ca27f
    • J
      xdp: introduce xdp_return_frame API and use in cpumap · 5ab073ff
      Jesper Dangaard Brouer 提交于
      Introduce an xdp_return_frame API, and convert over cpumap as
      the first user, given it have queued XDP frame structure to leverage.
      
      V3: Cleanup and remove C99 style comments, pointed out by Alex Duyck.
      V6: Remove comment that id will be added later (Req by Alex Duyck)
      V8: Rename enum mem_type to xdp_mem_type (found by kbuild test robot)
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ab073ff
    • E
      tcp: implement mmap() for zero copy receive · 93ab6cc6
      Eric Dumazet 提交于
      Some networks can make sure TCP payload can exactly fit 4KB pages,
      with well chosen MSS/MTU and architectures.
      
      Implement mmap() system call so that applications can avoid
      copying data without complex splice() games.
      
      Note that a successful mmap( X bytes) on TCP socket is consuming
      bytes, as if recvmsg() has been done. (tp->copied += X)
      
      Only PROT_READ mappings are accepted, as skb page frags
      are fundamentally shared and read only.
      
      If tcp_mmap() finds data that is not a full page, or a patch of
      urgent data, -EINVAL is returned, no bytes are consumed.
      
      Application must fallback to recvmsg() to read the problematic sequence.
      
      mmap() wont block,  regardless of socket being in blocking or
      non-blocking mode. If not enough bytes are in receive queue,
      mmap() would return -EAGAIN, or -EIO if socket is in a state
      where no other bytes can be added into receive queue.
      
      An application might use SO_RCVLOWAT, poll() and/or ioctl( FIONREAD)
      to efficiently use mmap()
      
      On the sender side, MSG_EOR might help to clearly separate unaligned
      headers and 4K-aligned chunks if necessary.
      
      Tested:
      
      mlx4 (cx-3) 40Gbit NIC, with tcp_mmap program provided in following patch.
      MTU set to 4168  (4096 TCP payload, 40 bytes IPv6 header, 32 bytes TCP header)
      
      Without mmap() (tcp_mmap -s)
      
      received 32768 MB (0 % mmap'ed) in 8.13342 s, 33.7961 Gbit,
        cpu usage user:0.034 sys:3.778, 116.333 usec per MB, 63062 c-switches
      received 32768 MB (0 % mmap'ed) in 8.14501 s, 33.748 Gbit,
        cpu usage user:0.029 sys:3.997, 122.864 usec per MB, 61903 c-switches
      received 32768 MB (0 % mmap'ed) in 8.11723 s, 33.8635 Gbit,
        cpu usage user:0.048 sys:3.964, 122.437 usec per MB, 62983 c-switches
      received 32768 MB (0 % mmap'ed) in 8.39189 s, 32.7552 Gbit,
        cpu usage user:0.038 sys:4.181, 128.754 usec per MB, 55834 c-switches
      
      With mmap() on receiver (tcp_mmap -s -z)
      
      received 32768 MB (100 % mmap'ed) in 8.03083 s, 34.2278 Gbit,
        cpu usage user:0.024 sys:1.466, 45.4712 usec per MB, 65479 c-switches
      received 32768 MB (100 % mmap'ed) in 7.98805 s, 34.4111 Gbit,
        cpu usage user:0.026 sys:1.401, 43.5486 usec per MB, 65447 c-switches
      received 32768 MB (100 % mmap'ed) in 7.98377 s, 34.4296 Gbit,
        cpu usage user:0.028 sys:1.452, 45.166 usec per MB, 65496 c-switches
      received 32768 MB (99.9969 % mmap'ed) in 8.01838 s, 34.281 Gbit,
        cpu usage user:0.02 sys:1.446, 44.7388 usec per MB, 65505 c-switches
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93ab6cc6
    • E
      tcp: avoid extra wakeups for SO_RCVLOWAT users · 03f45c88
      Eric Dumazet 提交于
      SO_RCVLOWAT is properly handled in tcp_poll(), so that POLLIN is only
      generated when enough bytes are available in receive queue, after
      David change (commit c7004482 "tcp: Respect SO_RCVLOWAT in tcp_poll().")
      
      But TCP still calls sk->sk_data_ready() for each chunk added in receive
      queue, meaning thread is awaken, and goes back to sleep shortly after.
      
      Tested:
      
      tcp_mmap test program, receiving 32768 MB of data with SO_RCVLOWAT set to 512KB
      
      -> Should get ~2 wakeups (c-switches) per MB, regardless of how many
      (tiny or big) packets were received.
      
      High speed (mostly full size GRO packets)
      
      received 32768 MB (100 % mmap'ed) in 8.03112 s, 34.2266 Gbit,
        cpu usage user:0.037 sys:1.404, 43.9758 usec per MB, 65497 c-switches
      
      received 32768 MB (99.9954 % mmap'ed) in 7.98453 s, 34.4263 Gbit,
        cpu usage user:0.03 sys:1.422, 44.3115 usec per MB, 65485 c-switches
      
      Low speed (sender is ratelimited and sends 1-MSS at a time, so GRO is not helping)
      
      received 22474.5 MB (100 % mmap'ed) in 6015.35 s, 0.0313414 Gbit,
        cpu usage user:0.05 sys:1.586, 72.7952 usec per MB, 44950 c-switches
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03f45c88
    • E
      tcp: fix SO_RCVLOWAT and RCVBUF autotuning · d1361840
      Eric Dumazet 提交于
      Applications might use SO_RCVLOWAT on TCP socket hoping to receive
      one [E]POLLIN event only when a given amount of bytes are ready in socket
      receive queue.
      
      Problem is that receive autotuning is not aware of this constraint,
      meaning sk_rcvbuf might be too small to allow all bytes to be stored.
      
      Add a new (struct proto_ops)->set_rcvlowat method so that a protocol
      can override the default setsockopt(SO_RCVLOWAT) behavior.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d1361840
  6. 11 4月, 2018 1 次提交
    • T
      slip: Check if rstate is initialized before uncompressing · 3f01ddb9
      Tejaswi Tanikella 提交于
      On receiving a packet the state index points to the rstate which must be
      used to fill up IP and TCP headers. But if the state index points to a
      rstate which is unitialized, i.e. filled with zeros, it gets stuck in an
      infinite loop inside ip_fast_csum trying to compute the ip checsum of a
      header with zero length.
      
      89.666953:   <2> [<ffffff9dd3e94d38>] slhc_uncompress+0x464/0x468
      89.666965:   <2> [<ffffff9dd3e87d88>] ppp_receive_nonmp_frame+0x3b4/0x65c
      89.666978:   <2> [<ffffff9dd3e89dd4>] ppp_receive_frame+0x64/0x7e0
      89.666991:   <2> [<ffffff9dd3e8a708>] ppp_input+0x104/0x198
      89.667005:   <2> [<ffffff9dd3e93868>] pppopns_recv_core+0x238/0x370
      89.667027:   <2> [<ffffff9dd4428fc8>] __sk_receive_skb+0xdc/0x250
      89.667040:   <2> [<ffffff9dd3e939e4>] pppopns_recv+0x44/0x60
      89.667053:   <2> [<ffffff9dd4426848>] __sock_queue_rcv_skb+0x16c/0x24c
      89.667065:   <2> [<ffffff9dd4426954>] sock_queue_rcv_skb+0x2c/0x38
      89.667085:   <2> [<ffffff9dd44f7358>] raw_rcv+0x124/0x154
      89.667098:   <2> [<ffffff9dd44f7568>] raw_local_deliver+0x1e0/0x22c
      89.667117:   <2> [<ffffff9dd44c8ba0>] ip_local_deliver_finish+0x70/0x24c
      89.667131:   <2> [<ffffff9dd44c92f4>] ip_local_deliver+0x100/0x10c
      
      ./scripts/faddr2line vmlinux slhc_uncompress+0x464/0x468 output:
       ip_fast_csum at arch/arm64/include/asm/checksum.h:40
       (inlined by) slhc_uncompress at drivers/net/slip/slhc.c:615
      
      Adding a variable to indicate if the current rstate is initialized. If
      such a packet arrives, move to toss state.
      Signed-off-by: NTejaswi Tanikella <tejaswit@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f01ddb9
  7. 09 4月, 2018 1 次提交
    • J
      devlink: convert occ_get op to separate registration · fc56be47
      Jiri Pirko 提交于
      This resolves race during initialization where the resources with
      ops are registered before driver and the structures used by occ_get
      op is initialized. So keep occ_get callbacks registered only when
      all structs are initialized.
      
      The example flows, as it is in mlxsw:
      1) driver load/asic probe:
         mlxsw_core
            -> mlxsw_sp_resources_register
              -> mlxsw_sp_kvdl_resources_register
                -> devlink_resource_register IDX
         mlxsw_spectrum
            -> mlxsw_sp_kvdl_init
              -> mlxsw_sp_kvdl_parts_init
                -> mlxsw_sp_kvdl_part_init
                  -> devlink_resource_size_get IDX (to get the current setup
                                                    size from devlink)
              -> devlink_resource_occ_get_register IDX (register current
                                                        occupancy getter)
      2) reload triggered by devlink command:
        -> mlxsw_devlink_core_bus_device_reload
          -> mlxsw_sp_fini
            -> mlxsw_sp_kvdl_fini
      	-> devlink_resource_occ_get_unregister IDX
          (struct mlxsw_sp *mlxsw_sp is freed at this point, call to occ get
           which is using mlxsw_sp would cause use-after free)
          -> mlxsw_sp_init
            -> mlxsw_sp_kvdl_init
              -> mlxsw_sp_kvdl_parts_init
                -> mlxsw_sp_kvdl_part_init
                  -> devlink_resource_size_get IDX (to get the current setup
                                                    size from devlink)
              -> devlink_resource_occ_get_register IDX (register current
                                                        occupancy getter)
      
      Fixes: d9f9b9a4 ("devlink: Add support for resource abstraction")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc56be47
  8. 08 4月, 2018 2 次提交
    • E
      soreuseport: initialise timewait reuseport field · 3099a529
      Eric Dumazet 提交于
      syzbot reported an uninit-value in inet_csk_bind_conflict() [1]
      
      It turns out we never propagated sk->sk_reuseport into timewait socket.
      
      [1]
      BUG: KMSAN: uninit-value in inet_csk_bind_conflict+0x5f9/0x990 net/ipv4/inet_connection_sock.c:151
      CPU: 1 PID: 3589 Comm: syzkaller008242 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       inet_csk_bind_conflict+0x5f9/0x990 net/ipv4/inet_connection_sock.c:151
       inet_csk_get_port+0x1d28/0x1e40 net/ipv4/inet_connection_sock.c:320
       inet6_bind+0x121c/0x1820 net/ipv6/af_inet6.c:399
       SYSC_bind+0x3f2/0x4b0 net/socket.c:1474
       SyS_bind+0x54/0x80 net/socket.c:1460
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x4416e9
      RSP: 002b:00007ffce6d15c88 EFLAGS: 00000217 ORIG_RAX: 0000000000000031
      RAX: ffffffffffffffda RBX: 0100000000000000 RCX: 00000000004416e9
      RDX: 000000000000001c RSI: 0000000020402000 RDI: 0000000000000004
      RBP: 0000000000000000 R08: 00000000e6d15e08 R09: 00000000e6d15e08
      R10: 0000000000000004 R11: 0000000000000217 R12: 0000000000009478
      R13: 00000000006cd448 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
       __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
       tcp_time_wait+0xf17/0xf50 net/ipv4/tcp_minisocks.c:283
       tcp_rcv_state_process+0xebe/0x6490 net/ipv4/tcp_input.c:6003
       tcp_v6_do_rcv+0x11dd/0x1d90 net/ipv6/tcp_ipv6.c:1331
       sk_backlog_rcv include/net/sock.h:908 [inline]
       __release_sock+0x2d6/0x680 net/core/sock.c:2271
       release_sock+0x97/0x2a0 net/core/sock.c:2786
       tcp_close+0x277/0x18f0 net/ipv4/tcp.c:2269
       inet_release+0x240/0x2a0 net/ipv4/af_inet.c:427
       inet6_release+0xaf/0x100 net/ipv6/af_inet6.c:435
       sock_release net/socket.c:595 [inline]
       sock_close+0xe0/0x300 net/socket.c:1149
       __fput+0x49e/0xa10 fs/file_table.c:209
       ____fput+0x37/0x40 fs/file_table.c:243
       task_work_run+0x243/0x2c0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x10e1/0x38d0 kernel/exit.c:867
       do_group_exit+0x1a0/0x360 kernel/exit.c:970
       SYSC_exit_group+0x21/0x30 kernel/exit.c:981
       SyS_exit_group+0x25/0x30 kernel/exit.c:979
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
       __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
       inet_twsk_alloc+0xaef/0xc00 net/ipv4/inet_timewait_sock.c:182
       tcp_time_wait+0xd9/0xf50 net/ipv4/tcp_minisocks.c:258
       tcp_rcv_state_process+0xebe/0x6490 net/ipv4/tcp_input.c:6003
       tcp_v6_do_rcv+0x11dd/0x1d90 net/ipv6/tcp_ipv6.c:1331
       sk_backlog_rcv include/net/sock.h:908 [inline]
       __release_sock+0x2d6/0x680 net/core/sock.c:2271
       release_sock+0x97/0x2a0 net/core/sock.c:2786
       tcp_close+0x277/0x18f0 net/ipv4/tcp.c:2269
       inet_release+0x240/0x2a0 net/ipv4/af_inet.c:427
       inet6_release+0xaf/0x100 net/ipv6/af_inet6.c:435
       sock_release net/socket.c:595 [inline]
       sock_close+0xe0/0x300 net/socket.c:1149
       __fput+0x49e/0xa10 fs/file_table.c:209
       ____fput+0x37/0x40 fs/file_table.c:243
       task_work_run+0x243/0x2c0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x10e1/0x38d0 kernel/exit.c:867
       do_group_exit+0x1a0/0x360 kernel/exit.c:970
       SYSC_exit_group+0x21/0x30 kernel/exit.c:981
       SyS_exit_group+0x25/0x30 kernel/exit.c:979
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
       kmem_cache_alloc+0xaab/0xb90 mm/slub.c:2756
       inet_twsk_alloc+0x13b/0xc00 net/ipv4/inet_timewait_sock.c:163
       tcp_time_wait+0xd9/0xf50 net/ipv4/tcp_minisocks.c:258
       tcp_rcv_state_process+0xebe/0x6490 net/ipv4/tcp_input.c:6003
       tcp_v6_do_rcv+0x11dd/0x1d90 net/ipv6/tcp_ipv6.c:1331
       sk_backlog_rcv include/net/sock.h:908 [inline]
       __release_sock+0x2d6/0x680 net/core/sock.c:2271
       release_sock+0x97/0x2a0 net/core/sock.c:2786
       tcp_close+0x277/0x18f0 net/ipv4/tcp.c:2269
       inet_release+0x240/0x2a0 net/ipv4/af_inet.c:427
       inet6_release+0xaf/0x100 net/ipv6/af_inet6.c:435
       sock_release net/socket.c:595 [inline]
       sock_close+0xe0/0x300 net/socket.c:1149
       __fput+0x49e/0xa10 fs/file_table.c:209
       ____fput+0x37/0x40 fs/file_table.c:243
       task_work_run+0x243/0x2c0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x10e1/0x38d0 kernel/exit.c:867
       do_group_exit+0x1a0/0x360 kernel/exit.c:970
       SYSC_exit_group+0x21/0x30 kernel/exit.c:981
       SyS_exit_group+0x25/0x30 kernel/exit.c:979
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: da5e3630 ("soreuseport: TCP/IPv4 implementation")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3099a529
    • E
      net: fix rtnh_ok() · b1993a2d
      Eric Dumazet 提交于
      syzbot reported :
      
      BUG: KMSAN: uninit-value in rtnh_ok include/net/nexthop.h:11 [inline]
      BUG: KMSAN: uninit-value in fib_count_nexthops net/ipv4/fib_semantics.c:469 [inline]
      BUG: KMSAN: uninit-value in fib_create_info+0x554/0x8d20 net/ipv4/fib_semantics.c:1091
      
      @remaining is an integer, coming from user space.
      If it is negative we want rtnh_ok() to return false.
      
      Fixes: 4e902c57 ("[IPv4]: FIB configuration using struct fib_config")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1993a2d