1. 14 8月, 2018 22 次提交
  2. 13 8月, 2018 8 次提交
    • H
      r8169: don't use MSI-X on RTL8168g · 7c53a722
      Heiner Kallweit 提交于
      There have been two reports that network doesn't come back on resume
      from suspend when using MSI-X. Both cases affect the same chip version
      (RTL8168g - version 40), on different systems. Falling back to MSI
      fixes the issue.
      Even though we don't really have a proof yet that the network chip
      version is to blame, let's disable MSI-X for this version.
      Reported-by: NSteve Dodd <steved424@gmail.com>
      Reported-by: NLou Reed <gogen@disroot.org>
      Tested-by: NSteve Dodd <steved424@gmail.com>
      Tested-by: NLou Reed <gogen@disroot.org>
      Fixes: 6c6aa15f ("r8169: improve interrupt handling")
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c53a722
    • D
      Merge branch 'nixge-Minor-cleanups' · 9ebcc22c
      David S. Miller 提交于
      Moritz Fischer says:
      
      ====================
      net: nixge: Minor cleanups
      
      in preparation of my 64-bit support series, here's some
      minor cleanup in preparation that gets rid of unneccesary
      accesses to the descriptor application fields.
      
      I've confirmed that the hardware does not access the fields
      in all our configurations.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ebcc22c
    • M
      net: nixge: Don't store skb in app4 field of descriptor · fd5cf434
      Moritz Fischer 提交于
      Don't store skb in app4 field of descriptor since it is
      not being used anywhere (including hardware).
      Signed-off-by: NMoritz Fischer <mdf@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd5cf434
    • M
      net: nixge: Do not zero application specific fields in desc · e158770e
      Moritz Fischer 提交于
      Do not zero application specific fields in DMA descriptors.
      The hardware does ignore them, so should software.
      Signed-off-by: NMoritz Fischer <mdf@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e158770e
    • W
      l2tp: use sk_dst_check() to avoid race on sk->sk_dst_cache · 6d37fa49
      Wei Wang 提交于
      In l2tp code, if it is a L2TP_UDP_ENCAP tunnel, tunnel->sk points to a
      UDP socket. User could call sendmsg() on both this tunnel and the UDP
      socket itself concurrently. As l2tp_xmit_skb() holds socket lock and call
      __sk_dst_check() to refresh sk->sk_dst_cache, while udpv6_sendmsg() is
      lockless and call sk_dst_check() to refresh sk->sk_dst_cache, there
      could be a race and cause the dst cache to be freed multiple times.
      So we fix l2tp side code to always call sk_dst_check() to garantee
      xchg() is called when refreshing sk->sk_dst_cache to avoid race
      conditions.
      
      Syzkaller reported stack trace:
      BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
      BUG: KASAN: use-after-free in atomic_fetch_add_unless include/linux/atomic.h:575 [inline]
      BUG: KASAN: use-after-free in atomic_add_unless include/linux/atomic.h:597 [inline]
      BUG: KASAN: use-after-free in dst_hold_safe include/net/dst.h:308 [inline]
      BUG: KASAN: use-after-free in ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029
      Read of size 4 at addr ffff8801aea9a880 by task syz-executor129/4829
      
      CPU: 0 PID: 4829 Comm: syz-executor129 Not tainted 4.18.0-rc7-next-20180802+ #30
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
       print_address_description+0x6c/0x20b mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.7+0x242/0x30d mm/kasan/report.c:412
       check_memory_region_inline mm/kasan/kasan.c:260 [inline]
       check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
       kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
       atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
       atomic_fetch_add_unless include/linux/atomic.h:575 [inline]
       atomic_add_unless include/linux/atomic.h:597 [inline]
       dst_hold_safe include/net/dst.h:308 [inline]
       ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029
       rt6_get_pcpu_route net/ipv6/route.c:1249 [inline]
       ip6_pol_route+0x354/0xd20 net/ipv6/route.c:1922
       ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2098
       fib6_rule_lookup+0x283/0x890 net/ipv6/fib6_rules.c:122
       ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2126
       ip6_dst_lookup_tail+0x1278/0x1da0 net/ipv6/ip6_output.c:978
       ip6_dst_lookup_flow+0xc8/0x270 net/ipv6/ip6_output.c:1079
       ip6_sk_dst_lookup_flow+0x5ed/0xc50 net/ipv6/ip6_output.c:1117
       udpv6_sendmsg+0x2163/0x36b0 net/ipv6/udp.c:1354
       inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:632
       ___sys_sendmsg+0x51d/0x930 net/socket.c:2115
       __sys_sendmmsg+0x240/0x6f0 net/socket.c:2210
       __do_sys_sendmmsg net/socket.c:2239 [inline]
       __se_sys_sendmmsg net/socket.c:2236 [inline]
       __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2236
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x446a29
      Code: e8 ac b8 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f4de5532db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 00000000006dcc38 RCX: 0000000000446a29
      RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003
      RBP: 00000000006dcc30 R08: 00007f4de5533700 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dcc3c
      R13: 00007ffe2b830fdf R14: 00007f4de55339c0 R15: 0000000000000001
      
      Fixes: 71b1391a ("l2tp: ensure sk->dst is still valid")
      Reported-by: syzbot+05f840f3b04f211bad55@syzkaller.appspotmail.com
      Signed-off-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Guillaume Nault <g.nault@alphalink.fr>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d37fa49
    • V
      ipv6: Add icmp_echo_ignore_all support for ICMPv6 · e6f86b0f
      Virgile Jarry 提交于
      Preventing the kernel from responding to ICMP Echo Requests messages
      can be useful in several ways. The sysctl parameter
      'icmp_echo_ignore_all' can be used to prevent the kernel from
      responding to IPv4 ICMP echo requests. For IPv6 pings, such
      a sysctl kernel parameter did not exist.
      
      Add the ability to prevent the kernel from responding to IPv6
      ICMP echo requests through the use of the following sysctl
      parameter : /proc/sys/net/ipv6/icmp/echo_ignore_all.
      Update the documentation to reflect this change.
      Signed-off-by: NVirgile Jarry <virgile@acceis.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6f86b0f
    • D
      Merge branch 'net-tls-Combined-memory-allocation-for-decryption-request' · 8f780044
      David S. Miller 提交于
      Vakul Garg says:
      
      ====================
      net/tls: Combined memory allocation for decryption request
      
      This patch does a combined memory allocation from heap for scatterlists,
      aead_request, aad and iv for the tls record decryption path. In present
      code, aead_request is allocated from heap, scatterlists on a conditional
      basis are allocated on heap or on stack. This is inefficient as it may
      requires multiple kmalloc/kfree.
      
      The initialization vector passed in cryption request is allocated on
      stack. This is a problem since the stack memory is not dma-able from
      crypto accelerators.
      
      Doing one combined memory allocation for each decryption request fixes
      both the above issues. It also paves a way to be able to submit multiple
      async decryption requests while the previous one is pending i.e. being
      processed or queued.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f780044
    • V
      net/tls: Combined memory allocation for decryption request · 0b243d00
      Vakul Garg 提交于
      For preparing decryption request, several memory chunks are required
      (aead_req, sgin, sgout, iv, aad). For submitting the decrypt request to
      an accelerator, it is required that the buffers which are read by the
      accelerator must be dma-able and not come from stack. The buffers for
      aad and iv can be separately kmalloced each, but it is inefficient.
      This patch does a combined allocation for preparing decryption request
      and then segments into aead_req || sgin || sgout || iv || aad.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b243d00
  3. 12 8月, 2018 10 次提交
    • D
      Merge branch 'ip-faster-in-order-IP-fragments' · 78cbac64
      David S. Miller 提交于
      Peter Oskolkov says:
      
      ====================
      ip: faster in-order IP fragments
      
      Added "Signed-off-by" in v2.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78cbac64
    • P
      ip: process in-order fragments efficiently · a4fd284a
      Peter Oskolkov 提交于
      This patch changes the runtime behavior of IP defrag queue:
      incoming in-order fragments are added to the end of the current
      list/"run" of in-order fragments at the tail.
      
      On some workloads, UDP stream performance is substantially improved:
      
      RX: ./udp_stream -F 10 -T 2 -l 60
      TX: ./udp_stream -c -H <host> -F 10 -T 5 -l 60
      
      with this patchset applied on a 10Gbps receiver:
      
        throughput=9524.18
        throughput_units=Mbit/s
      
      upstream (net-next):
      
        throughput=4608.93
        throughput_units=Mbit/s
      Reported-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NPeter Oskolkov <posk@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4fd284a
    • P
      ip: add helpers to process in-order fragments faster. · 353c9cb3
      Peter Oskolkov 提交于
      This patch introduces several helper functions/macros that will be
      used in the follow-up patch. No runtime changes yet.
      
      The new logic (fully implemented in the second patch) is as follows:
      
      * Nodes in the rb-tree will now contain not single fragments, but lists
        of consecutive fragments ("runs").
      
      * At each point in time, the current "active" run at the tail is
        maintained/tracked. Fragments that arrive in-order, adjacent
        to the previous tail fragment, are added to this tail run without
        triggering the re-balancing of the rb-tree.
      
      * If a fragment arrives out of order with the offset _before_ the tail run,
        it is inserted into the rb-tree as a single fragment.
      
      * If a fragment arrives after the current tail fragment (with a gap),
        it starts a new "tail" run, as is inserted into the rb-tree
        at the end as the head of the new run.
      
      skb->cb is used to store additional information
      needed here (suggested by Eric Dumazet).
      Reported-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NPeter Oskolkov <posk@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      353c9cb3
    • D
      6a92ef08
    • D
      Merge branch 'Remove-rtnl-lock-dependency-from-all-action-implementations' · 9a95d9c6
      David S. Miller 提交于
      Vlad Buslov says:
      
      ====================
      Remove rtnl lock dependency from all action implementations
      
      Currently, all netlink protocol handlers for updating rules, actions and
      qdiscs are protected with single global rtnl lock which removes any
      possibility for parallelism. This patch set is a second step to remove
      rtnl lock dependency from TC rules update path.
      
      Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
      Handlers registered with this flag are called without RTNL taken. End
      goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER,
      etc.) to be registered with UNLOCKED flag to allow parallel execution.
      However, there is no intention to completely remove or split rtnl lock
      itself. This patch set addresses specific problems in implementation of
      tc actions that prevent their control path from being executed
      concurrently. Additional changes are required to refactor classifiers
      API and individual classifiers for parallel execution. This patch set
      lays groundwork to eventually register rule update handlers as
      rtnl-unlocked.
      
      Action API is already prepared for parallel execution with previous
      patch set, which means that action ops that use action API for their
      implementation do not require additional modifications. (delete, search,
      etc.) Action API implements concurrency-safe reference counting and
      guarantees that cleanup/delete is called only once, after last reference
      to action is released.
      
      The goal of this change is to update specific actions APIs that access
      action private state directly, in order to be independent from external
      locking. General approach is to re-use existing tcf_lock spinlock (used
      by some action implementation to synchronize control path with data
      path) to protect action private state from concurrent modification. If
      action has rcu-protected pointer, tcf spinlock is used to protect its
      update code, instead of relying on rtnl lock.
      
      Some actions need to determine rtnl mutex status in order to release it.
      For example, ife action can load additional kernel modules(meta ops) and
      must make sure that no locks are held during module load. In such cases
      'rtnl_held' argument is used to conditionally release rtnl mutex.
      
      Changes from V1 to V2:
      - Patch 12:
        - new patch
      - Patch 14:
        - refactor gen_new_estimator() to reuse stats_lock when re-assigning
          rate estimator statistics pointer
      - Remove mirred and tunnel_key helper function changes. (to be submitted
        and standalone patch)
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a95d9c6
    • V
      net: sched: act_police: remove dependency on rtnl lock · e329bc42
      Vlad Buslov 提交于
      Use tcf spinlock to protect police action private data from concurrent
      modification during dump. (init already uses tcf spinlock when changing
      police action state)
      
      Pass tcf spinlock as estimator lock argument to gen_replace_estimator()
      during action init.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e329bc42
    • V
      net: core: protect rate estimator statistics pointer with lock · 51a9f5ae
      Vlad Buslov 提交于
      Extend gen_new_estimator() to also take stats_lock when re-assigning rate
      estimator statistics pointer. (to be used by unlocked actions)
      
      Rename 'stats_lock' to 'lock' and change argument description to explain
      that it is now also used for control path.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51a9f5ae
    • V
      net: sched: act_mirred: remove dependency on rtnl lock · 4e232818
      Vlad Buslov 提交于
      Re-introduce mirred list spinlock, that was removed some time ago, in order
      to protect it from concurrent modifications, instead of relying on rtnl
      lock.
      
      Use tcf spinlock to protect mirred action private data from concurrent
      modification in init and dump. Rearrange access to mirred data in order to
      be performed only while holding the lock.
      
      Rearrange net dev access to always hold reference while working with it,
      instead of relying on rntl lock.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e232818
    • V
      net: sched: extend action ops with put_dev callback · 84a75b32
      Vlad Buslov 提交于
      As a preparation for removing dependency on rtnl lock from rules update
      path, all users of shared objects must take reference while working with
      them.
      
      Extend action ops with put_dev() API to be used on net device returned by
      get_dev().
      
      Modify mirred action (only action that implements get_dev callback):
      - Take reference to net device in get_dev.
      - Implement put_dev API that releases reference to net device.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84a75b32
    • V
      net: sched: act_vlan: remove dependency on rtnl lock · 764e9a24
      Vlad Buslov 提交于
      Use tcf spinlock to protect vlan action private data from concurrent
      modification during dump and init. Use rcu swap operation to reassign
      params pointer under protection of tcf lock. (old params value is not used
      by init, so there is no need of standalone rcu dereference step)
      
      Remove rtnl assertion that is no longer necessary.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      764e9a24