1. 10 10月, 2017 26 次提交
    • W
      locking/core: Remove {read,spin,write}_can_lock() · a8a217c2
      Will Deacon 提交于
      Outside of the locking code itself, {read,spin,write}_can_lock() have no
      users in tree. Apparmor (the last remaining user of write_can_lock()) got
      moved over to lockdep by the previous patch.
      
      This patch removes the use of {read,spin,write}_can_lock() from the
      BUILD_LOCK_OPS macro, deferring to the trylock operation for testing the
      lock status, and subsequently removes the unused macros altogether. They
      aren't guaranteed to work in a concurrent environment and can give
      incorrect results in the case of qrwlock.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: paulmck@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/r/1507055129-12300-2-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a8a217c2
    • W
      locking/rwsem, security/apparmor: Replace homebrew use of write_can_lock() with lockdep · 26c4eb19
      Will Deacon 提交于
      The lockdep subsystem provides a robust way to assert that a lock is
      held, so use that instead of write_can_lock, which can give incorrect
      results for qrwlocks.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NJohn Johansen <john.johansen@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: paulmck@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/r/1507055129-12300-1-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      26c4eb19
    • K
      locking/rwsem, fs: Use killable down_read() in iterate_dir() · 0dc208b5
      Kirill Tkhai 提交于
      There was mutex_lock_interruptible() initially, and it was changed
      to rwsem, but there were not killable rwsem primitives that time.
      >From commit 9902af79:
      
          "The main issue is the lack of down_write_killable(), so the places
           like readdir.c switched to plain inode_lock(); once killable
           variants of rwsem primitives appear, that'll be dealt with"
      
      Use down_read_killable() same as down_write_killable() in !shared
      case, as concurrent inode_lock() may take much time, that may be
      wanted to be interrupted by user.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: avagin@virtuozzo.com
      Cc: davem@davemloft.net
      Cc: fenghua.yu@intel.com
      Cc: gorcunov@virtuozzo.com
      Cc: heiko.carstens@de.ibm.com
      Cc: hpa@zytor.com
      Cc: ink@jurassic.park.msu.ru
      Cc: mattst88@gmail.com
      Cc: rientjes@google.com
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Cc: viro@zeniv.linux.org.uk
      Link: http://lkml.kernel.org/r/150670120820.23930.5455667921545937220.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0dc208b5
    • K
      locking/rwsem: Add down_read_killable() · 76f8507f
      Kirill Tkhai 提交于
      Similar to down_read() and down_write_killable(),
      add killable version of down_read(), based on
      __down_read_killable() function, added in previous
      patches.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: avagin@virtuozzo.com
      Cc: davem@davemloft.net
      Cc: fenghua.yu@intel.com
      Cc: gorcunov@virtuozzo.com
      Cc: heiko.carstens@de.ibm.com
      Cc: hpa@zytor.com
      Cc: ink@jurassic.park.msu.ru
      Cc: mattst88@gmail.com
      Cc: rientjes@google.com
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Cc: viro@zeniv.linux.org.uk
      Link: http://lkml.kernel.org/r/150670119884.23930.2585570605960763239.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
      76f8507f
    • K
      locking/arch, x86: Add __down_read_killable() · 19c60923
      Kirill Tkhai 提交于
      Similar to __down_write_killable(), add read killable primitive:
      extract current __down_read() code to macros and teach it to get
      different functions as slow_path argument:
      store ax register to ret, and add sp register and preserve its value.
      
      Add call_rwsem_down_read_failed_killable() assembly entry similar
      to call_rwsem_down_read_failed():
      push dx register to stack in additional to common registers,
      as it's not declarated as modifiable in ____down_read().
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: avagin@virtuozzo.com
      Cc: davem@davemloft.net
      Cc: fenghua.yu@intel.com
      Cc: gorcunov@virtuozzo.com
      Cc: heiko.carstens@de.ibm.com
      Cc: hpa@zytor.com
      Cc: ink@jurassic.park.msu.ru
      Cc: mattst88@gmail.com
      Cc: rientjes@google.com
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Cc: viro@zeniv.linux.org.uk
      Link: http://lkml.kernel.org/r/150670118802.23930.1316107715255410256.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
      19c60923
    • K
      locking/arch, s390: Add __down_read_killable() · a61ba2c8
      Kirill Tkhai 提交于
      Similar to __down_write_killable(), and read killable primitive.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: avagin@virtuozzo.com
      Cc: davem@davemloft.net
      Cc: fenghua.yu@intel.com
      Cc: gorcunov@virtuozzo.com
      Cc: heiko.carstens@de.ibm.com
      Cc: hpa@zytor.com
      Cc: ink@jurassic.park.msu.ru
      Cc: mattst88@gmail.com
      Cc: rientjes@google.com
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Cc: viro@zeniv.linux.org.uk
      Link: http://lkml.kernel.org/r/150670117817.23930.13068785028558453848.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a61ba2c8
    • K
      locking/arch, ia64: Add __down_read_killable() · c0905115
      Kirill Tkhai 提交于
      Similar to __down_write_killable(), and read killable primitive.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: avagin@virtuozzo.com
      Cc: davem@davemloft.net
      Cc: fenghua.yu@intel.com
      Cc: gorcunov@virtuozzo.com
      Cc: heiko.carstens@de.ibm.com
      Cc: hpa@zytor.com
      Cc: ink@jurassic.park.msu.ru
      Cc: mattst88@gmail.com
      Cc: rientjes@google.com
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Cc: viro@zeniv.linux.org.uk
      Link: http://lkml.kernel.org/r/150670116749.23930.14976888440968191759.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c0905115
    • K
      locking/arch, alpha: Add __down_read_killable() · 8c74392a
      Kirill Tkhai 提交于
      Similar to __down_write_killable(), and read killable primitive.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: avagin@virtuozzo.com
      Cc: davem@davemloft.net
      Cc: fenghua.yu@intel.com
      Cc: gorcunov@virtuozzo.com
      Cc: heiko.carstens@de.ibm.com
      Cc: hpa@zytor.com
      Cc: ink@jurassic.park.msu.ru
      Cc: mattst88@gmail.com
      Cc: rientjes@google.com
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Cc: viro@zeniv.linux.org.uk
      Link: http://lkml.kernel.org/r/150670115677.23930.5711263025537758463.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8c74392a
    • J
      locking/spinlocks, paravirt, xen: Correct the xen_nopvspin case · e6fd28eb
      Juergen Gross 提交于
      With the boot parameter "xen_nopvspin" specified a Xen guest should not
      make use of paravirt spinlocks, but behave as if running on bare
      metal. This is not true, however, as the qspinlock code will fall back
      to a test-and-set scheme when it is detecting a hypervisor.
      
      In order to avoid this disable the virt_spin_lock_key.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NWaiman Long <longman@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akataria@vmware.com
      Cc: boris.ostrovsky@oracle.com
      Cc: chrisw@sous-sol.org
      Cc: hpa@zytor.com
      Cc: jeremy@goop.org
      Cc: rusty@rustcorp.com.au
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/20170906173625.18158-3-jgross@suse.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e6fd28eb
    • J
      locking/paravirt: Use new static key for controlling call of virt_spin_lock() · 9043442b
      Juergen Gross 提交于
      There are cases where a guest tries to switch spinlocks to bare metal
      behavior (e.g. by setting "xen_nopvspin" boot parameter). Today this
      has the downside of falling back to unfair test and set scheme for
      qspinlocks due to virt_spin_lock() detecting the virtualized
      environment.
      
      Add a static key controlling whether virt_spin_lock() should be
      called or not. When running on bare metal set the new key to false.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NWaiman Long <longman@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akataria@vmware.com
      Cc: boris.ostrovsky@oracle.com
      Cc: chrisw@sous-sol.org
      Cc: hpa@zytor.com
      Cc: jeremy@goop.org
      Cc: rusty@rustcorp.com.au
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/20170906173625.18158-2-jgross@suse.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9043442b
    • I
      af1a34f2
    • P
      locking/selftest: Avoid false BUG report · c7e2f69d
      Peter Zijlstra 提交于
      The work-around for the expected failure is providing another failure :/
      
      Only when CONFIG_PROVE_LOCKING=y do we increment unexpected_testcase_failures,
      so only then do we need to decrement, otherwise we'll end up with a negative
      number and that will again trigger a BUG (printout, not crash).
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: d82fed75 ("locking/lockdep/selftests: Fix mixed read-write ABBA tests")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c7e2f69d
    • P
      locking/lockdep: Fix stacktrace mess · 8b405d5c
      Peter Zijlstra 提交于
      There is some complication between check_prevs_add() and
      check_prev_add() wrt. saving stack traces. The problem is that we want
      to be frugal with saving stack traces, since it consumes static
      resources.
      
      We'll only know in check_prev_add() if we need the trace, but we can
      call into it multiple times. So we want to do on-demand and re-use.
      
      A further complication is that check_prev_add() can drop graph_lock
      and mess with our static resources.
      
      In any case, the current state; after commit:
      
        ce07a941 ("locking/lockdep: Make check_prev_add() able to handle external stack_trace")
      
      is that we'll assume the trace contains valid data once
      check_prev_add() returns '2'. However, as noted by Josh, this is
      false, check_prev_add() can return '2' before having saved a trace,
      this then result in the possibility of using uninitialized data.
      Testing, as reported by Wu, shows a NULL deref.
      
      So simplify.
      
      Since the graph_lock() thing is a debug path that hasn't
      really been used in a long while, take it out back and avoid the
      head-ache.
      
      Further initialize the stack_trace to a known 'empty' state; as long
      as nr_entries == 0, nothing should deref entries. We can then use the
      'entries == NULL' test for a valid trace / on-demand saving.
      Analyzed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: ce07a941 ("locking/lockdep: Make check_prev_add() able to handle external stack_trace")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8b405d5c
    • L
      Merge branch 'ppc-bundle' (bundle from Michael Ellerman) · 529a86e0
      Linus Torvalds 提交于
      Merge powerpc transactional memory fixes from Michael Ellerman:
       "I figured I'd still send you the commits using a bundle to make sure
        it works in case I need to do it again in future"
      
      This fixes transactional memory state restore for powerpc.
      
      * bundle'd patches from Michael Ellerman:
        powerpc/tm: Fix illegal TM state in signal handler
        powerpc/64s: Use emergency stack for kernel TM Bad Thing program checks
      529a86e0
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ff33952e
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix object leak on IPSEC offload failure, from Steffen Klassert.
      
       2) Fix range checks in ipset address range addition operations, from
          Jozsef Kadlecsik.
      
       3) Fix pernet ops unregistration order in ipset, from Florian Westphal.
      
       4) Add missing netlink attribute policy for nl80211 packet pattern
          attrs, from Peng Xu.
      
       5) Fix PPP device destruction race, from Guillaume Nault.
      
       6) Write marks get lost when BPF verifier processes R1=R2 register
          assignments, causing incorrect liveness information and less state
          pruning. Fix from Alexei Starovoitov.
      
       7) Fix blockhole routes so that they are marked dead and therefore not
          cached in sockets, otherwise IPSEC stops working. From Steffen
          Klassert.
      
       8) Fix broadcast handling of UDP socket early demux, from Paolo Abeni.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits)
        cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan
        net: thunderx: mark expected switch fall-throughs in nicvf_main()
        udp: fix bcast packet reception
        netlink: do not set cb_running if dump's start() errs
        ipv4: Fix traffic triggered IPsec connections.
        ipv6: Fix traffic triggered IPsec connections.
        ixgbe: incorrect XDP ring accounting in ethtool tx_frame param
        net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
        Revert commit 1a8b6d76 ("net:add one common config...")
        ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register
        ixgbe: Return error when getting PHY address if PHY access is not supported
        netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
        netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook
        tipc: Unclone message at secondary destination lookup
        tipc: correct initialization of skb list
        gso: fix payload length when gso_size is zero
        mlxsw: spectrum_router: Avoid expensive lookup during route removal
        bpf: fix liveness marking
        doc: Fix typo "8023.ad" in bonding documentation
        ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real
        ...
      ff33952e
    • A
      cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan · fdfbad32
      Aleksander Morgado 提交于
      The u-blox TOBY-L2 is a LTE Cat 4 module with HSPA+ and 2G fallback.
      This module allows switching to different USB profiles with the
      'AT+UUSBCONF' command, and provides a ECM network interface when the
      'AT+UUSBCONF=2' profile is selected.
      
      The u-blox SARA-U2 is a HSPA module with 2G fallback. The default USB
      configuration includes a ECM network interface.
      
      Both these modules are controlled via AT commands through one of the
      TTYs exposed. Connecting these modules may be done just by activating
      the desired PDP context with 'AT+CGACT=1,<cid>' and then running DHCP
      on the ECM interface.
      Signed-off-by: NAleksander Morgado <aleksander@aleksander.es>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fdfbad32
    • L
      Merge tag 'nfs-for-4.14-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 68ebe3cb
      Linus Torvalds 提交于
      Pull NFS client bugfixes from Trond Myklebust:
       "Hightlights include:
      
        stable fixes:
         - nfs/filelayout: fix oops when freeing filelayout segment
         - NFS: Fix uninitialized rpc_wait_queue
      
        bugfixes:
         - NFSv4/pnfs: Fix an infinite layoutget loop
         - nfs: RPC_MAX_AUTH_SIZE is in bytes"
      
      * tag 'nfs-for-4.14-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFSv4/pnfs: Fix an infinite layoutget loop
        nfs/filelayout: fix oops when freeing filelayout segment
        sunrpc: remove redundant initialization of sock
        NFS: Fix uninitialized rpc_wait_queue
        NFS: Cleanup error handling in nfs_idmap_request_key()
        nfs: RPC_MAX_AUTH_SIZE is in bytes
      68ebe3cb
    • G
      net: thunderx: mark expected switch fall-throughs in nicvf_main() · 1a2ace56
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wimplicit-fallthrough, mark switch cases
      where we are expecting to fall through.
      
      Cc: Sunil Goutham <sgoutham@cavium.com>
      Cc: Robert Richter <rric@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: netdev@vger.kernel.org
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a2ace56
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · fb60bccc
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter/IPVS fixes for your net tree,
      they are:
      
      1) Fix packet drops due to incorrect ECN handling in IPVS, from Vadim
         Fedorenko.
      
      2) Fix splat with mark restoration in xt_socket with non-full-sock,
         patch from Subash Abhinov Kasiviswanathan.
      
      3) ipset bogusly bails out when adding IPv4 range containing more than
         2^31 addresses, from Jozsef Kadlecsik.
      
      4) Incorrect pernet unregistration order in ipset, from Florian Westphal.
      
      5) Races between dump and swap in ipset results in BUG_ON splats, from
         Ross Lagerwall.
      
      6) Fix chain renames in nf_tables, from JingPiao Chen.
      
      7) Fix race in pernet codepath with ebtables table registration, from
         Artem Savkov.
      
      8) Memory leak in error path in set name allocation in nf_tables, patch
         from Arvind Yadav.
      
      9) Don't dump chain counters if they are not available, this fixes a
         crash when listing the ruleset.
      
      10) Fix out of bound memory read in strlcpy() in x_tables compat code,
          from Eric Dumazet.
      
      11) Make sure we only process TCP packets in SYNPROXY hooks, patch from
          Lin Zhang.
      
      12) Cannot load rules incrementally anymore after xt_bpf with pinned
          objects, added in revision 1. From Shmulik Ladkani.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb60bccc
    • D
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 5766cd68
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2017-10-09
      
      This series contains updates to ixgbe and arch/Kconfig.
      
      Mark fixes a case where PHY register access is not supported and we were
      returning a PHY address, when we should have been returning -EOPNOTSUPP.
      
      Sabrina Dubroca fixes the use of a logical "and" when it should have been
      the bitwise "and" operator.
      
      Ding Tianhong reverts the commit that added the Kconfig bool option
      ARCH_WANT_RELAX_ORDER, since there is now a new flag
      PCI_DEV_FLAGS_NO_RELAXED_ORDERING that has been added to indicate that
      Relaxed Ordering Attributes should not be used for Transaction Layer
      Packets.  Then follows up with making the needed changes to ixgbe to
      use the new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag.
      
      John Fastabend fixes an issue in the ring accounting when the transmit
      ring parameters are changed via ethtool when an XDP program is attached.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5766cd68
    • P
      udp: fix bcast packet reception · 996b44fc
      Paolo Abeni 提交于
      The commit bc044e8d ("udp: perform source validation for
      mcast early demux") does not take into account that broadcast packets
      lands in the same code path and they need different checks for the
      source address - notably, zero source address are valid for bcast
      and invalid for mcast.
      
      As a result, 2nd and later broadcast packets with 0 source address
      landing to the same socket are dropped. This breaks dhcp servers.
      
      Since we don't have stringent performance requirements for ingress
      broadcast traffic, fix it by disabling UDP early demux such traffic.
      Reported-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Fixes: bc044e8d ("udp: perform source validation for mcast early demux")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      996b44fc
    • J
      netlink: do not set cb_running if dump's start() errs · 41c87425
      Jason A. Donenfeld 提交于
      It turns out that multiple places can call netlink_dump(), which means
      it's still possible to dereference partially initialized values in
      dump() that were the result of a faulty returned start().
      
      This fixes the issue by calling start() _before_ setting cb_running to
      true, so that there's no chance at all of hitting the dump() function
      through any indirect paths.
      
      It also moves the call to start() to be when the mutex is held. This has
      the nice side effect of serializing invocations to start(), which is
      likely desirable anyway. It also prevents any possible other races that
      might come out of this logic.
      
      In testing this with several different pieces of tricky code to trigger
      these issues, this commit fixes all avenues that I'm aware of.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41c87425
    • D
      Merge tag 'mac80211-for-davem-2017-10-09' of... · 6df4d17c
      David S. Miller 提交于
      Merge tag 'mac80211-for-davem-2017-10-09' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      pull-request: mac80211 2017-10-09
      
      The QCA folks found another netlink problem - we were missing validation
      of some attributes. It's not super problematic since one can only read a
      few bytes beyond the message (and that memory must exist), but here's the
      fix for it.
      
      I thought perhaps we can make nla_parse_nested() require a policy, but
      given the two-stage validation/parsing in regular netlink that won't work.
      
      Please pull and let me know if there's any problem.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6df4d17c
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 93b03193
      David S. Miller 提交于
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2017-10-09
      
      1) Fix some error paths of the IPsec offloading API.
      
      2) Fix a NULL pointer dereference when IPsec is used
         with vti. From Alexey Kodanev.
      
      3) Don't call xfrm_policy_cache_flush under xfrm_state_lock,
         it triggers several locking warnings. From Artem Savkov.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93b03193
    • S
      ipv4: Fix traffic triggered IPsec connections. · 6c0e7284
      Steffen Klassert 提交于
      A recent patch removed the dst_free() on the allocated
      dst_entry in ipv4_blackhole_route(). The dst_free() marked the
      dst_entry as dead and added it to the gc list. I.e. it was setup
      for a one time usage. As a result we may now have a blackhole
      route cached at a socket on some IPsec scenarios. This makes the
      connection unusable.
      
      Fix this by marking the dst_entry directly at allocation time
      as 'dead', so it is used only once.
      
      Fixes: b838d5e1 ("ipv4: mark DST_NOGC and remove the operation of dst_free()")
      Reported-by: NTobias Brunner <tobias@strongswan.org>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c0e7284
    • S
      ipv6: Fix traffic triggered IPsec connections. · 62cf27e5
      Steffen Klassert 提交于
      A recent patch removed the dst_free() on the allocated
      dst_entry in ipv6_blackhole_route(). The dst_free() marked
      the dst_entry as dead and added it to the gc list. I.e. it
      was setup for a one time usage. As a result we may now have
      a blackhole route cached at a socket on some IPsec scenarios.
      This makes the connection unusable.
      
      Fix this by marking the dst_entry directly at allocation time
      as 'dead', so it is used only once.
      
      Fixes: 587fea74 ("ipv6: mark DST_NOGC and remove the operation of dst_free()")
      Reported-by: NTobias Brunner <tobias@strongswan.org>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62cf27e5
  2. 09 10月, 2017 12 次提交
  3. 08 10月, 2017 2 次提交
    • A
      bpf: fix liveness marking · 8fe2d6cc
      Alexei Starovoitov 提交于
      while processing Rx = Ry instruction the verifier does
      regs[insn->dst_reg] = regs[insn->src_reg]
      which often clears write mark (when Ry doesn't have it)
      that was just set by check_reg_arg(Rx) prior to the assignment.
      That causes mark_reg_read() to keep marking Rx in this block as
      REG_LIVE_READ (since the logic incorrectly misses that it's
      screened by the write) and in many of its parents (until lucky
      write into the same Rx or beginning of the program).
      That causes is_state_visited() logic to miss many pruning opportunities.
      
      Furthermore mark_reg_read() logic propagates the read mark
      for BPF_REG_FP as well (though it's readonly) which causes
      harmless but unnecssary work during is_state_visited().
      Note that do_propagate_liveness() skips FP correctly,
      so do the same in mark_reg_read() as well.
      It saves 0.2 seconds for the test below
      
      program               before  after
      bpf_lb-DLB_L3.o       2604    2304
      bpf_lb-DLB_L4.o       11159   3723
      bpf_lb-DUNKNOWN.o     1116    1110
      bpf_lxc-DDROP_ALL.o   34566   28004
      bpf_lxc-DUNKNOWN.o    53267   39026
      bpf_netdev.o          17843   16943
      bpf_overlay.o         8672    7929
      time                  ~11 sec  ~4 sec
      
      Fixes: dc503a8a ("bpf/verifier: track liveness for pruning")
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NEdward Cree <ecree@solarflare.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fe2d6cc
    • A
      doc: Fix typo "8023.ad" in bonding documentation · 00a534e5
      Axel Beckert 提交于
      Should be "802.3ad" like everywhere else in the document.
      Signed-off-by: NAxel Beckert <abe@deuxchevaux.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00a534e5