1. 10 9月, 2016 4 次提交
  2. 09 9月, 2016 13 次提交
  3. 08 9月, 2016 1 次提交
  4. 07 9月, 2016 6 次提交
  5. 05 9月, 2016 6 次提交
    • L
      af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock' · 6e1ce3c3
      Linus Torvalds 提交于
      Right now we use the 'readlock' both for protecting some of the af_unix
      IO path and for making the bind be single-threaded.
      
      The two are independent, but using the same lock makes for a nasty
      deadlock due to ordering with regards to filesystem locking.  The bind
      locking would want to nest outside the VSF pathname locking, but the IO
      locking wants to nest inside some of those same locks.
      
      We tried to fix this earlier with commit c845acb3 ("af_unix: Fix
      splice-bind deadlock") which moved the readlock inside the vfs locks,
      but that caused problems with overlayfs that will then call back into
      filesystem routines that take the lock in the wrong order anyway.
      
      Splitting the locks means that we can go back to having the bind lock be
      the outermost lock, and we don't have any deadlocks with lock ordering.
      Acked-by: NRainer Weikusat <rweikusat@cyberadapt.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e1ce3c3
    • L
      Revert "af_unix: Fix splice-bind deadlock" · 38f7bd94
      Linus Torvalds 提交于
      This reverts commit c845acb3.
      
      It turns out that it just replaces one deadlock with another one: we can
      still get the wrong lock ordering with the readlock due to overlayfs
      calling back into the filesystem layer and still taking the vfs locks
      after the readlock.
      
      The proper solution ends up being to just split the readlock into two
      pieces: the bind lock (taken *outside* the vfs locks) and the IO lock
      (taken *inside* the filesystem locks).  The two locks are independent
      anyway.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reviewed-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38f7bd94
    • D
      Merge branch 'vxlan-fixes' · 2f83a53a
      David S. Miller 提交于
      Jiri Benc says:
      
      ====================
      vxlan: fix error reporting
      
      This patchset improves checking for invalid configuration in VXLAN and
      fixes problems with duplicated and inappropriate error messages.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f83a53a
    • J
      vxlan: fix duplicated and wrong error messages · 3555621d
      Jiri Benc 提交于
      vxlan_dev_configure outputs error messages before returning, no need to
      print again the same mesages in vxlan_newlink. Also, vxlan_dev_configure may
      return a particular error code for a different reason than vxlan_newlink
      thinks.
      
      Move the remaining error messages into vxlan_dev_configure and let
      vxlan_newlink just pass on the error code.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3555621d
    • J
      vxlan: reject multicast destination without an interface · 9b4cdd51
      Jiri Benc 提交于
      Currently, kernel accepts configurations such as:
      
        ip l a type vxlan dstport 4789 id 1 group 239.192.0.1
        ip l a type vxlan dstport 4789 id 1 group ff0e::110
      
      However, neither of those really works. In the IPv4 case, the interface
      cannot be brought up ("RTNETLINK answers: No such device"). This is because
      multicast join will be rejected without the interface being specified.
      
      In the IPv6 case, multicast wil be joined on the first interface found. This
      is not what the user wants as it depends on random factors (order of
      interfaces).
      
      Note that it's possible to add a local address but it doesn't solve
      anything. For IPv4, it's not considered in the multicast join (thus the same
      error as above is returned on ifup). This could be added but it wouldn't
      help for IPv6 anyway. For IPv6, we do need the interface.
      
      Just reject a configuration that sets multicast address and does not provide
      an interface. Nobody can depend on the previous behavior as it never worked.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b4cdd51
    • M
      bonding: Fix bonding crash · 24b27fc4
      Mahesh Bandewar 提交于
      Following few steps will crash kernel -
      
        (a) Create bonding master
            > modprobe bonding miimon=50
        (b) Create macvlan bridge on eth2
            > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
      	   type macvlan
        (c) Now try adding eth2 into the bond
            > echo +eth2 > /sys/class/net/bond0/bonding/slaves
            <crash>
      
      Bonding does lots of things before checking if the device enslaved is
      busy or not.
      
      In this case when the notifier call-chain sends notifications, the
      bond_netdev_event() assumes that the rx_handler /rx_handler_data is
      registered while the bond_enslave() hasn't progressed far enough to
      register rx_handler for the new slave.
      
      This patch adds a rx_handler check that can be performed right at the
      beginning of the enslave code to avoid getting into this situation.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      24b27fc4
  6. 03 9月, 2016 6 次提交
  7. 02 9月, 2016 4 次提交
    • E
      ipv6: Don't unset flowi6_proto in ipxip6_tnl_xmit() · ab343801
      Eli Cooper 提交于
      Commit 8eb30be0 ("ipv6: Create ip6_tnl_xmit") unsets
      flowi6_proto in ip4ip6_tnl_xmit() and ip6ip6_tnl_xmit().
      Since xfrm_selector_match() relies on this info, IPv6 packets
      sent by an ip6tunnel cannot be properly selected by their
      protocols after removing it. This patch puts flowi6_proto back.
      
      Cc: stable@vger.kernel.org
      Fixes: 8eb30be0 ("ipv6: Create ip6_tnl_xmit")
      Signed-off-by: NEli Cooper <elicooper@gmx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab343801
    • G
      bnx2x: don't reset chip on cleanup if PCI function is offline · b44e108b
      Guilherme G. Piccoli 提交于
      When PCI error is detected, in some architectures (like PowerPC) a slot
      reset is performed - the driver's error handlers are in charge of "disable"
      device before the reset, and re-enable it after a successful slot reset.
      
      There are two cases though that another path is taken on the code: if the
      slot reset is not successful or if too many errors already happened in the
      specific adapter (meaning that possibly the device is experiencing a HW
      failure that slot reset is not able to solve), the core PCI error mechanism
      (called EEH in PowerPC) will remove the adapter from the system, since it
      will consider this as a permanent failure on device. In this case, a path
      is taken that leads to bnx2x_chip_cleanup() calling bnx2x_reset_hw(), which
      then tries to perform a HW reset on chip. This reset won't succeed since
      the HW is in a fault state, which can be seen by multiple messages on
      kernel log like below:
      
      	bnx2x: [bnx2x_issue_dmae_with_comp:552(eth1)]DMAE timeout!
      	bnx2x: [bnx2x_write_dmae:600(eth1)]DMAE returned failure -1
      
      After some time, the PCI error mechanism gives up on waiting the driver's
      correct removal procedure and forcibly remove the adapter from the system.
      We can see soft lockup while core PCI error mechanism is waiting for driver
      to accomplish the right removal process.
      
      This patch adds a verification to avoid a chip reset whenever the function
      is in PCI error state - since this case is only reached when we have a
      device being removed because of a permanent failure, the HW chip reset is
      not expected to work fine neither is necessary.
      
      Also, as a minor improvement in error path, we avoid the MCP information dump
      in case of non-recoverable PCI error (when adapter is about to be removed),
      since it will certainly fail.
      Reported-by: NHarsha Thyagaraja <hathyaga@in.ibm.com>
      Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Acked-By: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b44e108b
    • G
      rps: flow_dissector: Fix uninitialized flow_keys used in __skb_get_hash possibly · 635c223c
      Gao Feng 提交于
      The original codes depend on that the function parameters are evaluated from
      left to right. But the parameter's evaluation order is not defined in C
      standard actually.
      
      When flow_keys_have_l4(&keys) is invoked before ___skb_get_hash(skb, &keys,
      hashrnd) with some compilers or environment, the keys passed to
      flow_keys_have_l4 is not initialized.
      
      Fixes: 6db61d79 ("flow_dissector: Ignore flow dissector return value from ___skb_get_hash")
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NGao Feng <fgao@ikuai8.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      635c223c
    • N
      tcp: fastopen: fix rcv_wup initialization for TFO server on SYN/data · 28b346cb
      Neal Cardwell 提交于
      Yuchung noticed that on the first TFO server data packet sent after
      the (TFO) handshake, the server echoed the TCP timestamp value in the
      SYN/data instead of the timestamp value in the final ACK of the
      handshake. This problem did not happen on regular opens.
      
      The tcp_replace_ts_recent() logic that decides whether to remember an
      incoming TS value needs tp->rcv_wup to hold the latest receive
      sequence number that we have ACKed (latest tp->rcv_nxt we have
      ACKed). This commit fixes this issue by ensuring that a TFO server
      properly updates tp->rcv_wup to match tp->rcv_nxt at the time it sends
      a SYN/ACK for the SYN/data.
      Reported-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28b346cb