1. 05 2月, 2016 2 次提交
  2. 23 1月, 2016 2 次提交
  3. 22 1月, 2016 8 次提交
    • I
      libceph: remove outdated comment · 7e01726a
      Ilya Dryomov 提交于
      MClientMount{,Ack} are long gone.  The receipt of bare monmap doesn't
      actually indicate a mount success as we are yet to authenticate at that
      point in time.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      7e01726a
    • I
      libceph: kill off ceph_x_ticket_handler::validity · f6cdb292
      Ilya Dryomov 提交于
      With it gone, no need to preserve ceph_timespec in process_one_ticket()
      either.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      f6cdb292
    • I
      libceph: invalidate AUTH in addition to a service ticket · 187d131d
      Ilya Dryomov 提交于
      If we fault due to authentication, we invalidate the service ticket we
      have and request a new one - the idea being that if a service rejected
      our authorizer, it must have expired, despite mon_client's attempts at
      periodic renewal.  (The other possibility is that our ticket is too new
      and the service hasn't gotten it yet, in which case invalidating isn't
      necessary but doesn't hurt.)
      
      Invalidating just the service ticket is not enough, though.  If we
      assume a failure on mon_client's part to renew a service ticket, we
      have to assume the same for the AUTH ticket.  If our AUTH ticket is
      bad, we won't get any service tickets no matter how hard we try, so
      invalidate AUTH ticket along with the service ticket.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      187d131d
    • I
      libceph: fix authorizer invalidation, take 2 · 6abe097d
      Ilya Dryomov 提交于
      Back in 2013, commit 4b8e8b5d ("libceph: fix authorizer
      invalidation") tried to fix authorizer invalidation issues by clearing
      validity field.  However, nothing ever consults this field, so it
      doesn't force us to request any new secrets in any way and therefore we
      never get out of the exponential backoff mode:
      
          [  129.973812] libceph: osd2 192.168.122.1:6810 connect authorization failure
          [  130.706785] libceph: osd2 192.168.122.1:6810 connect authorization failure
          [  131.710088] libceph: osd2 192.168.122.1:6810 connect authorization failure
          [  133.708321] libceph: osd2 192.168.122.1:6810 connect authorization failure
          [  137.706598] libceph: osd2 192.168.122.1:6810 connect authorization failure
          ...
      
      AFAICT this was the case at the time 4b8e8b5d was merged, too.
      
      Using timespec solely as a bool isn't nice, so introduce a new have_key
      flag, specifically for this purpose.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      6abe097d
    • I
      libceph: clear messenger auth_retry flag if we fault · f6330cc1
      Ilya Dryomov 提交于
      Commit 20e55c4c ("libceph: clear messenger auth_retry flag when we
      authenticate") got us only half way there.  We clear the flag if the
      second attempt succeeds, but it also needs to be cleared if that
      attempt fails, to allow for the exponential backoff to kick in.
      Otherwise, if ->should_authenticate() thinks our keys are valid, we
      will busy loop, incrementing auth_retry to no avail:
      
          process_connect ffff880079a63830 got BADAUTHORIZER attempt 1
          process_connect ffff880079a63830 got BADAUTHORIZER attempt 2
          process_connect ffff880079a63830 got BADAUTHORIZER attempt 3
          process_connect ffff880079a63830 got BADAUTHORIZER attempt 4
          process_connect ffff880079a63830 got BADAUTHORIZER attempt 5
          ...
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      f6330cc1
    • I
      libceph: fix ceph_msg_revoke() · 67645d76
      Ilya Dryomov 提交于
      There are a number of problems with revoking a "was sending" message:
      
      (1) We never make any attempt to revoke data - only kvecs contibute to
      con->out_skip.  However, once the header (envelope) is written to the
      socket, our peer learns data_len and sets itself to expect at least
      data_len bytes to follow front or front+middle.  If ceph_msg_revoke()
      is called while the messenger is sending message's data portion,
      anything we send after that call is counted by the OSD towards the now
      revoked message's data portion.  The effects vary, the most common one
      is the eventual hang - higher layers get stuck waiting for the reply to
      the message that was sent out after ceph_msg_revoke() returned and
      treated by the OSD as a bunch of data bytes.  This is what Matt ran
      into.
      
      (2) Flat out zeroing con->out_kvec_bytes worth of bytes to handle kvecs
      is wrong.  If ceph_msg_revoke() is called before the tag is sent out or
      while the messenger is sending the header, we will get a connection
      reset, either due to a bad tag (0 is not a valid tag) or a bad header
      CRC, which kind of defeats the purpose of revoke.  Currently the kernel
      client refuses to work with header CRCs disabled, but that will likely
      change in the future, making this even worse.
      
      (3) con->out_skip is not reset on connection reset, leading to one or
      more spurious connection resets if we happen to get a real one between
      con->out_skip is set in ceph_msg_revoke() and before it's cleared in
      write_partial_skip().
      
      Fixing (1) and (3) is trivial.  The idea behind fixing (2) is to never
      zero the tag or the header, i.e. send out tag+header regardless of when
      ceph_msg_revoke() is called.  That way the header is always correct, no
      unnecessary resets are induced and revoke stands ready for disabled
      CRCs.  Since ceph_msg_revoke() rips out con->out_msg, introduce a new
      "message out temp" and copy the header into it before sending.
      
      Cc: stable@vger.kernel.org # 4.0+
      Reported-by: NMatt Conner <matt.conner@keepertech.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Tested-by: NMatt Conner <matt.conner@keepertech.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      67645d76
    • G
      libceph: use list_for_each_entry_safe · 10bcee14
      Geliang Tang 提交于
      Use list_for_each_entry_safe() instead of list_for_each_safe() to
      simplify the code.
      Signed-off-by: NGeliang Tang <geliangtang@163.com>
      [idryomov@gmail.com: nuke call to list_splice_init() as well]
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      10bcee14
    • G
      libceph: use list_next_entry instead of list_entry_next · 17ddc49b
      Geliang Tang 提交于
      list_next_entry has been defined in list.h, so I replace list_entry_next
      with it.
      Signed-off-by: NGeliang Tang <geliangtang@163.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      17ddc49b
  4. 21 1月, 2016 4 次提交
  5. 20 1月, 2016 11 次提交
  6. 19 1月, 2016 1 次提交
    • H
      ovs: limit ovs recursions in ovs_execute_actions to not corrupt stack · b064d0d8
      Hannes Frederic Sowa 提交于
      It was seen that defective configurations of openvswitch could overwrite
      the STACK_END_MAGIC and cause a hard crash of the kernel because of too
      many recursions within ovs.
      
      This problem arises due to the high stack usage of openvswitch. The rest
      of the kernel is fine with the current limit of 10 (RECURSION_LIMIT).
      
      We use the already existing recursion counter in ovs_execute_actions to
      implement an upper bound of 5 recursions.
      
      Cc: Pravin Shelar <pshelar@ovn.org>
      Cc: Simon Horman <simon.horman@netronome.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Simon Horman <simon.horman@netronome.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b064d0d8
  7. 18 1月, 2016 1 次提交
  8. 16 1月, 2016 11 次提交
    • S
      batman-adv: Drop immediate orig_node free function · 42eff6a6
      Sven Eckelmann 提交于
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_orig_node_free_ref.
      
      Fixes: 72822225 ("batman-adv: Fix rcu_barrier() miss due to double call_rcu() in TT code")
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NAntonio Quartulli <a@unstable.cc>
      42eff6a6
    • S
      batman-adv: Drop immediate batadv_hard_iface free function · b4d922cf
      Sven Eckelmann 提交于
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_hardif_free_ref.
      
      Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct")
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NAntonio Quartulli <a@unstable.cc>
      b4d922cf
    • S
      batman-adv: Drop immediate neigh_ifinfo free function · ae3e1e36
      Sven Eckelmann 提交于
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_neigh_ifinfo_free_ref.
      
      Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct")
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NAntonio Quartulli <a@unstable.cc>
      ae3e1e36
    • S
      batman-adv: Drop immediate batadv_hardif_neigh_node free function · f6389692
      Sven Eckelmann 提交于
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_hardif_neigh_free_ref.
      
      Fixes: cef63419 ("batman-adv: add list of unique single hop neighbors per hard-interface")
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NAntonio Quartulli <a@unstable.cc>
      f6389692
    • S
      batman-adv: Drop immediate batadv_neigh_node free function · 2baa753c
      Sven Eckelmann 提交于
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_neigh_node_free_ref.
      
      Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct")
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NAntonio Quartulli <a@unstable.cc>
      2baa753c
    • S
      batman-adv: Drop immediate batadv_orig_ifinfo free function · deed9660
      Sven Eckelmann 提交于
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_orig_ifinfo_free_ref.
      
      Fixes: 7351a482 ("batman-adv: split out router from orig_node")
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NAntonio Quartulli <a@unstable.cc>
      deed9660
    • S
      batman-adv: Avoid recursive call_rcu for batadv_nc_node · 44e8e7e9
      Sven Eckelmann 提交于
      The batadv_nc_node_free_ref function uses call_rcu to delay the free of the
      batadv_nc_node object until no (already started) rcu_read_lock is enabled
      anymore. This makes sure that no context is still trying to access the
      object which should be removed. But batadv_nc_node also contains a
      reference to orig_node which must be removed.
      
      The reference drop of orig_node was done in the call_rcu function
      batadv_nc_node_free_rcu but should actually be done in the
      batadv_nc_node_release function to avoid nested call_rcus. This is
      important because rcu_barrier (e.g. batadv_softif_free or batadv_exit) will
      not detect the inner call_rcu as relevant for its execution. Otherwise this
      barrier will most likely be inserted in the queue before the callback of
      the first call_rcu was executed. The caller of rcu_barrier will therefore
      continue to run before the inner call_rcu callback finished.
      
      Fixes: d56b1705 ("batman-adv: network coding - detect coding nodes and remove these after timeout")
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NAntonio Quartulli <a@unstable.cc>
      44e8e7e9
    • S
      batman-adv: Avoid recursive call_rcu for batadv_bla_claim · 63b39927
      Sven Eckelmann 提交于
      The batadv_claim_free_ref function uses call_rcu to delay the free of the
      batadv_bla_claim object until no (already started) rcu_read_lock is enabled
      anymore. This makes sure that no context is still trying to access the
      object which should be removed. But batadv_bla_claim also contains a
      reference to backbone_gw which must be removed.
      
      The reference drop of backbone_gw was done in the call_rcu function
      batadv_claim_free_rcu but should actually be done in the
      batadv_claim_release function to avoid nested call_rcus. This is important
      because rcu_barrier (e.g. batadv_softif_free or batadv_exit) will not
      detect the inner call_rcu as relevant for its execution. Otherwise this
      barrier will most likely be inserted in the queue before the callback of
      the first call_rcu was executed. The caller of rcu_barrier will therefore
      continue to run before the inner call_rcu callback finished.
      
      Fixes: 23721387 ("batman-adv: add basic bridge loop avoidance code")
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Acked-by: NSimon Wunderlich <sw@simonwunderlich.de>
      Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NAntonio Quartulli <a@unstable.cc>
      63b39927
    • N
      bridge: fix lockdep addr_list_lock false positive splat · c6894dec
      Nikolay Aleksandrov 提交于
      After promisc mode management was introduced a bridge device could do
      dev_set_promiscuity from its ndo_change_rx_flags() callback which in
      turn can be called after the bridge's addr_list_lock has been taken
      (e.g. by dev_uc_add). This causes a false positive lockdep splat because
      the port interfaces' addr_list_lock is taken when br_manage_promisc()
      runs after the bridge's addr list lock was already taken.
      To remove the false positive introduce a custom bridge addr_list_lock
      class and set it on bridge init.
      A simple way to reproduce this is with the following:
      $ brctl addbr br0
      $ ip l add l br0 br0.100 type vlan id 100
      $ ip l set br0 up
      $ ip l set br0.100 up
      $ echo 1 > /sys/class/net/br0/bridge/vlan_filtering
      $ brctl addif br0 eth0
      Splat:
      [   43.684325] =============================================
      [   43.684485] [ INFO: possible recursive locking detected ]
      [   43.684636] 4.4.0-rc8+ #54 Not tainted
      [   43.684755] ---------------------------------------------
      [   43.684906] brctl/1187 is trying to acquire lock:
      [   43.685047]  (_xmit_ETHER){+.....}, at: [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40
      [   43.685460]  but task is already holding lock:
      [   43.685618]  (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80
      [   43.686015]  other info that might help us debug this:
      [   43.686316]  Possible unsafe locking scenario:
      
      [   43.686743]        CPU0
      [   43.686967]        ----
      [   43.687197]   lock(_xmit_ETHER);
      [   43.687544]   lock(_xmit_ETHER);
      [   43.687886] *** DEADLOCK ***
      
      [   43.688438]  May be due to missing lock nesting notation
      
      [   43.688882] 2 locks held by brctl/1187:
      [   43.689134]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81510317>] rtnl_lock+0x17/0x20
      [   43.689852]  #1:  (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80
      [   43.690575] stack backtrace:
      [   43.690970] CPU: 0 PID: 1187 Comm: brctl Not tainted 4.4.0-rc8+ #54
      [   43.691270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
      [   43.691770]  ffffffff826a25c0 ffff8800369fb8e0 ffffffff81360ceb ffffffff826a25c0
      [   43.692425]  ffff8800369fb9b8 ffffffff810d0466 ffff8800369fb968 ffffffff81537139
      [   43.693071]  ffff88003a08c880 0000000000000000 00000000ffffffff 0000000002080020
      [   43.693709] Call Trace:
      [   43.693931]  [<ffffffff81360ceb>] dump_stack+0x4b/0x70
      [   43.694199]  [<ffffffff810d0466>] __lock_acquire+0x1e46/0x1e90
      [   43.694483]  [<ffffffff81537139>] ? netlink_broadcast_filtered+0x139/0x3e0
      [   43.694789]  [<ffffffff8153b5da>] ? nlmsg_notify+0x5a/0xc0
      [   43.695064]  [<ffffffff810d10f5>] lock_acquire+0xe5/0x1f0
      [   43.695340]  [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40
      [   43.695623]  [<ffffffff815edea5>] _raw_spin_lock_bh+0x45/0x80
      [   43.695901]  [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40
      [   43.696180]  [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40
      [   43.696460]  [<ffffffff8150189c>] dev_set_promiscuity+0x3c/0x50
      [   43.696750]  [<ffffffffa0586845>] br_port_set_promisc+0x25/0x50 [bridge]
      [   43.697052]  [<ffffffffa05869aa>] br_manage_promisc+0x8a/0xe0 [bridge]
      [   43.697348]  [<ffffffffa05826ee>] br_dev_change_rx_flags+0x1e/0x20 [bridge]
      [   43.697655]  [<ffffffff81501532>] __dev_set_promiscuity+0x132/0x1f0
      [   43.697943]  [<ffffffff81501672>] __dev_set_rx_mode+0x82/0x90
      [   43.698223]  [<ffffffff815072de>] dev_uc_add+0x5e/0x80
      [   43.698498]  [<ffffffffa05b3c62>] vlan_device_event+0x542/0x650 [8021q]
      [   43.698798]  [<ffffffff8109886d>] notifier_call_chain+0x5d/0x80
      [   43.699083]  [<ffffffff810988b6>] raw_notifier_call_chain+0x16/0x20
      [   43.699374]  [<ffffffff814f456e>] call_netdevice_notifiers_info+0x6e/0x80
      [   43.699678]  [<ffffffff814f4596>] call_netdevice_notifiers+0x16/0x20
      [   43.699973]  [<ffffffffa05872be>] br_add_if+0x47e/0x4c0 [bridge]
      [   43.700259]  [<ffffffffa058801e>] add_del_if+0x6e/0x80 [bridge]
      [   43.700548]  [<ffffffffa0588b5f>] br_dev_ioctl+0xaf/0xc0 [bridge]
      [   43.700836]  [<ffffffff8151a7ac>] dev_ifsioc+0x30c/0x3c0
      [   43.701106]  [<ffffffff8151aac9>] dev_ioctl+0xf9/0x6f0
      [   43.701379]  [<ffffffff81254345>] ? mntput_no_expire+0x5/0x450
      [   43.701665]  [<ffffffff812543ee>] ? mntput_no_expire+0xae/0x450
      [   43.701947]  [<ffffffff814d7b02>] sock_do_ioctl+0x42/0x50
      [   43.702219]  [<ffffffff814d8175>] sock_ioctl+0x1e5/0x290
      [   43.702500]  [<ffffffff81242d0b>] do_vfs_ioctl+0x2cb/0x5c0
      [   43.702771]  [<ffffffff81243079>] SyS_ioctl+0x79/0x90
      [   43.703033]  [<ffffffff815eebb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
      
      CC: Vlad Yasevich <vyasevic@redhat.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: Bridge list <bridge@lists.linux-foundation.org>
      CC: Andy Gospodarek <gospo@cumulusnetworks.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Fixes: 2796d0c6 ("bridge: Automatically manage port promiscuous mode.")
      Reported-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6894dec
    • G
      net: sctp: Move sequence start handling into sctp_transport_get_idx() · fb331185
      Geert Uytterhoeven 提交于
      net/sctp/proc.c: In function ‘sctp_transport_get_idx’:
      net/sctp/proc.c:313: warning: ‘obj’ may be used uninitialized in this function
      
      This is currently a false positive, as all callers check for a zero
      offset first, and handle this case in the exact same way.
      
      Move the check and handling into sctp_transport_get_idx() to kill the
      compiler warning, and avoid future bugs.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb331185
    • E
      ipv6: update skb->csum when CE mark is propagated · 34ae6a1a
      Eric Dumazet 提交于
      When a tunnel decapsulates the outer header, it has to comply
      with RFC 6080 and eventually propagate CE mark into inner header.
      
      It turns out IP6_ECN_set_ce() does not correctly update skb->csum
      for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
      messages and stack traces.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34ae6a1a