1. 11 1月, 2013 1 次提交
  2. 09 1月, 2013 1 次提交
    • T
      SUNRPC: Ensure we release the socket write lock if the rpc_task exits early · 87ed5003
      Trond Myklebust 提交于
      If the rpc_task exits while holding the socket write lock before it has
      allocated an rpc slot, then the usual mechanism for releasing the write
      lock in xprt_release() is defeated.
      
      The problem occurs if the call to xprt_lock_write() initially fails, so
      that the rpc_task is put on the xprt->sending wait queue. If the task
      exits after being assigned the lock by __xprt_lock_write_func, but
      before it has retried the call to xprt_lock_and_alloc_slot(), then
      it calls xprt_release() while holding the write lock, but will
      immediately exit due to the test for task->tk_rqstp != NULL.
      Reported-by: NChris Perl <chris.perl@gmail.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org [>= 3.1]
      87ed5003
  3. 08 1月, 2013 2 次提交
    • H
      s390/irq: remove split irq fields from /proc/stat · 420f42ec
      Heiko Carstens 提交于
      Now that irq sum accounting for /proc/stat's "intr" line works again we
      have the oddity that the sum field (first field) contains only the sum
      of the second (external irqs) and third field (I/O interrupts).
      The reason for that is that these two fields are already sums of all other
      fields. So if we would sum up everything we would count every interrupt
      twice.
      This is broken since the split interrupt accounting was merged two years
      ago: 052ff461 "[S390] irq: have detailed
      statistics for interrupt types".
      To fix this remove the split interrupt fields from /proc/stat's "intr"
      line again and only have them in /proc/interrupts.
      
      This restores the old behaviour, seems to be the only sane fix and mimics
      a behaviour from other architectures where /proc/interrupts also contains
      more than /proc/stat's "intr" line does.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      420f42ec
    • A
      sctp: fix Kconfig bug in default cookie hmac selection · 36a25de2
      Alex Elder 提交于
      Commit 0d0863b0 ("sctp: Change defaults on cookie hmac selection")
      added a "choice" to the sctp Kconfig file.  It introduced a bug which
      led to an infinite loop when while running "make oldconfig".
      
      The problem is that the wrong symbol was defined as the default value
      for the choice.  Using the correct value gets rid of the infinite loop.
      
      Note:  if CONFIG_SCTP_COOKIE_HMAC_SHA1=y was present in the input
      config file, both that and CONFIG_SCTP_COOKIE_HMAC_MD5=y be present
      in the generated config file.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36a25de2
  4. 07 1月, 2013 1 次提交
  5. 05 1月, 2013 7 次提交
  6. 03 1月, 2013 2 次提交
  7. 28 12月, 2012 4 次提交
    • S
      libceph: fix protocol feature mismatch failure path · 0fa6ebc6
      Sage Weil 提交于
      We should not set con->state to CLOSED here; that happens in
      ceph_fault() in the caller, where it first asserts that the state
      is not yet CLOSED.  Avoids a BUG when the features don't match.
      
      Since the fail_protocol() has become a trivial wrapper, replace
      calls to it with direct calls to reset_connection().
      Signed-off-by: NSage Weil <sage@inktank.com>
      Reviewed-by: NAlex Elder <elder@inktank.com>
      0fa6ebc6
    • A
      libceph: WARN, don't BUG on unexpected connection states · 122070a2
      Alex Elder 提交于
      A number of assertions in the ceph messenger are implemented with
      BUG_ON(), killing the system if connection's state doesn't match
      what's expected.  At this point our state model is (evidently) not
      well understood enough for these assertions to trigger a BUG().
      Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
      so we learn about these issues without killing the machine.
      
      We now recognize that a connection fault can occur due to a socket
      closure at any time, regardless of the state of the connection.  So
      there is really nothing we can assert about the state of the
      connection at that point so eliminate that assertion.
      Reported-by: NUgis <ugis22@gmail.com>
      Tested-by: NUgis <ugis22@gmail.com>
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      122070a2
    • A
      libceph: always reset osds when kicking · e6d50f67
      Alex Elder 提交于
      When ceph_osdc_handle_map() is called to process a new osd map,
      kick_requests() is called to ensure all affected requests are
      updated if necessary to reflect changes in the osd map.  This
      happens in two cases:  whenever an incremental map update is
      processed; and when a full map update (or the last one if there is
      more than one) gets processed.
      
      In the former case, the kick_requests() call is followed immediately
      by a call to reset_changed_osds() to ensure any connections to osds
      affected by the map change are reset.  But for full map updates
      this isn't done.
      
      Both cases should be doing this osd reset.
      
      Rather than duplicating the reset_changed_osds() call, move it into
      the end of kick_requests().
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      e6d50f67
    • A
      libceph: move linger requests sooner in kick_requests() · ab60b16d
      Alex Elder 提交于
      The kick_requests() function is called by ceph_osdc_handle_map()
      when an osd map change has been indicated.  Its purpose is to
      re-queue any request whose target osd is different from what it
      was when it was originally sent.
      
      It is structured as two loops, one for incomplete but registered
      requests, and a second for handling completed linger requests.
      As a special case, in the first loop if a request marked to linger
      has not yet completed, it is moved from the request list to the
      linger list.  This is as a quick and dirty way to have the second
      loop handle sending the request along with all the other linger
      requests.
      
      Because of the way it's done now, however, this quick and dirty
      solution can result in these incomplete linger requests never
      getting re-sent as desired.  The problem lies in the fact that
      the second loop only arranges for a linger request to be sent
      if it appears its target osd has changed.  This is the proper
      handling for *completed* linger requests (it avoids issuing
      the same linger request twice to the same osd).
      
      But although the linger requests added to the list in the first loop
      may have been sent, they have not yet completed, so they need to be
      re-sent regardless of whether their target osd has changed.
      
      The first required fix is we need to avoid calling __map_request()
      on any incomplete linger request.  Otherwise the subsequent
      __map_request() call in the second loop will find the target osd
      has not changed and will therefore not re-send the request.
      
      Second, we need to be sure that a sent but incomplete linger request
      gets re-sent.  If the target osd is the same with the new osd map as
      it was when the request was originally sent, this won't happen.
      This can be fixed through careful handling when we move these
      requests from the request list to the linger list, by unregistering
      the request *before* it is registered as a linger request.  This
      works because a side-effect of unregistering the request is to make
      the request's r_osd pointer be NULL, and *that* will ensure the
      second loop actually re-sends the linger request.
      
      Processing of such a request is done at that point, so continue with
      the next one once it's been moved.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      ab60b16d
  8. 27 12月, 2012 10 次提交
  9. 25 12月, 2012 1 次提交
  10. 24 12月, 2012 1 次提交
    • P
      netfilter: xt_CT: recover NOTRACK target support · 10db9069
      Pablo Neira Ayuso 提交于
      Florian Westphal reported that the removal of the NOTRACK target
      (96550501 netfilter: remove xt_NOTRACK) is breaking some existing
      setups.
      
      That removal was scheduled for removal since long time ago as
      described in Documentation/feature-removal-schedule.txt
      
      What:  xt_NOTRACK
      Files: net/netfilter/xt_NOTRACK.c
      When:  April 2011
      Why:   Superseded by xt_CT
      
      Still, people may have not notice / may have decided to stick to an
      old iptables version. I agree with him in that some more conservative
      approach by spotting some printk to warn users for some time is less
      agressive.
      
      Current iptables 1.4.16.3 already contains the aliasing support
      that makes it point to the CT target, so upgrading would fix it.
      Still, the policy so far has been to avoid pushing our users to
      upgrade.
      
      As a solution, this patch recovers the NOTRACK target inside the CT
      target and it now spots a warning.
      Reported-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      10db9069
  11. 22 12月, 2012 7 次提交
    • S
      net: sched: integer overflow fix · d2fe85da
      Stefan Hasko 提交于
      Fixed integer overflow in function htb_dequeue
      Signed-off-by: NStefan Hasko <hasko.stevo@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2fe85da
    • G
      CONFIG_HOTPLUG removal from networking core · 8baf82b3
      Greg KH 提交于
      CONFIG_HOTPLUG is always enabled now, so remove the unused code that was
      trying to be compiled out when this option was disabled, in the
      networking core.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8baf82b3
    • G
      bridge: call br_netpoll_disable in br_add_if · 9b1536c4
      Gao feng 提交于
      When netdev_set_master faild in br_add_if, we should
      call br_netpoll_disable to do some cleanup jobs,such
      as free the memory of struct netpoll which allocated
      in br_netpoll_enable.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Acked-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b1536c4
    • E
      ipv4: arp: fix a lockdep splat in arp_solicit() · 9650388b
      Eric Dumazet 提交于
      Yan Burman reported following lockdep warning :
      
      =============================================
      [ INFO: possible recursive locking detected ]
      3.7.0+ #24 Not tainted
      ---------------------------------------------
      swapper/1/0 is trying to acquire lock:
        (&n->lock){++--..}, at: [<ffffffff8139f56e>] __neigh_event_send
      +0x2e/0x2f0
      
      but task is already holding lock:
        (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit+0x1d4/0x280
      
      other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&n->lock);
         lock(&n->lock);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
      4 locks held by swapper/1/0:
        #0:  (((&n->timer))){+.-...}, at: [<ffffffff8104b350>]
      call_timer_fn+0x0/0x1c0
        #1:  (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit
      +0x1d4/0x280
        #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81395400>]
      dev_queue_xmit+0x0/0x5d0
        #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff813cb41e>]
      ip_finish_output+0x13e/0x640
      
      stack backtrace:
      Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24
      Call Trace:
        <IRQ>  [<ffffffff8108c7ac>] validate_chain+0xdcc/0x11f0
        [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
        [<ffffffff81120565>] ? kmem_cache_free+0xe5/0x1c0
        [<ffffffff8108d570>] __lock_acquire+0x440/0xc30
        [<ffffffff813c3570>] ? inet_getpeer+0x40/0x600
        [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
        [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
        [<ffffffff8108ddf5>] lock_acquire+0x95/0x140
        [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
        [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
        [<ffffffff81448d4b>] _raw_write_lock_bh+0x3b/0x50
        [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
        [<ffffffff8139f56e>] __neigh_event_send+0x2e/0x2f0
        [<ffffffff8139f99b>] neigh_resolve_output+0x16b/0x270
        [<ffffffff813cb62d>] ip_finish_output+0x34d/0x640
        [<ffffffff813cb41e>] ? ip_finish_output+0x13e/0x640
        [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
        [<ffffffff813cb9a0>] ip_output+0x80/0xf0
        [<ffffffff813ca368>] ip_local_out+0x28/0x80
        [<ffffffffa046f25a>] vxlan_xmit+0x66a/0xbec [vxlan]
        [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
        [<ffffffff81394a50>] ? skb_gso_segment+0x2b0/0x2b0
        [<ffffffff81449355>] ? _raw_spin_unlock_irqrestore+0x65/0x80
        [<ffffffff81394c57>] ? dev_queue_xmit_nit+0x207/0x270
        [<ffffffff813950c8>] dev_hard_start_xmit+0x298/0x5d0
        [<ffffffff813956f3>] dev_queue_xmit+0x2f3/0x5d0
        [<ffffffff81395400>] ? dev_hard_start_xmit+0x5d0/0x5d0
        [<ffffffff813f5788>] arp_xmit+0x58/0x60
        [<ffffffff813f59db>] arp_send+0x3b/0x40
        [<ffffffff813f6424>] arp_solicit+0x204/0x280
        [<ffffffff813a1a70>] ? neigh_add+0x310/0x310
        [<ffffffff8139f515>] neigh_probe+0x45/0x70
        [<ffffffff813a1c10>] neigh_timer_handler+0x1a0/0x2a0
        [<ffffffff8104b3cf>] call_timer_fn+0x7f/0x1c0
        [<ffffffff8104b350>] ? detach_if_pending+0x120/0x120
        [<ffffffff8104b748>] run_timer_softirq+0x238/0x2b0
        [<ffffffff813a1a70>] ? neigh_add+0x310/0x310
        [<ffffffff81043e51>] __do_softirq+0x101/0x280
        [<ffffffff814518cc>] call_softirq+0x1c/0x30
        [<ffffffff81003b65>] do_softirq+0x85/0xc0
        [<ffffffff81043a7e>] irq_exit+0x9e/0xc0
        [<ffffffff810264f8>] smp_apic_timer_interrupt+0x68/0xa0
        [<ffffffff8145122f>] apic_timer_interrupt+0x6f/0x80
        <EOI>  [<ffffffff8100a054>] ? mwait_idle+0xa4/0x1c0
        [<ffffffff8100a04b>] ? mwait_idle+0x9b/0x1c0
        [<ffffffff8100a6a9>] cpu_idle+0x89/0xe0
        [<ffffffff81441127>] start_secondary+0x1b2/0x1b6
      
      Bug is from arp_solicit(), releasing the neigh lock after arp_send()
      In case of vxlan, we eventually need to write lock a neigh lock later.
      
      Its a false positive, but we can get rid of it without lockdep
      annotations.
      
      We can instead use neigh_ha_snapshot() helper.
      Reported-by: NYan Burman <yanb@mellanox.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9650388b
    • E
      net: devnet_rename_seq should be a seqcount · 30e6c9fa
      Eric Dumazet 提交于
      Using a seqlock for devnet_rename_seq is not a good idea,
      as device_rename() can sleep.
      
      As we hold RTNL, we dont need a protection for writers,
      and only need a seqcount so that readers can catch a change done
      by a writer.
      
      Bug added in commit c91f6df2 (sockopt: Change getsockopt() of
      SO_BINDTODEVICE to return an interface name)
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Brian Haley <brian.haley@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30e6c9fa
    • E
      ip_gre: fix possible use after free · f7e75ba1
      Eric Dumazet 提交于
      Once skb_realloc_headroom() is called, tiph might point to freed memory.
      
      Cache tiph->ttl value before the reallocation, to avoid unexpected
      behavior.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Isaku Yamahata <yamahata@valinux.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7e75ba1
    • I
      ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally · 412ed947
      Isaku Yamahata 提交于
      ipgre_tunnel_xmit() parses network header as IP unconditionally.
      But transmitting packets are not always IP packet. For example such packet
      can be sent by packet socket with sockaddr_ll.sll_protocol set.
      So make the function check if skb->protocol is IP.
      Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      412ed947
  12. 21 12月, 2012 3 次提交