1. 13 2月, 2008 13 次提交
    • P
      [NETLABEL]: Compilation for CONFIG_AUDIT=n case. · 94de7feb
      Pavel Emelyanov 提交于
      The audit_log_start() will expand into an empty do { } while (0)
      construction and the audit_ctx becomes unused.
      
      The solution: push current->audit_context into audit_log_start()
      directly, since it is not required in any other place in the 
      calling function.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94de7feb
    • P
      [GENETLINK]: Relax dances with genl_lock. · 910d6c32
      Pavel Emelyanov 提交于
      The genl_unregister_family() calls the genl_unregister_mc_groups(), 
      which takes and releases the genl_lock and then locks and releases
      this lock itself.
      
      Relax this behavior, all the more so the genl_unregister_mc_groups() 
      is called from genl_unregister_family() only.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      910d6c32
    • P
      [NETLABEL]: Fix lookup logic of netlbl_domhsh_search_def. · 4c3a0a25
      Pavel Emelyanov 提交于
      Currently, if the call to netlbl_domhsh_search succeeds the
      return result will still be NULL.
      
      Fix that, by returning the found entry (if any).
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NPaul Moore <paul.moore@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c3a0a25
    • U
      [NET]: Fix comment for skb_pull_rcsum · fee54fa5
      Urs Thuermann 提交于
      Fix comment for skb_pull_rcsum
      Signed-off-by: NUrs Thuermann <urs@isnogud.escape.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fee54fa5
    • H
      [IPV6]: Fix IPsec datagram fragmentation · 28a89453
      Herbert Xu 提交于
      This is a long-standing bug in the IPsec IPv6 code that breaks
      when we emit a IPsec tunnel-mode datagram packet.  The problem
      is that the code the emits the packet assumes the IPv6 stack
      will fragment it later, but the IPv6 stack assumes that whoever
      is emitting the packet is going to pre-fragment the packet.
      
      In the long term we need to fix both sides, e.g., to get the
      datagram code to pre-fragment as well as to get the IPv6 stack
      to fragment locally generated tunnel-mode packet.
      
      For now this patch does the second part which should make it
      work for the IPsec host case.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28a89453
    • D
      [NDISC]: Fix race in generic address resolution · 69cc64d8
      David S. Miller 提交于
      Frank Blaschka provided the bug report and the initial suggested fix
      for this bug.  He also validated this version of this fix.
      
      The problem is that the access to neigh->arp_queue is inconsistent, we
      grab references when dropping the lock lock to call
      neigh->ops->solicit() but this does not prevent other threads of
      control from trying to send out that packet at the same time causing
      corruptions because both code paths believe they have exclusive access
      to the skb.
      
      The best option seems to be to hold the write lock on neigh->lock
      during the ->solicit() call.  I looked at all of the ndisc_ops
      implementations and this seems workable.  The only case that needs
      special care is the IPV4 ARP implementation of arp_solicit().  It
      wants to take neigh->lock as a reader to protect the header entry in
      neigh->ha during the emission of the soliciation.  We can simply
      remove the read lock calls to take care of that since holding the lock
      as a writer at the caller providers a superset of the protection
      afforded by the existing read locking.
      
      The rest of the ->solicit() implementations don't care whether the
      neigh is locked or not.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69cc64d8
    • J
      [AX25] ax25_ds_timer: use mod_timer instead of add_timer · e848b583
      Jarek Poplawski 提交于
      This patch changes current use of: init_timer(), add_timer()
      and del_timer() to setup_timer() with mod_timer(), which
      should be safer anyway.
      Reported-by: NJann Traschewski <jann@gmx.de>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e848b583
    • J
      [AX25] ax25_timer: use mod_timer instead of add_timer · 21fab4a8
      Jarek Poplawski 提交于
      According to one of Jann's OOPS reports it looks like
      BUG_ON(timer_pending(timer)) triggers during add_timer()
      in ax25_start_t1timer(). This patch changes current use
      of: init_timer(), add_timer() and del_timer() to
      setup_timer() with mod_timer(), which should be safer
      anyway.
      Reported-by: NJann Traschewski <jann@gmx.de>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21fab4a8
    • J
      [AX25] ax25_route: make ax25_route_lock BH safe · 4de211f1
      Jarek Poplawski 提交于
      > =================================
      > [ INFO: inconsistent lock state ]
      > 2.6.24-dg8ngn-p02 #1
      > ---------------------------------
      > inconsistent {softirq-on-W} -> {in-softirq-R} usage.
      > linuxnet/3046 [HC0[0]:SC1[2]:HE1:SE0] takes:
      >  (ax25_route_lock){--.+}, at: [<f8a0cfb7>] ax25_get_route+0x18/0xb7 [ax25]
      > {softirq-on-W} state was registered at:
      ...
      
      This lockdep report shows that ax25_route_lock is taken for reading in
      softirq context, and for writing in process context with BHs enabled.
      So, to make this safe, all write_locks in ax25_route.c are changed to
      _bh versions.
      
      Reported-by: Jann Traschewski <jann@gmx.de>,
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4de211f1
    • J
      [AX25] af_ax25: remove sock lock in ax25_info_show() · 1105b5d1
      Jarek Poplawski 提交于
      This lockdep warning:
      
      > =======================================================
      > [ INFO: possible circular locking dependency detected ]
      > 2.6.24 #3
      > -------------------------------------------------------
      > swapper/0 is trying to acquire lock:
      >  (ax25_list_lock){-+..}, at: [<f91dd3b1>] ax25_destroy_socket+0x171/0x1f0 [ax25]
      >
      > but task is already holding lock:
      >  (slock-AF_AX25){-+..}, at: [<f91dbabc>] ax25_std_heartbeat_expiry+0x1c/0xe0 [ax25]
      >
      > which lock already depends on the new lock.
      ...
      
      shows that ax25_list_lock and slock-AF_AX25 are taken in different
      order: ax25_info_show() takes slock (bh_lock_sock(ax25->sk)) while
      ax25_list_lock is held, so reversely to other functions. To fix this
      the sock lock should be moved to ax25_info_start(), and there would
      be still problem with breaking ax25_list_lock (it seems this "proper"
      order isn't optimal yet). But, since it's only for reading proc info
      it seems this is not necessary (e.g.  ax25_send_to_raw() does similar
      reading without this lock too).
      
      So, this patch removes sock lock to avoid deadlock possibility; there
      is also used sock_i_ino() function, which reads sk_socket under proper
      read lock. Additionally printf format of this i_ino is changed to %lu.
      Reported-by: NBernard Pidoux F6BVP <f6bvp@free.fr>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1105b5d1
    • S
      fib_trie: /proc/net/route performance improvement · 8315f5d8
      Stephen Hemminger 提交于
      Use key/offset caching to change /proc/net/route (use by iputils route)
      from O(n^2) to O(n). This improves performance from 30sec with 160,000
      routes to 1sec.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8315f5d8
    • S
      fib_trie: handle empty tree · ec28cf73
      Stephen Hemminger 提交于
      This fixes possible problems when trie_firstleaf() returns NULL
      to trie_leafindex().
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec28cf73
    • D
      [IPV4]: Remove IP_TOS setting privilege checks. · e4f8b5d4
      David S. Miller 提交于
      Various RFCs have all sorts of things to say about the CS field of the
      DSCP value.  In particular they try to make the distinction between
      values that should be used by "user applications" and things like
      routing daemons.
      
      This seems to have influenced the CAP_NET_ADMIN check which exists for
      IP_TOS socket option settings, but in fact it has an off-by-one error
      so it wasn't allowing CS5 which is meant for "user applications" as
      well.
      
      Further adding to the inconsistency and brokenness here, IPV6 does not
      validate the DSCP values specified for the IPV6_TCLASS socket option.
      
      The real actual uses of these TOS values are system specific in the
      final analysis, and these RFC recommendations are just that, "a
      recommendation".  In fact the standards very purposefully use
      "SHOULD" and "SHOULD NOT" when describing how these values can be
      used.
      
      In the final analysis the only clean way to provide consistency here
      is to remove the CAP_NET_ADMIN check.  The alternatives just don't
      work out:
      
      1) If we add the CAP_NET_ADMIN check to ipv6, this can break existing
         setups.
      
      2) If we just fix the off-by-one error in the class comparison in
         IPV4, certain DSCP values can be used in IPV6 but not IPV4 by
         default.  So people will just ask for a sysctl asking to
         override that.
      
      I checked several other freely available kernel trees and they
      do not make any privilege checks in this area like we do.  For
      the BSD stacks, this goes back all the way to Stevens Volume 2
      and beyond.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4f8b5d4
  2. 11 2月, 2008 1 次提交
  3. 10 2月, 2008 10 次提交
  4. 09 2月, 2008 2 次提交
    • S
      [PKT_SCHED] ematch: oops from uninitialized variable (resend) · 268bcca1
      Stephen Hemminger 提交于
      Setting up a meta match causes a kernel OOPS because of uninitialized
      elements in tree.
      
      [   37.322381] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [   37.322381] IP: [<ffffffff883fc717>] :em_meta:em_meta_destroy+0x17/0x80
      
      [   37.322381] Call Trace:
      [   37.322381]  [<ffffffff803ec83d>] tcf_em_tree_destroy+0x2d/0xa0
      [   37.322381]  [<ffffffff803ecc8c>] tcf_em_tree_validate+0x2dc/0x4a0
      [   37.322381]  [<ffffffff803f06d2>] nla_parse+0x92/0xe0
      [   37.322381]  [<ffffffff883f9672>] :cls_basic:basic_change+0x202/0x3c0
      [   37.322381]  [<ffffffff802a3917>] kmem_cache_alloc+0x67/0xa0
      [   37.322381]  [<ffffffff803ea221>] tc_ctl_tfilter+0x3b1/0x580
      [   37.322381]  [<ffffffff803dffd0>] rtnetlink_rcv_msg+0x0/0x260
      [   37.322381]  [<ffffffff803ee944>] netlink_rcv_skb+0x74/0xa0
      [   37.322381]  [<ffffffff803dffc8>] rtnetlink_rcv+0x18/0x20
      [   37.322381]  [<ffffffff803ee6c3>] netlink_unicast+0x263/0x290
      [   37.322381]  [<ffffffff803cf276>] __alloc_skb+0x96/0x160
      [   37.322381]  [<ffffffff803ef014>] netlink_sendmsg+0x274/0x340
      [   37.322381]  [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140
      [   37.322381]  [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30
      [   37.322381]  [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30
      [   37.322381]  [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140
      [   37.322381]  [<ffffffff80288611>] zone_statistics+0xb1/0xc0
      [   37.322381]  [<ffffffff803c7e5e>] sys_sendmsg+0x20e/0x360
      [   37.322381]  [<ffffffff803c7411>] sockfd_lookup_light+0x41/0x80
      [   37.322381]  [<ffffffff8028d04b>] handle_mm_fault+0x3eb/0x7f0
      [   37.322381]  [<ffffffff8020c2fb>] system_call_after_swapgs+0x7b/0x80
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      268bcca1
    • P
      namespaces: mark NET_NS with "depends on NAMESPACES" · cbdc7387
      Pavel Emelyanov 提交于
      There's already an option controlling the net namespaces cloning code, so make
      it work the same way as all the other namespaces' options.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cbdc7387
  5. 08 2月, 2008 14 次提交