1. 30 3月, 2007 1 次提交
  2. 28 3月, 2007 1 次提交
  3. 26 3月, 2007 3 次提交
    • D
      [IPV6]: Fix routing round-robin locking. · f11e6659
      David S. Miller 提交于
      As per RFC2461, section 6.3.6, item #2, when no routers on the
      matching list are known to be reachable or probably reachable we
      do round robin on those available routes so that we make sure
      to probe as many of them as possible to detect when one becomes
      reachable faster.
      
      Each routing table has a rwlock protecting the tree and the linked
      list of routes at each leaf.  The round robin code executes during
      lookup and thus with the rwlock taken as a reader.  A small local
      spinlock tries to provide protection but this does not work at all
      for two reasons:
      
      1) The round-robin list manipulation, as coded, goes like this (with
         read lock held):
      
      	walk routes finding head and tail
      
      	spin_lock();
      	rotate list using head and tail
      	spin_unlock();
      
         While one thread is rotating the list, another thread can
         end up with stale values of head and tail and then proceed
         to corrupt the list when it gets the lock.  This ends up causing
         the OOPS in fib6_add() later onthat many people have been hitting.
      
      2) All the other code paths that run with the rwlock held as
         a reader do not expect the list to change on them, they
         expect it to remain completely fixed while they hold the
         lock in that way.
      
      So, simply stated, it is impossible to implement this correctly using
      a manipulation of the list without violating the rwlock locking
      semantics.
      
      Reimplement using a per-fib6_node round-robin pointer.  This way we
      don't need to manipulate the list at all, and since the round-robin
      pointer can only ever point to real existing entries we don't need
      to perform any locking on the changing of the round-robin pointer
      itself.  We only need to reset the round-robin pointer to NULL when
      the entry it is pointing to is removed.
      
      The idea is from Thomas Graf and it is very similar to how this
      was implemented before the advanced router selection code when in.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f11e6659
    • A
      [NET]: Fix neighbour destructor handling. · ecbb4169
      Alexey Kuznetsov 提交于
      ->neigh_destructor() is killed (not used), replaced with
      ->neigh_cleanup(), which is called when neighbor entry goes to dead
      state. At this point everything is still valid: neigh->dev,
      neigh->parms etc.
      
      The device should guarantee that dead neighbor entries (neigh->dead !=
      0) do not get private part initialized, otherwise nobody will cleanup
      it.
      
      I think this is enough for ipoib which is the only user of this thing.
      Initialization private part of neighbor entries happens in ipib
      start_xmit routine, which is not reached when device is down.  But it
      would be better to add explicit test for neigh->dead in any case.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecbb4169
    • T
      [NET]: Fix fib_rules compatibility breakage · e1701c68
      Thomas Graf 提交于
      Based upon a patch from Patrick McHardy.
      
      The fib_rules netlink attribute policy introduced in 2.6.19 broke
      userspace compatibilty. When specifying a rule with "from all"
      or "to all", iproute adds a zero byte long netlink attribute,
      but the policy requires all addresses to have a size equal to
      sizeof(struct in_addr)/sizeof(struct in6_addr), resulting in a
      validation error.
      
      Check attribute length of FRA_SRC/FRA_DST in the generic framework
      by letting the family specific rules implementation provide the
      length of an address. Report an error if address length is non
      zero but no address attribute is provided. Fix actual bug by
      checking address length for non-zero instead of relying on
      availability of attribute.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1701c68
  4. 20 3月, 2007 2 次提交
  5. 08 3月, 2007 1 次提交
    • E
      [IPSEC]: xfrm_policy delete security check misplaced · ef41aaa0
      Eric Paris 提交于
      The security hooks to check permissions to remove an xfrm_policy were
      actually done after the policy was removed.  Since the unlinking and
      deletion are done in xfrm_policy_by* functions this moves the hooks
      inside those 2 functions.  There we have all the information needed to
      do the security check and it can be done before the deletion.  Since
      auditing requires the result of that security check err has to be passed
      back and forth from the xfrm_policy_by* functions.
      
      This patch also fixes a bug where a deletion that failed the security
      check could cause improper accounting on the xfrm_policy
      (xfrm_get_policy didn't have a put on the exit path for the hold taken
      by xfrm_policy_by*)
      
      It also fixes the return code when no policy is found in
      xfrm_add_pol_expire.  In old code (at least back in the 2.6.18 days) err
      wasn't used before the return when no policy is found and so the
      initialization would cause err to be ENOENT.  But since err has since
      been used above when we don't get a policy back from the xfrm_policy_by*
      function we would always return 0 instead of the intended ENOENT.  Also
      fixed some white space damage in the same area.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NVenkat Yekkirala <vyekkirala@trustedcs.com>
      Acked-by: NJames Morris <jmorris@namei.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef41aaa0
  6. 07 3月, 2007 1 次提交
  7. 06 3月, 2007 2 次提交
    • E
      187f5f84
    • P
      [NETFILTER]: conntrack: fix {nf,ip}_ct_iterate_cleanup endless loops · ec68e97d
      Patrick McHardy 提交于
      Fix {nf,ip}_ct_iterate_cleanup unconfirmed list handling:
      
      - unconfirmed entries can not be killed manually, they are removed on
        confirmation or final destruction of the conntrack entry, which means
        we might iterate forever without making forward progress.
      
        This can happen in combination with the conntrack event cache, which
        holds a reference to the conntrack entry, which is only released when
        the packet makes it all the way through the stack or a different
        packet is handled.
      
      - taking references to an unconfirmed entry and using it outside the
        locked section doesn't work, the list entries are not refcounted and
        another CPU might already be waiting to destroy the entry
      
      What the code really wants to do is make sure the references of the hash
      table to the selected conntrack entries are released, so they will be
      destroyed once all references from skbs and the event cache are dropped.
      
      Since unconfirmed entries haven't even entered the hash yet, simply mark
      them as dying and skip confirmation based on that.
      
      Reported and tested by Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec68e97d
  8. 03 3月, 2007 1 次提交
    • W
      [NET]: Fix bugs in "Whether sock accept queue is full" checking · 8488df89
      Wei Dong 提交于
      	when I use linux TCP socket, and find there is a bug in function
      sk_acceptq_is_full().
      
      	When a new SYN comes, TCP module first checks its validation. If valid,
      send SYN,ACK to the client and add the sock to the syn hash table. Next
      time if received the valid ACK for SYN,ACK from the client. server will
      accept this connection and increase the sk->sk_ack_backlog -- which is
      done in function tcp_check_req().We check wether acceptq is full in
      function tcp_v4_syn_recv_sock().
      
      Consider an example:
      
       After listen(sockfd, 1) system call, sk->sk_max_ack_backlog is set to
      1. As we know, sk->sk_ack_backlog is initialized to 0. Assuming accept()
      system call is not invoked now.
      
      1. 1st connection comes. invoke sk_acceptq_is_full(). sk-
      >sk_ack_backlog=0 sk->sk_max_ack_backlog=1, function return 0 accept
      this connection. Increase the sk->sk_ack_backlog
      2. 2nd connection comes. invoke sk_acceptq_is_full(). sk-
      >sk_ack_backlog=1 sk->sk_max_ack_backlog=1, function return 0 accept
      this connection. Increase the sk->sk_ack_backlog
      3. 3rd connection comes. invoke sk_acceptq_is_full(). sk-
      >sk_ack_backlog=2 sk->sk_max_ack_backlog=1, function return 1. Refuse
      this connection.
      
      I think it has bugs. after listen system call. sk->sk_max_ack_backlog=1
      but now it can accept 2 connections.
      Signed-off-by: NWei Dong <weid@np.css.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8488df89
  9. 01 3月, 2007 1 次提交
    • P
      [NET]: Handle disabled preemption in gfp_any() · 4498121c
      Patrick McHardy 提交于
      ctnetlink uses netlink_unicast from an atomic_notifier_chain
      (which is called within a RCU read side critical section)
      without holding further locks. netlink_unicast calls netlink_trim
      with the result of gfp_any() for the gfp flags, which are passed
      down to pskb_expand_header. gfp_any() only checks for softirq
      context and returns GFP_KERNEL, resulting in this warning:
      
      BUG: sleeping function called from invalid context at mm/slab.c:3032
      in_atomic():1, irqs_disabled():0
      no locks held by rmmod/7010.
      
      Call Trace:
       [<ffffffff8109467f>] debug_show_held_locks+0x9/0xb
       [<ffffffff8100b0b4>] __might_sleep+0xd9/0xdb
       [<ffffffff810b5082>] __kmalloc+0x68/0x110
       [<ffffffff811ba8f2>] pskb_expand_head+0x4d/0x13b
       [<ffffffff81053147>] netlink_broadcast+0xa5/0x2e0
       [<ffffffff881cd1d7>] :nfnetlink:nfnetlink_send+0x83/0x8a
       [<ffffffff8834f6a6>] :nf_conntrack_netlink:ctnetlink_conntrack_event+0x94c/0x96a
       [<ffffffff810624d6>] notifier_call_chain+0x29/0x3e
       [<ffffffff8106251d>] atomic_notifier_call_chain+0x32/0x60
       [<ffffffff881d266d>] :nf_conntrack:destroy_conntrack+0xa5/0x1d3
       [<ffffffff881d194e>] :nf_conntrack:nf_ct_cleanup+0x8c/0x12c
       [<ffffffff881d4614>] :nf_conntrack:kill_l3proto+0x0/0x13
       [<ffffffff881d482a>] :nf_conntrack:nf_conntrack_l3proto_unregister+0x90/0x94
       [<ffffffff883551b3>] :nf_conntrack_ipv4:nf_conntrack_l3proto_ipv4_fini+0x2b/0x5d
       [<ffffffff8109d44f>] sys_delete_module+0x1b5/0x1e6
       [<ffffffff8105f245>] trace_hardirqs_on_thunk+0x35/0x37
       [<ffffffff8105911e>] system_call+0x7e/0x83
      
      Since netlink_unicast is supposed to be callable from within RCU
      read side critical sections, make gfp_any() check for in_atomic()
      instead of in_softirq().
      
      Additionally nfnetlink_send needs to use gfp_any() as well for the
      call to netlink_broadcast).
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4498121c
  10. 27 2月, 2007 1 次提交
  11. 14 2月, 2007 2 次提交
  12. 13 2月, 2007 4 次提交
  13. 11 2月, 2007 5 次提交
  14. 09 2月, 2007 11 次提交
  15. 26 1月, 2007 1 次提交
  16. 24 1月, 2007 2 次提交
  17. 05 1月, 2007 1 次提交