1. 07 12月, 2011 1 次提交
  2. 06 12月, 2011 1 次提交
  3. 04 12月, 2011 4 次提交
  4. 02 12月, 2011 1 次提交
  5. 01 12月, 2011 2 次提交
  6. 29 11月, 2011 2 次提交
    • E
      net: dont call jump_label_dec from irq context · b90e5794
      Eric Dumazet 提交于
      Igor Maravic reported an error caused by jump_label_dec() being called
      from IRQ context :
      
       BUG: sleeping function called from invalid context at kernel/mutex.c:271
       in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper
       1 lock held by swapper/0:
        #0:  (&n->timer){+.-...}, at: [<ffffffff8107ce90>] call_timer_fn+0x0/0x340
       Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1
      Call Trace:
       <IRQ>  [<ffffffff8104f417>] __might_sleep+0x137/0x1f0
       [<ffffffff816b9a2f>] mutex_lock_nested+0x2f/0x370
       [<ffffffff810a89fd>] ? trace_hardirqs_off+0xd/0x10
       [<ffffffff8109a37f>] ? local_clock+0x6f/0x80
       [<ffffffff810a90a5>] ? lock_release_holdtime.part.22+0x15/0x1a0
       [<ffffffff81557929>] ? sock_def_write_space+0x59/0x160
       [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
       [<ffffffff810969cd>] atomic_dec_and_mutex_lock+0x5d/0x80
       [<ffffffff8112fc1d>] jump_label_dec+0x1d/0x50
       [<ffffffff81566525>] net_disable_timestamp+0x15/0x20
       [<ffffffff81557a75>] sock_disable_timestamp+0x45/0x50
       [<ffffffff81557b00>] __sk_free+0x80/0x200
       [<ffffffff815578d0>] ? sk_send_sigurg+0x70/0x70
       [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
       [<ffffffff81557cba>] sock_wfree+0x3a/0x70
       [<ffffffff8155c2b0>] skb_release_head_state+0x70/0x120
       [<ffffffff8155c0b6>] __kfree_skb+0x16/0x30
       [<ffffffff8155c119>] kfree_skb+0x49/0x170
       [<ffffffff815e936e>] arp_error_report+0x3e/0x90
       [<ffffffff81575bd9>] neigh_invalidate+0x89/0xc0
       [<ffffffff81578dbe>] neigh_timer_handler+0x9e/0x2a0
       [<ffffffff81578d20>] ? neigh_update+0x640/0x640
       [<ffffffff81073558>] __do_softirq+0xc8/0x3a0
      
      Since jump_label_{inc|dec} must be called from process context only,
      we must defer jump_label_dec() if net_disable_timestamp() is called
      from interrupt context.
      Reported-by: NIgor Maravic <igorm@etf.rs>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b90e5794
    • L
      ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given. · 2a38e6d5
      Li Wei 提交于
      We need to set np->mcast_hops to it's default value at this moment
      otherwise when we use it and found it's value is -1, the logic to
      get default hop limit doesn't take multicast into account and will
      return wrong hop limit(IPV6_DEFAULT_HOPLIMIT) which is for unicast.
      Signed-off-by: NLi Wei <lw@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a38e6d5
  7. 27 11月, 2011 3 次提交
  8. 24 11月, 2011 3 次提交
  9. 23 11月, 2011 2 次提交
  10. 19 11月, 2011 1 次提交
    • H
      ipv6: Remove all uses of LL_ALLOCATED_SPACE · a7ae1992
      Herbert Xu 提交于
      ipv6: Remove all uses of LL_ALLOCATED_SPACE
      
      The macro LL_ALLOCATED_SPACE was ill-conceived.  It applies the
      alignment to the sum of needed_headroom and needed_tailroom.  As
      the amount that is then reserved for head room is needed_headroom
      with alignment, this means that the tail room left may be too small.
      
      This patch replaces all uses of LL_ALLOCATED_SPACE in net/ipv6
      with the macro LL_RESERVED_SPACE and direct reference to
      needed_tailroom.
      
      This also fixes the problem with needed_headroom changing between
      allocating the skb and reserving the head room.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7ae1992
  11. 17 11月, 2011 3 次提交
  12. 16 11月, 2011 1 次提交
  13. 15 11月, 2011 2 次提交
  14. 14 11月, 2011 3 次提交
    • E
      neigh: new unresolved queue limits · 8b5c171b
      Eric Dumazet 提交于
      Le mercredi 09 novembre 2011 à 16:21 -0500, David Miller a écrit :
      > From: David Miller <davem@davemloft.net>
      > Date: Wed, 09 Nov 2011 16:16:44 -0500 (EST)
      >
      > > From: Eric Dumazet <eric.dumazet@gmail.com>
      > > Date: Wed, 09 Nov 2011 12:14:09 +0100
      > >
      > >> unres_qlen is the number of frames we are able to queue per unresolved
      > >> neighbour. Its default value (3) was never changed and is responsible
      > >> for strange drops, especially if IP fragments are used, or multiple
      > >> sessions start in parallel. Even a single tcp flow can hit this limit.
      > >  ...
      > >
      > > Ok, I've applied this, let's see what happens :-)
      >
      > Early answer, build fails.
      >
      > Please test build this patch with DECNET enabled and resubmit.  The
      > decnet neigh layer still refers to the removed ->queue_len member.
      >
      > Thanks.
      
      Ouch, this was fixed on one machine yesterday, but not the other one I
      used this morning, sorry.
      
      [PATCH V5 net-next] neigh: new unresolved queue limits
      
      unres_qlen is the number of frames we are able to queue per unresolved
      neighbour. Its default value (3) was never changed and is responsible
      for strange drops, especially if IP fragments are used, or multiple
      sessions start in parallel. Even a single tcp flow can hit this limit.
      
      $ arp -d 192.168.20.108 ; ping -c 2 -s 8000 192.168.20.108
      PING 192.168.20.108 (192.168.20.108) 8000(8028) bytes of data.
      8008 bytes from 192.168.20.108: icmp_seq=2 ttl=64 time=0.322 ms
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b5c171b
    • J
      ip6_tunnel: copy parms.name after register_netdevice · 731abb9c
      Josh Boyer 提交于
      Commit 1c5cae81 removed an explicit call to dev_alloc_name in ip6_tnl_create
      because register_netdevice will now create a valid name.  This works for the
      net_device itself.
      
      However the tunnel keeps a copy of the name in the parms structure for the
      ip6_tnl associated with the tunnel.  parms.name is set by copying the net_device
      name in ip6_tnl_dev_init_gen.  That function is called from ip6_tnl_dev_init in
      ip6_tnl_create, but it is done before register_netdevice is called so the name
      is set to a bogus value in the parms.name structure.
      
      This shows up if you do a simple tunnel add, followed by a tunnel show:
      
      [root@localhost ~]# ip -6 tunnel add remote fec0::100 local fec0::200
      [root@localhost ~]# ip -6 tunnel show
      ip6tnl0: ipv6/ipv6 remote :: local :: encaplimit 0 hoplimit 0 tclass 0x00 flowlabel 0x00000 (flowinfo 0x00000000)
      ip6tnl%d: ipv6/ipv6 remote fec0::100 local fec0::200 encaplimit 4 hoplimit 64 tclass 0x00 flowlabel 0x00000 (flowinfo 0x00000000)
      [root@localhost ~]#
      
      Fix this by moving the strcpy out of ip6_tnl_dev_init_gen, and calling it after
      register_netdevice has successfully returned.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJosh Boyer <jwboyer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      731abb9c
    • E
      ipv6: reduce percpu needs for icmpv6msg mibs · 2a24444f
      Eric Dumazet 提交于
      Reading /proc/net/snmp6 on a machine with a lot of cpus is very
      expensive (can be ~88000 us).
      
      This is because ICMPV6MSG MIB uses 4096 bytes per cpu, and folding
      values for all possible cpus can read 16 Mbytes of memory (32MBytes on
      non x86 arches)
      
      ICMP messages are not considered as fast path on a typical server, and
      eventually few cpus handle them anyway. We can afford an atomic
      operation instead of using percpu data.
      
      This saves 4096 bytes per cpu and per network namespace.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a24444f
  15. 13 11月, 2011 1 次提交
  16. 10 11月, 2011 3 次提交
    • E
      ipv4: PKTINFO doesnt need dst reference · d826eb14
      Eric Dumazet 提交于
      Le lundi 07 novembre 2011 à 15:33 +0100, Eric Dumazet a écrit :
      
      > At least, in recent kernels we dont change dst->refcnt in forwarding
      > patch (usinf NOREF skb->dst)
      >
      > One particular point is the atomic_inc(dst->refcnt) we have to perform
      > when queuing an UDP packet if socket asked PKTINFO stuff (for example a
      > typical DNS server has to setup this option)
      >
      > I have one patch somewhere that stores the information in skb->cb[] and
      > avoid the atomic_{inc|dec}(dst->refcnt).
      >
      
      OK I found it, I did some extra tests and believe its ready.
      
      [PATCH net-next] ipv4: IP_PKTINFO doesnt need dst reference
      
      When a socket uses IP_PKTINFO notifications, we currently force a dst
      reference for each received skb. Reader has to access dst to get needed
      information (rt_iif & rt_spec_dst) and must release dst reference.
      
      We also forced a dst reference if skb was put in socket backlog, even
      without IP_PKTINFO handling. This happens under stress/load.
      
      We can instead store the needed information in skb->cb[], so that only
      softirq handler really access dst, improving cache hit ratios.
      
      This removes two atomic operations per packet, and false sharing as
      well.
      
      On a benchmark using a mono threaded receiver (doing only recvmsg()
      calls), I can reach 720.000 pps instead of 570.000 pps.
      
      IP_PKTINFO is typically used by DNS servers, and any multihomed aware
      UDP application.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d826eb14
    • N
      ah: Read nexthdr value before overwriting it in ahash input callback. · b7ea81a5
      Nick Bowler 提交于
      The AH4/6 ahash input callbacks read out the nexthdr field from the AH
      header *after* they overwrite that header.  This is obviously not going
      to end well.  Fix it up.
      Signed-off-by: NNick Bowler <nbowler@elliptictech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7ea81a5
    • N
      ah: Correctly pass error codes in ahash output callback. · 069294e8
      Nick Bowler 提交于
      The AH4/6 ahash output callbacks pass nexthdr to xfrm_output_resume
      instead of the error code.  This appears to be a copy+paste error from
      the input case, where nexthdr is expected.  This causes the driver to
      continuously add AH headers to the datagram until either an allocation
      fails and the packet is dropped or the ahash driver hits a synchronous
      fallback and the resulting monstrosity is transmitted.
      
      Correct this issue by simply passing the error code unadulterated.
      Signed-off-by: NNick Bowler <nbowler@elliptictech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      069294e8
  17. 09 11月, 2011 4 次提交
  18. 02 11月, 2011 1 次提交
  19. 01 11月, 2011 2 次提交