1. 25 6月, 2011 1 次提交
    • H
      bridge: Only flood unregistered groups to routers · bd4265fe
      Herbert Xu 提交于
      The bridge currently floods packets to groups that we have never
      seen before to all ports.  This is not required by RFC4541 and
      in fact it is not desirable in environment where traffic to
      unregistered group is always present.
      
      This patch changes the behaviour so that we only send traffic
      to unregistered groups to ports marked as routers.
      
      The user can always force flooding behaviour to any given port
      by marking it as a router.
      
      Note that this change does not apply to traffic to 224.0.0.X
      as traffic to those groups must always be flooded to all ports.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd4265fe
  2. 22 6月, 2011 2 次提交
    • X
      udp/recvmsg: Clear MSG_TRUNC flag when starting over for a new packet · 9cfaa8de
      Xufeng Zhang 提交于
      Consider this scenario: When the size of the first received udp packet
      is bigger than the receive buffer, MSG_TRUNC bit is set in msg->msg_flags.
      However, if checksum error happens and this is a blocking socket, it will
      goto try_again loop to receive the next packet.  But if the size of the
      next udp packet is smaller than receive buffer, MSG_TRUNC flag should not
      be set, but because MSG_TRUNC bit is not cleared in msg->msg_flags before
      receive the next packet, MSG_TRUNC is still set, which is wrong.
      
      Fix this problem by clearing MSG_TRUNC flag when starting over for a
      new packet.
      Signed-off-by: NXufeng Zhang <xufeng.zhang@windriver.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cfaa8de
    • X
      ipv6/udp: Use the correct variable to determine non-blocking condition · 32c90254
      Xufeng Zhang 提交于
      udpv6_recvmsg() function is not using the correct variable to determine
      whether or not the socket is in non-blocking operation, this will lead
      to unexpected behavior when a UDP checksum error occurs.
      
      Consider a non-blocking udp receive scenario: when udpv6_recvmsg() is
      called by sock_common_recvmsg(), MSG_DONTWAIT bit of flags variable in
      udpv6_recvmsg() is cleared by "flags & ~MSG_DONTWAIT" in this call:
      
          err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT,
                         flags & ~MSG_DONTWAIT, &addr_len);
      
      i.e. with udpv6_recvmsg() getting these values:
      
      	int noblock = flags & MSG_DONTWAIT
      	int flags = flags & ~MSG_DONTWAIT
      
      So, when udp checksum error occurs, the execution will go to
      csum_copy_err, and then the problem happens:
      
          csum_copy_err:
                  ...............
                  if (flags & MSG_DONTWAIT)
                          return -EAGAIN;
                  goto try_again;
                  ...............
      
      But it will always go to try_again as MSG_DONTWAIT has been cleared
      from flags at call time -- only noblock contains the original value
      of MSG_DONTWAIT, so the test should be:
      
                  if (noblock)
                          return -EAGAIN;
      
      This is also consistent with what the ipv4/udp code does.
      Signed-off-by: NXufeng Zhang <xufeng.zhang@windriver.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32c90254
  3. 21 6月, 2011 1 次提交
  4. 20 6月, 2011 1 次提交
  5. 19 6月, 2011 1 次提交
  6. 18 6月, 2011 2 次提交
    • E
      inet_diag: fix inet_diag_bc_audit() · eeb14972
      Eric Dumazet 提交于
      A malicious user or buggy application can inject code and trigger an
      infinite loop in inet_diag_bc_audit()
      
      Also make sure each instruction is aligned on 4 bytes boundary, to avoid
      unaligned accesses.
      Reported-by: NDan Rosenberg <drosenberg@vsecurity.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eeb14972
    • E
      net: rfs: enable RFS before first data packet is received · 1eddcead
      Eric Dumazet 提交于
      Le jeudi 16 juin 2011 à 23:38 -0400, David Miller a écrit :
      > From: Ben Hutchings <bhutchings@solarflare.com>
      > Date: Fri, 17 Jun 2011 00:50:46 +0100
      >
      > > On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
      > >> @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
      > >>  			goto discard;
      > >>
      > >>  		if (nsk != sk) {
      > >> +			sock_rps_save_rxhash(nsk, skb->rxhash);
      > >>  			if (tcp_child_process(sk, nsk, skb)) {
      > >>  				rsk = nsk;
      > >>  				goto reset;
      > >>
      > >
      > > I haven't tried this, but it looks reasonable to me.
      > >
      > > What about IPv6?  The logic in tcp_v6_do_rcv() looks very similar.
      >
      > Indeed ipv6 side needs the same fix.
      >
      > Eric please add that part and resubmit.  And in fact I might stick
      > this into net-2.6 instead of net-next-2.6
      >
      
      OK, here is the net-2.6 based one then, thanks !
      
      [PATCH v2] net: rfs: enable RFS before first data packet is received
      
      First packet received on a passive tcp flow is not correctly RFS
      steered.
      
      One sock_rps_record_flow() call is missing in inet_accept()
      
      But before that, we also must record rxhash when child socket is setup.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Tom Herbert <therbert@google.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      CC: Jamal Hadi Salim <hadi@cyberus.ca>
      Signed-off-by: NDavid S. Miller <davem@conan.davemloft.net>
      1eddcead
  7. 17 6月, 2011 4 次提交
  8. 16 6月, 2011 6 次提交
  9. 15 6月, 2011 1 次提交
  10. 14 6月, 2011 2 次提交
  11. 13 6月, 2011 2 次提交
    • H
      IPVS netns exit causes crash in conntrack · 8f4e0a18
      Hans Schillstrom 提交于
      Quote from Patric Mc Hardy
      "This looks like nfnetlink.c excited and destroyed the nfnl socket, but
      ip_vs was still holding a reference to a conntrack. When the conntrack
      got destroyed it created a ctnetlink event, causing an oops in
      netlink_has_listeners when trying to use the destroyed nfnetlink
      socket."
      
      If nf_conntrack_netlink is loaded before ip_vs this is not a problem.
      
      This patch simply avoids calling ip_vs_conn_drop_conntrack()
      when netns is dying as suggested by Julian.
      Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      8f4e0a18
    • A
      Delay struct net freeing while there's a sysfs instance refering to it · a685e089
      Al Viro 提交于
      	* new refcount in struct net, controlling actual freeing of the memory
      	* new method in kobj_ns_type_operations (->drop_ns())
      	* ->current_ns() semantics change - it's supposed to be followed by
      corresponding ->drop_ns().  For struct net in case of CONFIG_NET_NS it bumps
      the new refcount; net_drop_ns() decrements it and calls net_free() if the
      last reference has been dropped.  Method renamed to ->grab_current_ns().
      	* old net_free() callers call net_drop_ns() instead.
      	* sysfs_exit_ns() is gone, along with a large part of callchain
      leading to it; now that the references stored in ->ns[...] stay valid we
      do not need to hunt them down and replace them with NULL.  That fixes
      problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
      of sb->s_instances abuse.
      
      	Note that struct net *shutdown* logics has not changed - net_cleanup()
      is called exactly when it used to be called.  The only thing postponed by
      having a sysfs instance refering to that struct net is actual freeing of
      memory occupied by struct net.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a685e089
  12. 12 6月, 2011 2 次提交
  13. 11 6月, 2011 2 次提交
  14. 10 6月, 2011 2 次提交
  15. 09 6月, 2011 3 次提交
  16. 08 6月, 2011 4 次提交
  17. 07 6月, 2011 4 次提交
    • H
      net: cpu offline cause napi stall · 264524d5
      Heiko Carstens 提交于
      Frank Blaschka reported :
      <quote>
        During heavy network load we turn off/on cpus.
        Sometimes this causes a stall on the network device.
        Digging into the dump I found out following:
      
        napi is scheduled but does not run. From the I/O buffers
        and the napi state I see napi/rx_softirq processing has stopped
        because the budget was reached. napi stays in the
        softnet_data poll_list and the rx_softirq was raised again.
      
        I assume at this time the cpu offline comes in,
        the rx softirq is raised/moved to another cpu but napi stays in the
        poll_list of the softnet_data of the now offline cpu.
      
        Reviewing dev_cpu_callback (net/core/dev.c) I did not find the
        poll_list is transfered to the new cpu.
      </quote>
      
      This patch is a straightforward implementation of Frank suggestion :
      
      Transfert poll_list and trigger NET_RX_SOFTIRQ on new cpu.
      Reported-by: NFrank Blaschka <blaschka@linux.vnet.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      264524d5
    • A
      bridge: provide a cow_metrics method for fake_ops · 6407d74c
      Alexander Holler 提交于
      Like in commit 0972ddb2 (provide cow_metrics() methods to blackhole
      dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
      well.
      
      This fixes a regression coming from commits 62fa8a84 (net: Implement
      read-only protection and COW'ing of metrics.) and 33eb9873 (bridge:
      initialize fake_rtable metrics)
      
      ip link set mybridge mtu 1234
      -->
      [  136.546243] Pid: 8415, comm: ip Tainted: P 
      2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc.         V1Sn 
              /V1Sn
      [  136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
      [  136.546268] EIP is at 0x0
      [  136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
      [  136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
      [  136.546285]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      [  136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80 
      task.ti=f15c2000)
      [  136.546297] Stack:
      [  136.546301]  f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80 
      ffffffa1 f15c3bbc
      [  136.546315]  c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80 
      ffffffa6 f15c3be4
      [  136.546329]  00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae 
      00000000 00000000
      [  136.546343] Call Trace:
      [  136.546359]  [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
      [  136.546372]  [<c12cb9c8>] dev_set_mtu+0x38/0x80
      [  136.546381]  [<c12da347>] do_setlink+0x1a7/0x860
      [  136.546390]  [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
      [  136.546400]  [<c11d6cae>] ? nla_parse+0x6e/0xb0
      [  136.546409]  [<c12db931>] rtnl_newlink+0x361/0x510
      [  136.546420]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
      [  136.546429]  [<c1362762>] ? error_code+0x5a/0x60
      [  136.546438]  [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
      [  136.546446]  [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
      [  136.546454]  [<c12db180>] ? __rtnl_unlock+0x20/0x20
      [  136.546463]  [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
      [  136.546471]  [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
      [  136.546479]  [<c12edafa>] netlink_unicast+0x23a/0x280
      [  136.546487]  [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
      [  136.546497]  [<c12bb828>] sock_sendmsg+0xc8/0x100
      [  136.546508]  [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
      [  136.546517]  [<c11d0602>] ? _copy_from_user+0x42/0x60
      [  136.546525]  [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
      [  136.546534]  [<c12bd805>] sys_sendmsg+0x1c5/0x200
      [  136.546542]  [<c10c2150>] ? __do_fault+0x310/0x410
      [  136.546549]  [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
      [  136.546557]  [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
      [  136.546565]  [<c12bd1af>] ? sys_getsockname+0x7f/0x90
      [  136.546574]  [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
      [  136.546582]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
      [  136.546589]  [<c10233b3>] ? do_page_fault+0x173/0x3d0
      [  136.546596]  [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
      [  136.546605]  [<c12bdd83>] sys_socketcall+0x293/0x2d0
      [  136.546614]  [<c13629d0>] sysenter_do_call+0x12/0x26
      [  136.546619] Code:  Bad EIP value.
      [  136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
      [  136.546645] CR2: 0000000000000000
      [  136.546652] ---[ end trace 6909b560e78934fa ]---
      Signed-off-by: NAlexander Holler <holler@ahsoftware.de>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6407d74c
    • E
      af_packet: prevent information leak · 13fcb7bd
      Eric Dumazet 提交于
      In 2.6.27, commit 393e52e3 (packet: deliver VLAN TCI to userspace)
      added a small information leak.
      
      Add padding field and make sure its zeroed before copy to user.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13fcb7bd
    • D
      irda: iriap: Use seperate lockdep class for irias_objects->hb_spinlock · 79b38915
      David S. Miller 提交于
      The SEQ output functions grab the obj->attrib->hb_spinlock lock of
      sub-objects found in the hash traversal.  These locks are in a different
      realm than the one used for the irias_objects hash table itself.
      
      So put the latter into it's own lockdep class.
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79b38915