1. 07 11月, 2008 1 次提交
    • D
      net: Fix recursive descent in __scm_destroy(). · 3b53fbf4
      David S. Miller 提交于
      __scm_destroy() walks the list of file descriptors in the scm_fp_list
      pointed to by the scm_cookie argument.
      
      Those, in turn, can close sockets and invoke __scm_destroy() again.
      
      There is nothing which limits how deeply this can occur.
      
      The idea for how to fix this is from Linus.  Basically, we do all of
      the fput()s at the top level by collecting all of the scm_fp_list
      objects hit by an fput().  Inside of the initial __scm_destroy() we
      keep running the list until it is empty.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b53fbf4
  2. 05 11月, 2008 6 次提交
    • D
      tcp: Fix recvmsg MSG_PEEK influence of blocking behavior. · 518a09ef
      David S. Miller 提交于
      Vito Caputo noticed that tcp_recvmsg() returns immediately from
      partial reads when MSG_PEEK is used.  In particular, this means that
      SO_RCVLOWAT is not respected.
      
      Simply remove the test.  And this matches the behavior of several
      other systems, including BSD.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      518a09ef
    • A
      netfilter: netns ct: walk netns list under RTNL · efb9a8c2
      Alexey Dobriyan 提交于
      netns list (just list) is under RTNL. But helper and proto unregistration
      happen during rmmod when RTNL is not held, and that's how it was tested:
      modprobe/rmmod vs clone(CLONE_NEWNET)/exit.
      
      BUG: unable to handle kernel paging request at 0000000000100100	<===
      IP: [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
      PGD 15e300067 PUD 15e1d8067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      last sysfs file: /sys/kernel/uevent_seqnum
      CPU 0
      Modules linked in: nf_conntrack_proto_sctp(-) nf_conntrack_proto_dccp(-) af_packet iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 sr_mod cdrom [last unloaded: nf_conntrack_proto_sctp]
      Pid: 16758, comm: rmmod Not tainted 2.6.28-rc2-netns-xfrm #3
      RIP: 0010:[<ffffffffa009890f>]  [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
      RSP: 0018:ffff88015dc1fec8  EFLAGS: 00010212
      RAX: 0000000000000000 RBX: 00000000001000f8 RCX: 0000000000000000
      RDX: ffffffffa009575c RSI: 0000000000000003 RDI: ffffffffa00956b5
      RBP: ffff88015dc1fed8 R08: 0000000000000002 R09: 0000000000000000
      R10: 0000000000000000 R11: ffff88015dc1fe48 R12: ffffffffa0458f60
      R13: 0000000000000880 R14: 00007fff4c361d30 R15: 0000000000000880
      FS:  00007f624435a6f0(0000) GS:ffffffff80521580(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000100100 CR3: 0000000168969000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process rmmod (pid: 16758, threadinfo ffff88015dc1e000, task ffff880179864218)
      Stack:
       ffffffffa0459100 0000000000000000 ffff88015dc1fee8 ffffffffa0457934
       ffff88015dc1ff78 ffffffff80253fef 746e6e6f635f666e 6f72705f6b636172
       00707463735f6f74 ffffffff8024cb30 00000000023b8010 0000000000000000
      Call Trace:
       [<ffffffffa0457934>] nf_conntrack_proto_sctp_fini+0x10/0x1e [nf_conntrack_proto_sctp]
       [<ffffffff80253fef>] sys_delete_module+0x19f/0x1fe
       [<ffffffff8024cb30>] ? trace_hardirqs_on_caller+0xf0/0x114
       [<ffffffff803ea9b2>] ? trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff8020b52b>] system_call_fastpath+0x16/0x1b
      Code: 13 35 e0 e8 c4 6c 1a e0 48 8b 1d 6d c6 46 e0 eb 16 48 89 df 4c 89 e2 48 c7 c6 fc 85 09 a0 e8 61 cd ff ff 48 8b 5b 08 48 83 eb 08 <48> 8b 43 08 0f 18 08 48 8d 43 08 48 3d 60 4f 50 80 75 d3 5b 41
      RIP  [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
       RSP <ffff88015dc1fec8>
      CR2: 0000000000100100
      ---[ end trace bde8ac82debf7192 ]---
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efb9a8c2
    • B
      ipv6: fix run pending DAD when interface becomes ready · e3ec6cfc
      Benjamin Thery 提交于
      With some net devices types, an IPv6 address configured while the
      interface was down can stay 'tentative' forever, even after the interface
      is set up. In some case, pending IPv6 DADs are not executed when the
      device becomes ready.
      
      I observed this while doing some tests with kvm. If I assign an IPv6 
      address to my interface eth0 (kvm driver rtl8139) when it is still down
      then the address is flagged tentative (IFA_F_TENTATIVE). Then, I set
      eth0 up, and to my surprise, the address stays 'tentative', no DAD is
      executed and the address can't be pinged.
      
      I also observed the same behaviour, without kvm, with virtual interfaces
      types macvlan and veth.
      
      Some easy steps to reproduce the issue with macvlan:
      
      1. ip link add link eth0 type macvlan
      2. ip -6 addr add 2003::ab32/64 dev macvlan0
      3. ip addr show dev macvlan0
         ... 
         inet6 2003::ab32/64 scope global tentative
         ...
      4. ip link set macvlan0 up
      5. ip addr show dev macvlan0
         ...
         inet6 2003::ab32/64 scope global tentative
         ...
         Address is still tentative
      
      I think there's a bug in net/ipv6/addrconf.c, addrconf_notify():
      addrconf_dad_run() is not always run when the interface is flagged IF_READY.
      Currently it is only run when receiving NETDEV_CHANGE event. Looks like
      some (virtual) devices doesn't send this event when becoming up.
      
      For both NETDEV_UP and NETDEV_CHANGE events, when the interface becomes
      ready, run_pending should be set to 1. Patch below.
      
      'run_pending = 1' could be moved below the if/else block but it makes 
      the code less readable.
      Signed-off-by: NBenjamin Thery <benjamin.thery@bull.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3ec6cfc
    • R
      net/9p: fix printk format warnings · b22cecdd
      Randy Dunlap 提交于
      Fix printk format warnings in net/9p.
      Built cleanly on 7 arches.
      
      net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
      net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
      net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
      net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
      net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
      net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
      net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
      net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
      net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
      net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
      net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
      net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
      net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
      net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
      net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
      net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
      net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
      net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b22cecdd
    • P
      net: fix packet socket delivery in rx irq handler · 9b22ea56
      Patrick McHardy 提交于
      The changes to deliver hardware accelerated VLAN packets to packet
      sockets (commit bc1d0411) caused a warning for non-NAPI drivers.
      The __vlan_hwaccel_rx() function is called directly from the drivers
      RX function, for non-NAPI drivers that means its still in RX IRQ
      context:
      
      [   27.779463] ------------[ cut here ]------------
      [   27.779509] WARNING: at kernel/softirq.c:136 local_bh_enable+0x37/0x81()
      ...
      [   27.782520]  [<c0264755>] netif_nit_deliver+0x5b/0x75
      [   27.782590]  [<c02bba83>] __vlan_hwaccel_rx+0x79/0x162
      [   27.782664]  [<f8851c1d>] atl1_intr+0x9a9/0xa7c [atl1]
      [   27.782738]  [<c0155b17>] handle_IRQ_event+0x23/0x51
      [   27.782808]  [<c015692e>] handle_edge_irq+0xc2/0x102
      [   27.782878]  [<c0105fd5>] do_IRQ+0x4d/0x64
      
      Split hardware accelerated VLAN reception into two parts to fix this:
      
      - __vlan_hwaccel_rx just stores the VLAN TCI and performs the VLAN
        device lookup, then calls netif_receive_skb()/netif_rx()
      
      - vlan_hwaccel_do_receive(), which is invoked by netif_receive_skb()
        in softirq context, performs the real reception and delivery to
        packet sockets.
      Reported-and-tested-by: NRamon Casellas <ramon.casellas@cttc.es>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b22ea56
    • A
      xfrm: Have af-specific init_tempsel() initialize family field of temporary selector · 79654a76
      Andreas Steffen 提交于
      While adding MIGRATE support to strongSwan, Andreas Steffen noticed that
      the selectors provided in XFRM_MSG_ACQUIRE have their family field
      uninitialized (those in MIGRATE do have their family set).
      
      Looking at the code, this is because the af-specific init_tempsel()
      (called via afinfo->init_tempsel() in xfrm_init_tempsel()) do not set
      the value.
      Reported-by: NAndreas Steffen <andreas.steffen@strongswan.org>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NArnaud Ebalard <arno@natisbad.org>
      79654a76
  3. 04 11月, 2008 9 次提交
  4. 03 11月, 2008 4 次提交
  5. 02 11月, 2008 5 次提交
  6. 01 11月, 2008 1 次提交
    • A
      key: fix setkey(8) policy set breakage · 920da692
      Alexey Dobriyan 提交于
      Steps to reproduce:
      
      	#/usr/sbin/setkey -f
      	flush;
      	spdflush;
      
      	add 192.168.0.42 192.168.0.1 ah 24500 -A hmac-md5 "1234567890123456";
      	add 192.168.0.42 192.168.0.1 esp 24501 -E 3des-cbc "123456789012123456789012";
      
      	spdadd 192.168.0.42 192.168.0.1 any -P out ipsec
      		esp/transport//require
      		ah/transport//require;
      
      setkey: invalid keymsg length
      
      Policy dump will bail out with the same message after that.
      
      -recv(4, "\2\16\0\0\32\0\3\0\0\0\0\0\37\r\0\0\3\0\5\0\377 \0\0\2\0\0\0\300\250\0*\0"..., 32768, 0) = 208
      +recv(4, "\2\16\0\0\36\0\3\0\0\0\0\0H\t\0\0\3\0\5\0\377 \0\0\2\0\0\0\300\250\0*\0"..., 32768, 0) = 208
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      920da692
  7. 31 10月, 2008 14 次提交
    • I
      bpa10x: free sk_buff with kfree_skb · cbafe312
      Ilpo Järvinen 提交于
      Inspired by Sergio Luis' similar patches, I finally found
      a case which is trivial enough that spatch won't choke
      on it.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbafe312
    • F
      xfrm: do not leak ESRCH to user space · a4322266
      fernando@oss.ntt.co 提交于
      I noticed that, under certain conditions, ESRCH can be leaked from the
      xfrm layer to user space through sys_connect. In particular, this seems
      to happen reliably when the kernel fails to resolve a template either
      because the AF_KEY receive buffer being used by racoon is full or
      because the SA entry we are trying to use is in XFRM_STATE_EXPIRED
      state.
      
      However, since this could be a transient issue it could be argued that
      EAGAIN would be more appropriate. Besides this error code is not even
      documented in the man page for sys_connect (as of man-pages 3.07).
      Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4322266
    • D
      net: Really remove all of LOOPBACK_TSO code. · 3a8af722
      David S. Miller 提交于
      As noticed by Saikiran Madugula, commit 7447ef63
      ("loopback: Remove rest of LOOPBACK_TSO code.") got rid of
      emulate_large_send_offload() but didn't get rid of the call
      site as well.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a8af722
    • D
    • A
      netfilter: nf_conntrack_proto_gre: switch to register_pernet_gen_subsys() · 61e57448
      Alexey Dobriyan 提交于
      register_pernet_gen_device() can't be used is nf_conntrack_pptp module is
      also used (compiled in or loaded).
      
      Right now, proto_gre_net_exit() is called before nf_conntrack_pptp_net_exit().
      The former shutdowns and frees GRE piece of netns, however the latter
      absolutely needs it to flush keymap. Oops is inevitable.
      
      Switch to shiny new register_pernet_gen_subsys() to get correct ordering in
      netns ops list.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      61e57448
    • A
      netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys · 485ac57b
      Alexey Dobriyan 提交于
      netns ops which are registered with register_pernet_gen_device() are
      shutdown strictly before those which are registered with
      register_pernet_subsys(). Sometimes this leads to opposite (read: buggy)
      shutdown ordering between two modules.
      
      Add register_pernet_gen_subsys()/unregister_pernet_gen_subsys() for modules
      which aren't elite enough for entry in struct net, and which can't use
      register_pernet_gen_device(). PPTP conntracking module is such one.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      485ac57b
    • R
      net: delete excess kernel-doc notation · ad1d967c
      Randy Dunlap 提交于
      Remove excess kernel-doc function parameters from networking header
      & driver files:
      
      Warning(include/net/sock.h:946): Excess function parameter or struct member 'sk' description in 'sk_filter_release'
      Warning(include/linux/netdevice.h:1545): Excess function parameter or struct member 'cpu' description in 'netif_tx_lock'
      Warning(drivers/net/wan/z85230.c:712): Excess function parameter or struct member 'regs' description in 'z8530_interrupt'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad1d967c
    • D
    • D
    • D
      pppoe: Fix socket leak. · 263e69cb
      David S. Miller 提交于
      Move SKB trim before we lookup the socket so we don't have to
      put it on failure.
      
      Based upon an initial patch by Jarek Poplawski and suggestions
      from Herbert Xu.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      263e69cb
    • T
      gianfar: Don't reset TBI<->SerDes link if it's already up · bdb59f94
      Trent Piepho 提交于
      The link may be up already via the chip's reset strapping, or though action
      of U-Boot, or from the last time the interface was brought up.  Resetting
      the link causes it to go down for several seconds.  This can significantly
      increase the time from power-on to DHCP completion and a device being
      accessible to the network.
      Signed-off-by: NTrent Piepho <tpiepho@freescale.com>
      Acked-by: NAndy Fleming <afleming@freescale.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      bdb59f94
    • T
      gianfar: Fix race in TBI/SerDes configuration · c132419e
      Trent Piepho 提交于
      The init_phy() function attaches to the PHY, then configures the
      SerDes<->TBI link (in SGMII mode).  The TBI is on the MDIO bus with the PHY
      (sort of) and is accessed via the gianfar's MDIO registers, using the
      functions gfar_local_mdio_read/write(), which don't do any locking.
      
      The previously attached PHY will start a work-queue on a timer, and
      probably an irq handler as well, which will talk to the PHY and thus use
      the MDIO bus.  This uses phy_read/write(), which have locking, but not
      against the gfar_local_mdio versions.
      
      The result is that PHY code will try to use the MDIO bus at the same time
      as the SerDes setup code, corrupting the transfers.
      
      Setting up the SerDes before attaching to the PHY will insure that there is
      no race between the SerDes code and *our* PHY, but doesn't fix everything.
      Typically the PHYs for all gianfar devices are on the same MDIO bus, which
      is associated with the first gianfar device.  This means that the first
      gianfar's SerDes code could corrupt the MDIO transfers for a different
      gianfar's PHY.
      
      The lock used by phy_read/write() is contained in the mii_bus structure,
      which is pointed to by the PHY.  This is difficult to access from the
      gianfar drivers, as there is no link between a gianfar device and the
      mii_bus which shares the same MDIO registers.  As far as the device layer
      and drivers are concerned they are two unrelated devices (which happen to
      share registers).
      
      Generally all gianfar devices' PHYs will be on the bus associated with the
      first gianfar.  But this might not be the case, so simply locking the
      gianfar's PHY's mii bus might not lock the mii bus that the SerDes setup
      code is going to use.
      
      We solve this by having the code that creates the gianfar platform device
      look in the device tree for an mdio device that shares the gianfar's
      registers.  If one is found the ID of its platform device is saved in the
      gianfar's platform data.
      
      A new function in the gianfar mii code, gfar_get_miibus(), can use the bus
      ID to search through the platform devices for a gianfar_mdio device with
      the right ID.  The platform device's driver data is the mii_bus structure,
      which the SerDes setup code can use to lock the current bus.
      Signed-off-by: NTrent Piepho <tpiepho@freescale.com>
      CC: Andy Fleming <afleming@freescale.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      c132419e
    • D
      at91_ether: request/free GPIO for PHY interrupt · 71527ef4
      David Brownell 提交于
      When the at91_ether driver is using a GPIO for its PHY interrupt,
      be sure to request (and later, if needed, free) that GPIO.
      Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      71527ef4
    • C
      amd8111e: fix dma_free_coherent context · e83603fd
      Chunbo Luo 提交于
      Acoording commit aa24886e,
      dma_free_coherent() need irqs enabled.
      
      This patch fix following warning messages:
      
      WARNING: at linux/arch/x86/kernel/pci-dma.c:376 dma_free_coherent+0xaa/0xb0()
      
      Call Trace:
       [<ffffffff8023f80f>] warn_on_slowpath+0x5f/0x90
       [<ffffffff80496ffa>] ? __kfree_skb+0x3a/0xa0
       [<ffffffff802a4723>] ? discard_slab+0x23/0x40
       [<ffffffff8021274a>] dma_free_coherent+0xaa/0xb0
       [<ffffffff8043668f>] amd8111e_close+0x10f/0x1b0
       [<ffffffff8049f3ae>] dev_close+0x5e/0xb0
       [<ffffffff8049efa1>] dev_change_flags+0xa1/0x1e0
       [<ffffffff806b2171>] ic_close_devs+0x36/0x4e
       [<ffffffff806b29ee>] ip_auto_config+0x581/0x10f3
       [<ffffffff803a6e19>] ? kobject_add+0x69/0x90
       [<ffffffff803a698a>] ? kobject_get+0x1a/0x30
       [<ffffffff803a785b>] ? kobject_uevent+0xb/0x10
       [<ffffffff803a6c62>] ? kset_register+0x52/0x60
       [<ffffffff803a6f9b>] ? kset_create_and_add+0x6b/0xa0
       [<ffffffff804e2e74>] ? tcp_ca_find+0x24/0x50
       [<ffffffff806b246d>] ? ip_auto_config+0x0/0x10f3
       [<ffffffff8020903c>] _stext+0x3c/0x150
       [<ffffffff802772d3>] ? register_irq_proc+0xd3/0xf0
       [<ffffffff802f0000>] ? mb_cache_create+0x80/0x1f0
       [<ffffffff80688693>] kernel_init+0x141/0x1b8
       [<ffffffff80688552>] ? kernel_init+0x0/0x1b8
       [<ffffffff8020d609>] child_rip+0xa/0x11
       [<ffffffff80688552>] ? kernel_init+0x0/0x1b8
       [<ffffffff80688552>] ? kernel_init+0x0/0x1b8
       [<ffffffff8020d5ff>] ? child_rip+0x0/0x11
      Signed-off-by: NChunbo Luo <chunbo.luo@windriver.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      e83603fd