1. 04 6月, 2008 2 次提交
    • T
      netlink: Improve returned error codes · bc3ed28c
      Thomas Graf 提交于
      Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and
      nla_nest_cancel() void functions.
      
      Return -EMSGSIZE instead of -1 if the provided message buffer is not
      big enough.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc3ed28c
    • S
      net: neighbour table ABI problem · b9f5f52c
      Stephen Hemminger 提交于
      The neighbor table time of last use information is returned in the
      incorrect unit. Kernel to user space ABI's need to use USER_HZ (or
      milliseconds), otherwise the application has to try and discover the
      real system HZ value which is problematic.  Linux has standardized on
      keeping USER_HZ consistent (100hz) even when kernel is running
      internally at some other value.
      
      This change is small, but it breaks the ABI for older version of
      iproute2 utilities.  But these utilities are already broken since they
      are looking at the psched_hz values which are completely different. So
      let's just go ahead and fix both kernel and user space. Older
      utilities will just print wrong values.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9f5f52c
  2. 02 5月, 2008 1 次提交
  3. 28 3月, 2008 2 次提交
  4. 26 3月, 2008 5 次提交
  5. 25 3月, 2008 1 次提交
    • P
      [NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3). · fa86d322
      Pavel Emelyanov 提交于
      Proxy neighbors do not have any reference counting, so any caller
      of pneigh_lookup (unless it's a netlink triggered add/del routine)
      should _not_ perform any actions on the found proxy entry. 
      
      There's one exception from this rule - the ipv6's ndisc_recv_ns() 
      uses found entry to check the flags for NTF_ROUTER.
      
      This creates a race between the ndisc and pneigh_delete - after 
      the pneigh is returned to the caller, the nd_tbl.lock is dropped 
      and the deleting procedure may proceed.
      
      One of the fixes would be to add a reference counting, but this
      problem exists for ndisc only. Besides such a patch would be too 
      big for -rc4.
      
      So I propose to introduce a __pneigh_lookup() which is supposed
      to be called with the lock held and use it in ndisc code to check
      the flags on alive pneigh entry.
      
      
      Changes from v2:
      As David noticed, Exported the __pneigh_lookup() to ipv6 module. 
      The checkpatch generates a warning on it, since the EXPORT_SYMBOL 
      does not follow the symbol itself, but in this file all the 
      exports come at the end, so I decided no to break this harmony.
      
      Changes from v1:
      Fixed comments from YOSHIFUJI - indentation of prototype in header
      and the pndisc_check_router() name - and a compilation fix, pointed
      by Daniel - the is_routed was (falsely) considered as uninitialized
      by gcc.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa86d322
  6. 04 3月, 2008 1 次提交
  7. 29 2月, 2008 3 次提交
  8. 24 2月, 2008 1 次提交
  9. 20 2月, 2008 1 次提交
  10. 18 2月, 2008 1 次提交
  11. 13 2月, 2008 1 次提交
    • D
      [NDISC]: Fix race in generic address resolution · 69cc64d8
      David S. Miller 提交于
      Frank Blaschka provided the bug report and the initial suggested fix
      for this bug.  He also validated this version of this fix.
      
      The problem is that the access to neigh->arp_queue is inconsistent, we
      grab references when dropping the lock lock to call
      neigh->ops->solicit() but this does not prevent other threads of
      control from trying to send out that packet at the same time causing
      corruptions because both code paths believe they have exclusive access
      to the skb.
      
      The best option seems to be to hold the write lock on neigh->lock
      during the ->solicit() call.  I looked at all of the ndisc_ops
      implementations and this seems workable.  The only case that needs
      special care is the IPV4 ARP implementation of arp_solicit().  It
      wants to take neigh->lock as a reader to protect the header entry in
      neigh->ha during the emission of the soliciation.  We can simply
      remove the read lock calls to take care of that since holding the lock
      as a writer at the caller providers a superset of the protection
      afforded by the existing read locking.
      
      The rest of the ->solicit() implementations don't care whether the
      neigh is locked or not.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69cc64d8
  12. 29 1月, 2008 12 次提交
  13. 21 1月, 2008 1 次提交
  14. 10 1月, 2008 1 次提交
    • P
      [NEIGH]: Fix race between neigh_parms_release and neightbl_fill_parms · 9cd40029
      Pavel Emelyanov 提交于
      The neightbl_fill_parms() is called under the write-locked tbl->lock
      and accesses the parms->dev. The negh_parm_release() calls the
      dev_put(parms->dev) without this lock. This creates a tiny race window
      on which the parms contains potentially stale dev pointer.
      
      To fix this race it's enough to move the dev_put() upper under the
      tbl->lock, but note, that the parms are held by neighbors and thus can
      live after the neigh_parms_release() is called, so we still can have a
      parm with bad dev pointer.
      
      I didn't find where the neigh->parms->dev is accessed, but still think
      that putting the dev is to be done in a place, where the parms are
      really freed. Am I right with that?
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cd40029
  15. 07 11月, 2007 1 次提交
    • A
      [NET]: Remove /proc/net/stat/*_arp_cache upon module removal · 3f192b5c
      Alexey Dobriyan 提交于
      neigh_table_init_no_netlink() creates them, but they aren't removed anywhere.
      
      Steps to reproduce:
      
      	modprobe clip
      	rmmod clip
      	cat /proc/net/stat/clip_arp_cache
      
      BUG: unable to handle kernel paging request at virtual address f89d7758
      printing eip: c05a99da *pdpt = 0000000000004001 *pde = 0000000004408067 *pte = 0000000000000000
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in: atm af_packet ipv6 binfmt_misc sbs sbshc fan dock battery backlight ac power_supply parport loop rtc_cmos rtc_core rtc_lib serio_raw button k8temp hwmon amd_rng sr_mod cdrom shpchp pci_hotplug ehci_hcd ohci_hcd uhci_hcd usbcore
      Pid: 2082, comm: cat Not tainted (2.6.24-rc1-b1d08ac0-bloat #4)
      EIP: 0060:[<c05a99da>] EFLAGS: 00210256 CPU: 0
      EIP is at neigh_stat_seq_next+0x26/0x3f
      EAX: 00000001 EBX: f89d7600 ECX: c587bf40 EDX: 00000000
      ESI: 00000000 EDI: 00000001 EBP: 00000400 ESP: c587bf1c
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      Process cat (pid: 2082, ti=c587b000 task=c5984e10 task.ti=c587b000)
      Stack: c06228cc c5313790 c049e5c0 0804f000 c45a7b00 c53137b0 00000000 00000000
             00000082 00000001 00000000 00000000 00000000 fffffffb c58d6780 c049e437
             c45a7b00 c04b1f93 c587bfa0 00000400 0804f000 00000400 0804f000 c04b1f2f
      Call Trace:
       [<c049e5c0>] seq_read+0x189/0x281
       [<c049e437>] seq_read+0x0/0x281
       [<c04b1f93>] proc_reg_read+0x64/0x77
       [<c04b1f2f>] proc_reg_read+0x0/0x77
       [<c048907e>] vfs_read+0x80/0xd1
       [<c0489491>] sys_read+0x41/0x67
       [<c04080fa>] sysenter_past_esp+0x6b/0xc1
       =======================
      Code: e9 ec 8d 05 00 56 8b 11 53 8b 40 70 8b 58 3c eb 29 0f a3 15 80 91 7b c0 19 c0 85 c0 8d 42 01 74 17 89 c6 c1 fe 1f 89 01 89 71 04 <8b> 83 58 01 00 00 f7 d0 8b 04 90 eb 09 89 c2 83 fa 01 7e d2 31
      EIP: [<c05a99da>] neigh_stat_seq_next+0x26/0x3f SS:ESP 0068:c587bf1c
      Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f192b5c
  16. 22 10月, 2007 1 次提交
  17. 19 10月, 2007 1 次提交
  18. 16 10月, 2007 1 次提交
  19. 11 10月, 2007 3 次提交
    • S
      [NET]: Move hardware header operations out of netdevice. · 3b04ddde
      Stephen Hemminger 提交于
      Since hardware header operations are part of the protocol class
      not the device instance, make them into a separate object and
      save memory.
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b04ddde
    • S
      [NET]: Wrap netdevice hardware header creation. · 0c4e8581
      Stephen Hemminger 提交于
      Add inline for common usage of hardware header creation, and
      fix bug in IPV6 mcast where the assumption about negative return is
      an errno. Negative return from hard_header means not enough space
      was available,(ie -N bytes).
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c4e8581
    • E
      [NET]: Make the device list and device lookups per namespace. · 881d966b
      Eric W. Biederman 提交于
      This patch makes most of the generic device layer network
      namespace safe.  This patch makes dev_base_head a
      network namespace variable, and then it picks up
      a few associated variables.  The functions:
      dev_getbyhwaddr
      dev_getfirsthwbytype
      dev_get_by_flags
      dev_get_by_name
      __dev_get_by_name
      dev_get_by_index
      __dev_get_by_index
      dev_ioctl
      dev_ethtool
      dev_load
      wireless_process_ioctl
      
      were modified to take a network namespace argument, and
      deal with it.
      
      vlan_ioctl_set and brioctl_set were modified so their
      hooks will receive a network namespace argument.
      
      So basically anthing in the core of the network stack that was
      affected to by the change of dev_base was modified to handle
      multiple network namespaces.  The rest of the network stack was
      simply modified to explicitly use &init_net the initial network
      namespace.  This can be fixed when those components of the network
      stack are modified to handle multiple network namespaces.
      
      For now the ifindex generator is left global.
      
      Fundametally ifindex numbers are per namespace, or else
      we will have corner case problems with migration when
      we get that far.
      
      At the same time there are assumptions in the network stack
      that the ifindex of a network device won't change.  Making
      the ifindex number global seems a good compromise until
      the network stack can cope with ifindex changes when
      you change namespaces, and the like.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      881d966b