- 25 3月, 2008 1 次提交
-
-
由 Pavel Emelyanov 提交于
Proxy neighbors do not have any reference counting, so any caller of pneigh_lookup (unless it's a netlink triggered add/del routine) should _not_ perform any actions on the found proxy entry. There's one exception from this rule - the ipv6's ndisc_recv_ns() uses found entry to check the flags for NTF_ROUTER. This creates a race between the ndisc and pneigh_delete - after the pneigh is returned to the caller, the nd_tbl.lock is dropped and the deleting procedure may proceed. One of the fixes would be to add a reference counting, but this problem exists for ndisc only. Besides such a patch would be too big for -rc4. So I propose to introduce a __pneigh_lookup() which is supposed to be called with the lock held and use it in ndisc code to check the flags on alive pneigh entry. Changes from v2: As David noticed, Exported the __pneigh_lookup() to ipv6 module. The checkpatch generates a warning on it, since the EXPORT_SYMBOL does not follow the symbol itself, but in this file all the exports come at the end, so I decided no to break this harmony. Changes from v1: Fixed comments from YOSHIFUJI - indentation of prototype in header and the pndisc_check_router() name - and a compilation fix, pointed by Daniel - the is_routed was (falsely) considered as uninitialized by gcc. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 3月, 2008 1 次提交
-
-
由 Frank Blaschka 提交于
neigh_update sends skb from neigh->arp_queue while neigh_timer_handler has increased skbs refcount and calls solicit with the skb. neigh_timer_handler should not increase skbs refcount but make a copy of the skb and do solicit with the copy. Signed-off-by: NFrank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 2月, 2008 1 次提交
-
-
由 Wang Chen 提交于
Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 2月, 2008 1 次提交
-
-
由 Pavel Emelyanov 提交于
The neigh_hash_grow() may update the tbl->hash_rnd value, which is used in all tbl->hash callbacks to calculate the hashval. Two lookup routines may race with this, since they call the ->hash callback without the tbl->lock held. Since the hash_rnd is changed with this lock write-locked moving the calls to ->hash under this lock read-locked closes this gap. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 2月, 2008 1 次提交
-
-
由 Denis V. Lunev 提交于
release_net is missed on the error path in pneigh_lookup. Signed-off-by: NDenis V. Lunev <den@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 2月, 2008 1 次提交
-
-
由 David S. Miller 提交于
This reverts commit 69cc64d8. It causes recursive locking in IPV6 because unlike other neighbour layer clients, it even needs neighbour cache entries to send neighbour soliciation messages :-( We'll have to find another way to fix this race. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 2月, 2008 1 次提交
-
-
由 David S. Miller 提交于
Frank Blaschka provided the bug report and the initial suggested fix for this bug. He also validated this version of this fix. The problem is that the access to neigh->arp_queue is inconsistent, we grab references when dropping the lock lock to call neigh->ops->solicit() but this does not prevent other threads of control from trying to send out that packet at the same time causing corruptions because both code paths believe they have exclusive access to the skb. The best option seems to be to hold the write lock on neigh->lock during the ->solicit() call. I looked at all of the ndisc_ops implementations and this seems workable. The only case that needs special care is the IPV4 ARP implementation of arp_solicit(). It wants to take neigh->lock as a reader to protect the header entry in neigh->ha during the emission of the soliciation. We can simply remove the read lock calls to take care of that since holding the lock as a writer at the caller providers a superset of the protection afforded by the existing read locking. The rest of the ->solicit() implementations don't care whether the neigh is locked or not. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 1月, 2008 12 次提交
-
-
由 Denis V. Lunev 提交于
Make them static. [ Moved the inline before, instead of after, call sites. -DaveM ] Signed-off-by: NDenis V. Lunev <den@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
No need for this. It is declared in the neighbour.h Signed-off-by: NDenis V. Lunev <den@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
Valid network device is always passed into neigh_param_alloc, so remove extra checking for dev == NULL. Additionally, cleanup bogus netns assignment. Signed-off-by: NDenis V. Lunev <den@openvz.org> Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
seq_open_net requires that first field of the seq->private data to be struct seq_net_private. In reality this is a single pointer to a struct net for now. The patch makes code consistent. Signed-off-by: NDenis V. Lunev <den@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Add __acquires() and __releases() annotations to suppress some sparse warnings. example of warnings : net/ipv4/udp.c:1555:14: warning: context imbalance in 'udp_seq_start' - wrong count at exit net/ipv4/udp.c:1571:13: warning: context imbalance in 'udp_seq_stop' - unexpected unlock Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
I'm actually surprised at how much was involved. At first glance it appears that the neighbour table data structures are already split by network device so all that should be needed is to modify the user interface commands to filter the set of neighbours by the network namespace of their devices. However a couple things turned up while I was reading through the code. The proxy neighbour table allows entries with no network device, and the neighbour parms are per network device (except for the defaults) so they now need a per network namespace default. So I updated the two structures (which surprised me) with their very own network namespace parameter. Updated the relevant lookup and destroy routines with a network namespace parameter and modified the code that interacts with users to filter out neighbour table entries for devices of other namespaces. I'm a little concerned that we can modify and display the global table configuration and from all network namespaces. But this appears good enough for now. I keep thinking modifying the neighbour table to have per network namespace instances of each table type would should be cleaner. The hash table is already dynamically sized so there are it is not a limiter. The default parameter would be straight forward to take care of. However when I look at the how the network table is built and used I still find some assumptions that there is only a single neighbour table for each type of table in the kernel. The netlink operations, neigh_seq_start, the non-core network users that call neigh_lookup. So while it might be doable it would require more refactoring than my current approach of just doing a little extra filtering in the code. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDaniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The neigh_del_timer() looks sane - it removes the timer and (conditionally) puts the neighbor. I expected, that the neigh_add_timer() is symmetrical to the del one - i.e. it holds the neighbor and arms the timer - but it turned out that it was not so. I think, that making them look symmetrical makes the code more readable. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The appropriate path is prepared right inside this function. It is prepared similar to how the ctl tables were. Since the path is modified, it is put on the stack, to avoid possible races with multiple calls to neigh_sysctl_register() : it is called by protocols and I didn't find any protection in this case. Did I overlooked the rtnl lock?. The stack growth of the neigh_sysctl_register() is 40 bytes. I believe this is OK, since this is not that much and this function is not called with the deep stack (device/protocols register). The device's name is stored on the template to free it later. This will help with the net namespaces, as each namespace should have its own set of these ctls. Besides, this saves ~350 bytes from the neigh template :) Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
This mainly removes the err variable, as this call always return the same error code (-ENOBUFS). Besides, I moved the call to kmalloc() from the *t declaration into the code (this is confusing when a variable is initialized with the result of some call) and removed unneeded comment near the error path. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
After this patch none of the netlink callback support anything except the initial network namespace but the rtnetlink infrastructure now handles multiple network namespaces. Changes from v2: - IPv6 addrlabel processing Changes from v1: - no need for special rtnl_unlock handling - fixed IPv6 ndisc Signed-off-by: NDenis V. Lunev <den@openvz.org> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
Before I can enable rtnetlink to work in all network namespaces I need to be certain that something won't break. So this patch deliberately disables all of the rtnletlink methods in everything except the initial network namespace. After the methods have been audited this extra check can be disabled. Changes from v1: - added IPv6 addrlabel protection Signed-off-by: NDenis V. Lunev <den@openvz.org> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Pavel Emelyanov 提交于
Many-many code in the kernel initialized the timer->function and timer->data together with calling init_timer(timer). There is already a helper for this. Use it for networking code. The patch is HUGE, but makes the code 130 lines shorter (98 insertions(+), 228 deletions(-)). Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 1月, 2008 1 次提交
-
-
由 David S. Miller 提交于
Commit 9cd40029 (Fix race between neigh_parms_release and neightbl_fill_parms) introduced device reference counting regressions for several people, see: http://bugzilla.kernel.org/show_bug.cgi?id=9778 for example. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 1月, 2008 1 次提交
-
-
由 Pavel Emelyanov 提交于
The neightbl_fill_parms() is called under the write-locked tbl->lock and accesses the parms->dev. The negh_parm_release() calls the dev_put(parms->dev) without this lock. This creates a tiny race window on which the parms contains potentially stale dev pointer. To fix this race it's enough to move the dev_put() upper under the tbl->lock, but note, that the parms are held by neighbors and thus can live after the neigh_parms_release() is called, so we still can have a parm with bad dev pointer. I didn't find where the neigh->parms->dev is accessed, but still think that putting the dev is to be done in a place, where the parms are really freed. Am I right with that? Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 11月, 2007 1 次提交
-
-
由 Alexey Dobriyan 提交于
neigh_table_init_no_netlink() creates them, but they aren't removed anywhere. Steps to reproduce: modprobe clip rmmod clip cat /proc/net/stat/clip_arp_cache BUG: unable to handle kernel paging request at virtual address f89d7758 printing eip: c05a99da *pdpt = 0000000000004001 *pde = 0000000004408067 *pte = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: atm af_packet ipv6 binfmt_misc sbs sbshc fan dock battery backlight ac power_supply parport loop rtc_cmos rtc_core rtc_lib serio_raw button k8temp hwmon amd_rng sr_mod cdrom shpchp pci_hotplug ehci_hcd ohci_hcd uhci_hcd usbcore Pid: 2082, comm: cat Not tainted (2.6.24-rc1-b1d08ac0-bloat #4) EIP: 0060:[<c05a99da>] EFLAGS: 00210256 CPU: 0 EIP is at neigh_stat_seq_next+0x26/0x3f EAX: 00000001 EBX: f89d7600 ECX: c587bf40 EDX: 00000000 ESI: 00000000 EDI: 00000001 EBP: 00000400 ESP: c587bf1c DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 2082, ti=c587b000 task=c5984e10 task.ti=c587b000) Stack: c06228cc c5313790 c049e5c0 0804f000 c45a7b00 c53137b0 00000000 00000000 00000082 00000001 00000000 00000000 00000000 fffffffb c58d6780 c049e437 c45a7b00 c04b1f93 c587bfa0 00000400 0804f000 00000400 0804f000 c04b1f2f Call Trace: [<c049e5c0>] seq_read+0x189/0x281 [<c049e437>] seq_read+0x0/0x281 [<c04b1f93>] proc_reg_read+0x64/0x77 [<c04b1f2f>] proc_reg_read+0x0/0x77 [<c048907e>] vfs_read+0x80/0xd1 [<c0489491>] sys_read+0x41/0x67 [<c04080fa>] sysenter_past_esp+0x6b/0xc1 ======================= Code: e9 ec 8d 05 00 56 8b 11 53 8b 40 70 8b 58 3c eb 29 0f a3 15 80 91 7b c0 19 c0 85 c0 8d 42 01 74 17 89 c6 c1 fe 1f 89 01 89 71 04 <8b> 83 58 01 00 00 f7 d0 8b 04 90 eb 09 89 c2 83 fa 01 7e d2 31 EIP: [<c05a99da>] neigh_stat_seq_next+0x26/0x3f SS:ESP 0068:c587bf1c Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 10月, 2007 1 次提交
-
-
由 Randy Dunlap 提交于
net/atm/clip.c crashes the kernel if it (module) is loaded, removed, and then loaded again. Its exit call to neigh_table_clear() should destroy the cache after freeing it. Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 10月, 2007 1 次提交
-
-
由 Eric W. Biederman 提交于
- In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary sysctl names for a function that works with proc. - In neighbour.c reorder the table to put the possibly unused entries at the end so we can remove them by terminating the table early. - In neighbour.c kill the entries with questionable binary sysctl handling behavior. - In neighbour.c if we don't have a strategy routine remove the binary path. So we don't the default sysctl strategy routine on data that is not ready for it. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 10月, 2007 1 次提交
-
-
由 Pavel Emelyanov 提交于
The pnigh_lookup is used to lookup proxy entries and to create them in case lookup failed. However, the "creation" code does not perform the re-lookup after GFP_KERNEL allocation. This is done because the code is expected to be protected with the RTNL lock, so add the assertion (mainly to address future questions from new network developers like me :) ). Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 10月, 2007 6 次提交
-
-
由 Stephen Hemminger 提交于
Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Add inline for common usage of hardware header creation, and fix bug in IPV6 mcast where the assumption about negative return is an errno. Negative return from hard_header means not enough space was available,(ie -N bytes). Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Currently neighbour event notifications are limited to update notifications and only sent if the ARP daemon is enabled. This patch extends the existing notification code by also reporting neighbours being removed due to gc or administratively and removes the dependency on the ARP daemon. This allows to keep track of neighbour states without periodically fetching the complete neighbour table. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Introduces neigh_cleanup_and_release() to be used after a neighbour has been removed from its neighbour table. Serves as preparation to add event notifications. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 8月, 2007 1 次提交
-
-
由 vignesh babu 提交于
Replacing n & (n - 1) for power of 2 check by is_power_of_2(n) Signed-off-by: Nvignesh babu <vignesh.babu@wipro.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 7月, 2007 1 次提交
-
-
由 Paul Mundt 提交于
Slab destructors were no longer supported after Christoph's c59def9f change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
-
- 08 6月, 2007 1 次提交
-
-
由 Patrick McHardy 提交于
Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 4月, 2007 3 次提交
-
-
由 Thomas Graf 提交于
Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
The seq_file operations stuff can be marked constant to get it out of dirty cache. Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
For the quite common 'skb->nh.raw - skb->data' sequence. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 4月, 2007 1 次提交
-
-
由 Pavel Emelianov 提交于
Otherwise the following calltrace will lead to a wrong lockdep warning: neigh_proxy_process() `- lock(neigh_table->proxy_queue.lock); arp_redo /* via tbl->proxy_redo */ arp_process neigh_event_ns neigh_update skb_queue_purge `- lock(neighbor->arp_queue.lock); This is not a deadlock actually, as neighbor table's proxy_queue and the neighbor's arp_queue are different queues. Lockdep thinks there is a deadlock as both queues are initialized with skb_queue_head_init() and thus have a common class. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 3月, 2007 1 次提交
-
-
由 Alexey Kuznetsov 提交于
->neigh_destructor() is killed (not used), replaced with ->neigh_cleanup(), which is called when neighbor entry goes to dead state. At this point everything is still valid: neigh->dev, neigh->parms etc. The device should guarantee that dead neighbor entries (neigh->dead != 0) do not get private part initialized, otherwise nobody will cleanup it. I think this is enough for ipoib which is the only user of this thing. Initialization private part of neighbor entries happens in ipib start_xmit routine, which is not reached when device is down. But it would be better to add explicit test for neigh->dead in any case. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 2月, 2007 1 次提交
-
-
由 Eric W. Biederman 提交于
The semantic effect of insert_at_head is that it would allow new registered sysctl entries to override existing sysctl entries of the same name. Which is pain for caching and the proc interface never implemented. I have done an audit and discovered that none of the current users of register_sysctl care as (excpet for directories) they do not register duplicate sysctl entries. So this patch simply removes the support for overriding existing entries in the sys_sysctl interface since no one uses it or cares and it makes future enhancments harder. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NRalf Baechle <ralf@linux-mips.org> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andi Kleen <ak@muc.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Corey Minyard <minyard@acm.org> Cc: Neil Brown <neilb@suse.de> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Jan Kara <jack@ucw.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-