- 29 1月, 2008 2 次提交
-
-
由 Patrick McHardy 提交于
The IPv4 and IPv6 hook values are identical, yet some code tries to figure out the "correct" value by looking at the address family. Introduce NF_INET_* values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__ section for userspace compatibility. Signed-off-by: NPatrick McHardy <kaber@trash.net> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Many-many code in the kernel initialized the timer->function and timer->data together with calling init_timer(timer). There is already a helper for this. Use it for networking code. The patch is HUGE, but makes the code 130 lines shorter (98 insertions(+), 228 deletions(-)). Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 12月, 2007 1 次提交
-
-
由 Joe Perches 提交于
Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 11月, 2007 1 次提交
-
-
由 Peter P Waskiewicz Jr 提交于
The only qdiscs that check subqueue state before dequeue'ing are PRIO and RR. The other qdiscs, including the default pfifo_fast qdisc, will allow traffic bound for subqueue 0 through to hard_start_xmit. The check for netif_queue_stopped() is done above in pkt_sched.h, so it is unnecessary for qdisc_restart(). However, if the underlying driver is multiqueue capable, and only sets queue states on subqueues, this will allow packets to enter the driver when it's currently unable to process packets, resulting in expensive requeues and driver entries. This patch re-adds the check for the subqueue status before calling hard_start_xmit, so we can try and avoid the driver entry when the queues are stopped. Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 11月, 2007 1 次提交
-
-
由 Radu Rendec 提交于
Computing the rank of the first set bit in the hash mask (for using later in u32_hash_fold()) was done with plain C code. Using ffs() instead makes the code more readable and improves performance (since ffs() is better optimized in assembler). Using the conditional operator on hash mask before applying ntohl() also saves one ntohl() call if mask is 0. Signed-off-by: NRadu Rendec <radu.rendec@ines.ro> Signed-off-by: NJarek Poplawski <jarkao2@o2.pl> Acked-by: NJamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 11月, 2007 2 次提交
-
-
由 Radu Rendec 提交于
While trying to implement u32 hashes in my shaping machine I ran into a possible bug in the u32 hash/bucket computing algorithm (net/sched/cls_u32.c). The problem occurs only with hash masks that extend over the octet boundary, on little endian machines (where htonl() actually does something). Let's say that I would like to use 0x3fc0 as the hash mask. This means 8 contiguous "1" bits starting at b6. With such a mask, the expected (and logical) behavior is to hash any address in, for instance, 192.168.0.0/26 in bucket 0, then any address in 192.168.0.64/26 in bucket 1, then 192.168.0.128/26 in bucket 2 and so on. This is exactly what would happen on a big endian machine, but on little endian machines, what would actually happen with current implementation is 0x3fc0 being reversed (into 0xc03f0000) by htonl() in the userspace tool and then applied to 192.168.x.x in the u32 classifier. When shifting right by 16 bits (rank of first "1" bit in the reversed mask) and applying the divisor mask (0xff for divisor 256), what would actually remain is 0x3f applied on the "168" octet of the address. One could say is this can be easily worked around by taking endianness into account in userspace and supplying an appropriate mask (0xfc03) that would be turned into contiguous "1" bits when reversed (0x03fc0000). But the actual problem is the network address (inside the packet) not being converted to host order, but used as a host-order value when computing the bucket. Let's say the network address is written as n31 n30 ... n0, with n0 being the least significant bit. When used directly (without any conversion) on a little endian machine, it becomes n7 ... n0 n8 ..n15 etc in the machine's registers. Thus bits n7 and n8 would no longer be adjacent and 192.168.64.0/26 and 192.168.128.0/26 would no longer be consecutive. The fix is to apply ntohl() on the hmask before computing fshift, and in u32_hash_fold() convert the packet data to host order before shifting down by fshift. With helpful feedback from Jamal Hadi Salim and Jarek Poplawski. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Evgeniy Polyakov 提交于
tecl_reset() is called from deactivate and qdisc is set to noop already, but subsequent teql_xmit does not know about it and dereference private data as teql qdisc and thus oopses. not catch it first :) Signed-off-by: NEvgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 10月, 2007 1 次提交
-
-
由 Jamal Hadi Salim 提交于
clean skb_clone of any signs of CONFIG_NET_CLS_ACT and have mirred us skb_act_clone() Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 10月, 2007 1 次提交
-
-
由 Pavel Emelyanov 提交于
Fix one more user of netiff_subqueue_stopped. To check for the queue id one must use the __netiff_subqueue_stoped call. This run out of my sight when I made the: 668f895a [NET]: Hide the queue_mapping field inside netif_subqueue_stopped commit :( Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 10月, 2007 2 次提交
-
-
由 Pavel Emelyanov 提交于
Many places get the queue_mapping field from skb to pass it to the netif_subqueue_stopped() which will be 0 in any case. Make the helper that works with sk_buff Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Make the helper for getting the field, symmetrical to the "set" one. Return 0 if CONFIG_NETDEVICES_MULTIQUEUE=n Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 10月, 2007 1 次提交
-
-
由 Robert P. J. Day 提交于
Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: NRobert P. J. Day <rpjday@mindspring.com> Signed-off-by: NAdrian Bunk <bunk@kernel.org>
-
- 19 10月, 2007 2 次提交
-
-
由 Herbert Xu 提交于
The function dev_deactivate is supposed to only return when all outstanding transmissions have completed. Unfortunately it is possible for store operations in the driver's transmit function to only become visible after dev_deactivate returns. This patch fixes this by taking the queue lock after we see the end of the queue run. This ensures that all effects of any previous transmit calls are visible. If however we detect that there is another queue run occuring, then we'll warn about it because this should never happen as we have pointed dev->qdisc to noop_qdisc within the same queue lock earlier in the functino. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Randy Dunlap 提交于
Convert "QoS and/or fair queueing" to menuconfig. This makes it easy for someone to disable all sub-options with one config symbol. Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 10月, 2007 1 次提交
-
-
由 Jeff Garzik 提交于
While looking at a net driver with the following construct, if (!netif_carrier_ok(dev)) netif_carrier_on(dev); it stuck me that the netif_carrier_ok() check was redundant, since netif_carrier_on() checks bit __LINK_STATE_NOCARRIER anyway. This is the same reason why netif_queue_stopped() need not be called prior to netif_wake_queue(). This is true, but there is however an unwanted side effect from assuming that netif_carrier_on() can be called multiple times: it touches the watchdog, regardless of pre-existing carrier state. The fix: move watchdog-up inside the bit-cleared code path. Signed-off-by: NJeff Garzik <jgarzik@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 10月, 2007 1 次提交
-
-
由 Herbert Xu 提交于
With all the users of the double pointers removed, this patch mops up by finally replacing all occurances of sk_buff ** in the netfilter API by sk_buff *. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 10月, 2007 11 次提交
-
-
由 Patrick McHardy 提交于
The fourth parameter of /proc/net/psched is supposed to show the timer resultion and is used by HTB userspace to calculate the necessary burst rate. Currently we show the clock resolution, which results in a too low burst rate when the two differ. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Fix a bunch of sparse warnings. Mostly about 0 used as NULL pointer, and shadowed variable declarations. One notable case was that hash size should have been unsigned. Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Add inline for common usage of hardware header creation, and fix bug in IPV6 mcast where the assumption about negative return is an errno. Negative return from hard_header means not enough space was available,(ie -N bytes). Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jamal Hadi Salim 提交于
For N cpus, with full throttle traffic on all N CPUs, funneling traffic to the same ethernet device, the devices queue lock is contended by all N CPUs constantly. The TX lock is only contended by a max of 2 CPUS. In the current mode of operation, after all the work of entering the dequeue region, we may endup aborting the path if we are unable to get the tx lock and go back to contend for the queue lock. As N goes up, this gets worse. The changes in this patch result in a small increase in performance with a 4CPU (2xdual-core) with no irq binding. Both e1000 and tg3 showed similar behavior; Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ralf Baechle 提交于
It's been a useless no-op for long enough in 2.6 so I figured it's time to remove it. The number of people that could object because they're maintaining unified 2.4 and 2.6 drivers is probably rather small. [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ] Signed-off-by: NRalf Baechle <ralf@linux-mips.org> Signed-off-by: NJeff Garzik <jeff@garzik.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
Change L2T (length to time) macros, in all rate based schedulers, to call a common function qdisc_l2t() that does the rate table lookup. This function handles if the packet size lookup is larger than the rate table, which often occurs with TSO enabled. Signed-off-by: NJesper Dangaard Brouer <hawk@comx.dk> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Several devices have multiple independant RX queues per net device, and some have a single interrupt doorbell for several queues. In either case, it's easier to support layouts like that if the structure representing the poll is independant from the net device itself. The signature of the ->poll() call back goes from: int foo_poll(struct net_device *dev, int *budget) to int foo_poll(struct napi_struct *napi, int budget) The caller is returned the number of RX packets processed (or the number of "NAPI credits" consumed if you want to get abstract). The callee no longer messes around bumping dev->quota, *budget, etc. because that is all handled in the caller upon return. The napi_struct is to be embedded in the device driver private data structures. Furthermore, it is the driver's responsibility to disable all NAPI instances in it's ->stop() device close handler. Since the napi_struct is privatized into the driver's private data structures, only the driver knows how to get at all of the napi_struct instances it may have per-device. With lots of help and suggestions from Rusty Russell, Roland Dreier, Michael Chan, Jeff Garzik, and Jamal Hadi Salim. Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra, Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan. [ Ported to current tree and all drivers converted. Integrated Stephen's follow-on kerneldoc additions, and restored poll_list handling to the old style to fix mutual exclusion issues. -DaveM ] Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 10月, 2007 1 次提交
-
-
由 Stephen Hemminger 提交于
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 10月, 2007 1 次提交
-
-
由 Alexey Kuznetsov 提交于
This is followup to Patrick's patch. A little optimization to enqueue routine allows to remove artificial limitation on queue length. Plus, testing showed that hash function used by SFQ is too bad or even worse. It does not even sweep the whole range of hash values. Switched to Jenkins' hash. Signed-off-by: NAlexey Kuznetsov <kaber@ms2.inr.ac.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 9月, 2007 1 次提交
-
-
由 Alexey Kuznetsov 提交于
Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 9月, 2007 1 次提交
-
-
由 Satyam Sharma 提交于
net/sched/sch_cbq.c: In function 'cbq_enqueue': net/sched/sch_cbq.c:383: warning: 'ret' may be used uninitialized in this function has been verified to be a bogus case. So let's shut it up. Signed-off-by: NSatyam Sharma <satyam@infradead.org> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 9月, 2007 1 次提交
-
-
由 Jamal Hadi Salim 提交于
(with no apologies to C Heston) On Mon, 2007-10-09 at 21:00 +0800, Herbert Xu wrote: On Sun, Sep 02, 2007 at 01:11:29PM +0000, Christian Kujau wrote: > > > > after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep > > was quite noisy when I tried to shape my external (wireless) interface: > > > > [ 6400.534545] FahCore_78.exe/3552 just changed the state of lock: > > [ 6400.534713] (&dev->ingress_lock){-+..}, at: [<c038d595>] > > netif_receive_skb+0x2d5/0x3c0 > > [ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the > > past: > > [ 6400.535145] (police_lock){-.--} > > This is a genuine dead-lock. The police lock can be taken > for reading with softirqs on. If a second CPU tries to take > the police lock for writing, while holding the ingress lock, > then a softirq on the first CPU can dead-lock when it tries > to get the ingress lock. Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 8月, 2007 1 次提交
-
-
由 Lucas Nussbaum 提交于
When CONFIG_NET_CLS_ACT is enabled, tc_classify() is called twice in prio_classify(). This causes "interesting" behaviour: with the setup below, packets are duplicated, sent twice to ifb0, and then loop in and out of ifb0. The patch uses the previously calculated return value in the switch, which is probably what Patrick had in mind in commit bdba91ec -- maybe Patrick can double-check this? -- example setup -- ifconfig ifb0 up tc qdisc add dev ifb0 root netem delay 2s tc qdisc add dev $ETH root handle 1: prio tc filter add dev $ETH parent 1: protocol ip prio 10 u32 \ match ip dst 172.24.110.6/32 flowid 1:1 \ action mirred egress redirect dev ifb0 ping -c1 172.24.110.6 Signed-off-by: NLucas Nussbaum <lucas.nussbaum@imag.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 8月, 2007 1 次提交
-
-
由 Jesper Juhl 提交于
This patch cleans up duplicate includes in net/sched/ Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 7月, 2007 3 次提交
-
-
由 Peter P Waskiewicz Jr 提交于
Fix the check in prio_tune() to see if sch->parent is TC_H_ROOT instead of sch->handle to load or reject the qdisc for multiqueue devices. Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Fix sch_api to correctly set sch->parent for both ingress and egress qdiscs in qdisc_create(). Signed-off-by: NPatrick McHardy <trash@kaber.net> Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Fix handling of empty or completely non-matching filter chains. In that case -1 is returned and tcf_result is uninitialized, the qdisc should fall back to default classification in that case. Noticed by PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 7月, 2007 2 次提交
-
-
由 Gabriel Craciunescu 提交于
Signed-off-by: NGabriel Craciunescu <nix.or.die@googlemail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 vignesh babu 提交于
Signed-off-by: Nvignesh babu <vignesh.babu@wipro.com> Signed-off-by: Nchas williams <chas@cmf.nrl.navy.mil> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 7月, 2007 1 次提交
-
-
由 Patrick McHardy 提交于
The NET_CLS_ACT option is now a full replacement for NET_CLS_POLICE, remove the old code. The config option will be kept around to select the equivalent NET_CLS_ACT options for a short time to allow easier upgrades. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-