- 06 9月, 2009 2 次提交
-
-
由 Patrick McHardy 提交于
Some schedulers don't support creating, changing or deleting classes. Make the respective callbacks optionally and consistently return -EOPNOTSUPP for unsupported operations, instead of currently either -EOPNOTSUPP, -ENOSYS or no error. In case of sch_prio and sch_multiq, the removed operations additionally checked for an invalid class. This is not necessary since the class argument can only orginate from ->get() or in case of ->change is 0 for creation of new classes, in which case ->change() incorrectly returned -ENOENT. As a side-effect, this patch fixes a possible (root-only) NULL pointer function call in sch_ingress, which didn't implement a so far mandatory ->delete() operation. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Some qdiscs don't support attaching filters. Handle this centrally in cls_api and return a proper errno code (EOPNOTSUPP) instead of EINVAL. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 9月, 2009 1 次提交
-
-
由 Patrick McHardy 提交于
If the parent qdisc doesn't support classes, use EOPNOTSUPP. If the parent class doesn't exist, use ENOENT. Currently EINVAL is returned in both cases. Additionally check whether grafting is supported and remove a now unnecessary graft function from sch_ingress. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 9月, 2009 1 次提交
-
-
由 David S. Miller 提交于
These are full of unresolved problems, mainly that conversions don't work 1-1 from hrtimers to tasklet_hrtimers because unlike hrtimers tasklets can't be killed from softirq context. And when a qdisc gets reset, that's exactly what we need to do here. We'll work this out in the net-next-2.6 tree and if warranted we'll backport that work to -stable. This reverts the following 3 changesets: a2cb6a4d ("pkt_sched: Fix bogon in tasklet_hrtimer changes.") 38acce2d ("pkt_sched: Convert CBQ to tasklet_hrtimer.") ee5f9757 ("pkt_sched: Convert qdisc_watchdog to tasklet_hrtimer") Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 9月, 2009 1 次提交
-
-
由 Stephen Hemminger 提交于
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 8月, 2009 1 次提交
-
-
由 Krishna Kumar 提交于
pfifo_fast_enqueue has this check: if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) { which allows each band to enqueue upto tx_queue_len skbs for a total of 3*tx_queue_len skbs. I am not sure if this was the intention of limiting in qdisc. Patch compiled and 32 simultaneous netperf testing ran fine. Also: # tc -s qdisc show dev eth2 qdisc pfifo_fast 0: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues 25) rate 0bit 0pps backlog 0b 0p requeues 25 Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 8月, 2009 1 次提交
-
-
由 Krishna Kumar 提交于
Maintain a per-qdisc bitmap for pfifo_fast giving availability of skbs for each band. This allows faster lookup for a skb when there are no high priority skbs. Also, it helps in (rare) cases when there are no skbs on the list, where an immediate lookup is faster than iterating through the three bands. Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 8月, 2009 1 次提交
-
-
由 David S. Miller 提交于
Reported by Stephen Rothwell, luckily it's harmless: net/sched/sch_api.c: In function 'qdisc_watchdog': net/sched/sch_api.c:460: warning: initialization from incompatible pointer type net/sched/sch_cbq.c: In function 'cbq_undelay': net/sched/sch_cbq.c:595: warning: initialization from incompatible pointer type Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 8月, 2009 1 次提交
-
-
由 David S. Miller 提交于
This code expects to run in softirq context, and bare hrtimers run in hw IRQ context. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NThomas Gleixner <tglx@linutronix.de>
-
- 23 8月, 2009 1 次提交
-
-
由 David S. Miller 提交于
None of this stuff should execute in hw IRQ context, therefore use a tasklet_hrtimer so that it runs in softirq context. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NThomas Gleixner <tglx@linutronix.de>
-
- 18 8月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
In 5e140dfc "net: reorder struct Qdisc for better SMP performance" the definition of struct gnet_stats_basic changed incompatibly, as copies of this struct are shipped to userland via netlink. Restoring old behavior is not welcome, for performance reason. Fix is to use a private structure for kernel, and teach gnet_stats_copy_basic() to convert from kernel to user land, using legacy structure (struct gnet_stats_basic) Based on a report and initial patch from Michael Spang. Reported-by: NMichael Spang <mspang@csclub.uwaterloo.ca> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 8月, 2009 1 次提交
-
-
由 Krishna Kumar 提交于
dev_queue_xmit enqueue's a skb and calls qdisc_run which dequeue's the skb and xmits it. In most cases, the skb that is enqueue'd is the same one that is dequeue'd (unless the queue gets stopped or multiple cpu's write to the same queue and ends in a race with qdisc_run). For default qdiscs, we can remove the redundant enqueue/dequeue and simply xmit the skb since the default qdisc is work-conserving. The patch uses a new flag - TCQ_F_CAN_BYPASS to identify the default fast queue. The controversial part of the patch is incrementing qlen when a skb is requeued - this is to avoid checks like the second line below: + } else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q) && >> !q->gso_skb && + !test_and_set_bit(__QDISC_STATE_RUNNING, &q->state)) { Results of a 2 hour testing for multiple netperf sessions (1, 2, 4, 8, 12 sessions on a 4 cpu system-X). The BW numbers are aggregate Mb/s across iterations tested with this version on System-X boxes with Chelsio 10gbps cards: ---------------------------------- Size | ORG BW NEW BW | ---------------------------------- 128K | 156964 159381 | 256K | 158650 162042 | ---------------------------------- Changes from ver1: 1. Move sch_direct_xmit declaration from sch_generic.h to pkt_sched.h 2. Update qdisc basic statistics for direct xmit path. 3. Set qlen to zero in qdisc_reset. 4. Changed some function names to more meaningful ones. Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 7月, 2009 1 次提交
-
-
由 Patrick McHardy 提交于
This patch is the result of an automatic spatch transformation to convert all ndo_start_xmit() return values of 0 to NETDEV_TX_OK. Some occurences are missed by the automatic conversion, those will be handled in a seperate patch. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 6月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
commit 2b85a34e (net: No more expensive sock_hold()/sock_put() on each tx) changed initial sk_wmem_alloc value. We need to take into account this offset when reporting sk_wmem_alloc to user, in PROC_FS files or various ioctls (SIOCOUTQ/TIOCOUTQ) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jarek Poplawski 提交于
Action police statistics could be misleading because drops are not shown when expected. With feedback from: Jamal Hadi Salim <hadi@cyberus.ca> Reported-by: NPawel Staszewski <pstaszewski@itcare.pl> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Acked-by: NJamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 6月, 2009 1 次提交
-
-
由 Jarek Poplawski 提交于
Let's use TICKS instead of US, so PSCHED_TICKS2NS and PSCHED_NS2TICKS (like in PSCHED_TICKS_PER_SEC already) to avoid misleading. Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 6月, 2009 1 次提交
-
-
由 Patrick McHardy 提交于
Convert magic values 1 and -1 to NETDEV_TX_BUSY and NETDEV_TX_LOCKED respectively. 0 (NETDEV_TX_OK) is not changed to keep the noise down, except in very few cases where its in direct proximity to one of the other values. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 6月, 2009 2 次提交
-
-
由 Jarek Poplawski 提交于
Use PSCHED_SHIFT constant instead of '10' in PSCHED_US2NS() and PSCHED_NS2US() macros to enable changing this value later. Additionally use PSCHED_SHIFT in sch_hfsc SM_SHIFT and ISM_SHIFT definitions. This part of the patch is based on feedback from Patrick McHardy <kaber@trash.net>. Reported-by: NAntonio Almeida <vexwek@gmail.com> Tested-by: NAntonio Almeida <vexwek@gmail.com> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Minoru Usui 提交于
I found a bug in cls_cgroup_change() in cls_cgroup.c. cls_cgroup_change() expected tca[TCA_OPTIONS] was set from user space properly, but tc in iproute2-2.6.29-1 (which I used) didn't set it. In the current source code of tc in git, it set tca[TCA_OPTIONS]. git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git If we always use a newest iproute2 in git when we use cls_cgroup, we don't face this oops probably. But I think, kernel shouldn't panic regardless of use program's behaviour. Signed-off-by: NMinoru Usui <usui@mxm.nes.nec.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
Define three accessors to get/set dst attached to a skb struct dst_entry *skb_dst(const struct sk_buff *skb) void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb Delete skb->rtable field Setting rtable is not allowed, just set dst instead as rtable is an alias. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 6月, 2009 1 次提交
-
-
由 Minoru Usui 提交于
net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup This patch fixes a bug which unconfigured struct tcf_proto keeps chaining in tc_ctl_tfilter(), and avoids kernel panic in cls_cgroup_classify() when we use cls_cgroup. When we execute 'tc filter add', tcf_proto is allocated, initialized by classifier's init(), and chained. After it's chained, tc_ctl_tfilter() calls classifier's change(). When classifier's change() fails, tc_ctl_tfilter() does not free and keeps tcf_proto. In addition, cls_cgroup is initialized in change() not in init(). It accesses unconfigured struct tcf_proto which is chained before change(), then hits Oops. Signed-off-by: NMinoru Usui <usui@mxm.nes.nec.co.jp> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca> Tested-by: NMinoru Usui <usui@mxm.nes.nec.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 5月, 2009 1 次提交
-
-
由 Paul Menage 提交于
Avoid reading the unsynchronized value cs->classid multiple times, since it could change concurrently from non-zero to zero; this would result in the classifier returning a positive result with a bogus (zero) classid. Signed-off-by: NPaul Menage <menage@google.com> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 5月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
We would like to get rid of netdev->trans_start = jiffies; that about all net drivers have to use in their start_xmit() function, and use txq->trans_start instead. This can be done generically in core network, as suggested by David. Some devices, (particularly loopback) dont need trans_start update, because they dont have transmit watchdog. We could add a new device flag, or rely on fact that txq->tran_start can be updated is txq->xmit_lock_owner is different than -1. Use a helper function to hide our choice. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 5月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
We can slightly reduce size of teqlN structure, not duplicating stats structure in teql_master but using stats field from net_device.stats for tx_errors and from netdev_queue for tx_bytes/tx_packets/tx_dropped values. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 5月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
It is illegal to dereference a skb after a successful ndo_start_xmit() call. We must store skb length in a local variable instead. Bug was introduced in 2.6.27 by commit 0abf77e5 (net_sched: Add accessor function for packet length for qdiscs) Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 5月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
struct net_device trans_start field is a hot spot on SMP and high performance devices, particularly multi queues ones, because every transmitter dirties it. Is main use is tx watchdog and bonding alive checks. But as most devices dont use NETIF_F_LLTX, we have to lock a netdev_queue before calling their ndo_start_xmit(). So it makes sense to move trans_start from net_device to netdev_queue. Its update will occur on a already present (and in exclusive state) cache line, for free. We can do this transition smoothly. An old driver continue to update dev->trans_start, while an updated one updates txq->trans_start. Further patches could also put tx_bytes/tx_packets counters in netdev_queue to avoid dirtying dev->stats (vlan device comes to mind) Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li Zefan 提交于
We can remove this lock here, since we are in cgroup write handler and thus the cgrp is guaranteed to be valid, and no lock is needed when writing a u32 variable. Signed-off-by: NLi Zefan <lizf@cn.fujitsuc.com> Acked-by: NPaul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 5月, 2009 1 次提交
-
-
由 Patrick McHardy 提交于
When no limit is given, the bfifo uses a default of tx_queue_len * mtu. Packets handled by qdiscs include the link layer header, so this should be taken into account, similar to what other qdiscs do. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 5月, 2009 1 次提交
-
-
由 Robert Love 提交于
The kernel should only be using the high 16 bits of a kernel generated priority. Filter priorities in all other cases only use the upper 16 bits of the u32 'prio' field of 'struct tcf_proto', but when the kernel generates the priority of a filter is saves all 32 bits which can result in incorrect lookup failures when a filter needs to be deleted or modified. Signed-off-by: NRobert Love <robert.w.love@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 4月, 2009 1 次提交
-
-
由 Jarek Poplawski 提交于
Alex Sidorenko reported: "while experimenting with 'netem' we have found some strange behaviour. It seemed that ingress delay as measured by 'ping' command shows up on some hosts but not on others. After some investigation I have found that the problem is that skbuff->tstamp field value depends on whether there are any packet sniffers enabled. That is: - if any ptype_all handler is registered, the tstamp field is as expected - if there are no ptype_all handlers, the tstamp field does not show the delay" This patch prevents unnecessary update of tstamp in dev_queue_xmit_nit() on ingress path (with act_mirred) adding a check, so minimal overhead on the fast path, but only when sniffers etc. are active. Since netem at ingress seems to logically emulate a network before a host, tstamp is zeroed to trigger the update and pretend delays are from the outside. Reported-by: NAlex Sidorenko <alexandre.sidorenko@hp.com> Tested-by: NAlex Sidorenko <alexandre.sidorenko@hp.com> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 4月, 2009 1 次提交
-
-
由 Stephen Hemminger 提交于
When vlan acceleration is used on receive, the vlan tag is maintained outside of the skb data. The existing vlan tag match only works on TX path because it uses vlan_get_tag which tests for VLAN_HW_TX_ACCEL. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2009 1 次提交
-
-
由 Ilpo Järvinen 提交于
tcp_sack_swap seems unnecessary so I pushed swap to the caller. Also removed comment that seemed then pointless, and added include when not already there. Compile tested. Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 3月, 2009 1 次提交
-
-
由 Jarek Poplawski 提交于
While looking for a possible reason of bugzilla report on HTB oops: http://bugzilla.kernel.org/show_bug.cgi?id=12858 I found the code in htb_delete calling htb_destroy_class on zero refcount is very misleading: it can suggest this is a common path, and destroy is called under sch_tree_lock. Actually, this can never happen like this because before deletion cops->get() is done, and after delete a class is still used by tclass_notify. The class destroy is always called from cops->put(), so without sch_tree_lock. This doesn't mean much now (since 2.6.27) because all vulnerable calls were moved from htb_destroy_class to htb_delete, but there was a bug in older kernels. The same change is done for other classful scheds, which, it seems, didn't have similar locking problems here. Reported-by: Nm0sia <m0sia@m0sia.ru> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 3月, 2009 1 次提交
-
-
由 Jarek Poplawski 提交于
A commit c1b56878 "tc: policing requires a rate estimator" introduced a test which invalidates previously working configs, based on examples from iproute2: doc/actions/actions-general. This is too rigorous: a rate estimator is needed only when police's "avrate" option is used. Reported-by: NJoao Correia <joaomiguelcorreia@gmail.com> Diagnosed-by: NJohn Dykstra <john.dykstra1@gmail.com> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 2月, 2009 1 次提交
-
-
由 Jarek Poplawski 提交于
drr_change_class lacks a check for NULL of tca[TCA_OPTIONS], so oops is possible. Reported-by: NDenys Fedoryschenko <denys@visp.net.lb> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 2月, 2009 1 次提交
-
-
由 Jarek Poplawski 提交于
Current "RTNETLINK answers: Invalid argument" warning, while trying to add multiq qdisc to non-multiqueue device, isn't very helpful and some of these devs can be changed btw., so let's use a better errno. With feedback from Stephen Hemminger <shemminger@vyatta.com> Reported-by: NBadalian Vyacheslav <slavon@bigtelecom.ru> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 2月, 2009 3 次提交
-
-
由 Jarek Poplawski 提交于
Patrick McHardy <kaber@trash.net> suggested using a workqueue instead of hrtimers to trigger netif_schedule() when there is a problem with setting exact time of this event: 'The differnce - yeah, it shouldn't make much, mainly wake up the qdisc earlier (but not too early) after "too many events" occured _and_ no further enqueue events wake up the qdisc anyways.' Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jarek Poplawski 提交于
Let's get some info on possible config problems. This patch brings back an old warning, but is printed only once now. With feedback from Patrick McHardy <kaber@trash.net> Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jarek Poplawski 提交于
Patrick McHardy <kaber@trash.net> suggested: > How about making this flag and the warning message (in a out-of-line > function) globally available? Other qdiscs (f.i. HFSC) can't deal with > inner non-work-conserving qdiscs as well. This patch uses qdisc->flags field of "suspected" child qdisc. Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-