提交 · ead81cc5fc6d996db6afb20f211241612610a07a · openanolis / cloud-kernel

18 7月, 2008 4 次提交

D
netdevice: Move qdisc_list back into net_device proper. · ead81cc5
由 David S. Miller 提交于 7月 17, 2008
```
And give it it's own lock.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
ead81cc5

pkt_sched: Schedule qdiscs instead of netdev_queue. · 37437bb2

由 David S. Miller 提交于 7月 16, 2008

When we have shared qdiscs, packets come out of the qdiscs
for multiple transmit queues.

Therefore it doesn't make any sense to schedule the transmit
queue when logically we cannot know ahead of time the TX
queue of the SKB that the qdisc->dequeue() will give us.

Just for sanity I added a BUG check to make sure we never
get into a state where the noop_qdisc is scheduled.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37437bb2

pkt_sched: Add and use qdisc_root() and qdisc_root_lock(). · 7698b4fc

由 David S. Miller 提交于 7月 16, 2008

When code wants to lock the qdisc tree state, the logic
operation it's doing is locking the top-level qdisc that
sits of the root of the netdev_queue.

Add qdisc_root_lock() to represent this and convert the
easiest cases.

In order for this to work out in all cases, we have to
hook up the noop_qdisc to a dummy netdev_queue.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7698b4fc

netdev: Allocate multiple queues for TX. · e8a0464c

由 David S. Miller 提交于 7月 17, 2008

alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.

Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces.  This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.

Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e8a0464c

09 7月, 2008 9 次提交

netdev: Make netif_schedule() routines work with netdev_queue objects. · 86d804e1

由 David S. Miller 提交于 7月 08, 2008

Only plain netif_schedule() remains taking a net_device, mostly as a
compatability item while we transition the rest of these interfaces.

Everything else calls netif_schedule_queue() or __netif_schedule(),
both of which take a netdev_queue pointer.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86d804e1

D
pkt_sched: Kill stats_lock member of struct Qdisc. · 68dfb427
由 David S. Miller 提交于 7月 08, 2008
```
It is always equal to qdisc->dev_queue->lock
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
68dfb427

netdev: Kill qdisc_ingress, use netdev->rx_queue.qdisc instead. · 816f3258

由 David S. Miller 提交于 7月 08, 2008

Now that our qdisc management is bi-directional, per-queue, and fully
orthogonal, there is no reason to have a special ingress qdisc pointer
in struct net_device.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

816f3258

D
netdev: Move rest of qdisc state into struct netdev_queue · b0e1e646
由 David S. Miller 提交于 7月 08, 2008
```
Now qdisc, qdisc_sleeping, and qdisc_list also live there.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b0e1e646

netdev: The ingress_lock member is no longer needed. · 555353cf

由 David S. Miller 提交于 7月 08, 2008

Every qdisc is assosciated with a queue, and in the case of ingress
qdiscs that will now be netdev->rx_queue so using that queue's lock is
the thing to do.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

555353cf

netdev: Move queue_lock into struct netdev_queue. · dc2b4847

由 David S. Miller 提交于 7月 08, 2008

The lock is now an attribute of the device queue.

One thing to notice is that "suspicious" places
emerge which will need specific training about
multiple queue handling.  They are so marked with
explicit "netdev->rx_queue" and "netdev->tx_queue"
references.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc2b4847

pkt_sched: Remove 'dev' member of struct Qdisc. · 5ce2d488

由 David S. Miller 提交于 7月 08, 2008

It can be obtained via the netdev_queue. So create a helper routine,
qdisc_dev(), to make the transformations nicer looking.

Now, qdisc_alloc() now no longer needs a net_device pointer argument.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ce2d488

netdev: Create netdev_queue abstraction. · bb949fbd

由 David S. Miller 提交于 7月 08, 2008

A netdev_queue is an entity managed by a qdisc.

Currently there is one RX and one TX queue, and a netdev_queue merely
contains a backpointer to the net_device.

The Qdisc struct is augmented with a netdev_queue pointer as well.

Eventually the 'dev' Qdisc member will go away and we will have the
resulting hierarchy:

	net_device --> netdev_queue --> Qdisc

Also, qdisc_alloc() and qdisc_create_dflt() now take a netdev_queue
pointer argument.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb949fbd

D
pkt_sched: Remove comment reference to old style TX locking. · e65d22e1
由 David S. Miller 提交于 7月 08, 2008
```
We haven't had netdev->tbusy in many years :)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
e65d22e1

06 7月, 2008 1 次提交

net-sched: add dynamically sized qdisc class hash helpers · 6fe1c7a5

由 Patrick McHardy 提交于 7月 05, 2008

Currently all qdiscs which allow to create classes uses a fixed sized hash
table with size 16 to hash the classes. This causes a large bottleneck
when using thousands of classes and unbound filters.

Add helpers for dynamically sized class hashes to fix this. The following
patches will convert the qdiscs to use them.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6fe1c7a5

02 7月, 2008 1 次提交

net-sched: change tcf_destroy_chain() to clear start of filter list · ff31ab56

由 Patrick McHardy 提交于 7月 01, 2008

Pass double tcf_proto pointers to tcf_destroy_chain() to make it
clear the start of the filter list for more consistency.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff31ab56

15 4月, 2008 1 次提交

[NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop · 066a3b5b

由 Jarek Poplawski 提交于 4月 14, 2008

TC_H_MAJ(parentid) for root classes is the same as for ingress, and if
ingress qdisc is created qdisc_lookup() returns its pointer (without
ingress NULL is returned). After this all qdisc_lookups give the same,
and we get endless loop. (I don't know how this could hide for so long
- it should trigger with every leaf class deleted if it's qdisc isn't
empty.)

After this fix qdisc_lookup() is omitted both for ingress and root
parents, but looking for root is only wasting a little time here...
Many thanks to Enrico Demarin for finding a test for catching this
bug, which probably bothered quite a lot of admins.
 
Reported-by: Enrico Demarin <enrico@superclick.com>,
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

066a3b5b

26 3月, 2008 1 次提交

[NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS. · 3b1e0a65

由 YOSHIFUJI Hideaki 提交于 3月 26, 2008

Introduce per-sock inlines: sock_net(), sock_net_set()
and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

3b1e0a65

29 1月, 2008 7 次提交

P
[NET_SCHED]: sch_api: introduce constant for rate table size · 5feb5e1a
由 Patrick McHardy 提交于 1月 23, 2008
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
5feb5e1a
P
[NET_SCHED]: Use NLA_PUT_STRING for string dumping · 57e1c487
由 Patrick McHardy 提交于 1月 23, 2008
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
57e1c487

[NET_SCHED]: Convert packet schedulers from rtnetlink to new netlink API · 1e90474c

由 Patrick McHardy 提交于 1月 22, 2008

Convert packet schedulers to use the netlink API. Unfortunately a gradual
conversion is not possible without breaking compilation in the middle or
adding lots of casts, so this patch converts them all in one step. The
patch has been mostly generated automatically with some minor edits to
at least allow seperate conversion of classifiers and actions.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e90474c

P
[NET_SCHED]: Move EXPORT_SYMBOL next to exported symbol · 62e3ba1b
由 Patrick McHardy 提交于 1月 22, 2008
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
62e3ba1b

[NET]: Make rtnetlink infrastructure network namespace aware (v3) · 97c53cac

由 Denis V. Lunev 提交于 11月 19, 2007

After this patch none of the netlink callback support anything
except the initial network namespace but the rtnetlink infrastructure
now handles multiple network namespaces.

Changes from v2:
- IPv6 addrlabel processing

Changes from v1:
- no need for special rtnl_unlock handling
- fixed IPv6 ndisc
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97c53cac

[NET]: Modify all rtnetlink methods to only work in the initial namespace (v2) · b854272b

由 Denis V. Lunev 提交于 12月 01, 2007

Before I can enable rtnetlink to work in all network namespaces I need
to be certain that something won't break.  So this patch deliberately
disables all of the rtnletlink methods in everything except the
initial network namespace.  After the methods have been audited this
extra check can be disabled.

Changes from v1:
- added IPv6 addrlabel protection
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b854272b

[NET]: Move Qdisc_class_ops and Qdisc_ops in appropriate sections. · 20fea08b

由 Eric Dumazet 提交于 11月 14, 2007

Qdisc_class_ops are const, and Qdisc_ops are mostly read.

Using "const" and "__read_mostly" qualifiers helps to reduce false
sharing.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20fea08b

11 10月, 2007 3 次提交

[NET_SCHED]: Show timer resolution instead of clock resolution in /proc/net/psched · 3c0cfc13

由 Patrick McHardy 提交于 10月 10, 2007

The fourth parameter of /proc/net/psched is supposed to show the timer
resultion and is used by HTB userspace to calculate the necessary
burst rate. Currently we show the clock resolution, which results in a
too low burst rate when the two differ.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c0cfc13

[NET]: Make the device list and device lookups per namespace. · 881d966b

由 Eric W. Biederman 提交于 9月 17, 2007

This patch makes most of the generic device layer network
namespace safe.  This patch makes dev_base_head a
network namespace variable, and then it picks up
a few associated variables.  The functions:
dev_getbyhwaddr
dev_getfirsthwbytype
dev_get_by_flags
dev_get_by_name
__dev_get_by_name
dev_get_by_index
__dev_get_by_index
dev_ioctl
dev_ethtool
dev_load
wireless_process_ioctl

were modified to take a network namespace argument, and
deal with it.

vlan_ioctl_set and brioctl_set were modified so their
hooks will receive a network namespace argument.

So basically anthing in the core of the network stack that was
affected to by the change of dev_base was modified to handle
multiple network namespaces.  The rest of the network stack was
simply modified to explicitly use &init_net the initial network
namespace.  This can be fixed when those components of the network
stack are modified to handle multiple network namespaces.

For now the ifindex generator is left global.

Fundametally ifindex numbers are per namespace, or else
we will have corner case problems with migration when
we get that far.

At the same time there are assumptions in the network stack
that the ifindex of a network device won't change.  Making
the ifindex number global seems a good compromise until
the network stack can cope with ifindex changes when
you change namespaces, and the like.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

881d966b

[NET]: Make /proc/net per network namespace · 457c4cbc

由 Eric W. Biederman 提交于 9月 12, 2007

This patch makes /proc/net per network namespace. It modifies the global
variables proc_net and proc_net_stat to be per network namespace.
The proc_net file helpers are modified to take a network namespace argument,
and all of their callers are fixed to pass &init_net for that argument.
This ensures that all of the /proc/net files are only visible and
usable in the initial network namespace until the code behind them
has been updated to be handle multiple network namespaces.

Making /proc/net per namespace is necessary as at least some files
in /proc/net depend upon the set of network devices which is per
network namespace, and even more files in /proc/net have contents
that are relevant to a single network namespace.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

457c4cbc

31 7月, 2007 1 次提交

[NET]: Fix sch_api to properly set sch->parent on the root. · ffc8fefa

由 Patrick McHardy 提交于 7月 30, 2007

Fix sch_api to correctly set sch->parent for both ingress and egress
qdiscs in qdisc_create().
Signed-off-by: NPatrick McHardy <trash@kaber.net>
Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ffc8fefa

15 7月, 2007 2 次提交

[NET_SCHED]: act_api: qdisc internal reclassify support · 73ca4918

由 Patrick McHardy 提交于 7月 15, 2007

The behaviour of NET_CLS_POLICE for TC_POLICE_RECLASSIFY was to return
it to the qdisc, which could handle it internally or ignore it. With
NET_CLS_ACT however, tc_classify starts over at the first classifier
and never returns it to the qdisc. This makes it impossible to support
qdisc-internal reclassification, which in turn makes it impossible to
remove the old NET_CLS_POLICE code without breaking compatibility since
we have two qdiscs (CBQ and ATM) that support this.

This patch adds a tc_classify_compat function that handles
reclassification the old way and changes CBQ and ATM to use it.

This again is of course not fully backwards compatible with the previous
NET_CLS_ACT behaviour. Unfortunately there is no way to fully maintain
compatibility *and* support qdisc internal reclassification with
NET_CLS_ACT, but this seems like the better choice over keeping the two
incompatible options around forever.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73ca4918

[NET_SCHED]: Revert "avoid transmit softirq on watchdog wakeup" optimization · 0621ed2e

由 Patrick McHardy 提交于 7月 14, 2007

As noticed by Ranko Zivojnovic <ranko@spidernet.net>, calling qdisc_run
from the timer handler can result in deadlock:

> CPU#0
>
> qdisc_watchdog() fires and gets dev->queue_lock
> qdisc_run()...qdisc_restart()...
> -> releases dev->queue_lock and enters dev_hard_start_xmit()
>
> CPU#1
>
> tc del qdisc dev ...
> qdisc_graft()...dev_graft_qdisc()...dev_deactivate()...
> -> grabs dev->queue_lock ...
>
> qdisc_reset()...{cbq,hfsc,htb,netem,tbf}_reset()...qdisc_watchdog_cancel()...
> -> hrtimer_cancel() - waiting for the qdisc_watchdog() to exit, while still
>		        holding dev->queue_lock
>
> CPU#0
>
> dev_hard_start_xmit() returns ...
> -> wants to get dev->queue_lock(!)
>
> DEADLOCK!

The entire optimization is a bit questionable IMO, it moves potentially
large parts of NET_TX_SOFTIRQ work to TIMER_SOFTIRQ/HRTIMER_SOFTIRQ,
which kind of defeats the separation of them.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Acked-by: NRanko Zivojnovic <ranko@spidernet.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0621ed2e

11 7月, 2007 2 次提交

[NET_SCHED]: Remove unnecessary includes · 0ba48053

由 Patrick McHardy 提交于 7月 02, 2007

Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ba48053

[NET_SCHED]: Remove CONFIG_NET_ESTIMATOR option · 876d48aa

由 Patrick McHardy 提交于 7月 02, 2007

The generic estimator is always built in anways and all the config options
does is prevent including a minimal amount of code for setting it up.
Additionally the option is already automatically selected for most cases.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

876d48aa

04 5月, 2007 1 次提交

[NET]: Rework dev_base via list_head (v3) · 7562f876

由 Pavel Emelianov 提交于 5月 03, 2007

Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().
Signed-off-by: NPavel Emelianov <xemul@openvz.org>
Acked-by: NKirill Korotaev <dev@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7562f876

26 4月, 2007 7 次提交

[NET_SCHED]: ingress: switch back to using ingress_lock · fd44de7c

由 Patrick McHardy 提交于 4月 16, 2007

Switch ingress queueing back to use ingress_lock. qdisc_lock_tree now locks
both the ingress and egress qdiscs on the device. All changes to data that
might be used on both ingress and egress needs to be protected by using
qdisc_lock_tree instead of manually taking dev->queue_lock. Additionally
the qdisc stats_lock needs to be initialized to ingress_lock for ingress
qdiscs.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd44de7c

[NET_SCHED]: Eliminate qdisc_tree_lock · 0463d4ae

由 Patrick McHardy 提交于 4月 16, 2007

Since we're now holding the rtnl during the entire dump operation, we
can remove qdisc_tree_lock, whose only purpose is to protect dump
callbacks from concurrent changes to the qdisc tree.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0463d4ae

[NET_SCHED]: qdisc: remove unnecessary memory barriers · c95e9395

由 Patrick McHardy 提交于 3月 23, 2007

We're holding dev->queue_lock in qdisc_watchdog_schedule and
qdisc_watchdog_cancel, no need for the barriers.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c95e9395

[NET_SCHED]: Unline tcf_destroy · a48b5a61

由 Patrick McHardy 提交于 3月 23, 2007

Uninline tcf_destroy and add a helper function to destroy an entire filter
chain.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a48b5a61

[NET_SCHED] qdisc: avoid transmit softirq on watchdog wakeup · 1936502d

由 Stephen Hemminger 提交于 3月 22, 2007

If possible, avoid having to do a transmit softirq when a qdisc
watchdog decides to re-enable.  The watchdog routine runs off
a timer, so it is already in the same effective context as
the softirq.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1936502d

[NETEM]: avoid excessive requeues · 11274e5a

由 Stephen Hemminger 提交于 3月 22, 2007

The netem code would call getnstimeofday() and dequeue/requeue after
every packet, even if it was waiting. Avoid this overhead by using
the throttled flag.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

11274e5a

T
[PKT_SCHED] qdisc: Use rtnl registration interface · be577ddc
由 Thomas Graf 提交于 3月 22, 2007
```
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
be577ddc

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功