提交 · c462238d6a6d8ee855bda10f9fff442971540ed2 · openeuler / raspberrypi-kernel

26 4月, 2007 40 次提交

由 Stephen Hemminger 提交于 4月 20, 2007

It isn't any faster to test a boolean global variable than do a simple
check for empty list.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9be9a6b9

[NET] skbuff: skb_store_bits const is backwards · 0c6fcc8a

由 Stephen Hemminger 提交于 4月 20, 2007

Getting warnings becuase skb_store_bits has skb as constant,
but the function overwrites it. Looks like const was on the
wrong side.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c6fcc8a

[NET_SCHED]: ingress: switch back to using ingress_lock · fd44de7c

由 Patrick McHardy 提交于 4月 16, 2007

Switch ingress queueing back to use ingress_lock. qdisc_lock_tree now locks
both the ingress and egress qdiscs on the device. All changes to data that
might be used on both ingress and egress needs to be protected by using
qdisc_lock_tree instead of manually taking dev->queue_lock. Additionally
the qdisc stats_lock needs to be initialized to ingress_lock for ingress
qdiscs.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd44de7c

[RTNETLINK]: Remove unnecessary locking in dump callbacks · 6313c1e0

由 Patrick McHardy 提交于 4月 16, 2007

Since we're now holding the rtnl during the entire dump operation, we can
remove additional locking for rtnl protected data. This patch does that
for all simple cases (dev_base_lock for dev_base walking, RCU protection
for FIB rule dumping).
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6313c1e0

[RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks · 1c2d670f

由 Patrick McHardy 提交于 4月 16, 2007

Hold rtnl_mutex during the entire netlink dump operation. This allows
to simplify locking in the dump callbacks, since they can now rely on
that no concurrent changes happen.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c2d670f

[NETLINK]: Switch cb_lock spinlock to mutex and allow to override it · af65bdfc

由 Patrick McHardy 提交于 4月 20, 2007

Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af65bdfc

[SK_BUFF]: Fix missing offset adjustment in skb_copy_expand · efd1e8d5

由 Patrick McHardy 提交于 4月 10, 2007

skb_copy_expand changes the headroom, so it needs to adjust the header
offsets by the difference between the old and the new value.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efd1e8d5

bridge: eliminate call by reference · 6229e362

由 Stephen Hemminger 提交于 3月 21, 2007

Change the bridging hook to be simple function with return value
rather than modifying the skb argument. This could generate better
code and is cleaner.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>

6229e362

[NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY · 60476372

由 Herbert Xu 提交于 4月 09, 2007

When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
maps to the semantics of CHECKSUM_UNNECESSARY.  Therefore we should
treat it as such in the stack.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60476372

[NET]: Use csum_start offset instead of skb_transport_header · 663ead3b

由 Herbert Xu 提交于 4月 09, 2007

The skb transport pointer is currently used to specify the start
of the checksum region for transmit checksum offload.  Unfortunately,
the same pointer is also used during receive side processing.

This creates a problem when we want to retransmit a received
packet with partial checksums since the skb transport pointer
would be overwritten.

This patch solves this problem by creating a new 16-bit csum_start
offset value to replace the skb transport header for the purpose
of checksums.  This offset is calculated from skb->head so that
it does not have to change when skb->data changes.

No extra space is required since csum_offset itself fits within
a 16-bit word so we can use the other 16 bits for csum_start.

For backwards compatibility, just before we push a packet with
partial checksums off into the device driver, we set the skb
transport header to what it would have been under the old scheme.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

663ead3b

[SK_BUFF]: Fix missing offset adjustment in pskb_expand_head · 56eb8882

由 Patrick McHardy 提交于 4月 09, 2007

Since we're increasing the headroom, the header offsets need to be
increased by the same amount as well.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56eb8882

[RTNL]: Improve error codes for unsupported operations · 038890fe

由 Thomas Graf 提交于 4月 05, 2007

The most common trigger of these errors is that the
config option hasn't been enable wich would make the
functionality available. Therefore returning EOPNOTSUPP
gives a better idea on what is going wrong.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

038890fe

[NET]: Move generic skbuff stuff from XFRM code to generic code · 716ea3a7

由 David Howells 提交于 4月 02, 2007

Move generic skbuff stuff from XFRM code to generic code so that
AF_RXRPC can use it too.

The kdoc comments I've attached to the functions needs to be checked
by whoever wrote them as I had to make some guesses about the workings
of these functions.
Signed-off-By: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

716ea3a7

[SK_BUFF]: Introduce skb_copy_to_linear_data{_offset} · 27d7ff46

由 Arnaldo Carvalho de Melo 提交于 3月 31, 2007

To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>

27d7ff46

[NET]: Inline net_device_stats · c45d286e

由 Rusty Russell 提交于 3月 28, 2007

Network drivers which keep stats allocate their own stats structure
then write a get_stats() function to return them.  It would be nice if
this were done by default.

1) Add a new "stats" field to "struct net_device".
2) Add a new feature field to say "this driver uses the internal one"
3) Have a default "get_stats" which returns NULL if that feature not set.
4) Change callers to check result of get_stats call for NULL, not if
   ->get_stats is set.

This should not break backwards compatibility with older drivers, yet
allow modern drivers to shed some boilerplate code.

Lightly tested: works for a modified lguest network driver.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c45d286e

[SK_BUFF]: Introduce skb_copy_from_linear_data{_offset} · d626f62b

由 Arnaldo Carvalho de Melo 提交于 3月 27, 2007

To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d626f62b

[NET] fib_rules: Flush route cache after rule modifications · 73417f61

由 Thomas Graf 提交于 3月 27, 2007

The results of FIB rules lookups are cached in the routing cache
except for IPv6 as no such cache exists. So far, it was the
responsibility of the user to flush the cache after modifying any
rules. This lead to many false bug reports due to misunderstanding
of this concept.

This patch automatically flushes the route cache after inserting
or deleting a rule.

Thanks to Muli Ben-Yehuda <muli@il.ibm.com> for catching a bug
in the previous patch.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73417f61

[NET] fib_rules: Add no-operation action · fa0b2d1d

由 Thomas Graf 提交于 3月 26, 2007

The use of nop rules simplifies the usage of goto rules
and adds more flexibility as they allow targets to remain
while the actual content of the branches can change easly.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa0b2d1d

[NET] fib_rules: Mark rules detached from the device · 2b443683

由 Thomas Graf 提交于 3月 26, 2007

Rules which match against device names in their selector can
remain while the device itself disappears, in fact the device
doesn't have to present when the rule is added in the first
place. The device name is resolved by trying when the rule is
added and later by listening to NETDEV_REGISTER/UNREGISTER
notifications.

This patch adds the flag FIB_RULE_DEV_DETACHED which is set
towards userspace when a rule contains a device match which
is unresolved at the moment. This eases spotting the reason
why certain rules seem not to function properly.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b443683

[NET] fib_rules: goto rule action · 0947c9fe

由 Thomas Graf 提交于 3月 26, 2007

This patch adds a new rule action FR_ACT_GOTO which allows
to skip a set of rules by jumping to another rule. The rule
to jump to is specified via the FRA_GOTO attribute which
carries a rule preference.

Referring to a rule which doesn't exists is explicitely allowed.
Such goto rules are marked with the flag FIB_RULE_UNRESOLVED
and will act like a rule with a non-matching selector. The rule
will become functional as soon as its target is present.

The goto action enables performance optimizations by reducing
the average number of rules that have to be passed per lookup.

Example:
0:      from all lookup local
40:     not from all to 192.168.23.128 goto 32766
41:     from all fwmark 0xa blackhole
42:     from all fwmark 0xff blackhole
32766:  from all lookup main
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0947c9fe

[NETFILTER]: nf_conntrack: don't use nfct in skb if conntrack is disabled · 5f79e0f9

由 Yasuyuki Kozakai 提交于 3月 23, 2007

Signed-off-by: NYasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f79e0f9

[NETLINK]: Directly return -EINTR from netlink_dump_start() · c702e804

由 Thomas Graf 提交于 3月 22, 2007

Now that all users of netlink_dump_start() use netlink_run_queue()
to process the receive queue, it is possible to return -EINTR from
netlink_dump_start() directly, therefore simplying the callers.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c702e804

[NETLINK]: Remove error pointer from netlink message handler · 1d00a4eb

由 Thomas Graf 提交于 3月 22, 2007

The error pointer argument in netlink message handlers is used
to signal the special case where processing has to be interrupted
because a dump was started but no error happened. Instead it is
simpler and more clear to return -EINTR and have netlink_run_queue()
deal with getting the queue right.

nfnetlink passed on this error pointer to its subsystem handlers
but only uses it to signal the start of a netlink dump. Therefore
it can be removed there as well.

This patch also cleans up the error handling in the affected
message handlers to be consistent since it had to be touched anyway.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d00a4eb

[NETLINK]: Ignore control messages directly in netlink_run_queue() · 45e7ae7f

由 Thomas Graf 提交于 3月 22, 2007

Changes netlink_rcv_skb() to skip netlink controll messages and don't
pass them on to the message handler.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

45e7ae7f

[NETLINK]: Ignore !NLM_F_REQUEST messages directly in netlink_run_queue() · d35b6856

由 Thomas Graf 提交于 3月 22, 2007

netlink_rcv_skb() is changed to skip messages which don't have the
NLM_F_REQUEST bit to avoid every netlink family having to perform this
check on their own.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d35b6856

[RTNL]: Properly return rntl message handler · 51057f2f

由 Thomas Graf 提交于 3月 22, 2007

Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51057f2f

[NET] rules: Unified rules dumping · c454673d

由 Thomas Graf 提交于 3月 25, 2007

Implements a unified, protocol independant rules dumping function
which is capable of both, dumping a specific protocol family or
all of them. This speeds up dumping as less lookups are required.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c454673d

T
[RTNL]: Use rtnl registration interface for dump-all aliases · 687ad8cc
由 Thomas Graf 提交于 3月 22, 2007
```
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
687ad8cc

[NET] rules: Use rtnl registration interface · 9d9e6a58

由 Thomas Graf 提交于 3月 25, 2007

Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d9e6a58

[NEIGH]: Use rtnl registration interface · c8822a4e

由 Thomas Graf 提交于 3月 22, 2007

Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8822a4e

[NET] link: Use rtnl registration interface · 340d17fc

由 Thomas Graf 提交于 3月 22, 2007

Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

340d17fc

[RTNL]: Message handler registration interface · e2849863

由 Thomas Graf 提交于 3月 22, 2007

This patch adds a new interface to register rtnetlink message
handlers replacing the exported rtnl_links[] array which
required many message handlers to be exported unnecessarly.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e2849863

[NETLINK]: Use nlmsg_trim() where appropriate · dc5fc579

由 Arnaldo Carvalho de Melo 提交于 3月 25, 2007

Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc5fc579

A
[SK_BUFF]: Remove skb_add_mtu() leftovers · 897933bc
由 Arnaldo Carvalho de Melo 提交于 3月 19, 2007
```
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
```
897933bc

[SK_BUFF]: Adjust the zeroing up to tail in __alloc_skb too · ca0605a7

由 Arnaldo Carvalho de Melo 提交于 3月 19, 2007

I did it just in alloc_skb_from_cache, forgot __alloc_skb, fixed now.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca0605a7

[SK_BUFF]: Convert skb->end to sk_buff_data_t · 4305b541

由 Arnaldo Carvalho de Melo 提交于 4月 19, 2007

Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4305b541

[SK_BUFF]: Convert skb->tail to sk_buff_data_t · 27a884dc

由 Arnaldo Carvalho de Melo 提交于 4月 19, 2007

So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27a884dc

[SK_BUFF]: Use offsets for skb->{mac,network,transport}_header on 64bit architectures · 2e07fa9c

由 Arnaldo Carvalho de Melo 提交于 4月 10, 2007

With this we save 8 bytes per network packet, leaving a 4 bytes hole to be used
in further shrinking work, likely with the offsetization of other pointers,
such as ->{data,tail,end}, at the cost of adds, that were minimized by the
usual practice of setting skb->{mac,nh,n}.raw to a local variable that is then
accessed multiple times in each function, it also is not more expensive than
before with regards to most of the handling of such headers, like setting one
of these headers to another (transport to network, etc), or subtracting, adding
to/from it, comparing them, etc.

Now we have this layout for sk_buff on a x86_64 machine:

[acme@mica net-2.6.22]$ pahole vmlinux sk_buff
struct sk_buff {
	struct sk_buff *       next;             /*   0   8 */
	struct sk_buff *       prev;             /*   8   8 */
	struct rb_node         rb;               /*  16  24 */
	struct sock *          sk;               /*  40   8 */
	ktime_t                tstamp;           /*  48   8 */
	struct net_device *    dev;              /*  56   8 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	struct net_device *    input_dev;        /*  64   8 */
	sk_buff_data_t         transport_header; /*  72   4 */
	sk_buff_data_t         network_header;   /*  76   4 */
	sk_buff_data_t         mac_header;       /*  80   4 */

	/* XXX 4 bytes hole, try to pack */

	struct dst_entry *     dst;              /*  88   8 */
	struct sec_path *      sp;               /*  96   8 */
	char                   cb[48];           /* 104  48 */
	/* cacheline 2 boundary (128 bytes) was 24 bytes ago*/
	unsigned int           len;              /* 152   4 */
	unsigned int           data_len;         /* 156   4 */
	unsigned int           mac_len;          /* 160   4 */
	union {
		__wsum         csum;             /*       4 */
		__u32          csum_offset;      /*       4 */
	};                                       /* 164   4 */
	__u32                  priority;         /* 168   4 */
	__u8                   local_df:1;       /* 172   1 */
	__u8                   cloned:1;         /* 172   1 */
	__u8                   ip_summed:2;      /* 172   1 */
	__u8                   nohdr:1;          /* 172   1 */
	__u8                   nfctinfo:3;       /* 172   1 */
	__u8                   pkt_type:3;       /* 173   1 */
	__u8                   fclone:2;         /* 173   1 */
	__u8                   ipvs_property:1;  /* 173   1 */

	/* XXX 2 bits hole, try to pack */

	__be16                 protocol;         /* 174   2 */
	void    (*destructor)(struct sk_buff *); /* 176   8 */
	struct nf_conntrack *  nfct;             /* 184   8 */
	/* --- cacheline 3 boundary (192 bytes) --- */
	struct sk_buff *       nfct_reasm;       /* 192   8 */
	struct nf_bridge_info *nf_bridge;        /* 200   8 */
	__u16                  tc_index;         /* 208   2 */
	__u16                  tc_verd;          /* 210   2 */
	dma_cookie_t           dma_cookie;       /* 212   4 */
	__u32                  secmark;          /* 216   4 */
	__u32                  mark;             /* 220   4 */
	unsigned int           truesize;         /* 224   4 */
	atomic_t               users;            /* 228   4 */
	unsigned char *        head;             /* 232   8 */
	unsigned char *        data;             /* 240   8 */
	unsigned char *        tail;             /* 248   8 */
	/* --- cacheline 4 boundary (256 bytes) --- */
	unsigned char *        end;              /* 256   8 */
}; /* size: 264, cachelines: 5 */
   /* sum members: 260, holes: 1, sum holes: 4 */
   /* bit holes: 1, sum bit holes: 2 bits */
   /* last cacheline: 8 bytes */

On 32 bits nothing changes, and pointers continue to be used with the compiler
turning all this abstraction layer into dust. But there are some sk_buff
validation tricks that are now possible, humm... :-)
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e07fa9c

[SK_BUFF]: unions of just one member don't get anything done, kill them · b0e380b1

由 Arnaldo Carvalho de Melo 提交于 4月 10, 2007

Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
skb->mac to skb->mac_header, to match the names of the associated helpers
(skb[_[re]set]_{transport,network,mac}_header).
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0e380b1

[SK_BUFF]: Introduce skb_network_header_len · cfe1fc77

由 Arnaldo Carvalho de Melo 提交于 3月 16, 2007

For the common sequence "skb->h.raw - skb->nh.raw", similar to skb->mac_len,
that is precalculated tho, don't think we need to bloat skb with one more
member, so just use this new helper, reducing the number of non-skbuff.h
references to the layer headers even more.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfe1fc77