提交 · 6b25d30bf112370a12d05c3c0fd43732985dab01 · openeuler / Kernel

11 7月, 2007 21 次提交

[NET]: Fix gen_estimator timer removal race · 6b25d30b

由 Patrick McHardy 提交于 7月 09, 2007

As noticed by Jarek Poplawski <jarkao2@o2.pl>, the timer removal in
gen_kill_estimator races with the timer function rearming the timer.

Check whether the timer list is empty before rearming the timer
in the timer function to fix this.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Acked-by: NJarek Poplawski <jarkao2@o2.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b25d30b

[NETPOLL]: Fix a leak-n-bug in netpoll_cleanup() · 1498b3f1

由 Satyam Sharma 提交于 7月 09, 2007

93ec2c72 applied excessive duct tape to
the netpoll beast's netpoll_cleanup(), thus substituting one leak with
another, and opening up a little buglet :-)

net_device->npinfo (netpoll_info) is a shared and refcounted object and
cannot simply be set NULL the first time netpoll_cleanup() is called.
Otherwise, further netpoll_cleanup()'s see np->dev->npinfo == NULL and
become no-ops, thus leaking. And it's a bug too: the first call to
netpoll_cleanup() would thus (annoyingly) "disable" other (still alive)
netpolls too. Maybe nobody noticed this because netconsole (only user
of netpoll) never supported multiple netpoll objects earlier.

This is a trivial and obvious one-line fixlet.
Signed-off-by: NSatyam Sharma <ssatyam@cse.iitk.ac.in>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1498b3f1

[NET]: "wrong timeout value in sk_wait_data()": cleanups · 6f11df83

由 Andrew Morton 提交于 7月 09, 2007

- save 4 bytes

- it's read-mostly.
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NVasily Averin <vvs@sw.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f11df83

[NET]: Make some network-related proc files use seq_list_xxx helpers · 60f0438a

由 Pavel Emelianov 提交于 7月 09, 2007

This includes /proc/net/protocols, /proc/net/rxrpc_calls and
/proc/net/rxrpc_connections files.

All three need seq_list_start_head to show some header.
Signed-off-by: NPavel Emelianov <xemul@openvz.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60f0438a

[NETFILTER]: x_tables: add TRACE target · ba9dda3a

由 Jozsef Kadlecsik 提交于 7月 07, 2007

The TRACE target can be used to follow IP and IPv6 packets through
the ruleset.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick NcHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba9dda3a

[PKTGEN]: IPSEC support · a553e4a6

由 Jamal Hadi Salim 提交于 7月 02, 2007

Added transport mode ESP support for starters.  I will send more of
these modes and types once i have resolved the tunnel mode isses.
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a553e4a6

[PKTGEN]: Introduce sequential flows · 007a531b

由 Jamal Hadi Salim 提交于 7月 02, 2007

By default all flows in pktgen are randomly selected.
This patch introduces ability to have all defined flows to
be sent sequentially. Robert defined randomness to be the
default behavior.
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

007a531b

[PKTGEN]: Centralize packet overhead tracking · 16dab72f

由 Jamal Hadi Salim 提交于 7月 02, 2007

Track the extra packet overhead for VLAN tags, MPLS, IPSEC etc
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16dab72f

[NET]: Fix secondary unicast/multicast address count maintenance · 61cbc2fc

由 Patrick McHardy 提交于 6月 30, 2007

When a reference to an existing address is increased or decreased without
hitting zero, the address count is incorrectly adjusted.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61cbc2fc

[CORE] Stack changes to add multiqueue hardware support API · f25f4e44

由 Peter P Waskiewicz Jr 提交于 7月 06, 2007

Add the multiqueue hardware device support API to the core network
stack.  Allow drivers to allocate multiple queues and manage them at
the netdev level if they choose to do so.

Added a new field to sk_buff, namely queue_mapping, for drivers to
know which tx_ring to select based on OS classification of the flow.
Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f25f4e44

[NET]: Fix TX checksum feature check · a298830c

由 Herbert Xu 提交于 6月 28, 2007

This patch fixes a boolean error in the new TX checksum check
that causes bogus TSO packets to be generated.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a298830c

[NET]: dev: secondary unicast address support · 4417da66

由 Patrick McHardy 提交于 6月 27, 2007

Add support for configuring secondary unicast addresses on network
devices. To support this devices capable of filtering multiple
unicast addresses need to change their set_multicast_list function
to configure unicast filters as well and assign it to dev->set_rx_mode
instead of dev->set_multicast_list. Other devices are put into promiscous
mode when secondary unicast addresses are present.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4417da66

[NET]: dev_mcast: switch to generic net_device address lists · 3fba5a8b

由 Patrick McHardy 提交于 6月 27, 2007

Use generic net_device address lists for multicast list handling.
Some defines are used to keep drivers working.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fba5a8b

[NET]: dev: introduce generic net_device address lists · bf742482

由 Patrick McHardy 提交于 6月 27, 2007

Introduce struct dev_addr_list and list maintenance functions
based on dev_mc_list and the related functions. This will be
used by follow-up patches for both multicast and secondary
unicast addresses.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf742482

[NET]: dev_mcast: unexport dev_mc_upload · 75ebe8f7

由 Patrick McHardy 提交于 6月 27, 2007

dev_mc_add/dev_mc_delete take care of uploading the list when
necessary and thats the only interface other code should use.
Also remove two incorrect calls in DECnet.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

75ebe8f7

[NET]: IPV6 checksum offloading in network devices · d212f87b

由 Stephen Hemminger 提交于 6月 27, 2007

The existing model for checksum offload does not correctly handle
devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
implies device can do any arbitrary protocol.

This patch:
 * adds NETIF_F_IPV6_CSUM for those devices
 * fixes bnx2 and tg3 devices that need it
 * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
 * fixes assumptions about NETIF_F_ALL_CSUM in nat
 * adjusts bridge union of checksumming computation
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d212f87b

[RTNETLINK]: Fix rtnetlink compat attribute patch · 2371baa4

由 Patrick McHardy 提交于 6月 26, 2007

Sent the wrong patch previously.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2371baa4

[RTNETLINK]: Add nested compat attribute · afdc3238

由 Patrick McHardy 提交于 6月 25, 2007

Add a nested compat attribute type that can be used to convert
attributes that contain a structure to nested attributes in a
backwards compatible way.

The attribute looks like this:

struct {
        [ compat contents ]
        struct rtattr {
                .rta_len        = total size,
                .rta_type       = type,
        } rta;
        struct old_structure struct;

        [ nested top-level attribute ]
        struct rtattr {
                .rta_len        = nest size,
                .rta_type       = type,
        } nest_attr;

        [ optional 0 .. n nested attributes ]
        struct rtattr {
                .rta_len        = private attribute len,
                .rta_type       = private attribute typ,
        } nested_attr;
        struct nested_data data;
};

Since both userspace and kernel deal correctly with attributes that are
larger than expected old versions will just parse the compat part and
ignore the rest.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

afdc3238

[SKBUFF]: Keep track of writable header len of headerless clones · 334a8132

由 Patrick McHardy 提交于 6月 25, 2007

Currently NAT (and others) that want to modify cloned skbs copy them,
even if in the vast majority of cases its not necessary because the
skb is a clone made by TCP and the portion NAT wants to modify is
actually writable because TCP release the header reference before
cloning.

The problem is that there is no clean way for NAT to find out how
long the writable header area is, so this patch introduces skb->hdr_len
to hold this length. When a headerless skb is cloned skb->hdr_len
is set to the current headroom, for regular clones it is copied from
the original. A new function skb_clone_writable(skb, len) returns
whether the skb is writable up to len bytes from skb->data. To avoid
enlarging the skb the mac_len field is reduced to 16 bit and the
new hdr_len field is put in the remaining 16 bit.

I've done a few rough benchmarks of NAT (not with this exact patch,
but a very similar one). As expected it saves huge amounts of system
time in case of sendfile, bringing it down to basically the same
amount as without NAT, with sendmsg it only helps on loopback,
probably because of the large MTU.

Transmit a 1GB file using sendfile/sendmsg over eth0/lo with and
without NAT:

- sendfile eth0, no NAT:	sys     0m0.388s
- sendfile eth0, NAT:		sys     0m1.835s
- sendfile eth0: NAT + path:	sys     0m0.370s	(~ -80%)

- sendfile lo, no NAT:		sys     0m0.258s
- sendfile lo, NAT:		sys     0m2.609s
- sendfile lo, NAT + patch:	sys     0m0.260s	(~ -90%)

- sendmsg eth0, no NAT:		sys     0m2.508s
- sendmsg eth0, NAT:		sys     0m2.539s
- sendmsg eth0, NAT + patch:	sys     0m2.445s	(no change)

- sendmsg lo, no NAT:		sys	0m2.151s
- sendmsg lo, NAT:		sys     0m3.557s
- sendmsg lo, NAT + patch:	sys     0m2.159s	(~ -40%)

I expect other users can see a similar performance improvement,
packet mangling iptables targets, ipip and ip_gre come to mind ..
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

334a8132

[RTNETLINK]: Link creation API · 38f7b870

由 Patrick McHardy 提交于 6月 13, 2007

Add rtnetlink API for creating, changing and deleting software devices.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38f7b870

[RTNETLINK]: Split up rtnl_setlink · 0157f60c

由 Patrick McHardy 提交于 6月 13, 2007

Split up rtnl_setlink into a function performing validation and a function
performing the actual changes. This allows to share the modifcation logic
with rtnl_newlink, which is introduced by the next patch.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0157f60c

06 7月, 2007 3 次提交

[NETPOLL]: Fixups for 'fix soft lockup when removing module' · 25442caf

由 Jarek Poplawski 提交于 7月 05, 2007

>From my recent patch:

> >    #1
> >    Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work()
> >    required a work function should always (unconditionally) rearm with
> >    delay > 0 - otherwise it would endlessly loop. This patch replaces
> >    this function with cancel_delayed_work(). Later kernel versions don't
> >    require this, so here it's only for uniformity.

But Oleg Nesterov <oleg@tv-sign.ru> found:

> But 2.6.22 doesn't need this change, why it was merged?
> 
> In fact, I suspect this change adds a race,
...

His description was right (thanks), so this patch reverts #1.
Signed-off-by: NJarek Poplawski <jarkao2@o2.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25442caf

[NET]: net/core/netevent.c should #include <net/netevent.h> · 94b83419

由 Adrian Bunk 提交于 7月 05, 2007

Every file should include the headers containing the prototypes for
its global functions.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94b83419

[NET] skbuff: remove export of static symbol · 2cd052e4

由 Johannes Berg 提交于 7月 05, 2007

skb_clone_fraglist is static so it shouldn't be exported.
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2cd052e4

29 6月, 2007 1 次提交

[NETPOLL] netconsole: fix soft lockup when removing module · 17200811

由 Jarek Poplawski 提交于 6月 28, 2007

#1
Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work()
required a work function should always (unconditionally) rearm with
delay > 0 - otherwise it would endlessly loop. This patch replaces
this function with cancel_delayed_work(). Later kernel versions don't
require this, so here it's only for uniformity.

#2
After deleting a timer in cancel_[rearming_]delayed_work() there could
stay a last skb queued in npinfo->txq causing a memory leak after
kfree(npinfo).

Initial patch & testing by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: NJarek Poplawski <jarkao2@o2.pl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17200811

27 6月, 2007 1 次提交

[NETPOLL]: tx lock deadlock fix · 0db3dc73

由 Stephen Hemminger 提交于 6月 27, 2007

If sky2 device poll routine is called from netpoll_send_skb, it would
deadlock. The netpoll_send_skb held the netif_tx_lock, and the poll
routine could acquire it to clean up skb's. Other drivers might use
same locking model.

The driver is correct, netpoll should not introduce more locking
problems than it causes already. So change the code to drop lock
before calling poll handler.
Signed-off-by: NStephen Hemminger <shemminger@linux.foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0db3dc73

24 6月, 2007 3 次提交

[NET]: Make skb_seq_read unmap the last fragment · 5b5a60da

由 Olaf Kirch 提交于 6月 23, 2007

Having walked through the entire skbuff, skb_seq_read would leave the
last fragment mapped.  As a consequence, the unwary caller would leak
kmaps, and proceed with preempt_count off by one. The only (kind of
non-intuitive) workaround is to use skb_seq_read_abort.

This patch makes sure skb_seq_read always unmaps frag_data after
having cycled through the skb's paged part.
Signed-off-by: NOlaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b5a60da

[NET]: Re-enable irqs before pushing pending DMA requests · 515e06c4

由 Shannon Nelson 提交于 6月 23, 2007

This moves the local_irq_enable() call in net_rx_action() to before
calling the CONFIG_NET_DMA's dma_async_memcpy_issue_pending() rather
than after.  This shortens the irq disabled window and allows for DMA
drivers that need to do their own irq hold.
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

515e06c4

[SKBUFF]: Fix incorrect config #ifdef around skb_copy_secmark · dbbeb2f9

由 Patrick McHardy 提交于 6月 23, 2007

secmark doesn't depend on CONFIG_NET_SCHED.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Acked-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbbeb2f9

08 6月, 2007 4 次提交

[NET]: Avoid duplicate netlink notification when changing link state · 7c355f53

由 Thomas Graf 提交于 6月 05, 2007

When changing the link state from userspace not affecting any other
flags. Two duplicate notification are being sent, once as action
in the NETDEV_UP/NETDEV_DOWN notification chain and a second time
when comparing old and new device flags after the change has been
completed. Although harmless, the duplicates should be avoided.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c355f53

[RTNETLINK]: ifindex 0 does not exist · 51055be8

由 Patrick McHardy 提交于 6月 05, 2007

ifindex == 0 does not exist and implies we should do a lookup by name if
one was given.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51055be8

[NETLINK]: Mark netlink policies const · ef7c79ed

由 Patrick McHardy 提交于 6月 05, 2007

Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef7c79ed

[NET]: Merge dst_discard_in and dst_discard_out. · c4b1010f

由 Denis Cheng 提交于 6月 05, 2007

Signed-off-by: NDenis Cheng <crquan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4b1010f

04 6月, 2007 1 次提交

[NET] gso: Fix GSO feature mask in sk_setup_caps · 4fcd6b99

由 Herbert Xu 提交于 5月 31, 2007

This isn't a bug just yet as only TCP uses sk_setup_caps for GSO.
However, if and when UDP or something else starts using it this is
likely to cause a problem if we forget to add software emulation
for it at the same time.

The problem is that right now we translate GSO emulation to the
bitmask NETIF_F_GSO_MASK, which includes every protocol, even
ones that we cannot emulate.

This patch makes it provide only the ones that we can emulate.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fcd6b99

31 5月, 2007 2 次提交

[NET]: parse ip:port strings correctly in in4_pton · 83f03fa5

由 Jerome Borsboom 提交于 5月 29, 2007

in4_pton converts a textual representation of an ip4 address
into an integer representation. However, when the textual representation
is of in the form ip:port, e.g. 192.168.1.1:5060, and 'delim' is set to
-1, the function bails out with an error when reading the colon.

It makes sense to allow the colon as a delimiting character without
explicitly having to set it through the 'delim' variable as there can be
no ambiguity in the point where the ip address is completely parsed. This
function is indeed called from nf_conntrack_sip.c in this way to parse
textual ip:port combinations which fails due to the reason stated above.
Signed-off-by: NJerome Borsboom <j.borsboom@erasmusmc.nl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83f03fa5

D
[XFRM]: Allow XFRM_ACQ_EXPIRES to be tunable via sysctl. · 01e67d08
由 David S. Miller 提交于 5月 25, 2007
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
01e67d08

25 5月, 2007 2 次提交

[XFRM]: Allow packet drops during larval state resolution. · 14e50e57

由 David S. Miller 提交于 5月 24, 2007

The current IPSEC rule resolution behavior we have does not work for a
lot of people, even though technically it's an improvement from the
-EAGAIN buisness we had before.

Right now we'll block until the key manager resolves the route.  That
works for simple cases, but many folks would rather packets get
silently dropped until the key manager resolves the IPSEC rules.

We can't tell these folks to "set the socket non-blocking" because
they don't have control over the non-block setting of things like the
sockets used to resolve DNS deep inside of the resolver libraries in
libc.

With that in mind I coded up the patch below with some help from
Herbert Xu which provides packet-drop behavior during larval state
resolution, controllable via sysctl and off by default.

This lays the framework to either:

1) Make this default at some point or...

2) Move this logic into xfrm{4,6}_policy.c and implement the
   ARP-like resolution queue we've all been dreaming of.
   The idea would be to queue packets to the policy, then
   once the larval state is resolved by the key manager we
   re-resolve the route and push the packets out.  The
   packets would timeout if the rule didn't get resolved
   in a certain amount of time.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14e50e57

[NET]: "wrong timeout value" in sk_wait_data() v2 · ba78073e

由 Vasily Averin 提交于 5月 24, 2007

sys_setsockopt() do not check properly timeout values for
SO_RCVTIMEO/SO_SNDTIMEO, for example it's possible to set negative timeout
values. POSIX do not defines behaviour for sys_setsockopt in case negative
timeouts, but requires that setsockopt() shall fail with -EDOM if the send and
receive timeout values are too big to fit into the timeout fields in the socket
structure.
In current implementation negative timeout can lead to error messages like
"schedule_timeout: wrong timeout value".

Proposed patch:
- checks tv_usec and returns -EDOM if it is wrong
- do not allows to set negative timeout values (sets 0 instead) and outputs
ratelimited information message about such attempts.
Signed-off-By: NVasily Averin <vvs@sw.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba78073e

23 5月, 2007 2 次提交

[RTNETLINK]: Remove remains of wireless extensions over rtnetlink · 575c3e2a

由 Patrick McHardy 提交于 5月 22, 2007

Remove some unused variables and function arguments related to the
recently removed wireless extensions over rtnetlink.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Acked-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

575c3e2a

[RTNETLINK]: Allow changing of subsets of netdevice flags in rtnl_setlink · 83b496e9

由 Patrick McHardy 提交于 5月 22, 2007

rtnl_setlink doesn't allow to change subsets of the flags, just to override
the set entirely by a new one. This means that for simply setting a device
up or down userspace first needs to query the current flags, change it and
send the changed flags back, which is racy and needlessly complicated.

Mask the flags using ifi_change since this is what it is intended for.
For backwards compatibility treat ifi_change == 0 as ~0 (even though it
seems quite unlikely that anyone has been using this so far).
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83b496e9

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功