提交 · 4b0b72f7dd617b13abd1b04c947e15873e011a24 · openeuler / raspberrypi-kernel

29 4月, 2010 1 次提交

net: speedup udp receive path · 4b0b72f7

由 Eric Dumazet 提交于 4月 28, 2010

Since commit 95766fff ([UDP]: Add memory accounting.), 
each received packet needs one extra sock_lock()/sock_release() pair.

This added latency because of possible backlog handling. Then later,
ticket spinlocks added yet another latency source in case of DDOS.

This patch introduces lock_sock_bh() and unlock_sock_bh()
synchronization primitives, avoiding one atomic operation and backlog
processing.

skb_free_datagram_locked() uses them instead of full blown
lock_sock()/release_sock(). skb is orphaned inside locked section for
proper socket memory reclaim, and finally freed outside of it.

UDP receive path now take the socket spinlock only once.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b0b72f7

28 4月, 2010 1 次提交

net: sk_add_backlog() take rmem_alloc into account · c377411f

由 Eric Dumazet 提交于 4月 27, 2010

Current socket backlog limit is not enough to really stop DDOS attacks,
because user thread spend many time to process a full backlog each
round, and user might crazy spin on socket lock.

We should add backlog size and receive_queue size (aka rmem_alloc) to
pace writers, and let user run without being slow down too much.

Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in
stress situations.

Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp
receiver can now process ~200.000 pps (instead of ~100 pps before the
patch) on a 8 core machine.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c377411f

24 4月, 2010 2 次提交

IPv6: Complete IPV6_DONTFRAG support · 4b340ae2

由 Brian Haley 提交于 4月 23, 2010

Finally add support to detect a local IPV6_DONTFRAG event
and return the relevant data to the user if they've enabled
IPV6_RECVPATHMTU on the socket.  The next recvmsg() will
return no data, but have an IPV6_PATHMTU as ancillary data.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b340ae2

IPv6: Add dontfrag argument to relevant functions · 13b52cd4

由 Brian Haley 提交于 4月 23, 2010

Add dontfrag argument to relevant functions for
IPV6_DONTFRAG support, as well as allowing the value
to be passed-in via ancillary cmsg data.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13b52cd4

21 4月, 2010 1 次提交

net: Fix various endianness glitches · 0eae88f3

由 Eric Dumazet 提交于 4月 20, 2010

Sparse can help us find endianness bugs, but we need to make some
cleanups to be able to more easily spot real bugs.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0eae88f3

09 4月, 2010 1 次提交

udp: fix for unicast RX path optimization · 1223c67c

由 Jorge Boncompte [DTI2] 提交于 4月 08, 2010

Commits 5051ebd2 and
5051ebd2 ("ipv[46]: udp: optimize unicast RX
path") broke some programs.

	After upgrading a L2TP server to 2.6.33 it started to fail, tunnels going up an
down, after the 10th tunnel came up. My modified rp-l2tp uses a global
unconnected socket bound to (INADDR_ANY, 1701) and one connected socket per
tunnel after parameter negotiation.

	After ten sockets were open and due to mixed parameters to
udp[46]_lib_lookup2() kernel started to drop packets.
Signed-off-by: NJorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1223c67c

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

06 3月, 2010 2 次提交

net: backlog functions rename · a3a858ff

由 Zhu Yi 提交于 3月 04, 2010

sk_add_backlog -> __sk_add_backlog
sk_add_backlog_limited -> sk_add_backlog
Signed-off-by: NZhu Yi <yi.zhu@intel.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3a858ff

udp: use limited socket backlog · 55349790

由 Zhu Yi 提交于 3月 04, 2010

Make udp adapt to the limited socket backlog change.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NZhu Yi <yi.zhu@intel.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55349790

19 2月, 2010 1 次提交

ipv6: drop unused "dev" arg of icmpv6_send() · 3ffe533c

由 Alexey Dobriyan 提交于 2月 18, 2010

Dunno, what was the idea, it wasn't used for a long time.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ffe533c

13 2月, 2010 1 次提交

udp: remove redundant variable · 81d54ec8

由 Gerrit Renker 提交于 2月 10, 2010

The variable 'copied' is used in udp_recvmsg() to emphasize that the passed
'len' is adjusted to fit the actual datagram length. But the same can be
done by adjusting 'len' directly. This patch thus removes the indirection.
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81d54ec8

18 1月, 2010 1 次提交

net: spread __net_init, __net_exit · 2c8c1e72

由 Alexey Dobriyan 提交于 1月 17, 2010

__net_init/__net_exit are apparently not going away, so use them
to full extent.

In some cases __net_init was removed, because it was called from
__net_exit code.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c8c1e72

11 11月, 2009 2 次提交

IPv6: use ipv6_addr_v4mapped() · 856540ee

由 Brian Haley 提交于 11月 09, 2009

Change udp6_portaddr_hash() to use ipv6_addr_v4mapped()
inline instead of ipv6_addr_type().
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

856540ee

udp: bind() optimisation · 30fff923

由 Eric Dumazet 提交于 11月 09, 2009

UDP bind() can be O(N^2) in some pathological cases.

Thanks to secondary hash tables, we can make it O(N)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30fff923

09 11月, 2009 4 次提交

udp: multicast RX should increment SNMP/sk_drops counter in allocation failures · f6b8f32c

由 Eric Dumazet 提交于 11月 08, 2009

When skb_clone() fails, we should increment sk_drops and SNMP counters.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6b8f32c

ipv6: udp: Optimise multicast reception · a1ab77f9

由 Eric Dumazet 提交于 11月 08, 2009

IPV6 UDP multicast rx path is a bit complex and can hold a spinlock
for a long time.

Using a small (32 or 64 entries) stack of socket pointers can help
to perform expensive operations (skb_clone(), udp_queue_rcv_skb())
outside of the lock, in most cases.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a1ab77f9

ipv6: udp: optimize unicast RX path · fddc17de

由 Eric Dumazet 提交于 11月 08, 2009

We first locate the (local port) hash chain head
If few sockets are in this chain, we proceed with previous lookup algo.

If too many sockets are listed, we take a look at the secondary
(port, address) hash chain.

We choose the shortest chain and proceed with a RCU lookup on the elected chain.

But, if we chose (port, address) chain, and fail to find a socket on given address,
 we must try another lookup on (port, in6addr_any) chain to find sockets not bound
to a particular IP.

-> No extra cost for typical setups, where the first lookup will probabbly
be performed.

RCU lookups everywhere, we dont acquire spinlock.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fddc17de

udp: split sk_hash into two u16 hashes · d4cada4a

由 Eric Dumazet 提交于 11月 08, 2009

Union sk_hash with two u16 hashes for udp (no extra memory taken)

One 16 bits hash on (local port) value (the previous udp 'hash')

One 16 bits hash on (local address, local port) values, initialized
but not yet used. This second hash is using jenkin hash for better
distribution.

Because the 'port' is xored later, a partial hash is performed
on local address + net_hash_mix(net)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4cada4a

06 11月, 2009 1 次提交

net: drop capability from protocol definitions · 13f18aa0

由 Eric Paris 提交于 11月 05, 2009

struct can_proto had a capability field which wasn't ever used.  It is
dropped entirely.

struct inet_protosw had a capability field which can be more clearly
expressed in the code by just checking if sock->type = SOCK_RAW.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13f18aa0

31 10月, 2009 1 次提交

net: fix sk_forward_alloc corruption · 9d410c79

由 Eric Dumazet 提交于 10月 30, 2009

On UDP sockets, we must call skb_free_datagram() with socket locked,
or risk sk_forward_alloc corruption. This requirement is not respected
in SUNRPC.

Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC
Reported-by: NFrancis Moreau <francis.moro@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d410c79

19 10月, 2009 2 次提交

net: sk_drops consolidation part 2 · 8edf19c2

由 Eric Dumazet 提交于 10月 15, 2009

- skb_kill_datagram() can increment sk->sk_drops itself, not callers.

- UDP on IPV4 & IPV6 dropped frames (because of bad checksum or policy checks) increment sk_drops
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8edf19c2

inet: rename some inet_sock fields · c720c7e8

由 Eric Dumazet 提交于 10月 15, 2009

In order to have better cache layouts of struct sock (separate zones
for rx/tx paths), we need this preliminary patch.

Goal is to transfert fields used at lookup time in the first
read-mostly cache line (inside struct sock_common) and move sk_refcnt
to a separate cache line (only written by rx path)

This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
sport and id fields. This allows a future patch to define these
fields as macros, like sk_refcnt, without name clashes.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c720c7e8

15 10月, 2009 1 次提交

net: sk_drops consolidation · 766e9037

由 Eric Dumazet 提交于 10月 14, 2009

sock_queue_rcv_skb() can update sk_drops itself, removing need for
callers to take care of it. This is more consistent since
sock_queue_rcv_skb() also reads sk_drops when queueing a skb.

This adds sk_drops managment to many protocols that not cared yet.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

766e9037

13 10月, 2009 1 次提交

net: Generalize socket rx gap / receive queue overflow cmsg · 3b885787

由 Neil Horman 提交于 10月 12, 2009

Create a new socket level option to report number of queue overflows

Recently I augmented the AF_PACKET protocol to report the number of frames lost
on the socket receive queue between any two enqueued frames. This value was
exported via a SOL_PACKET level cmsg. AFter I completed that work it was
requested that this feature be generalized so that any datagram oriented socket
could make use of this option. As such I've created this patch, It creates a
new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a
SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue
overflowed between any two given frames. It also augments the AF_PACKET
protocol to take advantage of this new feature (as it previously did not touch
sk->sk_drops, which this patch uses to record the overflow count). Tested
successfully by me.

Notes:

1) Unlike my previous patch, this patch simply records the sk_drops value, which
is not a number of drops between packets, but rather a total number of drops.
Deltas must be computed in user space.

2) While this patch currently works with datagram oriented protocols, it will
also be accepted by non-datagram oriented protocols. I'm not sure if thats
agreeable to everyone, but my argument in favor of doing so is that, for those
protocols which aren't applicable to this option, sk_drops will always be zero,
and reporting no drops on a receive queue that isn't used for those
non-participating protocols seems reasonable to me. This also saves us having
to code in a per-protocol opt in mechanism.

3) This applies cleanly to net-next assuming that commit
97775007 (my af packet cmsg patch) is reverted
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b885787

08 10月, 2009 2 次提交

udp: dynamically size hash tables at boot time · f86dcc5a

由 Eric Dumazet 提交于 10月 07, 2009

UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for
several setups.

4000 active UDP sockets -> 32 sockets per chain in average. An
incoming frame has to lookup all sockets to find best match, so long
chains hurt latency.

Instead of a fixed size hash table that cant be perfect for every
needs, let UDP stack choose its table size at boot time like tcp/ip
route, using alloc_large_system_hash() helper

Add an optional boot parameter, uhash_entries=x so that an admin can
force a size between 256 and 65536 if needed, like thash_entries and
rhash_entries.

dmesg logs two new lines :
[    0.647039] UDP hash table entries: 512 (order: 0, 4096 bytes)
[    0.647099] UDP Lite hash table entries: 512 (order: 0, 4096 bytes)

Maximal size on 64bit arches would be 65536 slots, ie 1 MBytes for non
debugging spinlocks.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f86dcc5a

IPv6: use ipv6_addr_set_v4mapped() · b301e82c

由 Brian Haley 提交于 10月 07, 2009

Might as well use the ipv6_addr_set_v4mapped() inline we created last
year.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b301e82c

07 10月, 2009 1 次提交

Use sk_mark for IPv6 routing lookups · 51953d5b

由 Brian Haley 提交于 10月 05, 2009

Atis Elsts wrote:
> Not sure if there is need to fill the mark from skb in tunnel xmit functions. In any case, it's not done for GRE or IPIP tunnels at the moment.

Ok, I'll just drop that part, I'm not sure what should be done in this case.

> Also, in this patch you are doing that for SIT (v6-in-v4) tunnels only, and not doing it for v4-in-v6 or v6-in-v6 tunnels. Any reason for that?

I just sent that patch out too quickly, here's a better one with the updates.

Add support for IPv6 route lookups using sk_mark.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51953d5b

01 10月, 2009 1 次提交

net: Make setsockopt() optlen be unsigned. · b7058842

由 David S. Miller 提交于 9月 30, 2009

This provides safety against negative optlen at the type
level instead of depending upon (sometimes non-trivial)
checks against this sprinkled all over the the place, in
each and every implementation.

Based upon work done by Arjan van de Ven and feedback
from Linus Torvalds.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7058842

15 9月, 2009 1 次提交

net: constify struct inet6_protocol · 41135cc8

由 Alexey Dobriyan 提交于 9月 14, 2009

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41135cc8

03 9月, 2009 1 次提交

ip: Report qdisc packet drops · 6ce9e7b5

由 Eric Dumazet 提交于 9月 02, 2009

Christoph Lameter pointed out that packet drops at qdisc level where not
accounted in SNMP counters. Only if application sets IP_RECVERR, drops
are reported to user (-ENOBUFS errors) and SNMP counters updated.

IP_RECVERR is used to enable extended reliable error message passing,
but these are not needed to update system wide SNMP stats.

This patch changes things a bit to allow SNMP counters to be updated,
regardless of IP_RECVERR being set or not on the socket.

Example after an UDP tx flood
# netstat -s 
...
IP:
    1487048 outgoing packets dropped
...
Udp:
...
    SndbufErrors: 1487048


send() syscalls, do however still return an OK status, to not
break applications.

Note : send() manual page explicitly says for -ENOBUFS error :

 "The output queue for a network interface was full.
  This generally indicates that the interface has stopped sending,
  but may be caused by transient congestion.
  (Normally, this does not occur in Linux. Packets are just silently
  dropped when a device queue overflows.) "

This is not true for IP_RECVERR enabled sockets : a send() syscall
that hit a qdisc drop returns an ENOBUFS error.

Many thanks to Christoph, David, and last but not least, Alexey !
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ce9e7b5

14 8月, 2009 1 次提交

inet6: Conversion from u8 to int · e651f03a

由 Gerrit Renker 提交于 8月 09, 2009

This replaces assignments of the type "int on LHS" = "u8 on RHS" with
simpler code. The LHS can express all of the unsigned right hand side
values, hence the assigned value can not be negative.
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e651f03a

13 7月, 2009 2 次提交

udpv6: Handle large incoming UDP/IPv6 packets and support software UFO · ba735425

由 Sridhar Samudrala 提交于 7月 09, 2009

- validate and forward GSO UDP/IPv6 packets from untrusted sources.
- do software UFO if the outgoing device doesn't support UFO.
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba735425

udpv6: Fix HW checksum support for outgoing UFO packets · 493c6be3

由 Sridhar Samudrala 提交于 7月 09, 2009

- add HW checksum support for outgoing large UDP/IPv6 packets destined for
  a UFO enabled device.
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

493c6be3

23 6月, 2009 1 次提交

ipv6: Use correct data types for ICMPv6 type and code · d5fdd6ba

由 Brian Haley 提交于 6月 23, 2009

Change all the code that deals directly with ICMPv6 type and code
values to use u8 instead of a signed int as that's the actual data
type.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5fdd6ba

18 6月, 2009 1 次提交

net: correct off-by-one write allocations reports · 31e6d363

由 Eric Dumazet 提交于 6月 17, 2009

commit 2b85a34e
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.

We need to take into account this offset when reporting
sk_wmem_alloc to user, in PROC_FS files or various
ioctls (SIOCOUTQ/TIOCOUTQ)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31e6d363

03 6月, 2009 1 次提交

net: skb->dst accessors · adf30907

由 Eric Dumazet 提交于 6月 02, 2009

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

adf30907

11 4月, 2009 1 次提交

ipv6: Fix NULL pointer dereference with time-wait sockets · 499923c7

由 Vlad Yasevich 提交于 4月 09, 2009

Commit b2f5e7cd
(ipv6: Fix conflict resolutions during ipv6 binding)
introduced a regression where time-wait sockets were
not treated correctly.  This resulted in the following:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000062
IP: [<ffffffff805d7d61>] ipv4_rcv_saddr_equal+0x61/0x70
...
Call Trace:
[<ffffffffa033847b>] ipv6_rcv_saddr_equal+0x1bb/0x250 [ipv6]
[<ffffffffa03505a8>] inet6_csk_bind_conflict+0x88/0xd0 [ipv6]
[<ffffffff805bb18e>] inet_csk_get_port+0x1ee/0x400
[<ffffffffa0319b7f>] inet6_bind+0x1cf/0x3a0 [ipv6]
[<ffffffff8056d17c>] ? sockfd_lookup_light+0x3c/0xd0
[<ffffffff8056ed49>] sys_bind+0x89/0x100
[<ffffffff80613ea2>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[<ffffffff8020bf9b>] system_call_fastpath+0x16/0x1b
Tested-by: NBrian Haley <brian.haley@hp.com>
Tested-by: NEd Tomlinson <edt@aei.ca>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

499923c7

25 3月, 2009 1 次提交

ipv6: Fix conflict resolutions during ipv6 binding · b2f5e7cd

由 Vlad Yasevich 提交于 3月 24, 2009

The ipv6 version of bind_conflict code calls ipv6_rcv_saddr_equal()
which at times wrongly identified intersections between addresses.
It particularly broke down under a few instances and caused erroneous
bind conflicts.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2f5e7cd

16 12月, 2008 1 次提交

ipv6: fix the outgoing interface selection order in udpv6_sendmsg() · 9f690db7

由 Yang Hongyang 提交于 12月 16, 2008

1.When no interface is specified in an IPV6_PKTINFO ancillary data
  item, the interface specified in an IPV6_PKTINFO sticky optionis 
  is used.

RFC3542:
6.7.  Summary of Outgoing Interface Selection

   This document and [RFC-3493] specify various methods that affect the
   selection of the packet's outgoing interface.  This subsection
   summarizes the ordering among those in order to ensure deterministic
   behavior.

   For a given outgoing packet on a given socket, the outgoing interface
   is determined in the following order:

   1. if an interface is specified in an IPV6_PKTINFO ancillary data
      item, the interface is used.

   2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky
      option, the interface is used.
Signed-off-by: NYang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f690db7

26 11月, 2008 1 次提交

netns xfrm: lookup in netns · 52479b62

由 Alexey Dobriyan 提交于 11月 25, 2008

Pass netns to xfrm_lookup()/__xfrm_lookup(). For that pass netns
to flow_cache_lookup() and resolver callback.

Take it from socket or netdevice. Stub DECnet to init_net.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52479b62