提交 · d5fdd6babcfc2b0e6a8da1acf492a69fb54b4c47 · OpenHarmony / kernel_linux

17 6月, 2009 1 次提交

net: sk_wmem_alloc has initial value of one, not zero · c564039f

由 Eric Dumazet 提交于 6月 16, 2009

commit 2b85a34e
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.

Some protocols check sk_wmem_alloc value to determine if a timer
must delay socket deallocation. We must take care of the sk_wmem_alloc
value being one instead of zero when no write allocations are pending.

Reported by Ingo Molnar, and full diagnostic from David Miller.

This patch introduces three helpers to get read/write allocations
and a followup patch will use these helpers to report correct
write allocations to user.
Reported-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c564039f

11 6月, 2009 1 次提交

net: No more expensive sock_hold()/sock_put() on each tx · 2b85a34e

由 Eric Dumazet 提交于 6月 11, 2009

One of the problem with sock memory accounting is it uses
a pair of sock_hold()/sock_put() for each transmitted packet.

This slows down bidirectional flows because the receive path
also needs to take a refcount on socket and might use a different
cpu than transmit path or transmit completion path. So these
two atomic operations also trigger cache line bounces.

We can see this in tx or tx/rx workloads (media gateways for example),
where sock_wfree() can be in top five functions in profiles.

We use this sock_hold()/sock_put() so that sock freeing
is delayed until all tx packets are completed.

As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
by one unit at init time, until sk_free() is called.
Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
to decrement initial offset and atomicaly check if any packets
are in flight.

skb_set_owner_w() doesnt call sock_hold() anymore

sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc
reached 0 to perform the final freeing.

Drawback is that a skb->truesize error could lead to unfreeable sockets, or
even worse, prematurely calling __sk_free() on a live socket.

Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s
on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt
contention point. 5 % speedup on a UDP transmit workload (depends
on number of flows), lowering TX completion cpu usage.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b85a34e

18 2月, 2009 1 次提交

net: Kill skb_truesize_check(), it only catches false-positives. · 92a0acce

由 David S. Miller 提交于 2月 17, 2009

A long time ago we had bugs, primarily in TCP, where we would modify
skb->truesize (for TSO queue collapsing) in ways which would corrupt
the socket memory accounting.

skb_truesize_check() was added in order to try and catch this error
more systematically.

However this debugging check has morphed into a Frankenstein of sorts
and these days it does nothing other than catch false-positives.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92a0acce

16 2月, 2009 1 次提交

net: socket infrastructure for SO_TIMESTAMPING · 20d49473

由 Patrick Ohly 提交于 2月 12, 2009

The overlap with the old SO_TIMESTAMP[NS] options is handled so
that time stamping in software (net_enable_timestamp()) is
enabled when SO_TIMESTAMP[NS] and/or SO_TIMESTAMPING_RX_SOFTWARE
is set.  It's disabled if all of these are off.
Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20d49473

13 2月, 2009 1 次提交

net: don't use in_atomic() in gfp_any() · 99709372

由 Andrew Morton 提交于 2月 12, 2009

The problem is that in_atomic() will return false inside spinlocks if
CONFIG_PREEMPT=n.  This will lead to deadlockable GFP_KERNEL allocations
from spinlocked regions.

Secondly, if CONFIG_PREEMPT=y, this bug solves itself because networking
will instead use GFP_ATOMIC from this callsite.  Hence we won't get the
might_sleep() debugging warnings which would have informed us of the buggy
callsites.

Solve both these problems by switching to in_interrupt().  Now, if someone
runs a gfp_any() allocation from inside spinlock we will get the warning
if CONFIG_PREEMPT=y.

I reviewed all callsites and most of them were too complex for my little
brain and none of them documented their interface requirements.  I have no
idea what this patch will do.
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99709372

05 2月, 2009 1 次提交

net: Reexport sock_alloc_send_pskb · 4cc7f68d

由 Herbert Xu 提交于 2月 04, 2009

The function sock_alloc_send_pskb is completely useless if not
exported since most of the code in it won't be used as is.  In
fact, this code has already been duplicated in the tun driver.

Now that we need accounting in the tun driver, we can in fact
use this function as is.  So this patch marks it for export again.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4cc7f68d

26 11月, 2008 2 次提交

net: Use a percpu_counter for orphan_count · dd24c001

由 Eric Dumazet 提交于 11月 25, 2008

Instead of using one atomic_t per protocol, use a percpu_counter
for "orphan_count", to reduce cache line contention on
heavy duty network servers. 
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd24c001

net: Use a percpu_counter for sockets_allocated · 1748376b

由 Eric Dumazet 提交于 11月 25, 2008

Instead of using one atomic_t per protocol, use a percpu_counter
for "sockets_allocated", to reduce cache line contention on
heavy duty network servers. 

Note : We revert commit (248969ae
net: af_unix can make unix_nr_socks visbile in /proc),
since it is not anymore used after sock_prot_inuse_add() addition
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1748376b

17 11月, 2008 1 次提交

udp: Use hlist_nulls in UDP RCU code · 88ab1932

由 Eric Dumazet 提交于 11月 16, 2008

This is a straightforward patch, using hlist_nulls infrastructure.

RCUification already done on UDP two weeks ago.

Using hlist_nulls permits us to avoid some memory barriers, both
at lookup time and delete time.

Patch is large because it adds new macros to include/net/sock.h.
These macros will be used by TCP & DCCP in next patch.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88ab1932

14 11月, 2008 1 次提交

lockdep: include/linux/lockdep.h - fix warning in net/bluetooth/af_bluetooth.c · e8f6fbf6

由 Ingo Molnar 提交于 11月 12, 2008

fix this warning:

  net/bluetooth/af_bluetooth.c:60: warning: ‘bt_key_strings’ defined but not used
  net/bluetooth/af_bluetooth.c:71: warning: ‘bt_slock_key_strings’ defined but not used

this is a lockdep macro problem in the !LOCKDEP case.

We cannot convert it to an inline because the macro works on multiple types,
but we can mark the parameter used.

[ also clean up a misaligned tab in sock_lock_init_class_and_name() ]

[ also remove #ifdefs from around af_family_clock_key strings - which
  were certainly added to get rid of the ugly build warnings. ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e8f6fbf6

13 11月, 2008 1 次提交

net: ifdef struct sock::sk_async_wait_queue · 23789824

由 Alexey Dobriyan 提交于 11月 12, 2008

Every user is under CONFIG_NET_DMA already, so ifdef field as well.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23789824

12 11月, 2008 1 次提交

lockdep: include/linux/lockdep.h - fix warning in net/bluetooth/af_bluetooth.c · e25cf3db

由 Ingo Molnar 提交于 10月 17, 2008

fix this warning:

  net/bluetooth/af_bluetooth.c:60: warning: ‘bt_key_strings’ defined but not used
  net/bluetooth/af_bluetooth.c:71: warning: ‘bt_slock_key_strings’ defined but not used

this is a lockdep macro problem in the !LOCKDEP case.

We cannot convert it to an inline because the macro works on multiple types,
but we can mark the parameter used.

[ also clean up a misaligned tab in sock_lock_init_class_and_name() ]

[ also remove #ifdefs from around af_family_clock_key strings - which
  were certainly added to get rid of the ugly build warnings. ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e25cf3db

05 11月, 2008 1 次提交

net: #ifdef ->sk_security · d5f64238

由 Alexey Dobriyan 提交于 11月 04, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5f64238

31 10月, 2008 1 次提交

net: delete excess kernel-doc notation · ad1d967c

由 Randy Dunlap 提交于 10月 30, 2008

Remove excess kernel-doc function parameters from networking header
& driver files:

Warning(include/net/sock.h:946): Excess function parameter or struct member 'sk' description in 'sk_filter_release'
Warning(include/linux/netdevice.h:1545): Excess function parameter or struct member 'cpu' description in 'netif_tx_lock'
Warning(drivers/net/wan/z85230.c:712): Excess function parameter or struct member 'regs' description in 'z8530_interrupt'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad1d967c

30 10月, 2008 1 次提交

udp: introduce sk_for_each_rcu_safenext() · 96631ed1

由 Eric Dumazet 提交于 10月 29, 2008

Corey Minyard found a race added in commit 271b72c7
(udp: RCU handling for Unicast packets.)

 "If the socket is moved from one list to another list in-between the
 time the hash is calculated and the next field is accessed, and the
 socket has moved to the end of the new list, the traversal will not
 complete properly on the list it should have, since the socket will
 be on the end of the new list and there's not a way to tell it's on a
 new list and restart the list traversal.  I think that this can be
 solved by pre-fetching the "next" field (with proper barriers) before
 checking the hash."

This patch corrects this problem, introducing a new
sk_for_each_rcu_safenext() macro.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96631ed1

29 10月, 2008 3 次提交

udp: RCU handling for Unicast packets. · 271b72c7

由 Eric Dumazet 提交于 10月 29, 2008

Goals are :

1) Optimizing handling of incoming Unicast UDP frames, so that no memory
 writes should happen in the fast path.

 Note: Multicasts and broadcasts still will need to take a lock,
 because doing a full lockless lookup in this case is difficult.

2) No expensive operations in the socket bind/unhash phases :
  - No expensive synchronize_rcu() calls.

  - No added rcu_head in socket structure, increasing memory needs,
  but more important, forcing us to use call_rcu() calls,
  that have the bad property of making sockets structure cold.
  (rcu grace period between socket freeing and its potential reuse
   make this socket being cold in CPU cache).
  David did a previous patch using call_rcu() and noticed a 20%
  impact on TCP connection rates.
  Quoting Cristopher Lameter :
   "Right. That results in cacheline cooldown. You'd want to recycle
    the object as they are cache hot on a per cpu basis. That is screwed
    up by the delayed regular rcu processing. We have seen multiple
    regressions due to cacheline cooldown.
    The only choice in cacheline hot sensitive areas is to deal with the
    complexity that comes with SLAB_DESTROY_BY_RCU or give up on RCU."

  - Because udp sockets are allocated from dedicated kmem_cache,
  use of SLAB_DESTROY_BY_RCU can help here.

Theory of operation :
---------------------

As the lookup is lockfree (using rcu_read_lock()/rcu_read_unlock()),
special attention must be taken by readers and writers.

Use of SLAB_DESTROY_BY_RCU is tricky too, because a socket can be freed,
reused, inserted in a different chain or in worst case in the same chain
while readers could do lookups in the same time.

In order to avoid loops, a reader must check each socket found in a chain
really belongs to the chain the reader was traversing. If it finds a
mismatch, lookup must start again at the begining. This *restart* loop
is the reason we had to use rdlock for the multicast case, because
we dont want to send same message several times to the same socket.

We use RCU only for fast path.
Thus, /proc/net/udp still takes spinlocks.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

271b72c7

udp: introduce struct udp_table and multiple spinlocks · 645ca708

由 Eric Dumazet 提交于 10月 29, 2008

UDP sockets are hashed in a 128 slots hash table.

This hash table is protected by *one* rwlock.

This rwlock is readlocked each time an incoming UDP message is handled.

This rwlock is writelocked each time a socket must be inserted in
hash table (bind time), or deleted from this table (close time)

This is not scalable on SMP machines :

1) Even in read mode, lock() and unlock() are atomic operations and
 must dirty a contended cache line, shared by all cpus.

2) A writer might be starved if many readers are 'in flight'. This can
 happen on a machine with some NIC receiving many UDP messages. User
 process can be delayed a long time at socket creation/dismantle time.

This patch prepares RCU migration, by introducing 'struct udp_table
and struct udp_hslot', and using one spinlock per chain, to reduce
contention on central rwlock.

Introducing one spinlock per chain reduces latencies, for port
randomization on heavily loaded UDP servers. This also speedup
bindings to specific ports.

udp_lib_unhash() was uninlined, becoming to big.

Some cleanups were done to ease review of following patch
(RCUification of UDP Unicast lookups)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

645ca708

net: reduce structures when XFRM=n · def8b4fa

由 Alexey Dobriyan 提交于 10月 28, 2008

ifdef out
* struct sk_buff::sp		(pointer)
* struct dst_entry::xfrm	(pointer)
* struct sock::sk_policy	(2 pointers)
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

def8b4fa

08 10月, 2008 2 次提交

net: wrap sk->sk_backlog_rcv() · c57943a1

由 Peter Zijlstra 提交于 10月 07, 2008

Wrap calling sk->sk_backlog_rcv() in a function. This will allow extending the
generic sk_backlog_rcv behaviour.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c57943a1

inet: Don't lookup the socket if there's a socket attached to the skb · 23542618

由 KOVACS Krisztian 提交于 10月 07, 2008

Use the socket cached in the skb if it's present.
Signed-off-by: NKOVACS Krisztian <hidden@sch.bme.hu>
Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23542618

28 8月, 2008 1 次提交

net: more #ifdef CONFIG_COMPAT · af01d537

由 Alexey Dobriyan 提交于 8月 28, 2008

All users of struct proto::compat_[gs]etsockopt and
struct inet_connection_sock_af_ops::compat_[gs]etsockopt are under
#ifdef already, so use it in structure definition too.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af01d537

17 7月, 2008 1 次提交

sock: add net to prot->enter_memory_pressure callback · 5c52ba17

由 Pavel Emelyanov 提交于 7月 16, 2008

The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't
have where to get the net from.

I decided to add a sk argument, not the net itself, only to factor
all the required sock_net(sk) calls inside the enter_memory_pressure 
callback itself.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c52ba17

18 6月, 2008 2 次提交

net: Add sk_set_socket() helper. · 972692e0

由 David S. Miller 提交于 6月 17, 2008

In order to more easily grep for all things that set
sk->sk_socket, add sk_set_socket() helper inline function.

Suggested (although only half-seriously) by Evgeniy Polyakov.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

972692e0

udp: sk_drops handling · cb61cb9b

由 Eric Dumazet 提交于 6月 17, 2008

In commits 33c732c3 ([IPV4]: Add raw
drops counter) and a92aa318 ([IPV6]:
Add raw drops counter), Wang Chen added raw drops counter for
/proc/net/raw & /proc/net/raw6

This patch adds this capability to UDP sockets too (/proc/net/udp &
/proc/net/udp6).

This means that 'RcvbufErrors' errors found in /proc/net/snmp can be also
be examined for each udp socket.

# grep Udp: /proc/net/snmp
Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
Udp: 23971006 75 899420 16390693 146348 0

# cat /proc/net/udp
 sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt  ---
uid  timeout inode ref pointer drops
 75: 00000000:02CB 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2358 2 ffff81082a538c80 0
111: 00000000:006F 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2286 2 ffff81042dd35c80 146348

In this example, only port 111 (0x006F) was flooded by messages that
user program could not read fast enough. 146348 messages were lost.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb61cb9b

17 6月, 2008 1 次提交
- D
  net: Kill SOCK_SLEEP_PRE and SOCK_SLEEP_POST, no users. · 338db085
  由 David S. Miller 提交于 6月 17, 2008
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  338db085
15 6月, 2008 1 次提交

net: change proto destroy method to return void · 7d06b2e0

由 Brian Haley 提交于 6月 14, 2008

Change struct proto destroy function pointer to return void.  Noticed
by Al Viro.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d06b2e0

16 4月, 2008 1 次提交

[NETNS]: Add netns refcnt debug for kernel sockets. · 65a18ec5

由 Denis V. Lunev 提交于 4月 16, 2008

Protocol control sockets and netlink kernel sockets should not prevent the
namespace stop request. They are initialized and disposed in a special way by
sk_change_net/sk_release_kernel.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65a18ec5

10 4月, 2008 1 次提交

socket: sk_filter deinline · 43db6d65

由 Stephen Hemminger 提交于 4月 10, 2008

The sk_filter function is too big to be inlined. This saves 2296 bytes
of text on allyesconfig.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43db6d65

01 4月, 2008 1 次提交

[SOCK][NETNS]: Add a struct net argument to sock_prot_inuse_add and _get. · c29a0bc4

由 Pavel Emelyanov 提交于 3月 31, 2008

This counter is about to become per-proto-and-per-net, so we'll need 
two arguments to determine which cell in this "table" to work with.

All the places, but proc already pass proper net to it - proc will be
tuned a bit later.

Some indentation with spaces in proc files is done to keep the file
coding style consistent.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c29a0bc4

29 3月, 2008 4 次提交

[SOCK]: Drop inuse pcounter from struct proto (v2). · bdcde3d7

由 Pavel Emelyanov 提交于 3月 28, 2008

An uppercut - do not use the pcounter on struct proto.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bdcde3d7

[SOCK]: Drop per-proto inuse init and fre functions (v2). · 60e7663d

由 Pavel Emelyanov 提交于 3月 28, 2008

Constructive part of the set is finished here. We have to remove the
pcounter, so start with its init and free functions.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60e7663d

[SOCK]: Introduce a percpu inuse counters array (v2). · 1338d466

由 Pavel Emelyanov 提交于 3月 28, 2008

And redirect sock_prot_inuse_add and _get to use one.

As far as the dereferences are concerned. Before the patch we made
1 dereference to proto->inuse.add call, the call itself and then
called the __get_cpu_var() on a static variable. After the patch we 
make a direct call, then one dereference to proto->inuse_idx and 
then the same __get_cpu_var() on a still static variable. So this 
patch doesn't seem to produce performance penalty on SMP.

This is not per-net yet, but I will deliberately make NET_NS=y case
separated from NET_NS=n one, since it'll cost us one-or-two more 
dereferences to get the struct net and the inuse counter.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1338d466

[SOCK]: Enumerate struct proto-s to facilitate percpu inuse accounting (v2). · 13ff3d6f

由 Pavel Emelyanov 提交于 3月 28, 2008

The inuse counters are going to become a per-cpu array.  Introduce an
index for this array on the struct proto.

To handle the case of proto register-unregister-register loop the
bitmap is used. All its bits manipulations are protected with
proto_list_lock and a sanity check for the bitmap being exhausted is
also added.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13ff3d6f

26 3月, 2008 2 次提交

[NETNS]: Compilation warnings under CONFIG_NET_NS. · f5aa23fd

由 Denis V. Lunev 提交于 3月 26, 2008

Recent commits from YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
have been introduced a several compilation warnings
'assignment discards qualifiers from pointer target type'
due to extra const modifier in the inline call parameters of
{dev|sock|twsk}_net_set.

Drop it.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5aa23fd

[NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS. · 3b1e0a65

由 YOSHIFUJI Hideaki 提交于 3月 26, 2008

Introduce per-sock inlines: sock_net(), sock_net_set()
and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

3b1e0a65

23 3月, 2008 2 次提交

[RAW]: Add raw_hashinfo member on struct proto. · fc8717ba

由 Pavel Emelyanov 提交于 3月 22, 2008

Sorry for the patch sequence confusion :| but I found that the similar
thing can be done for raw sockets easily too late.

Expand the proto.h union with the raw_hashinfo member and use it in
raw_prot and rawv6_prot. This allows to drop the protocol specific
versions of hash and unhash callbacks.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc8717ba

[SOCK]: Add udp_hash member to struct proto. · 39d8cda7

由 Pavel Emelyanov 提交于 3月 22, 2008

Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to 
struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 
and -v6 protocols.

The result is not that exciting, but it removes some levels of
indirection in udpxxx_get_port and saves some space in code and text.

The first step is to union existing hashinfo and new udp_hash on the
struct proto and give a name to this union, since future initialization 
of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc 
warning about inability to initialize anonymous member this way.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39d8cda7

22 3月, 2008 1 次提交

socket: SOCK_DEBUG type checking · 4cd9029d

由 Stephen Hemminger 提交于 3月 21, 2008

Use the inline trick (same as pr_debug) to get checking of debug
statements even if no code is generated.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4cd9029d

21 3月, 2008 1 次提交

[NET]: Add per-connection option to set max TSO frame size · 82cc1a7a

由 Peter P Waskiewicz Jr 提交于 3月 21, 2008

Update: My mailer ate one of Jarek's feedback mails...  Fixed the
parameter in netif_set_gso_max_size() to be u32, not u16.  Fixed the
whitespace issue due to a patch import botch.  Changed the types from
u32 to unsigned int to be more consistent with other variables in the
area.  Also brought the patch up to the latest net-2.6.26 tree.

Update: Made gso_max_size container 32 bits, not 16.  Moved the
location of gso_max_size within netdev to be less hotpath.  Made more
consistent names between the sock and netdev layers, and added a
define for the max GSO size.

Update: Respun for net-2.6.26 tree.

Update: changed max_gso_frame_size and sk_gso_max_size from signed to
unsigned - thanks Stephen!

This patch adds the ability for device drivers to control the size of
the TSO frames being sent to them, per TCP connection.  By setting the
netdevice's gso_max_size value, the socket layer will set the GSO
frame size based on that value.  This will propogate into the TCP
layer, and send TSO's of that size to the hardware.

This can be desirable to help tune the bursty nature of TSO on a
per-adapter basis, where one may have 1 GbE and 10 GbE devices
coexisting in a system, one running multiqueue and the other not, etc.

This can also be desirable for devices that cannot support full 64 KB
TSO's, but still want to benefit from some level of segmentation
offloading.
Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82cc1a7a

01 3月, 2008 1 次提交

[NET]: Make netlink_kernel_release publically available as sk_release_kernel. · edf02087

由 Denis V. Lunev 提交于 2月 29, 2008

This staff will be needed for non-netlink kernel sockets, which should
also not pin a namespace like tcp_socket and icmp_socket.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Acked-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

edf02087

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多