提交 · 77be155cba4e163e8bba9fd27222a8b6189ec4f7 · openanolis / cloud-kernel

31 10月, 2008 6 次提交

pkt_sched: Add peek emulation for non-work-conserving qdiscs. · 77be155c

由 Jarek Poplawski 提交于 10月 31, 2008

This patch adds qdisc_peek_dequeued() wrapper to emulate peek method
with qdisc->dequeue() and storing "peeked" skb in qdisc->gso_skb until
dequeuing. This is mainly for compatibility reasons not to break some
strange configs because peeking is expected for non-work-conserving
parent qdiscs to query work-conserving child qdiscs.

This implementation requires using qdisc_dequeue_peeked() wrapper
instead of directly calling qdisc->dequeue() for all qdiscs ever
querried with qdisc->ops->peek() or qdisc_peek_dequeued().
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77be155c

pkt_sched: Add ->peek() methods for fifo, prio and SFQ qdiscs. · 48a8f519

由 Patrick McHardy 提交于 10月 31, 2008

From: Patrick McHardy <kaber@trash.net>

Just as a demonstration how easy adding a peek operation to the
work-conserving qdiscs actually is. It doesn't need to keep or change
any internal state in many cases thanks to the guarantee that the
packet will either be dequeued or, if another packet arrives, the
upper qdisc will immediately ->peek again to reevaluate the state.

(This is only slightly modified Patrick's patch.)
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48a8f519

pkt_sched: sch_generic: Add Qdisc_ops peek() method. · 90d841fd

由 Jarek Poplawski 提交于 10月 31, 2008

Add Qdisc_ops peek() method in order to replace requeuing.

Based on ideas and patches of Herbert Xu, Patrick McHardy and
David S. Miller.
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90d841fd

xfrm: remove unused struct xfrm_policy::next · cc0fe835

由 Alexey Dobriyan 提交于 10月 31, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc0fe835

netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys · 485ac57b

由 Alexey Dobriyan 提交于 10月 30, 2008

netns ops which are registered with register_pernet_gen_device() are
shutdown strictly before those which are registered with
register_pernet_subsys(). Sometimes this leads to opposite (read: buggy)
shutdown ordering between two modules.

Add register_pernet_gen_subsys()/unregister_pernet_gen_subsys() for modules
which aren't elite enough for entry in struct net, and which can't use
register_pernet_gen_device(). PPTP conntracking module is such one.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

485ac57b

net: delete excess kernel-doc notation · ad1d967c

由 Randy Dunlap 提交于 10月 30, 2008

Remove excess kernel-doc function parameters from networking header
& driver files:

Warning(include/net/sock.h:946): Excess function parameter or struct member 'sk' description in 'sk_filter_release'
Warning(include/linux/netdevice.h:1545): Excess function parameter or struct member 'cpu' description in 'netif_tx_lock'
Warning(drivers/net/wan/z85230.c:712): Excess function parameter or struct member 'regs' description in 'z8530_interrupt'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad1d967c

30 10月, 2008 2 次提交

net: replace %p6 with %pI6 · 5b095d98

由 Harvey Harrison 提交于 10月 29, 2008

Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b095d98

udp: introduce sk_for_each_rcu_safenext() · 96631ed1

由 Eric Dumazet 提交于 10月 29, 2008

Corey Minyard found a race added in commit 271b72c7
(udp: RCU handling for Unicast packets.)

 "If the socket is moved from one list to another list in-between the
 time the hash is calculated and the next field is accessed, and the
 socket has moved to the end of the new list, the traversal will not
 complete properly on the list it should have, since the socket will
 be on the end of the new list and there's not a way to tell it's on a
 new list and restart the list traversal.  I think that this can be
 solved by pre-fetching the "next" field (with proper barriers) before
 checking the hash."

This patch corrects this problem, introducing a new
sk_for_each_rcu_safenext() macro.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96631ed1

29 10月, 2008 5 次提交

udp: RCU handling for Unicast packets. · 271b72c7

由 Eric Dumazet 提交于 10月 29, 2008

Goals are :

1) Optimizing handling of incoming Unicast UDP frames, so that no memory
 writes should happen in the fast path.

 Note: Multicasts and broadcasts still will need to take a lock,
 because doing a full lockless lookup in this case is difficult.

2) No expensive operations in the socket bind/unhash phases :
  - No expensive synchronize_rcu() calls.

  - No added rcu_head in socket structure, increasing memory needs,
  but more important, forcing us to use call_rcu() calls,
  that have the bad property of making sockets structure cold.
  (rcu grace period between socket freeing and its potential reuse
   make this socket being cold in CPU cache).
  David did a previous patch using call_rcu() and noticed a 20%
  impact on TCP connection rates.
  Quoting Cristopher Lameter :
   "Right. That results in cacheline cooldown. You'd want to recycle
    the object as they are cache hot on a per cpu basis. That is screwed
    up by the delayed regular rcu processing. We have seen multiple
    regressions due to cacheline cooldown.
    The only choice in cacheline hot sensitive areas is to deal with the
    complexity that comes with SLAB_DESTROY_BY_RCU or give up on RCU."

  - Because udp sockets are allocated from dedicated kmem_cache,
  use of SLAB_DESTROY_BY_RCU can help here.

Theory of operation :
---------------------

As the lookup is lockfree (using rcu_read_lock()/rcu_read_unlock()),
special attention must be taken by readers and writers.

Use of SLAB_DESTROY_BY_RCU is tricky too, because a socket can be freed,
reused, inserted in a different chain or in worst case in the same chain
while readers could do lookups in the same time.

In order to avoid loops, a reader must check each socket found in a chain
really belongs to the chain the reader was traversing. If it finds a
mismatch, lookup must start again at the begining. This *restart* loop
is the reason we had to use rdlock for the multicast case, because
we dont want to send same message several times to the same socket.

We use RCU only for fast path.
Thus, /proc/net/udp still takes spinlocks.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

271b72c7

udp: introduce struct udp_table and multiple spinlocks · 645ca708

由 Eric Dumazet 提交于 10月 29, 2008

UDP sockets are hashed in a 128 slots hash table.

This hash table is protected by *one* rwlock.

This rwlock is readlocked each time an incoming UDP message is handled.

This rwlock is writelocked each time a socket must be inserted in
hash table (bind time), or deleted from this table (close time)

This is not scalable on SMP machines :

1) Even in read mode, lock() and unlock() are atomic operations and
 must dirty a contended cache line, shared by all cpus.

2) A writer might be starved if many readers are 'in flight'. This can
 happen on a machine with some NIC receiving many UDP messages. User
 process can be delayed a long time at socket creation/dismantle time.

This patch prepares RCU migration, by introducing 'struct udp_table
and struct udp_hslot', and using one spinlock per chain, to reduce
contention on central rwlock.

Introducing one spinlock per chain reduces latencies, for port
randomization on heavily loaded UDP servers. This also speedup
bindings to specific ports.

udp_lib_unhash() was uninlined, becoming to big.

Some cleanups were done to ease review of following patch
(RCUification of UDP Unicast lookups)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

645ca708

net: replace uses of NIP6_FMT with %p6 · 0c6ce78a

由 Harvey Harrison 提交于 10月 28, 2008

Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c6ce78a

net: reduce structures when XFRM=n · def8b4fa

由 Alexey Dobriyan 提交于 10月 28, 2008

ifdef out
* struct sk_buff::sp		(pointer)
* struct dst_entry::xfrm	(pointer)
* struct sock::sk_policy	(2 pointers)
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

def8b4fa

P
netlink: constify struct nlattr * arg to parsing functions · b057efd4
由 Patrick McHardy 提交于 10月 28, 2008
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b057efd4

28 10月, 2008 2 次提交

net: implement emergency route cache rebulds when gc_elasticity is exceeded · 1080d709

由 Neil Horman 提交于 10月 27, 2008

This is a patch to provide on demand route cache rebuilding. Currently, our
route cache is rebulid periodically regardless of need. This introduced
unneeded periodic latency. This patch offers a better approach. Using code
provided by Eric Dumazet, we compute the standard deviation of the average hash
bucket chain length while running rt_check_expire. Should any given chain
length grow to larger that average plus 4 standard deviations, we trigger an
emergency hash table rebuild for that net namespace. This allows for the common
case in which chains are well behaved and do not grow unevenly to not incur any
latency at all, while those systems (which may be being maliciously attacked),
only rebuild when the attack is detected. This patch take 2 other factors into
account:
1) chains with multiple entries that differ by attributes that do not affect the
hash value are only counted once, so as not to unduly bias system to rebuilding
if features like QOS are heavily used
2) if rebuilding crosses a certain threshold (which is adjustable via the added
sysctl in this patch), route caching is disabled entirely for that net
namespace, since constant rebuilding is less efficient that no caching at all

Tested successfully by me.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1080d709

mac80211.h: fix kernel-doc excesses · ea2d8b59

由 Randy Dunlap 提交于 10月 27, 2008

Fix mac80211.h kernel-doc: it had some extra parameters that were
no longer valid and incorrect format for a return value in 2 places.

Warning(lin2628-rc2//include/net/mac80211.h:1487): Excess function parameter or struct member 'control' description in 'ieee80211_beacon_get'
Warning(lin2628-rc2//include/net/mac80211.h:1596): Excess function parameter or struct member 'control' description in 'ieee80211_get_buffered_bc'
Warning(lin2628-rc2//include/net/mac80211.h:1632): Excess function parameter or struct member 'rc4key' description in 'ieee80211_get_tkip_key'
Warning(lin2628-rc2//include/net/mac80211.h:1735): Excess function parameter or struct member 'return' description in 'ieee80211_start_tx_ba_session'
Warning(lin2628-rc2//include/net/mac80211.h:1775): Excess function parameter or struct member 'return' description in 'ieee80211_stop_tx_ba_session'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Acked-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

ea2d8b59

27 10月, 2008 1 次提交

Phonet: include generic link-layer header size in MAX_PHONET_HEADER · e214a8cc

由 Remi Denis-Courmont 提交于 10月 26, 2008

This fixes an OOPS in hard_header if a Phonet address is assigned to a
non-Phonet network interface.
Signed-off-by: NRemi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e214a8cc

23 10月, 2008 4 次提交

sctp: Fix to handle SHUTDOWN in SHUTDOWN_RECEIVED state · 2e3f92da

由 Wei Yongjun 提交于 10月 23, 2008

Once an endpoint has reached the SHUTDOWN-RECEIVED state,
it MUST NOT send a SHUTDOWN in response to a ULP request.
The Cumulative TSN Ack of the received SHUTDOWN chunk
MUST be processed.

This patch fix to process Cumulative TSN Ack of the received
SHUTDOWN chunk in SHUTDOWN_RECEIVED state.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e3f92da

9p: fix sparse warnings · e45c5405

由 Eric Van Hensbergen 提交于 10月 22, 2008

Several sparse warnings were introduced by patches accepted during the merge
window which weren't caught. This patch fixes those warnings.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

e45c5405

9p: rdma: RDMA Transport Support for 9P · fc79d4b1

由 Tom Tucker 提交于 10月 22, 2008

This patch implements the RDMA transport provider for 9P. It allows
mounts to be performed over iWARP and IB capable network interfaces.
Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
Signed-off-by: NLatchesar Ionkov <lionkov@lanl.gov>

fc79d4b1

9p: fix debug build error · 0b15a3a5

由 Eric Van Hensbergen 提交于 10月 22, 2008

Fixes build problem with 9p when building with debug disabled.
Also contains some fixes for warnings which pop up when 
CONFIG_NET_9P_DEBUG is disabled.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

0b15a3a5

20 10月, 2008 1 次提交

netfilter: netns: use NFPROTO_NUMPROTO instead of NUMPROTO for tables array · 10a03a42

由 Patrick McHardy 提交于 10月 20, 2008

The netfilter families have been decoupled from regular protocol families.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10a03a42

18 10月, 2008 13 次提交

9p: Improve debug support · e7f4b8f1

由 Eric Van Hensbergen 提交于 10月 17, 2008

The new debug support lacks some of the information that the previous fcprint
code provided -- this patch focuses on better presentation of debug data along
with more helpful debug along error paths.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

e7f4b8f1

9p: eliminate depricated conv functions · 02da398b

由 Eric Van Hensbergen 提交于 10月 16, 2008

Remove depricated conv functions which have been replaced with new 
protocol routines.

This patch also reworks the one instance of the file-system code which
directly calls conversion routines (to accomplish unpacking dirreads).
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

02da398b

9p: rework client code to use new protocol support functions · 51a87c55

由 Eric Van Hensbergen 提交于 10月 16, 2008

Now that the new protocol functions are in place, this patch switches
the client code to using the new support code.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

51a87c55

9p: remove unnecessary tag field from p9_req_t structure · cb198131

由 Eric Van Hensbergen 提交于 10月 16, 2008

This removes the vestigial tag field from the p9_req_t structure.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

cb198131

9p: remove 9p fcall debug prints · 51d71f9f

由 Eric Van Hensbergen 提交于 10月 16, 2008

One of the current debug options allows users to get a verbose dump of fcalls.
This isn't really necessary as correctly parsed protocol frames can be printed
as part of the code in the client functions. The consolidated printfcalls
structure would require new entries to be added for every extension. This
patch removes the debug print methods and their use.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

51d71f9f

9p: add new protocol support code · ace51c4d

由 Eric Van Hensbergen 提交于 10月 13, 2008

This adds a new protocol processing support code based on Anthony Liguori's
9p library code. This code performs protocol marshalling/unmarshalling using
printf like strings to represent protocol elements. It is my intent to use
them to replace the current functions in conv.c as well as the
p9_create_* functions.

This should make the client implementation much more clear, and also make it
much easier to add new protocol extensions by limiting the number of places
in which changes need to be made.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

ace51c4d

9p: move dirread to fs layer · 06b55b46

由 Eric Van Hensbergen 提交于 10月 13, 2008

Currently reading a directory is implemented in the client code.
This function is not actually a wire operation, but a meta operation
which calls read operations and processes the results.

This patch moves this functionality to the fs layer and calls component
wire operations instead of constructing their packets. This provides a
cleaner separation and will help when we reorganize the client functions
and protocol processing methods.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

06b55b46

9p: move readn meta-function from client to fs layer · fbedadc1

由 Eric Van Hensbergen 提交于 10月 13, 2008

There are a couple of methods in the client code which aren't actually
wire operations.  To keep things organized cleaner, these operations are
being moved to the fs layer.

This patch moves the readn meta-function (which executes multiple wire
reads until a buffer is full) to the fs layer.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

fbedadc1

9p: consolidate read/write functions · 0fc9655e

由 Eric Van Hensbergen 提交于 10月 13, 2008

Currently there are two separate versions of read and write. One for
dealing with user buffers and the other for dealing with kernel buffers.
There is a tremendous amount of code duplication in the otherwise
identical versions of these functions. This patch adds an additional
user buffer parameter to read and write and conditionalizes handling of
the buffer on whether the kernel buffer or the user buffer is populated.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

0fc9655e

9p: make rpc code common and rework flush code · 91b8534f

由 Eric Van Hensbergen 提交于 10月 13, 2008

This code moves the rpc function to the common client base,
reorganizes the flush code to be more simple and stable, and
makes the necessary adjustments to the underlying transports
to adapt to the new structure.

This reduces the overall amount of code duplication between the
transports and should make adding new transports more straightforward.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

91b8534f

9p: apply common request code to trans_fd · 673d62cd

由 Eric Van Hensbergen 提交于 10月 13, 2008

Apply the now common p9_req_t structure to the fd transport.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

673d62cd

9p: move request management to client code · fea511a6

由 Eric Van Hensbergen 提交于 10月 13, 2008

The virtio transport uses a simplified request management system
that I want to use for all transports. This patch adapts and moves the
exisiting code for managing requests to the client common code.
Later patches will apply these mechanisms to the other transports.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

fea511a6

9p: consolidate transport structure · 8b81ef58

由 Eric Van Hensbergen 提交于 10月 13, 2008

Right now there is a transport module structure which provides per-transport
type functions and data and a transport structure which contains per-instance
public data as well as function pointers to instance specific functions.

This patch moves public transport visible instance data to the client
structure (which in some cases had duplicate data) and consolidates the
functions into the transport module structure.
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

8b81ef58

17 10月, 2008 2 次提交

sysctl: simplify ->strategy · f221e726

由 Alexey Dobriyan 提交于 10月 15, 2008

name and nlen parameters passed to ->strategy hook are unused, remove
them.  In general ->strategy hook should know what it's doing, and don't
do something tricky for which, say, pointer to original userspace array
may be needed (name).
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ]
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f221e726

include: replace __FUNCTION__ with __func__ · d5c003b4

由 Harvey Harrison 提交于 10月 15, 2008

__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d5c003b4

15 10月, 2008 2 次提交

mac80211: fixme for kernel-doc · e1a65b58

由 Randy Dunlap 提交于 10月 13, 2008

Fix kernel-doc warnings in mac80211.h.
Fields need real explanations added to them.

Warning(lin2627-g3-kdocfixes//include/net/mac80211.h:659): No description found for parameter 'icv_len'
Warning(lin2627-g3-kdocfixes//include/net/mac80211.h:659): No description found for parameter 'iv_len'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

e1a65b58

netfilter: ctnetlink: remove bogus module dependency between ctnetlink and nf_nat · e6a7d3c0

由 Pablo Neira Ayuso 提交于 10月 14, 2008

This patch removes the module dependency between ctnetlink and
nf_nat by means of an indirect call that is initialized when
nf_nat is loaded. Now, nf_conntrack_netlink only requires
nf_conntrack and nfnetlink.

This patch puts nfnetlink_parse_nat_setup_hook into the
nf_conntrack_core to avoid dependencies between ctnetlink,
nf_conntrack_ipv4 and nf_conntrack_ipv6.

This patch also introduces the function ctnetlink_change_nat
that is only invoked from the creation path. Actually, the
nat handling cannot be invoked from the update path since
this is not allowed. By introducing this function, we remove
the useless nat handling in the update path and we avoid
deadlock-prone code.

This patch also adds the required EAGAIN logic for nfnetlink.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6a7d3c0

12 10月, 2008 1 次提交

net: fix dummy 'nf_conntrack_event_cache()' · 64f1b653

由 Linus Torvalds 提交于 10月 11, 2008

The dummy version of 'nf_conntrack_event_cache()' (used when the
NF_CONNTRACK_EVENTS config option is not enabled) had not been updated
when the calling convention changed.

This was introduced by commit a71996fc
("netfilter: netns nf_conntrack: pass conntrack to
nf_conntrack_event_cache() not skb")

Tssk.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64f1b653

10 10月, 2008 1 次提交

netlabel: Add configuration support for local labeling · d91d4079

由 Paul Moore 提交于 10月 10, 2008

Add the necessary NetLabel support for the new CIPSO mapping,
CIPSO_V4_MAP_LOCAL, which allows full LSM label/context support.
Signed-off-by: NPaul Moore <paul.moore@hp.com>
Reviewed-by: NJames Morris <jmorris@namei.org>

d91d4079

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功