提交 · 9981151086385eecc2febf4ba95a14593f834b3d · openanolis / cloud-kernel

22 7月, 2010 9 次提交

Bluetooth: Implemented HCI frame reassembly for RX from stream · 99811510

由 Suraj Sumangala 提交于 7月 14, 2010

Implemented frame reassembly implementation for reassembling fragments
received from stream.
Signed-off-by: NSuraj Sumangala <suraj@atheros.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

99811510

Bluetooth: Implement hci_reassembly helper to reassemble RX packets · 33e882a5

由 Suraj Sumangala 提交于 7月 14, 2010

Implements feature to reassemble received HCI frames from any input stream
Signed-off-by: NSuraj Sumangala <suraj@atheros.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

33e882a5

Bluetooth: Add one more buffer for HCI stream reassembly · cd4c5391

由 Suraj Sumangala 提交于 7月 14, 2010

Additional reassembly buffer to keep track of stream reasembly
Signed-off-by: NSuraj Sumangala <suraj@atheros.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

cd4c5391

Bluetooth: Add Copyright notice to L2CAP · ce5706bd

由 Gustavo F. Padovan 提交于 7月 13, 2010

Copyright for the time I worked on L2CAP during the Google Summer of Code
program.
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

ce5706bd

Bluetooth: Remove the send_lock spinlock from ERTM · e0f66218

由 Gustavo F. Padovan 提交于 6月 21, 2010

Using a lock to deal with the ERTM race condition - interruption with
new data from the hci layer - is wrong. We should use the native skb
backlog queue.
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

e0f66218

Bluetooth: Disconnect early if mode is not supported · cf6c2c0b

由 Gustavo F. Padovan 提交于 6月 07, 2010

When mode is mandatory we shall not send connect request and report this
to the userspace as well.
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

cf6c2c0b

Bluetooth: Reassigned copyright to Code Aurora Forum · 2d0a0346

由 Ron Shaffer 提交于 5月 28, 2010

Qualcomm, Inc. has reassigned rights to Code Aurora Forum. Accordingly,
as files are modified by Code Aurora Forum members, the copyright
statement will be updated.
Signed-off-by: NRon Shaffer <rshaffer@codeaurora.org>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

2d0a0346

Bluetooth: Remove extraneous white space · 04fafe4e

由 Ron Shaffer 提交于 5月 28, 2010

Deleted extraneous white space from the end of several lines
Signed-off-by: NRon Shaffer <rshaffer@codeaurora.org>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

04fafe4e

Bluetooth: Add blacklist support for incoming connections · f0358568

由 Johan Hedberg 提交于 5月 18, 2010

In some circumstances it could be desirable to reject incoming
connections on the baseband level. This patch adds this feature through
two new ioctl's: HCIBLOCKADDR and HCIUNBLOCKADDR. Both take a simple
Bluetooth address as a parameter. BDADDR_ANY can be used with
HCIUNBLOCKADDR to remove all devices from the blacklist.
Signed-off-by: NJohan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

f0358568

16 7月, 2010 1 次提交

tcp: sizeof struct tcp_skb_cb is 44 · f86586fa

由 Eric Dumazet 提交于 7月 15, 2010

Correct comment stating sizeof(struct tcp_skb_cb) is 36 or 40, since its
44 bytes, since commit 951dbc8a ([IPV6]: Move nextheader offset
to the IP6CB).
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f86586fa

15 7月, 2010 1 次提交

net: fix problem in reading sock TX queue · b0f77d0e

由 Tom Herbert 提交于 7月 14, 2010

Fix problem in reading the tx_queue recorded in a socket.  In
dev_pick_tx, the TX queue is read by doing a check with
sk_tx_queue_recorded on the socket, followed by a sk_tx_queue_get.
The problem is that there is not mutual exclusion across these
calls in the socket so it it is possible that the queue in the
sock can be invalidated after sk_tx_queue_recorded is called so
that sk_tx_queue get returns -1, which sets 65535 in queue_index
and thus dev_pick_tx returns 65536 which is a bogus queue and
can cause crash in dev_queue_xmit.

We fix this by only calling sk_tx_queue_get which does the proper
checks.  The interface is that sk_tx_queue_get returns the TX queue
if the sock argument is non-NULL and TX queue is recorded, else it
returns -1.  sk_tx_queue_recorded is no longer used so it can be
completely removed.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0f77d0e

13 7月, 2010 2 次提交

inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and inet_sendpage() · 7ba42910

由 Changli Gao 提交于 7月 10, 2010

a new boolean flag no_autobind is added to structure proto to avoid the autobind
calls when the protocol is TCP. Then sock_rps_record_flow() is called int the
TCP's sendmsg() and sendpage() pathes.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 include/net/inet_common.h |    4 ++++
 include/net/sock.h        |    1 +
 include/net/tcp.h         |    8 ++++----
 net/ipv4/af_inet.c        |   15 +++++++++------
 net/ipv4/tcp.c            |   11 +++++------
 net/ipv4/tcp_ipv4.c       |    3 +++
 net/ipv6/af_inet6.c       |    8 ++++----
 net/ipv6/tcp_ipv6.c       |    3 +++
 8 files changed, 33 insertions(+), 20 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ba42910

net: cleanups · 53d3176b

由 Changli Gao 提交于 7月 10, 2010

remove useless blanks.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 include/net/inet_common.h |   55 ++++-------
 include/net/tcp.h         |  222 +++++++++++++++++-----------------------------
 include/net/udp.h         |   38 +++----
 3 files changed, 123 insertions(+), 192 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53d3176b

03 7月, 2010 2 次提交

net: decreasing real_num_tx_queues needs to flush qdisc · f0796d5c

由 John Fastabend 提交于 7月 01, 2010

Reducing real_num_queues needs to flush the qdisc otherwise
skbs with queue_mappings greater then real_num_tx_queues can
be sent to the underlying driver.

The flow for this is,

dev_queue_xmit()
	dev_pick_tx()
		skb_tx_hash()  => hash using real_num_tx_queues
		skb_set_queue_mapping()
	...
	qdisc_enqueue_root() => enqueue skb on txq from hash
...
dev->real_num_tx_queues -= n
...
sch_direct_xmit()
	dev_hard_start_xmit()
		ndo_start_xmit(skb,dev) => skb queue set with old hash

skbs are enqueued on the qdisc with skb->queue_mapping set
0 < queue_mappings < real_num_tx_queues.  When the driver
decreases real_num_tx_queues skb's may be dequeued from the
qdisc with a queue_mapping greater then real_num_tx_queues.

This fixes a case in ixgbe where this was occurring with DCB
and FCoE. Because the driver is using queue_mapping to map
skbs to tx descriptor rings we can potentially map skbs to
rings that no longer exist.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0796d5c

sched: qdisc_reset_all_tx is calling qdisc_reset without qdisc_lock · 4ef6acff

由 John Fastabend 提交于 7月 01, 2010

When calling qdisc_reset() the qdisc lock needs to be held.  In
this case there is at least one driver i4l which is using this
without holding the lock.  Add the locking here.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ef6acff

01 7月, 2010 4 次提交

fragment: add fast path for in-order fragments · d6bebca9

由 Changli Gao 提交于 6月 29, 2010

add fast path for in-order fragments

As the fragments are sent in order in most of OSes, such as Windows, Darwin and
FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
In the fast path, we check if the skb at the end of the inet_frag_queue is the
prev we expect.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 include/net/inet_frag.h |    1 +
 net/ipv4/ip_fragment.c  |   12 ++++++++++++
 net/ipv6/reassembly.c   |   11 +++++++++++
 3 files changed, 24 insertions(+)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6bebca9

snmp: 64bit ipstats_mib for all arches · 4ce3c183

由 Eric Dumazet 提交于 6月 30, 2010

/proc/net/snmp and /proc/net/netstat expose SNMP counters.

Width of these counters is either 32 or 64 bits, depending on the size
of "unsigned long" in kernel.

This means user program parsing these files must already be prepared to
deal with 64bit values, regardless of user program being 32 or 64 bit.

This patch introduces 64bit snmp values for IPSTAT mib, where some
counters can wrap pretty fast if they are 32bit wide.

# netstat -s|egrep "InOctets|OutOctets"
    InOctets: 244068329096
    OutOctets: 244069348848
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ce3c183

net/neighbour.h: fix typo · 787a3445

由 Kulikov Vasiliy 提交于 6月 30, 2010

'Shoul' must be 'should'.
Signed-off-by: NKulikov Vasiliy <segooon@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

787a3445

xfrm: fix XFRMA_MARK extraction in xfrm_mark_get · 4efd7e83

由 Andreas Steffen 提交于 6月 30, 2010

Determine the size of the xfrm_mark struct, not of its pointer.
Signed-off-by: NAndreas Steffen <andreas.steffen@strongswan.org>
Acked-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4efd7e83

29 6月, 2010 2 次提交

caif-driver: Add CAIF-SPI Protocol driver. · 529d6dad

由 Sjur Braendeland 提交于 6月 29, 2010

This patch introduces the CAIF SPI Protocol Driver for
CAIF Link Layer.

This driver implements a platform driver to accommodate for a
platform specific SPI device. A general platform driver is not
possible as there are no SPI Slave side Kernel API defined.
A sample CAIF SPI Platform device can be found in
.../Documentation/networking/caif/spi_porting.txt
Signed-off-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

529d6dad

act_mirred: don't clone skb when skb isn't shared · 210d6de7

由 Changli Gao 提交于 6月 24, 2010

don't clone skb when skb isn't shared

When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need
to clone a new skb. As the skb will be freed after this function returns, we
can use it freely once we get a reference to it.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 include/net/sch_generic.h |   11 +++++++++--
 net/sched/act_mirred.c    |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

210d6de7

27 6月, 2010 1 次提交

syncookies: add support for ECN · 172d69e6

由 Florian Westphal 提交于 6月 21, 2010

Allows use of ECN when syncookies are in effect by encoding ecn_ok
into the syn-ack tcp timestamp.

While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES.
With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc
removes the "if (0)".
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

172d69e6

26 6月, 2010 1 次提交

snmp: add align parameter to snmp_mib_init() · 1823e4c8

由 Eric Dumazet 提交于 6月 22, 2010

In preparation for 64bit snmp counters for some mibs,
add an 'align' parameter to snmp_mib_init(), instead
of assuming mibs only contain 'unsigned long' fields.

Callers can use __alignof__(type) to provide correct
alignment.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1823e4c8

25 6月, 2010 2 次提交

netfilter: xt_connbytes: Force CT accounting to be enabled · a8756201

由 Tim Gardner 提交于 6月 25, 2010

Check at rule install time that CT accounting is enabled. Force it
to be enabled if not while also emitting a warning since this is not
the default state.

This is in preparation for deprecating CONFIG_NF_CT_ACCT upon which
CONFIG_NETFILTER_XT_MATCH_CONNBYTES depended being set.

Added 2 CT accounting support functions:

nf_ct_acct_enabled() - Get CT accounting state.
nf_ct_set_acct() - Enable/disable CT accountuing.
Signed-off-by: NTim Gardner <tim.gardner@canonical.com>
Acked-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

a8756201

cfg80211/mac80211: Update set_tx_power to use mBm instead of dBm units · fa61cf70

由 Juuso Oikarinen 提交于 6月 23, 2010

In preparation for a TX power setting interface in the nl80211, change the
.set_tx_power function to use mBm units instead of dBm for greater accuracy and
smaller power levels.

Also, already in advance move the tx_power_setting enumeration to nl80211.

This change affects the .tx_set_power function prototype. As a result, the
corresponding changes are needed to modules using it. These are mac80211,
iwmc3200wifi and rndis_wlan.

Cc: Samuel Ortiz <samuel.ortiz@intel.com>
Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NJuuso Oikarinen <juuso.oikarinen@nokia.com>
Acked-by: NSamuel Ortiz <samuel.ortiz@intel.com>
Acked-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

fa61cf70

24 6月, 2010 3 次提交

net - IP_NODEFRAG option for IPv4 socket · 7b2ff18e

由 Jiri Olsa 提交于 6月 15, 2010

this patch is implementing IP_NODEFRAG option for IPv4 socket.
The reason is, there's no other way to send out the packet with user
customized header of the reassembly part.
Signed-off-by: NJiri Olsa <jolsa@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b2ff18e

net: Fix a typo in netlink.h · 1dc8d8c0

由 Justin P. Mattock 提交于 6月 21, 2010

Fix a typo in include/net/netlink.h
should be finalize instead of finanlize
Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1dc8d8c0

snmp: fix SNMP_ADD_STATS() · 8f1c14b2

由 Eric Dumazet 提交于 6月 23, 2010

commit aa2ea058 (tcp: fix outsegs stat for TSO segments) incorrectly
assumed SNMP_ADD_STATS() was used from BH context.

Fix this using mib[!in_softirq()] instead of mib[0]
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f1c14b2

22 6月, 2010 1 次提交

mac80211: Add interface for driver to temporarily disable dynamic ps · f90754c1

由 Juuso Oikarinen 提交于 6月 21, 2010

This mechanism introduced in this patch applies (at least) for hardware
designs using a single shared antenna for both WLAN and BT. In these designs,
the antenna must be toggled between WLAN and BT.

In those hardware, managing WLAN co-existence with Bluetooth requires WLAN
full power save whenever there is Bluetooth activity in order for WLAN to be
able to periodically relinquish the antenna to be used for BT. This is because
BT can only access the shared antenna when WLAN is idle or asleep.

Some hardware, for instance the wl1271, are able to indicate to the host
whenever there is BT traffic. In essence, the hardware will send an indication
to the host whenever there is, for example, SCO traffic or A2DP traffic, and
will send another indication when the traffic is over.

The hardware gets information of Bluetooth traffic via hardware co-existence
control lines - these lines are used to negotiate the shared antenna
ownership. The hardware will give the antenna to BT whenever WLAN is sleeping.

This patch adds the interface to mac80211 to facilitate temporarily disabling
of dynamic power save as per request of the WLAN driver. This interface will
immediately force WLAN to full powersave, hence allowing BT coexistence as
described above.

In these kind of shared antenna desings, when WLAN powersave is fully disabled,
Bluetooth will not work simultaneously with WLAN at all. This patch does not
address that problem. This interface will not change PSM state, so if PSM is
disabled it will remain so. Solving this problem requires knowledge about BT
state, and is best done in user-space.
Signed-off-by: NJuuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

f90754c1

21 6月, 2010 3 次提交

caif: Use link layer MTU instead of fixed MTU · 2aa40aef

由 Sjur Braendeland 提交于 6月 17, 2010

Previously CAIF supported maximum transfer size of ~4050.
The transfer size is now calculated dynamically based on the
link layers mtu size.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2aa40aef

caif: Bugfix - RFM must support segmentation. · a7da1f55

由 Sjur Braendeland 提交于 6月 17, 2010

CAIF Remote File Manager may send or receive more than 4050 bytes.
Due to this The CAIF RFM service have to support segmentation.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7da1f55

caif: Bugfix not all services uses flow-ctrl. · b1c74247

由 Sjur Braendeland 提交于 6月 17, 2010

Flow control is not used by all CAIF services.
The usage of flow control is now part of the gerneal
initialization function for CAIF Services.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1c74247

17 6月, 2010 7 次提交

netfilter: nf_nat: support user-specified SNAT rules in LOCAL_IN · c68cd6cc

由 Patrick McHardy 提交于 6月 17, 2010

2.6.34 introduced 'conntrack zones' to deal with cases where packets
from multiple identical networks are handled by conntrack/NAT. Packets
are looped through veth devices, during which they are NATed to private
addresses, after which they can continue normally through the stack
and possibly have NAT rules applied a second time.

This works well, but is needlessly complicated for cases where only
a single SNAT/DNAT mapping needs to be applied to these packets. In that
case, all that needs to be done is to assign each network to a seperate
zone and perform NAT as usual. However this doesn't work for packets
destined for the machine performing NAT itself since its corrently not
possible to configure SNAT mappings for the LOCAL_IN chain.

This patch adds a new INPUT chain to the NAT table and changes the
targets performing SNAT to be usable in that chain.

Example usage with two identical networks (192.168.0.0/24) on eth0/eth1:

iptables -t raw -A PREROUTING -i eth0 -j CT --zone 1
iptables -t raw -A PREROUTING -i eth0 -j MARK --set-mark 1
iptables -t raw -A PREROUTING -i eth1 -j CT --zone 2
iptabels -t raw -A PREROUTING -i eth1 -j MARK --set-mark 2

iptables -t nat -A INPUT -m mark --mark 1 -j NETMAP --to 10.0.0.0/24
iptables -t nat -A POSTROUTING -m mark --mark 1 -j NETMAP --to 10.0.0.0/24
iptables -t nat -A INPUT -m mark --mark 2 -j NETMAP --to 10.0.1.0/24
iptables -t nat -A POSTROUTING -m mark --mark 2 -j NETMAP --to 10.0.1.0/24

iptables -t raw -A PREROUTING -d 10.0.0.0/24 -j CT --zone 1
iptables -t raw -A OUTPUT -d 10.0.0.0/24 -j CT --zone 1
iptables -t raw -A PREROUTING -d 10.0.1.0/24 -j CT --zone 2
iptables -t raw -A OUTPUT -d 10.0.1.0/24 -j CT --zone 2

iptables -t nat -A PREROUTING -d 10.0.0.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A OUTPUT -d 10.0.0.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A PREROUTING -d 10.0.1.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A OUTPUT -d 10.0.1.0/24 -j NETMAP --to 192.168.0.0/24
Signed-off-by: NPatrick McHardy <kaber@trash.net>

c68cd6cc

af_unix: Allow credentials to work across user and pid namespaces. · 7361c36c

由 Eric W. Biederman 提交于 6月 13, 2010

In unix_skb_parms store pointers to struct pid and struct cred instead
of raw uid, gid, and pid values, then translate the credentials on
reception into values that are meaningful in the receiving processes
namespaces.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7361c36c

scm: Capture the full credentials of the scm sender. · 257b5358

由 Eric W. Biederman 提交于 6月 13, 2010

Start capturing not only the userspace pid, uid and gid values of the
sending process but also the struct pid and struct cred of the sending
process as well.

This is in preparation for properly supporting SCM_CREDENTIALS for
sockets that have different uid and/or pid namespaces at the different
ends.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NSerge E. Hallyn <serge@hallyn.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

257b5358

af_unix: Allow SO_PEERCRED to work across namespaces. · 109f6e39

由 Eric W. Biederman 提交于 6月 13, 2010

Use struct pid and struct cred to store the peer credentials on struct
sock.  This gives enough information to convert the peer credential
information to a value relative to whatever namespace the socket is in
at the time.

This removes nasty surprises when using SO_PEERCRED on socket
connetions where the processes on either side are in different pid and
user namespaces.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NDaniel Lezcano <daniel.lezcano@free.fr>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

109f6e39

scm: Reorder scm_cookie. · 812e876e

由 Eric W. Biederman 提交于 6月 13, 2010

Reorder the fields in scm_cookie so they pack better on 64bit.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

812e876e

syncookies: check decoded options against sysctl settings · 8c763681

由 Florian Westphal 提交于 6月 16, 2010

Discard the ACK if we find options that do not match current sysctl
settings.

Previously it was possible to create a connection with sack, wscale,
etc. enabled even if the feature was disabled via sysctl.

Also remove an unneeded call to tcp_sack_reset() in
cookie_check_timestamp: Both call sites (cookie_v4_check,
cookie_v6_check) zero "struct tcp_options_received", hand it to
tcp_parse_options() (which does not change tcp_opt->num_sacks/dsack)
and then call cookie_check_timestamp().

Even if num_sacks/dsacks were changed, the structure is allocated on
the stack and after cookie_check_timestamp returns only a few selected
members are copied to the inet_request_sock.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8c763681

inetpeer: restore small inet_peer structures · 317fe0e6

由 Eric Dumazet 提交于 6月 16, 2010

Addition of rcu_head to struct inet_peer added 16bytes on 64bit arches.

Thats a bit unfortunate, since old size was exactly 64 bytes.

This can be solved, using an union between this rcu_head an four fields,
that are normally used only when a refcount is taken on inet_peer.
rcu_head is used only when refcnt=-1, right before structure freeing.

Add a inet_peer_refcheck() function to check this assertion for a while.

We can bring back SLAB_HWCACHE_ALIGN qualifier in kmem cache creation.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

317fe0e6

16 6月, 2010 1 次提交

inetpeer: RCU conversion · aa1039e7

由 Eric Dumazet 提交于 6月 15, 2010

inetpeer currently uses an AVL tree protected by an rwlock.

It's possible to make most lookups use RCU

1) Add a struct rcu_head to struct inet_peer

2) add a lookup_rcu_bh() helper to perform lockless and opportunistic
lookup. This is a normal function, not a macro like lookup().

3) Add a limit to number of links followed by lookup_rcu_bh(). This is
needed in case we fall in a loop.

4) add an smp_wmb() in link_to_pool() right before node insert.

5) make unlink_from_pool() use atomic_cmpxchg() to make sure it can take
last reference to an inet_peer, since lockless readers could increase
refcount, even while we hold peers.lock.

6) Delay struct inet_peer freeing after rcu grace period so that
lookup_rcu_bh() cannot crash.

7) inet_getpeer() first attempts lockless lookup.
   Note this lookup can fail even if target is in AVL tree, but a
concurrent writer can let tree in a non correct form.
   If this attemps fails, lock is taken a regular lookup is performed
again.

8) convert peers.lock from rwlock to a spinlock

9) Remove SLAB_HWCACHE_ALIGN when peer_cachep is created, because
rcu_head adds 16 bytes on 64bit arches, doubling effective size (64 ->
128 bytes)
In a future patch, this is probably possible to revert this part, if rcu
field is put in an union to share space with rid, ip_id_count, tcp_ts &
tcp_ts_stamp. These fields being manipulated only with refcnt > 0.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa1039e7

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功