提交 · f8ff182c716c6f11ca3061961f5722f26a14e101 · openeuler / Kernel

18 11月, 2010 2 次提交

rtnetlink: Link address family API · f8ff182c

由 Thomas Graf 提交于 11月 16, 2010

Each net_device contains address family specific data such as
per device settings and statistics. We already expose this data
via procfs/sysfs and partially netlink.

The netlink method requires the requester to send one RTM_GETLINK
request for each address family it wishes to receive data of
and then merge this data itself.

This patch implements a new API which combines all address family
specific link data in a new netlink attribute IFLA_AF_SPEC.
IFLA_AF_SPEC contains a sequence of nested attributes, one for each
address family which in turn defines the structure of its own
attribute. Example:

   [IFLA_AF_SPEC] = {
       [AF_INET] = {
           [IFLA_INET_CONF] = ...,
       },
       [AF_INET6] = {
           [IFLA_INET6_FLAGS] = ...,
           [IFLA_INET6_CONF] = ...,
       }
   }

The API also allows for address families to implement a function
which parses the IFLA_AF_SPEC attribute sent by userspace to
implement address family specific link options.
Signed-off-by: NThomas Graf <tgraf@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8ff182c

netfilter: allow hooks to pass error code back up the stack · da683650

由 Eric Paris 提交于 11月 16, 2010

SELinux would like to pass certain fatal errors back up the stack.  This patch
implements the generic netfilter support for this functionality.
Based-on-patch-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NEric Paris <eparis@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da683650

17 11月, 2010 4 次提交

net: reorder struct sock fields · b178bb3d

由 Eric Dumazet 提交于 11月 16, 2010

Right now, fields in struct sock are not optimally ordered, because each
path (RX softirq, TX completion, RX user,  TX user) has to touch fields
that are contained in many different cache lines.

The really critical thing is to shrink number of cache lines that are
used at RX softirq time : CPU handling softirqs for a device can receive
many frames per second for many sockets. If load is too big, we can drop
frames at NIC level. RPS or multiqueue cards can help, but better reduce
latency if possible.

This patch starts with UDP protocol, then additional patches will try to
reduce latencies of other ones as well.

At RX softirq time, fields of interest for UDP protocol are :
(not counting ones in inet struct for the lookup)

Read/Written:
sk_refcnt   (atomic increment/decrement)
sk_rmem_alloc & sk_backlog.len (to check if there is room in queues)
sk_receive_queue
sk_backlog (if socket locked by user program)
sk_rxhash
sk_forward_alloc
sk_drops

Read only:
sk_rcvbuf (sk_rcvqueues_full())
sk_filter
sk_wq
sk_policy[0]
sk_flags

Additional notes :

- sk_backlog has one hole on 64bit arches. We can fill it to save 8
bytes.
- sk_backlog is used only if RX sofirq handler finds the socket while
locked by user.
- sk_rxhash is written only once per flow.
- sk_drops is written only if queues are full

Final layout :

[1] One section grouping all read/write fields, but placing rxhash and
sk_backlog at the end of this section.

[2] One section grouping all read fields in RX handler
   (sk_filter, sk_rcv_buf, sk_wq)

[3] Section used by other paths

I'll post a patch on its own to put sk_refcnt at the end of struct
sock_common so that it shares same cache line than section [1]

New offsets on 64bit arch :

sizeof(struct sock)=0x268
offsetof(struct sock, sk_refcnt)  =0x10
offsetof(struct sock, sk_lock)    =0x48
offsetof(struct sock, sk_receive_queue)=0x68
offsetof(struct sock, sk_backlog)=0x80
offsetof(struct sock, sk_rmem_alloc)=0x80
offsetof(struct sock, sk_forward_alloc)=0x98
offsetof(struct sock, sk_rxhash)=0x9c
offsetof(struct sock, sk_rcvbuf)=0xa4
offsetof(struct sock, sk_drops) =0xa0
offsetof(struct sock, sk_filter)=0xa8
offsetof(struct sock, sk_wq)=0xb0
offsetof(struct sock, sk_policy)=0xd0
offsetof(struct sock, sk_flags) =0xe0

Instead of :

sizeof(struct sock)=0x270
offsetof(struct sock, sk_refcnt)  =0x10
offsetof(struct sock, sk_lock)    =0x50
offsetof(struct sock, sk_receive_queue)=0xc0
offsetof(struct sock, sk_backlog)=0x70
offsetof(struct sock, sk_rmem_alloc)=0xac
offsetof(struct sock, sk_forward_alloc)=0x10c
offsetof(struct sock, sk_rxhash)=0x128
offsetof(struct sock, sk_rcvbuf)=0x4c
offsetof(struct sock, sk_drops) =0x16c
offsetof(struct sock, sk_filter)=0x198
offsetof(struct sock, sk_wq)=0x88
offsetof(struct sock, sk_policy)=0x98
offsetof(struct sock, sk_flags) =0x130
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b178bb3d

udp: use atomic_inc_not_zero_hint · c31504dc

由 Eric Dumazet 提交于 11月 15, 2010

UDP sockets refcount is usually 2, unless an incoming frame is going to
be queued in receive or backlog queue.

Using atomic_inc_not_zero_hint() permits to reduce latency, because
processor issues less memory transactions.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c31504dc

macvlan: lockless tx path · 8ffab51b

由 Eric Dumazet 提交于 11月 10, 2010

macvlan is a stacked device, like tunnels. We should use the lockless
mechanism we are using in tunnels and loopback.

This patch completely removes locking in TX path.

tx stat counters are added into existing percpu stat structure, renamed
from rx_stats to pcpu_stats.

Note : this reverts commit 2c114553 (macvlan: add multiqueue
capability)

Note : rx_errors converted to a 32bit counter, like tx_dropped, since
they dont need 64bit range.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ffab51b

netlink: let nlmsg and nla functions take pointer-to-const args · 3654654f

由 Jan Engelhardt 提交于 11月 16, 2010

The changed functions do not modify the NL messages and/or attributes
at all. They should use const (similar to strchr), so that callers
which have a const nlmsg/nlattr around can make use of them without
casting.

While at it, constify a data array.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3654654f

16 11月, 2010 12 次提交

netdev: add rcu annotations to receive handler hook · 61391cde

由 stephen hemminger 提交于 11月 15, 2010

Suggested by Eric's bridge RCU changes.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61391cde

bridge: add proper RCU annotation to should_route_hook · a386f990

由 Eric Dumazet 提交于 11月 15, 2010

Add br_should_route_hook_t typedef, this is the only way we can
get a clean RCU implementation for function pointer.

Move route_hook to location where it is used.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a386f990

include/net/caif/cfctrl.h: Remove unnecessary semicolons · d577f1cc

由 Joe Perches 提交于 11月 14, 2010

Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d577f1cc

J
include/linux/if_macvlan.h: Remove unnecessary semicolons · c59504eb
由 Joe Perches 提交于 11月 14, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
c59504eb

net: Simplify RX queue allocation · fe822240

由 Tom Herbert 提交于 11月 09, 2010

This patch move RX queue allocation to alloc_netdev_mq and freeing of
the queues to free_netdev (symmetric to TX queue allocation). Each
kobject RX queue takes a reference to the queue's device so that the
device can't be freed before all the kobjects have been released-- this
obviates the need for reference counts specific to RX queues.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe822240

xfrm: use gre key as flow upper protocol info · cc9ff19d

由 Timo Teräs 提交于 11月 03, 2010

The GRE Key field is intended to be used for identifying an individual
traffic flow within a tunnel. It is useful to be able to have XFRM
policy selector matches to have different policies for different
GRE tunnels.
Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc9ff19d

bitops: Provide generic sign_extend32 function · 7919a57b

由 Andreas Herrmann 提交于 8月 30, 2010

This patch moves code out from wireless drivers where two different
functions are defined in three code locations for the same purpose and
provides a common function to sign extend a 32-bit value.
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

7919a57b

wl1271: ref_clock cosmetic changes · c8aea565

由 Gery Kahn 提交于 10月 05, 2010

Cosmetic cleanup for ref_clock code while configured by board.
Signed-off-by: NGery Kahn <geryk@ti.com>
Signed-off-by: NLuciano Coelho <luciano.coelho@nokia.com>

c8aea565

cfg80211: fix disabling channels based on hints · ca4ffe8f

由 Luis R. Rodriguez 提交于 10月 20, 2010

After a module loads you will have loaded the world roaming regulatory
domain or a custom regulatory domain. Further regulatory hints are
welcomed and should be respected unless the regulatory hint is coming
from a country IE as the IEEE spec allows for a country IE to be a subset
of what is allowed by the local regulatory agencies.

So disable all channels that do not fit a regulatory domain sent
from a unless the hint is from a country IE and the country IE had
no information about the band we are currently processing.

This fixes a few regulatory issues, for example for drivers that depend
on CRDA and had no 5 GHz freqencies allowed were not properly disabling
5 GHz at all, furthermore it also allows users to restrict devices
further as was intended.

If you recieve a country IE upon association we will also disable the
channels that are not allowed if the country IE had at least one
channel on the respective band we are procesing.

This was the original intention behind this design but it was
completely overlooked...

Cc: David Quan <david.quan@atheros.com>
Cc: Jouni Malinen <jouni.malinen@atheros.com>
cc: Easwar Krishnan <easwar.krishnan@atheros.com>
Cc: stable@kernel.org
Signed-off-by: NLuis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

ca4ffe8f

cfg80211: fix allowing country IEs for WIPHY_FLAG_STRICT_REGULATORY · 749b527b

由 Luis R. Rodriguez 提交于 10月 20, 2010

We should be enabling country IE hints for WIPHY_FLAG_STRICT_REGULATORY
even if we haven't yet recieved regulatory domain hint for the driver
if it needed one. Without this Country IEs are not passed on to drivers
that have set WIPHY_FLAG_STRICT_REGULATORY, today this is just all
Atheros chipset drivers: ath5k, ath9k, ar9170, carl9170.

This was part of the original design, however it was completely
overlooked...

Cc: Easwar Krishnan <easwar.krishnan@atheros.com>
Cc: stable@kernel.org
Signed-off-by: NLuis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

749b527b

rfkill: remove dead code · 2e48928d

由 Stephen Hemminger 提交于 10月 20, 2010

The following code is defined but never used.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

2e48928d

offloading: Force software GSO for multiple vlan tags. · 58e998c6

由 Jesse Gross 提交于 10月 29, 2010

We currently use vlan_features to check for TSO support if there is
a vlan tag.  However, it's quite likely that the NIC is not able to
do TSO when there is an arbitrary number of tags.  Therefore if there
is more than one tag (in-band or out-of-band), fall back to software
emulation.
Signed-off-by: NJesse Gross <jesse@nicira.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58e998c6

15 11月, 2010 1 次提交

dccp ccid-2: Schedule Sync as out-of-band mechanism · d83447f0

由 Gerrit Renker 提交于 11月 14, 2010

The problem with Ack Vectors is that
  i) their length is variable and can in principle grow quite large,
 ii) it is hard to predict exactly how large they will be.

Due to the second point it seems not a good idea to reduce the MPS; in
particular when on average there is enough room for the Ack Vector and an
increase in length is momentarily due to some burst loss, after which the
Ack Vector returns to its normal/average length.

The solution taken by this patch is to subtract a minimum-expected Ack Vector
length from the MPS, and to defer any larger Ack Vectors onto a separate
Sync - but only if indeed there is no space left on the skb.

This patch provides the infrastructure to schedule Sync-packets for transporting
(urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since
it does not need to wait for new application data.
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>

d83447f0

13 11月, 2010 2 次提交

igmp: RCU conversion of in_dev->mc_list · 1d7138de

由 Eric Dumazet 提交于 11月 12, 2010

in_dev->mc_list is protected by one rwlock (in_dev->mc_list_lock).

This can easily be converted to a RCU protection.

Writers hold RTNL, so mc_list_lock is removed, not replaced by a
spinlock.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Cypher Wu <cypher.w@gmail.com>
Cc: Américo Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d7138de

vlan: Add function to retrieve EtherType from vlan packets. · 0a85df00

由 Hao Zheng 提交于 11月 11, 2010

Depending on how a packet is vlan tagged (i.e. hardware accelerated or
not), the encapsulated protocol is stored in different locations.  This
provides a consistent method of accessing that protocol, which is needed
by drivers, security checks, etc.
Signed-off-by: NHao Zheng <hzheng@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a85df00

12 11月, 2010 13 次提交

backlight: add low threshold to pwm backlight · fef7764f

由 Arun Murthy 提交于 11月 11, 2010

The intensity of the backlight can be varied from a range of
max_brightness to zero.  Though most, if not all the pwm based backlight
devices start flickering at lower brightness value.  And also for each
device there exists a brightness value below which the backlight appears
to be turned off though the value is not equal to zero.

If the range of brightness for a device is from zero to max_brightness.  A
graph is plotted for brightness Vs intensity for the pwm based backlight
device has to be a linear graph.

intensity
	  |   /
	  |  /
	  | /
	  |/
	  ---------
	 0	max_brightness

But pratically on measuring the above we note that the intensity of
backlight goes to zero(OFF) when the value in not zero almost nearing to
zero(some x%).  so the graph looks like

intensity
	  |    /
	  |   /
	  |  /
	  |  |
	  ------------
	 0   x	 max_brightness

In order to overcome this drawback knowing this x% i.e nothing but the low
threshold beyond which the backlight is off and will have no effect, the
brightness value is being offset by the low threshold value(retaining the
linearity of the graph).  Now the graph becomes

intensity
	  |     /
	  |    /
	  |   /
	  |  /
	  -------------
	   0	  max_brightness

With this for each and every digit increment in the brightness from zero
there is a change in the intensity of backlight.  Devices having this
behaviour can set the low threshold brightness(lth_brightness) and pass
the same as platform data else can have it as zero.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NArun Murthy <arun.murthy@stericsson.com>
Acked-by: NLinus Walleij <linus.walleij@stericsson.com>
Acked-by: NRichard Purdie <rpurdie@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fef7764f

leds: driver for National Semiconductors LP5523 chip · 0efba16c

由 Samu Onkalo 提交于 11月 11, 2010

LP5523 chip is nine channel led driver with programmable engines.  Driver
provides support for that chip for direct access via led class or via
programmable engines.
Signed-off-by: NSamu Onkalo <samu.p.onkalo@nokia.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0efba16c

leds: driver for National Semiconductor LP5521 chip · 500fe141

由 Samu Onkalo 提交于 11月 11, 2010

This patchset provides support for LP5521 and LP5523 LED driver chips from
National Semicondutor.  Both drivers supports programmable engines and
naturally LED class features.

Documentation is provided as a part of the patchset.  I created "leds"
subdirectory under Documentation.  Perhaps the rest of the leds*
documentation should be moved there.

Datasheets are freely available at National Semiconductor www pages.

This patch:

LP5521 chip is three channel led driver with programmable engines.  Driver
provides support for that chip for direct access via led class or via
programmable engines.
Signed-off-by: NSamu Onkalo <samu.p.onkalo@nokia.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

500fe141

led-class: always implement blinking · 5ada28bf

由 Johannes Berg 提交于 11月 11, 2010

Currently, blinking LEDs can be awkward because it is not guaranteed that
all LEDs implement blinking.  The trigger that wants it to blink then
needs to implement its own timer solution.

Rather than require that, add led_blink_set() API that triggers can use.
This function will attempt to use hw blinking, but if that fails
implements a timer for it.  To stop blinking again, brightness_set() also
needs to be wrapped into API that will stop the software blink.

As a result of this, the timer trigger becomes a very trivial one, and
hopefully we can finally see triggers using blinking as well because it's
always easy to use.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Acked-by: NRichard Purdie <rpurdie@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5ada28bf

radix-tree: fix RCU bug · 27d20fdd

由 Nick Piggin 提交于 11月 11, 2010

Salman Qazi describes the following radix-tree bug:

In the following case, we get can get a deadlock:

0.  The radix tree contains two items, one has the index 0.
1.  The reader (in this case find_get_pages) takes the rcu_read_lock.
2.  The reader acquires slot(s) for item(s) including the index 0 item.
3.  The non-zero index item is deleted, and as a consequence the other item is
    moved to the root of the tree. The place where it used to be is queued for
    deletion after the readers finish.
3b. The zero item is deleted, removing it from the direct slot, it remains in
    the rcu-delayed indirect node.
4.  The reader looks at the index 0 slot, and finds that the page has 0 ref
    count
5.  The reader looks at it again, hoping that the item will either be freed or
    the ref count will increase. This never happens, as the slot it is looking
    at will never be updated. Also, this slot can never be reclaimed because
    the reader is holding rcu_read_lock and is in an infinite loop.

The fix is to re-use the same "indirect" pointer case that requires a slot
lookup retry into a general "retry the lookup" bit.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>
Reported-by: NSalman Qazi <sqazi@google.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27d20fdd

Restrict unprivileged access to kernel syslog · eaf06b24

由 Dan Rosenberg 提交于 11月 11, 2010

The kernel syslog contains debugging information that is often useful
during exploitation of other vulnerabilities, such as kernel heap
addresses.  Rather than futilely attempt to sanitize hundreds (or
thousands) of printk statements and simultaneously cripple useful
debugging functionality, it is far simpler to create an option that
prevents unprivileged users from reading the syslog.

This patch, loosely based on grsecurity's GRKERNSEC_DMESG, creates the
dmesg_restrict sysctl.  When set to "0", the default, no restrictions are
enforced.  When set to "1", only users with CAP_SYS_ADMIN can read the
kernel syslog via dmesg(8) or other mechanisms.

[akpm@linux-foundation.org: explain the config option in kernel.txt]
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NEugene Teo <eugeneteo@kernel.org>
Acked-by: NKees Cook <kees.cook@canonical.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eaf06b24

include/linux/highmem.h needs hardirq.h · 43b3a0c7

由 Catalin Marinas 提交于 11月 11, 2010

Commit 3e4d3af5 ("mm: stack based kmap_atomic()") introduced the
kmap_atomic_idx_push() function which warns on in_irq() with
CONFIG_DEBUG_HIGHMEM enabled.  This patch includes linux/hardirq.h for
the in_irq definition.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

43b3a0c7

atomic: add atomic_inc_not_zero_hint() · 3f9d35b9

由 Eric Dumazet 提交于 11月 11, 2010

Followup of perf tools session in Netfilter WorkShop 2010

In the network stack we make high usage of atomic_inc_not_zero() in
contexts we know the probable value of atomic before increment (2 for udp
sockets for example)

Using a special version of atomic_inc_not_zero() giving this hint can help
processor to use less bus transactions.

On x86 (MESI protocol) for example, this avoids entering Shared state,
because "lock cmpxchg" issues an RFO (Read For Ownership)

akpm: Adds a new include/linux/atomic.h.  This means that new code should
henceforth include linux/atomic.h and not asm/atomic.h.  The presence of
include/linux/atomic.h will in fact cause checkpatch.pl to warn about use
of asm/atomic.h.  The new include/linux/atomic.h becomes the place where
arch-neutral atomic_t code should be placed.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3f9d35b9

include/linux/resource.h needs types.h · 8705a1ba

由 Jean Delvare 提交于 11月 11, 2010

Fix the following warning:
usr/include/linux/resource.h:49: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8705a1ba

netfilter: NF_HOOK_COND has wrong conditional · ac5aa2e3

由 Eric Paris 提交于 11月 12, 2010

The NF_HOOK_COND returns 0 when it shouldn't due to what I believe to be an
error in the code as the order of operations is not what was intended.  C will
evalutate == before =.  Which means ret is getting set to the bool result,
rather than the return value of the function call.  The code says

if (ret = function() == 1)
when it meant to say:
if ((ret = function()) == 1)

Normally the compiler would warn, but it doesn't notice it because its
a actually complex conditional and so the wrong code is wrapped in an explict
set of () [exactly what the compiler wants you to do if this was intentional].
Fixing this means that errors when netfilter denies a packet get propagated
back up the stack rather than lost.

Problem introduced by commit 2249065f (netfilter: get rid of the grossness
in netfilter.h).
Signed-off-by: NEric Paris <eparis@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NPatrick McHardy <kaber@trash.net>

ac5aa2e3

ipv4: Make rt->fl.iif tests lest obscure. · c7537967

由 David S. Miller 提交于 11月 11, 2010

When we test rt->fl.iif against zero, we're seeing if it's
an output or an input route.

Make that explicit with some helper functions.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7537967

net: get rid of rtable->idev · 72cdd1d9

由 Eric Dumazet 提交于 11月 11, 2010

It seems idev field in struct rtable has no special purpose, but adding
extra atomic ops.

We hold refcounts on the device itself (using percpu data, so pretty
cheap in current kernel).

infiniband case is solved using dst.dev instead of idev->dev

Removal of this field means routing without route cache is now using
shared data, percpu data, and only potential contention is a pair of
atomic ops on struct neighbour per forwarded packet.

About 5% speedup on routing test.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72cdd1d9

neigh: reorder struct neighbour · 46b13fc5

由 Eric Dumazet 提交于 11月 11, 2010

It is important to move nud_state outside of the often modified cache
line (because of refcnt), to reduce false sharing in neigh_event_send()

This is a followup of commit 0ed8ddf4 (neigh: Protect neigh->ha[]
with a seqlock)

This gives a 7% speedup on routing test with IP route cache disabled.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46b13fc5

11 11月, 2010 3 次提交

block: remove unused copy_io_context() · cedb4a7d

由 Jens Axboe 提交于 11月 11, 2010

Reported-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

cedb4a7d

perf_events: Fix time tracking in samples · eed01528

由 Stephane Eranian 提交于 10月 26, 2010

This patch corrects time tracking in samples. Without this patch
both time_enabled and time_running are bogus when user asks for
PERF_SAMPLE_READ.

One uses PERF_SAMPLE_READ to sample the values of other counters
in each sample. Because of multiplexing, it is necessary to know
both time_enabled, time_running to be able to scale counts correctly.

In this second version of the patch, we maintain a shadow
copy of ctx->time which allows us to compute ctx->time without
calling update_context_time() from NMI context. We avoid the
issue that update_context_time() must always be called with
ctx->lock held.

We do not keep shadow copies of the other event timings
because if the lead event is overflowing then it is active
and thus it's been scheduled in via event_sched_in() in
which case neither tstamp_stopped, tstamp_running can be modified.

This timing logic only applies to samples when PERF_SAMPLE_READ
is used.

Note that this patch does not address timing issues related
to sampling inheritance between tasks. This will be addressed
in a future patch.

With this patch, the libpfm4 example task_smpl now reports
correct counts (shown on 2.4GHz Core 2):

$ task_smpl -p 2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears  noploop 5
noploop for 5 seconds
IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3
	2,400,000,254 unhalted_core_cycles:u (33)
	2,399,273,744 instructions_retired:u (34)
	53,340 baclears (35)
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4cc6e14b.1e07e30a.256e.5190@mx.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

eed01528

net: avoid limits overflow · 8d987e5c

由 Eric Dumazet 提交于 11月 09, 2010

Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Reported-by: NRobin Holt <holt@sgi.com>
Reviewed-by: NRobin Holt <holt@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d987e5c

10 11月, 2010 3 次提交

block: remove REQ_HARDBARRIER · 02e031cb

由 Christoph Hellwig 提交于 11月 10, 2010

REQ_HARDBARRIER is dead now, so remove the leftovers.  What's left
at this point is:

 - various checks inside the block layer.
 - sanity checks in bio based drivers.
 - now unused bio_empty_barrier helper.
 - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
   but Xen really needs to sort out it's barrier situaton.
 - setting of ordered tags in uas - dead code copied from old scsi
   drivers.
 - scsi different retry for barriers - it's dead and should have been
   removed when flushes were converted to FS requests.
 - blktrace handling of barriers - removed.  Someone who knows blktrace
   better should add support for REQ_FLUSH and REQ_FUA, though.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

02e031cb

drm/ttm: Be consistent on ttm_bo_init() failures · 7dfbbdcf

由 Thomas Hellstrom 提交于 11月 09, 2010

Call destroy() on _all_ ttm_bo_init() failures, and make sure that
behavior is documented in the function description.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

7dfbbdcf

tty: Fix formatting in tty.h · 65f8e441

由 Alan Cox 提交于 11月 09, 2010

Someone added a new ldisc number and messed up the tabbing. Fix it before
anyone else copies it.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

65f8e441

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功