提交 · a5560a6c17a4af4e8f45d5c1215ba7d5bb0ff109 · OpenHarmony / kernel_linux

24 5月, 2013 1 次提交

xen: netif.h: document feature-split-event-channels · a5560a6c

由 Wei Liu 提交于 5月 22, 2013

This patch synchronises documentation for feature-split-event-channels from
Xen canonical header file.
Signed-off-by: NWei Liu <wei.liu2@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5560a6c

21 5月, 2013 5 次提交

phy: add phy_mac_interrupt() to use with PHY_IGNORE_INTERRUPT · 5ea94e76

由 Florian Fainelli 提交于 5月 19, 2013

There is currently no way for an Ethernet MAC driver servicing PHY link
interrupts to notify this to the PHY state machine without defining its
own state machine. Since most drivers are not so special, introduce a
helper: phy_mac_interrupt() which can be called from a link up/down
interrupt routine to update the PHY state machine. To avoid code
duplication some refactoring has been done to expose the workqueue and
its corresponding callback internally.
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ea94e76

phy: fix the use of PHY_IGNORE_INTERRUPT · 2c7b4921

由 Florian Fainelli 提交于 5月 19, 2013

When a PHY device is registered with the special IRQ value
PHY_IGNORE_INTERRUPT (-2) it will not properly be handled by the PHY
library:

- it continues to poll its register, while we do not want this
  because such PHY link events or register changes are serviced by an
  Ethernet MAC
- it will still try to configure PHY interrupts at the PHY level, such
  interrupts do not exist at the PHY but at the MAC level
- the state machine only handles PHY_POLL, but should also handle
  PHY_IGNORE_INTERRUPT similarly

This patch updates the PHY state machine and initialization paths to
account for the specific PHY_IGNORE_INTERRUPT. Based on an earlier patch
by Thomas Petazzoni, and reworked to add the missing bits. Add a helper
phy_interrupt_is_valid() which specifically tests for a PHY interrupt
not to be PHY_POLL or PHY_IGNORE_INTERRUPT and use it throughout the
code.
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c7b4921

tcp: md5: remove spinlock usage in fast path · 71cea17e

由 Eric Dumazet 提交于 5月 20, 2013

TCP md5 code uses per cpu variables but protects access to them with
a shared spinlock, which is a contention point.

[ tcp_md5sig_pool_lock is locked twice per incoming packet ]

Makes things much simpler, by allocating crypto structures once, first
time a socket needs md5 keys, and not deallocating them as they are
really small.

Next step would be to allow crypto allocations being done in a NUMA
aware way.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71cea17e

net: ipv6: remove 'next' member from inet6_dev · 168fc21a

由 Daniel Borkmann 提交于 5月 20, 2013

The next pointer within the inet6_dev structure seems not to be used
anywhere. So just remove it. Tested with allmodconfig on x86_64.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

168fc21a

rps: selective flow shedding during softnet overflow · 99bbc707

由 Willem de Bruijn 提交于 5月 20, 2013

A cpu executing the network receive path sheds packets when its input
queue grows to netdev_max_backlog. A single high rate flow (such as a
spoofed source DoS) can exceed a single cpu processing rate and will
degrade throughput of other flows hashed onto the same cpu.

This patch adds a more fine grained hashtable. If the netdev backlog
is above a threshold, IRQ cpus track the ratio of total traffic of
each flow (using 4096 buckets, configurable). The ratio is measured
by counting the number of packets per flow over the last 256 packets
from the source cpu. Any flow that occupies a large fraction of this
(set at 50%) will see packet drop while above the threshold.

Tested:
Setup is a muli-threaded UDP echo server with network rx IRQ on cpu0,
kernel receive (RPS) on cpu0 and application threads on cpus 2--7
each handling 20k req/s. Throughput halves when hit with a 400 kpps
antagonist storm. With this patch applied, antagonist overload is
dropped and the server processes its complete load.

The patch is effective when kernel receive processing is the
bottleneck. The above RPS scenario is a extreme, but the same is
reached with RFS and sufficient kernel processing (iptables, packet
socket tap, ..).
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99bbc707

20 5月, 2013 3 次提交

filter: do not output bpf image address for security reason · 16495445

由 Eric Dumazet 提交于 5月 17, 2013

Do not leak starting address of BPF JIT code for non root users,
as it might help intruders to perform an attack.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16495445

tcp: remove bad timeout logic in fast recovery · 3e59cb0d

由 Yuchung Cheng 提交于 5月 17, 2013

tcp_timeout_skb() was intended to trigger fast recovery on timeout,
unfortunately in reality it often causes spurious retransmission
storms during fast recovery. The particular sign is a fast retransmit
over the highest sacked sequence (SND.FACK).

Currently the RTO timer re-arming (as in RFC6298) offers a nice cushion
to avoid spurious timeout: when SND.UNA advances the sender re-arms
RTO and extends the timeout by icsk_rto. The sender does not offset
the time elapsed since the packet at SND.UNA was sent.

But if the next (DUP)ACK arrives later than ~RTTVAR and triggers
tcp_fastretrans_alert(), then tcp_timeout_skb() will mark any packet
sent before the icsk_rto interval lost, including one that's above the
highest sacked sequence. Most likely a large part of scorebard will be
marked.

If most packets are not lost then the subsequent DUPACKs with new SACK
blocks will cause the sender to continue to retransmit packets beyond
SND.FACK spuriously. Even if only one packet is lost the sender may
falsely retransmit almost the entire window.

The situation becomes common in the world of bufferbloat: the RTT
continues to grow as the queue builds up but RTTVAR remains small and
close to the minimum 200ms. If a data packet is lost and the DUPACK
triggered by the next data packet is slightly delayed, then a spurious
retransmission storm forms.

As the original comment on tcp_timeout_skb() suggests: the usefulness
of this feature is questionable. It also wastes cycles walking the
sack scoreboard and is actually harmful because of false recovery.

It's time to remove this.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NNandita Dukkipati <nanditad@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e59cb0d

ipv6: add support of peer address · caeaba79

由 Nicolas Dichtel 提交于 5月 16, 2013

This patch adds the support of peer address for IPv6. For example, it is
possible to specify the remote end of a 6inY tunnel.
This was already possible in IPv4:
 ip addr add ip1 peer ip2 dev dev1

The peer address is specified with IFA_ADDRESS and the local address with
IFA_LOCAL (like explained in include/uapi/linux/if_addr.h).
Note that the API is not changed, because before this patch, it was not
possible to specify two different addresses in IFA_LOCAL and IFA_REMOTE.
There is a small change for the dump: if the peer is different from ::,
IFA_ADDRESS will contain the peer address instead of the local address.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

caeaba79

12 5月, 2013 2 次提交

ipv6: do not clear pinet6 field · f77d6021

由 Eric Dumazet 提交于 5月 09, 2013

We have seen multiple NULL dereferences in __inet6_lookup_established()

After analysis, I found that inet6_sk() could be NULL while the
check for sk_family == AF_INET6 was true.

Bug was added in linux-2.6.29 when RCU lookups were introduced in UDP
and TCP stacks.

Once an IPv6 socket, using SLAB_DESTROY_BY_RCU is inserted in a hash
table, we no longer can clear pinet6 field.

This patch extends logic used in commit fcbdf09d
("net: fix nulls list corruptions in sk_prot_alloc")

TCP/UDP/UDPLite IPv6 protocols provide their own .clear_sk() method
to make sure we do not clear pinet6 field.

At socket clone phase, we do not really care, as cloning the parent (non
NULL) pinet6 is not adding a fatal race.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f77d6021

net/mlx4: Strengthen VLAN tags/priorities enforcement in VST mode · 7677fc96

由 Rony Efraim 提交于 5月 08, 2013

Make sure that the following steps are taken:

- drop packets sent by the VF with vlan tag
- block packets with vlan tag which are steered to the VF
- drop/block tagged packets when the policy is priority-tagged
- make sure VLAN stripping for received packets is set
- make sure force UP bit for the VF QP is set

Use enum values for all the above instead of numerical bit offsets.
Signed-off-by: NRony Efraim <ronye@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7677fc96

10 5月, 2013 10 次提交

[SCSI] iscsi class, qla4xxx: fix sess/conn refcounting when find fns are used · 8526cb11

由 Mike Christie 提交于 5月 06, 2013

This fixes a bug where the iscsi class/driver did not do a put_device
when a sess/conn device was found. This also simplifies the interface
by not having to pass in some arguments that were duplicated and did
not need to be exported.
Reported-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Acked-by: NVikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

8526cb11

[SCSI] sas: unify the pointlessly separated enums sas_dev_type and sas_device_type · aa9f8328

由 James Bottomley 提交于 5月 07, 2013

These enums have been separate since the dawn of SAS, mainly because the
latter is a procotol only enum and the former includes additional state
for libsas. The dichotomy causes endless confusion about which one you
should use where and leads to pointless warnings like this:

drivers/scsi/mvsas/mv_sas.c: In function 'mvs_update_phyinfo':
drivers/scsi/mvsas/mv_sas.c:1162:34: warning: comparison between 'enum sas_device_type' and 'enum sas_dev_type' [-Wenum-compare]

Fix by eliminating one of them. The one kept is effectively the sas.h
one, but call it sas_device_type and make sure the enums are all
properly namespaced with the SAS_ prefix.
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

aa9f8328

dm: document iterate_devices · 058ce5ca

由 Alasdair G Kergon 提交于 5月 10, 2013

Document iterate_devices in device-mapper.h.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

058ce5ca

drm: Use names of ioctls in debug traces · b9434d0f

由 Chris Cummins 提交于 5月 09, 2013

The intention here is to make the output of dmesg with full verbosity a
bit easier for a human to parse. This commit transforms:

[drm:drm_ioctl], pid=699, cmd=0x6458, nr=0x58, dev 0xe200, auth=1
[drm:drm_ioctl], pid=699, cmd=0xc010645b, nr=0x5b, dev 0xe200, auth=1
[drm:drm_ioctl], pid=699, cmd=0xc0106461, nr=0x61, dev 0xe200, auth=1
[drm:drm_ioctl], pid=699, cmd=0xc01c64ae, nr=0xae, dev 0xe200, auth=1
[drm:drm_mode_addfb], [FB:32]
[drm:drm_ioctl], pid=699, cmd=0xc0106464, nr=0x64, dev 0xe200, auth=1
[drm:drm_vm_open_locked], 0x7fd9302fe000,0x00a00000
[drm:drm_ioctl], pid=699, cmd=0x400c645f, nr=0x5f, dev 0xe200, auth=1
[drm:drm_ioctl], pid=699, cmd=0xc00464af, nr=0xaf, dev 0xe200, auth=1
[drm:intel_crtc_set_config], [CRTC:3] [NOFB]

into:

[drm:drm_ioctl], pid=699, dev=0xe200, auth=1, I915_GEM_THROTTLE
[drm:drm_ioctl], pid=699, dev=0xe200, auth=1, I915_GEM_CREATE
[drm:drm_ioctl], pid=699, dev=0xe200, auth=1, I915_GEM_SET_TILING
[drm:drm_ioctl], pid=699, dev=0xe200, auth=1, IOCTL_MODE_ADDFB
[drm:drm_mode_addfb], [FB:32]
[drm:drm_ioctl], pid=699, dev=0xe200, auth=1, I915_GEM_MMAP_GTT
[drm:drm_vm_open_locked], 0x7fd9302fe000,0x00a00000
[drm:drm_ioctl], pid=699, dev=0xe200, auth=1, I915_GEM_SET_DOMAIN
[drm:drm_ioctl], pid=699, dev=0xe200, auth=1, DRM_IOCTL_MODE_RMFB
[drm:intel_crtc_set_config], [CRTC:3] [NOFB]

v2: drm_ioctls is now a constant (Ville Syrjälä)
Signed-off-by: NChris Cummins <christopher.e.cummins@intel.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

b9434d0f

V
drm: Remove pointless '-' characters from drm_fb_helper documentation · 7b97936f
由 Ville Syrjälä 提交于 5月 08, 2013
```
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
```
7b97936f
V
drm: Add kernel-doc for drm_fb_helper_funcs->initial_config · 54afc121
由 Ville Syrjälä 提交于 5月 08, 2013
```
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
```
54afc121

tracing: Modify soft-mode only if there's no other referrer · 1cf4c073

由 Masami Hiramatsu 提交于 5月 09, 2013

Modify soft-mode flag only if no other soft-mode referrer
(currently only the ftrace triggers) by using a reference
counter in each ftrace_event_file.

Without this fix, adding and removing several different
enable/disable_event triggers on the same event clear
soft-mode bit from the ftrace_event_file. This also
happens with a typo of glob on setting triggers.

e.g.

 # echo vfs_symlink:enable_event:net:netif_rx > set_ftrace_filter
 # cat events/net/netif_rx/enable
 0*
 # echo typo_func:enable_event:net:netif_rx > set_ftrace_filter
 # cat events/net/netif_rx/enable
 0
 # cat set_ftrace_filter
 #### all functions enabled ####
 vfs_symlink:enable_event:net:netif_rx:unlimited

As above, we still have a trigger, but soft-mode is gone.

Link: http://lkml.kernel.org/r/20130509054429.30398.7464.stgit@mhiramat-M0-7522

Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: David Sharp <dhsharp@google.com>
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

1cf4c073

ftrace, kprobes: Fix a deadlock on ftrace_regex_lock · f04f24fb

由 Masami Hiramatsu 提交于 5月 09, 2013

Fix a deadlock on ftrace_regex_lock which happens when setting
an enable_event trigger on dynamic kprobe event as below.

----
sh-2.05b# echo p vfs_symlink > kprobe_events
sh-2.05b# echo vfs_symlink:enable_event:kprobes:p_vfs_symlink_0 > set_ftrace_filter

=============================================
[ INFO: possible recursive locking detected ]
3.9.0+ #35 Not tainted
---------------------------------------------
sh/72 is trying to acquire lock:
 (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810ba6c1>] ftrace_set_hash+0x81/0x1f0

but task is already holding lock:
 (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810b7cbd>] ftrace_regex_write.isra.29.part.30+0x3d/0x220

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(ftrace_regex_lock);
  lock(ftrace_regex_lock);

 *** DEADLOCK ***
----

To fix that, this introduces a finer regex_lock for each ftrace_ops.
ftrace_regex_lock is too big of a lock which protects all
filter/notrace_hash operations, but it doesn't need to be a global
lock after supporting multiple ftrace_ops because each ftrace_ops
has its own filter/notrace_hash.

Link: http://lkml.kernel.org/r/20130509054417.30398.84254.stgit@mhiramat-M0-7522

Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[ Added initialization flag and automate mutex initialization for
  non ftrace.c ftrace_probes. ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f04f24fb

ARM: imx: Select GENERIC_ALLOCATOR · 60371952

由 Fabio Estevam 提交于 5月 08, 2013

Since commit 657eee7d (media: coda: use genalloc API) the following build
error happens with imx_v4_v5_defconfig:

drivers/built-in.o: In function 'coda_remove':
clk-composite.c:(.text+0x112180): undefined reference to 'gen_pool_free'
drivers/built-in.o: In function 'coda_probe':
clk-composite.c:(.text+0x112310): undefined reference to 'of_get_named_gen_pool'
clk-composite.c:(.text+0x1123f4): undefined reference to 'gen_pool_alloc'
clk-composite.c:(.text+0x11240c): undefined reference to 'gen_pool_virt_to_phys'
clk-composite.c:(.text+0x112458): undefined reference to 'dev_get_gen_pool'

Select GENERIC_ALLOCATOR and get rid of the custom IRAM_ALLOC.
Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: NShawn Guo <shawn.guo@linaro.org>
Signed-off-by: NOlof Johansson <olof@lixom.net>

60371952

A
unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE · 91c2e0bc
由 Al Viro 提交于 3月 05, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
91c2e0bc

09 5月, 2013 2 次提交

if_cablemodem.h: Add parenthesis around ioctl macros · 4f924b2a

由 Josh Boyer 提交于 5月 08, 2013

Protect the SIOCGCM* ioctl macros with parenthesis.
Reported-by: NPaul Wouters <pwouters@redhat.com>
Signed-off-by: NJosh Boyer <jwboyer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f924b2a

usbnet: allow status interrupt URB to always be active · 6eecdc5f

由 Dan Williams 提交于 5月 06, 2013

Some drivers (sierra_net) need the status interrupt URB
active even when the device is closed, because they receive
custom indications from firmware.  Add functions to refcount
the status interrupt URB submit/kill operation so that
sub-drivers and the generic driver don't fight over whether
the status interrupt URB is active or not.

A sub-driver can call usbnet_status_start() at any time, but
the URB is only submitted the first time the function is
called.  Likewise, when the sub-driver is done with the URB,
it calls usbnet_status_stop() but the URB is only killed when
all users have stopped it.  The URB is still killed and
re-submitted for suspend/resume, as before, with the same
refcount it had at suspend.
Signed-off-by: NDan Williams <dcbw@redhat.com>
Acked-by: NOliver Neukum <oliver@neukum.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6eecdc5f

08 5月, 2013 17 次提交

NVMe: Simplify Firmware Activate code slightly · ab3ea5bf

由 Matthew Wilcox 提交于 5月 06, 2013

Add definitions for the three Firmware Activate actions, and change the
SCSI translation code to construct the command into a temporary variable
instead of translating the endianness back-and-forth.
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: NVishal Verma <vishal.l.verma@linux.intel.com>

ab3ea5bf

ALSA: Add comment for control TLV API · d24f5a9a

由 David Henningsson 提交于 5月 08, 2013

Userspace is not meant to have to handle all strange dB ranges,
so add a specification comment.
Signed-off-by: NDavid Henningsson <david.henningsson@canonical.com>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

d24f5a9a

aio: don't include aio.h in sched.h · a27bb332

由 Kent Overstreet 提交于 5月 07, 2013

Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a27bb332

aio: kill ki_retry · 41ef4eb8

由 Kent Overstreet 提交于 5月 07, 2013

Thanks to Zach Brown's work to rip out the retry infrastructure, we don't
need this anymore - ki_retry was only called right after the kiocb was
initialized.

This also refactors and trims some duplicated code, as well as cleaning up
the refcounting/error handling a bit.

[akpm@linux-foundation.org: use fmode_t in aio_run_iocb()]
[akpm@linux-foundation.org: fix file_start_write/file_end_write tests]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

41ef4eb8

aio: kill ki_key · 8a660890

由 Kent Overstreet 提交于 5月 07, 2013

ki_key wasn't actually used for anything previously - it was always 0.
Drop it to trim struct kiocb a bit.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8a660890

audit: Make testing for a valid loginuid explicit. · 780a7654

由 Eric W. Biederman 提交于 4月 09, 2013

audit rule additions containing "-F auid!=4294967295" were failing
with EINVAL because of a regression caused by e1760bd5.

Apparently some userland audit rule sets want to know if loginuid uid
has been set and are using a test for auid != 4294967295 to determine
that.

In practice that is a horrible way to ask if a value has been set,
because it relies on subtle implementation details and will break
every time the uid implementation in the kernel changes.

So add a clean way to test if the audit loginuid has been set, and
silently convert the old idiom to the cleaner and more comprehensible
new idiom.

Cc: <stable@vger.kernel.org> # 3.7
Reported-By: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

780a7654

aio: kill batch allocation · a1c8eae7

由 Kent Overstreet 提交于 5月 07, 2013

Previously, allocating a kiocb required touching quite a few global
(well, per kioctx) cachelines...  so batching up allocation to amortize
those was worthwhile.  But we've gotten rid of some of those, and in
another couple of patches kiocb allocation won't require writing to any
shared cachelines, so that means we can just rip this code out.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a1c8eae7

aio: use cancellation list lazily · 0460fef2

由 Kent Overstreet 提交于 5月 07, 2013

Cancelling kiocbs requires adding them to a per kioctx linked list,
which is one of the few things we need to take the kioctx lock for in
the fast path.  But most kiocbs can't be cancelled - so if we just do
this lazily, we can avoid quite a bit of locking overhead.

While we're at it, instead of using a flag bit switch to using ki_cancel
itself to indicate that a kiocb has been cancelled/completed.  This lets
us get rid of ki_flags entirely.

[akpm@linux-foundation.org: remove buggy BUG()]
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0460fef2

wait: add wait_event_hrtimeout() · 774a08b3

由 Kent Overstreet 提交于 5月 07, 2013

Analagous to wait_event_timeout() and friends, this adds
wait_event_hrtimeout() and wait_event_interruptible_hrtimeout().

Note that unlike the versions that use regular timers, these don't
return the amount of time remaining when they return - instead, they
return 0 or -ETIME if they timed out.  because I was uncomfortable with
the semantics of doing it the other way (that I could get it right,
anyways).

If the timer expires, there's no real guarantee that expire_time -
current_time would be <= 0 - due to timer slack certainly, and I'm not
sure I want to know the implications of the different clock bases in
hrtimers.

If the timer does expire and the code calculates that the time remaining
is nonnegative, that could be even worse if the calling code then reuses
that timeout.  Probably safer to just return 0 then, but I could imagine
weird bugs or at least unintended behaviour arising from that too.

I came to the conclusion that if other users end up actually needing the
amount of time remaining, the sanest thing to do would be to create a
version that uses absolute timeouts instead of relative.

[akpm@linux-foundation.org: fix description of `timeout' arg]
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

774a08b3

aio: make aio_put_req() lockless · 11599eba

由 Kent Overstreet 提交于 5月 07, 2013

Freeing a kiocb needed to touch the kioctx for three things:

 * Pull it off the reqs_active list
 * Decrementing reqs_active
 * Issuing a wakeup, if the kioctx was in the process of being freed.

This patch moves these to aio_complete(), for a couple reasons:

 * aio_complete() already has to issue the wakeup, so if we drop the
   kioctx refcount before aio_complete does its wakeup we don't have to
   do it twice.
 * aio_complete currently has to take the kioctx lock, so it makes sense
   for it to pull the kiocb off the reqs_active list too.
 * A later patch is going to change reqs_active to include unreaped
   completions - this will mean allocating a kiocb doesn't have to look
   at the ringbuffer. So taking the decrement of reqs_active out of
   kiocb_free() is useful prep work for that patch.

This doesn't really affect cancellation, since existing (usb) code that
implements a cancel function still calls aio_complete() - we just have
to make sure that aio_complete does the necessary teardown for cancelled
kiocbs.

It does affect code paths where we free kiocbs that were never
submitted; they need to decrement reqs_active and pull the kiocb off the
reqs_active list.  This occurs in two places: kiocb_batch_free(), which
is going away in a later patch, and the error path in io_submit_one.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

11599eba

aio: move private stuff out of aio.h · 4e179bca

由 Kent Overstreet 提交于 5月 07, 2013

Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4e179bca

aio: kill return value of aio_complete() · 2d68449e

由 Kent Overstreet 提交于 5月 07, 2013

Nothing used the return value, and it probably wasn't possible to use it
safely for the locked versions (aio_complete(), aio_put_req()).  Just
kill it.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Acked-by: NZach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2d68449e

aio: remove retry-based AIO · 41003a7b

由 Zach Brown 提交于 5月 07, 2013

This removes the retry-based AIO infrastructure now that nothing in tree
is using it.

We want to remove retry-based AIO because it is fundemantally unsafe.
It retries IO submission from a kernel thread that has only assumed the
mm of the submitting task.  All other task_struct references in the IO
submission path will see the kernel thread, not the submitting task.
This design flaw means that nothing of any meaningful complexity can use
retry-based AIO.

This removes all the code and data associated with the retry machinery.
The most significant benefit of this is the removal of the locking
around the unused run list in the submission path.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Signed-off-by: NZach Brown <zab@redhat.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

41003a7b

aio: remove dead code from aio.h · 4b49bb8a

由 Zach Brown 提交于 5月 07, 2013

Signed-off-by: NZach Brown <zab@redhat.com>
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4b49bb8a

remove unused random32() and srandom32() · 22ea9c07

由 Akinobu Mita 提交于 5月 07, 2013

After finishing a naming transition, remove unused backward
compatibility wrapper macros
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

22ea9c07

hugetlbfs: fix mmap failure in unaligned size request · af73e4d9

由 Naoya Horiguchi 提交于 5月 07, 2013

The current kernel returns -EINVAL unless a given mmap length is
"almost" hugepage aligned.  This is because in sys_mmap_pgoff() the
given length is passed to vm_mmap_pgoff() as it is without being aligned
with hugepage boundary.

This is a regression introduced in commit 40716e29 ("hugetlbfs: fix
alignment of huge page requests"), where alignment code is pushed into
hugetlb_file_setup() and the variable len in caller side is not changed.

To fix this, this patch partially reverts that commit, and adds
alignment code in caller side.  And it also introduces hstate_sizelog()
in order to get proper hstate to specified hugepage size.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=56881

[akpm@linux-foundation.org: fix warning when CONFIG_HUGETLB_PAGE=n]
Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Reported-by: <iceman_dvd@yahoo.com>
Cc: Steven Truelove <steven.truelove@utoronto.ca>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

af73e4d9

include/linux/mm.h: complete the mm_walk definition · 0f157a5b

由 Andrew Morton 提交于 5月 07, 2013

That nameless-function-arguments thing drives me batty.  Fix.

Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0f157a5b

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年