提交 · 4819568f23a8bef0ca99b740ca60fe2450ab0aac · openeuler / Kernel

13 12月, 2009 1 次提交

ftrace.h: Use common pr_info fmt string · 4819568f

由 Joe Perches 提交于 12月 12, 2009

Reduces fmt string space a bit.
Signed-off-by: NJoe Perches <joe@perches.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1260651974.2637.4.camel@Joe-Laptop.home>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4819568f

10 12月, 2009 2 次提交

tracing: Add full state to trace_seq · d184b31c

由 Johannes Berg 提交于 11月 25, 2009

The trace_seq buffer might fill up, and right now one needs to check the
return value of each printf into the buffer to check for that.

Instead, have the buffer keep track of whether it is full or not, and
reject more input if it is full or would have overflowed with an input
that wasn't added.

Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d184b31c

tracing: Buffer the output of seq_file in case of filled buffer · a63ce5b3

由 Steven Rostedt 提交于 12月 07, 2009

If the seq_read fills the buffer it will call s_start again on the next
itertation with the same position. This causes a problem with the
function_graph tracer because it consumes the iteration in order to
determine leaf functions.

What happens is that the iterator stores the entry, and the function
graph plugin will look at the next entry. If that next entry is a return
of the same function and task, then the function is a leaf and the
function_graph plugin calls ring_buffer_read which moves the ring buffer
iterator forward (the trace iterator still points to the function start
entry).

The copying of the trace_seq to the seq_file buffer will fail if the
seq_file buffer is full. The seq_read will not show this entry.
The next read by userspace will cause seq_read to again call s_start
which will reuse the trace iterator entry (the function start entry).
But the function return entry was already consumed. The function graph
plugin will think that this entry is a nested function and not a leaf.

To solve this, the trace code now checks the return status of the
seq_printf (trace_print_seq). If the writing to the seq_file buffer
fails, we set a flag in the iterator (leftover) and we do not reset
the trace_seq buffer. On the next call to s_start, we check the leftover
flag, and if it is set, we just reuse the trace_seq buffer and do not
call into the plugin print functions.

Before this patch:

 2)               |      fput() {
 2)               |        __fput() {
 2)   0.550 us    |          inotify_inode_queue_event();
 2)               |          __fsnotify_parent() {
 2)   0.540 us    |          inotify_dentry_parent_queue_event();

After the patch:

 2)               |      fput() {
 2)               |        __fput() {
 2)   0.550 us    |          inotify_inode_queue_event();
 2)   0.548 us    |          __fsnotify_parent();
 2)   0.540 us    |          inotify_dentry_parent_queue_event();

[
  Updated the patch to fix a missing return 0 from the trace_print_seq()
  stub when CONFIG_TRACING is disabled.
Reported-by: NIngo Molnar <mingo@elte.hu>
]
Reported-by: NJiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

a63ce5b3

07 12月, 2009 2 次提交

i2c: Drop probe, ignore and force module parameters · c7b25a9e

由 Jean Delvare 提交于 12月 06, 2009

The legacy probe and force module parameters are obsolete now, the
same can be achieved using the new_device sysfs interface, which is
both more flexible and cheaper (it is implemented by i2c-core rather
than replicated in every driver module.)

The legacy ignore module parameters can be dropped as well. Ignoring
can be done by instantiating a "dummy" device at the problematic
address.

This is the first step of a huge cleanup to i2c-core's i2c_detect
function, i2c.h's I2C_CLIENT_INSMOD* macros, and all drivers that made
use of them.
Signed-off-by: NJean Delvare <khali@linux-fr.org>

c7b25a9e

i2c: Prevent priority inversion on top of bus lock · 194684e5

由 Mika Kuoppala 提交于 12月 06, 2009

Low priority thread holding the i2c bus mutex could block higher
priority threads to access the bus resulting in unacceptable
latencies. Change the mutex type to rt_mutex preventing priority
inversion.
Tested-by: NPeter Ujfalusi <peter.ujfalusi@nokia.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@nokia.com>
Signed-off-by: NJean Delvare <khali@linux-fr.org>

194684e5

06 12月, 2009 2 次提交

PM: Add flag for devices capable of generating run-time wake-up events · 7a1a8eb5

由 Rafael J. Wysocki 提交于 12月 03, 2009

Apparently, there are devices that can wake up the system from sleep
states and yet are incapable of generating wake-up events at run
time.  Thus, introduce a flag indicating if given device is capable
of generating run-time wake-up events.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

7a1a8eb5

Add support for GCC-4.5's __builtin_unreachable() to compiler.h (v2) · 38938c87

由 David Daney 提交于 12月 04, 2009

Starting with version 4.5, GCC has a new built-in function
__builtin_unreachable() that can be used in places like the kernel's
BUG() where inline assembly is used to transfer control flow.  This
eliminated the need for an endless loop in these places.

The patch adds a new macro 'unreachable()' that will expand to either
__builtin_unreachable() or an endless loop depending on the compiler
version.

Change from v1: Simplify unreachable() for non-GCC 4.5 case.
Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com>
Acked-by: NRalf Baechle <ralf@linux-mips.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

38938c87

04 12月, 2009 22 次提交

block: Fix io_context leak after failure of clone with CLONE_IO · b69f2292

由 Louis Rilling 提交于 12月 04, 2009

With CLONE_IO, parent's io_context->nr_tasks is incremented, but never
decremented whenever copy_process() fails afterwards, which prevents
exit_io_context() from calling IO schedulers exit functions.

Give a task_struct to exit_io_context(), and call exit_io_context() instead of
put_io_context() in copy_process() cleanup path.
Signed-off-by: NLouis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b69f2292

tcp: connect() race with timewait reuse · 13475a30

由 Eric Dumazet 提交于 12月 02, 2009

Its currently possible that several threads issuing a connect() find
the same timewait socket and try to reuse it, leading to list
corruptions.

Condition for bug is that these threads bound their socket on same
address/port of to-be-find timewait socket, and connected to same
target. (SO_REUSEADDR needed)

To fix this problem, we could unhash timewait socket while holding
ehash lock, to make sure lookups/changes will be serialized. Only
first thread finds the timewait socket, other ones find the
established socket and return an EADDRNOTAVAIL error.

This second version takes into account Evgeniy's review and makes sure
inet_twsk_put() is called outside of locked sections.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13475a30

netdevice: provide common routine for macvlan and vlan operstate management · fc4a7489

由 Patrick Mullaney 提交于 12月 03, 2009

Provide common routine for the transition of operational state for a leaf
device during a root device transition.
Signed-off-by: NPatrick Mullaney <pmullaney@novell.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc4a7489

usbnet & cdc-ether: Autosuspend for online devices · 69ee472f

由 Oliver Neukum 提交于 12月 03, 2009

Using remote wakeup and delayed transmission to allow
online device to go into usb autosuspend.
Minimal alternate support for devices that don't support
remote wakeup.
Signed-off-by: NOliver Neukum <oliver@neukum.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

69ee472f

libata: Clarify ata_set_lba_range_entries function · d0634c4a

由 Martin K. Petersen 提交于 11月 26, 2009

ata_set_lba_range_entries used the variable max for two different things
which was confusing. Make the function take a buffer size in bytes as
argument and return the used buffer size upon completion.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

d0634c4a

libata: Report zeroed read after TRIM and max discard size · e78db4df

由 Martin K. Petersen 提交于 11月 26, 2009

Our current TRIM payload is a single sector that can accommodate 64 *
65535 blocks being unmapped.  Report this value in the Block Limits
Maximum Unmap LBA count field.

If a storage device supports TRIM and the DRAT and RZAT bits are set,
report TPRZ=1 in Read Capacity(16).
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

e78db4df

tg3: Add some VPD preprocessor constants · 141518c9

由 Matt Carlson 提交于 12月 03, 2009

This patch cleans up the VPD code by creating preprocessor definitions
and using them in the place of hardcoded constants.
Signed-off-by: NMatt Carlson <mcarlson@broadcom.com>
Reviewed-by: NMichael Chan <mchan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

141518c9

net: Batch inet_twsk_purge · b099ce26

由 Eric W. Biederman 提交于 12月 03, 2009

This function walks the whole hashtable so there is no point in
passing it a network namespace.  Instead I purge all timewait
sockets from dead network namespaces that I find.  If the namespace
is one of the once I am trying to purge I am guaranteed no new timewait
sockets can be formed so this will get them all.  If the namespace
is one I am not acting for it might form a few more but I will
call inet_twsk_purge again and  shortly to get rid of them.  In
any even if the network namespace is dead timewait sockets are
useless.

Move the calls of inet_twsk_purge into batch_exit routines so
that if I am killing a bunch of namespaces at once I will just
call inet_twsk_purge once and save a lot of redundant unnecessary
work.

My simple 4k network namespace exit test the cleanup time dropped from
roughly 8.2s to 1.6s.  While the time spent running inet_twsk_purge fell
to about 2ms.  1ms for ipv4 and 1ms for ipv6.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b099ce26

net: Allow fib_rule_unregister to batch · e9c5158a

由 Eric W. Biederman 提交于 12月 03, 2009

Refactor the code so fib_rules_register always takes a template instead
of the actual fib_rules_ops structure that will be used.  This is
required for network namespace support so 2 out of the 3 callers already
do this, it allows the error handling to be made common, and it allows
fib_rules_unregister to free the template for hte caller.

Modify fib_rules_unregister to use call_rcu instead of syncrhonize_rcu
to allw multiple namespaces to be cleaned up in the same rcu grace
period.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e9c5158a

net: Allow xfrm_user_net_exit to batch efficiently. · d79d792e

由 Eric W. Biederman 提交于 12月 03, 2009

xfrm.nlsk is provided by the xfrm_user module and is access via rcu from
other parts of the xfrm code.  Add xfrm.nlsk_stash a copy of xfrm.nlsk that
will never be set to NULL.  This allows the synchronize_net and
netlink_kernel_release to be deferred until a whole batch of xfrm.nlsk sockets
have been set to NULL.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d79d792e

net: Add support for batching network namespace cleanups · 72ad937a

由 Eric W. Biederman 提交于 12月 03, 2009

- Add exit_list to struct net to support building lists of network
  namespaces to cleanup.

- Add exit_batch to pernet_operations to allow running operations only
  once during a network namespace exit.  Instead of once per network
  namespace.

- Factor opt ops_exit_list and ops_exit_free so the logic with cleanup
  up a network namespace does not need to be duplicated.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72ad937a

ipv4 05/05: add sysctl to accept packets with local source addresses · 8153a10c

由 Patrick McHardy 提交于 12月 03, 2009

commit 8ec1e0ebe26087bfc5c0394ada5feb5758014fc8
Author: Patrick McHardy <kaber@trash.net>
Date:   Thu Dec 3 12:16:35 2009 +0100

    ipv4: add sysctl to accept packets with local source addresses

    Change fib_validate_source() to accept packets with a local source address when
    the "accept_local" sysctl is set for the incoming inet device. Combined with the
    previous patches, this allows to communicate between multiple local interfaces
    over the wire.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8153a10c

net 03/05: fib_rules: add oif classification · 1b038a5e

由 Patrick McHardy 提交于 12月 03, 2009

commit 68144d350f4f6c348659c825cde6a82b34c27a91
Author: Patrick McHardy <kaber@trash.net>
Date:   Thu Dec 3 12:05:25 2009 +0100

    net: fib_rules: add oif classification

    Support routing table lookup based on the flow's oif. This is useful to
    classify packets originating from sockets bound to interfaces differently.

    The route cache already includes the oif and needs no changes.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b038a5e

net 02/05: fib_rules: rename ifindex/ifname/FRA_IFNAME to iifindex/iifname/FRA_IIFNAME · 491deb24

由 Patrick McHardy 提交于 12月 03, 2009

commit 229e77eec406ad68662f18e49fda8b5d366768c5
Author: Patrick McHardy <kaber@trash.net>
Date:   Thu Dec 3 12:05:23 2009 +0100

    net: fib_rules: rename ifindex/ifname/FRA_IFNAME to iifindex/iifname/FRA_IIFNAME

    The next patch will add oif classification, rename interface related members
    and attributes to reflect that they're used for iif classification.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

491deb24

net 01/05: fib_rules: rearrange struct fib_rule · d2858340

由 Patrick McHardy 提交于 12月 03, 2009

commit b8952893d5d86f69c4e499d191b98c6658f64b0f
Author: Patrick McHardy <kaber@trash.net>
Date:   Thu Dec 3 12:05:22 2009 +0100

    net: fib_rules: rearrange struct fib_rule

    The ifname member is only used to resolve interface names and is not needed
    during rule lookups. The target and ctarget members however are used during
    rule lookups and are currently located in a second cacheline.

    Move ifname further to the end to make sure both target and ctarget are
    located in the same cacheline as other members used during rule lookups.

    The layout on 64 bit changes from:

    struct fib_rule {
    	...
            u32                        table;                /*    56     4 */
            u8                         action;               /*    60     1 */

            /* XXX 3 bytes hole, try to pack */

            /* --- cacheline 1 boundary (64 bytes) --- */
            u32                        target;               /*    64     4 */

            /* XXX 4 bytes hole, try to pack */

            struct fib_rule *          ctarget;              /*    72     8 */
            struct rcu_head            rcu;                  /*    80    16 */
            struct net *               fr_net;               /*    96     8 */
    };

    to:

    struct fib_rule {
    	...
            u32                        table;                /*    40     4 */
            u8                         action;               /*    44     1 */

            /* XXX 3 bytes hole, try to pack */

            u32                        target;               /*    48     4 */

            /* XXX 4 bytes hole, try to pack */

            struct fib_rule *          ctarget;              /*    56     8 */
            /* --- cacheline 1 boundary (64 bytes) --- */
            char                       ifname[16];           /*    64    16 */
            struct rcu_head            rcu;                  /*    80    16 */
            struct net *               fr_net;               /*    96     8 */

    };
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2858340

libata: add private driver field to struct ata_device · 95514fd8

由 Bartlomiej Zolnierkiewicz 提交于 11月 25, 2009

This brings struct ata_device in-line with struct ata_{port,host}.
Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

95514fd8

pata_piccolo: Driver for old Toshiba chipsets · 8e182a90

由 Alan Cox 提交于 11月 30, 2009

We were never able to get docs for this out of Toshiba for years. Dave
Barnes produced a NetBSD driver however and from that we can fill in the
needed tables.

As we correct the PCI identifiers a bit also update the old ide generic driver
at the same time so it stays compiling.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

8e182a90

Bluetooth: Implement RejActioned flag · 4ec10d97

由 Gustavo F. Padovan 提交于 10月 03, 2009

RejActioned is used to prevent retransmission when a entity is on the
WAIT_F state, i.e., waiting for a frame with F-bit set due local busy
condition or a expired retransmission timer. (When these two events raise
they send a frame with the Poll bit set and enters in the WAIT_F state to
wait for a frame with the Final bit set.)
The local entity doesn't send I-frames(the data frames) until the receipt
of a frame with F-bit set. When that happens it also set RejActioned to false.
RejActioned is a mandatory feature of ERTM spec.
Signed-off-by: NGustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

4ec10d97

Bluetooth: Fix sending ReqSeq on I-frames · 9f121a5a

由 Gustavo F. Padovan 提交于 10月 03, 2009

As specified by ERTM spec an ERTM channel can acknowledge received
I-frames(the data frames) by sending an I-frame with the proper ReqSeq
value (i.e. ReqSeq is set to BufferSeq). Until now we aren't setting the
ReqSeq value on I-frame control bits. That way we can save sending
S-frames(Supervise frames) only to acknowledge receipt of I-frames. It
is very helpful to the full-duplex channel.
ReqSeq is the packet sequence number sent in an acknowledgement frame to
acknowledge receipt of frames up to (ReqSeq - 1).
BufferSeq controls the receiver buffer, it is used to delay
acknowledgement of new frames to not cause buffer overflow. BufferSeq
value is not increased until frames are pulled by reassembly function.
Signed-off-by: NGustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

9f121a5a

Bluetooth: Unobfuscate tasklet_schedule usage · c78ae283

由 Marcel Holtmann 提交于 11月 18, 2009

The tasklet schedule function helpers are just an obfuscation. So remove
them and call the schedule functions directly.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

c78ae283

Bluetooth: Turn hci_recv_frame into an exported function · 76bca880

由 Marcel Holtmann 提交于 11月 18, 2009

For future simplification it is important that the hci_recv_frame
function is no longer an inline function. So move it into the module
itself and export it.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

76bca880

blkio: Introduce blkio controller cgroup interface · 31e4c28d

由 Vivek Goyal 提交于 12月 03, 2009

o This is basic implementation of blkio controller cgroup interface. This is
  the common interface visible to user space and should be used by different
  IO control policies as we implement those.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

31e4c28d

03 12月, 2009 11 次提交

writeback: introduce wbc.for_background · b17621fe

由 Wu Fengguang 提交于 12月 03, 2009

It will lower the flush priority for NFS, and maybe more in future.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b17621fe

GFS2: Tag all metadata with jid · 0ab7d13f

由 Steven Whitehouse 提交于 11月 06, 2009

There are two spare field in the header common to all GFS2
metadata. One is just the right size to fit a journal id
in it, and this patch updates the journal code so that each
time a metadata block is modified, we tag it with the journal
id of the node which is performing the modification.

The reason for this is that it should make it much easier to
debug issues which arise if we can tell which node was the
last to modify a particular metadata block.

Since the field is updated before the block is written into
the journal, each journal should only contain metadata which
is tagged with its own journal id. The one exception to this
is the journal header block, which might have a different node's
id in it, if that journal was recovered by another node in the
cluster.

Thus each journal will contain a record of which nodes recovered
it, via the journal header.

The other field in the metadata header could potentially be
used to hold information about what kind of operation was
performed, but for the time being we just zero it on each
transaction so that if we use it for that in future, we'll
know that the information (where it exists) is reliable.

I did consider using the other field to hold the journal
sequence number, however since in GFS2's journaling we write
the modified data into the journal and not the original
data, this gives no information as to what action caused the
modification, so I think we can probably come up with a better
use for those 64 bits in the future.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

0ab7d13f

VFS: Export dquot_send_warning · 86e931a3

由 Steven Whitehouse 提交于 9月 28, 2009

Sending a message to userspace in a generic format to warn
of events (e.g. quota exceeded) in the quota subsystem is
a generically useful feature. This patch makes some minor
changes to the send_message function from dquot.c renaming
it quota_send_message, moving it to quota.c and exporting it
for use by filesystems which do not use the dquot code.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

86e931a3

VFS: Add forget_all_cached_acls() · 796bd952

由 Steven Whitehouse 提交于 9月 29, 2009

This is required for cluster filesystems which want to use
cached ACLs so that they can invalidate the cache when
required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>

796bd952

block: Allow devices to indicate whether discarded blocks are zeroed · 98262f27

由 Martin K. Petersen 提交于 12月 03, 2009

The discard ioctl is used by mkfs utilities to clear a block device
prior to putting metadata down. However, not all devices return zeroed
blocks after a discard. Some drives return stale data, potentially
containing old superblocks. It is therefore important to know whether
discarded blocks are properly zeroed.

Both ATA and SCSI drives have configuration bits that indicate whether
zeroes are returned after a discard operation. Implement a block level
interface that allows this information to be bubbled up the stack and
queried via a new block device ioctl.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

98262f27

libata: add translation for SCSI WRITE SAME (aka TRIM support) · 18f0f978

由 Christoph Hellwig 提交于 11月 17, 2009

Add support for the ATA TRIM command in libata. We translate a WRITE SAME 16
command with the unmap bit set into an ATA TRIM command and export enough
information in READ CAPACITY 16 and the block limits EVPD page so that the new
SCSI layer discard support will driver this for us.

Note that I hardcode the WRITE_SAME_16 opcode for now as the patch to introduce
the symbolic is not in 2.6.32 yet but only in the SCSI tree - as soon as it is
merged we can fix it up to properly use the symbolic name.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

18f0f978

libata: retry failed FLUSH if device didn't fail it · 6013efd8

由 Tejun Heo 提交于 11月 19, 2009

If ATA device failed FLUSH, it means that the device failed to write
out some amount of data and the error needs to be reported to upper
layers. As retries can't recover the lost data, FLUSH failures need to
be reported immediately in general.

However, if FLUSH fails due to transmission errors, the FLUSH needs to
be retried; otherwise, filesystems may switch to RO mode and/or raid
array may drop a drive for a random transmission glitch.

This condition can be rather easily reproduced on certain ahci
controllers which go through a PHY event after powersave mode switch +
ext4 combination.  Powersave mode switch is often closely followed by
flush from the filesystem failing the FLUSH with ATA bus error which
makes the filesystem code believe that data is lost and drop to RO
mode.  This was reported in the following bugzilla bug.

  http://bugzilla.kernel.org/show_bug.cgi?id=14543

This patch makes libata EH retry FLUSH if it wasn't failed by the
device.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NAndrey Vihrov <andrey.vihrov@gmail.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

6013efd8

KVM: s390: Make psw available on all exits, not just a subset · d7b0b5eb

由 Carsten Otte 提交于 11月 19, 2009

This patch moves s390 processor status word into the base kvm_run
struct and keeps it up-to date on all userspace exits.

The userspace ABI is broken by this, however there are no applications
in the wild using this.  A capability check is provided so users can
verify the updated API exists.

Cc: stable@kernel.org
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d7b0b5eb

KVM: x86: Add KVM_GET/SET_VCPU_EVENTS · 3cfc3092

由 Jan Kiszka 提交于 11月 12, 2009

This new IOCTL exports all yet user-invisible states related to
exceptions, interrupts, and NMIs. Together with appropriate user space
changes, this fixes sporadic problems of vmsave/restore, live migration
and system reset.

[avi: future-proof abi by adding a flags field]
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3cfc3092

KVM: VMX: Report unexpected simultaneous exceptions as internal errors · 65ac7264

由 Avi Kivity 提交于 11月 04, 2009

These happen when we trap an exception when another exception is being
delivered; we only expect these with MCEs and page faults.  If something
unexpected happens, things probably went south and we're better off reporting
an internal error and freezing.
Signed-off-by: NAvi Kivity <avi@redhat.com>

65ac7264

KVM: Allow internal errors reported to userspace to carry extra data · a9c7399d

由 Avi Kivity 提交于 11月 04, 2009

Usually userspace will freeze the guest so we can inspect it, but some
internal state is not available.  Add extra data to internal error
reporting so we can expose it to the debugger.  Extra data is specific
to the suberror.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a9c7399d

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功