提交 · d77385f23830ee6c400569bac8b37e6eb3b7d360 · openeuler / raspberrypi-kernel

19 10月, 2011 15 次提交

SUNRPC: Fix rpc_sockaddr2uaddr · d77385f2

由 Trond Myklebust 提交于 10月 17, 2011

rpc_sockaddr2uaddr is only used by net/sunrpc/rpcb_clnt.c, where
it is used in a non-blockable context in at least one case.

Add non-blocking capability by adding a gfp_t argument
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d77385f2

nfs/super.c: local functions should be static · 45402c38

由 H Hartley Sweeten 提交于 9月 02, 2011

commit ae50c0b5 "pnfs: client stats" added additional information to
the output of /proc/self/mountstats. The new functions introduced are
only used in this file and should be marked static.

If CONFIG_NFS_V4_1 is not defined, empty stub functions are used.  If
CONFIG_NFS_V4 is not defined these stub functions are not used at all.
Adding static for the functions results in compile warnings:

fs/nfs/super.c:743: warning: 'show_sessions' defined but not used
fs/nfs/super.c:756: warning: 'show_pnfs' defined but not used

Fix this by adding a #ifdef CONFIG_NFS_V4 guard around the two
show_ functions.
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

45402c38

pnfsblock: fix writeback deadlock · 75422745

由 Peng Tao 提交于 9月 22, 2011

We should check if the sector is already initialized before
trying to grab the page from page cache. Otherwise when two
pages of the same block are written back by two threads each
calling from writepage_locked, it can cause deadlock like bellow.

 [ 1080.972099] INFO: task kswapd0:25 blocked for more than 120 seconds.
 [ 1080.972377] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 [ 1080.972812] kswapd0         D ffff88000c4926c0     0    25      2 0x00000000
 [ 1080.972816]  ffff88000df276b0 0000000000000046 ffff88000df27640 ffffffff81013ba7
 [ 1080.972821]  ffff88000c492310 ffff88000df27fd8 ffff88000df27fd8 00000000001d3440
 [ 1080.972824]  ffff88000c378000 ffff88000c492310 ffff8800175d3d40 ffff880017fc75a8
 [ 1080.972828] Call Trace:
 [ 1080.972860]  [<ffffffff81013ba7>] ? read_tsc+0x9/0x19
 [ 1080.972877]  [<ffffffff810e0b23>] ? lock_page+0x2b/0x2b
 [ 1080.972899]  [<ffffffff81475a1d>] io_schedule+0x63/0x7e
 [ 1080.972902]  [<ffffffff810e0b31>] sleep_on_page+0xe/0x12
 [ 1080.972905]  [<ffffffff81475fe8>] __wait_on_bit_lock+0x46/0x8f
 [ 1080.972916]  [<ffffffff810822d7>] ? lock_release_holdtime.part.7+0x6b/0x72
 [ 1080.972919]  [<ffffffff810e0af6>] __lock_page+0x66/0x68
 [ 1080.972928]  [<ffffffff81072705>] ? autoremove_wake_function+0x3d/0x3d
 [ 1080.972932]  [<ffffffff810e0b1f>] lock_page+0x27/0x2b
 [ 1080.972934]  [<ffffffff810e0bcf>] find_lock_page+0x34/0x57
 [ 1080.972937]  [<ffffffff810e1738>] find_or_create_page+0x34/0x8a
 [ 1080.972947]  [<ffffffffa034245b>] bl_write_pagelist+0x205/0x6da [blocklayoutdriver]
 [ 1080.972951]  [<ffffffffa034145d>] ? bl_free_lseg+0x38/0x38 [blocklayoutdriver]
 [ 1080.972995]  [<ffffffffa02e27b9>] ? nfs_write_rpcsetup+0x118/0x123 [nfs]
 [ 1080.973033]  [<ffffffffa030246b>] pnfs_generic_pg_writepages+0x10b/0x1f4 [nfs]
 [ 1080.973089]  [<ffffffffa02deaae>] nfs_pageio_doio+0x1a/0x43 [nfs]
 [ 1080.973098]  [<ffffffffa02df035>] nfs_pageio_complete+0x16/0x2d [nfs]
 [ 1080.973108]  [<ffffffffa02e2d8f>] nfs_writepage_locked+0xa0/0xbf [nfs]
 [ 1080.973119]  [<ffffffffa02e36a1>] nfs_writepage+0x16/0x2b [nfs]
 [ 1080.973122]  [<ffffffff810e8762>] ? clear_page_dirty_for_io+0x87/0x9a
 [ 1080.973133]  [<ffffffff810efc5b>] shrink_page_list+0x39b/0x6c8
 [ 1080.973139]  [<ffffffff810f03bb>] shrink_inactive_list+0x22c/0x39e
 [ 1080.973144]  [<ffffffff810822d7>] ? lock_release_holdtime.part.7+0x6b/0x72
 [ 1080.973148]  [<ffffffff810f0c33>] shrink_zone+0x445/0x588
 [ 1080.973152]  [<ffffffff810f1a11>] balance_pgdat+0x2c2/0x56b
 [ 1080.973170]  [<ffffffff81254208>] ? __bitmap_weight+0x34/0x80
 [ 1080.973175]  [<ffffffff810f1f78>] kswapd+0x2be/0x2fa
 [ 1080.973179]  [<ffffffff810726c8>] ? __init_waitqueue_head+0x4b/0x4b
 [ 1080.973183]  [<ffffffff810f1cba>] ? balance_pgdat+0x56b/0x56b
 [ 1080.973187]  [<ffffffff81071f69>] kthread+0xa8/0xb0
 [ 1080.973200]  [<ffffffff814806b4>] kernel_thread_helper+0x4/0x10
 [ 1080.973205]  [<ffffffff81071ec1>] ? __init_kthread_worker+0x5a/0x5a
 [ 1080.973210]  [<ffffffff814806b0>] ? gs_change+0x13/0x13
 [ 1080.973213] no locks held by kswapd0/25.
Signed-off-by: NPeng Tao <peng_tao@emc.com>
Signed-off-by: NJim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

75422745

pnfsblock: fix NULL pointer dereference · e6d05a75

由 Peng Tao 提交于 9月 22, 2011

bl_add_page_to_bio returns error pointer. bio should be reset to
NULL in failure cases as the out path always calls bl_submit_bio.
Signed-off-by: NPeng Tao <peng_tao@emc.com>
Signed-off-by: NJim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

e6d05a75

pnfs: recoalesce when ld read pagelist fails · 9b7eecdc

由 Peng Tao 提交于 9月 22, 2011

For pnfs pagelist read failure, we need to pg_recoalesce and resend IO to
mds.
Signed-off-by: NPeng Tao <peng_tao@emc.com>
Signed-off-by: NJim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

9b7eecdc

pnfs: recoalesce when ld write pagelist fails · 8ce160c5

由 Peng Tao 提交于 9月 22, 2011

For pnfs pagelist write failure, we need to pg_recoalesce and resend IO to
mds.
Signed-off-by: NPeng Tao <peng_tao@emc.com>
Signed-off-by: NJim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8ce160c5

pnfs: make _set_lo_fail generic · 1b0ae068

由 Peng Tao 提交于 9月 22, 2011

file layout and block layout both use it to set mark layout io failure
bit. So make it generic.
Signed-off-by: NPeng Tao <peng_tao@emc.com>
Signed-off-by: NJim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1b0ae068

pnfsblock: add missing rpc_put_mount and path_put · 760383f1

由 Peng Tao 提交于 9月 22, 2011

Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NPeng Tao <peng_tao@emc.com>
Signed-off-by: NJim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

760383f1

SUNRPC/NFS: make rpc pipe upcall generic · c1225158

由 Peng Tao 提交于 9月 22, 2011

The same function is used by idmap, gss and blocklayout code. Make it
generic.
Signed-off-by: NPeng Tao <peng_tao@emc.com>
Signed-off-by: NJim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

c1225158

pnfsblock: fix size of upcall message · fdc17abb

由 Jim Rees 提交于 9月 22, 2011

Make the status field explicitly 32 bits.  "...it's unlikely that the kernel
and userspace would differ on the size of an int here, but it might be a
good idea to go ahead and make that explicitly 32 bits in case we end up
dealing with more exotic arches at some point in the future."
Suggested-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJim Rees <rees@umich.edu>
Signed-off-by: NBenny Halevy <bhalevy@tonian.com>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

fdc17abb

pnfsblock: fix return code confusion · 516f2e24

由 Jim Rees 提交于 9月 22, 2011

Always return PTR_ERR, not NULL, from nfs4_blk_get_deviceinfo and
nfs4_blk_decode_device.

Check for IS_ERR, not NULL, in bl_set_layoutdriver when calling
nfs4_blk_get_deviceinfo.
Signed-off-by: NJim Rees <rees@umich.edu>
Signed-off-by: NBenny Halevy <bhalevy@tonian.com>
Cc: stable@kernel.org [3.0]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

516f2e24

nfs: don't try to migrate pages with active requests · 2da95652

由 Jeff Layton 提交于 10月 12, 2011

nfs_find_and_lock_request will take a reference to the nfs_page and
will then put it if the req is already locked. It's possible though
that the reference will be the last one. That put then can kick off
a whole series of reference puts:

nfs_page
   nfs_open_context
      dentry
          inode

If the inode ends up being deleted, then the VFS will call
truncate_inode_pages. That function will try to take the page lock, but
it was already locked when migrate_page was called. The code
deadlocks.

Fix this by simply refusing the migration request if PagePrivate is
already set, indicating that the page is already associated with an
active read or write request.

We've had a customer test a backported version of this patch and
the preliminary results seem good.

Cc: stable@kernel.org
Cc: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: NHarshula Jayasuriya <harshula@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2da95652

nfs: fix bug about IPv6 address scope checking · b9dd3abb

由 Mi Jinlong 提交于 10月 12, 2011

The result from ipv6_addr_scope() always not be a single SCOPE,
so we can't use equal to compare the result with IPV6_ADDR_SCOPE_LINKLOCAL
at nfs_sockaddr_match_ipaddr6.

This patch fixs the problem, and lets checking address before scope_id.
Signed-off-by: NMi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

b9dd3abb

nfs: don't redirty inode when ncommit == 0 in nfs_commit_unstable_pages · 3236c3e1

由 Jeff Layton 提交于 10月 11, 2011

commit 420e3646 allowed the kernel to reduce the number of unnecessary
commit calls by skipping the commit when there are a large number of
outstanding pages.

However, the current test in nfs_commit_unstable_pages does not handle
the edge condition properly. When ncommit == 0, then that means that the
kernel doesn't need to do anything more for the inode. The current test
though in the WB_SYNC_NONE case will return true, and the inode will end
up being marked dirty. Once that happens the inode will never be clean
until there's a WB_SYNC_ALL flush.

Fix this by immediately returning from nfs_commit_unstable_pages when
ncommit == 0.

Mike noticed this problem initially in RHEL5 (2.6.18-based kernel) which
has a backported version of 420e3646. The inode cache there was growing
very large. The inode cache was unable to be shrunk since the inodes
were all marked dirty. Calling sync() would essentially "fix" the
problem -- the WB_SYNC_ALL flush would result in the inodes all being
marked clean.

What I'm not clear on is how big a problem this is in mainline kernels
as the writeback code there is very different. Either way, it seems
incorrect to re-mark the inode dirty in this case.
Reported-by: NMike McLean <mikem@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Cc: stable@kernel.org [2.6.34+]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

3236c3e1

Revert "NFS: Ensure that writeback_single_inode() calls write_inode() when syncing" · 59b7c05f

由 Trond Myklebust 提交于 10月 17, 2011

This reverts commit b80c3cb6.

The reverted commit was rendered obsolete by a VFS fix: commit
5547e8aa (writeback: Update dirty flags in
two steps). We now no longer need to worry about writeback_single_inode()
missing our marking the inode for COMMIT in 'do_writepages()' call.

Reverting this patch, fixes a performance regression in which the inode
would continuously get queued to the dirty list, causing the writeback
code to unnecessarily try to send a COMMIT.

Signed-off-by: Trond Myklebust <Trond.Myklebust>
Tested-by: NSimon Kirby <sim@hostway.ca>
Cc: stable@kernel.org [2.6.35+]

59b7c05f

18 10月, 2011 1 次提交
- L
  
  Linux 3.1-rc10 · 899e3ee4
  由 Linus Torvalds 提交于 10月 17, 2011
  
  899e3ee4
17 10月, 2011 2 次提交

Avoid using variable-length arrays in kernel/sys.c · a84a79e4

由 Linus Torvalds 提交于 10月 17, 2011

The size is always valid, but variable-length arrays generate worse code
for no good reason (unless the function happens to be inlined and the
compiler sees the length for the simple constant it is).

Also, there seems to be some code generation problem on POWER, where
Henrik Bakken reports that register r28 can get corrupted under some
subtle circumstances (interrupt happening at the wrong time?).  That all
indicates some seriously broken compiler issues, but since variable
length arrays are bad regardless, there's little point in trying to
chase it down.

"Just don't do that, then".
Reported-by: NHenrik Grindal Bakken <henribak@cisco.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a84a79e4

Merge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm · 8bc03e8f

由 Linus Torvalds 提交于 10月 16, 2011

* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
  ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS
  ARM: 7122/1: localtimer: add header linux/errno.h explicitly
  ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9
  ARM: 7113/1: mm: Align bank start to MAX_ORDER_NR_PAGES

8bc03e8f

15 10月, 2011 4 次提交

ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS · f8be12d1

由 Zoltan Devai 提交于 10月 10, 2011

This is unneeded and causes an abort on the SPMP8000 platform.
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NZoltan Devai <zoss@devai.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

f8be12d1

ARM: 7122/1: localtimer: add header linux/errno.h explicitly · bb1ac3ec

由 Shawn Guo 提交于 10月 06, 2011

Per the text in  Documentation/SubmitChecklist as below, we should
explicitly have header linux/errno.h in localtimer.h for ENXIO
reference.

1: If you use a facility then #include the file that defines/declares
   that facility.  Don't depend on other header files pulling in ones
   that you use.

Otherwise, we may run into some compiling error like the following one,
if any file includes localtimer.h without CONFIG_LOCAL_TIMERS defined.

  arch/arm/include/asm/localtimer.h: In function ‘local_timer_setup’:
  arch/arm/include/asm/localtimer.h:53:10: error: ‘ENXIO’ undeclared (first use in this function)
Signed-off-by: NShawn Guo <shawn.guo@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

bb1ac3ec

ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9 · 29a541f6

由 Will Deacon 提交于 10月 03, 2011

Using COHERENT_LINE_{MISS,HIT} for cache misses and references
respectively is completely wrong. Instead, use the L1D events which
are a better and more useful approximation despite ignoring instruction
traffic.
Reported-by: NAlasdair Grant <alasdair.grant@arm.com>
Reported-by: NMatt Horsnell <matt.horsnell@arm.com>
Reported-by: NMichael Williams <michael.williams@arm.com>
Cc: stable@kernel.org
Cc: Jean Pihet <j-pihet@ti.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

29a541f6

Merge branch 'hwmon-for-linus' of... · 4c41042d

由 Linus Torvalds 提交于 10月 15, 2011

Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (w83627ehf) Properly report thermal diode sensors

4c41042d

14 10月, 2011 8 次提交

Merge branch 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6 · e9308cfd

由 Linus Torvalds 提交于 10月 14, 2011

* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6:
  gpio-pca953x: fix gpio_base
  gpio/omap: fix build error with certain OMAP1 configs

e9308cfd

Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · 48008296

由 Linus Torvalds 提交于 10月 14, 2011

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: revert to using a kthread for AIL pushing
  xfs: force the log if we encounter pinned buffers in .iop_pushbuf
  xfs: do not update xa_last_pushed_lsn for locked items

48008296

Merge branch 'stable' of git://github.com/cmetcalf-tilera/linux-tile · 95bc156c

由 Linus Torvalds 提交于 10月 14, 2011

* 'stable' of git://github.com/cmetcalf-tilera/linux-tile:
  tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files

95bc156c

L
Merge branch 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip · 2ad53110
由 Linus Torvalds 提交于 10月 14, 2011
```
* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  x86: Default to vsyscall=native for now
```
2ad53110

x86, mrst: use a temporary variable for SFI irq · 153b19a3

由 Mika Westerberg 提交于 10月 13, 2011

SFI tables reside in RAM and should not be modified once they are
written.  Current code went to set pentry->irq to zero which causes
subsequent reads to fail with invalid SFI table checksum.  This will
break kexec as the second kernel fails to validate SFI tables.

To fix this we use temporary variable for irq number.
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

153b19a3

hwmon: (w83627ehf) Properly report thermal diode sensors · bf164c58

由 Jean Delvare 提交于 10月 13, 2011

The w83627ehf driver is improperly reporting thermal diode sensors as
type 2, instead of 3. This caused "sensors" and possibly other
monitoring tools to report these sensors as "transistor" instead of
"thermal diode".

Furthermore, diode subtype selection (CPU vs. external) is only
supported by the original W83627EHF/EHG. All later models only support
CPU diode type, and some (NCT6776F) don't even have the register in
question so we should avoid reading from it.
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Signed-off-by: NGuenter Roeck <guenter.roeck@ericsson.com>

bf164c58

gpio-pca953x: fix gpio_base · 25fcf2b7

由 Hartmut Knaack 提交于 10月 11, 2011

gpio_base was set to 0 if no system platform data or open firmware
platform data was provided. This led to conflicts, if any other gpiochip
with a gpiobase of 0 was instantiated already. Setting it to -1 will
automatically use the first one available.
Signed-off-by: NHartmut Knaack <knaack.h@gmx.de>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

25fcf2b7

gpio/omap: fix build error with certain OMAP1 configs · 78a43158

由 Janusz Krzysztofik 提交于 8月 23, 2011

With commit f64ad1a0, "gpio/omap: cleanup _set_gpio_wakeup(), remove
ifdefs", access to build time conditionally omitted 'suspend_wakeup'
member of the 'gpio_bank' structure has been placed unconditionally in
function _set_gpio_wakeup(), which is always built. This resulted in the
driver compilation broken for certain OMAP1, i.e., non-OMAP16xx,
configurations.

Really required or not in previously excluded cases, define this
structure member unconditionally as a fix.

Tested with a custom OMAP1510 only configuration.
Signed-off-by: NJanusz Krzysztofik <jkrzyszt@tis.icnet.pl>
Acked-by: NKevin Hilman <khilman@ti.com>
Tested-by: NAaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

78a43158

13 10月, 2011 4 次提交

tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files · d52104b2

由 Chris Metcalf 提交于 10月 05, 2011

The 32-bit TILEPro support uses some #defines in <asm/atomic_32.h>
for atomic support routines in assembly.  To make this more explicit,
I've turned those includes into includes of <asm/atomic_32.h>, which
should hopefully make it clear that they shouldn't be bombed into
<linux/atomic.h> in any cleanups.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

d52104b2

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 37cf9516

由 Linus Torvalds 提交于 10月 13, 2011

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  mscan: too much data copied to CAN frame due to 16 bit accesses
  gro: refetch inet6_protos[] after pulling ext headers
  bnx2x: fix cl_id allocation for non-eth clients for NPAR mode
  mlx4_en: fix endianness with blue frame support

37cf9516

ide: Fix file references in drivers/ide/ · 1d113601

由 Johann Felix Soden 提交于 10月 10, 2011

Fix file references in drivers/ide/

There are a lot of file references to now moved or deleted files in the
whole tree, especially in documentation and Kconfig files.  This patch
fixes the references in drivers/ide/.
Signed-off-by: NJohann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d113601

Merge branch 'btrfs-3.0' of git://github.com/chrismason/linux · b2f9452b

由 Linus Torvalds 提交于 10月 13, 2011

* 'btrfs-3.0' of git://github.com/chrismason/linux:
  Btrfs: make sure not to defrag extents past i_size
  Btrfs: fix recursive auto-defrag

b2f9452b

12 10月, 2011 3 次提交

xfs: revert to using a kthread for AIL pushing · 0030807c

由 Christoph Hellwig 提交于 10月 11, 2011

Currently we have a few issues with the way the workqueue code is used to
implement AIL pushing:

 - it accidentally uses the same workqueue as the syncer action, and thus
   can be prevented from running if there are enough sync actions active
   in the system.
 - it doesn't use the HIGHPRI flag to queue at the head of the queue of
   work items

At this point I'm not confident enough in getting all the workqueue flags and
tweaks right to provide a perfectly reliable execution context for AIL
pushing, which is the most important piece in XFS to make forward progress
when the log fills.

Revert back to use a kthread per filesystem which fixes all the above issues
at the cost of having a task struct and stack around for each mounted
filesystem.  In addition this also gives us much better ways to diagnose
any issues involving hung AIL pushing and removes a small amount of code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NStefan Priebe <s.priebe@profihost.ag>
Tested-by: NStefan Priebe <s.priebe@profihost.ag>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

0030807c

xfs: force the log if we encounter pinned buffers in .iop_pushbuf · 17b38471

由 Christoph Hellwig 提交于 10月 11, 2011

We need to check for pinned buffers even in .iop_pushbuf given that inode
items flush into the same buffers that may be pinned directly due operations
on the unlinked inode list operating directly on buffers.  To do this add a
return value to .iop_pushbuf that tells the AIL push about this and use
the existing log force mechanisms to unpin it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NStefan Priebe <s.priebe@profihost.ag>
Tested-by: NStefan Priebe <s.priebe@profihost.ag>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

17b38471

xfs: do not update xa_last_pushed_lsn for locked items · bc6e588a

由 Christoph Hellwig 提交于 10月 11, 2011

If an item was locked we should not update xa_last_pushed_lsn and thus skip
it when restarting the AIL scan as we need to be able to lock and write it
out as soon as possible.  Otherwise heavy lock contention might starve AIL
pushing too easily, especially given the larger backoff once we moved
xa_last_pushed_lsn all the way to the target lsn.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NStefan Priebe <s.priebe@profihost.ag>
Tested-by: NStefan Priebe <s.priebe@profihost.ag>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

bc6e588a

11 10月, 2011 3 次提交

Btrfs: make sure not to defrag extents past i_size · f7f43cc8

由 Chris Mason 提交于 10月 11, 2011

The btrfs file defrag code will loop through the extents and
force COW on them.  But there is a concurrent truncate in the middle of
the defrag, it might end up defragging the same range over and over
again.

The problem is that writepage won't go through and do anything on pages
past i_size, so the cow won't happen, so the file will appear to still
be fragmented.  defrag will end up hitting the same extents again and
again.

In the worst case, the truncate can actually live lock with the defrag
because the defrag keeps creating new ordered extents which the truncate
code keeps waiting on.

The fix here is to make defrag check for i_size inside the main loop,
instead of just once before the looping starts.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f7f43cc8

x86: Default to vsyscall=native for now · 2b666859

由 Adrian Bunk 提交于 10月 06, 2011

This UML breakage:

linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790
linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790

Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add
vsyscall= parameter") - the vsyscall emulation code is not fully cooked
yet as UML relies on some rather fragile SIGSEGV semantics.

Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default
to vsyscall=native for now, this patch implements that.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Acked-by: NAndrew Lutomirski <luto@mit.edu>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20111005214047.GE14406@localhost.pp.htv.fiSigned-off-by: NIngo Molnar <mingo@elte.hu>

2b666859

Btrfs: fix recursive auto-defrag · 2a0f7f57

由 Li Zefan 提交于 10月 10, 2011

Follow those steps:

  # mount -o autodefrag /dev/sda7 /mnt
  # dd if=/dev/urandom of=/mnt/tmp bs=200K count=1
  # sync
  # dd if=/dev/urandom of=/mnt/tmp bs=8K count=1 conv=notrunc

and then it'll go into a loop: writeback -> defrag -> writeback ...

It's because writeback writes [8K, 200K] and then writes [0, 8K].

I tried to make writeback know if the pages are dirtied by defrag,
but the patch was a bit intrusive. Here I simply set writeback_index
when we defrag a file.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2a0f7f57