提交 · 1efe8fe1c2240acc476bed77740883df63373862 · openeuler / Kernel

03 2月, 2010 1 次提交

cfq-iosched: Do not idle on async queues · 1efe8fe1

由 Vivek Goyal 提交于 2月 02, 2010

Few weeks back, Shaohua Li had posted similar patch. I am reposting it
with more test results.

This patch does two things.

- Do not idle on async queues.

- It also changes the write queue depth CFQ drives (cfq_may_dispatch()).
  Currently, we seem to driving queue depth of 1 always for WRITES. This is
  true even if there is only one write queue in the system and all the logic
  of infinite queue depth in case of single busy queue as well as slowly
  increasing queue depth based on last delayed sync request does not seem to
  be kicking in at all.

This patch will allow deeper WRITE queue depths (subjected to the other
WRITE queue depth contstraints like cfq_quantum and last delayed sync
request).

Shaohua Li had reported getting more out of his SSD. For me, I have got
one Lun exported from an HP EVA and when pure buffered writes are on, I
can get more out of the system. Following are test results of pure
buffered writes (with end_fsync=1) with vanilla and patched kernel. These
results are average of 3 sets of run with increasing number of threads.

AVERAGE[bufwfs][vanilla]
-------
job       Set NR  ReadBW(KB/s)   MaxClat(us)    WriteBW(KB/s)  MaxClat(us)
---       --- --  ------------   -----------    -------------  -----------
bufwfs    3   1   0              0              95349          474141
bufwfs    3   2   0              0              100282         806926
bufwfs    3   4   0              0              109989         2.7301e+06
bufwfs    3   8   0              0              116642         3762231
bufwfs    3   16  0              0              118230         6902970

AVERAGE[bufwfs] [patched kernel]
-------
bufwfs    3   1   0              0              270722         404352
bufwfs    3   2   0              0              206770         1.06552e+06
bufwfs    3   4   0              0              195277         1.62283e+06
bufwfs    3   8   0              0              260960         2.62979e+06
bufwfs    3   16  0              0              299260         1.70731e+06

I also ran buffered writes along with some sequential reads and some
buffered reads going on in the system on a SATA disk because the potential
risk could be that we should not be driving queue depth higher in presence
of sync IO going to keep the max clat low.

With some random and sequential reads going on in the system on one SATA
disk I did not see any significant increase in max clat. So it looks like
other WRITE queue depth control logic is doing its job. Here are the
results.

AVERAGE[brr, bsr, bufw together] [vanilla]
-------
job       Set NR  ReadBW(KB/s)   MaxClat(us)    WriteBW(KB/s)  MaxClat(us)
---       --- --  ------------   -----------    -------------  -----------
brr       3   1   850            546345         0              0
bsr       3   1   14650          729543         0              0
bufw      3   1   0              0              23908          8274517

brr       3   2   981.333        579395         0              0
bsr       3   2   14149.7        1175689        0              0
bufw      3   2   0              0              21921          1.28108e+07

brr       3   4   898.333        1.75527e+06    0              0
bsr       3   4   12230.7        1.40072e+06    0              0
bufw      3   4   0              0              19722.3        2.4901e+07

brr       3   8   900            3160594        0              0
bsr       3   8   9282.33        1.91314e+06    0              0
bufw      3   8   0              0              18789.3        23890622

AVERAGE[brr, bsr, bufw mixed] [patched kernel]
-------
job       Set NR  ReadBW(KB/s)   MaxClat(us)    WriteBW(KB/s)  MaxClat(us)
---       --- --  ------------   -----------    -------------  -----------
brr       3   1   837            417973         0              0
bsr       3   1   14357.7        591275         0              0
bufw      3   1   0              0              24869.7        8910662

brr       3   2   1038.33        543434         0              0
bsr       3   2   13351.3        1205858        0              0
bufw      3   2   0              0              18626.3        13280370

brr       3   4   913            1.86861e+06    0              0
bsr       3   4   12652.3        1430974        0              0
bufw      3   4   0              0              15343.3        2.81305e+07

brr       3   8   890            2.92695e+06    0              0
bsr       3   8   9635.33        1.90244e+06    0              0
bufw      3   8   0              0              17200.3        24424392

So looks like it might make sense to include this patch.

Thanks
Vivek
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1efe8fe1

01 2月, 2010 1 次提交

blk-cgroup: Fix potential deadlock in blk-cgroup · bcf4dd43

由 Gui Jianfeng 提交于 2月 01, 2010

I triggered a lockdep warning as following.

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.33-rc2 #1
-------------------------------------------------------
test_io_control/7357 is trying to acquire lock:
 (blkio_list_lock){+.+...}, at: [<c053a990>] blkiocg_weight_write+0x82/0x9e

but task is already holding lock:
 (&(&blkcg->lock)->rlock){......}, at: [<c053a949>] blkiocg_weight_write+0x3b/0x9e

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&(&blkcg->lock)->rlock){......}:
       [<c04583b7>] validate_chain+0x8bc/0xb9c
       [<c0458dba>] __lock_acquire+0x723/0x789
       [<c0458eb0>] lock_acquire+0x90/0xa7
       [<c0692b0a>] _raw_spin_lock_irqsave+0x27/0x5a
       [<c053a4e1>] blkiocg_add_blkio_group+0x1a/0x6d
       [<c053cac7>] cfq_get_queue+0x225/0x3de
       [<c053eec2>] cfq_set_request+0x217/0x42d
       [<c052c8a6>] elv_set_request+0x17/0x26
       [<c0532a0f>] get_request+0x203/0x2c5
       [<c0532ae9>] get_request_wait+0x18/0x10e
       [<c0533470>] __make_request+0x2ba/0x375
       [<c0531985>] generic_make_request+0x28d/0x30f
       [<c0532da7>] submit_bio+0x8a/0x8f
       [<c04d827a>] submit_bh+0xf0/0x10f
       [<c04d91d2>] ll_rw_block+0xc0/0xf9
       [<f86e9705>] ext3_find_entry+0x319/0x544 [ext3]
       [<f86eae58>] ext3_lookup+0x2c/0xb9 [ext3]
       [<c04c3e1b>] do_lookup+0xd3/0x172
       [<c04c56c8>] link_path_walk+0x5fb/0x95c
       [<c04c5a65>] path_walk+0x3c/0x81
       [<c04c5b63>] do_path_lookup+0x21/0x8a
       [<c04c66cc>] do_filp_open+0xf0/0x978
       [<c04c0c7e>] open_exec+0x1b/0xb7
       [<c04c1436>] do_execve+0xbb/0x266
       [<c04081a9>] sys_execve+0x24/0x4a
       [<c04028a2>] ptregs_execve+0x12/0x18

-> #1 (&(&q->__queue_lock)->rlock){..-.-.}:
       [<c04583b7>] validate_chain+0x8bc/0xb9c
       [<c0458dba>] __lock_acquire+0x723/0x789
       [<c0458eb0>] lock_acquire+0x90/0xa7
       [<c0692b0a>] _raw_spin_lock_irqsave+0x27/0x5a
       [<c053dd2a>] cfq_unlink_blkio_group+0x17/0x41
       [<c053a6eb>] blkiocg_destroy+0x72/0xc7
       [<c0467df0>] cgroup_diput+0x4a/0xb2
       [<c04ca473>] dentry_iput+0x93/0xb7
       [<c04ca4b3>] d_kill+0x1c/0x36
       [<c04cb5c5>] dput+0xf5/0xfe
       [<c04c6084>] do_rmdir+0x95/0xbe
       [<c04c60ec>] sys_rmdir+0x10/0x12
       [<c04027cc>] sysenter_do_call+0x12/0x32

-> #0 (blkio_list_lock){+.+...}:
       [<c0458117>] validate_chain+0x61c/0xb9c
       [<c0458dba>] __lock_acquire+0x723/0x789
       [<c0458eb0>] lock_acquire+0x90/0xa7
       [<c06929fd>] _raw_spin_lock+0x1e/0x4e
       [<c053a990>] blkiocg_weight_write+0x82/0x9e
       [<c0467f1e>] cgroup_file_write+0xc6/0x1c0
       [<c04bd2f3>] vfs_write+0x8c/0x116
       [<c04bd7c6>] sys_write+0x3b/0x60
       [<c04027cc>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

1 lock held by test_io_control/7357:
 #0:  (&(&blkcg->lock)->rlock){......}, at: [<c053a949>] blkiocg_weight_write+0x3b/0x9e
stack backtrace:
Pid: 7357, comm: test_io_control Not tainted 2.6.33-rc2 #1
Call Trace:
 [<c045754f>] print_circular_bug+0x91/0x9d
 [<c0458117>] validate_chain+0x61c/0xb9c
 [<c0458dba>] __lock_acquire+0x723/0x789
 [<c0458eb0>] lock_acquire+0x90/0xa7
 [<c053a990>] ? blkiocg_weight_write+0x82/0x9e
 [<c06929fd>] _raw_spin_lock+0x1e/0x4e
 [<c053a990>] ? blkiocg_weight_write+0x82/0x9e
 [<c053a990>] blkiocg_weight_write+0x82/0x9e
 [<c0467f1e>] cgroup_file_write+0xc6/0x1c0
 [<c0454df5>] ? trace_hardirqs_off+0xb/0xd
 [<c044d93a>] ? cpu_clock+0x2e/0x44
 [<c050e6ec>] ? security_file_permission+0xf/0x11
 [<c04bcdda>] ? rw_verify_area+0x8a/0xad
 [<c0467e58>] ? cgroup_file_write+0x0/0x1c0
 [<c04bd2f3>] vfs_write+0x8c/0x116
 [<c04bd7c6>] sys_write+0x3b/0x60
 [<c04027cc>] sysenter_do_call+0x12/0x32

To prevent deadlock, we should take locks as following sequence:

blkio_list_lock -> queue_lock ->  blkcg_lock.

The following patch should fix this bug.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bcf4dd43

31 1月, 2010 1 次提交

block: fix bugs in bio-integrity mempool usage · 9e9432c2

由 Chuck Ebbert 提交于 1月 30, 2010

Fix two bugs in the bio integrity code:

 use_bip_pool() always returns 0 because it checks against the wrong limit,
 causing the mempool to be used only when regular allocation fails.

 When the mempool is used as a fallback we don't free the data properly.
Signed-Off-By: NChuck Ebbert <cebbert@redhat.com>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9e9432c2

28 1月, 2010 1 次提交

block: fix bio_add_page for non trivial merge_bvec_fn case · 1d616585

由 Dmitry Monakhov 提交于 1月 27, 2010

We have to properly decrease bi_size in order to merge_bvec_fn return
right result.  Otherwise this result in false merge rejects for two
absolutely valid bio_vecs.  This may cause significant performance
penalty for example fs_block_size == 1k and block device is raid0 with
small chunk_size = 8k. Then it is impossible to merge 7-th fs-block in
to bio which already has 6 fs-blocks.

Cc: <stable@kernel.org>
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1d616585

26 1月, 2010 2 次提交
- J
  
  Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-linus · c84a301d
  由 Jens Axboe 提交于 1月 25, 2010
  
  c84a301d
- D
  drbd: null dereference bug · d3db7b48
  由 Dan Carpenter 提交于 1月 23, 2010
```
epoch is always NULL here.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
```
  d3db7b48
25 1月, 2010 3 次提交

Merge branch 'timers-fixes-for-linus' of... · f6760aa0

由 Linus Torvalds 提交于 1月 24, 2010

Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  clockevent: Don't remove broadcast device when cpu is dead

f6760aa0

Merge branch 'for-linus/i2c' of git://git.fluff.org/bjdooks/linux · 90ea3019

由 Linus Torvalds 提交于 1月 24, 2010

* 'for-linus/i2c' of git://git.fluff.org/bjdooks/linux:
  i2c: imx: call ioremap only after request_mem_region
  i2c: mxc: let time to generate stop bit

90ea3019

Merge git://git.infradead.org/~dwmw2/mtd-2.6.33 · b8be634e

由 Linus Torvalds 提交于 1月 24, 2010

* git://git.infradead.org/~dwmw2/mtd-2.6.33:
  mtd: tests: fix read, speed and stress tests on NOR flash
  mtd: Really add ARM pismo support
  kmsg_dump: Dump on crash_kexec as well

b8be634e

24 1月, 2010 2 次提交

i2c: imx: call ioremap only after request_mem_region · 4927fbf1

由 Uwe Kleine-König 提交于 1月 08, 2010

accordingly adapt order of release_mem_region and release_mem_region on
remove.
Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Cc: Richard Zhao <linuxzsc@gmail.com>
Cc: Darius Augulis <augulis.darius@gmail.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: linux-i2c@vger.kernel.org
Acked-by: NWolfram Sang <w.sang@pengutronix.de>
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

4927fbf1

i2c: mxc: let time to generate stop bit · a1ee06b7

由 Valentin Longchamp 提交于 1月 21, 2010

After generating the stop bit by changing MSTA from 1 to 0,
the i2c_imx->stopped was immediatly set to 1. The second test
on i2c_imx->stopped then is correct and the controller never
waits if the bus is busy. This patch corrects this.

On mx31moboard, stop bit was not generated on single write transfers.
This was kept unnoticed as other transfers are made afterwards that
help the write recipient to resynchronize.

Thanks to Philippe and Michael for the debugging.
Signed-off-by: NValentin Longchamp <valentin.longchamp@epfl.ch>
Signed-off by: Philippe Rétornaz <philippe.retornaz@epfl.ch>
Reported-by: NMichael Bonani <michael.bonani@epfl.ch>
Acked-by; Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

a1ee06b7

22 1月, 2010 6 次提交

drbd: fix max_segment_size initialization · 98ec286e

由 Lars Ellenberg 提交于 1月 21, 2010

blk_queue_make_request() internally calls blk_set_default_limits(),
so calling blk_queue_max_segment_size() before is useless.
Ergo: move the call to blk_queue_max_segment_size() down a few lines.

Impact:
If, after a fresh modprobe, you first connect a Diskless drbd,
then attach, this could result in a DRBD Protocol Error at first.
The next connection attempt would then succeeded.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

98ec286e

L
Merge branch 'for-linus/samsung' of git://git.fluff.org/bjdooks/linux · 298a4c3a
由 Linus Torvalds 提交于 1月 21, 2010
```
* 'for-linus/samsung' of git://git.fluff.org/bjdooks/linux:
  hmt: adjust for new pwm_backlight->notify prototype
```
298a4c3a

hmt: adjust for new pwm_backlight->notify prototype · 1619ce11

由 Peter Korsgaard 提交于 1月 21, 2010

Commit cfc38999f (backlight: Pass device through notify callback)
added a struct device argument to the notify callback, but didn't
update the user of it in mach-hmt.c
Signed-off-by: NPeter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

1619ce11

L

Linux 2.6.33-rc5 · 92dcffb9
由 Linus Torvalds 提交于 1月 21, 2010

92dcffb9

Merge branch 'perf-fixes-for-linus' of... · e80b1359

由 Linus Torvalds 提交于 1月 21, 2010

Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: x86: Add support for the ANY bit
  perf: Change the is_software_event() definition
  perf: Honour event state for aux stream data
  perf: Fix perf_event_do_pending() fallback callsite
  perf kmem: Print usage help for unknown commands
  perf kmem: Increase "Hit" column length
  hw-breakpoints, perf: Fix broken mmiotrace due to dr6 by reference change
  perf timechart: Use tid not pid for COMM change

e80b1359

Merge branch 'sched-fixes-for-linus' of... · 341031ca

由 Linus Torvalds 提交于 1月 21, 2010

Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Reassign prev and switch_count when reacquire_kernel_lock() fail
  sched: Fix vmark regression on big machines

341031ca

21 1月, 2010 23 次提交

L
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev · 836f48c5
由 Linus Torvalds 提交于 1月 21, 2010
```
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: retry FS IOs even if it has failed with AC_ERR_INVALID
```
836f48c5

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 · bdeef61c

由 Linus Torvalds 提交于 1月 21, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  tty: fix race in tty_fasync
  serial: serial_cs: oxsemi quirk breaks resume
  serial: imx: bit &/| confusion
  serial: Fix crash if the minimum rate of the device is > 9600 baud
  serial-core: resume serial hardware with no_console_suspend
  serial: 8250_pnp: use wildcard for serial Wacom tablets
  nozomi: quick fix for the close/close bug
  compat_ioctl: Supress "unknown cmd" message on serial /dev/console

bdeef61c

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6 · 4caca5f9

由 Linus Torvalds 提交于 1月 21, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
  Staging: hv: fix smp problems in the hyperv core code
  Staging: et131x: Fix 2.6.33rc1 regression in et131x
  Staging: asus_oled: fix oops in 2.6.32.2

4caca5f9

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 · f8c7e6c2

由 Linus Torvalds 提交于 1月 21, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  Revert "sysdev: fix prototype for memory_sysdev_class show/store functions"
  driver-core: fix devtmpfs crash on s390

f8c7e6c2

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 · c9140487

由 Linus Torvalds 提交于 1月 21, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: isp1362: fix build failure on ARM systems via irq_flags cleanup
  USB: isp1362: better 64bit printf warning fixes
  USB: fix usbstorage for 2770:915d delivers no FAT
  USB: Fix level of isp1760 Reloading ptd error message
  USB: FHCI: avoid NULL pointer dereference
  USB: Fix duplicate sysfs problem after device reset.
  USB: add speed values for USB 3.0 and wireless controllers
  USB: add missing delay during remote wakeup
  USB: EHCI & UHCI: fix race between root-hub suspend and port resume
  USB: EHCI: fix handling of unusual interrupt intervals
  USB: Don't use GFP_KERNEL while we cannot reset a storage device
  USB: fix bitmask merge error
  usb: serial: fix memory leak in generic driver
  USB: serial: fix USB serial fix kfifo_len locking

c9140487

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · 456eac94

由 Linus Torvalds 提交于 1月 21, 2010

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  fs/bio.c: fix shadows sparse warning
  drbd: The kernel code is now equivalent to out of tree release 8.3.7
  drbd: Allow online resizing of DRBD devices while peer not reachable (needs to be explicitly forced)
  drbd: Don't go into StandAlone mode when authentification failes because of network error
  drivers/block/drbd/drbd_receiver.c: correct NULL test
  cfq-iosched: Respect ioprio_class when preempting
  genhd: overlapping variable definition
  block: removed unused as_io_context
  DM: Fix device mapper topology stacking
  block: bdev_stack_limits wrapper
  block: Fix discard alignment calculation and printing
  block: Correct handling of bottom device misaligment
  drbd: check on CONFIG_LBDAF, not LBD
  drivers/block/drbd: Correct NULL test
  drbd: Silenced an assert that could triggered after changing write ordering method
  drbd: Kconfig fix
  drbd: Fix for a race between IO and a detach operation [Bugz 262]
  drbd: Use drbd_crypto_is_hash() instead of an open coded check

456eac94

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 · dedd0c2a

由 Linus Torvalds 提交于 1月 21, 2010

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (23 commits)
  ACPI: delete acpi_processor_power_verify_c2()
  ACPI: allow C3 > 1000usec
  ACPI: enable C2 and Turbo-mode on Nehalem notebooks on A/C
  ACPI: power_meter: remove double kfree()
  ACPI: processor: restrict early _PDC to opt-in platforms
  ACPI: Fix unused variable warning in sbs.c
  acpi: make ACPI device id constant
  sony-laptop - fix using of uninitialized variable
  ACPI: Fix section mismatch error for acpi_early_processor_set_pdc()
  eeepc-laptop: disable wireless hotplug for 1201N
  eeepc-laptop: add hotplug_disable parameter
  eeepc-laptop: switch to using sparse keymap library
  eeepc-laptop: dmi blacklist to disable pci hotplug code
  eeepc-laptop: disable cpu speed control on EeePC 701
  ACPI: don't cond_resched if irq is disabled
  ACPI: Remove unnecessary cast.
  ACPI: Advertise to BIOS in _OSC: _OST on _PPC changes
  ACPI: EC: Add wait for irq storm
  ACPI: SBS: Move SBS HC callback to faster Notify queue
  x86, ACPI: delete acpi_boot_table_init() return value
  ...

dedd0c2a

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 · 15e551e5

由 Linus Torvalds 提交于 1月 21, 2010

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
  ecryptfs: use after free
  ecryptfs: Eliminate useless code
  ecryptfs: fix interpose/interpolate typos in comments
  ecryptfs: pass matching flags to interpose as defined and used there
  ecryptfs: remove unnecessary d_drop calls in ecryptfs_link
  ecryptfs: don't ignore return value from lock_rename
  ecryptfs: initialize private persistent file before dereferencing pointer
  eCryptfs: Remove mmap from directory operations
  eCryptfs: Add getattr function
  eCryptfs: Use notify_change for truncating lower inodes

15e551e5

Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable · 30a0f5e1

由 Linus Torvalds 提交于 1月 21, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: fix possible panic on unmount
  Btrfs: deal with NULL acl sent to btrfs_set_acl
  Btrfs: fix regression in orphan cleanup
  Btrfs: Fix race in btrfs_mark_extent_written
  Btrfs, fix memory leaks in error paths
  Btrfs: align offsets for btrfs_ordered_update_i_size
  btrfs: fix missing last-entry in readdir(3)

30a0f5e1

vmalloc: remove BUG_ON due to racy counting of VM_LAZY_FREE · 88f50044

由 Yongseok Koh 提交于 1月 19, 2010

In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE first, and
then vmap_lazy_nr is increased atomically.

But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, nr
is counted by checking VM_LAZY_FREE is set to va->flags.  After counting
the variable nr, kernel reads vmap_lazy_nr atomically and checks a
BUG_ON condition whether nr is greater than vmap_lazy_nr to prevent
vmap_lazy_nr from being negative.

The problem is that, if interrupted right after marking VM_LAZY_FREE,
increment of vmap_lazy_nr can be delayed.  Consequently, BUG_ON
condition can be met because nr is counted more than vmap_lazy_nr.

It is highly probable when vmalloc/vfree are called frequently.  This
scenario have been verified by adding delay between marking VM_LAZY_FREE
and increasing vmap_lazy_nr in free_unmap_area_noflush().

Even the vmap_lazy_nr is for checking high watermark, it never be the
strict watermark.  Although the BUG_ON condition is to prevent
vmap_lazy_nr from being negative, vmap_lazy_nr is signed variable.  So,
it could go down to negative value temporarily.

Consequently, removing the BUG_ON condition is proper.

A possible BUG_ON message is like the below.

   kernel BUG at mm/vmalloc.c:517!
   invalid opcode: 0000 [#1] SMP
   EIP: 0060:[<c04824a4>] EFLAGS: 00010297 CPU: 3
   EIP is at __purge_vmap_area_lazy+0x144/0x150
   EAX: ee8a8818 EBX: c08e77d4 ECX: e7c7ae40 EDX: c08e77ec
   ESI: 000081fe EDI: e7c7ae60 EBP: e7c7ae64 ESP: e7c7ae3c
   DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
   Call Trace:
   [<c0482ad9>] free_unmap_vmap_area_noflush+0x69/0x70
   [<c0482b02>] remove_vm_area+0x22/0x70
   [<c0482c15>] __vunmap+0x45/0xe0
   [<c04831ec>] vmalloc+0x2c/0x30
   Code: 8d 59 e0 eb 04 66 90 89 cb 89 d0 e8 87 fe ff ff 8b 43 20 89 da 8d 48 e0 8d 43 20 3b 04 24 75 e7 fe 05 a8 a5 a3 c0 e9 78 ff ff ff <0f> 0b eb fe 90 8d b4 26 00 00 00 00 56 89 c6 b8 ac a5 a3 c0 31
   EIP: [<c04824a4>] __purge_vmap_area_lazy+0x144/0x150 SS:ESP 0068:e7c7ae3c

[ See also http://marc.info/?l=linux-kernel&m=126335856228090&w=2 ]
Signed-off-by: NYongseok Koh <yongseok.koh@samsung.com>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

88f50044

Merge branch 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 · 970114a1

由 Linus Torvalds 提交于 1月 21, 2010

* 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh64: wire up sys_accept4.
  sh: unwire sys_recvmmsg.
  sh: ms7724: Correct sh-eth EEPROM polling timeout.

970114a1

Merge master.kernel.org:/home/rmk/linux-2.6-arm · def20529

由 Linus Torvalds 提交于 1月 21, 2010

* master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: 5888/1: arm: Update comments in cacheflush.h and remove unnecessary V6 and V7 comments
  ARM: 5886/1: arm: Fix cpu_proc_fin() for proc-v7.S and make kexec work
  ARM: 5885/1: arm: Flush TLB entries in setup_mm_for_reboot()
  ARM: 5884/1: arm: Fix DCC console for v7
  ARM: 5883/1: Revert "disable NX support for OABI-supporting kernels"
  ARM: 5882/1: ARM: Fix uncompress code compile for different defines of flush(void)
  ARM: fix badly placed mach/plat entries in Kconfig & Makefile

def20529

perf: x86: Add support for the ANY bit · b27d515a

由 Stephane Eranian 提交于 1月 18, 2010

Propagate the ANY bit into the fixed counter config for v3 and higher.
Signed-off-by: NStephane Eranian <eranian@google.com>
[a.p.zijlstra@chello.nl: split from larger patch]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4b5430c6.0f975e0a.1bf9.ffff85fe@mx.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b27d515a

perf: Change the is_software_event() definition · 92b67598

由 Peter Zijlstra 提交于 1月 18, 2010

The is_software_event() definition always confuses me because its an
exclusive expression, make it an inclusive one.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

92b67598

perf: Honour event state for aux stream data · 22e19085

由 Peter Zijlstra 提交于 1月 18, 2010

Anton reported that perf record kept receiving events even after calling
ioctl(PERF_EVENT_IOC_DISABLE). It turns out that FORK,COMM and MMAP
events didn't respect the disabled state and kept flowing in.
Reported-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: NAnton Blanchard <anton@samba.org>
LKML-Reference: <1263459187.4244.265.camel@laptop>
CC: stable@kernel.org
Signed-off-by: NIngo Molnar <mingo@elte.hu>

22e19085

perf: Fix perf_event_do_pending() fallback callsite · fe432200

由 Peter Zijlstra 提交于 1月 18, 2010

Paul questioned the context in which we should call
perf_event_do_pending(). After looking at that I found that it should be
called from IRQ context these days, however the fallback call-site is
placed in softirq context. Ammend this by placing the callback in the IRQ
timer path.
Reported-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1263374859.4244.192.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fe432200

sched: Reassign prev and switch_count when reacquire_kernel_lock() fail · 6d558c3a

由 Yong Zhang 提交于 1月 11, 2010

Assume A->B schedule is processing, if B have acquired BKL before and it
need reschedule this time. Then on B's context, it will go to
need_resched_nonpreemptible for reschedule. But at this time, prev and
switch_count are related to A. It's wrong and will lead to incorrect
scheduler statistics.
Signed-off-by: NYong Zhang <yong.zhang0@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <2674af741001102238w7b0ddcadref00d345e2181d11@mail.gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6d558c3a

sched: Fix vmark regression on big machines · 50b926e4

由 Mike Galbraith 提交于 1月 04, 2010

SD_PREFER_SIBLING is set at the CPU domain level if power saving isn't
enabled, leading to many cache misses on large machines as we traverse
looking for an idle shared cache to wake to.  Change the enabler of
select_idle_sibling() to SD_SHARE_PKG_RESOURCES, and enable same at the
sibling domain level.
Reported-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1262612696.15495.15.camel@marge.simson.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

50b926e4

USB: isp1362: fix build failure on ARM systems via irq_flags cleanup · 0a2fea2e

由 Lothar Wassmann 提交于 1月 15, 2010

There was some left over #ifdef ARM logic that is outdated but no one
really noticed.  So instead of relying on this tricky logic, properly
load and utilize the platform irq_flags resources.
Reported-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NLothar Wassmann <LW@KARO-electronics.de>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

0a2fea2e

USB: isp1362: better 64bit printf warning fixes · 96b85179

由 Lothar Wassmann 提交于 1月 15, 2010

Some hosts that treat the return value of sizeof differently from unsigned
long might still hit warnings.  So use %zu for sizeof() values.  This is a
better version of the previous commit b0a9cf29.
Signed-off-by: NLothar Wassmann <LW@KARO-electronics.de>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

96b85179

USB: fix usbstorage for 2770:915d delivers no FAT · 10d2cdb6

由 Ryan May 提交于 1月 06, 2010

Resolves kernel.org bug 14914.

Remove entry for 2770:915d (usb digital camera with mass storage
support) from unusual_devs.h. The fix triggered by the entry causes
the file system on the camera to be completely inaccessible (no
partition table, the device is not mountable).

The patch works, but let me clarify a few things about it.  All the
patch does is remove the entry for this device from the
drivers/usb/storage/unusual_devs.h, which is supposed to help with a
problem with the device's reported size (I think).  I'm pretty sure it
was originally added for a reason, so I'm not sure removing it won't
cause other problems to reappear.  Also, I should note that this
unusual_devs.h entry was present (and activating workarounds) in
2.6.29, but in that version everything works fine.  Starting with
2.6.30, things no longer work.
Signed-off-by: NRyan May <rmay31@gmail.com>
Cc: Rohan Hart <rohan.hart17@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

10d2cdb6

USB: Fix level of isp1760 Reloading ptd error message · c0d74142

由 Colin Tuckley 提交于 1月 07, 2010

This error message is not actually an error, it's an information
message. It is triggered when a transfer which ended in a NAQ is
retried successfully by the hardware.
Signed-off-by: NColin Tuckley <colin.tuckley@arm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

c0d74142

USB: FHCI: avoid NULL pointer dereference · ae35fe9e

由 Alexander Beregalov 提交于 1月 07, 2010

Assign fhci only if usb is not NULL.
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

ae35fe9e

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功