提交 · 44fb5511638938a2c37c895abc14df648ffc07e9 · openanolis / cloud-kernel

05 6月, 2009 1 次提交

Btrfs: Fix oops and use after free during space balancing · 44fb5511

由 Chris Mason 提交于 6月 04, 2009

The btrfs allocator uses list_for_each to walk the available block
groups when searching for free blocks.  It starts off with a hint
to help find the best block group for a given allocation.

The hint is resolved into a block group, but we don't properly check
to make sure the block group we find isn't in the middle of being
freed due to filesystem shrinking or balancing.  If it is being
freed, the list pointers in it are bogus and can't be trusted.  But,
the code happily goes along and uses them in the list_for_each loop,
leading to all kinds of fun.

The fix used here is to check to make sure the block group we find really
is on the list before we use it.  list_del_init is used when removing
it from the list, so we can do a proper check.

The allocation clustering code has a similar bug where it will trust
the block group in the current free space cluster.  If our allocation
flags have changed (going from single spindle dup to raid1 for example)
because the drives in the FS have changed, we're not allowed to use
the old block group any more.

The fix used here is to check the current cluster against the
current allocation flags.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

44fb5511

04 6月, 2009 1 次提交

Btrfs: set device->total_disk_bytes when adding new device · 2cc3c559

由 Yan Zheng 提交于 6月 04, 2009

It was not being properly initialized, and so the size saved to
disk was not correct.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2cc3c559

15 5月, 2009 6 次提交

S
Btrfs: Spelling fix in btrfs_lookup_first_block_group comments · 9f55684c
由 Sankar P 提交于 5月 14, 2009
```
Signed-off-by: NSankar P <sankar.curiosity@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
9f55684c

Btrfs: make show_options result match actual option names · 6b65c5c6

由 Sage Weil 提交于 5月 14, 2009

The notreelog and flushoncommit mount options were being printed slightly
differently.
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6b65c5c6

Btrfs: remove outdated comment in btrfs_ioctl_resize() · 5d847a8e

由 Li Hong 提交于 5月 14, 2009

In Li Zefan's commit dae7b665,
a combination call of kmalloc() and copy_from_user() is replaced by
memdup_user(). So btrfs_ioctl_resize() doesn't use GFP_NOFS any more.
Signed-off-by: NLi Hong <lihong.hi@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5d847a8e

Btrfs: remove some WARN_ONs in the IO failure path · cc7b0c9b

由 Chris Mason 提交于 5月 14, 2009

These debugging WARN_ONs make too much console noise during regular
IO failures. An IO failure will still generate a number of messages
as we verify checksums etc, but these two are not needed.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

cc7b0c9b

Btrfs: Don't loop forever on metadata IO failures · 76a05b35

由 Chris Mason 提交于 5月 14, 2009

When a btrfs metadata read fails, the first thing we try to do is find
a good copy on another mirror of the block.  If this fails, read_tree_block()
ends up returning a buffer that isn't up to date.

The btrfs btree reading code was reworked to drop locks and repeat
the search when IO was done, but the changes didn't add a check for failed
reads.  The end result was looping forever on buffers that were never
going to become up to date.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

76a05b35

Btrfs: init inode ordered_data_close flag properly · 2757495c

由 Chris Mason 提交于 5月 14, 2009

This flag is used to decide when we need to send a given file through
the ordered code to make sure it is fully written before a transaction
commits.  It was not being properly set to zero when the inode was
being setup.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2757495c

28 4月, 2009 2 次提交

Btrfs: look for acls during btrfs_read_locked_inode · 46a53cca

由 Chris Mason 提交于 4月 27, 2009

This changes btrfs_read_locked_inode() to peek ahead in the btree for acl items.
If it is certain a given inode has no acls, it will set the in memory acl
fields to null to avoid acl lookups completely.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

46a53cca

Btrfs: fix acl caching · 7b1a14bb

由 Chris Mason 提交于 4月 27, 2009

Linus noticed the btrfs code to cache acls wasn't properly caching
a NULL acl when the inode didn't have any acls.  This meant the common
case of no acls resulted in expensive btree searches every time the
kernel checked permissions (which is quite often).

This is a modified version of Linus' original patch:

Properly set initial acl fields to BTRFS_ACL_NOT_CACHED in the inode.
This forces an acl lookup when permission checks are done.

Fix btrfs_get_acl to avoid lookups and locking when the inode acls fields
are set to null.

Fix btrfs_get_acl to use the right return value from __btrfs_getxattr
when deciding to cache a NULL acl.  It was storing a NULL acl when
__btrfs_getxattr return -ENOENT, but __btrfs_getxattr was actually returning
-ENODATA for this case.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7b1a14bb

27 4月, 2009 6 次提交

Btrfs: Fix a bunch of printk() warnings. · 21380931

由 Joel Becker 提交于 4月 21, 2009

Just happened to notice a bunch of %llu vs u64 warnings.  Here's a patch
to cast them all.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

21380931

Btrfs: Fix a trivial warning using max() of u64 vs ULL. · e63b6a6c

由 Joel Becker 提交于 4月 21, 2009

A small warning popped up on ia64 because inode-map.c was comparing a
u64 object id with the ULL FIRST_FREE_OBJECTID.  My first thought was
that all the OBJECTID constants should contain the u64 cast because
btrfs code deals entirely in u64s.  But then I saw how large that was,
and figured I'd just fix the max() call.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e63b6a6c

C
Btrfs: remove unused btrfs_bit_radix slab · 45c06543
由 Chris Mason 提交于 4月 27, 2009
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
45c06543

Btrfs: ratelimit IO error printks · 193f284d

由 Chris Mason 提交于 4月 27, 2009

Btrfs has printks for various IO errors, including bad checksums and
mismatches between what we expect the block headers to contain and what
we actually find on the disk.

Longer term we need a real reporting mechanism for this, but for now
printk is going to have to do.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

193f284d

Btrfs: remove #if 0 code · b7967db7

由 Chris Mason 提交于 4月 27, 2009

Btrfs had some old code sitting around under #if 0, this drops it.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b7967db7

Btrfs: When shrinking, only update disk size on success · d6397bae

由 Chris Ball 提交于 4月 27, 2009

Previously, we updated a device's size prior to attempting a shrink
operation. This patch moves the device resizing logic to only happen if
the shrink completes successfully. In the process, it introduces a new
field to btrfs_device -- disk_total_bytes -- to track the on-disk size.
Signed-off-by: NChris Ball <cjb@laptop.org>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d6397bae

25 4月, 2009 6 次提交

Btrfs: fix deadlocks and stalls on dead root removal · 59bc5c75

由 Chris Mason 提交于 4月 24, 2009

After a transaction commit, the old root of the subvol btrees are sent through
snapshot removal. This is what actually frees up any blocks replaced by
COW, and anything the old blocks pointed to.

Snapshot deletion will pause when a transaction commit has started, which
helps to avoid a huge amount of delayed reference count updates piling up
as the transaction is trying to close.

But, this pause happens after the snapshot deletion process has asked other
procs on the system to throttle back a bit so that it can make progress.

We don't want to throttle everyone while we're waiting for the transaction
commit, it leads to deadlocks in the user transaction ioctls used by Ceph
and makes things slower in general.

This patch changes things to avoid the throttling while we sleep.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

59bc5c75

Btrfs: fix fallocate deadlock on inode extent lock · e980b50c

由 Chris Mason 提交于 4月 24, 2009

The btrfs fallocate call takes an extent lock on the entire range
being fallocated, and then runs through insert_reserved_extent on each
extent as they are allocated.

The problem with this is that btrfs_drop_extents may decide to try
and take the same extent lock fallocate was already holding.  The solution
used here is to push down knowledge of the range that is already locked
going into btrfs_drop_extents.

It turns out that at least one other caller had the same bug.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e980b50c

Btrfs: kill btrfs_cache_create · 9601e3f6

由 Christoph Hellwig 提交于 4月 13, 2009

Just use kmem_cache_create directly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9601e3f6

Btrfs: don't export symbols · 0d4bf11e

由 Christoph Hellwig 提交于 4月 13, 2009

Currently the extent_map code is only for btrfs so don't export it's
symbols.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0d4bf11e

Btrfs: simplify makefile · 2ea2544e

由 Christoph Hellwig 提交于 4月 13, 2009

Get rid of the hacks for building out of tree, and always use += for
assigning to the object lists.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2ea2544e

Btrfs: try to keep a healthy ratio of metadata vs data block groups · 97e728d4

由 Josef Bacik 提交于 4月 21, 2009

This patch makes the chunk allocator keep a good ratio of metadata vs data
block groups. By default for every 8 data block groups, we'll allocate 1
metadata chunk, or about 12% of the disk will be allocated for metadata. This
can be changed by specifying the metadata_ratio mount option.

This is simply the number of data block groups that have to be allocated to
force a metadata chunk allocation. By making sure we allocate metadata chunks
more often, we are less likely to get into situations where the whole disk
has been allocated as data block groups.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

97e728d4

22 4月, 2009 1 次提交

Btrfs: fix btrfs fallocate oops and deadlock · 546888da

由 Chris Mason 提交于 4月 21, 2009

Btrfs fallocate was incorrectly starting a transaction with a lock held
on the extent_io tree for the file, which could deadlock. Strictly
speaking it was using join_transaction which would be safe, but it is better
to move the transaction outside of the lock.

When preallocated extents are overwritten, btrfs_mark_buffer_dirty was
being called on an unlocked buffer. This was triggering an assertion and
oops because the lock is supposed to be held.

The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had
been run. btrfs_del_item takes care of dirtying things, so the solution is a
to skip the btrfs_mark_buffer_dirty call in this case.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

546888da

21 4月, 2009 4 次提交

Btrfs: use the right node in reada_for_balance · 8c594ea8

由 Chris Mason 提交于 4月 20, 2009

reada_for_balance was using the wrong index into the path node array,
so it wasn't reading the right blocks.  We never directly used the
results of the read done by this function because the btree search is
started over at the end.

This fixes reada_for_balance to reada in the correct node and to
avoid searching past the last slot in the node.  It also makes sure to
hold the parent lock while we are finding the nodes to read.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8c594ea8

Btrfs: fix oops on page->mapping->host during writepage · 11c8349b

由 Chris Mason 提交于 4月 20, 2009

The extent_io writepage call updates the writepage index in the inode
as it makes progress.  But, it was doing the update after unlocking the page,
which isn't legal because page->mapping can't be trusted once the page
is unlocked.

This lead to an oops, especially common with compression turned on.  The
fix here is to update the writeback index before unlocking the page.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11c8349b

Btrfs: add a priority queue to the async thread helpers · d313d7a3

由 Chris Mason 提交于 4月 20, 2009

Btrfs is using WRITE_SYNC_PLUG to send down synchronous IOs with a
higher priority.  But, the checksumming helper threads prevent it
from being fully effective.

There are two problems.  First, a big queue of pending checksumming
will delay the synchronous IO behind other lower priority writes.  Second,
the checksumming uses an ordered async work queue.  The ordering makes sure
that IOs are sent to the block layer in the same order they are sent
to the checksumming threads.  Usually this gives us less seeky IO.

But, when we start mixing IO priorities, the lower priority IO can delay
the higher priority IO.

This patch solves both problems by adding a high priority list to the async
helper threads, and a new btrfs_set_work_high_prio(), which is used
to make put a new async work item onto the higher priority list.

The ordering is still done on high priority IO, but all of the high
priority bios are ordered separately from the low priority bios.  This
ordering is purely an IO optimization, it is not involved in data
or metadata integrity.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d313d7a3

Btrfs: use WRITE_SYNC for synchronous writes · ffbd517d

由 Chris Mason 提交于 4月 20, 2009

Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
writes we plan on waiting on in the near future.  This patch
mirrors recent changes in other filesystems and the generic code to
use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
other latency critical writes.

Btrfs uses async worker threads for checksumming before the write is done,
and then again to actually submit the bios.  The bio submission code just
runs a per-device list of bios that need to be sent down the pipe.

This list is split into low priority and high priority lists so the
WRITE_SYNC IO happens first.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ffbd517d

15 4月, 2009 10 次提交

L

Linux 2.6.30-rc2 · 0882e8dd
由 Linus Torvalds 提交于 4月 14, 2009

0882e8dd

Merge branch 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel · b897e6fb

由 Linus Torvalds 提交于 4月 14, 2009

* 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
  drm/i915: fix scheduling while holding the new active list spinlock
  drm/i915: Allow tiling of objects with bit 17 swizzling by the CPU.
  drm/i915: Correctly set the write flag for get_user_pages in pread.
  drm/i915: Fix use of uninitialized var in 40a5f0de
  drm/i915: indicate framebuffer restore key in SysRq help message
  drm/i915: sync hdmi detection by hdmi identifier with 2D
  drm/i915: Fix a mismerge of the IGD patch (new .find_pll hooks missed)
  drm/i915: Implement batch and ring buffer dumping

b897e6fb

x86 microcode: revert some work_on_cpu · 6f66cbc6

由 Hugh Dickins 提交于 4月 14, 2009

Revert part of af5c820a ("x86: cpumask:
use work_on_cpu in arch/x86/kernel/microcode_core.c")

That change is causing only one Intel CPU's microcode to be updated e.g.
microcode: CPU3 updated from revision 0x9 to 0x17, date = 2005-04-22
where before it announced that also for CPU0 and CPU1 and CPU2.

We cannot use work_on_cpu() in the CONFIG_MICROCODE_OLD_INTERFACE code,
because Intel's request_microcode_user() involves a copy_from_user() from
/sbin/microcode_ctl, which therefore needs to be on that CPU at the time.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6f66cbc6

drm/i915: fix scheduling while holding the new active list spinlock · 68c84342

由 Shaohua Li 提交于 4月 08, 2009

regression caused by commit 5e118f41:
i915_gem_object_move_to_inactive() should be called in task context,
as it calls fput();

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
[anholt: Add more detail to the comment about the lock break that's added]
Signed-off-by: NEric Anholt <eric@anholt.net>

68c84342

Merge branch 'core-fixes-for-linus' of... · 610f26e7

由 Linus Torvalds 提交于 4月 14, 2009

Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: warn about lockdep disabling after kernel taint, fix

610f26e7

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · e9de427e

由 Linus Torvalds 提交于 4月 14, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix "direct_io" private mmap
  fuse: fix argument type in fuse_get_user_pages()

e9de427e

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 · 9fc0178c

由 Linus Torvalds 提交于 4月 14, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: fix possible mismatch of sufile counters on recovery
  nilfs2: segment usage file cleanups
  nilfs2: fix wrong accounting and duplicate brelse in nilfs_sufile_set_error
  nilfs2: simplify handling of active state of segments fix
  nilfs2: remove module version
  nilfs2: fix lockdep recursive locking warning on meta data files
  nilfs2: fix lockdep recursive locking warning on bmap
  nilfs2: return f_fsid for statfs2

9fc0178c

Merge branch 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze · 2b6b6d38

由 Linus Torvalds 提交于 4月 14, 2009

* 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Add missing FILE tag to MAINTAINERS
  microblaze: remove duplicated #include's
  microblaze: struct device - replace bus_id with dev_name()
  microblaze: Simplify copy_thread()
  microblaze: Add TIMESTAMPING constants to socket.h
  microblaze: Add missing empty ftrace.h file
  microblaze: Fix problem with removing zero length files

2b6b6d38

Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 · 3e862dd5

由 Linus Torvalds 提交于 4月 14, 2009

* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: Add in PCI bus for DMA API debugging.
  sh: Pre-allocate a reasonable number of DMA debug entries.
  sh: sh7786: modify usb setup timeout judgment bug.
  MAINTAINERS: Update sh architecture file patterns.
  sh: ap325: use edge control for ov772x camera
  sh: Plug in support for ARCH=sh64 using sh SRCARCH.
  sh: urquell: Fix up address mapping in board comments.
  sh: Add support for DMA API debugging.
  sh: Provide cpumask_of_pcibus() to fix NUMA build.
  sh: urquell: Add board comment
  sh: wire up sys_preadv/sys_pwritev() syscalls.
  sh: sh7785lcr: fix PCI address map for 32-bit mode
  sh: intc: Added resume from hibernation support to the intc

3e862dd5

Fix lpfc_parse_bg_err()'s use of do_div() · 2344b5b6

由 David Howells 提交于 4月 14, 2009

Fix lpfc_parse_bg_err()'s use of do_div(). It should be passing a 64-bit
variable as the first parameter. However, since it's only using a 32-bit
variable, it doesn't need to use do_div() at all, but can instead use the
division operator.

This deals with the following warnings:

CC drivers/scsi/lpfc/lpfc_scsi.o
drivers/scsi/lpfc/lpfc_scsi.c: In function 'lpfc_parse_bg_err':
drivers/scsi/lpfc/lpfc_scsi.c:1397: warning: comparison of distinct pointer types lacks a cast
drivers/scsi/lpfc/lpfc_scsi.c:1397: warning: right shift count >= width of type
drivers/scsi/lpfc/lpfc_scsi.c:1397: warning: passing argument 1 of '__div64_32' from incompatible pointer type
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2344b5b6

14 4月, 2009 3 次提交

tty: Update some of the USB kernel doc · 78c5b82e

由 Leandro Dorileo 提交于 4月 14, 2009

Updates some usb_serial_port members documentation.
Signed-off-by: NLeandro Dorileo <ldorileo@gmail.com>
Signed-off-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

78c5b82e

parport_pc: Fix build failure drivers/parport/parport_pc.c for powerpc · 19e05426

由 Tony Breeds 提交于 4月 14, 2009

In commit 51dcdfec ("parport: Use the
PCI IRQ if offered") parport_pc_probe_port() gained an irqflags arg.
This isn't being supplied on powerpc. This patch make powerpc fallback
to the old behaviour, that is using "0" for irqflags.

Fixes build failure:

In file included from drivers/parport/parport_pc.c:68:
arch/powerpc/include/asm/parport.h: In function 'parport_pc_find_nonpci_ports':
arch/powerpc/include/asm/parport.h:32: error: too few arguments to function 'parport_pc_probe_port'
arch/powerpc/include/asm/parport.h:32: error: too few arguments to function 'parport_pc_probe_port'
arch/powerpc/include/asm/parport.h:32: error: too few arguments to function 'parport_pc_probe_port'
make[3]: *** [drivers/parport/parport_pc.o] Error 1
Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

19e05426

parport: Fix various uses of parport_pc · 28783eb5

由 Alan Cox 提交于 4月 14, 2009

These got overlooked first time around.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

28783eb5

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功