提交 · 51a8cf9d2d97017d334f33f1b39067bd2f03bc49 · openanolis / cloud-kernel

24 7月, 2012 23 次提交

Btrfs: fix btrfs_is_free_space_inode to recognize btree inode · 51a8cf9d

由 Liu Bo 提交于 7月 10, 2012

For btree inode, its root is also 'tree root', so btree inode can be
misunderstood as a free space inode.

We should add one more check for btree inode.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

51a8cf9d

Btrfs: avoid I/O repair BUG() from btree_read_extent_buffer_pages() · c0901581

由 Stefan Behrens 提交于 7月 10, 2012

From btree_read_extent_buffer_pages(), currently repair_io_failure()
can be called with mirror_num being zero when submit_one_bio() returned
an error before. This used to cause a BUG_ON(!mirror_num) in
repair_io_failure() and indeed this is not a case that needs the I/O
repair code to rewrite disk blocks.
This commit prevents calling repair_io_failure() in this case and thus
avoids the BUG_ON() and malfunction.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

c0901581

Btrfs: rework shrink_delalloc · f4c738c2

由 Josef Bacik 提交于 7月 02, 2012

So shrink_delalloc has grown all sorts of cruft over the years thanks to
many reworkings of how we track enospc. What happens now as we fill up the
disk is we will loop for freaking ever hoping to reclaim a arbitrary amount
of space of metadata, this was from when everybody flushed at the same time.
Now we only have people flushing one at a time. So instead of trying to
reclaim a huge amount of space, just try to flush a decent chunk of space,
and stop looping as soon as we have enough free space to satisfy our
reservation. This makes xfstests 224 go much faster. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f4c738c2

Btrfs: do not set subvolume flags in readonly mode · b9ca0664

由 Liu Bo 提交于 6月 29, 2012

$ mkfs.btrfs /dev/sdb7
$ btrfstune -S1 /dev/sdb7
$ mount /dev/sdb7 /mnt/btrfs
mount: block device /dev/sdb7 is write-protected, mounting read-only
$ btrfs dev add /dev/sdb8 /mnt/btrfs/

Now we get a btrfs in which mnt flags has readonly but sb flags does
not.  So for those ioctls that only check sb flags with MS_RDONLY, it
is going to be a problem.
Setting subvolume flags is such an ioctl, we should use mnt_want_write_file()
to check RO flags.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

b9ca0664

Btrfs: use mnt_want_write_file instead of mnt_want_write · e54bfa31

由 Liu Bo 提交于 6月 29, 2012

mnt_want_write_file is faster when file has been opened for write.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

e54bfa31

Btrfs: remove redundant r/o check for superblock · 768e9dfe

由 Liu Bo 提交于 6月 29, 2012

mnt_want_write() and mnt_want_write_file() will check sb->s_flags with
MS_RDONLY, and we don't need to do it ourselves.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

768e9dfe

Btrfs: check write access to mount earlier while creating snapshots · a874a63e

由 Liu Bo 提交于 6月 29, 2012

Move check of write access to mount into upper functions so that we can
use mnt_want_write_file instead, which is faster than mnt_want_write.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

a874a63e

Btrfs: fix typo in cow_file_range_async and async_cow_submit · 287082b0

由 Liu Bo 提交于 6月 28, 2012

It should be 10 * 1024 * 1024.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

287082b0

Btrfs: change how we indicate we're adding csums · 0e721106

由 Josef Bacik 提交于 6月 26, 2012

There is weird logic I had to put in place to make sure that when we were
adding csums that we'd used the delalloc block rsv instead of the global
block rsv. Part of this meant that we had to free up our transaction
reservation before we ran the delayed refs since csum deletion happens
during the delayed ref work. The problem with this is that when we release
a reservation we will add it to the global reserve if it is not full in
order to keep us going along longer before we have to force a transaction
commit. By releasing our reservation before we run delayed refs we don't
get the opportunity to drain down the global reserve for the work we did, so
we won't refill it as often. This isn't a problem per-se, it just results
in us possibly committing transactions more and more often, and in rare
cases could cause those WARN_ON()'s to pop in use_block_rsv because we ran
out of space in our block rsv.

This also helps us by holding onto space while the delayed refs run so we
don't end up with as many people trying to do things at the same time, which
again will help us not force commits or hit the use_block_rsv warnings.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

0e721106

Btrfs: return error of btrfs_update_inode() to caller · b9959295

由 Tsutomu Itoh 提交于 6月 25, 2012

We didn't check error of btrfs_update_inode(), but that error looks
easy to bubble back up.
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b9959295

Btrfs: fix error handling in __add_reloc_root() · 23291a04

由 Dan Carpenter 提交于 6月 25, 2012

We dereferenced "node" in the error message after freeing it.  Also
btrfs_panic() can return so we should return an error code instead of
continuing.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>

23291a04

Btrfs: do not ignore errors from btrfs_cleanup_fs_roots() when mounting · 44c44af2

由 Ilya Dryomov 提交于 6月 22, 2012

There used to be a BUG_ON(ret) there before EH patch (79787eaa) went in.
Bail out with EINVAL.

Cc: David Sterba <dsterba@suse.cz>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

44c44af2

I
Btrfs: do not return EINVAL instead of ENOMEM from open_ctree() · fed425c7
由 Ilya Dryomov 提交于 6月 22, 2012
```
When bailing from open_ctree() err is returned, not ret.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
fed425c7

Btrfs: add DEVICE_READY ioctl · 02db0844

由 Josef Bacik 提交于 6月 21, 2012

This will be used in conjunction with btrfs device ready <dev>.  This is
needed for initrd's to have a nice and lightweight way to tell if all of the
devices needed for a file system are in the cache currently.  This keeps
them from having to do mount+sleep loops waiting for devices to show up.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

02db0844

Btrfs: flush delayed inodes if we're short on space · 96c3f433

由 Josef Bacik 提交于 6月 21, 2012

Those crazy gentoo guys have been complaining about ENOSPC errors on their
portage volumes. This is because doing things like untar tends to create
lots of new files which will soak up all the reservation space in the
delayed inodes. Usually this gets papered over by the fact that we will try
and commit the transaction, however if this happens in the wrong spot or we
choose not to commit the transaction you will be screwed. So add the
ability to expclitly flush delayed inodes to free up space. Please test
this out guys to make sure it works since as usual I cannot reproduce.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

96c3f433

btrfs: join DEV_STATS ioctls to one · b27f7c0c

由 David Sterba 提交于 6月 22, 2012

Commit c11d2c23 (Btrfs: add ioctl to get and reset the device
stats) introduced two ioctls doing almost the same thing distinguished
by just the ioctl number which encodes "do reset after read". I have
suggested

http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16604.html

to implement it via the ioctl args. This hasn't happen, and I think we
should use a more clean way to pass flags and should not waste ioctl
numbers.

CC: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

b27f7c0c

btrfs: ignore unfragmented file checks in defrag when compression enabled - rebased · a43a2111

由 Andrew Mahone 提交于 6月 19, 2012

Rebased on btrfs-next and retested.

Inform should_defrag_range if BTRFS_DEFRAG_RANGE_COMPRESS is set. If so, skip
checks for adjacent extents and extent size when deciding whether to defrag,
as these can prevent an uncompressed and unfragmented file from being
compressed as requested.
Signed-off-by: NAndrew Mahone <andrew.mahone@gmail.com>

a43a2111

Btrfs: small naming cleanup in join_transaction() · e4b50e14

由 Dan Carpenter 提交于 6月 19, 2012

"root->fs_info" and "fs_info" are the same, but "fs_info" is prefered
because it is shorter and that's what is used in the rest of the
function.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>

e4b50e14

Btrfs: don't update atime on RO subvolumes · 2bc55652

由 Alexander Block 提交于 6月 15, 2012

Before the update_time inode operation was indroduced, it was
not possible to prevent updates of atime on RO subvolumes. VFS
was only able to check for RO on the mount, but did not know
anything about btrfs subvolumes.

btrfs_update_time does now check if the root is RO and skip
updating of times.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

2bc55652

Btrfs: allow mount -o remount,compress=no · 063849ea

由 Arnd Hannemann 提交于 4月 16, 2012

Btrfs allows to turn on compression on a mounted and used filesystem
by issuing mount -o remount,compress=lzo.
This patch allows to turn compression off again
while the filesystem is mounted. As suggested by David Sterba
if the compress-force option was set, it is implicitly cleared
if compression is turned off.
Tested-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NArnd Hannemann <arnd@arndnet.de>

063849ea

Btrfs: remove ->dirty_inode · c5c3c5f3

由 Josef Bacik 提交于 4月 05, 2012

We do all of our inode updating when we change it, and now that we do
->update_time we don't need ->dirty_inode for atime updates anymore, so just
remove it.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

c5c3c5f3

Btrfs: reduce calls to wake_up on uncontended locks · cbea5ac1

由 Chris Mason 提交于 7月 23, 2012

The btrfs locks were unconditionally calling wake_up as the
locks were released.  This lead to extra thrashing on the waitqueue,
especially for locks that were dominated by readers.
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

cbea5ac1

Btrfs: don't wait around for new log writers on an SSD · e39e64ac

由 Chris Mason 提交于 7月 23, 2012

Waiting on spindles improves performance, but ssds want all the
IO as quickly as we can push it down.
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

e39e64ac

22 7月, 2012 7 次提交

L

Linux 3.5 · 28a33cbc
由 Linus Torvalds 提交于 7月 21, 2012

28a33cbc

Remove SYSTEM_SUSPEND_DISK system state · bff9d186

由 Rafael J. Wysocki 提交于 7月 21, 2012

The SYSTEM_SUSPEND_DISK system state is never used, so drop it.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bff9d186

Merge branch 'anton-kgdb' (kgdb dmesg fixups) · 9a2bc860

由 Linus Torvalds 提交于 7月 21, 2012

Merge emailed kgdb dmesg fixups patches from Anton Vorontsov:
 "The dmesg command appears to be broken after the printk rework.  The
  old logic in the kdb code makes no sense in terms of current
  printk/logging storage format, and KDB simply hangs forever upon
  entering 'dmesg' command.

  The first patch revives the command by switching to kmsg_dumper
  iterator.  As a side-effect, the code is now much more simpler.

  A few changes were needed in the printk.c: we needed unlocked variant
  of the kmsg_dumper iterator, but these can surely wait for 3.6.

  It's probably too late even for the first patch to go to 3.5, but I'll
  try to convince otherwise.  :-) Here we go:

   - The current code is broken for sure, and has no hope to work at
     all.  It is a regression
   - The new code works for me, and probably works for everyone else;
   - If it compiles (and I urge everyone to compile-test it on your
     setup), it hardly can make things worse."

* Merge emailed patches from Anton Vorontsov: (4 commits)
  kdb: Switch to nolock variants of kmsg_dump functions
  printk: Implement some unlocked kmsg_dump functions
  printk: Remove kdb_syslog_data
  kdb: Revive dmesg command

9a2bc860

kdb: Switch to nolock variants of kmsg_dump functions · c064da47

由 Anton Vorontsov 提交于 7月 20, 2012

The locked variants are prone to deadlocks (suppose we got to the
debugger w/ the logbuf lock held), so let's switch to nolock variants.
Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c064da47

printk: Implement some unlocked kmsg_dump functions · 533827c9

由 Anton Vorontsov 提交于 7月 20, 2012

If used from KDB, the locked variants are prone to deadlocks (suppose we
got to the debugger w/ the logbuf lock held).

So, we have to implement a few routines that grab no logbuf lock.

Yet we don't need these functions in modules, so we don't export them.
Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

533827c9

printk: Remove kdb_syslog_data · 1b499d05

由 Anton Vorontsov 提交于 7月 20, 2012

The function is no longer needed, so remove it.
Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1b499d05

kdb: Revive dmesg command · bc792e61

由 Anton Vorontsov 提交于 7月 20, 2012

The kgdb dmesg command is broken after the printk rework.  The old logic
in kdb code makes no sense in terms of current printk/logging storage
format, and KDB simply hangs forever.

This patch revives the command by switching to kmsg_dumper iterator.

The code is now much more simpler and shorter.
Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bc792e61

21 7月, 2012 4 次提交

Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · d75e2c9a

由 Linus Torvalds 提交于 7月 20, 2012

Pull late MIPS fixes from Ralf Baechle:
 "This fixes a number of lose ends in the MIPS code and various bug
  fixes.

  Aside of dropping some patch that should not be in this pull request
  everything has sat in -next for quite a while and there are no known
  issues.

  The biggest patch in this patch set moves the allocation of an array
  that is aliased to a function (for runtime generated code) to
  assembler code.  This avoids an issue with certain toolchains when
  building for microMIPS."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (35 commits)
  MIPS: PCI: Move fixups from __init to __devinit.
  MIPS: Fix bug.h MIPS build regression
  MIPS: sync-r4k: remove redundant irq operation
  MIPS: smp: Warn on too early irq enable
  MIPS: call set_cpu_online() on cpu being brought up with irq disabled
  MIPS: call ->smp_finish() a little late
  MIPS: Yosemite: delay irq enable to ->smp_finish()
  MIPS: SMTC: delay irq enable to ->smp_finish()
  MIPS: BMIPS: delay irq enable to ->smp_finish()
  MIPS: Octeon: delay enable irq to ->smp_finish()
  MIPS: Oprofile: Fix build as a module.
  MIPS: BCM63XX: Fix BCM6368 IPSec clock bit
  MIPS: perf: Fix build error caused by unused counters_per_cpu_to_total()
  MIPS: Fix Magic SysRq L kernel crash.
  MIPS: BMIPS: Fix duplicate header inclusion.
  mips: mark const init data with __initconst instead of __initdata
  MIPS: cmpxchg.h: Add missing include
  MIPS: Malta may also be equipped with MIPS64 R2 processors.
  MIPS: Fix typo multipy -> multiply
  MIPS: Cavium: Fix duplicate ARCH_SPARSEMEM_ENABLE in kconfig.
  ...

d75e2c9a

Merge tag 'dm-3.5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm · 93517374

由 Linus Torvalds 提交于 7月 20, 2012

Pull device-mapper discard fixes from Alasdair G Kergon:
  - avoid a crash in dm-raid1 when discards coincide with mirror
    recovery;
  - avoid discarding shared data that's still needed in dm-thin;
  - don't guarantee that discarded blocks will be wiped in dm-raid1.

* tag 'dm-3.5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm raid1: set discard_zeroes_data_unsupported
  dm thin: do not send discards to shared blocks
  dm raid1: fix crash with mirror recovery and discard

93517374

Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd · ce9f8d6b

由 Linus Torvalds 提交于 7月 20, 2012

Pull pnfs/ore fixes from Boaz Harrosh:
 "These are catastrophic fixes to the pnfs objects-layout that were just
  discovered.  They are also destined for @stable.

  I have found these and worked on them at around RC1 time but
  unfortunately went to the hospital for kidney stones and had a very
  slow recovery.  I refrained from sending them as is, before proper
  testing, and surly I have found a bug just yesterday.

  So now they are all well tested, and have my sign-off.  Other then
  fixing the problem at hand, and assuming there are no bugs at the new
  code, there is low risk to any surrounding code.  And in anyway they
  affect only these paths that are now broken.  That is RAID5 in pnfs
  objects-layout code.  It does also affect exofs (which was not broken)
  but I have tested exofs and it is lower priority then objects-layout
  because no one is using exofs, but objects-layout has lots of users."

* 'for-linus' of git://git.open-osd.org/linux-open-osd:
  pnfs-obj: Fix __r4w_get_page when offset is beyond i_size
  pnfs-obj: don't leak objio_state if ore_write/read fails
  ore: Unlock r4w pages in exact reverse order of locking
  ore: Remove support of partial IO request (NFS crash)
  ore: Fix NFS crash by supporting any unaligned RAID IO

ce9f8d6b

Merge tag 'upstream-3.5-rc8' of git://git.infradead.org/linux-ubifs · 17934162

由 Linus Torvalds 提交于 7月 20, 2012

Pull UBIFS free space fix-up bugfix from Artem Bityutskiy:
 "It's been reported already twice recently:

    http://lists.infradead.org/pipermail/linux-mtd/2012-May/041408.html
    http://lists.infradead.org/pipermail/linux-mtd/2012-June/042422.html

  and we finally have the fix.  I am quite confident the fix is correct
  because I could reproduce the problem with nandsim and verify the fix.
  It was also verified by Iwo (the reporter).

  I am also confident that this is OK to merge the fix so late because
  this patch affects only the fixup functionality, which is not used by
  most users."

* tag 'upstream-3.5-rc8' of git://git.infradead.org/linux-ubifs:
  UBIFS: fix a bug in empty space fix-up

17934162

20 7月, 2012 6 次提交

dm raid1: set discard_zeroes_data_unsupported · 7c8d3a42

由 Mikulas Patocka 提交于 7月 20, 2012

We can't guarantee that REQ_DISCARD on dm-mirror zeroes the data even if
the underlying disks support zero on discard.  So this patch sets
ti->discard_zeroes_data_unsupported.

For example, if the mirror is in the process of resynchronizing, it may
happen that kcopyd reads a piece of data, then discard is sent on the
same area and then kcopyd writes the piece of data to another leg.
Consequently, the data is not zeroed.

The flag was made available by commit 983c7db3
(dm crypt: always disable discard_zeroes_data).
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

7c8d3a42

dm thin: do not send discards to shared blocks · 650d2a06

由 Mikulas Patocka 提交于 7月 20, 2012

When process_discard receives a partial discard that doesn't cover a
full block, it sends this discard down to that block. Unfortunately, the
block can be shared and the discard would corrupt the other snapshots
sharing this block.

This patch detects block sharing and ends the discard with success when
sending it to the shared block.

The above change means that if the device supports discard it can't be
guaranteed that a discard request zeroes data. Therefore, we set
ti->discard_zeroes_data_unsupported.

Thin target discard support with this bug arrived in commit
104655fd (dm thin: support discards).
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

650d2a06

dm raid1: fix crash with mirror recovery and discard · 751f188d

由 Mikulas Patocka 提交于 7月 20, 2012

This patch fixes a crash when a discard request is sent during mirror
recovery.

Firstly, some background.  Generally, the following sequence happens during
mirror synchronization:
- function do_recovery is called
- do_recovery calls dm_rh_recovery_prepare
- dm_rh_recovery_prepare uses a semaphore to limit the number
  simultaneously recovered regions (by default the semaphore value is 1,
  so only one region at a time is recovered)
- dm_rh_recovery_prepare calls __rh_recovery_prepare,
  __rh_recovery_prepare asks the log driver for the next region to
  recover. Then, it sets the region state to DM_RH_RECOVERING. If there
  are no pending I/Os on this region, the region is added to
  quiesced_regions list. If there are pending I/Os, the region is not
  added to any list. It is added to the quiesced_regions list later (by
  dm_rh_dec function) when all I/Os finish.
- when the region is on quiesced_regions list, there are no I/Os in
  flight on this region. The region is popped from the list in
  dm_rh_recovery_start function. Then, a kcopyd job is started in the
  recover function.
- when the kcopyd job finishes, recovery_complete is called. It calls
  dm_rh_recovery_end. dm_rh_recovery_end adds the region to
  recovered_regions or failed_recovered_regions list (depending on
  whether the copy operation was successful or not).

The above mechanism assumes that if the region is in DM_RH_RECOVERING
state, no new I/Os are started on this region. When I/O is started,
dm_rh_inc_pending is called, which increases reg->pending count. When
I/O is finished, dm_rh_dec is called. It decreases reg->pending count.
If the count is zero and the region was in DM_RH_RECOVERING state,
dm_rh_dec adds it to the quiesced_regions list.

Consequently, if we call dm_rh_inc_pending/dm_rh_dec while the region is
in DM_RH_RECOVERING state, it could be added to quiesced_regions list
multiple times or it could be added to this list when kcopyd is copying
data (it is assumed that the region is not on any list while kcopyd does
its jobs). This results in memory corruption and crash.

There already exist bypasses for REQ_FLUSH requests: REQ_FLUSH requests
do not belong to any region, so they are always added to the sync list
in do_writes. dm_rh_inc_pending does not increase count for REQ_FLUSH
requests. In mirror_end_io, dm_rh_dec is never called for REQ_FLUSH
requests. These bypasses avoid the crash possibility described above.

These bypasses were improperly implemented for REQ_DISCARD when
the mirror target gained discard support in commit
5fc2ffea (dm raid1: support discard).

In do_writes, REQ_DISCARD requests is always added to the sync queue and
immediately dispatched (even if the region is in DM_RH_RECOVERING).  However,
dm_rh_inc and dm_rh_dec is called for REQ_DISCARD resusts.  So it violates the
rule that no I/Os are started on DM_RH_RECOVERING regions, and causes the list
corruption described above.

This patch changes it so that REQ_DISCARD requests follow the same path
as REQ_FLUSH. This avoids the crash.

Reference: https://bugzilla.redhat.com/837607Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

751f188d

pnfs-obj: Fix __r4w_get_page when offset is beyond i_size · c999ff68

由 Boaz Harrosh 提交于 6月 08, 2012

It is very common for the end of the file to be unaligned on
stripe size. But since we know it's beyond file's end then
the XOR should be preformed with all zeros.

Old code used to just read zeros out of the OSD devices, which is a great
waist. But what scares me more about this situation is that, we now have
pages attached to the file's mapping that are beyond i_size. I don't
like the kind of bugs this calls for.

Fix both birds, by returning a global zero_page, if offset is beyond
i_size.

TODO:
	Change the API to ->__r4w_get_page() so a NULL can be
	returned without being considered as error, since XOR API
	treats NULL entries as zero_pages.

[Bug since 3.2. Should apply the same way to all Kernels since]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

c999ff68

B
pnfs-obj: don't leak objio_state if ore_write/read fails · 9909d45a
由 Boaz Harrosh 提交于 6月 08, 2012
```
[Bug since 3.2 Kernel]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
```
9909d45a

ore: Unlock r4w pages in exact reverse order of locking · 537632e0

由 Boaz Harrosh 提交于 7月 11, 2012

The read-4-write pages are locked in address ascending order.
But where unlocked in a way easiest for coding. Fix that,
locks should be released in opposite order of locking, .i.e
descending address order.

I have not hit this dead-lock. It was found by inspecting the
dbug print-outs. I suspect there is an higher lock at caller that
protects us, but fix it regardless.
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

537632e0

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功