提交 · f1e89c86fdd0f5e59f6768146c86437934202033 · openeuler / raspberrypi-kernel

23 10月, 2010 25 次提交

nilfs2: use iget for all metadata files · f1e89c86

由 Ryusuke Konishi 提交于 9月 05, 2010

This makes use of iget5_locked to allocate or get inode for metadata
files to stop using own inode allocator.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

f1e89c86

nilfs2: get rid of GCDAT inode · c1c1d709

由 Ryusuke Konishi 提交于 8月 29, 2010

This applies prepared rollback function and redirect function of
metadata file to DAT file, and eliminates GCDAT inode.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

c1c1d709

nilfs2: add routines to redirect access to buffers of DAT file · b1f6a4f2

由 Ryusuke Konishi 提交于 8月 31, 2010

During garbage collection (GC), DAT file, which converts virtual block
number to real block number, may return disk block number that is not
yet written to the device.

To avoid access to unwritten blocks, the current implementation stores
changes to the caches of GCDAT during GC and atomically commit the
changes into the DAT file after they are written to the device.

This patch, instead, adds a function that makes a copy of specified
buffer and stores it in nilfs_shadow_map, and a function to get the
backup copy as needed (nilfs_mdt_freeze_buffer and
nilfs_mdt_get_frozen_buffer respectively).

Before DAT changes block number in an entry block, it makes a copy and
redirect access to the buffer so that address conversion function
(i.e. nilfs_dat_translate) refers to the old address saved in the
copy.

This patch gives requisites for such redirection.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

b1f6a4f2

nilfs2: add routines to roll back state of DAT file · ebdfed4d

由 Ryusuke Konishi 提交于 9月 06, 2010

This adds optional function to metadata files which makes a copy of
bmap, page caches, and b-tree node cache, and rolls back to the copy
as needed.

This enhancement is intended to displace gcdat inode that provides a
similar function in a different way.

In this patch, nilfs_shadow_map structure is added to store a copy of
the foregoing states.  nilfs_mdt_setup_shadow_map relates this
structure to a metadata file.  And, nilfs_mdt_save_to_shadow_map() and
nilfs_mdt_restore_from_shadow_map() provides save and restore
functions respectively.  Finally, nilfs_mdt_clear_shadow_map() clears
states of nilfs_shadow_map.

The copy of b-tree node cache and page cache is made by duplicating
only dirty pages into corresponding caches in nilfs_shadow_map.  Their
restoration is done by clearing dirty pages from original caches and
by copying dirty pages back from nilfs_shadow_map.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

ebdfed4d

nilfs2: add routines to save and restore bmap state · a8070dd3

由 Ryusuke Konishi 提交于 8月 30, 2010

This adds routines to save and restore the state of bmap structure.
The bmap state is stored in a given nilfs_bmap_store object.

These routines will be used to roll back the state of dat inode
without using gcdat inode.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

a8070dd3

nilfs2: do not allocate nilfs_mdt_info structure to gc-inodes · adbb39b5

由 Ryusuke Konishi 提交于 9月 05, 2010

GC-inode now doesn't need the nilfs_mdt_info structure and there is no
reason that it is a sort of metadata files.

This stops the allocation and makes them not dependent on metadata
file routines.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

adbb39b5

nilfs2: allow nilfs_clear_inode to clear metadata file inodes · 518d1a6a

由 Ryusuke Konishi 提交于 8月 20, 2010

Allows clear inode function (nilfs_clear_inode) to handle metadata
files that uses bitmap-based object alloctor.  DAT and ifile
correspond to this.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

518d1a6a

nilfs2: simplify life cycle management of nilfs object · 348fe8da

由 Ryusuke Konishi 提交于 9月 09, 2010

This stops pre-allocating nilfs object in nilfs_get_sb routine, and
stops managing its life cycle by reference counting.

nilfs_find_or_create_nilfs() function, nilfs->ns_mount_mutex,
nilfs_objects list, and the reference counter will be removed through
the simplification.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

348fe8da

nilfs2: do not allocate multiple super block instances for a device · f11459ad

由 Ryusuke Konishi 提交于 8月 16, 2010

This stops allocating multiple super block instances for a device.

All snapshots and a current mode mount (i.e. latest tree) will be
controlled with nilfs_root objects that are kept within an sb
instance.

nilfs_get_sb() is rewritten so that it always has a root object for
the latest tree and snapshots make additional root objects.

The root dentry of the latest tree is binded to sb->s_root even if it
isn't attached on a directory.  Root dentries of snapshots or the
latest tree are binded to mnt->mnt_root on which they are mounted.

With this patch, nilfs_find_sbinfo() function, nilfs->ns_supers list,
and nilfs->ns_current back pointer, are deleted.  In addition,
init_nilfs() and load_nilfs() are simplified since they will be called
once for a device, not repeatedly called for mount points.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

f11459ad

nilfs2: split out nilfs_attach_snapshot · ab4d8f7e

由 Ryusuke Konishi 提交于 8月 26, 2010

This splits the code to attach snapshots into a separate routine for
convenience sake.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

ab4d8f7e

nilfs2: split out nilfs_get_root_dentry · 367ea334

由 Ryusuke Konishi 提交于 8月 26, 2010

This splits the code to allocate root dentry into a separate routine
for convenience in successive changes.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

367ea334

nilfs2: deny write access to inodes in snapshots · dc3d3b81

由 Ryusuke Konishi 提交于 8月 15, 2010

Snapshots of nilfs are read-only.

After super block instances (sb) will be unified, nilfs will need to
check write access by a way other than implicit test with
IS_RDONLY(inode).  This is because IS_RDONLY() refers to MS_RDONLY bit
of inode->i_sb->s_flags and it will become inaccurate after the
unification of sb.

To prepare for the issue, this uses i_op->permission to deny write
access to inodes in snapshots.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

dc3d3b81

nilfs2: use checkpoint tree for mount check of snapshots · fd522029

由 Ryusuke Konishi 提交于 8月 14, 2010

This rewrites nilfs_checkpoint_is_mounted() function so that it
decides whether a checkpoint is mounted by whether the corresponding
root object is found in checkpoint tree.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

fd522029

nilfs2: move inode count and block count into root object · b7c06342

由 Ryusuke Konishi 提交于 8月 14, 2010

This moves sbi->s_inodes_count and sbi->s_blocks_count into nilfs_root
object.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

b7c06342

nilfs2: use root object to get ifile · e912a5b6

由 Ryusuke Konishi 提交于 8月 14, 2010

This rewrites functions using ifile so that they get ifile from
nilfs_root object, and will remove sbi->s_ifile. Some functions that
don't know the root object are extended to receive it from caller.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

e912a5b6

nilfs2: make snapshots in checkpoint tree exportable · 8e656fd5

由 Ryusuke Konishi 提交于 8月 27, 2010

The previous export operations cannot handle multiple versions of
a filesystem if they belong to the same sb instance.

This adds a new type of file handle and extends export operations so
that they can get the inode specified by a checkpoint number as well
as an inode number and a generation number.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

8e656fd5

nilfs2: set pointer to root object in inodes · 4d8d9293

由 Ryusuke Konishi 提交于 8月 25, 2010

This puts a pointer to nilfs_root object in the private part of
on-memory inode, and makes nilfs_iget function pick up the inode with
the same root object.

Non-root inodes inherit its nilfs_root object from parent inode.  That
of the root inode is allocated through nilfs_attach_checkpoint()
function.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

4d8d9293

nilfs2: add checkpoint tree to nilfs object · ba65ae47

由 Ryusuke Konishi 提交于 8月 14, 2010

To hold multiple versions of a filesystem in one sb instance, a new
on-memory structure is necessary to handle one or more checkpoints.

This adds a red-black tree of checkpoints to nilfs object, and adds
lookup and create functions for them.

Each checkpoint is represented by "nilfs_root" structure, and this
structure has rb_node to configure the rb-tree.

The nilfs_root object is identified with a checkpoint number.  For
each snapshot, a nilfs_root object is allocated and the checkpoint
number of snapshot is assigned to it.  For a regular mount
(i.e. current mode mount), NILFS_CPTREE_CURRENT_CNO constant is
assigned to the corresponding nilfs_root object.

Each nilfs_root object has an ifile inode and some counters.  These
items will displace those of nilfs_sb_info structure in successive
patches.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

ba65ae47

nilfs2: remove own inode hash used for GC · 263d90ce

由 Ryusuke Konishi 提交于 8月 20, 2010

This uses inode hash function that vfs provides instead of the own
hash table for caching gc inodes.  This finally removes the own inode
hash from nilfs.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

263d90ce

nilfs2: separate initializer of metadata file inode · 5e19a995

由 Ryusuke Konishi 提交于 8月 21, 2010

This separates a part of initialization code of metadata file inode,
and makes it available from the nilfs iget function that a later patch
will add to.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

5e19a995

nilfs2: use iget5_locked to get inode · 0e14a359

由 Ryusuke Konishi 提交于 8月 20, 2010

This uses iget5_locked instead of iget_locked so that gc cache can
look up inodes with an inode number and an optional checkpoint number.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

0e14a359

nilfs2: keep zero value in i_cno except for gc-inodes · 6c43f410

由 Ryusuke Konishi 提交于 8月 20, 2010

On-memory inode structures of nilfs have a member "i_cno" which stores
a checkpoint number related to the inode.  For gc-inodes, this field
indicates version of data each gc-inode caches for GC.  Log writer
temporarily uses "i_cno" to transfer the latest checkpoint number.

This stops the latter use and lets only gc-inodes use it.

The purpose of this patch is to allow the successive change use
"i_cno" for inode lookup.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

6c43f410

nilfs2: allow nilfs_dirty_inode to mark metadata file inodes dirty · 7d6cd92f

由 Ryusuke Konishi 提交于 8月 21, 2010

This allows sop->dirty_inode callback function (nilfs_dirty_inode) to
handle metadata file inodes.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

7d6cd92f

nilfs2: allow nilfs_destroy_inode to destroy metadata file inodes · b91c9a97

由 Ryusuke Konishi 提交于 8月 20, 2010

The current nilfs_destroy_inode() doesn't handle metadata file inodes
including gc inodes (dummy inodes used for garbage collection).

This allows nilfs_destroy_inode() to destroy inodes of metadata files.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

b91c9a97

nilfs2: accept future revisions · 9566a7a8

由 Ryusuke Konishi 提交于 8月 10, 2010

Compatibility of nilfs partitions is now managed with three feature
sets.  This changes old compatibility check with revision number so
that it can accept future revisions.

Note that we can stop support of experimental versions of nilfs that
doesn't know the feature sets by incrementing NILFS_CURRENT_REV.  We
don't have to do it soon, but it would be a possible option whenever
the need arises.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

9566a7a8

05 10月, 2010 2 次提交

BKL: Remove BKL from NILFS2 · d6d4c19c

由 Jan Blunck 提交于 2月 24, 2010

The BKL is only used in put_super, fill_super and remount_fs that are all
three protected by the superblocks s_umount rw_semaphore. Therefore it is
safe to remove the BKL entirely.
Signed-off-by: NJan Blunck <jblunck@infradead.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

d6d4c19c

BKL: Explicitly add BKL around get_sb/fill_super · db719222

由 Jan Blunck 提交于 8月 15, 2010

This patch is a preparation necessary to remove the BKL from do_new_mount().
It explicitly adds calls to lock_kernel()/unlock_kernel() around
get_sb/fill_super operations for filesystems that still uses the BKL.

I've read through all the code formerly covered by the BKL inside
do_kern_mount() and have satisfied myself that it doesn't need the BKL
any more.

do_kern_mount() is already called without the BKL when mounting the rootfs
and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
afs_mntpt_follow_link(). Both later functions are actually the filesystems
follow_link inode operation. vfs_kern_mount() is calling the specified
get_sb function and lets the filesystem do its job by calling the given
fill_super function.

Therefore I think it is safe to push down the BKL from the VFS to the
low-level filesystems get_sb/fill_super operation.

[arnd: do not add the BKL to those file systems that already
       don't use it elsewhere]
Signed-off-by: NJan Blunck <jblunck@infradead.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Christoph Hellwig <hch@infradead.org>

db719222

30 8月, 2010 1 次提交

nilfs2: fix leak of shadow dat inode in error path of load_nilfs · 4afc3134

由 Ryusuke Konishi 提交于 8月 29, 2010

If load_nilfs() gets an error while doing recovery, it will fail to
free the shadow inode of dat (nilfs->ns_gc_dat).

This fixes the leak issue.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

4afc3134

18 8月, 2010 2 次提交

nilfs2: wait for discard to finish · 1cb0c924

由 Ryusuke Konishi 提交于 8月 18, 2010

nilfs_discard_segment() doesn't wait for completion of discard
requests.  This specifies BLKDEV_IFL_WAIT flag when calling
blkdev_issue_discard() in order to fix the sync failure.
Reported-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Christoph Hellwig <hch@lst.de>

1cb0c924

kill BH_Ordered flag · 87e99511

由 Christoph Hellwig 提交于 8月 11, 2010

Instead of abusing a buffer_head flag just add a variant of
sync_dirty_buffer which allows passing the exact type of write
flag required.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

87e99511

16 8月, 2010 2 次提交

nilfs2: fix false warning saying one of two super blocks is broken · ea1a16f7

由 Ryusuke Konishi 提交于 8月 15, 2010

After applying commit b2ac86e1, the following message got appeared
after unclean shutdown:

> NILFS warning: broken superblock. using spare superblock.

This turns out to be a false message due to the change which updates
two super blocks alternately.  The secondary super block now can be
selected if it's newer than the primary one.

This kills the false warning by suppressing it if another super block
is not actually broken.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

ea1a16f7

nilfs2: fix list corruption after ifile creation failure · af4e3631

由 Ryusuke Konishi 提交于 8月 13, 2010

If nilfs_attach_checkpoint() gets a memory allocation failure during
creation of ifile, it will return without removing nilfs_sb_info
struct from ns_supers list.  When a concurrently mounted snapshot is
unmounted or another new snapshot is mounted after that, this causes
kernel oops as below:

> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<f83662ff>] nilfs_find_sbinfo+0x74/0xa4 [nilfs2]
> *pde = 00000000
> Oops: 0000 [#1] SMP
<snip>
> Call Trace:
>  [<f835dc29>] ? nilfs_get_sb+0x165/0x532 [nilfs2]
>  [<c1173c87>] ? ida_get_new_above+0x16d/0x187
>  [<c109a7f8>] ? alloc_vfsmnt+0x7e/0x10a
>  [<c1070790>] ? kstrdup+0x2c/0x40
>  [<c1089041>] ? vfs_kern_mount+0x96/0x14e
>  [<c108913d>] ? do_kern_mount+0x32/0xbd
>  [<c109b331>] ? do_mount+0x642/0x6a1
>  [<c101a415>] ? do_page_fault+0x0/0x2d1
>  [<c1099c00>] ? copy_mount_options+0x80/0xe2
>  [<c10705d8>] ? strndup_user+0x48/0x67
>  [<c109b3f1>] ? sys_mount+0x61/0x90
>  [<c10027cc>] ? sysenter_do_call+0x12/0x22

This fixes the problem.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: stable@kernel.org

af4e3631

10 8月, 2010 7 次提交

A
convert nilfs2 to ->evict_inode() · 6fd1e5c9
由 Al Viro 提交于 6月 07, 2010
```
[folded build fix from sfr]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
6fd1e5c9

simplify checks for I_CLEAR/I_FREEING · a4ffdde6

由 Al Viro 提交于 6月 02, 2010

add I_CLEAR instead of replacing I_FREEING with it.  I_CLEAR is
equivalent to I_FREEING for almost all code looking at either;
it's there to keep track of having called clear_inode() exactly
once per inode lifetime, at some point after having set I_FREEING.
I_CLEAR and I_FREEING never get set at the same time with the
current code, so we can switch to setting i_flags to I_FREEING | I_CLEAR
instead of I_CLEAR without loss of information.  As the result of
such change, checks become simpler and the amount of code that needs
to know about I_CLEAR shrinks a lot.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a4ffdde6

remove inode_setattr · 1025774c

由 Christoph Hellwig 提交于 6月 04, 2010

Replace inode_setattr with opencoded variants of it in all callers.  This
moves the remaining call to vmtruncate into the filesystem methods where it
can be replaced with the proper truncate sequence.

In a few cases it was obvious that we would never end up calling vmtruncate
so it was left out in the opencoded variant:

 spufs: explicitly checks for ATTR_SIZE earlier
 btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier
 ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above

In addition to that ncpfs called inode_setattr with handcrafted iattrs,
which allowed to trim down the opencoded variant.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1025774c

get rid of block_write_begin_newtrunc · 155130a4

由 Christoph Hellwig 提交于 6月 04, 2010

Move the call to vmtruncate to get rid of accessive blocks to the callers
in preparation of the new truncate sequence and rename the non-truncating
version to block_write_begin.

While we're at it also remove several unused arguments to block_write_begin.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

155130a4

introduce __block_write_begin · 6e1db88d

由 Christoph Hellwig 提交于 6月 04, 2010

Split up the block_write_begin implementation - __block_write_begin is a new
trivial wrapper for block_prepare_write that always takes an already
allocated page and can be either called from block_write_begin or filesystem
code that already has a page allocated.  Remove the handling of already
allocated pages from block_write_begin after switching all callers that
do it to __block_write_begin.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6e1db88d

clean up write_begin usage for directories in pagecache · f4e420dc

由 Christoph Hellwig 提交于 6月 04, 2010

For filesystem that implement directories in pagecache we call
block_write_begin with an already allocated page for this code, while the
normal regular file write path uses the default block_write_begin behaviour.

Get rid of the __foofs_write_begin helper and opencode the normal write_begin
call in foofs_write_begin, while adding a new foofs_prepare_chunk helper for
the directory code. The added benefit is that foofs_prepare_chunk has
a much saner calling convention.

Note that the interruptible flag passed into block_write_begin is always
ignored if we already pass in a page (see next patch for details), and
we never were doing truncations of exessive blocks for this case either so we
can switch directly to block_write_begin_newtrunc.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f4e420dc

sort out blockdev_direct_IO variants · eafdc7d1

由 Christoph Hellwig 提交于 6月 04, 2010

Move the call to vmtruncate to get rid of accessive blocks to the callers
in prepearation of the new truncate calling sequence. This was only done
for DIO_LOCKING filesystems, so the __blockdev_direct_IO_newtrunc variant
was not needed anyway. Get rid of blockdev_direct_IO_no_locking and
its _newtrunc variant while at it as just opencoding the two additional
paramters is shorted than the name suffix.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

eafdc7d1

08 8月, 2010 1 次提交

block: unify flags for struct bio and struct request · 7b6d91da

由 Christoph Hellwig 提交于 8月 07, 2010

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7b6d91da