提交 · 8e656fd518784b49453f60c5f78b78703bc85cb2 · openeuler / raspberrypi-kernel

23 10月, 2010 10 次提交

nilfs2: make snapshots in checkpoint tree exportable · 8e656fd5

由 Ryusuke Konishi 提交于 8月 27, 2010

The previous export operations cannot handle multiple versions of
a filesystem if they belong to the same sb instance.

This adds a new type of file handle and extends export operations so
that they can get the inode specified by a checkpoint number as well
as an inode number and a generation number.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

8e656fd5

nilfs2: set pointer to root object in inodes · 4d8d9293

由 Ryusuke Konishi 提交于 8月 25, 2010

This puts a pointer to nilfs_root object in the private part of
on-memory inode, and makes nilfs_iget function pick up the inode with
the same root object.

Non-root inodes inherit its nilfs_root object from parent inode.  That
of the root inode is allocated through nilfs_attach_checkpoint()
function.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

4d8d9293

nilfs2: add checkpoint tree to nilfs object · ba65ae47

由 Ryusuke Konishi 提交于 8月 14, 2010

To hold multiple versions of a filesystem in one sb instance, a new
on-memory structure is necessary to handle one or more checkpoints.

This adds a red-black tree of checkpoints to nilfs object, and adds
lookup and create functions for them.

Each checkpoint is represented by "nilfs_root" structure, and this
structure has rb_node to configure the rb-tree.

The nilfs_root object is identified with a checkpoint number.  For
each snapshot, a nilfs_root object is allocated and the checkpoint
number of snapshot is assigned to it.  For a regular mount
(i.e. current mode mount), NILFS_CPTREE_CURRENT_CNO constant is
assigned to the corresponding nilfs_root object.

Each nilfs_root object has an ifile inode and some counters.  These
items will displace those of nilfs_sb_info structure in successive
patches.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

ba65ae47

nilfs2: remove own inode hash used for GC · 263d90ce

由 Ryusuke Konishi 提交于 8月 20, 2010

This uses inode hash function that vfs provides instead of the own
hash table for caching gc inodes.  This finally removes the own inode
hash from nilfs.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

263d90ce

nilfs2: separate initializer of metadata file inode · 5e19a995

由 Ryusuke Konishi 提交于 8月 21, 2010

This separates a part of initialization code of metadata file inode,
and makes it available from the nilfs iget function that a later patch
will add to.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

5e19a995

nilfs2: use iget5_locked to get inode · 0e14a359

由 Ryusuke Konishi 提交于 8月 20, 2010

This uses iget5_locked instead of iget_locked so that gc cache can
look up inodes with an inode number and an optional checkpoint number.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

0e14a359

nilfs2: keep zero value in i_cno except for gc-inodes · 6c43f410

由 Ryusuke Konishi 提交于 8月 20, 2010

On-memory inode structures of nilfs have a member "i_cno" which stores
a checkpoint number related to the inode.  For gc-inodes, this field
indicates version of data each gc-inode caches for GC.  Log writer
temporarily uses "i_cno" to transfer the latest checkpoint number.

This stops the latter use and lets only gc-inodes use it.

The purpose of this patch is to allow the successive change use
"i_cno" for inode lookup.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

6c43f410

nilfs2: allow nilfs_dirty_inode to mark metadata file inodes dirty · 7d6cd92f

由 Ryusuke Konishi 提交于 8月 21, 2010

This allows sop->dirty_inode callback function (nilfs_dirty_inode) to
handle metadata file inodes.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

7d6cd92f

nilfs2: allow nilfs_destroy_inode to destroy metadata file inodes · b91c9a97

由 Ryusuke Konishi 提交于 8月 20, 2010

The current nilfs_destroy_inode() doesn't handle metadata file inodes
including gc inodes (dummy inodes used for garbage collection).

This allows nilfs_destroy_inode() to destroy inodes of metadata files.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

b91c9a97

nilfs2: accept future revisions · 9566a7a8

由 Ryusuke Konishi 提交于 8月 10, 2010

Compatibility of nilfs partitions is now managed with three feature
sets.  This changes old compatibility check with revision number so
that it can accept future revisions.

Note that we can stop support of experimental versions of nilfs that
doesn't know the feature sets by incrementing NILFS_CURRENT_REV.  We
don't have to do it soon, but it would be a possible option whenever
the need arises.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

9566a7a8

05 10月, 2010 2 次提交

BKL: Remove BKL from NILFS2 · d6d4c19c

由 Jan Blunck 提交于 2月 24, 2010

The BKL is only used in put_super, fill_super and remount_fs that are all
three protected by the superblocks s_umount rw_semaphore. Therefore it is
safe to remove the BKL entirely.
Signed-off-by: NJan Blunck <jblunck@infradead.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

d6d4c19c

BKL: Explicitly add BKL around get_sb/fill_super · db719222

由 Jan Blunck 提交于 8月 15, 2010

This patch is a preparation necessary to remove the BKL from do_new_mount().
It explicitly adds calls to lock_kernel()/unlock_kernel() around
get_sb/fill_super operations for filesystems that still uses the BKL.

I've read through all the code formerly covered by the BKL inside
do_kern_mount() and have satisfied myself that it doesn't need the BKL
any more.

do_kern_mount() is already called without the BKL when mounting the rootfs
and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
afs_mntpt_follow_link(). Both later functions are actually the filesystems
follow_link inode operation. vfs_kern_mount() is calling the specified
get_sb function and lets the filesystem do its job by calling the given
fill_super function.

Therefore I think it is safe to push down the BKL from the VFS to the
low-level filesystems get_sb/fill_super operation.

[arnd: do not add the BKL to those file systems that already
       don't use it elsewhere]
Signed-off-by: NJan Blunck <jblunck@infradead.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Christoph Hellwig <hch@infradead.org>

db719222

30 8月, 2010 1 次提交

nilfs2: fix leak of shadow dat inode in error path of load_nilfs · 4afc3134

由 Ryusuke Konishi 提交于 8月 29, 2010

If load_nilfs() gets an error while doing recovery, it will fail to
free the shadow inode of dat (nilfs->ns_gc_dat).

This fixes the leak issue.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

4afc3134

18 8月, 2010 2 次提交

nilfs2: wait for discard to finish · 1cb0c924

由 Ryusuke Konishi 提交于 8月 18, 2010

nilfs_discard_segment() doesn't wait for completion of discard
requests.  This specifies BLKDEV_IFL_WAIT flag when calling
blkdev_issue_discard() in order to fix the sync failure.
Reported-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Christoph Hellwig <hch@lst.de>

1cb0c924

kill BH_Ordered flag · 87e99511

由 Christoph Hellwig 提交于 8月 11, 2010

Instead of abusing a buffer_head flag just add a variant of
sync_dirty_buffer which allows passing the exact type of write
flag required.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

87e99511

16 8月, 2010 2 次提交

nilfs2: fix false warning saying one of two super blocks is broken · ea1a16f7

由 Ryusuke Konishi 提交于 8月 15, 2010

After applying commit b2ac86e1, the following message got appeared
after unclean shutdown:

> NILFS warning: broken superblock. using spare superblock.

This turns out to be a false message due to the change which updates
two super blocks alternately.  The secondary super block now can be
selected if it's newer than the primary one.

This kills the false warning by suppressing it if another super block
is not actually broken.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

ea1a16f7

nilfs2: fix list corruption after ifile creation failure · af4e3631

由 Ryusuke Konishi 提交于 8月 13, 2010

If nilfs_attach_checkpoint() gets a memory allocation failure during
creation of ifile, it will return without removing nilfs_sb_info
struct from ns_supers list.  When a concurrently mounted snapshot is
unmounted or another new snapshot is mounted after that, this causes
kernel oops as below:

> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<f83662ff>] nilfs_find_sbinfo+0x74/0xa4 [nilfs2]
> *pde = 00000000
> Oops: 0000 [#1] SMP
<snip>
> Call Trace:
>  [<f835dc29>] ? nilfs_get_sb+0x165/0x532 [nilfs2]
>  [<c1173c87>] ? ida_get_new_above+0x16d/0x187
>  [<c109a7f8>] ? alloc_vfsmnt+0x7e/0x10a
>  [<c1070790>] ? kstrdup+0x2c/0x40
>  [<c1089041>] ? vfs_kern_mount+0x96/0x14e
>  [<c108913d>] ? do_kern_mount+0x32/0xbd
>  [<c109b331>] ? do_mount+0x642/0x6a1
>  [<c101a415>] ? do_page_fault+0x0/0x2d1
>  [<c1099c00>] ? copy_mount_options+0x80/0xe2
>  [<c10705d8>] ? strndup_user+0x48/0x67
>  [<c109b3f1>] ? sys_mount+0x61/0x90
>  [<c10027cc>] ? sysenter_do_call+0x12/0x22

This fixes the problem.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: stable@kernel.org

af4e3631

10 8月, 2010 7 次提交

A
convert nilfs2 to ->evict_inode() · 6fd1e5c9
由 Al Viro 提交于 6月 07, 2010
```
[folded build fix from sfr]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
6fd1e5c9

simplify checks for I_CLEAR/I_FREEING · a4ffdde6

由 Al Viro 提交于 6月 02, 2010

add I_CLEAR instead of replacing I_FREEING with it.  I_CLEAR is
equivalent to I_FREEING for almost all code looking at either;
it's there to keep track of having called clear_inode() exactly
once per inode lifetime, at some point after having set I_FREEING.
I_CLEAR and I_FREEING never get set at the same time with the
current code, so we can switch to setting i_flags to I_FREEING | I_CLEAR
instead of I_CLEAR without loss of information.  As the result of
such change, checks become simpler and the amount of code that needs
to know about I_CLEAR shrinks a lot.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a4ffdde6

remove inode_setattr · 1025774c

由 Christoph Hellwig 提交于 6月 04, 2010

Replace inode_setattr with opencoded variants of it in all callers.  This
moves the remaining call to vmtruncate into the filesystem methods where it
can be replaced with the proper truncate sequence.

In a few cases it was obvious that we would never end up calling vmtruncate
so it was left out in the opencoded variant:

 spufs: explicitly checks for ATTR_SIZE earlier
 btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier
 ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above

In addition to that ncpfs called inode_setattr with handcrafted iattrs,
which allowed to trim down the opencoded variant.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1025774c

get rid of block_write_begin_newtrunc · 155130a4

由 Christoph Hellwig 提交于 6月 04, 2010

Move the call to vmtruncate to get rid of accessive blocks to the callers
in preparation of the new truncate sequence and rename the non-truncating
version to block_write_begin.

While we're at it also remove several unused arguments to block_write_begin.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

155130a4

introduce __block_write_begin · 6e1db88d

由 Christoph Hellwig 提交于 6月 04, 2010

Split up the block_write_begin implementation - __block_write_begin is a new
trivial wrapper for block_prepare_write that always takes an already
allocated page and can be either called from block_write_begin or filesystem
code that already has a page allocated.  Remove the handling of already
allocated pages from block_write_begin after switching all callers that
do it to __block_write_begin.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6e1db88d

clean up write_begin usage for directories in pagecache · f4e420dc

由 Christoph Hellwig 提交于 6月 04, 2010

For filesystem that implement directories in pagecache we call
block_write_begin with an already allocated page for this code, while the
normal regular file write path uses the default block_write_begin behaviour.

Get rid of the __foofs_write_begin helper and opencode the normal write_begin
call in foofs_write_begin, while adding a new foofs_prepare_chunk helper for
the directory code. The added benefit is that foofs_prepare_chunk has
a much saner calling convention.

Note that the interruptible flag passed into block_write_begin is always
ignored if we already pass in a page (see next patch for details), and
we never were doing truncations of exessive blocks for this case either so we
can switch directly to block_write_begin_newtrunc.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f4e420dc

sort out blockdev_direct_IO variants · eafdc7d1

由 Christoph Hellwig 提交于 6月 04, 2010

Move the call to vmtruncate to get rid of accessive blocks to the callers
in prepearation of the new truncate calling sequence. This was only done
for DIO_LOCKING filesystems, so the __blockdev_direct_IO_newtrunc variant
was not needed anyway. Get rid of blockdev_direct_IO_no_locking and
its _newtrunc variant while at it as just opencoding the two additional
paramters is shorted than the name suffix.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

eafdc7d1

08 8月, 2010 1 次提交

block: unify flags for struct bio and struct request · 7b6d91da

由 Christoph Hellwig 提交于 8月 07, 2010

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7b6d91da

25 7月, 2010 2 次提交

nilfs2: reject filesystem with unsupported block size · 89c0fd01

由 Ryusuke Konishi 提交于 7月 25, 2010

This inserts sanity check that refuses to mount a filesystem with
unsupported block size.

Previously, kernel code of nilfs was looking only limitation of
devices though mkfs.nilfs2 limits the range of block sizes; there was
no check that prevents rec_len overflow with larger block sizes.

With this change, block sizes larger than 64KB or smaller than 1KB
will get rejected explicitly by kernel.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

89c0fd01

nilfs2: avoid rec_len overflow with 64KB block size · 6cda9fa2

由 Ryusuke Konishi 提交于 7月 25, 2010

With 64KB blocksize, a directory entry can have size 64KB which does
not fit into 16 bits we have for entry length. So this patch stores
0xffff instead and converts value when read from / written to disk.

Nilfs derives its directory implementation from ext2 filesystem, and
this draws upon the corresponding change on ext2.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

6cda9fa2

24 7月, 2010 1 次提交

nilfs2: simplify nilfs_get_page function · c28e69d9

由 Ryusuke Konishi 提交于 7月 24, 2010

Implementation of nilfs_get_page() is a bit old as below:

 - A common read_mapping_page inline function is now available instead
   of its read_cache_page use.
 - wait_on_page_locked() use in the function is eliminable since
   read_cache_page function does the same thing through wait_on_page_read().
 - PageUptodate() check is eliminable for the same reason.

This renews nilfs_get_page() based on these points.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

c28e69d9

23 7月, 2010 12 次提交

nilfs2: reject incompatible filesystem · c5ca48aa

由 Ryusuke Konishi 提交于 7月 22, 2010

This forces nilfs to check compatibility of feature flags so as to
reject a filesystem with unknown features when it mounts or remounts
the filesystem.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

c5ca48aa

nilfs2: apply read-ahead for nilfs_btree_lookup_contig · 03bdb5ac

由 Ryusuke Konishi 提交于 7月 18, 2010

This applies read-ahead to nilfs_btree_do_lookup and
nilfs_btree_lookup_contig functions and extends them to read ahead
siblings of level 1 btree nodes that hold data blocks.

At present, the read-ahead is not applied to most btree operations;
only get_block() callback function, which is used during read of
regular files or directories, receives the benefit.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

03bdb5ac

nilfs2: introduce check flag to btree node buffer · 4e13e66b

由 Ryusuke Konishi 提交于 7月 18, 2010

nilfs_btree_get_block() now may return untested buffer due to
read-ahead. This adds a new flag for buffer heads so that the btree
code can check whether the buffer is already verified or not.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

4e13e66b

nilfs2: add btree get block function with readahead option · 464ece88

由 Ryusuke Konishi 提交于 7月 18, 2010

This adds __nilfs_btree_get_block() function that can issue a series
of read-ahead requests for sibling btree nodes.

This read-ahead needs parent node block, so nilfs_btree_readahead_info
structure is added to pass the information that
__nilfs_btree_get_block() needs.

This also replaces the previous nilfs_btree_get_block() implementation
with a wrapper function of __nilfs_btree_get_block().
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

464ece88

nilfs2: add read ahead mode to nilfs_btnode_submit_block · 26dfdd8e

由 Ryusuke Konishi 提交于 7月 18, 2010

This adds mode argument to nilfs_btnode_submit_block() function and
allows it to issue a read-ahead request.

An optional submit_ptr argument is also added to store the actual
block address for which bio is sent.  submit_ptr is used for a series
of read-ahead requests, and helps to decide if each requested block is
continous to the previous one on disk.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

26dfdd8e

nilfs2: fix buffer head leak in nilfs_btnode_submit_block · f8e6cc01

由 Ryusuke Konishi 提交于 7月 15, 2010

nilfs_btnode_submit_block() refers to buffer head just before
returning from the function, but it releases the buffer head earlier
than that if nilfs_dat_translate() gets an error.

This has potential for oops in the erroneous case.  This fixes the
issue.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

f8e6cc01

nilfs2: eliminate inline keywords in btree implementation · 7c397a81

由 Ryusuke Konishi 提交于 7月 13, 2010

This removes all inline uses from btree.c.  Gcc now agressively apply
inline expansion even for the functions declared without the keyword;
the inline use in btree.c looks excessive.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

7c397a81

nilfs2: get maximum number of child nodes from bmap object · 5ad2686e

由 Ryusuke Konishi 提交于 7月 13, 2010

The patch "reduce repetitive calculation of max number of child nodes"
gathered up the calculation of maximum number of child nodes into
nilfs_btree_nchildren_per_block() function.  This makes the function
get resultant value from a private variable in bmap object instead of
calculating it for each call.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

5ad2686e

nilfs2: reduce repetitive calculation of max number of child nodes · 9b7b265c

由 Ryusuke Konishi 提交于 7月 13, 2010

The current btree implementation repeats the same calculation on the
maximum number of child nodes.  This is because a few low level
routines use the calculation for index addressing in a btree node
block.

This reduces the calculation by explicitly passing the maximum number
of child nodes (ncmax) through their argument.

This changes parameter passing of the following functions:

 - nilfs_btree_node_dptrs
 - nilfs_btree_node_get_ptr
 - nilfs_btree_node_set_ptr
 - nilfs_btree_node_init
 - nilfs_btree_node_move_left
 - nilfs_btree_node_move_right
 - nilfs_btree_node_insert
 - nilfs_btree_node_delete, and
 - nilfs_btree_get_node

The following functions are removed:

 - nilfs_btree_node_nchildren_min
 - nilfs_btree_node_nchildren_max

Most middle level btree operations are rewritten to pass a proper
ncmax value depending on whether each occurrence of node is "root" or
not.

A constant NILFS_BTREE_ROOT_NCHILDREN_MAX is used for the root node,
whereas nilfs_btree_nchildren_per_block() function is used for
non-root nodes.  If a node could be either root or a non-root node, an
output argument of nilfs_btree_get_node() is used to set up ncmax.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

9b7b265c

nilfs2: optimize calculation of min/max number of btree node children · ea64ab87

由 Ryusuke Konishi 提交于 7月 13, 2010

nilfs_btree_node_nchildren_max() and nilfs_btree_node_nchildren_min()
functions switch return value depending on whether target node is the
root or a node block.  In most uses of these functions, however, the
node type is fixed, and moreover the same calculation is repeatedly
performed in loop.

This unfold these functions depending on context and move them outside
loops wherever possible.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

ea64ab87

nilfs2: remove redundant pointer checks in bmap lookup functions · 364ec2d7

由 Ryusuke Konishi 提交于 7月 13, 2010

nilfs_bmap_lookup and its variants are supposed to take a valid
pointer argument to return a block address, thus pointer checks in
nilfs_btree_lookup and nilfs_direct_lookup are needless.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

364ec2d7

nilfs2: get rid of nilfs_bmap_union · 05d0e94b

由 Ryusuke Konishi 提交于 7月 10, 2010

This removes nilfs_bmap_union and finally unifies three structures and
the union in bmap/btree code into one.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

05d0e94b