- 01 10月, 2013 1 次提交
-
-
由 Vyacheslav Dubeyko 提交于
Many NILFS2 users were reported about strange file system corruption (for example): NILFS: bad btree node (blocknr=185027): level = 0, flags = 0x0, nchildren = 768 NILFS error (device sda4): nilfs_bmap_last_key: broken bmap (inode number=11540) But such error messages are consequence of file system's issue that takes place more earlier. Fortunately, Jerome Poulin <jeromepoulin@gmail.com> and Anton Eliasson <devel@antoneliasson.se> were reported about another issue not so recently. These reports describe the issue with segctor thread's crash: BUG: unable to handle kernel paging request at 0000000000004c83 IP: nilfs_end_page_io+0x12/0xd0 [nilfs2] Call Trace: nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2] nilfs_segctor_construct+0x17b/0x290 [nilfs2] nilfs_segctor_thread+0x122/0x3b0 [nilfs2] kthread+0xc0/0xd0 ret_from_fork+0x7c/0xb0 These two issues have one reason. This reason can raise third issue too. Third issue results in hanging of segctor thread with eating of 100% CPU. REPRODUCING PATH: One of the possible way or the issue reproducing was described by Jermoe me Poulin <jeromepoulin@gmail.com>: 1. init S to get to single user mode. 2. sysrq+E to make sure only my shell is running 3. start network-manager to get my wifi connection up 4. login as root and launch "screen" 5. cd /boot/log/nilfs which is a ext3 mount point and can log when NILFS dies. 6. lscp | xz -9e > lscp.txt.xz 7. mount my snapshot using mount -o cp=3360839,ro /dev/vgUbuntu/root /mnt/nilfs 8. start a screen to dump /proc/kmsg to text file since rsyslog is killed 9. start a screen and launch strace -f -o find-cat.log -t find /mnt/nilfs -type f -exec cat {} > /dev/null \; 10. start a screen and launch strace -f -o apt-get.log -t apt-get update 11. launch the last command again as it did not crash the first time 12. apt-get crashes 13. ps aux > ps-aux-crashed.log 13. sysrq+W 14. sysrq+E wait for everything to terminate 15. sysrq+SUSB Simplified way of the issue reproducing is starting kernel compilation task and "apt-get update" in parallel. REPRODUCIBILITY: The issue is reproduced not stable [60% - 80%]. It is very important to have proper environment for the issue reproducing. The critical conditions for successful reproducing: (1) It should have big modified file by mmap() way. (2) This file should have the count of dirty blocks are greater that several segments in size (for example, two or three) from time to time during processing. (3) It should be intensive background activity of files modification in another thread. INVESTIGATION: First of all, it is possible to see that the reason of crash is not valid page address: NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82 NILFS [nilfs_segctor_complete_write]:2101 segbuf->sb_segnum 6783 Moreover, value of b_page (0x1a82) is 6786. This value looks like segment number. And b_blocknr with b_size values look like block numbers. So, buffer_head's pointer points on not proper address value. Detailed investigation of the issue is discovered such picture: [-----------------------------SEGMENT 6783-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111149024, segbuf->sb_segnum 6783 [-----------------------------SEGMENT 6784-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff8802174a6798, bh->b_assoc_buffers.prev ffff880221cffee8 NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6784 NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50 NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111150080, segbuf->sb_segnum 6784, segbuf->sb_nbio 0 [----------] ditto NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111164416, segbuf->sb_segnum 6784, segbuf->sb_nbio 15 [-----------------------------SEGMENT 6785-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff880219277e80, bh->b_assoc_buffers.prev ffff880221cffc88 NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6785 NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8 NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111165440, segbuf->sb_segnum 6785, segbuf->sb_nbio 0 [----------] ditto NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111177728, segbuf->sb_segnum 6785, segbuf->sb_nbio 12 NILFS [nilfs_segctor_do_construct]:2399 nilfs_segctor_wait NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6783 NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6784 NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6785 NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82 BUG: unable to handle kernel paging request at 0000000000001a82 IP: [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] Usually, for every segment we collect dirty files in list. Then, dirty blocks are gathered for every dirty file, prepared for write and submitted by means of nilfs_segbuf_submit_bh() call. Finally, it takes place complete write phase after calling nilfs_end_bio_write() on the block layer. Buffers/pages are marked as not dirty on final phase and processed files removed from the list of dirty files. It is possible to see that we had three prepare_write and submit_bio phases before segbuf_wait and complete_write phase. Moreover, segments compete between each other for dirty blocks because on every iteration of segments processing dirty buffer_heads are added in several lists of payload_buffers: [SEGMENT 6784]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50 [SEGMENT 6785]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8 The next pointer is the same but prev pointer has changed. It means that buffer_head has next pointer from one list but prev pointer from another. Such modification can be made several times. And, finally, it can be resulted in various issues: (1) segctor hanging, (2) segctor crashing, (3) file system metadata corruption. FIX: This patch adds: (1) setting of BH_Async_Write flag in nilfs_segctor_prepare_write() for every proccessed dirty block; (2) checking of BH_Async_Write flag in nilfs_lookup_dirty_data_buffers() and nilfs_lookup_dirty_node_buffers(); (3) clearing of BH_Async_Write flag in nilfs_segctor_complete_write(), nilfs_abort_logs(), nilfs_forget_buffer(), nilfs_clear_dirty_page(). Reported-by: NJerome Poulin <jeromepoulin@gmail.com> Reported-by: NAnton Eliasson <devel@antoneliasson.se> Cc: Paul Fertser <fercerpav@gmail.com> Cc: ARAI Shun-ichi <hermes@ceres.dti.ne.jp> Cc: Piotr Szymaniak <szarpaj@grubelek.pl> Cc: Juan Barry Manuel Canham <Linux@riotingpacifist.net> Cc: Zahid Chowdhury <zahid.chowdhury@starsolutions.com> Cc: Elmer Zhang <freeboy6716@gmail.com> Cc: Kenneth Langga <klangga@gmail.com> Signed-off-by: NVyacheslav Dubeyko <slava@dubeyko.com> Acked-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 7月, 2013 1 次提交
-
-
由 Vyacheslav Dubeyko 提交于
The cp_inodes_count and cp_blocks_count are represented as __le64 type in on-disk structure (struct nilfs_checkpoint). But analogous fields in in-core structure (struct nilfs_root) are represented by atomic_t type. This patch replaces atomic_t on atomic64_t type in representation of inodes_count and blocks_count fields in struct nilfs_root. Signed-off-by: NVyacheslav Dubeyko <slava@dubeyko.com> Acked-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: NJoern Engel <joern@logfs.org> Cc: Clemens Eisserer <linuxhippy@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 7月, 2012 1 次提交
-
-
由 Jan Kara 提交于
We change nilfs_page_mkwrite() to provide proper freeze protection for writeable page faults (we must wait for frozen filesystem even if the page is fully mapped). We remove all vfs_check_frozen() checks since they are now handled by the generic code. CC: linux-nilfs@vger.kernel.org CC: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 21 6月, 2012 1 次提交
-
-
由 Ryusuke Konishi 提交于
A gc-inode is a pseudo inode used to buffer the blocks to be moved by garbage collection. Block caches of gc-inodes must be cleared every time a garbage collection function (nilfs_clean_segments) completes. Otherwise, stale blocks buffered in the caches may be wrongly reused in successive calls of the GC function. For user files, this is not a problem because their gc-inodes are distinguished by a checkpoint number as well as an inode number. They never buffer different blocks if either an inode number, a checkpoint number, or a block offset differs. However, gc-inodes of sufile, cpfile and DAT file can store different data for the same block offset. Thus, the nilfs_clean_segments function can move incorrect block for these meta-data files if an old block is cached. I found this is really causing meta-data corruption in nilfs. This fixes the issue by ensuring cache clear of gc-inodes and resolves reported GC problems including checkpoint file corruption, b-tree corruption, and the following warning during GC. nilfs_palloc_freev: entry number 307234 already freed. ... Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> [2.6.37+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 11月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
There is no reason to export two functions for entering the refrigerator. Calling refrigerator() instead of try_to_freeze() doesn't save anything noticeable or removes any race condition. * Rename refrigerator() to __refrigerator() and make it return bool indicating whether it scheduled out for freezing. * Update try_to_freeze() to return bool and relay the return value of __refrigerator() if freezing(). * Convert all refrigerator() users to try_to_freeze(). * Update documentation accordingly. * While at it, add might_sleep() to try_to_freeze(). Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org>
-
- 11 6月, 2011 1 次提交
-
-
由 Ryusuke Konishi 提交于
Checkpoint generation interval of nilfs goes wrong after user has changed the interval parameter with nilfs-tune tool. segctord starting. Construction interval = 5 seconds, CP frequency < 30 seconds segctord starting. Construction interval = 0 seconds, CP frequency < 30 seconds This turned out to be caused by a trivial bug in initialization code of log writer. This will fix it. Reported-by: NAndrea Gelmini <andrea.gelmini@gmail.com> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 10 5月, 2011 7 次提交
-
-
由 Ryusuke Konishi 提交于
This replaces nilfs_mdt_mark_buffer_dirty and nilfs_btnode_mark_dirty macros with mark_buffer_dirty and gets rid of nilfs_mark_buffer_dirty, an own mark buffer dirty function. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
In the current nilfs, page cache for btree nodes and meta data files do not set a valid back pointer to the host inode in mapping->host. This will change it so that every address space in nilfs uses mapping->host to hold its host inode. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This uses list_first_entry macro instead of list_entry if it's used to get the first entry. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
The super root block is newly-allocated each time it is written back to disk, so unused portion of the block should be cleared. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
The size of super root structure depends on inode size, so NILFS_SR_BYTES macro should be a function of the inode size. This fixes the issue. Even though a different size value will be written for a possible future filesystem with extended inode, but fortunately this does not break disk format compatibility. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Previously, nilfs was cloning pages for mmapped region to freeze their data and ensure consistency of checksum during writeback cycles. A private page allocator was used for this page cloning. But, we no longer need to do that since clear_page_dirty_for_io function sets up pte so that vm_ops->page_mkwrite function is called right before the mmapped pages are modified and nilfs_page_mkwrite function can safely wait for the pages to be written back to disk. So, this stops making a copy of mmapped pages during writeback, and eliminates the private page allocation and deallocation functions from nilfs. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Nicolas Kaiser 提交于
Merge list_del() + list_add_tail() to list_move_tail(). Signed-off-by: NNicolas Kaiser <nikai@nikai.net> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 09 3月, 2011 7 次提交
-
-
由 Ryusuke Konishi 提交于
This directly uses sb->s_fs_info to keep a nilfs filesystem object and fully removes the intermediate nilfs_sb_info structure. With this change, the hierarchy of on-memory structures of nilfs will be simplified as follows: Before: super_block -> nilfs_sb_info -> the_nilfs -> cptree --+-> nilfs_root (current file system) +-> nilfs_root (snapshot A) +-> nilfs_root (snapshot B) : -> nilfs_sc_info (log writer structure) After: super_block -> the_nilfs -> cptree --+-> nilfs_root (current file system) +-> nilfs_root (snapshot A) +-> nilfs_root (snapshot B) : -> nilfs_sc_info (log writer structure) The reason why we didn't design so from the beginning is because the initial shape also differed from the above. The early hierachy was composed of "per-mount-point" super_block -> nilfs_sb_info pairs and a shared nilfs object. On the kernel 2.6.37, it was changed to the current shape in order to unify super block instances into one per device, and this cleanup became applicable as the result. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This replaces sbi uses with direct reference to sb instance. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Removes sci->sc_sbi which is a back pointer to nilfs_sb_info struct from log writer object (nilfs_sc_info). Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Log writer is held by the nilfs_sb_info structure. This moves it into nilfs object and replaces all uses of NILFS_SC() accessor. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Moves s_inode_lock spinlock and s_dirty_files list to nilfs object from nilfs_sb_info structure. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This moves four parameter variables on nilfs_sb_info s_resuid, s_resgid, s_interval and s_watermark to the nilfs object. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This moves mount_opt local variable to nilfs object from nilfs_sb_info struct. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 02 3月, 2011 1 次提交
-
-
由 Ryusuke Konishi 提交于
According to the report from Jiro SEKIBA titled "regression in 2.6.37?" (Message-Id: <8739n8vs1f.wl%jir@sekiba.com>), on 2.6.37 and later kernels, lscp command no longer displays "i" flag on checkpoints that snapshot operations or garbage collection created. This is a regression of nilfs2 checkpointing function, and it's critical since it broke behavior of a part of nilfs2 applications. For instance, snapshot manager of TimeBrowse gets to create meaningless snapshots continuously; snapshot creation triggers another checkpoint, but applications cannot distinguish whether the new checkpoint contains meaningful changes or not without the i-flag. This patch fixes the regression and brings that application behavior back to normal. Reported-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: NJiro SEKIBA <jir@unicus.jp> Cc: stable <stable@kernel.org> [2.6.37]
-
- 10 1月, 2011 3 次提交
-
-
由 Ryusuke Konishi 提交于
nilfs_dat_inode function was a wrapper to switch between normal dat inode and gcdat, a clone of the dat inode for garbage collection. This function got obsolete when the gcdat inode was removed, and now we can access the dat inode directly from a nilfs object. So, we will unfold the wrapper and remove it. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Nilfs does not allocate new blocks on disk until they are actually written to. To implement fiemap, we need to deal with such blocks. To allow successive fiemap patch to distinguish mapped but unallocated regions, this marks buffer heads of those new blocks as delayed and clears the flag after the blocks are written to disk. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Some functions using nilfs bmap routines can wrongly return invalid argument error (i.e. -EINVAL) that bmap returns as an internal code for btree corruption. This fixes the issue by catching and converting the internal EINVAL to EIO and calling nilfs_error function inside bmap routines. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 27 10月, 2010 1 次提交
-
-
由 Michael Rubin 提交于
To help developers and applications gain visibility into writeback behaviour this patch adds two counters to /proc/vmstat. # grep nr_dirtied /proc/vmstat nr_dirtied 3747 # grep nr_written /proc/vmstat nr_written 3618 These entries allow user apps to understand writeback behaviour over time and learn how it is impacting their performance. Currently there is no way to inspect dirty and writeback speed over time. It's not possible for nr_dirty/nr_writeback. These entries are necessary to give visibility into writeback behaviour. We have /proc/diskstats which lets us understand the io in the block layer. We have blktrace for more in depth understanding. We have e2fsprogs and debugsfs to give insight into the file systems behaviour, but we don't offer our users the ability understand what writeback is doing. There is no way to know how active it is over the whole system, if it's falling behind or to quantify it's efforts. With these values exported users can easily see how much data applications are sending through writeback and also at what rates writeback is processing this data. Comparing the rates of change between the two allow developers to see when writeback is not able to keep up with incoming traffic and the rate of dirty memory being sent to the IO back end. This allows folks to understand their io workloads and track kernel issues. Non kernel engineers at Google often use these counters to solve puzzling performance problems. Patch #4 adds a pernode vmstat file with nr_dirtied and nr_written Patch #5 add writeback thresholds to /proc/vmstat Currently these values are in debugfs. But they should be promoted to /proc since they are useful for developers who are writing databases and file servers and are not debugging the kernel. The output is as below: # grep threshold /proc/vmstat nr_pages_dirty_threshold 409111 nr_pages_dirty_background_threshold 818223 This patch: This allows code outside of the mm core to safely manipulate page writeback state and not worry about the other accounting. Not using these routines means that some code will lose track of the accounting and we get bugs. Modify nilfs2 to use interface. Signed-off-by: NMichael Rubin <mrubin@google.com> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NWu Fengguang <fengguang.wu@intel.com> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Jiro SEKIBA <jir@unicus.jp> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 10月, 2010 9 次提交
-
-
由 Jiro SEKIBA 提交于
insert sparse annotations to fix following sparse warning. fs/nilfs2/segment.c:2681:3: warning: context imbalance in 'nilfs_segctor_kill_thread' - unexpected unlock nilfs_segctor_kill_thread is only called inside sc_state_lock lock. sparse doesn't detect the context and warn "unexpected unlock". __acquires/__releases pretend to lock/unlock the sc_state_lock for sparse. Signed-off-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Ryusuke Konishi 提交于
Nilfs hasn't supported the freeze/thaw feature because it didn't work due to the peculiar design that multiple super block instances could be allocated for a device. This limitation was removed by the patch "nilfs2: do not allocate multiple super block instances for a device". So now this adds the freeze/thaw support to nilfs. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Nilfs object holds a back pointer to a writable super block instance in nilfs->ns_writer, and this became eliminable since sb is now made per device and all inodes have a valid pointer to it. This deletes the ns_writer pointer and a reader/writer semaphore protecting it. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This applies prepared rollback function and redirect function of metadata file to DAT file, and eliminates GCDAT inode. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
During garbage collection (GC), DAT file, which converts virtual block number to real block number, may return disk block number that is not yet written to the device. To avoid access to unwritten blocks, the current implementation stores changes to the caches of GCDAT during GC and atomically commit the changes into the DAT file after they are written to the device. This patch, instead, adds a function that makes a copy of specified buffer and stores it in nilfs_shadow_map, and a function to get the backup copy as needed (nilfs_mdt_freeze_buffer and nilfs_mdt_get_frozen_buffer respectively). Before DAT changes block number in an entry block, it makes a copy and redirect access to the buffer so that address conversion function (i.e. nilfs_dat_translate) refers to the old address saved in the copy. This patch gives requisites for such redirection. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This moves sbi->s_inodes_count and sbi->s_blocks_count into nilfs_root object. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This rewrites functions using ifile so that they get ifile from nilfs_root object, and will remove sbi->s_ifile. Some functions that don't know the root object are extended to receive it from caller. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This uses inode hash function that vfs provides instead of the own hash table for caching gc inodes. This finally removes the own inode hash from nilfs. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
On-memory inode structures of nilfs have a member "i_cno" which stores a checkpoint number related to the inode. For gc-inodes, this field indicates version of data each gc-inode caches for GC. Log writer temporarily uses "i_cno" to transfer the latest checkpoint number. This stops the latter use and lets only gc-inodes use it. The purpose of this patch is to allow the successive change use "i_cno" for inode lookup. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 23 7月, 2010 4 次提交
-
-
由 Ryusuke Konishi 提交于
Super blocks of nilfs are periodically overwritten in order to record the recent log position. This shortens recovery time after unclean unmount, but the current implementation performs the update even for a few blocks of change. If the filesystem gets small changes slowly and continually, super blocks may be updated excessively. This moderates the issue by skipping update of log cursor if it does not cross a segment boundary. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Jiro SEKIBA 提交于
This will sync super blocks in turns instead of syncing duplicate super blocks at the time. This will help searching valid super root when super block is written into disk before log is written, which is happen when barrier-less block devices are unmounted uncleanly. In the situation, old super block likely points to valid log. This patch introduces ns_sbwcount member to the nilfs object and adds nilfs_sb_will_flip() function; ns_sbwcount counts how many times super blocks write back to the disk. And, nilfs_sb_will_flip() decides whether flipping required or not based on the count of ns_sbwcount to sync super blocks asymmetrically. The following functions are also changed: - nilfs_prepare_super(): flips super blocks according to the argument. The argument is calculated by nilfs_sb_will_flip() function. - nilfs_cleanup_super(): sets "clean" flag to both super blocks if they point to the same checkpoint. To update both of super block information, caller of nilfs_commit_super must set the information on both super blocks. Signed-off-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Jiro SEKIBA 提交于
This function checks validity of super block pointers. If first super block is invalid, it will swap the super blocks. The function should be called before any super block information updates. Caller must obtain nilfs->ns_sem. Signed-off-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Ryusuke Konishi 提交于
This removes macros to test segment summary flags and redefines a few relevant macros with inline functions. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 10 5月, 2010 2 次提交
-
-
由 Ryusuke Konishi 提交于
This kills the following sparse warnings: fs/nilfs2/segment.c:567:28: warning: symbol 'nilfs_sc_file_ops' was not declared. Should it be static? fs/nilfs2/segment.c:617:28: warning: symbol 'nilfs_sc_dat_ops' was not declared. Should it be static? fs/nilfs2/segment.c:625:28: warning: symbol 'nilfs_sc_dsync_ops' was not declared. Should it be static? Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Li Hong 提交于
In nilfs_segctor_thread(), timer is a local variable allocated on stack. Its address can't be set to sci->sc_timer and passed in several procedures. It works now by chance, just because other procedures are called by nilfs_segctor_thread() directly or indirectly and the stack hasn't been deallocated yet. Signed-off-by: NLi Hong <lihong.hi@gmail.com> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-