- 10 5月, 2011 7 次提交
-
-
由 Ryusuke Konishi 提交于
This replaces nilfs_mdt_mark_buffer_dirty and nilfs_btnode_mark_dirty macros with mark_buffer_dirty and gets rid of nilfs_mark_buffer_dirty, an own mark buffer dirty function. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
In the current nilfs, page cache for btree nodes and meta data files do not set a valid back pointer to the host inode in mapping->host. This will change it so that every address space in nilfs uses mapping->host to hold its host inode. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This uses list_first_entry macro instead of list_entry if it's used to get the first entry. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
The super root block is newly-allocated each time it is written back to disk, so unused portion of the block should be cleared. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
The size of super root structure depends on inode size, so NILFS_SR_BYTES macro should be a function of the inode size. This fixes the issue. Even though a different size value will be written for a possible future filesystem with extended inode, but fortunately this does not break disk format compatibility. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Previously, nilfs was cloning pages for mmapped region to freeze their data and ensure consistency of checksum during writeback cycles. A private page allocator was used for this page cloning. But, we no longer need to do that since clear_page_dirty_for_io function sets up pte so that vm_ops->page_mkwrite function is called right before the mmapped pages are modified and nilfs_page_mkwrite function can safely wait for the pages to be written back to disk. So, this stops making a copy of mmapped pages during writeback, and eliminates the private page allocation and deallocation functions from nilfs. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Nicolas Kaiser 提交于
Merge list_del() + list_add_tail() to list_move_tail(). Signed-off-by: NNicolas Kaiser <nikai@nikai.net> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 09 3月, 2011 7 次提交
-
-
由 Ryusuke Konishi 提交于
This directly uses sb->s_fs_info to keep a nilfs filesystem object and fully removes the intermediate nilfs_sb_info structure. With this change, the hierarchy of on-memory structures of nilfs will be simplified as follows: Before: super_block -> nilfs_sb_info -> the_nilfs -> cptree --+-> nilfs_root (current file system) +-> nilfs_root (snapshot A) +-> nilfs_root (snapshot B) : -> nilfs_sc_info (log writer structure) After: super_block -> the_nilfs -> cptree --+-> nilfs_root (current file system) +-> nilfs_root (snapshot A) +-> nilfs_root (snapshot B) : -> nilfs_sc_info (log writer structure) The reason why we didn't design so from the beginning is because the initial shape also differed from the above. The early hierachy was composed of "per-mount-point" super_block -> nilfs_sb_info pairs and a shared nilfs object. On the kernel 2.6.37, it was changed to the current shape in order to unify super block instances into one per device, and this cleanup became applicable as the result. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This replaces sbi uses with direct reference to sb instance. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Removes sci->sc_sbi which is a back pointer to nilfs_sb_info struct from log writer object (nilfs_sc_info). Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Log writer is held by the nilfs_sb_info structure. This moves it into nilfs object and replaces all uses of NILFS_SC() accessor. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Moves s_inode_lock spinlock and s_dirty_files list to nilfs object from nilfs_sb_info structure. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This moves four parameter variables on nilfs_sb_info s_resuid, s_resgid, s_interval and s_watermark to the nilfs object. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This moves mount_opt local variable to nilfs object from nilfs_sb_info struct. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 02 3月, 2011 1 次提交
-
-
由 Ryusuke Konishi 提交于
According to the report from Jiro SEKIBA titled "regression in 2.6.37?" (Message-Id: <8739n8vs1f.wl%jir@sekiba.com>), on 2.6.37 and later kernels, lscp command no longer displays "i" flag on checkpoints that snapshot operations or garbage collection created. This is a regression of nilfs2 checkpointing function, and it's critical since it broke behavior of a part of nilfs2 applications. For instance, snapshot manager of TimeBrowse gets to create meaningless snapshots continuously; snapshot creation triggers another checkpoint, but applications cannot distinguish whether the new checkpoint contains meaningful changes or not without the i-flag. This patch fixes the regression and brings that application behavior back to normal. Reported-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: NJiro SEKIBA <jir@unicus.jp> Cc: stable <stable@kernel.org> [2.6.37]
-
- 10 1月, 2011 3 次提交
-
-
由 Ryusuke Konishi 提交于
nilfs_dat_inode function was a wrapper to switch between normal dat inode and gcdat, a clone of the dat inode for garbage collection. This function got obsolete when the gcdat inode was removed, and now we can access the dat inode directly from a nilfs object. So, we will unfold the wrapper and remove it. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Nilfs does not allocate new blocks on disk until they are actually written to. To implement fiemap, we need to deal with such blocks. To allow successive fiemap patch to distinguish mapped but unallocated regions, this marks buffer heads of those new blocks as delayed and clears the flag after the blocks are written to disk. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Some functions using nilfs bmap routines can wrongly return invalid argument error (i.e. -EINVAL) that bmap returns as an internal code for btree corruption. This fixes the issue by catching and converting the internal EINVAL to EIO and calling nilfs_error function inside bmap routines. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 27 10月, 2010 1 次提交
-
-
由 Michael Rubin 提交于
To help developers and applications gain visibility into writeback behaviour this patch adds two counters to /proc/vmstat. # grep nr_dirtied /proc/vmstat nr_dirtied 3747 # grep nr_written /proc/vmstat nr_written 3618 These entries allow user apps to understand writeback behaviour over time and learn how it is impacting their performance. Currently there is no way to inspect dirty and writeback speed over time. It's not possible for nr_dirty/nr_writeback. These entries are necessary to give visibility into writeback behaviour. We have /proc/diskstats which lets us understand the io in the block layer. We have blktrace for more in depth understanding. We have e2fsprogs and debugsfs to give insight into the file systems behaviour, but we don't offer our users the ability understand what writeback is doing. There is no way to know how active it is over the whole system, if it's falling behind or to quantify it's efforts. With these values exported users can easily see how much data applications are sending through writeback and also at what rates writeback is processing this data. Comparing the rates of change between the two allow developers to see when writeback is not able to keep up with incoming traffic and the rate of dirty memory being sent to the IO back end. This allows folks to understand their io workloads and track kernel issues. Non kernel engineers at Google often use these counters to solve puzzling performance problems. Patch #4 adds a pernode vmstat file with nr_dirtied and nr_written Patch #5 add writeback thresholds to /proc/vmstat Currently these values are in debugfs. But they should be promoted to /proc since they are useful for developers who are writing databases and file servers and are not debugging the kernel. The output is as below: # grep threshold /proc/vmstat nr_pages_dirty_threshold 409111 nr_pages_dirty_background_threshold 818223 This patch: This allows code outside of the mm core to safely manipulate page writeback state and not worry about the other accounting. Not using these routines means that some code will lose track of the accounting and we get bugs. Modify nilfs2 to use interface. Signed-off-by: NMichael Rubin <mrubin@google.com> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: NWu Fengguang <fengguang.wu@intel.com> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Jiro SEKIBA <jir@unicus.jp> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 10月, 2010 9 次提交
-
-
由 Jiro SEKIBA 提交于
insert sparse annotations to fix following sparse warning. fs/nilfs2/segment.c:2681:3: warning: context imbalance in 'nilfs_segctor_kill_thread' - unexpected unlock nilfs_segctor_kill_thread is only called inside sc_state_lock lock. sparse doesn't detect the context and warn "unexpected unlock". __acquires/__releases pretend to lock/unlock the sc_state_lock for sparse. Signed-off-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Ryusuke Konishi 提交于
Nilfs hasn't supported the freeze/thaw feature because it didn't work due to the peculiar design that multiple super block instances could be allocated for a device. This limitation was removed by the patch "nilfs2: do not allocate multiple super block instances for a device". So now this adds the freeze/thaw support to nilfs. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
Nilfs object holds a back pointer to a writable super block instance in nilfs->ns_writer, and this became eliminable since sb is now made per device and all inodes have a valid pointer to it. This deletes the ns_writer pointer and a reader/writer semaphore protecting it. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This applies prepared rollback function and redirect function of metadata file to DAT file, and eliminates GCDAT inode. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
During garbage collection (GC), DAT file, which converts virtual block number to real block number, may return disk block number that is not yet written to the device. To avoid access to unwritten blocks, the current implementation stores changes to the caches of GCDAT during GC and atomically commit the changes into the DAT file after they are written to the device. This patch, instead, adds a function that makes a copy of specified buffer and stores it in nilfs_shadow_map, and a function to get the backup copy as needed (nilfs_mdt_freeze_buffer and nilfs_mdt_get_frozen_buffer respectively). Before DAT changes block number in an entry block, it makes a copy and redirect access to the buffer so that address conversion function (i.e. nilfs_dat_translate) refers to the old address saved in the copy. This patch gives requisites for such redirection. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This moves sbi->s_inodes_count and sbi->s_blocks_count into nilfs_root object. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This rewrites functions using ifile so that they get ifile from nilfs_root object, and will remove sbi->s_ifile. Some functions that don't know the root object are extended to receive it from caller. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This uses inode hash function that vfs provides instead of the own hash table for caching gc inodes. This finally removes the own inode hash from nilfs. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
On-memory inode structures of nilfs have a member "i_cno" which stores a checkpoint number related to the inode. For gc-inodes, this field indicates version of data each gc-inode caches for GC. Log writer temporarily uses "i_cno" to transfer the latest checkpoint number. This stops the latter use and lets only gc-inodes use it. The purpose of this patch is to allow the successive change use "i_cno" for inode lookup. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 23 7月, 2010 4 次提交
-
-
由 Ryusuke Konishi 提交于
Super blocks of nilfs are periodically overwritten in order to record the recent log position. This shortens recovery time after unclean unmount, but the current implementation performs the update even for a few blocks of change. If the filesystem gets small changes slowly and continually, super blocks may be updated excessively. This moderates the issue by skipping update of log cursor if it does not cross a segment boundary. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Jiro SEKIBA 提交于
This will sync super blocks in turns instead of syncing duplicate super blocks at the time. This will help searching valid super root when super block is written into disk before log is written, which is happen when barrier-less block devices are unmounted uncleanly. In the situation, old super block likely points to valid log. This patch introduces ns_sbwcount member to the nilfs object and adds nilfs_sb_will_flip() function; ns_sbwcount counts how many times super blocks write back to the disk. And, nilfs_sb_will_flip() decides whether flipping required or not based on the count of ns_sbwcount to sync super blocks asymmetrically. The following functions are also changed: - nilfs_prepare_super(): flips super blocks according to the argument. The argument is calculated by nilfs_sb_will_flip() function. - nilfs_cleanup_super(): sets "clean" flag to both super blocks if they point to the same checkpoint. To update both of super block information, caller of nilfs_commit_super must set the information on both super blocks. Signed-off-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Jiro SEKIBA 提交于
This function checks validity of super block pointers. If first super block is invalid, it will swap the super blocks. The function should be called before any super block information updates. Caller must obtain nilfs->ns_sem. Signed-off-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Ryusuke Konishi 提交于
This removes macros to test segment summary flags and redefines a few relevant macros with inline functions. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 10 5月, 2010 7 次提交
-
-
由 Ryusuke Konishi 提交于
This kills the following sparse warnings: fs/nilfs2/segment.c:567:28: warning: symbol 'nilfs_sc_file_ops' was not declared. Should it be static? fs/nilfs2/segment.c:617:28: warning: symbol 'nilfs_sc_dat_ops' was not declared. Should it be static? fs/nilfs2/segment.c:625:28: warning: symbol 'nilfs_sc_dsync_ops' was not declared. Should it be static? Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Li Hong 提交于
In nilfs_segctor_thread(), timer is a local variable allocated on stack. Its address can't be set to sci->sc_timer and passed in several procedures. It works now by chance, just because other procedures are called by nilfs_segctor_thread() directly or indirectly and the stack hasn't been deallocated yet. Signed-off-by: NLi Hong <lihong.hi@gmail.com> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Li Hong 提交于
There are only two lines of code in nilfs_segctor_init(). From a logic design view, the first line 'sci->sc_seq_done = sci->sc_seq_request;' should be put in nilfs_segctor_new(). Even in nilfs_segctor_new(), this initialization is needless because sci is kzalloc-ed. So nilfs_segctor_init() is only a wrap call to nilfs_segctor_start_thread(). Signed-off-by: NLi Hong <lihong.hi@gmail.com> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Ryusuke Konishi 提交于
This adds a field to record the latest checkpoint number in the nilfs_segment_summary structure. This will help to recover the latest checkpoint number from logs on disk. This field is intended for crucial cases in which super blocks have lost pointer to the latest log. Even though this will change the disk format, both backward and forward compatibility is preserved by a size field prepared in the segment summary header. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Li Hong 提交于
This cleanup patch gives several improvements: - Moving all kmem_cache_{create_destroy} calls into one place, which removes some small function calls, cleans up error check code and clarify the logic. - Mark all initial code in __init section. - Remove some very obvious comments. - Adjust some declarations. - Fix some space-tab issues. Signed-off-by: NLi Hong <lihong.hi@gmail.com> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This moves out checksum routines in log writer to segbuf.c for cleanup. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> -
由 Ryusuke Konishi 提交于
This moves a pointer to buffer storing super root block to each log buffer from nilfs_sc_info struct for simplicity. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-