提交 ae005cbe 编写于 作者: L Linus Torvalds

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (43 commits)
  ext4: fix a BUG in mb_mark_used during trim.
  ext4: unused variables cleanup in fs/ext4/extents.c
  ext4: remove redundant set_buffer_mapped() in ext4_da_get_block_prep()
  ext4: add more tracepoints and use dev_t in the trace buffer
  ext4: don't kfree uninitialized s_group_info members
  ext4: add missing space in printk's in __ext4_grp_locked_error()
  ext4: add FITRIM to compat_ioctl.
  ext4: handle errors in ext4_clear_blocks()
  ext4: unify the ext4_handle_release_buffer() api
  ext4: handle errors in ext4_rename
  jbd2: add COW fields to struct jbd2_journal_handle
  jbd2: add the b_cow_tid field to journal_head struct
  ext4: Initialize fsync transaction ids in ext4_new_inode()
  ext4: Use single thread to perform DIO unwritten convertion
  ext4: optimize ext4_bio_write_page() when no extent conversion is needed
  ext4: skip orphan cleanup if fs has unknown ROCOMPAT features
  ext4: use the nblocks arg to ext4_truncate_restart_trans()
  ext4: fix missing iput of root inode for some mount error paths
  ext4: make FIEMAP and delayed allocation play well together
  ext4: suppress verbose debugging information if malloc-debug is off
  ...

Fi up conflicts in fs/ext4/super.c due to workqueue changes
......@@ -48,7 +48,7 @@ Description:
will have its blocks allocated out of its own unique
preallocation pool.
What: /sys/fs/ext4/<disk>/inode_readahead
What: /sys/fs/ext4/<disk>/inode_readahead_blks
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
......@@ -85,7 +85,14 @@ Date: June 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Tuning parameter which (if non-zero) controls the goal
inode used by the inode allocator in p0reference to
all other allocation hueristics. This is intended for
inode used by the inode allocator in preference to
all other allocation heuristics. This is intended for
debugging use only, and should be 0 on production
systems.
What: /sys/fs/ext4/<disk>/max_writeback_mb_bump
Date: September 2009
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
The maximum number of megabytes the writeback code will
try to write out before move on to another inode.
......@@ -367,12 +367,47 @@ init_itable=n The lazy itable init code will wait n times the
minimizes the impact on the systme performance
while file system's inode table is being initialized.
discard Controls whether ext4 should issue discard/TRIM
discard Controls whether ext4 should issue discard/TRIM
nodiscard(*) commands to the underlying block device when
blocks are freed. This is useful for SSD devices
and sparse/thinly-provisioned LUNs, but it is off
by default until sufficient testing has been done.
nouid32 Disables 32-bit UIDs and GIDs. This is for
interoperability with older kernels which only
store and expect 16-bit values.
resize Allows to resize filesystem to the end of the last
existing block group, further resize has to be done
with resize2fs either online, or offline. It can be
used only with conjunction with remount.
block_validity This options allows to enables/disables the in-kernel
noblock_validity facility for tracking filesystem metadata blocks
within internal data structures. This allows multi-
block allocator and other routines to quickly locate
extents which might overlap with filesystem metadata
blocks. This option is intended for debugging
purposes and since it negatively affects the
performance, it is off by default.
dioread_lock Controls whether or not ext4 should use the DIO read
dioread_nolock locking. If the dioread_nolock option is specified
ext4 will allocate uninitialized extent before buffer
write and convert the extent to initialized after IO
completes. This approach allows ext4 code to avoid
using inode mutex, which improves scalability on high
speed storages. However this does not work with nobh
option and the mount will fail. Nor does it work with
data journaling and dioread_nolock option will be
ignored with kernel warning. Note that dioread_nolock
code path is only used for extent-based files.
Because of the restrictions this options comprises
it is off by default (e.g. dioread_lock).
i_version Enable 64-bit inode version support. This option is
off by default.
Data Mode
=========
There are 3 different data modes:
......@@ -400,6 +435,176 @@ needs to be read from and written to disk at the same time where it
outperforms all others modes. Currently ext4 does not have delayed
allocation support if this data journalling mode is selected.
/proc entries
=============
Information about mounted ext4 file systems can be found in
/proc/fs/ext4. Each mounted filesystem will have a directory in
/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
/proc/fs/ext4/dm-0). The files in each per-device directory are shown
in table below.
Files in /proc/fs/ext4/<devname>
..............................................................................
File Content
mb_groups details of multiblock allocator buddy cache of free blocks
..............................................................................
/sys entries
============
Information about mounted ext4 file systems can be found in
/sys/fs/ext4. Each mounted filesystem will have a directory in
/sys/fs/ext4 based on its device name (i.e., /sys/fs/ext4/hdc or
/sys/fs/ext4/dm-0). The files in each per-device directory are shown
in table below.
Files in /sys/fs/ext4/<devname>
(see also Documentation/ABI/testing/sysfs-fs-ext4)
..............................................................................
File Content
delayed_allocation_blocks This file is read-only and shows the number of
blocks that are dirty in the page cache, but
which do not have their location in the
filesystem allocated yet.
inode_goal Tuning parameter which (if non-zero) controls
the goal inode used by the inode allocator in
preference to all other allocation heuristics.
This is intended for debugging use only, and
should be 0 on production systems.
inode_readahead_blks Tuning parameter which controls the maximum
number of inode table blocks that ext4's inode
table readahead algorithm will pre-read into
the buffer cache
lifetime_write_kbytes This file is read-only and shows the number of
kilobytes of data that have been written to this
filesystem since it was created.
max_writeback_mb_bump The maximum number of megabytes the writeback
code will try to write out before move on to
another inode.
mb_group_prealloc The multiblock allocator will round up allocation
requests to a multiple of this tuning parameter if
the stripe size is not set in the ext4 superblock
mb_max_to_scan The maximum number of extents the multiblock
allocator will search to find the best extent
mb_min_to_scan The minimum number of extents the multiblock
allocator will search to find the best extent
mb_order2_req Tuning parameter which controls the minimum size
for requests (as a power of 2) where the buddy
cache is used
mb_stats Controls whether the multiblock allocator should
collect statistics, which are shown during the
unmount. 1 means to collect statistics, 0 means
not to collect statistics
mb_stream_req Files which have fewer blocks than this tunable
parameter will have their blocks allocated out
of a block group specific preallocation pool, so
that small files are packed closely together.
Each large file will have its blocks allocated
out of its own unique preallocation pool.
session_write_kbytes This file is read-only and shows the number of
kilobytes of data that have been written to this
filesystem since it was mounted.
..............................................................................
Ioctls
======
There is some Ext4 specific functionality which can be accessed by applications
through the system call interfaces. The list of all Ext4 specific ioctls are
shown in the table below.
Table of Ext4 specific ioctls
..............................................................................
Ioctl Description
EXT4_IOC_GETFLAGS Get additional attributes associated with inode.
The ioctl argument is an integer bitfield, with
bit values described in ext4.h. This ioctl is an
alias for FS_IOC_GETFLAGS.
EXT4_IOC_SETFLAGS Set additional attributes associated with inode.
The ioctl argument is an integer bitfield, with
bit values described in ext4.h. This ioctl is an
alias for FS_IOC_SETFLAGS.
EXT4_IOC_GETVERSION
EXT4_IOC_GETVERSION_OLD
Get the inode i_generation number stored for
each inode. The i_generation number is normally
changed only when new inode is created and it is
particularly useful for network filesystems. The
'_OLD' version of this ioctl is an alias for
FS_IOC_GETVERSION.
EXT4_IOC_SETVERSION
EXT4_IOC_SETVERSION_OLD
Set the inode i_generation number stored for
each inode. The '_OLD' version of this ioctl
is an alias for FS_IOC_SETVERSION.
EXT4_IOC_GROUP_EXTEND This ioctl has the same purpose as the resize
mount option. It allows to resize filesystem
to the end of the last existing block group,
further resize has to be done with resize2fs,
either online, or offline. The argument points
to the unsigned logn number representing the
filesystem new block count.
EXT4_IOC_MOVE_EXT Move the block extents from orig_fd (the one
this ioctl is pointing to) to the donor_fd (the
one specified in move_extent structure passed
as an argument to this ioctl). Then, exchange
inode metadata between orig_fd and donor_fd.
This is especially useful for online
defragmentation, because the allocator has the
opportunity to allocate moved blocks better,
ideally into one contiguous extent.
EXT4_IOC_GROUP_ADD Add a new group descriptor to an existing or
new group descriptor block. The new group
descriptor is described by ext4_new_group_input
structure, which is passed as an argument to
this ioctl. This is especially useful in
conjunction with EXT4_IOC_GROUP_EXTEND,
which allows online resize of the filesystem
to the end of the last existing block group.
Those two ioctls combined is used in userspace
online resize tool (e.g. resize2fs).
EXT4_IOC_MIGRATE This ioctl operates on the filesystem itself.
It converts (migrates) ext3 indirect block mapped
inode to ext4 extent mapped inode by walking
through indirect block mapping of the original
inode and converting contiguous block ranges
into ext4 extents of the temporary inode. Then,
inodes are swapped. This ioctl might help, when
migrating from ext3 to ext4 filesystem, however
suggestion is to create fresh ext4 filesystem
and copy data from the backup. Note, that
filesystem has to support extents for this ioctl
to work.
EXT4_IOC_ALLOC_DA_BLKS Force all of the delay allocated blocks to be
allocated to preserve application-expected ext3
behaviour. Note that this will also start
triggering a write of the data blocks, but this
behaviour may change in the future as it is
not necessary and has been done this way only
for sake of simplicity.
..............................................................................
References
==========
......
......@@ -21,6 +21,8 @@
#include "ext4_jbd2.h"
#include "mballoc.h"
#include <trace/events/ext4.h>
/*
* balloc.c contains the blocks allocation and deallocation routines
*/
......@@ -342,6 +344,7 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
* We do it here so the bitmap uptodate bit
* get set with buffer lock held.
*/
trace_ext4_read_block_bitmap_load(sb, block_group);
set_bitmap_uptodate(bh);
if (bh_submit_read(bh) < 0) {
put_bh(bh);
......
......@@ -202,13 +202,6 @@ static inline int ext4_handle_has_enough_credits(handle_t *handle, int needed)
return 1;
}
static inline void ext4_journal_release_buffer(handle_t *handle,
struct buffer_head *bh)
{
if (ext4_handle_valid(handle))
jbd2_journal_release_buffer(handle, bh);
}
static inline handle_t *ext4_journal_start(struct inode *inode, int nblocks)
{
return ext4_journal_start_sb(inode->i_sb, nblocks);
......
......@@ -44,6 +44,8 @@
#include "ext4_jbd2.h"
#include "ext4_extents.h"
#include <trace/events/ext4.h>
static int ext4_ext_truncate_extend_restart(handle_t *handle,
struct inode *inode,
int needed)
......@@ -664,6 +666,8 @@ ext4_ext_find_extent(struct inode *inode, ext4_lblk_t block,
if (unlikely(!bh))
goto err;
if (!bh_uptodate_or_lock(bh)) {
trace_ext4_ext_load_extent(inode, block,
path[ppos].p_block);
if (bh_submit_read(bh) < 0) {
put_bh(bh);
goto err;
......@@ -1034,7 +1038,7 @@ static int ext4_ext_split(handle_t *handle, struct inode *inode,
for (i = 0; i < depth; i++) {
if (!ablocks[i])
continue;
ext4_free_blocks(handle, inode, 0, ablocks[i], 1,
ext4_free_blocks(handle, inode, NULL, ablocks[i], 1,
EXT4_FREE_BLOCKS_METADATA);
}
}
......@@ -2059,7 +2063,7 @@ static int ext4_ext_rm_idx(handle_t *handle, struct inode *inode,
if (err)
return err;
ext_debug("index is empty, remove it, free block %llu\n", leaf);
ext4_free_blocks(handle, inode, 0, leaf, 1,
ext4_free_blocks(handle, inode, NULL, leaf, 1,
EXT4_FREE_BLOCKS_METADATA | EXT4_FREE_BLOCKS_FORGET);
return err;
}
......@@ -2156,7 +2160,7 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
num = le32_to_cpu(ex->ee_block) + ee_len - from;
start = ext4_ext_pblock(ex) + ee_len - num;
ext_debug("free last %u blocks starting %llu\n", num, start);
ext4_free_blocks(handle, inode, 0, start, num, flags);
ext4_free_blocks(handle, inode, NULL, start, num, flags);
} else if (from == le32_to_cpu(ex->ee_block)
&& to <= le32_to_cpu(ex->ee_block) + ee_len - 1) {
printk(KERN_INFO "strange request: removal %u-%u from %u:%u\n",
......@@ -3108,14 +3112,13 @@ static int check_eofblocks_fl(handle_t *handle, struct inode *inode,
{
int i, depth;
struct ext4_extent_header *eh;
struct ext4_extent *ex, *last_ex;
struct ext4_extent *last_ex;
if (!ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS))
return 0;
depth = ext_depth(inode);
eh = path[depth].p_hdr;
ex = path[depth].p_ext;
if (unlikely(!eh->eh_entries)) {
EXT4_ERROR_INODE(inode, "eh->eh_entries == 0 and "
......@@ -3295,9 +3298,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
struct ext4_map_blocks *map, int flags)
{
struct ext4_ext_path *path = NULL;
struct ext4_extent_header *eh;
struct ext4_extent newex, *ex;
ext4_fsblk_t newblock;
ext4_fsblk_t newblock = 0;
int err = 0, depth, ret;
unsigned int allocated = 0;
struct ext4_allocation_request ar;
......@@ -3305,6 +3307,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
ext_debug("blocks %u/%u requested for inode %lu\n",
map->m_lblk, map->m_len, inode->i_ino);
trace_ext4_ext_map_blocks_enter(inode, map->m_lblk, map->m_len, flags);
/* check in cache */
if (ext4_ext_in_cache(inode, map->m_lblk, &newex)) {
......@@ -3352,7 +3355,6 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
err = -EIO;
goto out2;
}
eh = path[depth].p_hdr;
ex = path[depth].p_ext;
if (ex) {
......@@ -3485,7 +3487,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
/* not a good idea to call discard here directly,
* but otherwise we'd need to call it every free() */
ext4_discard_preallocations(inode);
ext4_free_blocks(handle, inode, 0, ext4_ext_pblock(&newex),
ext4_free_blocks(handle, inode, NULL, ext4_ext_pblock(&newex),
ext4_ext_get_actual_len(&newex), 0);
goto out2;
}
......@@ -3525,6 +3527,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
ext4_ext_drop_refs(path);
kfree(path);
}
trace_ext4_ext_map_blocks_exit(inode, map->m_lblk,
newblock, map->m_len, err ? err : allocated);
return err ? err : allocated;
}
......@@ -3658,6 +3662,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)))
return -EOPNOTSUPP;
trace_ext4_fallocate_enter(inode, offset, len, mode);
map.m_lblk = offset >> blkbits;
/*
* We can't just convert len to max_blocks because
......@@ -3673,6 +3678,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
ret = inode_newsize_ok(inode, (len + offset));
if (ret) {
mutex_unlock(&inode->i_mutex);
trace_ext4_fallocate_exit(inode, offset, max_blocks, ret);
return ret;
}
retry:
......@@ -3717,6 +3723,8 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
goto retry;
}
mutex_unlock(&inode->i_mutex);
trace_ext4_fallocate_exit(inode, offset, max_blocks,
ret > 0 ? ret2 : ret);
return ret > 0 ? ret2 : ret;
}
......@@ -3775,6 +3783,7 @@ int ext4_convert_unwritten_extents(struct inode *inode, loff_t offset,
}
return ret > 0 ? ret2 : ret;
}
/*
* Callback function called for each extent to gather FIEMAP information.
*/
......@@ -3782,38 +3791,162 @@ static int ext4_ext_fiemap_cb(struct inode *inode, struct ext4_ext_path *path,
struct ext4_ext_cache *newex, struct ext4_extent *ex,
void *data)
{
struct fiemap_extent_info *fieinfo = data;
unsigned char blksize_bits = inode->i_sb->s_blocksize_bits;
__u64 logical;
__u64 physical;
__u64 length;
loff_t size;
__u32 flags = 0;
int error;
int ret = 0;
struct fiemap_extent_info *fieinfo = data;
unsigned char blksize_bits;
logical = (__u64)newex->ec_block << blksize_bits;
blksize_bits = inode->i_sb->s_blocksize_bits;
logical = (__u64)newex->ec_block << blksize_bits;
if (newex->ec_start == 0) {
pgoff_t offset;
struct page *page;
/*
* No extent in extent-tree contains block @newex->ec_start,
* then the block may stay in 1)a hole or 2)delayed-extent.
*
* Holes or delayed-extents are processed as follows.
* 1. lookup dirty pages with specified range in pagecache.
* If no page is got, then there is no delayed-extent and
* return with EXT_CONTINUE.
* 2. find the 1st mapped buffer,
* 3. check if the mapped buffer is both in the request range
* and a delayed buffer. If not, there is no delayed-extent,
* then return.
* 4. a delayed-extent is found, the extent will be collected.
*/
ext4_lblk_t end = 0;
pgoff_t last_offset;
pgoff_t offset;
pgoff_t index;
struct page **pages = NULL;
struct buffer_head *bh = NULL;
struct buffer_head *head = NULL;
unsigned int nr_pages = PAGE_SIZE / sizeof(struct page *);
pages = kmalloc(PAGE_SIZE, GFP_KERNEL);
if (pages == NULL)
return -ENOMEM;
offset = logical >> PAGE_SHIFT;
page = find_get_page(inode->i_mapping, offset);
if (!page || !page_has_buffers(page))
return EXT_CONTINUE;
repeat:
last_offset = offset;
head = NULL;
ret = find_get_pages_tag(inode->i_mapping, &offset,
PAGECACHE_TAG_DIRTY, nr_pages, pages);
if (!(flags & FIEMAP_EXTENT_DELALLOC)) {
/* First time, try to find a mapped buffer. */
if (ret == 0) {
out:
for (index = 0; index < ret; index++)
page_cache_release(pages[index]);
/* just a hole. */
kfree(pages);
return EXT_CONTINUE;
}
bh = page_buffers(page);
/* Try to find the 1st mapped buffer. */
end = ((__u64)pages[0]->index << PAGE_SHIFT) >>
blksize_bits;
if (!page_has_buffers(pages[0]))
goto out;
head = page_buffers(pages[0]);
if (!head)
goto out;
if (!bh)
return EXT_CONTINUE;
bh = head;
do {
if (buffer_mapped(bh)) {
/* get the 1st mapped buffer. */
if (end > newex->ec_block +
newex->ec_len)
/* The buffer is out of
* the request range.
*/
goto out;
goto found_mapped_buffer;
}
bh = bh->b_this_page;
end++;
} while (bh != head);
if (buffer_delay(bh)) {
flags |= FIEMAP_EXTENT_DELALLOC;
page_cache_release(page);
/* No mapped buffer found. */
goto out;
} else {
page_cache_release(page);
return EXT_CONTINUE;
/*Find contiguous delayed buffers. */
if (ret > 0 && pages[0]->index == last_offset)
head = page_buffers(pages[0]);
bh = head;
}
found_mapped_buffer:
if (bh != NULL && buffer_delay(bh)) {
/* 1st or contiguous delayed buffer found. */
if (!(flags & FIEMAP_EXTENT_DELALLOC)) {
/*
* 1st delayed buffer found, record
* the start of extent.
*/
flags |= FIEMAP_EXTENT_DELALLOC;
newex->ec_block = end;
logical = (__u64)end << blksize_bits;
}
/* Find contiguous delayed buffers. */
do {
if (!buffer_delay(bh))
goto found_delayed_extent;
bh = bh->b_this_page;
end++;
} while (bh != head);
for (index = 1; index < ret; index++) {
if (!page_has_buffers(pages[index])) {
bh = NULL;
break;
}
head = page_buffers(pages[index]);
if (!head) {
bh = NULL;
break;
}
if (pages[index]->index !=
pages[0]->index + index) {
/* Blocks are not contiguous. */
bh = NULL;
break;
}
bh = head;
do {
if (!buffer_delay(bh))
/* Delayed-extent ends. */
goto found_delayed_extent;
bh = bh->b_this_page;
end++;
} while (bh != head);
}
} else if (!(flags & FIEMAP_EXTENT_DELALLOC))
/* a hole found. */
goto out;
found_delayed_extent:
newex->ec_len = min(end - newex->ec_block,
(ext4_lblk_t)EXT_INIT_MAX_LEN);
if (ret == nr_pages && bh != NULL &&
newex->ec_len < EXT_INIT_MAX_LEN &&
buffer_delay(bh)) {
/* Have not collected an extent and continue. */
for (index = 0; index < ret; index++)
page_cache_release(pages[index]);
goto repeat;
}
for (index = 0; index < ret; index++)
page_cache_release(pages[index]);
kfree(pages);
}
physical = (__u64)newex->ec_start << blksize_bits;
......@@ -3822,32 +3955,16 @@ static int ext4_ext_fiemap_cb(struct inode *inode, struct ext4_ext_path *path,
if (ex && ext4_ext_is_uninitialized(ex))
flags |= FIEMAP_EXTENT_UNWRITTEN;
/*
* If this extent reaches EXT_MAX_BLOCK, it must be last.
*
* Or if ext4_ext_next_allocated_block is EXT_MAX_BLOCK,
* this also indicates no more allocated blocks.
*
* XXX this might miss a single-block extent at EXT_MAX_BLOCK
*/
if (ext4_ext_next_allocated_block(path) == EXT_MAX_BLOCK ||
newex->ec_block + newex->ec_len - 1 == EXT_MAX_BLOCK) {
loff_t size = i_size_read(inode);
loff_t bs = EXT4_BLOCK_SIZE(inode->i_sb);
size = i_size_read(inode);
if (logical + length >= size)
flags |= FIEMAP_EXTENT_LAST;
if ((flags & FIEMAP_EXTENT_DELALLOC) &&
logical+length > size)
length = (size - logical + bs - 1) & ~(bs-1);
}
error = fiemap_fill_next_extent(fieinfo, logical, physical,
ret = fiemap_fill_next_extent(fieinfo, logical, physical,
length, flags);
if (error < 0)
return error;
if (error == 1)
if (ret < 0)
return ret;
if (ret == 1)
return EXT_BREAK;
return EXT_CONTINUE;
}
......
......@@ -164,20 +164,20 @@ int ext4_sync_file(struct file *file, int datasync)
J_ASSERT(ext4_journal_current_handle() == NULL);
trace_ext4_sync_file(file, datasync);
trace_ext4_sync_file_enter(file, datasync);
if (inode->i_sb->s_flags & MS_RDONLY)
return 0;
ret = ext4_flush_completed_IO(inode);
if (ret < 0)
return ret;
goto out;
if (!journal) {
ret = generic_file_fsync(file, datasync);
if (!ret && !list_empty(&inode->i_dentry))
ext4_sync_parent(inode);
return ret;
goto out;
}
/*
......@@ -194,8 +194,10 @@ int ext4_sync_file(struct file *file, int datasync)
* (they were dirtied by commit). But that's OK - the blocks are
* safe in-journal, which is all fsync() needs to ensure.
*/
if (ext4_should_journal_data(inode))
return ext4_force_commit(inode->i_sb);
if (ext4_should_journal_data(inode)) {
ret = ext4_force_commit(inode->i_sb);
goto out;
}
commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid;
if (jbd2_log_start_commit(journal, commit_tid)) {
......@@ -215,5 +217,7 @@ int ext4_sync_file(struct file *file, int datasync)
ret = jbd2_log_wait_commit(journal, commit_tid);
} else if (journal->j_flags & JBD2_BARRIER)
blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL);
out:
trace_ext4_sync_file_exit(inode, ret);
return ret;
}
......@@ -152,6 +152,7 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
* We do it here so the bitmap uptodate bit
* get set with buffer lock held.
*/
trace_ext4_load_inode_bitmap(sb, block_group);
set_bitmap_uptodate(bh);
if (bh_submit_read(bh) < 0) {
put_bh(bh);
......@@ -649,7 +650,7 @@ static int find_group_other(struct super_block *sb, struct inode *parent,
*group = parent_group + flex_size;
if (*group > ngroups)
*group = 0;
return find_group_orlov(sb, parent, group, mode, 0);
return find_group_orlov(sb, parent, group, mode, NULL);
}
/*
......@@ -1054,6 +1055,11 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, int mode,
}
}
if (ext4_handle_valid(handle)) {
ei->i_sync_tid = handle->h_transaction->t_tid;
ei->i_datasync_tid = handle->h_transaction->t_tid;
}
err = ext4_mark_inode_dirty(handle, inode);
if (err) {
ext4_std_error(sb, err);
......
此差异已折叠。
......@@ -334,16 +334,22 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
case FITRIM:
{
struct super_block *sb = inode->i_sb;
struct request_queue *q = bdev_get_queue(sb->s_bdev);
struct fstrim_range range;
int ret = 0;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
if (!blk_queue_discard(q))
return -EOPNOTSUPP;
if (copy_from_user(&range, (struct fstrim_range *)arg,
sizeof(range)))
return -EFAULT;
range.minlen = max((unsigned int)range.minlen,
q->limits.discard_granularity);
ret = ext4_trim_fs(sb, &range);
if (ret < 0)
return ret;
......@@ -421,6 +427,7 @@ long ext4_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
return err;
}
case EXT4_IOC_MOVE_EXT:
case FITRIM:
break;
default:
return -ENOIOCTLCMD;
......
......@@ -432,9 +432,10 @@ static void *mb_find_buddy(struct ext4_buddy *e4b, int order, int *max)
}
/* at order 0 we see each particular block */
*max = 1 << (e4b->bd_blkbits + 3);
if (order == 0)
if (order == 0) {
*max = 1 << (e4b->bd_blkbits + 3);
return EXT4_MB_BITMAP(e4b);
}
bb = EXT4_MB_BUDDY(e4b) + EXT4_SB(e4b->bd_sb)->s_mb_offsets[order];
*max = EXT4_SB(e4b->bd_sb)->s_mb_maxs[order];
......@@ -616,7 +617,6 @@ static int __mb_check_buddy(struct ext4_buddy *e4b, char *file,
MB_CHECK_ASSERT(e4b->bd_info->bb_fragments == fragments);
grp = ext4_get_group_info(sb, e4b->bd_group);
buddy = mb_find_buddy(e4b, 0, &max);
list_for_each(cur, &grp->bb_prealloc_list) {
ext4_group_t groupnr;
struct ext4_prealloc_space *pa;
......@@ -635,7 +635,12 @@ static int __mb_check_buddy(struct ext4_buddy *e4b, char *file,
#define mb_check_buddy(e4b)
#endif
/* FIXME!! need more doc */
/*
* Divide blocks started from @first with length @len into
* smaller chunks with power of 2 blocks.
* Clear the bits in bitmap which the blocks of the chunk(s) covered,
* then increase bb_counters[] for corresponded chunk size.
*/
static void ext4_mb_mark_free_simple(struct super_block *sb,
void *buddy, ext4_grpblk_t first, ext4_grpblk_t len,
struct ext4_group_info *grp)
......@@ -2381,7 +2386,7 @@ static int ext4_mb_init_backend(struct super_block *sb)
/* An 8TB filesystem with 64-bit pointers requires a 4096 byte
* kmalloc. A 128kb malloc should suffice for a 256TB filesystem.
* So a two level scheme suffices for now. */
sbi->s_group_info = kmalloc(array_size, GFP_KERNEL);
sbi->s_group_info = kzalloc(array_size, GFP_KERNEL);
if (sbi->s_group_info == NULL) {
printk(KERN_ERR "EXT4-fs: can't allocate buddy meta group\n");
return -ENOMEM;
......@@ -3208,7 +3213,7 @@ ext4_mb_check_group_pa(ext4_fsblk_t goal_block,
cur_distance = abs(goal_block - cpa->pa_pstart);
new_distance = abs(goal_block - pa->pa_pstart);
if (cur_distance < new_distance)
if (cur_distance <= new_distance)
return cpa;
/* drop the previous reference */
......@@ -3907,7 +3912,8 @@ static void ext4_mb_show_ac(struct ext4_allocation_context *ac)
struct super_block *sb = ac->ac_sb;
ext4_group_t ngroups, i;
if (EXT4_SB(sb)->s_mount_flags & EXT4_MF_FS_ABORTED)
if (!mb_enable_debug ||
(EXT4_SB(sb)->s_mount_flags & EXT4_MF_FS_ABORTED))
return;
printk(KERN_ERR "EXT4-fs: Can't allocate:"
......@@ -4753,7 +4759,8 @@ static int ext4_trim_extent(struct super_block *sb, int start, int count,
* bitmap. Then issue a TRIM command on this extent and free the extent in
* the group buddy bitmap. This is done until whole group is scanned.
*/
ext4_grpblk_t ext4_trim_all_free(struct super_block *sb, struct ext4_buddy *e4b,
static ext4_grpblk_t
ext4_trim_all_free(struct super_block *sb, struct ext4_buddy *e4b,
ext4_grpblk_t start, ext4_grpblk_t max, ext4_grpblk_t minblocks)
{
void *bitmap;
......@@ -4863,10 +4870,15 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
break;
}
if (len >= EXT4_BLOCKS_PER_GROUP(sb))
len -= (EXT4_BLOCKS_PER_GROUP(sb) - first_block);
else
/*
* For all the groups except the last one, last block will
* always be EXT4_BLOCKS_PER_GROUP(sb), so we only need to
* change it for the last group in which case start +
* len < EXT4_BLOCKS_PER_GROUP(sb).
*/
if (first_block + len < EXT4_BLOCKS_PER_GROUP(sb))
last_block = first_block + len;
len -= last_block - first_block;
if (e4b.bd_info->bb_free >= minlen) {
cnt = ext4_trim_all_free(sb, &e4b, first_block,
......
......@@ -169,7 +169,7 @@ struct ext4_allocation_context {
/* original request */
struct ext4_free_extent ac_o_ex;
/* goal request (after normalization) */
/* goal request (normalized ac_o_ex) */
struct ext4_free_extent ac_g_ex;
/* the best found extent */
......
......@@ -263,7 +263,7 @@ static int free_dind_blocks(handle_t *handle,
for (i = 0; i < max_entries; i++) {
if (tmp_idata[i]) {
extend_credit_for_blkdel(handle, inode);
ext4_free_blocks(handle, inode, 0,
ext4_free_blocks(handle, inode, NULL,
le32_to_cpu(tmp_idata[i]), 1,
EXT4_FREE_BLOCKS_METADATA |
EXT4_FREE_BLOCKS_FORGET);
......@@ -271,7 +271,7 @@ static int free_dind_blocks(handle_t *handle,
}
put_bh(bh);
extend_credit_for_blkdel(handle, inode);
ext4_free_blocks(handle, inode, 0, le32_to_cpu(i_data), 1,
ext4_free_blocks(handle, inode, NULL, le32_to_cpu(i_data), 1,
EXT4_FREE_BLOCKS_METADATA |
EXT4_FREE_BLOCKS_FORGET);
return 0;
......@@ -302,7 +302,7 @@ static int free_tind_blocks(handle_t *handle,
}
put_bh(bh);
extend_credit_for_blkdel(handle, inode);
ext4_free_blocks(handle, inode, 0, le32_to_cpu(i_data), 1,
ext4_free_blocks(handle, inode, NULL, le32_to_cpu(i_data), 1,
EXT4_FREE_BLOCKS_METADATA |
EXT4_FREE_BLOCKS_FORGET);
return 0;
......@@ -315,7 +315,7 @@ static int free_ind_block(handle_t *handle, struct inode *inode, __le32 *i_data)
/* ei->i_data[EXT4_IND_BLOCK] */
if (i_data[0]) {
extend_credit_for_blkdel(handle, inode);
ext4_free_blocks(handle, inode, 0,
ext4_free_blocks(handle, inode, NULL,
le32_to_cpu(i_data[0]), 1,
EXT4_FREE_BLOCKS_METADATA |
EXT4_FREE_BLOCKS_FORGET);
......@@ -428,7 +428,7 @@ static int free_ext_idx(handle_t *handle, struct inode *inode,
}
put_bh(bh);
extend_credit_for_blkdel(handle, inode);
ext4_free_blocks(handle, inode, 0, block, 1,
ext4_free_blocks(handle, inode, NULL, block, 1,
EXT4_FREE_BLOCKS_METADATA | EXT4_FREE_BLOCKS_FORGET);
return retval;
}
......
......@@ -40,6 +40,7 @@
#include "xattr.h"
#include "acl.h"
#include <trace/events/ext4.h>
/*
* define how far ahead to read directories while searching them.
*/
......@@ -2183,6 +2184,7 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry)
struct ext4_dir_entry_2 *de;
handle_t *handle;
trace_ext4_unlink_enter(dir, dentry);
/* Initialize quotas before so that eventual writes go
* in separate transaction */
dquot_initialize(dir);
......@@ -2228,6 +2230,7 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry)
end_unlink:
ext4_journal_stop(handle);
brelse(bh);
trace_ext4_unlink_exit(dentry, retval);
return retval;
}
......@@ -2402,6 +2405,10 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
if (!new_inode && new_dir != old_dir &&
EXT4_DIR_LINK_MAX(new_dir))
goto end_rename;
BUFFER_TRACE(dir_bh, "get_write_access");
retval = ext4_journal_get_write_access(handle, dir_bh);
if (retval)
goto end_rename;
}
if (!new_bh) {
retval = ext4_add_entry(handle, new_dentry, old_inode);
......@@ -2409,7 +2416,9 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
goto end_rename;
} else {
BUFFER_TRACE(new_bh, "get write access");
ext4_journal_get_write_access(handle, new_bh);
retval = ext4_journal_get_write_access(handle, new_bh);
if (retval)
goto end_rename;
new_de->inode = cpu_to_le32(old_inode->i_ino);
if (EXT4_HAS_INCOMPAT_FEATURE(new_dir->i_sb,
EXT4_FEATURE_INCOMPAT_FILETYPE))
......@@ -2470,8 +2479,6 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
old_dir->i_ctime = old_dir->i_mtime = ext4_current_time(old_dir);
ext4_update_dx_flag(old_dir);
if (dir_bh) {
BUFFER_TRACE(dir_bh, "get_write_access");
ext4_journal_get_write_access(handle, dir_bh);
PARENT_INO(dir_bh->b_data, new_dir->i_sb->s_blocksize) =
cpu_to_le32(new_dir->i_ino);
BUFFER_TRACE(dir_bh, "call ext4_handle_dirty_metadata");
......
......@@ -259,6 +259,11 @@ static void ext4_end_bio(struct bio *bio, int error)
bi_sector >> (inode->i_blkbits - 9));
}
if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) {
ext4_free_io_end(io_end);
return;
}
/* Add the io_end to per-inode completed io list*/
spin_lock_irqsave(&EXT4_I(inode)->i_completed_io_lock, flags);
list_add_tail(&io_end->list, &EXT4_I(inode)->i_completed_io_list);
......@@ -279,9 +284,9 @@ void ext4_io_submit(struct ext4_io_submit *io)
BUG_ON(bio_flagged(io->io_bio, BIO_EOPNOTSUPP));
bio_put(io->io_bio);
}
io->io_bio = 0;
io->io_bio = NULL;
io->io_op = 0;
io->io_end = 0;
io->io_end = NULL;
}
static int io_submit_init(struct ext4_io_submit *io,
......@@ -380,8 +385,6 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
BUG_ON(!PageLocked(page));
BUG_ON(PageWriteback(page));
set_page_writeback(page);
ClearPageError(page);
io_page = kmem_cache_alloc(io_page_cachep, GFP_NOFS);
if (!io_page) {
......@@ -392,6 +395,8 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
io_page->p_page = page;
atomic_set(&io_page->p_count, 1);
get_page(page);
set_page_writeback(page);
ClearPageError(page);
for (bh = head = page_buffers(page), block_start = 0;
bh != head || !block_start;
......
......@@ -230,7 +230,7 @@ static int setup_new_group_blocks(struct super_block *sb,
}
/* Zero out all of the reserved backup group descriptor table blocks */
ext4_debug("clear inode table blocks %#04llx -> %#04llx\n",
ext4_debug("clear inode table blocks %#04llx -> %#04lx\n",
block, sbi->s_itb_per_group);
err = sb_issue_zeroout(sb, gdblocks + start + 1, reserved_gdb,
GFP_NOFS);
......@@ -248,7 +248,7 @@ static int setup_new_group_blocks(struct super_block *sb,
/* Zero out all of the inode table blocks */
block = input->inode_table;
ext4_debug("clear inode table blocks %#04llx -> %#04llx\n",
ext4_debug("clear inode table blocks %#04llx -> %#04lx\n",
block, sbi->s_itb_per_group);
err = sb_issue_zeroout(sb, block, sbi->s_itb_per_group, GFP_NOFS);
if (err)
......@@ -499,12 +499,12 @@ static int add_new_gdb(handle_t *handle, struct inode *inode,
return err;
exit_inode:
/* ext4_journal_release_buffer(handle, iloc.bh); */
/* ext4_handle_release_buffer(handle, iloc.bh); */
brelse(iloc.bh);
exit_dindj:
/* ext4_journal_release_buffer(handle, dind); */
/* ext4_handle_release_buffer(handle, dind); */
exit_sbh:
/* ext4_journal_release_buffer(handle, EXT4_SB(sb)->s_sbh); */
/* ext4_handle_release_buffer(handle, EXT4_SB(sb)->s_sbh); */
exit_dind:
brelse(dind);
exit_bh:
......@@ -586,7 +586,7 @@ static int reserve_backup_gdb(handle_t *handle, struct inode *inode,
/*
int j;
for (j = 0; j < i; j++)
ext4_journal_release_buffer(handle, primary[j]);
ext4_handle_release_buffer(handle, primary[j]);
*/
goto exit_bh;
}
......
......@@ -54,9 +54,9 @@
static struct proc_dir_entry *ext4_proc_root;
static struct kset *ext4_kset;
struct ext4_lazy_init *ext4_li_info;
struct mutex ext4_li_mtx;
struct ext4_features *ext4_feat;
static struct ext4_lazy_init *ext4_li_info;
static struct mutex ext4_li_mtx;
static struct ext4_features *ext4_feat;
static int ext4_load_journal(struct super_block *, struct ext4_super_block *,
unsigned long journal_devnum);
......@@ -75,6 +75,7 @@ static void ext4_write_super(struct super_block *sb);
static int ext4_freeze(struct super_block *sb);
static struct dentry *ext4_mount(struct file_system_type *fs_type, int flags,
const char *dev_name, void *data);
static int ext4_feature_set_ok(struct super_block *sb, int readonly);
static void ext4_destroy_lazyinit_thread(void);
static void ext4_unregister_li_request(struct super_block *sb);
static void ext4_clear_request_list(void);
......@@ -594,7 +595,7 @@ __acquires(bitlock)
vaf.fmt = fmt;
vaf.va = &args;
printk(KERN_CRIT "EXT4-fs error (device %s): %s:%d: group %u",
printk(KERN_CRIT "EXT4-fs error (device %s): %s:%d: group %u, ",
sb->s_id, function, line, grp);
if (ino)
printk(KERN_CONT "inode %lu: ", ino);
......@@ -997,13 +998,10 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
if (test_opt(sb, OLDALLOC))
seq_puts(seq, ",oldalloc");
#ifdef CONFIG_EXT4_FS_XATTR
if (test_opt(sb, XATTR_USER) &&
!(def_mount_opts & EXT4_DEFM_XATTR_USER))
if (test_opt(sb, XATTR_USER))
seq_puts(seq, ",user_xattr");
if (!test_opt(sb, XATTR_USER) &&
(def_mount_opts & EXT4_DEFM_XATTR_USER)) {
if (!test_opt(sb, XATTR_USER))
seq_puts(seq, ",nouser_xattr");
}
#endif
#ifdef CONFIG_EXT4_FS_POSIX_ACL
if (test_opt(sb, POSIX_ACL) && !(def_mount_opts & EXT4_DEFM_ACL))
......@@ -1041,8 +1039,8 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
!(def_mount_opts & EXT4_DEFM_NODELALLOC))
seq_puts(seq, ",nodelalloc");
if (test_opt(sb, MBLK_IO_SUBMIT))
seq_puts(seq, ",mblk_io_submit");
if (!test_opt(sb, MBLK_IO_SUBMIT))
seq_puts(seq, ",nomblk_io_submit");
if (sbi->s_stripe)
seq_printf(seq, ",stripe=%lu", sbi->s_stripe);
/*
......@@ -1451,7 +1449,7 @@ static int parse_options(char *options, struct super_block *sb,
* Initialize args struct so we know whether arg was
* found; some options take optional arguments.
*/
args[0].to = args[0].from = 0;
args[0].to = args[0].from = NULL;
token = match_token(p, tokens, args);
switch (token) {
case Opt_bsd_df:
......@@ -1771,7 +1769,7 @@ static int parse_options(char *options, struct super_block *sb,
return 0;
if (option < 0 || option > (1 << 30))
return 0;
if (!is_power_of_2(option)) {
if (option && !is_power_of_2(option)) {
ext4_msg(sb, KERN_ERR,
"EXT4-fs: inode_readahead_blks"
" must be a power of 2");
......@@ -2120,6 +2118,13 @@ static void ext4_orphan_cleanup(struct super_block *sb,
return;
}
/* Check if feature set would not allow a r/w mount */
if (!ext4_feature_set_ok(sb, 0)) {
ext4_msg(sb, KERN_INFO, "Skipping orphan cleanup due to "
"unknown ROCOMPAT features");
return;
}
if (EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS) {
if (es->s_last_orphan)
jbd_debug(1, "Errors on filesystem, "
......@@ -2412,7 +2417,7 @@ static ssize_t inode_readahead_blks_store(struct ext4_attr *a,
if (parse_strtoul(buf, 0x40000000, &t))
return -EINVAL;
if (!is_power_of_2(t))
if (t && !is_power_of_2(t))
return -EINVAL;
sbi->s_inode_readahead_blks = t;
......@@ -3095,14 +3100,14 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
}
if (def_mount_opts & EXT4_DEFM_UID16)
set_opt(sb, NO_UID32);
/* xattr user namespace & acls are now defaulted on */
#ifdef CONFIG_EXT4_FS_XATTR
if (def_mount_opts & EXT4_DEFM_XATTR_USER)
set_opt(sb, XATTR_USER);
set_opt(sb, XATTR_USER);
#endif
#ifdef CONFIG_EXT4_FS_POSIX_ACL
if (def_mount_opts & EXT4_DEFM_ACL)
set_opt(sb, POSIX_ACL);
set_opt(sb, POSIX_ACL);
#endif
set_opt(sb, MBLK_IO_SUBMIT);
if ((def_mount_opts & EXT4_DEFM_JMODE) == EXT4_DEFM_JMODE_DATA)
set_opt(sb, JOURNAL_DATA);
else if ((def_mount_opts & EXT4_DEFM_JMODE) == EXT4_DEFM_JMODE_ORDERED)
......@@ -3516,7 +3521,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
* concurrency isn't really necessary. Limit it to 1.
*/
EXT4_SB(sb)->dio_unwritten_wq =
alloc_workqueue("ext4-dio-unwritten", WQ_MEM_RECLAIM, 1);
alloc_workqueue("ext4-dio-unwritten", WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
if (!EXT4_SB(sb)->dio_unwritten_wq) {
printk(KERN_ERR "EXT4-fs: failed to create DIO workqueue\n");
goto failed_mount_wq;
......@@ -3531,17 +3536,16 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
if (IS_ERR(root)) {
ext4_msg(sb, KERN_ERR, "get root inode failed");
ret = PTR_ERR(root);
root = NULL;
goto failed_mount4;
}
if (!S_ISDIR(root->i_mode) || !root->i_blocks || !root->i_size) {
iput(root);
ext4_msg(sb, KERN_ERR, "corrupt root inode, run e2fsck");
goto failed_mount4;
}
sb->s_root = d_alloc_root(root);
if (!sb->s_root) {
ext4_msg(sb, KERN_ERR, "get root dentry failed");
iput(root);
ret = -ENOMEM;
goto failed_mount4;
}
......@@ -3657,6 +3661,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
goto failed_mount;
failed_mount4:
iput(root);
sb->s_root = NULL;
ext4_msg(sb, KERN_ERR, "mount failed");
destroy_workqueue(EXT4_SB(sb)->dio_unwritten_wq);
failed_mount_wq:
......
......@@ -735,7 +735,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
int offset = (char *)s->here - bs->bh->b_data;
unlock_buffer(bs->bh);
jbd2_journal_release_buffer(handle, bs->bh);
ext4_handle_release_buffer(handle, bs->bh);
if (ce) {
mb_cache_entry_release(ce);
ce = NULL;
......@@ -833,7 +833,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
new_bh = sb_getblk(sb, block);
if (!new_bh) {
getblk_failed:
ext4_free_blocks(handle, inode, 0, block, 1,
ext4_free_blocks(handle, inode, NULL, block, 1,
EXT4_FREE_BLOCKS_METADATA);
error = -EIO;
goto cleanup;
......
......@@ -432,13 +432,35 @@ struct jbd2_journal_handle
int h_err;
/* Flags [no locking] */
unsigned int h_sync: 1; /* sync-on-close */
unsigned int h_jdata: 1; /* force data journaling */
unsigned int h_aborted: 1; /* fatal error on handle */
unsigned int h_sync:1; /* sync-on-close */
unsigned int h_jdata:1; /* force data journaling */
unsigned int h_aborted:1; /* fatal error on handle */
unsigned int h_cowing:1; /* COWing block to snapshot */
/* Number of buffers requested by user:
* (before adding the COW credits factor) */
unsigned int h_base_credits:14;
/* Number of buffers the user is allowed to dirty:
* (counts only buffers dirtied when !h_cowing) */
unsigned int h_user_credits:14;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map h_lockdep_map;
#endif
#ifdef CONFIG_JBD2_DEBUG
/* COW debugging counters: */
unsigned int h_cow_moved; /* blocks moved to snapshot */
unsigned int h_cow_copied; /* blocks copied to snapshot */
unsigned int h_cow_ok_jh; /* blocks already COWed during current
transaction */
unsigned int h_cow_ok_bitmap; /* blocks not set in COW bitmap */
unsigned int h_cow_ok_mapped;/* blocks already mapped in snapshot */
unsigned int h_cow_bitmaps; /* COW bitmaps created */
unsigned int h_cow_excluded; /* blocks set in exclude bitmap */
#endif
};
......
......@@ -40,6 +40,13 @@ struct journal_head {
*/
unsigned b_modified;
/*
* This feild tracks the last transaction id in which this buffer
* has been cowed
* [jbd_lock_bh_state()]
*/
unsigned b_cow_tid;
/*
* Copy of the buffer data frozen for writing to the log.
* [jbd_lock_bh_state()]
......
此差异已折叠。
......@@ -17,19 +17,17 @@ TRACE_EVENT(jbd2_checkpoint,
TP_ARGS(journal, result),
TP_STRUCT__entry(
__field( int, dev_major )
__field( int, dev_minor )
__field( dev_t, dev )
__field( int, result )
),
TP_fast_assign(
__entry->dev_major = MAJOR(journal->j_fs_dev->bd_dev);
__entry->dev_minor = MINOR(journal->j_fs_dev->bd_dev);
__entry->dev = journal->j_fs_dev->bd_dev;
__entry->result = result;
),
TP_printk("dev %d,%d result %d",
__entry->dev_major, __entry->dev_minor, __entry->result)
TP_printk("dev %s result %d",
jbd2_dev_to_name(__entry->dev), __entry->result)
);
DECLARE_EVENT_CLASS(jbd2_commit,
......@@ -39,22 +37,20 @@ DECLARE_EVENT_CLASS(jbd2_commit,
TP_ARGS(journal, commit_transaction),
TP_STRUCT__entry(
__field( int, dev_major )
__field( int, dev_minor )
__field( dev_t, dev )
__field( char, sync_commit )
__field( int, transaction )
),
TP_fast_assign(
__entry->dev_major = MAJOR(journal->j_fs_dev->bd_dev);
__entry->dev_minor = MINOR(journal->j_fs_dev->bd_dev);
__entry->dev = journal->j_fs_dev->bd_dev;
__entry->sync_commit = commit_transaction->t_synchronous_commit;
__entry->transaction = commit_transaction->t_tid;
),
TP_printk("dev %d,%d transaction %d sync %d",
__entry->dev_major, __entry->dev_minor,
__entry->transaction, __entry->sync_commit)
TP_printk("dev %s transaction %d sync %d",
jbd2_dev_to_name(__entry->dev), __entry->transaction,
__entry->sync_commit)
);
DEFINE_EVENT(jbd2_commit, jbd2_start_commit,
......@@ -91,24 +87,22 @@ TRACE_EVENT(jbd2_end_commit,
TP_ARGS(journal, commit_transaction),
TP_STRUCT__entry(
__field( int, dev_major )
__field( int, dev_minor )
__field( dev_t, dev )
__field( char, sync_commit )
__field( int, transaction )
__field( int, head )
),
TP_fast_assign(
__entry->dev_major = MAJOR(journal->j_fs_dev->bd_dev);
__entry->dev_minor = MINOR(journal->j_fs_dev->bd_dev);
__entry->dev = journal->j_fs_dev->bd_dev;
__entry->sync_commit = commit_transaction->t_synchronous_commit;
__entry->transaction = commit_transaction->t_tid;
__entry->head = journal->j_tail_sequence;
),
TP_printk("dev %d,%d transaction %d sync %d head %d",
__entry->dev_major, __entry->dev_minor,
__entry->transaction, __entry->sync_commit, __entry->head)
TP_printk("dev %s transaction %d sync %d head %d",
jbd2_dev_to_name(__entry->dev), __entry->transaction,
__entry->sync_commit, __entry->head)
);
TRACE_EVENT(jbd2_submit_inode_data,
......@@ -117,20 +111,17 @@ TRACE_EVENT(jbd2_submit_inode_data,
TP_ARGS(inode),
TP_STRUCT__entry(
__field( int, dev_major )
__field( int, dev_minor )
__field( dev_t, dev )
__field( ino_t, ino )
),
TP_fast_assign(
__entry->dev_major = MAJOR(inode->i_sb->s_dev);
__entry->dev_minor = MINOR(inode->i_sb->s_dev);
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
),
TP_printk("dev %d,%d ino %lu",
__entry->dev_major, __entry->dev_minor,
(unsigned long) __entry->ino)
TP_printk("dev %s ino %lu",
jbd2_dev_to_name(__entry->dev), (unsigned long) __entry->ino)
);
TRACE_EVENT(jbd2_run_stats,
......@@ -140,8 +131,7 @@ TRACE_EVENT(jbd2_run_stats,
TP_ARGS(dev, tid, stats),
TP_STRUCT__entry(
__field( int, dev_major )
__field( int, dev_minor )
__field( dev_t, dev )
__field( unsigned long, tid )
__field( unsigned long, wait )
__field( unsigned long, running )
......@@ -154,8 +144,7 @@ TRACE_EVENT(jbd2_run_stats,
),
TP_fast_assign(
__entry->dev_major = MAJOR(dev);
__entry->dev_minor = MINOR(dev);
__entry->dev = dev;
__entry->tid = tid;
__entry->wait = stats->rs_wait;
__entry->running = stats->rs_running;
......@@ -167,9 +156,9 @@ TRACE_EVENT(jbd2_run_stats,
__entry->blocks_logged = stats->rs_blocks_logged;
),
TP_printk("dev %d,%d tid %lu wait %u running %u locked %u flushing %u "
TP_printk("dev %s tid %lu wait %u running %u locked %u flushing %u "
"logging %u handle_count %u blocks %u blocks_logged %u",
__entry->dev_major, __entry->dev_minor, __entry->tid,
jbd2_dev_to_name(__entry->dev), __entry->tid,
jiffies_to_msecs(__entry->wait),
jiffies_to_msecs(__entry->running),
jiffies_to_msecs(__entry->locked),
......@@ -186,8 +175,7 @@ TRACE_EVENT(jbd2_checkpoint_stats,
TP_ARGS(dev, tid, stats),
TP_STRUCT__entry(
__field( int, dev_major )
__field( int, dev_minor )
__field( dev_t, dev )
__field( unsigned long, tid )
__field( unsigned long, chp_time )
__field( __u32, forced_to_close )
......@@ -196,8 +184,7 @@ TRACE_EVENT(jbd2_checkpoint_stats,
),
TP_fast_assign(
__entry->dev_major = MAJOR(dev);
__entry->dev_minor = MINOR(dev);
__entry->dev = dev;
__entry->tid = tid;
__entry->chp_time = stats->cs_chp_time;
__entry->forced_to_close= stats->cs_forced_to_close;
......@@ -205,9 +192,9 @@ TRACE_EVENT(jbd2_checkpoint_stats,
__entry->dropped = stats->cs_dropped;
),
TP_printk("dev %d,%d tid %lu chp_time %u forced_to_close %u "
TP_printk("dev %s tid %lu chp_time %u forced_to_close %u "
"written %u dropped %u",
__entry->dev_major, __entry->dev_minor, __entry->tid,
jbd2_dev_to_name(__entry->dev), __entry->tid,
jiffies_to_msecs(__entry->chp_time),
__entry->forced_to_close, __entry->written, __entry->dropped)
);
......@@ -220,8 +207,7 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
TP_ARGS(journal, first_tid, block_nr, freed),
TP_STRUCT__entry(
__field( int, dev_major )
__field( int, dev_minor )
__field( dev_t, dev )
__field( tid_t, tail_sequence )
__field( tid_t, first_tid )
__field(unsigned long, block_nr )
......@@ -229,18 +215,16 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
),
TP_fast_assign(
__entry->dev_major = MAJOR(journal->j_fs_dev->bd_dev);
__entry->dev_minor = MINOR(journal->j_fs_dev->bd_dev);
__entry->dev = journal->j_fs_dev->bd_dev;
__entry->tail_sequence = journal->j_tail_sequence;
__entry->first_tid = first_tid;
__entry->block_nr = block_nr;
__entry->freed = freed;
),
TP_printk("dev %d,%d from %u to %u offset %lu freed %lu",
__entry->dev_major, __entry->dev_minor,
__entry->tail_sequence, __entry->first_tid,
__entry->block_nr, __entry->freed)
TP_printk("dev %s from %u to %u offset %lu freed %lu",
jbd2_dev_to_name(__entry->dev), __entry->tail_sequence,
__entry->first_tid, __entry->block_nr, __entry->freed)
);
#endif /* _TRACE_JBD2_H */
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册