- 02 5月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
This will allow us to do per-I/O sync file writes, as required by a lot of fileservers or storage targets. XXX: Will need a few additional audits for O_DSYNC Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 4月, 2016 1 次提交
-
-
由 Kirill A. Shutemov 提交于
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 1月, 2016 1 次提交
-
-
由 Al Viro 提交于
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 22 12月, 2015 1 次提交
-
-
由 Junxiao Bi 提交于
Commit 4f656367 ("Move locks API users to locks_lock_inode_wait()") moved flock/posix lock identify code to locks_lock_inode_wait(), but missed to set fl_flags to FL_FLOCK which will cause kernel panic in locks_lock_inode_wait(). Fixes: 4f656367 ("Move locks API users to locks_lock_inode_wait()") Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
- 15 12月, 2015 2 次提交
-
-
由 Bob Peterson 提交于
This patch makes no functional changes. Its goal is to reduce the size of the gfs2 inode in memory by rearranging structures and changing the size of some variables within the structure. Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
由 Bob Peterson 提交于
Before this patch, multi-block reservation structures were allocated from a special slab. This patch folds the structure into the gfs2_inode structure. The disadvantage is that the gfs2_inode needs more memory, even when a file is opened read-only. The advantages are: (a) we don't need the special slab and the extra time it takes to allocate and deallocate from it. (b) we no longer need to worry that the structure exists for things like quota management. (c) This also allows us to remove the calls to get_write_access and put_write_access since we know the structure will exist. Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
- 24 11月, 2015 1 次提交
-
-
由 Bob Peterson 提交于
This patch basically reverts the majority of patch 5407e242. That patch eliminated the gfs2_qadata structure in favor of just using the reservations structure. The problem with doing that is that it increases the size of the reservations structure. That is not an issue until it comes time to fold the reservations structure into the inode in memory so we know it's always there. By separating out the quota structure again, we aren't punishing the non-quota users by making all the inodes bigger, requiring more slab space. This patch creates a new slab area to allocate the quota stuff so it's managed a little more sanely. Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
- 11 11月, 2015 1 次提交
-
-
由 Abhi Das 提交于
When new files and directories are created inside a parent directory we automatically inherit the GFS2_DIF_SYSTEM flag (if set) and assign it to the new file/dirs. All new system files/dirs created in the metafs by, say gfs2_jadd, will have this flag set because they will have parent directories in the metafs whose GFS2_DIF_SYSTEM flag has already been set (most likely by a previous mkfs.gfs2) Signed-off-by: NAbhi Das <adas@redhat.com> Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
- 23 10月, 2015 1 次提交
-
-
由 Benjamin Coddington 提交于
Instead of having users check for FL_POSIX or FL_FLOCK to call the correct locks API function, use the check within locks_lock_inode_wait(). This allows for some later cleanup. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
-
- 22 9月, 2015 1 次提交
-
-
由 Andrew Price 提交于
Previously __gfs2_fallocate() relied on file_update_time() marking the inode dirty, but that's not a safe assumption as that function doesn't dirty the inode in some cases. Mark the inode dirty explicitly. Signed-off-by: NAndrew Price <anprice@redhat.com> Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
- 09 6月, 2015 1 次提交
-
-
由 Abhi Das 提交于
We cannot provide an efficient implementation due to the headers on the data blocks, so there doesn't seem much point in having it. Signed-off-by: NAbhi Das <adas@redhat.com> Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
- 06 5月, 2015 1 次提交
-
-
由 Benjamin Marzinski 提交于
At the end of gfs2_set_inode_flags inode->i_flags is set to flags, so we should be modifying flags instead of inode->i_flags, so it isn't overwritten. Signed-off-by: Benjamin Marzinski <bmarzins redhat com> Signed-off-by: NBob Peterson <rpeterso@redhat.com>
-
- 12 4月, 2015 2 次提交
-
-
由 Al Viro 提交于
... avoiding write_iter/fcntl races. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
All places outside of core VFS that checked ->read and ->write for being NULL or called the methods directly are gone now, so NULL {read,write} with non-NULL {read,write}_iter will do the right thing in all cases. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 26 3月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 19 3月, 2015 4 次提交
-
-
由 Abhi Das 提交于
We can quickly get an estimate of how many blocks are available for allocation restricted by quota and fs size respectively, using the ap->allowed field in the gfs2_alloc_parms structure. gfs2_quota_check() and gfs2_inplace_reserve() provide these values. Once we have the total number of blocks available to us, we can compute how many bytes of data can be written using those blocks instead of guessing inefficiently. Signed-off-by: NAbhi Das <adas@redhat.com> Signed-off-by: NBob Peterson <rpeterso@redhat.com> Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Abhi Das 提交于
Use struct gfs2_alloc_parms as an argument to gfs2_quota_check() and gfs2_quota_lock_check() to check for quota violations while accounting for the new blocks requested by the current operation in ap->target. Previously, the number of new blocks requested during an operation were not accounted for during quota_check and would allow these operations to exceed quota. This was not very apparent since most operations allocated only 1 block at a time and quotas would get violated in the next operation. i.e. quota excess would only be by 1 block or so. With fallocate, (where we allocate a bunch of blocks at once) the quota excess is non-trivial and is addressed by this patch. Signed-off-by: NAbhi Das <adas@redhat.com> Signed-off-by: NBob Peterson <rpeterso@redhat.com> Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Bob Peterson 提交于
This patch moves function gfs2_file_splice_write so it's not conditionally compiled. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Bob Peterson 提交于
This patch adds a GFS2-specific function for splice_write which first calls function gfs2_rs_alloc to make sure a reservation structure has been allocated before attempting to reserve blocks. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 11 2月, 2015 1 次提交
-
-
由 Kirill A. Shutemov 提交于
Nobody uses it anymore. [akpm@linux-foundation.org: fix filemap_xip.c] Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 2月, 2015 1 次提交
-
-
由 Theodore Ts'o 提交于
Add a new mount option which enables a new "lazytime" mode. This mode causes atime, mtime, and ctime updates to only be made to the in-memory version of the inode. The on-disk times will only get updated when (a) if the inode needs to be updated for some non-time related change, (b) if userspace calls fsync(), syncfs() or sync(), or (c) just before an undeleted inode is evicted from memory. This is OK according to POSIX because there are no guarantees after a crash unless userspace explicitly requests via a fsync(2) call. For workloads which feature a large number of random write to a preallocated file, the lazytime mount option significantly reduces writes to the inode table. The repeated 4k writes to a single block will result in undesirable stress on flash devices and SMR disk drives. Even on conventional HDD's, the repeated writes to the inode table block will trigger Adjacent Track Interference (ATI) remediation latencies, which very negatively impact long tail latencies --- which is a very big deal for web serving tiers (for example). Google-Bug-Id: 18297052 Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 14 11月, 2014 3 次提交
-
-
由 Andrew Price 提交于
gfs2_fallocate() wasn't updating ctime and mtime when modifying the inode. Add a call to file_update_time() to do that. Signed-off-by: NAndrew Price <anprice@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Andrew Price 提交于
This addresses an issue caught by fsx where the inode size was not being updated to the expected value after fallocate(2) with mode 0. The problem was caused by the offset and len parameters being converted to multiples of the file system's block size, so i_size would be rounded up to the nearest block size multiple instead of the requested size. This replaces the per-chunk i_size updates with a single i_size_write on successful completion of the operation. With this patch gfs2 gets through a complete run of fsx. For clarity, the check for (error == 0) following the loop is removed as all failures before that point jump to out_* labels or return. Signed-off-by: NAndrew Price <anprice@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Andrew Price 提交于
gfs2_fallocate wasn't checking inode_newsize_ok nor get_write_access. Split out the context setup and inode locking pieces into a separate function to make it more clear and add these missing calls. inode_newsize_ok is called conditional on FALLOC_FL_KEEP_SIZE as there is no need to enforce a file size limit if it isn't going to change. Signed-off-by: NAndrew Price <anprice@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 04 11月, 2014 1 次提交
-
-
由 Bob Peterson 提交于
If an application does a sequence of (1) big write, (2) little write we don't necessarily want to reset the size hint based on the smaller size. The fact that they did any big writes implies they may do more, and therefore we should try to allocate bigger block reservations, even if the last few were small writes. Therefore this patch changes function gfs2_size_hint so that the size hint can only grow; it cannot shrink. This is especially important where there are multiple writers. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 10 9月, 2014 1 次提交
-
-
由 Jeff Layton 提交于
GFS2 and NFS have setlease routines that always just return -EINVAL. Turn that into a generic routine that can live in fs/libfs.c. Cc: <linux-nfs@vger.kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: <cluster-devel@redhat.com> Signed-off-by: NJeff Layton <jlayton@primarydata.com> Acked-by: NTrond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 21 8月, 2014 1 次提交
-
-
由 Bob Peterson 提交于
This patch changes the flock code so that it uses the TRY_1CB flag instead of the TRY flag on the first attempt. That forces any holding nodes to issue a dlm callback, which requests a demote of the glock. Then, if the "try" failed, it sleeps a small amount of time for the demote to occur. Then it tries again, for an increasing amount of time. Subsequent attempts to gain the "try" lock don't use "_1CB" so that only one callback is issued. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 18 7月, 2014 2 次提交
-
-
由 Bob Peterson 提交于
This patch removes the GLF_NOCACHE flag from the glocks associated with flocks. There should be no good reason not to cache glocks for flocks: they only force the glock to be demoted before they can be reacquired, which can slow down performance and even cause glock hangs, especially in cases where the flocks are held in Shared (SH) mode. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Bob Peterson 提交于
This patch allows flock glocks to use a non-blocking dequeue rather than dq_wait. It also reverts the previous patch I had posted regarding dq_wait. The reverted patch isn't necessarily a bad idea, but I decided this might avoid unforeseen side effects, and was therefore safer. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 12 6月, 2014 1 次提交
-
-
由 Al Viro 提交于
iter_file_splice_write() - a ->splice_write() instance that gathers the pipe buffers, builds a bio_vec-based iov_iter covering those and feeds it to ->write_iter(). A bunch of simple cases coverted to that... [AV: fixed the braino spotted by Cyrill] Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 16 5月, 2014 1 次提交
-
-
由 Fabian Frederick 提交于
Related function is not gfs2_set_flags but do_gfs2_set_flags Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 14 5月, 2014 1 次提交
-
-
由 Benjamin Marzinski 提交于
GFS2 has a transaction glock, which must be grabbed for every transaction, whose purpose is to deal with freezing the filesystem. Aside from this involving a large amount of locking, it is very easy to make the current fsfreeze code hang on unfreezing. This patch rewrites how gfs2 handles freezing the filesystem. The transaction glock is removed. In it's place is a freeze glock, which is cached (but not held) in a shared state by every node in the cluster when the filesystem is mounted. This lock only needs to be grabbed on freezing, and actions which need to be safe from freezing, like recovery. When a node wants to freeze the filesystem, it grabs this glock exclusively. When the freeze glock state changes on the nodes (either from shared to unlocked, or shared to exclusive), the filesystem does a special log flush. gfs2_log_flush() does all the work for flushing out the and shutting down the incore log, and then it tries to grab the freeze glock in a shared state again. Since the filesystem is stuck in gfs2_log_flush, no new transaction can start, and nothing can be written to disk. Unfreezing the filesytem simply involes dropping the freeze glock, allowing gfs2_log_flush() to grab and then release the shared lock, so it is cached for next time. However, in order for the unfreezing ioctl to occur, gfs2 needs to get a shared lock on the filesystem root directory inode to check permissions. If that glock has already been grabbed exclusively, fsfreeze will be unable to get the shared lock and unfreeze the filesystem. In order to allow the unfreeze, this patch makes gfs2 grab a shared lock on the filesystem root directory during the freeze, and hold it until it unfreezes the filesystem. The functions which need to grab a shared lock in order to allow the unfreeze ioctl to be issued now use the lock grabbed by the freeze code instead. The freeze and unfreeze code take care to make sure that this shared lock will not be dropped while another process is using it. Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 07 5月, 2014 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 08 4月, 2014 1 次提交
-
-
由 Kirill A. Shutemov 提交于
filemap_map_pages() is generic implementation of ->map_pages() for filesystems who uses page cache. It should be safe to use filemap_map_pages() for ->map_pages() if filesystem use filemap_fault() for ->fault(). Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Ning Qu <quning@gmail.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 2月, 2014 1 次提交
-
-
由 Bob Peterson 提交于
This patch causes GFS2 to lock the i_mutex during fallocate. It also switches from using a dinode's inode glock to using a local holder like the other GFS2 i_operations. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 02 10月, 2013 1 次提交
-
-
由 Steven Whitehouse 提交于
This patch adds a structure to contain allocation parameters with the intention of future expansion of this structure. The idea is that we should be able to add more information about the allocation in the future in order to allow the allocator to make a better job of placing the requests on-disk. There is no functional difference from applying this patch. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 27 9月, 2013 1 次提交
-
-
由 Steven Whitehouse 提交于
The reservation for an inode should be cleared when it is truncated so that we can start again at a different offset for future allocations. We could try and do better than that, by resetting the search based on where the truncation started from, but this is only a first step. In addition, there are three callers of gfs2_rs_delete() but only one of those should really be testing the value of i_writecount. While we get away with that in the other cases currently, I think it would be better if we made that test specific to the one case which requires it. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 05 9月, 2013 1 次提交
-
-
由 Benjamin Marzinski 提交于
GFS2 was only setting I_DIRTY_DATASYNC on files that it wrote to, when it actually increased the file size. If gfs2_fsync was called without I_DIRTY_DATASYNC set, it didn't flush the incore data to the log before returning, so any metadata or journaled data changes were not getting fsynced. This meant that writes to the middle of files were not always getting fsynced properly. This patch makes gfs2 set I_DIRTY_DATASYNC whenever metadata has been updated during a write. It also make gfs2_sync flush the incore log if I_DIRTY_PAGES is set, and the file is using data journalling. This will make sure that all incore logged data gets written to disk before returning from a fsync. Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 29 6月, 2013 1 次提交
-
-
由 Jeff Layton 提交于
Having a global lock that protects all of this code is a clear scalability problem. Instead of doing that, move most of the code to be protected by the i_lock instead. The exceptions are the global lists that the ->fl_link sits on, and the ->fl_block list. ->fl_link is what connects these structures to the global lists, so we must ensure that we hold those locks when iterating over or updating these lists. Furthermore, sound deadlock detection requires that we hold the blocked_list state steady while checking for loops. We also must ensure that the search and update to the list are atomic. For the checking and insertion side of the blocked_list, push the acquisition of the global lock into __posix_lock_file and ensure that checking and update of the blocked_list is done without dropping the lock in between. On the removal side, when waking up blocked lock waiters, take the global lock before walking the blocked list and dequeue the waiters from the global list prior to removal from the fl_block list. With this, deadlock detection should be race free while we minimize excessive file_lock_lock thrashing. Finally, in order to avoid a lock inversion problem when handling /proc/locks output we must ensure that manipulations of the fl_block list are also protected by the file_lock_lock. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-