- 06 5月, 2010 2 次提交
-
-
由 Mark Fasheh 提交于
Add a per-inode reservations structure and pass it through to the reservations code. Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
jbd[2]_journal_dirty_metadata() only returns 0. It's been returning 0 since before the kernel moved to git. There is no point in checking this error. ocfs2_journal_dirty() has been faithfully returning the status since the beginning. All over ocfs2, we have blocks of code checking this can't fail status. In the past few years, we've tried to avoid adding these checks, because they are pointless. But anyone who looks at our code assumes they are needed. Finally, ocfs2_journal_dirty() is made a void function. All error checking is removed from other files. We'll BUG_ON() the status of jbd2_journal_dirty_metadata() just in case they change it someday. They won't. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 05 3月, 2010 4 次提交
-
-
由 Christoph Hellwig 提交于
Get rid of the initialize dquot operation - it is now always called from the filesystem and if a filesystem really needs it's own (which none currently does) it can just call into it's own routine directly. Rename the now static low-level dquot_initialize helper to __dquot_initialize and vfs_dq_init to dquot_initialize to have a consistent namespace. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Christoph Hellwig 提交于
Currently various places in the VFS call vfs_dq_init directly. This means we tie the quota code into the VFS. Get rid of that and make the filesystem responsible for the initialization. For most metadata operations this is a straight forward move into the methods, but for truncate and open it's a bit more complicated. For truncate we currently only call vfs_dq_init for the sys_truncate case because open already takes care of it for ftruncate and open(O_TRUNC) - the new code causes an additional vfs_dq_init for those which is harmless. For open the initialization is moved from do_filp_open into the open method, which means it happens slightly earlier now, and only for regular files. The latter is fine because we don't need to initialize it for operations on special files, and we already do it as part of the namespace operations for directories. Add a dquot_file_open helper that filesystems that support generic quotas can use to fill in ->open. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Christoph Hellwig 提交于
Get rid of the transfer dquot operation - it is now always called from the filesystem and if a filesystem really needs it's own (which none currently does) it can just call into it's own routine directly. Rename the now static low-level dquot_transfer helper to __dquot_transfer and vfs_dq_transfer to dquot_transfer to have a consistent namespace, and make the new dquot_transfer return a normal negative errno value which all callers expect. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Christoph Hellwig 提交于
Get rid of the alloc_space, free_space, reserve_space, claim_space and release_rsv dquot operations - they are always called from the filesystem and if a filesystem really needs their own (which none currently does) it can just call into it's own routine directly. Move shared logic into the common __dquot_alloc_space, dquot_claim_space_nodirty and __dquot_free_space low-level methods, and rationalize the wrappers around it to move as much as possible code into the common block for CONFIG_QUOTA vs not. Also rename all these helpers to be named dquot_* instead of vfs_dq_*. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 28 2月, 2010 1 次提交
-
-
由 Wengang Wang 提交于
This patch makes ocfs2 send SIGXFSZ if new file size exceeds the rlimit. Processes may get SIGXFSZ on one node (in the cluster) while others will not on another if file size limits are different on the two nodes. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 27 2月, 2010 2 次提交
-
-
由 Coly Li 提交于
This patch fixes a compiling warning in ocfs2_file_aio_write(). Signed-off-by: NColy Li <coly.li@suse.de> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Wengang Wang 提交于
When ocfs2 has to do CoW for refcounted extents, we disable direct I/O and go through the buffered I/O path. This makes the combined check easier to read. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 03 2月, 2010 1 次提交
-
-
由 Tao Ma 提交于
Add parenthesis to wrap the check for O_DIRECT. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 26 1月, 2010 1 次提交
-
-
由 Sunil Mushran 提交于
Patch removes trailing whitespaces. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 31 12月, 2009 1 次提交
-
-
由 Tao Ma 提交于
In case of writing to a refcounted cluster with O_DIRECT, we need to fall back to buffer write. And when it is finished, we need to flush the page and the journal as we did for other O_DIRECT writes. This patch fix oss bug 1191. http://oss.oracle.com/bugzilla/show_bug.cgi?id=1191Signed-off-by: NTao Ma <tao.ma@oracle.com> Tested-by: NTristan Ye <tristan.ye@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 10 12月, 2009 1 次提交
-
-
由 Christoph Hellwig 提交于
While Linux provided an O_SYNC flag basically since day 1, it took until Linux 2.4.0-test12pre2 to actually get it implemented for filesystems, since that day we had generic_osync_around with only minor changes and the great "For now, when the user asks for O_SYNC, we'll actually give O_DSYNC" comment. This patch intends to actually give us real O_SYNC semantics in addition to the O_DSYNC semantics. After Jan's O_SYNC patches which are required before this patch it's actually surprisingly simple, we just need to figure out when to set the datasync flag to vfs_fsync_range and when not. This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's numerical value to keep binary compatibility, and adds a new real O_SYNC flag. To guarantee backwards compatiblity it is defined as expanding to both the O_DSYNC and the new additional binary flag (__O_SYNC) to make sure we are backwards-compatible when compiled against the new headers. This also means that all places that don't care about the differences can just check O_DSYNC and get the right behaviour for O_SYNC, too - only places that actuall care need to check __O_SYNC in addition. Drivers and network filesystems have been updated in a fail safe way to always do the full sync magic if O_DSYNC is set. The few places setting O_SYNC for lower layers are kept that way for now to stay failsafe. We enforce that O_DSYNC is set when __O_SYNC is set early in the open path to make sure we always get these sane options. Note that parisc really screwed up their headers as they already define a O_DSYNC that has always been a no-op. We try to repair it by using it for the new O_DSYNC and redefinining O_SYNC to send both the traditional O_SYNC numerical value _and_ the O_DSYNC one. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger@sun.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NKyle McMartin <kyle@mcmartin.ca> Acked-by: NUlrich Drepper <drepper@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 29 10月, 2009 1 次提交
-
-
由 Tao Ma 提交于
The old reflink fails to handle inodes with inline data and will oops if it encounters them. This patch copies inline data to the new inode. Extended attributes may still be refcounted. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com> Tested-by: NTristan Ye <tristan.ye@oracle.com>
-
- 23 9月, 2009 3 次提交
-
-
由 Tao Ma 提交于
Now with xattr refcount support, we need to check whether we have xattr refcounted before we remove the refcount tree. Now the mechanism is: 1) Check whether i_clusters == 0, if no, exit. 2) check whether we have i_xattr_loc in dinode. if yes, exit. 2) Check whether we have inline xattr stored outside, if yes, exit. 4) Remove the tree. Signed-off-by: NTao Ma <tao.ma@oracle.com>
-
由 Tao Ma 提交于
When we truncate a file to a specific size which resides in a reflinked cluster, we need to CoW it since ocfs2_zero_range_for_truncate will zero the space after the size(just another type of write). So we add a "max_cpos" in ocfs2_refcount_cow so that it will stop when it hit the max cluster offset. Signed-off-by: NTao Ma <tao.ma@oracle.com>
-
由 Tao Ma 提交于
When we use mmap, we CoW the refcountd clusters in ocfs2_write_begin_nolock. While for normal file io(including directio), we do CoW in ocfs2_prepare_inode_for_write. Signed-off-by: NTao Ma <tao.ma@oracle.com>
-
- 14 9月, 2009 2 次提交
-
-
由 Jan Kara 提交于
Update ocfs2 specific splicing code to use generic syncing helper. The sync now does not happen under rw_lock because generic_write_sync() acquires i_mutex which ranks above rw_lock. That should not matter because standard fsync path does not hold it either. Acked-by: NJoel Becker <Joel.Becker@oracle.com> Acked-by: NMark Fasheh <mfasheh@suse.com> CC: ocfs2-devel@oss.oracle.com Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
Use the new helper. We have to submit data pages ourselves in case of O_SYNC write because __generic_file_aio_write does not do it for us. OCFS2 developpers might think about moving the sync out of i_mutex which seems to be easily possible but that's out of scope of this patch. CC: ocfs2-devel@oss.oracle.com Acked-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 05 9月, 2009 3 次提交
-
-
由 Joel Becker 提交于
With this commit, extent tree operations are divorced from inodes and rely on ocfs2_caching_info. Phew! Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
One more function that doesn't need a struct inode to pass to its children. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Joel Becker 提交于
The next step in divorcing metadata I/O management from struct inode is to pass struct ocfs2_caching_info to the journal functions. Thus the journal locks a metadata cache with the cache io_lock function. It also can compare ci_last_trans and ci_created_trans directly. This is a large patch because of all the places we change ocfs2_journal_access..(handle, inode, ...) to ocfs2_journal_access..(handle, INODE_CACHE(inode), ...). Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 21 7月, 2009 1 次提交
-
-
由 Goldwyn Rodrigues 提交于
generic_write_checks() expects count to be initialized to the size of the write. Writes to files open with O_DIRECT|O_LARGEFILE write 0 bytes because count is uninitialized. Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.de> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 11 7月, 2009 1 次提交
-
-
由 Wengang Wang 提交于
in ocfs2_file_aio_write(), log_exit() could don't log the value which is really returned. this patch fixes it. Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 23 6月, 2009 1 次提交
-
-
由 Tao Ma 提交于
We should call ocfs2_inode_lock_atime instead of ocfs2_inode_lock in ocfs2_file_splice_read like we do in ocfs2_file_aio_read so that we can update atime in splice read if necessary. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 10 6月, 2009 1 次提交
-
-
由 Hisashi Hifumi 提交于
In ocfs2, fdatasync and fsync are identical. I think fdatasync should skip committing transaction when inode->i_state is set just I_DIRTY_SYNC and this indicates only atime or/and mtime updates. Following patch improves fdatasync throughput. #sysbench --num-threads=16 --max-requests=300000 --test=fileio --file-block-size=4K --file-total-size=16G --file-test-mode=rndwr --file-fsync-mode=fdatasync run Results: -2.6.30-rc8 Test execution summary: total time: 107.1445s total number of events: 119559 total time taken by event execution: 116.1050 per-request statistics: min: 0.0000s avg: 0.0010s max: 0.1220s approx. 95 percentile: 0.0016s Threads fairness: events (avg/stddev): 7472.4375/303.60 execution time (avg/stddev): 7.2566/0.64 -2.6.30-rc8-patched Test execution summary: total time: 86.8529s total number of events: 300016 total time taken by event execution: 24.3077 per-request statistics: min: 0.0000s avg: 0.0001s max: 0.0336s approx. 95 percentile: 0.0001s Threads fairness: events (avg/stddev): 18751.0000/718.75 execution time (avg/stddev): 1.5192/0.05 Signed-off-by: NHisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Acked-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 04 6月, 2009 1 次提交
-
-
由 Jan Kara 提交于
We called vfs_dq_transfer() with global quota file lock held. This can lead to deadlocks as if vfs_dq_transfer() has to allocate new quota structure, it calls ocfs2_dquot_acquire() which tries to get quota file lock again and this can block if another node requested the lock in the mean time. Since we have to call vfs_dq_transfer() with transaction already started and quota file lock ranks above the transaction start, we cannot just rely on ocfs2_dquot_acquire() or ocfs2_dquot_release() on getting the lock if they need it. We fix the problem by acquiring pointers to all quota structures needed by vfs_dq_transfer() already before calling the function. By this we are sure that all quota structures are properly allocated and they can be freed only after we drop references to them. Thus we don't need quota file lock anywhere inside vfs_dq_transfer(). Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
- 15 4月, 2009 1 次提交
-
-
由 Miklos Szeredi 提交于
Rearrange locking of i_mutex on destination and call to ocfs2_rw_lock() so locks are only held while buffers are copied with the pipe_to_file() actor, and not while waiting for more data on the pipe. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 07 4月, 2009 1 次提交
-
-
由 Miklos Szeredi 提交于
There's a possible deadlock in generic_file_splice_write(), splice_from_pipe() and ocfs2_file_splice_write(): - task A calls generic_file_splice_write() - this calls inode_double_lock(), which locks i_mutex on both pipe->inode and target inode - ordering depends on inode pointers, can happen that pipe->inode is locked first - __splice_from_pipe() needs more data, calls pipe_wait() - this releases lock on pipe->inode, goes to interruptible sleep - task B calls generic_file_splice_write(), similarly to the first - this locks pipe->inode, then tries to lock inode, but that is already held by task A - task A is interrupted, it tries to lock pipe->inode, but fails, as it is already held by task B - ABBA deadlock Fix this by explicitly ordering locks: the outer lock must be on target inode and the inner lock (which is later unlocked and relocked) must be on pipe->inode. This is OK, pipe inodes and target inodes form two nonoverlapping sets, generic_file_splice_write() and friends are not called with a target which is a pipe. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Acked-by: NMark Fasheh <mfasheh@suse.com> Acked-by: NJens Axboe <jens.axboe@oracle.com> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 1月, 2009 1 次提交
-
-
由 Fernando Carrijo 提交于
Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: NTheodore Ts'o <tytso@mit.edu> Acked-by: NMark Fasheh <mfasheh@suse.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Acked-by: NCasey Schaufler <casey@schaufler-ca.com> Acked-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 1月, 2009 7 次提交
-
-
由 Joel Becker 提交于
The per-metadata-type ocfs2_journal_access_*() functions hook up jbd2 commit triggers and allow us to compute metadata ecc right before the buffers are written out. This commit provides ecc for inodes, extent blocks, group descriptors, and quota blocks. It is not safe to use extened attributes and metaecc at the same time yet. The ocfs2_extent_tree and ocfs2_path abstractions in alloc.c both hide the type of block at their root. Before, it didn't matter, but now the root block must use the appropriate ocfs2_journal_access_*() function. To keep this abstract, the structures now have a pointer to the matching journal_access function and a wrapper call to call it. A few places use naked ocfs2_write_block() calls instead of adding the blocks to the journal. We make sure to calculate their checksum and ecc before the write. Since we pass around the journal_access functions. Let's typedef them in ocfs2.h. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Jan Kara 提交于
Add quota calls for allocation and freeing of inodes and space, also update estimates on number of needed credits for a transaction. Move out inode allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called outside of a transaction. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Jan Kara 提交于
For each quota type each node has local quota file. In this file it stores changes users have made to disk usage via this node. Once in a while this information is synced to global file (and thus with other nodes) so that limits enforcement at least aproximately works. Global quota files contain all the information about usage and limits. It's mostly handled by the generic VFS code (which implements a trie of structures inside a quota file). We only have to provide functions to convert structures from on-disk format to in-memory one. We also have to provide wrappers for various quota functions starting transactions and acquiring necessary cluster locks before the actual IO is really started. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
The ocfs2 code currently reads inodes off disk with a simple ocfs2_read_block() call. Each place that does this has a different set of sanity checks it performs. Some check only the signature. A couple validate the block number (the block read vs di->i_blkno). A couple others check for VALID_FL. Only one place validates i_fs_generation. A couple check nothing. Even when an error is found, they don't all do the same thing. We wrap inode reading into ocfs2_read_inode_block(). This will validate all the above fields, going readonly if they are invalid (they never should be). ocfs2_read_inode_block_full() is provided for the places that want to pass read_block flags. Every caller is passing a struct inode with a valid ip_blkno, so we don't need a separate blkno argument either. We will remove the validation checks from the rest of the code in a later commit, as they are no longer necessary. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tiger Yang 提交于
This function is used to update acl xattrs during file mode changes. Signed-off-by: NTiger Yang <tiger.yang@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tiger Yang 提交于
This function is used to enhance permission checking with POSIX ACLs. Signed-off-by: NTiger Yang <tiger.yang@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Mark Fasheh 提交于
This patch genericizes the high level handling of extent removal. ocfs2_remove_btree_range() is nearly identical to __ocfs2_remove_inode_range(), except that extent tree operations have been used where necessary. We update ocfs2_remove_inode_range() to use the generic helper. Now extent tree based structures have an easy way to truncate ranges. Signed-off-by: NMark Fasheh <mfasheh@suse.com> Acked-by: NJoel Becker <joel.becker@oracle.com>
-
- 11 11月, 2008 2 次提交
-
-
由 Dmitri Monakhov 提交于
Signed-off-by: NDmitri Monakhov <dmonakhov@openvz.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Joel Becker <Joel.Becker@oracle.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Jan Kara 提交于
On failure, ocfs2_start_trans() returns values like ERR_PTR(-ENOMEM). Thus checks for !handle are wrong. Fix them to use IS_ERR(). Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
- 31 10月, 2008 1 次提交
-
-
由 Nick Piggin 提交于
Nothing uses prepare_write or commit_write. Remove them from the tree completely. [akpm@linux-foundation.org: schedule simple_prepare_write() for unexporting] Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-