- 10 1月, 2015 6 次提交
-
-
由 Jaegeuk Kim 提交于
Now we use inmemory pages for atomic write only and provide abort procedure, we don't need to truncate them explicitly. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
The ckpt_valid_map and cur_valid_map are synced by seg_info_to_raw_sit. In the case of small discards, the candidates are selected before sync, while fitrim selects candidates after sync. So, for small discards, we need to add candidates only just being obsoleted. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch adds two new ioctls to release inmemory pages grabbed by atomic writes. o f2fs_ioc_abort_volatile_write - If transaction was failed, all the grabbed pages and data should be written. o f2fs_ioc_release_volatile_write - This is to enhance the performance of PERSIST mode in sqlite. In order to avoid huge memory consumption which causes OOM, this patch changes volatile writes to use normal dirty pages, instead blocked flushing to the disk as long as system does not suffer from memory pressure. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
We don't need to call lock_op and lock_page at the aborting path. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
If there is not enough available memory, we need to trigger f2fs_sync_fs. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
We don't need to force to write dirty_exceeded for f2fs_balance_fs_bg. This flag was only meaningful to write bypassing conditions. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 09 1月, 2015 3 次提交
-
-
由 David Drysdale 提交于
Fix clashing values for O_PATH and FMODE_NONOTIFY on sparc. The clashing O_PATH value was added in commit 5229645b ("vfs: add nonconflicting values for O_PATH") but this can't be changed as it is user-visible. FMODE_NONOTIFY is only used internally in the kernel, but it is in the same numbering space as the other O_* flags, as indicated by the comment at the top of include/uapi/asm-generic/fcntl.h (and its use in fs/notify/fanotify/fanotify_user.c). So renumber it to avoid the clash. All of this has happened before (commit 12ed2e36: "fanotify: FMODE_NONOTIFY and __O_SYNC in sparc conflict"), and all of this will happen again -- so update the uniqueness check in fcntl_init() to include __FMODE_NONOTIFY. Signed-off-by: NDavid Drysdale <drysdale@google.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NJan Kara <jack@suse.cz> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xue jiufei 提交于
In ocfs2_link(), the parent directory inode passed to function ocfs2_lookup_ino_from_name() is wrong. Parameter dir is the parent of new_dentry not old_dentry. We should get old_dir from old_dentry and lookup old_dentry in old_dir in case another node remove the old dentry. With this change, hard linking works again, when paths are relative with at least one subdirectory. This is how the problem was reproducable: # mkdir a # mkdir b # touch a/test # ln a/test b/test ln: failed to create hard link `b/test' => `a/test': No such file or directory However when creating links in the same dir, it worked well. Now the link gets created. Fixes: 0e048316 ("ocfs2: check existence of old dentry in ocfs2_link()") Signed-off-by: Njoyce.xue <xuejiufei@huawei.com> Reported-by: NSzabo Aron - UBIT <aron@ubit.hu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Tested-by: NAron Szabo <aron@ubit.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
In dlm_process_recovery_data, only when dlm_new_lock failed the ret will be set to -ENOMEM. And in this case, newlock is definitely NULL. So test newlock is meaningless, remove it. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NAlex Chen <alex.chen@huawei.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 1月, 2015 2 次提交
-
-
由 Jakub Wilk 提交于
Signed-off-by: NJakub Wilk <jwilk@jwilk.net> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This reverts commit 14516bb7. This was causing regression test failures with generic/285 with an ext3 filesystem using CONFIG_EXT4_USE_FOR_EXT23. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 27 12月, 2014 1 次提交
-
-
由 Theodore Ts'o 提交于
Prevent BUG or corrupted file systems after the following: mkfs.ext4 /dev/vdc 100M mount -t ext4 -o sb=40961 /dev/vdc /vdc resize2fs /dev/vdc We previously prevented online resizing using the old resize ioctl. Move the code to ext4_resize_begin(), so the check applies for all of the resize ioctl's. Reported-by: NMaxim Malkov <malkov@ispras.ru> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 23 12月, 2014 1 次提交
-
-
由 Nakajima Akira 提交于
In spite of different file type, if file is same name and same inode number, old inode cache is used. This causes that you can not cd directory, can not cat SymbolicLink. So this patch is that if file type is different, return error. Reproducible sample : 1. create file 'a' at cifs client. 2. repeat rm and mkdir 'a' 4 times at server, then direcotry 'a' having same inode number is created. (Repeat 4 times, then same inode number is recycled.) (When server is under RHEL 6.6, 1 time is O.K. Always same inode number is recycled.) 3. ls -li at client, then you can not cd directory, can not remove directory. SymbolicLink has same problem. Bug link: https://bugzilla.kernel.org/show_bug.cgi?id=90011Signed-off-by: NNakajima Akira <nakajima.akira@nttcom.co.jp> Acked-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NSteve French <steve.french@primarydata.com>
-
- 22 12月, 2014 2 次提交
-
-
由 Jan Kara 提交于
Replace repeated dereferences like dir->i_sb by storing superblock pointer in a variable and using that. Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
Check that length specified in a component of a symlink fits in the input buffer we are reading. Also properly ignore component length for component types that do not use it. Otherwise we read memory after end of buffer for corrupted udf image. Reported-by: NCarl Henrik Lunde <chlunde@ping.uio.no> CC: stable@vger.kernel.org Signed-off-by: NJan Kara <jack@suse.cz>
-
- 19 12月, 2014 10 次提交
-
-
由 Jan Kara 提交于
Symlink reading code does not check whether the resulting path fits into the page provided by the generic code. This isn't as easy as just checking the symlink size because of various encoding conversions we perform on path. So we have to check whether there is still enough space in the buffer on the fly. CC: stable@vger.kernel.org Reported-by: NCarl Henrik Lunde <chlunde@ping.uio.no> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
UDF specification allows arbitrarily large symlinks. However we support only symlinks at most one block large. Check the length of the symlink so that we don't access memory beyond end of the symlink block. CC: stable@vger.kernel.org Reported-by: NCarl Henrik Lunde <chlunde@gmail.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
Verify that inode size is sane when loading inode with data stored in ICB. Otherwise we may get confused later when working with the inode and inode size is too big. CC: stable@vger.kernel.org Reported-by: NCarl Henrik Lunde <chlunde@ping.uio.no> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
We didn't check length of rock ridge ER records before printing them. Thus corrupted isofs image can cause us to access and print some memory behind the buffer with obvious consequences. Reported-and-tested-by: NCarl Henrik Lunde <chlunde@ping.uio.no> CC: stable@vger.kernel.org Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Junxiao Bi 提交于
For buffer write, page lock will be got in write_begin and released in write_end, in ocfs2_write_end_nolock(), before it unlock the page in ocfs2_free_write_ctxt(), it calls ocfs2_run_deallocs(), this will ask for the read lock of journal->j_trans_barrier. Holding page lock and ask for journal->j_trans_barrier breaks the locking order. This will cause a deadlock with journal commit threads, ocfs2cmt will get write lock of journal->j_trans_barrier first, then it wakes up kjournald2 to do the commit work, at last it waits until done. To commit journal, kjournald2 needs flushing data first, it needs get the cache page lock. Since some ocfs2 cluster locks are holding by write process, this deadlock may hung the whole cluster. unlock pages before ocfs2_run_deallocs() can fix the locking order, also put unlock before ocfs2_commit_trans() to make page lock is unlocked before j_trans_barrier to preserve unlocking order. Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Reviewed-by: NWengang Wang <wen.gang.wang@oracle.com> Cc: <stable@vger.kernel.org> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
Commit ac4fef4d ("ocfs2/dlm: do not purge lockres that is queued for assert master") may have the following possible race case: dlm_dispatch_assert_master dlm_wq ======================================================================== queue_work(dlm->quedlm_worker, &dlm->dispatched_work); dispatch work, dlm_lockres_drop_inflight_worker *BUG_ON(res->inflight_assert_workers == 0)* dlm_lockres_grab_inflight_worker inflight_assert_workers++ So ensure inflight_assert_workers to be increased first. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Signed-off-by: NXue jiufei <xuejiufei@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Junxiao Bi 提交于
When running ocfs2 test suite multiple nodes reflink stress test, for a 4 nodes cluster, every unlink() for refcounted file needs about 700s. The slow unlink is caused by the contention of refcount tree lock since all nodes are unlink files using the same refcount tree. When the unlinking file have many extents(over 1600 in our test), most of the extents has refcounted flag set. In ocfs2_commit_truncate(), it will execute the following call trace for every extents. This means it needs get and released refcount tree lock about 1600 times. And when several nodes are do this at the same time, the performance will be very low. ocfs2_remove_btree_range() -- ocfs2_lock_refcount_tree() ---- ocfs2_refcount_lock() ------ __ocfs2_cluster_lock() ocfs2_refcount_lock() is costly, move it to ocfs2_commit_truncate() to do lock/unlock once can improve a lot performance. Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Wengang <wen.gang.wang@oracle.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pintu Kumar 提交于
This patch include CMA info (CMATotal, CMAFree) in /proc/meminfo. Currently, in a CMA enabled system, if somebody wants to know the total CMA size declared, there is no way to tell, other than the dmesg or /var/log/messages logs. With this patch we are showing the CMA info as part of meminfo, so that it can be determined at any point of time. This will be populated only when CMA is enabled. Below is the sample output from a ARM based device with RAM:512MB and CMA:16MB. MemTotal: 471172 kB MemFree: 111712 kB MemAvailable: 271172 kB . . . CmaTotal: 16384 kB CmaFree: 6144 kB This patch also fix below checkpatch errors that were found during these changes. ERROR: space required after that ',' (ctx:ExV) 199: FILE: fs/proc/meminfo.c:199: + ,atomic_long_read(&num_poisoned_pages) << (PAGE_SHIFT - 10) ^ ERROR: space required after that ',' (ctx:ExV) 202: FILE: fs/proc/meminfo.c:202: + ,K(global_page_state(NR_ANON_TRANSPARENT_HUGEPAGES) * ^ ERROR: space required after that ',' (ctx:ExV) 206: FILE: fs/proc/meminfo.c:206: + ,K(totalcma_pages) ^ total: 3 errors, 0 warnings, 2 checks, 236 lines checked Signed-off-by: NPintu Kumar <pintu.k@samsung.com> Signed-off-by: NVishnu Pratap Singh <vishnu.ps@samsung.com> Acked-by: NMichal Nazarewicz <mina86@mina86.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sougata Santra 提交于
Longname is not correctly handled by hfsplus driver. If an attempt to create a longname(>255) file/directory is made, it succeeds by creating a file/directory with HFSPLUS_MAX_STRLEN and incorrect catalog key. Thus leaving the volume in an inconsistent state. This patch fixes this issue. Although lookup is always called first to create a negative entry, so just doing a check in lookup would probably fix this issue. I choose to propagate error to other iops as well. Please NOTE: I have factored out hfsplus_cat_build_key_with_cnid from hfsplus_cat_build_key, to avoid unncessary branching. Thanks a lot. TEST: ------ dir="TEST_DIR" cdir=`pwd` name255="_123456789_123456789_123456789_123456789_123456789_123456789\ _123456789_123456789_123456789_123456789_123456789_123456789_123456789\ _123456789_123456789_123456789_123456789_123456789_123456789_123456789\ _123456789_123456789_123456789_123456789_123456789_1234" name256="${name255}5" mkdir $dir cd $dir touch $name255 rm -f $name255 touch $name256 ls -la cd $cdir rm -rf $dir RESULT: ------- [sougata@ultrabook tmp]$ cdir=`pwd` [sougata@ultrabook tmp]$ name255="_123456789_123456789_123456789_123456789_123456789_123456789\ > _123456789_123456789_123456789_123456789_123456789_123456789_123456789\ > _123456789_123456789_123456789_123456789_123456789_123456789_123456789\ > _123456789_123456789_123456789_123456789_123456789_1234" [sougata@ultrabook tmp]$ name256="${name255}5" [sougata@ultrabook tmp]$ [sougata@ultrabook tmp]$ mkdir $dir [sougata@ultrabook tmp]$ cd $dir [sougata@ultrabook TEST_DIR]$ touch $name255 [sougata@ultrabook TEST_DIR]$ rm -f $name255 [sougata@ultrabook TEST_DIR]$ touch $name256 [sougata@ultrabook TEST_DIR]$ ls -la ls: cannot access _123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234: No such file or directory total 0 drwxrwxr-x 1 sougata sougata 3 Feb 20 19:56 . drwxrwxrwx 1 root root 6 Feb 20 19:56 .. -????????? ? ? ? ? ? _123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234 [sougata@ultrabook TEST_DIR]$ cd $cdir [sougata@ultrabook tmp]$ rm -rf $dir rm: cannot remove `TEST_DIR': Directory not empty -ENAMETOOLONG returned from hfsplus_asc2uni was not propaged to iops. This allowed hfsplus to create files/directories with HFSPLUS_MAX_STRLEN and incorrect keys, leaving the FS in an inconsistent state. This patch fixes this issue. Signed-off-by: NSougata Santra <sougata@tuxera.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric W. Biederman 提交于
While reviewing the code of umount_tree I realized that when we append to a preexisting unmounted list we do not change pprev of the former first item in the list. Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on the former first item of the list will stomp unmounted.first leaving it set to some random mount point which we are likely to free soon. This isn't likely to hit, but if it does I don't know how anyone could track it down. [ This happened because we don't have all the same operations for hlist's as we do for normal doubly-linked lists. In particular, list_splice() is easy on our standard doubly-linked lists, while hlist_splice() doesn't exist and needs both start/end entries of the hlist. And commit 38129a13 incorrectly open-coded that missing hlist_splice(). We should think about making these kinds of "mindless" conversions easier to get right by adding the missing hlist helpers - Linus ] Fixes: 38129a13 switch mnt_hash to hlist Cc: stable@vger.kernel.org Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 12月, 2014 15 次提交
-
-
由 Linus Torvalds 提交于
Neither Sage nor I noticed that Zheng Yan had mistakenly committed fs/ceph/super.h.rej as part of commit 31c542a1 ("ceph: add inline data to pagecache"). Remove it. Requested-by: NYan, Zheng <ukernel@gmail.com> Cc: Sage Weil <sweil@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yan, Zheng 提交于
make sure 'value' is not null. otherwise __ceph_setxattr will remove the extended attribute. Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
由 Yan, Zheng 提交于
mksnap reply only contain 'target', does not contain 'dentry'. So it's wrong to use req->r_reply_info.head->is_dentry to detect traceless reply. Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
由 Dan Carpenter 提交于
Probably this code was syncing a lot more often then intended because the do_sync variable wasn't set to zero. Cc: stable@vger.kernel.org # v3.11+ Fixes: c62988ec ('ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.') Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
After converting inline data to normal data, client need to flush the new i_inline_version (CEPH_INLINE_NONE) to MDS. This commit makes cap messages (sent to MDS) contain inline_version and inline_data. Client always converts inline data to normal data before data write, so the inline data length part is always zero. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Before any data write, convert inline data to normal data and set i_inline_version to CEPH_INLINE_NONE. The OSD request that saves inline data to object contains 3 operations (CMPXATTR, WRITE and SETXATTR). It compares a xattr named 'inline_version' to prevent old data overwrites newer data. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
we can't use getattr to fetch inline data while holding Fr cap, because it can cause deadlock. If we need to sync read inline data, drop cap refs first, then use getattr to fetch inline data. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
we can't use getattr to fetch inline data after getting Fcr caps, because it can cause deadlock. The solution is try bringing inline data to page cache when not holding any cap, and hope the inline data page is still there after getting the Fcr caps. If the page is still there, pin it in page cache for later IO. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Add a new parameter 'locked_page' to ceph_do_getattr(). If inline data in getattr reply will be copied to the page. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Request reply and cap message can contain inline data. add inline data to the page cache if there is Fc cap. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
allow specifying position of extent operation in multi-operations osd request. This is required for cephfs to convert inline data to normal data (compare xattr, then write object). Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Ilya Dryomov 提交于
These were used to report git versions a long time ago. Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Yan, Zheng 提交于
Current snaphost code does not properly handle moving inode from one empty snap realm to another empty snap realm. After changing inode's snap realm, some dirty pages' snap context can be not equal to inode's i_head_snap. This can trigger BUG() in ceph_put_wrbuffer_cap_refs() The fix is introduce a global empty snap context for all empty snap realm. This avoids triggering the BUG() for filesystem with no snapshot. Fixes: http://tracker.ceph.com/issues/9928Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NIlya Dryomov <idryomov@redhat.com>
-