- 15 1月, 2016 7 次提交
-
-
由 Joseph Qi 提交于
Since iput will take care the NULL check itself, NULL check before calling it is redundant. So clean them up. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 jiangyiwen 提交于
Commit f3f85464 ("ocfs2_dlm: Ensure correct ordering of set/clear refmap bit on lockres") still exists a race which can't ensure the ordering is exactly correct. Node1 Node2 Node3 umount, migrate lockres to Node2 migrate finished, send migrate request to Node3 received migrate request, create a migration_mle, respond to Node2. set DLM_LOCK_RES_SETREF_INPROG and send assert master to Node3 delete migration_mle in assert_master_handler, Node3 umount without response dlm_thread purge this lockres, send drop deref message to Node2 found the flag of DLM_LOCK_RES_SETREF_INPROG is set, dispatch dlm_deref_lockres_worker to clear refmap, but in function of dlm_deref_lockres_worker, only if node in refmap it wait DLM_LOCK_RES_SETREF_INPROG to be cleared. So worker is done successfully purge lockres, send assert master response to Node1, and finish umount set Node3 in refmap, and it won't be cleared forever, thus lead to umount hung so wait until DLM_LOCK_RES_SETREF_INPROG is cleared in dlm_deref_lockres_worker. Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Julia Lawall 提交于
The ocfs2_extent_tree_operations structures are never modified, so declare them as const. Done with the help of Coccinelle. Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xue jiufei 提交于
We found a race between purge and migration when doing code review. Node A put lockres to purgelist before receiving the migrate message from node B which is the master. Node A call dlm_mig_lockres_handler to handle this message. dlm_mig_lockres_handler dlm_lookup_lockres >>>>>> race window, dlm_run_purge_list may run and send deref message to master, waiting the response spin_lock(&res->spinlock); res->state |= DLM_LOCK_RES_MIGRATING; spin_unlock(&res->spinlock); dlm_mig_lockres_handler returns >>>>>> dlm_thread receives the response from master for the deref message and triggers the BUG because the lockres has the state DLM_LOCK_RES_MIGRATING with the following message: dlm_purge_lockres:209 ERROR: 6633EB681FA7474A9C280A4E1A836F0F: res M0000000000000000030c0300000000 in use after deref Signed-off-by: NJiufei Xue <xuejiufei@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NYiwen Jiang <jiangyiwen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Junxiao Bi 提交于
When run multiple xattr test of ocfs2-test on a three-nodes cluster, mount failed sometimes with the following message. o2hb: Unable to stabilize heartbeart on region D18B775E758D4D80837E8CF3D086AD4A (xvdb) Stabilize heartbeat depends on the timing order to mount ocfs2 from cluster nodes and how fast the tcp connections are established. So increase unsteady interations to leave more time for it. Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 John Haxby 提交于
Some versions of tar assume that files with st_blocks == 0 do not contain any data and will skip reading them entirely. See also commit 9206c561 ("ext4: return non-zero st_blocks for inline data"). Signed-off-by: NJohn Haxby <john.haxby@oracle.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Acked-by: NGang He <ghe@suse.com> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Norton.Zhu 提交于
In ocfs2_parse_options, a) it's better to declare variables(small size) outside of while loop; b) 'option' will be set by match_int, 'option = 0;' makes no sense, if match_int failed, it just goto bail and return. Signed-off-by: NNorton.Zhu <norton.zhu@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Gang He <ghe@suse.com> Cc: Mark Fasheh <mfasheh@suse.de> Acked-by: NJoel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 12月, 2015 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 30 12月, 2015 3 次提交
-
-
由 xuejiufei 提交于
We have found a BUG on res->migration_pending when migrating lock resources. The situation is as follows. dlm_mark_lockres_migration res->migration_pending = 1; __dlm_lockres_reserve_ast dlm_lockres_release_ast returns with res->migration_pending remains because other threads reserve asts wait dlm_migration_can_proceed returns 1 >>>>>>> o2hb found that target goes down and remove target from domain_map dlm_migration_can_proceed returns 1 dlm_mark_lockres_migrating returns -ESHOTDOWN with res->migration_pending still remains. When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON with res->migration_pending. So clear migration_pending when target is down. Signed-off-by: NJiufei Xue <xuejiufei@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Junxiao Bi 提交于
Commit 4f656367 ("Move locks API users to locks_lock_inode_wait()") move flock/posix lock indentify code to locks_lock_inode_wait(), but missed to set fl_flags to FL_FLOCK which caused the following kernel panic on 4.4.0_rc5. kernel BUG at fs/locks.c:1895! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2(O) ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront CPU: 0 PID: 20268 Comm: flock_unit_test Tainted: G O 4.4.0-rc5-next-20151217 #1 Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014 task: ffff88007b3672c0 ti: ffff880028b58000 task.ti: ffff880028b58000 RIP: locks_lock_inode_wait+0x2e/0x160 Call Trace: ocfs2_do_flock+0x91/0x160 [ocfs2] ocfs2_flock+0x76/0xd0 [ocfs2] SyS_flock+0x10f/0x1a0 entry_SYSCALL_64_fastpath+0x12/0x71 Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 48 81 ec 88 00 00 00 8b 46 40 83 e0 03 83 f8 01 0f 84 ad 00 00 00 83 f8 02 74 04 <0f> 0b eb fe 4c 8d ad 60 ff ff ff 4c 8d 7b 58 e8 0e 8e 73 00 4d RIP locks_lock_inode_wait+0x2e/0x160 RSP <ffff880028b5bce8> ---[ end trace dfca74ec9b5b274c ]--- Fixes: 4f656367 ("Move locks API users to locks_lock_inode_wait()") Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
When resizing, it firstly extends the last gd. Once it should backup super in the gd, it calculates new backup super and update the corresponding value. But it currently doesn't consider the situation that the backup super is already done. And in this case, it still sets the bit in gd bitmap and then decrease from bg_free_bits_count, which leads to a corrupted gd and trigger the BUG in ocfs2_block_group_set_bits: BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits); So check whether the backup super is done and then do the updates. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NJiufei Xue <xuejiufei@huawei.com> Reviewed-by: NYiwen Jiang <jiangyiwen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 12月, 2015 1 次提交
-
-
由 Andreas Gruenbacher 提交于
The list operations of the ocfs2 xattr handlers were never called anywhere. Remove them and directly check in ocfs2_xattr_list_entry which attributes should be skipped over instead. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: ocfs2-devel@oss.oracle.com Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 13 12月, 2015 1 次提交
-
-
由 Junxiao Bi 提交于
Commit 8f1eb487 ("ocfs2: fix umask ignored issue") introduced an issue, SGID of sub dir was not inherited from its parents dir. It is because SGID is set into "inode->i_mode" in ocfs2_get_init_inode(), but is overwritten by "mode" which don't have SGID set later. Fixes: 8f1eb487 ("ocfs2: fix umask ignored issue") Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Acked-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 12月, 2015 2 次提交
-
-
由 Al Viro 提交于
new method: ->get_link(); replacement of ->follow_link(). The differences are: * inode and dentry are passed separately * might be called both in RCU and non-RCU mode; the former is indicated by passing it a NULL dentry. * when called that way it isn't allowed to block and should return ERR_PTR(-ECHILD) if it needs to be called in non-RCU mode. It's a flagday change - the old method is gone, all in-tree instances converted. Conversion isn't hard; said that, so far very few instances do not immediately bail out when called in RCU mode. That'll change in the next commits. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
kmap() in page_follow_link_light() needed to go - allowing to hold an arbitrary number of kmaps for long is a great way to deadlocking the system. new helper (inode_nohighmem(inode)) needs to be used for pagecache symlinks inodes; done for all in-tree cases. page_follow_link_light() instrumented to yell about anything missed. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 07 12月, 2015 1 次提交
-
-
由 Andreas Gruenbacher 提交于
Add an additional "name" field to struct xattr_handler. When the name is set, the handler matches attributes with exactly that name. When the prefix is set instead, the handler matches attributes with the given prefix and with a non-empty suffix. This patch should avoid bugs like the one fixed in commit c361016a in the future. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NJames Morris <james.l.morris@oracle.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 21 11月, 2015 1 次提交
-
-
由 Junxiao Bi 提交于
New created file's mode is not masked with umask, and this makes umask not work for ocfs2 volume. Fixes: 702e5bc6 ("ocfs2: use generic posix ACL infrastructure") Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 11月, 2015 1 次提交
-
-
由 Andreas Gruenbacher 提交于
The xattr_handler operations are currently all passed a file system specific flags value which the operations can use to disambiguate between different handlers; some file systems use that to distinguish the xattr namespace, for example. In some oprations, it would be useful to also have access to the handler prefix. To allow that, pass a pointer to the handler to operations instead of the flags value alone. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 06 11月, 2015 9 次提交
-
-
由 Joseph Qi 提交于
readahead_pages in ocfs2_duplicate_clusters_by_page is defined but not used, so clean it up. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
A node can mount multiple ocfs2 volumes. And if thread names are same for each volume/domain, it will bring inconvenience when analyzing problems because we have to identify which volume/domain the messages belong to. Since thread name will be printed to messages, so add volume uuid or dlm name to thread name can benefit problem analysis. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Gang He <ghe@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 alex chen 提交于
In ocfs2_mknod_locked if '__ocfs2_mknod_locke d' returns an error, we should reclaim the inode successfully claimed above, otherwise, the inode never be reused. The case is described below: ocfs2_mknod ocfs2_mknod_locked ocfs2_claim_new_inode Successfully claim the inode __ocfs2_mknod_locked ocfs2_journal_access_di Failed because of -ENOMEM or other reasons, the inode lockres has not been initialized yet. iput(inode) ocfs2_evict_inode ocfs2_delete_inode ocfs2_inode_lock ocfs2_inode_lock_full_nested __ocfs2_cluster_lock Return -EINVAL because of the inode lockres has not been initialized. So the following operations are not performed ocfs2_wipe_inode ocfs2_remove_inode ocfs2_free_dinode ocfs2_free_suballoc_bits Signed-off-by: NAlex Chen <alex.chen@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
There is a race case between mount and delete node/cluster, which will lead o2hb_thread to malfunctioning dead loop. o2hb_thread { o2nm_depend_this_node(); <<<<<< race window, node may have already been deleted, and then enter the loop, o2hb thread will be malfunctioning because of no configured nodes found. while (!kthread_should_stop() && !reg->hr_unclean_stop && !reg->hr_aborted_start) { } So check the return value of o2nm_depend_this_node() is needed. If node has been deleted, do not enter the loop and let mount fail. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
We have no need to take inode mutex, rw and inode lock if it is not dio entry when recover orphans. Optimize it by adding a flag OCFS2_INODE_DIO_ORPHAN_ENTRY to ocfs2_inode_info to reduce contention. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
dio entry will only do truncate in case of ORPHAN_NEED_TRUNCATE. So do not include it when doing normal orphan scan to reduce contention. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
Currently cluster allocation is always trying to find a victim chain (a chian has most space), and this may lead to poor performance because of discontiguous allocation in some scenarios. Our test case is block size 4k, cluster size 1M and mount option with localalloc=2048 (2G), since a gd is 32256M (about 31.5G) and a localalloc window is only 2G, creating 50G file will result in 2G from gd0, 2G from gd1, ... One way to improve performance is enlarge localalloc window size (max 31104M), but this will make end user feel that about 30G is suddenly "missing", and localalloc currently do not support steal, which means one node cannot use another node's localalloc even it is not used in fact. So using the last gd to record the allocation and continues with the gd if it has enough space for a localalloc window can make the allocation as more contiguous as possible. Our test result is below (evaluated in IOPS), which is using iometer running in VM, dynamic vhd virtual disk stored in ocfs2. IO model Original After Improved(%) 16K60%Write100%Random 703 876 24.59% 8K90%Write100%Random 735 827 12.59% 4K100%Write100%Random 859 915 6.52% 4K100%Read100%Random 2092 2600 24.30% Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Tested-by: NNorton Zhu <norton.zhu@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 jiangyiwen 提交于
A simplified test case is (this case from Ryan): 1) dd if=/dev/zero of=/mnt/hello bs=512 count=1 oflag=direct; 2) truncate /mnt/hello -s 2097152 file 'hello' is not exist before test. After this command, file 'hello' should be all zero. But 512~4096 is some random data. Setting bh state to new when get a new block, if so, direct_io_worker()->dio_zero_block() will fill-in the unused portion of the block with zero. Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Norton.Zhu 提交于
If ocfs2_is_overwrite failed, ocfs2_direct_IO_write mays till return success to the caller. Signed-off-by: NNorton.Zhu <norton.zhu@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 10月, 2015 2 次提交
-
-
由 Joseph Qi 提交于
dlm_lockres_put will call dlm_lockres_release if it is the last reference, and then it may call dlm_print_one_lock_resource and take lockres spinlock. So unlock lockres spinlock before dlm_lockres_put to avoid deadlock. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Benjamin Coddington 提交于
Instead of having users check for FL_POSIX or FL_FLOCK to call the correct locks API function, use the check within locks_lock_inode_wait(). This allows for some later cleanup. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
-
- 14 10月, 2015 2 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
由 Christoph Hellwig 提交于
The test and separate set bit scheme was racy to start with, so move to do a test_and_set_bit after doing the earlier error checks inside the actual store methods. Also remove the locking for the local attribute which already has a different scheme to synchronize. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
- 23 9月, 2015 1 次提交
-
-
由 Joseph Qi 提交于
The order of the following three spinlocks should be: dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock But dlm_dispatch_assert_master() is called while holding dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls dlm_grab() which will take dlm_domain_lock. Once another thread (for example, dlm_query_join_handler) has already taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock happens. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: "Junxiao Bi" <junxiao.bi@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 9月, 2015 1 次提交
-
-
由 Andrew Morton 提交于
Revert commit f83c7b5e ("ocfs2/dlm: use list_for_each_entry instead of list_for_each"). list_for_each_entry() will dereference its `pos' argument, which can be NULL in dlm_process_recovery_data(). Reported-by: NJulia Lawall <julia.lawall@lip6.fr> Reported-by: NFengguang Wu <fengguang.wu@gmail.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 9月, 2015 7 次提交
-
-
由 Kees Cook 提交于
Many file systems that implement the show_options hook fail to correctly escape their output which could lead to unescaped characters (e.g. new lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This could lead to confusion, spoofed entries (resulting in things like systemd issuing false d-bus "mount" notifications), and who knows what else. This looks like it would only be the root user stepping on themselves, but it's possible weird things could happen in containers or in other situations with delegated mount privileges. Here's an example using overlay with setuid fusermount trusting the contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use of "sudo" is something more sneaky: $ BASE="ovl" $ MNT="$BASE/mnt" $ LOW="$BASE/lower" $ UP="$BASE/upper" $ WORK="$BASE/work/ 0 0 none /proc fuse.pwn user_id=1000" $ mkdir -p "$LOW" "$UP" "$WORK" $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt $ cat /proc/mounts none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0 none /proc fuse.pwn user_id=1000 0 0 $ fusermount -u /proc $ cat /proc/mounts cat: /proc/mounts: No such file or directory This fixes the problem by adding new seq_show_option and seq_show_option_n helpers, and updating the vulnerable show_option handlers to use them as needed. Some, like SELinux, need to be open coded due to unusual existing escape mechanisms. [akpm@linux-foundation.org: add lost chunk, per Kees] [keescook@chromium.org: seq_show_option should be using const parameters] Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Acked-by: NJan Kara <jack@suse.com> Acked-by: NPaul Moore <paul@paul-moore.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: NKees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
NULL check before kfree is redundant and so clean them up. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joe Perches 提交于
These uses sometimes do and sometimes don't have '\n' terminations. Make the uses consistently use '\n' terminations and remove the newline from the functions. Miscellanea: o Coalesce formats o Realign arguments Signed-off-by: NJoe Perches <joe@perches.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xue jiufei 提交于
While appending an extent to a file, it will call these functions: ocfs2_insert_extent -> call ocfs2_grow_tree() if there's no free rec -> ocfs2_add_branch add a new branch to extent tree, now rec[0] in the leaf of rightmost path is empty -> ocfs2_do_insert_extent -> ocfs2_rotate_tree_right -> ocfs2_extend_rotate_transaction -> jbd2_journal_restart if jbd2_journal_extend fail -> ocfs2_insert_path -> ocfs2_extend_trans -> jbd2_journal_restart if jbd2_journal_extend fail -> ocfs2_insert_at_leaf -> ocfs2_et_update_clusters Function jbd2_journal_restart() may be called and it may happened that buffers dirtied in ocfs2_add_branch() are committed while buffers dirtied in ocfs2_insert_at_leaf() and ocfs2_et_update_clusters() are not. So an empty rec[0] is left in rightmost path which will cause read-only filesystem when call ocfs2_commit_truncate() with the error message: "Inode %lu has an empty extent record". This is not a serious problem, so remove the rightmost path when call ocfs2_commit_truncate(). Signed-off-by: Njoyce.xue <xuejiufei@huawei.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 yangwenfang 提交于
1: After we call ocfs2_journal_access_di() in ocfs2_write_begin(), jbd2_journal_restart() may also be called, in this function transaction A's t_updates-- and obtains a new transaction B. If jbd2_journal_commit_transaction() is happened to commit transaction A, when t_updates==0, it will continue to complete commit and unfile buffer. So when jbd2_journal_dirty_metadata(), the handle is pointed a new transaction B, and the buffer head's journal head is already freed, jh->b_transaction == NULL, jh->b_next_transaction == NULL, it returns EINVAL, So it triggers the BUG_ON(status). thread 1 jbd2 ocfs2_write_begin jbd2_journal_commit_transaction ocfs2_write_begin_nolock ocfs2_start_trans jbd2__journal_start(t_updates+1, transaction A) ocfs2_journal_access_di ocfs2_write_cluster_by_desc ocfs2_mark_extent_written ocfs2_change_extent_flag ocfs2_split_extent ocfs2_extend_rotate_transaction jbd2_journal_restart (t_updates-1,transaction B) t_updates==0 __jbd2_journal_refile_buffer (jh->b_transaction = NULL) ocfs2_write_end ocfs2_write_end_nolock ocfs2_journal_dirty jbd2_journal_dirty_metadata(bug) ocfs2_commit_trans 2. In ext4, I found that: jbd2_journal_get_write_access() called by ext4_write_end. ext4_write_begin ext4_journal_start __ext4_journal_start_sb ext4_journal_check_start jbd2__journal_start ext4_write_end ext4_mark_inode_dirty ext4_reserve_inode_write ext4_journal_get_write_access jbd2_journal_get_write_access ext4_mark_iloc_dirty ext4_do_update_inode ext4_handle_dirty_metadata jbd2_journal_dirty_metadata 3. So I think we should put ocfs2_journal_access_di before ocfs2_journal_dirty in the ocfs2_write_end. and it works well after my modification. Signed-off-by: Nvicky <vicky.yangwenfang@huawei.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Zhangguanghui <zhang.guanghui@h3c.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tina Ruchandani 提交于
o2hb_elapsed_msecs computes the time taken for a disk heartbeat. 'struct timeval' variables are used to store start and end times. On 32-bit systems, the 'tv_sec' component of 'struct timeval' will overflow in year 2038 and beyond. This patch solves the overflow with the following: 1. Replace o2hb_elapsed_msecs using 'ktime_t' values to measure start and end time, and built-in function 'ktime_ms_delta' to compute the elapsed time. ktime_get_real() is used since the code prints out the wallclock time. 2. Changes format string to print time as a single 64-bit nanoseconds value ("%lld") instead of seconds and microseconds. This simplifies the code since converting ktime_t to that format would need expensive computation. However, the debug log string is less readable than the previous format. Signed-off-by: NTina Ruchandani <ruchandani.tina@gmail.com> Suggested by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
There is a race case between crashed dio and rm, which will lead to OCFS2_VALID_FL not set read-only. N1 N2 ------------------------------------------------------------------------ dd with direct flag rm file crashed with an dio entry left in orphan dir clear OCFS2_VALID_FL in ocfs2_remove_inode recover N1 and read the corrupted inode, and set filesystem read-only So we skip the inode deletion this time and wait for dio entry recovered first. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-