- 15 1月, 2016 40 次提交
-
-
由 Yaowei Bai 提交于
Hardcoding index to zonelists array in gfp_zonelist() is not a good idea, let's enumerate it to improve readability. No functional change. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix CONFIG_NUMA=n build] [n-horiguchi@ah.jp.nec.com: fix warning in comparing enumerator] Signed-off-by: NYaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yaowei Bai 提交于
Since commit a0b8cab3 ("mm: remove lru parameter from __pagevec_lru_add and remove parts of pagevec API") there's no user of this function anymore, so remove it. Signed-off-by: NYaowei Bai <baiyaowei@cmss.chinamobile.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NHillf Danton <hillf.zj@alibaba-inc.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yaowei Bai 提交于
Make memblock_is_memory() and memblock_is_reserved return bool to improve readability due to these particular functions only using either one or zero as their return value. No functional change. Signed-off-by: NYaowei Bai <baiyaowei@cmss.chinamobile.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yaowei Bai 提交于
Make is_file_hugepages() return bool to improve readability due to this particular function only using either one or zero as its return value. This patch also removed the if condition to make is_file_hugepages return directly. No functional change. Signed-off-by: NYaowei Bai <baiyaowei@cmss.chinamobile.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 yalin wang 提交于
Move node_id zone_idx shrink flags into trace function, so thay we don't need caculate these args if the trace is disabled, and will make this function have less arguments. Signed-off-by: Nyalin wang <yalin.wang2010@gmail.com> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
Now, we have tracepoint in test_pages_isolated() to notify pfn which cannot be isolated. But, in alloc_contig_range(), some error path doesn't call test_pages_isolated() so it's still hard to know exact pfn that causes allocation failure. This patch change this situation by calling test_pages_isolated() in almost error path. In allocation failure case, some overhead is added by this change, but, allocation failure is really rare event so it would not matter. In fatal signal pending case, we don't call test_pages_isolated() because this failure is intentional one. There was a bogus outer_start problem due to unchecked buddy order and this patch also fix it. Before this patch, it didn't matter, because end result is same thing. But, after this patch, tracepoint will report failed pfn so it should be accurate. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Nazarewicz <mina86@mina86.com> Cc: David Rientjes <rientjes@google.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
cma allocation should be guranteeded to succeed. But sometimes it can fail in the current implementation. To track down the problem, we need to know which page is problematic and this new tracepoint will report it. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: NMichal Nazarewicz <mina86@mina86.com> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Minchan Kim <minchan@kernel.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
This is preparation step to report test failed pfn in new tracepoint to analyze cma allocation failure problem. There is no functional change in this patch. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: NDavid Rientjes <rientjes@google.com> Acked-by: NMichal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nathan Zimmer 提交于
When running the SPECint_rate gcc on some very large boxes it was noticed that the system was spending lots of time in mpol_shared_policy_lookup(). The gamess benchmark can also show it and is what I mostly used to chase down the issue since the setup for that I found to be easier. To be clear the binaries were on tmpfs because of disk I/O requirements. We then used text replication to avoid icache misses and having all the copies banging on the memory where the instruction code resides. This results in us hitting a bottleneck in mpol_shared_policy_lookup() since lookup is serialised by the shared_policy lock. I have only reproduced this on very large (3k+ cores) boxes. The problem starts showing up at just a few hundred ranks getting worse until it threatens to livelock once it gets large enough. For example on the gamess benchmark at 128 ranks this area consumes only ~1% of time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is over 90%. To alleviate the contention in this area I converted the spinlock to an rwlock. This allows a large number of lookups to happen simultaneously. The results were quite good reducing this consumtion at max ranks to around 2%. [akpm@linux-foundation.org: tidy up code comments] Signed-off-by: NNathan Zimmer <nzimmer@sgi.com> Acked-by: NDavid Rientjes <rientjes@google.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Nadia Yvette Chambers <nyc@holomorphy.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chen Gang 提交于
__phys_to_pfn and __pfn_to_phys are symmetric, PHYS_PFN and PFN_PHYS are semmetric: - y = (phys_addr_t)x << PAGE_SHIFT - y >> PAGE_SHIFT = (phys_add_t)x - (unsigned long)(y >> PAGE_SHIFT) = x [akpm@linux-foundation.org: use macro arg name `x'] [arnd@arndb.de: include linux/pfn.h for PHYS_PFN definition] Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 yalin wang 提交于
Move trace_reclaim_flags() into trace function, so that we don't need caculate these flags if the trace is disabled. Signed-off-by: Nyalin wang <yalin.wang2010@gmail.com> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chen Gang 提交于
Simplify may_expand_vm(). [akpm@linux-foundation.org: further simplification, per Naoya Horiguchi] Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Klimov 提交于
Before usage page pointer initialized by NULL is reinitialized by follow_page_mask(). Drop useless init of page pointer in the beginning of loop. Signed-off-by: NAlexey Klimov <klimov.linux@gmail.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
Make vmalloc family functions allocate vmalloc area pages with alloc_kmem_pages so that if __GFP_ACCOUNT is set they will be accounted to memcg. This is needed, at least, to account alloc_fdmem allocations. Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
Currently, if we want to account all objects of a particular kmem cache, we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to kmem_cache_create will force accounting for every allocation from this cache even if __GFP_ACCOUNT is not passed. This patch does not make any of the existing caches use this flag - it will be done later in the series. Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o SLAB_ACCOUNT, because merged caches share the same kmem_cache struct and hence cannot have different sets of SLAB_* flags. Thus using this flag will probably reduce the number of merged slabs even if kmem accounting is not used (only compiled in). Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Suggested-by: NTejun Heo <tj@kernel.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be fragile and difficult to maintain, because there seem to be many more allocations that should not be accounted than those that should be. Besides, false accounting an allocation might result in much worse consequences than not accounting at all, namely increased memory consumption due to pinned dead kmem caches. So this patch switches kmem accounting to the white-policy: now only those kmem allocations that are marked as __GFP_ACCOUNT are accounted to memcg. Currently, no kmem allocations are marked like this. The following patches will mark several kmem allocations that are known to be easily triggered from userspace and therefore should be accounted to memcg. Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
This reverts commit 8f4fc071 ("gfp: add __GFP_NOACCOUNT"). Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be fragile and difficult to maintain, because there seem to be many more allocations that should not be accounted than those that should be. Besides, false accounting an allocation might result in much worse consequences than not accounting at all, namely increased memory consumption due to pinned dead kmem caches. So it was decided to switch to the white-list policy. This patch reverts bits introducing the black-list policy. The white-list policy will be introduced later in the series. Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
Currently, all kmem allocations (namely every kmem_cache_alloc, kmalloc, alloc_kmem_pages call) are accounted to memory cgroup automatically. Callers have to explicitly opt out if they don't want/need accounting for some reason. Such a design decision leads to several problems: - kmalloc users are highly sensitive to failures, many of them implicitly rely on the fact that kmalloc never fails, while memcg makes failures quite plausible. - A lot of objects are shared among different containers by design. Accounting such objects to one of containers is just unfair. Moreover, it might lead to pinning a dead memcg along with its kmem caches, which aren't tiny, which might result in noticeable increase in memory consumption for no apparent reason in the long run. - There are tons of short-lived objects. Accounting them to memcg will only result in slight noise and won't change the overall picture, but we still have to pay accounting overhead. For more info, see - http://lkml.kernel.org/r/20151105144002.GB15111%40dhcp22.suse.cz - http://lkml.kernel.org/r/20151106090555.GK29259@esperanza Therefore this patchset switches to the white list policy. Now kmalloc users have to explicitly opt in by passing __GFP_ACCOUNT flag. Currently, the list of accounted objects is quite limited and only includes those allocations that (1) are known to be easily triggered from userspace and (2) can fail gracefully (for the full list see patch no. 6) and it still misses many object types. However, accounting only those objects should be a satisfactory approximation of the behavior we used to have for most sane workloads. This patch (of 6): Revert 499611ed ("kernfs: do not account ino_ida allocations to memcg"). Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be fragile and difficult to maintain, because there seem to be many more allocations that should not be accounted than those that should be. Besides, false accounting an allocation might result in much worse consequences than not accounting at all, namely increased memory consumption due to pinned dead kmem caches. So it was decided to switch to the white-list policy. This patch reverts bits introducing the black-list policy. The white-list policy will be introduced later in the series. Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Geliang Tang 提交于
Add a new helper function get_first_slab() that get the first slab from a kmem_cache_node. Signed-off-by: NGeliang Tang <geliangtang@163.com> Acked-by: NChristoph Lameter <cl@linux.com> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Geliang Tang 提交于
Simplify the code with list_for_each_entry(). Signed-off-by: NGeliang Tang <geliangtang@163.com> Acked-by: NChristoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Geliang Tang 提交于
Simplify the code with list_first_entry_or_null(). Signed-off-by: NGeliang Tang <geliangtang@163.com> Acked-by: NChristoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
A little cleanup - the invocation site provdes the semicolon. Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
lksb flags are defined both in dlmapi.h and dlmcommon.h. So clean them up from dlmcommon.h. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NJiufei Xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Junxiao Bi 提交于
Found this when do patch review, remove to make it clear and save a little cpu time. Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
In ocfs2_orphan_del, currently it finds and deletes entry first, and then access orphan dir dinode. This will have a problem once ocfs2_journal_access_di fails. In this case, entry will be removed from orphan dir, but in deed the inode hasn't been deleted successfully. In other words, the file is missing but not actually deleted. So we should access orphan dinode first like unlink and rename. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NJiufei Xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 xuejiufei 提交于
When two processes are migrating the same lockres, dlm_add_migration_mle() return -EEXIST, but insert a new mle in hash list. dlm_migrate_lockres() will detach the old mle and free the new one which is already in hash list, that will destroy the list. Signed-off-by: NJiufei Xue <xuejiufei@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 xuejiufei 提交于
We have found that migration source will trigger a BUG that the refcount of mle is already zero before put when the target is down during migration. The situation is as follows: dlm_migrate_lockres dlm_add_migration_mle dlm_mark_lockres_migrating dlm_get_mle_inuse <<<<<< Now the refcount of the mle is 2. dlm_send_one_lockres and wait for the target to become the new master. <<<<<< o2hb detect the target down and clean the migration mle. Now the refcount is 1. dlm_migrate_lockres woken, and put the mle twice when found the target goes down which trigger the BUG with the following message: "ERROR: bad mle: ". Signed-off-by: NJiufei Xue <xuejiufei@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Goldwyn Rodrigues 提交于
DLM does not cache locks. So, blocking lock and unlock will only make the performance worse where contention over the locks is high. Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 jiangyiwen 提交于
The following case will lead to slot overwritten. N1 N2 mount ocfs2 volume, find and allocate slot 0, then set osb->slot_num to 0, begin to write slot info to disk mount ocfs2 volume, wait for super lock write block fail because of storage link down, unlock super lock got super lock and also allocate slot 0 then unlock super lock mount fail and then dismount, since osb->slot_num is 0, try to put invalid slot to disk. And it will succeed if storage link restores. N2 slot info is now overwritten Once another node say N3 mount, it will find and allocate slot 0 again, which will lead to mount hung because journal has already been locked by N2. so when write slot info failed, invalidate slot in advance to avoid overwrite slot. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xue jiufei 提交于
dlm_grab() may return NULL when the node is doing unmount. When doing code review, we found that some dlm handlers may return error to caller when dlm_grab() returns NULL and make caller BUG or other problems. Here is an example: Node 1 Node 2 receives migration message from node 3, and send migrate request to others start unmounting receives migrate request from node 1 and call dlm_migrate_request_handler() unmount thread unregisters domain handlers and removes dlm_context from dlm_domains dlm_migrate_request_handlers() returns -EINVAL to node 1 Exit migration neither clearing the migration state nor sending assert master message to node 3 which cause node 3 hung. Signed-off-by: NJiufei Xue <xuejiufei@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NYiwen Jiang <jiangyiwen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joseph Qi 提交于
Since iput will take care the NULL check itself, NULL check before calling it is redundant. So clean them up. Signed-off-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 jiangyiwen 提交于
Commit f3f85464 ("ocfs2_dlm: Ensure correct ordering of set/clear refmap bit on lockres") still exists a race which can't ensure the ordering is exactly correct. Node1 Node2 Node3 umount, migrate lockres to Node2 migrate finished, send migrate request to Node3 received migrate request, create a migration_mle, respond to Node2. set DLM_LOCK_RES_SETREF_INPROG and send assert master to Node3 delete migration_mle in assert_master_handler, Node3 umount without response dlm_thread purge this lockres, send drop deref message to Node2 found the flag of DLM_LOCK_RES_SETREF_INPROG is set, dispatch dlm_deref_lockres_worker to clear refmap, but in function of dlm_deref_lockres_worker, only if node in refmap it wait DLM_LOCK_RES_SETREF_INPROG to be cleared. So worker is done successfully purge lockres, send assert master response to Node1, and finish umount set Node3 in refmap, and it won't be cleared forever, thus lead to umount hung so wait until DLM_LOCK_RES_SETREF_INPROG is cleared in dlm_deref_lockres_worker. Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Julia Lawall 提交于
The ocfs2_extent_tree_operations structures are never modified, so declare them as const. Done with the help of Coccinelle. Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xue jiufei 提交于
We found a race between purge and migration when doing code review. Node A put lockres to purgelist before receiving the migrate message from node B which is the master. Node A call dlm_mig_lockres_handler to handle this message. dlm_mig_lockres_handler dlm_lookup_lockres >>>>>> race window, dlm_run_purge_list may run and send deref message to master, waiting the response spin_lock(&res->spinlock); res->state |= DLM_LOCK_RES_MIGRATING; spin_unlock(&res->spinlock); dlm_mig_lockres_handler returns >>>>>> dlm_thread receives the response from master for the deref message and triggers the BUG because the lockres has the state DLM_LOCK_RES_MIGRATING with the following message: dlm_purge_lockres:209 ERROR: 6633EB681FA7474A9C280A4E1A836F0F: res M0000000000000000030c0300000000 in use after deref Signed-off-by: NJiufei Xue <xuejiufei@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Reviewed-by: NYiwen Jiang <jiangyiwen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Junxiao Bi 提交于
When run multiple xattr test of ocfs2-test on a three-nodes cluster, mount failed sometimes with the following message. o2hb: Unable to stabilize heartbeart on region D18B775E758D4D80837E8CF3D086AD4A (xvdb) Stabilize heartbeat depends on the timing order to mount ocfs2 from cluster nodes and how fast the tcp connections are established. So increase unsteady interations to leave more time for it. Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 John Haxby 提交于
Some versions of tar assume that files with st_blocks == 0 do not contain any data and will skip reading them entirely. See also commit 9206c561 ("ext4: return non-zero st_blocks for inline data"). Signed-off-by: NJohn Haxby <john.haxby@oracle.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Acked-by: NGang He <ghe@suse.com> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Norton.Zhu 提交于
In ocfs2_parse_options, a) it's better to declare variables(small size) outside of while loop; b) 'option' will be set by match_int, 'option = 0;' makes no sense, if match_int failed, it just goto bail and return. Signed-off-by: NNorton.Zhu <norton.zhu@huawei.com> Reviewed-by: NJoseph Qi <joseph.qi@huawei.com> Cc: Gang He <ghe@suse.com> Cc: Mark Fasheh <mfasheh@suse.de> Acked-by: NJoel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Arnd Bergmann 提交于
Fix build errors that happen when CONFIG_LOGFS=y and CONFIG_MTD=m: fs/built-in.o: In function `logfs_mount': super.c:(.text+0x92a6f): undefined reference to `logfs_get_sb_mtd' fs/built-in.o: In function `logfs_get_sb_bdev': (.text+0x93530): undefined reference to `logfs_get_sb_mtd' This patch avoids the error by changing the dependencies of logfs in a way that we can no longer configure logfs as built-in when the MTD core is a loadable module, while leaving the dependency to require at least one of MTD or BLOCK to be enabled. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Peter Chen <peter.chen@freescale.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Javier Martinez Canillas 提交于
Commit ac551828 ("modpost: i2c aliases need no trailing wildcard") removed the wildcard at the end of the I2C module aliases because I2C devices have no IDs so the aliases are just arbitrary device names. This is also true for OF modaliases since a compatible string is used to define a specific IP hardware block. So the modalias should match a specific compatible string and not attempt to match a compatible string whose name matches the beginning of another one. For example, the following driver module: $ modinfo cros_ec_keyb | grep alias alias: platform:cros-ec-keyb alias: of:N*T*Cgoogle,cros-ec-keyb* will be tried to be loaded for an alias of:N*T*Cgoogle,cros-ec-keyb-v2 but there could be a different driver that supports the device for that compatible string so it's better to remove the trailing wildcard for OF. Also, remove the word "always" from the add_wildcard() function comment since that was carried from the time where a wildcard was always added at the end of the module alias for all the devices. Signed-off-by: NJavier Martinez Canillas <javier@osg.samsung.com> Suggested-by: NBrian Norris <computersforpeace@gmail.com> Reviewed-by: NSjoerd Simons <sjoerd.simons@collabora.co.uk> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-