- 11 7月, 2013 3 次提交
-
-
由 Michel Lespinasse 提交于
Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: NMichel Lespinasse <walken@google.com> Acked-by: NRik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Seth Jennings 提交于
zswap is a thin backend for frontswap that takes pages that are in the process of being swapped out and attempts to compress them and store them in a RAM-based memory pool. This can result in a significant I/O reduction on the swap device and, in the case where decompressing from RAM is faster than reading from the swap device, can also improve workload performance. It also has support for evicting swap pages that are currently compressed in zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a true cache in that, once the cache is full, the oldest pages can be moved out of zswap to the swap device so newer pages can be compressed and stored in zswap. This patch adds the zswap driver to mm/ Signed-off-by: NSeth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: NRik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Seth Jennings 提交于
zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: NSeth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: NRik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 7月, 2013 36 次提交
-
-
由 Andrew Morton 提交于
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Toshi Kani 提交于
online_pages() is called from memory_block_action() when a user requests to online a memory block via sysfs. This function needs to return a proper error value in case of error. Signed-off-by: NToshi Kani <toshi.kani@hp.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min) currently which is right thing to do in most cases but this could be unexpected if admin increased the value to prevent from allocation failures and the new min_free_kbytes would be decreased as a result of memory hotadd. This patch saves the user defined value and allows updating min_free_kbytes only if it is higher than the saved one. A warning is printed when the new value is ignored. Signed-off-by: NMichal Hocko <mhocko@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Acked-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Now memcg has the same life cycle with its corresponding cgroup, and a cgroup is freed via RCU and then mem_cgroup_css_free() will be called in a work function, so we can simply call __mem_cgroup_free() in mem_cgroup_css_free(). This actually reverts commit 59927fb9 ("memcg: free mem_cgroup by RCU to fix oops"). Signed-off-by: NLi Zefan <lizefan@huawei.com> Cc: Hugh Dickins <hughd@google.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Now memcg has the same life cycle as its corresponding cgroup. Kill the useless refcnt. Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
The cgroup core guarantees it's always safe to access the parent. Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Use css_get/put instead of mem_cgroup_get/put. A simple replacement will do. The historical reason that memcg has its own refcnt instead of always using css_get/put, is that cgroup couldn't be removed if there're still css refs, so css refs can't be used as long-lived reference. The situation has changed so that rmdir a cgroup will succeed regardless css refs, but won't be freed until css refs goes down to 0. Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Use css_get/put instead of mem_cgroup_get/put. We can't do a simple replacement, because here mem_cgroup_put() is called during mem_cgroup_css_free(), while mem_cgroup_css_free() won't be called until css refcnt goes down to 0. Instead we increment css refcnt in mem_cgroup_css_offline(), and then check if there's still kmem charges. If not, css refcnt will be decremented immediately, otherwise the refcnt will be released after the last kmem allocation is uncahred. [akpm@linux-foundation.org: tweak comment] Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NTejun Heo <tj@kernel.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Use css_get()/css_put() instead of mem_cgroup_get()/mem_cgroup_put(). There are two things being done in the current code: First, we acquired a css_ref to make sure that the underlying cgroup would not go away. That is a short lived reference, and it is put as soon as the cache is created. At this point, we acquire a long-lived per-cache memcg reference count to guarantee that the memcg will still be alive. so it is: enqueue: css_get create : memcg_get, css_put destroy: memcg_put So we only need to get rid of the memcg_get, change the memcg_put to css_put, and get rid of the now extra css_put. (This changelog is mostly written by Glauber) Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Use css_get/css_put instead of mem_cgroup_get/put. Note, if at the same time someone is moving @current to a different cgroup and removing the old cgroup, css_tryget() may return false, and sock->sk_cgrp won't be initialized, which is fine. Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
mem_cgroup_css_online calls mem_cgroup_put if memcg_init_kmem fails. This is not correct because only memcg_propagate_kmem takes an additional reference while mem_cgroup_sockets_init is allowed to fail as well (although no current implementation fails) but it doesn't take any reference. This all suggests that it should be memcg_propagate_kmem that should clean up after itself so this patch moves mem_cgroup_put over there. Unfortunately this is not that easy (as pointed out by Li Zefan) because memcg_kmem_mark_dead marks the group dead (KMEM_ACCOUNTED_DEAD) if it is marked active (KMEM_ACCOUNTED_ACTIVE) which is the case even if memcg_propagate_kmem fails so the additional reference is dropped in that case in kmem_cgroup_destroy which means that the reference would be dropped two times. The easiest way then would be to simply remove mem_cgrroup_put from mem_cgroup_css_online and rely on kmem_cgroup_destroy doing the right thing. Signed-off-by: NMichal Hocko <mhocko@suse.cz> Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.8] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
This reverts commit e4715f01. mem_cgroup_put is hierarchy aware so mem_cgroup_put(memcg) already drops an additional reference from all parents so the additional mem_cgrroup_put(parent) potentially causes use-after-free. Signed-off-by: NMichal Hocko <mhocko@suse.cz> Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.9+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jörn Engel 提交于
It is counterintuitive at best that mmap'ing a hugetlbfs file with MAP_HUGETLB fails, while mmap'ing it without will a) succeed and b) return huge pages. v2: use is_file_hugepages(), as suggested by Jianguo Signed-off-by: NJoern Engel <joern@logfs.org> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
After the patch "mm: vmscan: Flatten kswapd priority loop" was merged the scanning priority of kswapd changed. The priority now rises until it is scanning enough pages to meet the high watermark. shrink_inactive_list sets ZONE_WRITEBACK if a number of pages were encountered under writeback but this value is scaled based on the priority. As kswapd frequently scans with a higher priority now it is relatively easy to set ZONE_WRITEBACK. This patch removes the scaling and treates writeback pages similar to how it treats unqueued dirty pages and congested pages. The user-visible effect should be that kswapd will writeback fewer pages from reclaim context. Signed-off-by: NMel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Direct reclaim is not aborting to allow compaction to go ahead properly. do_try_to_free_pages is told to abort reclaim which is happily ignores and instead increases priority instead until it reaches 0 and starts shrinking file/anon equally. This patch corrects the situation by aborting reclaim when requested instead of raising priority. Signed-off-by: NMel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tang Chen 提交于
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tang Chen 提交于
Remove one redundant "nid" in the comment. Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
When searching a vmap area in the vmalloc space, we use (addr + size - 1) to check if the value is less than addr, which is an overflow. But we assign (addr + size) to vmap_area->va_end. So if we come across the below case: (addr + size - 1) : not overflow (addr + size) : overflow we will assign an overflow value (e.g 0) to vmap_area->va_end, And this will trigger BUG in __insert_vmap_area, causing system panic. So using (addr + size) to check the overflow should be the correct behaviour, not (addr + size - 1). Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Reported-by: NGhennadi Procopciuc <unix140@gmail.com> Tested-by: NDaniel Baluta <dbaluta@ixiacom.com> Cc: David Rientjes <rientjes@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joe Perches 提交于
These VM_<READfoo> macros aren't used very often and three of them aren't used at all. Expand the ones that are used in-place, and remove all the now unused #define VM_<foo> macros. VM_READHINTMASK, VM_NormalReadHint and VM_ClearReadHint were added just before 2.4 and appears have never been used. Signed-off-by: NJoe Perches <joe@perches.com> Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
With CONFIG_MEMORY_HOTREMOVE unset, there is a compile warning: mm/sparse.c:755: warning: `clear_hwpoisoned_pages' defined but not used And Bisecting it ended up pointing to 4edd7cef ("mm, hotplug: avoid compiling memory hotremove functions when disabled"). This is because the commit above put sparse_remove_one_section() within the protection of CONFIG_MEMORY_HOTREMOVE but the only user of clear_hwpoisoned_pages() is sparse_remove_one_section(), and it is not within the protection of CONFIG_MEMORY_HOTREMOVE. So put clear_hwpoisoned_pages within CONFIG_MEMORY_HOTREMOVE should fix the warning. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Acked-by: NToshi Kani <toshi.kani@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
This function is nowhere used, and it has a confusing name with put_page in mm/swap.c. So better to remove it. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
vfree() only needs schedule_work(&p->wq) if p->list was empty, otherwise vfree_deferred->wq is already pending or it is running and didn't do llist_del_all() yet. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
In __rmqueue_fallback(), current_order loops down from MAX_ORDER - 1 to the order passed. MAX_ORDER is typically 11 and pageblock_order is typically 9 on x86. Integer division truncates, so pageblock_order / 2 is 4. For the first eight iterations, it's guaranteed that current_order >= pageblock_order / 2 if it even gets that far! So just remove the unlikely(), it's completely bogus. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Suggested-by: NDavid Rientjes <rientjes@google.com> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
The callers of build_zonelists_node always pass MAX_NR_ZONES -1 as the zone_type argument, so we can directly use the value in build_zonelists_node and remove zone_type argument. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Glauber Costa 提交于
The memory we used to hold the memcg arrays is currently accounted to the current memcg. But that creates a problem, because that memory can only be freed after the last user is gone. Our only way to know which is the last user, is to hook up to freeing time, but the fact that we still have some in flight kmallocs will prevent freeing to happen. I believe therefore to be just easier to account this memory as global overhead. Signed-off-by: NGlauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Glauber Costa 提交于
The memory we used to hold the memcg arrays is currently accounted to the current memcg. But that creates a problem, because that memory can only be freed after the last user is gone. Our only way to know which is the last user, is to hook up to freeing time, but the fact that we still have some in flight kmallocs will prevent freeing to happen. I believe therefore to be just easier to account this memory as global overhead. This patch (of 2): Disabling accounting is only relevant for some specific memcg internal allocations. Therefore we would initially not have such check at memcg_kmem_newpage_charge, since direct calls to the page allocator that are marked with GFP_KMEMCG only happen outside memcg core. We are mostly concerned with cache allocations and by having this test at memcg_kmem_get_cache we are already able to relay the allocation to the root cache and bypass the memcg caches altogether. There is one exception, though: the SLUB allocator does not create large order caches, but rather service large kmallocs directly from the page allocator. Therefore, the following sequence, when backed by the SLUB allocator: memcg_stop_kmem_account(); kmalloc(<large_number>) memcg_resume_kmem_account(); would effectively ignore the fact that we should skip accounting, since it will drive us directly to this function without passing through the cache selector memcg_kmem_get_cache. Such large allocations are extremely rare but can happen, for instance, for the cache arrays. This was never a problem in practice, because we weren't skipping accounting for the cache arrays. All the allocations we were skipping were fairly small. However, the fact that we were not skipping those allocations are a problem and can prevent the memcgs from going away. As we fix that, we need to make sure that the fix will also work with the SLUB allocator. Signed-off-by: NGlauber Costa <glommer@openvz.org> Reported-by: NMichal Hocko <mhocko@suze.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
We should check the VM_UNITIALIZED flag in s_show(). If this flag is set, that said, the vm_struct is not fully initialized. So it is unnecessary to try to show the information contained in vm_struct. We checked this flag in show_numa_info(), but I think it's better to check it earlier. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
VM_UNLIST was used to indicate that the vm_struct is not listed in vmlist. But after commit 4341fa45 ("mm, vmalloc: remove list management of vmlist after initializing vmalloc"), the meaning of this flag changed. It now means the vm_struct is not fully initialized. So renaming it to VM_UNINITIALIZED seems more reasonable. Also change clear_vm_unlist to clear_vm_uninitialized_flag. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
Use goto to jump to the fail label to give a failure message before returning NULL. This makes the failure handling in this function consistent. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
As we have removed the dead code in the vb_alloc, it seems there is no place to use the alloc_map. So there is no reason to maintain the alloc_map in vmap_block. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
This function is nowhere used now, so remove it. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
Space in a vmap block that was once allocated is considered dirty and not made available for allocation again before the whole block is recycled. The result is that free space within a vmap block is always contiguous. So if a vmap block has enough free space for allocation, the allocation is impossible to fail. Thus, the fragmented block purging was never invoked from vb_alloc(). So remove this dead code. [ Same patches also sent by: Chanho Min <chanho.min@lge.com> Johannes Weiner <hannes@cmpxchg.org> but git doesn't do "multiple authors" ] Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Carpenter 提交于
There is an extra semi-colon so the function always returns. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
When calculating pages in a node, for each zone in that node, we will have zone_spanned_pages_in_node --> get_pfn_range_for_nid zone_absent_pages_in_node --> get_pfn_range_for_nid That is to say, we call the get_pfn_range_for_nid to get start_pfn and end_pfn of the node for MAX_NR_ZONES * 2 times. And this is totally unnecessary if we call the get_pfn_range_for_nid before zone_*_pages_in_node add two extra arguments node_start_pfn and node_end_pfn for zone_*_pages_in_node, then we can remove the get_pfn_range_in_node in zone_*_pages_in_node. [akpm@linux-foundation.org: make definitions more readable] Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Weiner 提交于
Remove struct mem_cgroup_lru_info and fold its single member, the variably sized nodeinfo[0], directly into struct mem_cgroup. This should make it more obvious why it has to be the last member there. Also move the comment that's above that special last member below it, so it is more visible to somebody that considers appending to the struct mem_cgroup. Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rasmus Villemoes 提交于
This patch is very similar to commit 84d96d89 ("mm: madvise: complete input validation before taking lock"): perform some basic validation of the input to mremap() before taking the ¤t->mm->mmap_sem lock. This also makes the MREMAP_FIXED => MREMAP_MAYMOVE dependency slightly more explicit. Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 7月, 2013 1 次提交
-
-
由 Kees Cook 提交于
Calling dev_set_name with a single paramter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents, including wrappers like device_create*() and bdi_register(). Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-