- 16 4月, 2015 1 次提交
-
-
由 Minchan Kim 提交于
Recently, we started to use zram heavily and some of issues popped. 1) external fragmentation I got a report from Juneho Choi that fork failed although there are plenty of free pages in the system. His investigation revealed zram is one of the culprit to make heavy fragmentation so there was no more contiguous 16K page for pgd to fork in the ARM. 2) non-movable pages Other problem of zram now is that inherently, user want to use zram as swap in small memory system so they use zRAM with CMA to use memory efficiently. However, unfortunately, it doesn't work well because zRAM cannot use CMA's movable pages unless it doesn't support compaction. I got several reports about that OOM happened with zram although there are lots of swap space and free space in CMA area. 3) internal fragmentation zRAM has started support memory limitation feature to limit memory usage and I sent a patchset(https://lkml.org/lkml/2014/9/21/148) for VM to be harmonized with zram-swap to stop anonymous page reclaim if zram consumed memory up to the limit although there are free space on the swap. One problem for that direction is zram has no way to know any hole in memory space zsmalloc allocated by internal fragmentation so zram would regard swap is full although there are free space in zsmalloc. For solving the issue, zram want to trigger compaction of zsmalloc before it decides full or not. This patchset is first step to support above issues. For that, it adds indirect layer between handle and object location and supports manual compaction to solve 3th problem first of all. After this patchset got merged, next step is to make VM aware of zsmalloc compaction so that generic compaction will move zsmalloced-pages automatically in runtime. In my imaginary experiment(ie, high compress ratio data with heavy swap in/out on 8G zram-swap), data is as follows, Before = zram allocated object : 60212066 bytes zram total used: 140103680 bytes ratio: 42.98 percent MemFree: 840192 kB Compaction After = frag ratio after compaction zram allocated object : 60212066 bytes zram total used: 76185600 bytes ratio: 79.03 percent MemFree: 901932 kB Juneho reported below in his real platform with small aging. So, I think the benefit would be bigger in real aging system for a long time. - frag_ratio increased 3% (ie, higher is better) - memfree increased about 6MB - In buddy info, Normal 2^3: 4, 2^2: 1: 2^1 increased, Highmem: 2^1 21 increased frag ratio after swap fragment used : 156677 kbytes total: 166092 kbytes frag_ratio : 94 meminfo before compaction MemFree: 83724 kB Node 0, zone Normal 13642 1364 57 10 61 17 9 5 4 0 0 Node 0, zone HighMem 425 29 1 0 0 0 0 0 0 0 0 num_migrated : 23630 compaction done frag ratio after compaction used : 156673 kbytes total: 160564 kbytes frag_ratio : 97 meminfo after compaction MemFree: 89060 kB Node 0, zone Normal 14076 1544 67 14 61 17 9 5 4 0 0 Node 0, zone HighMem 863 50 1 0 0 0 0 0 0 0 0 This patchset adds more logics(about 480 lines) in zsmalloc but when I tested heavy swapin/out program, the regression for swapin/out speed is marginal because most of overheads were caused by compress/decompress and other MM reclaim stuff. This patch (of 7): Currently, handle of zsmalloc encodes object's location directly so it makes support of migration hard. This patch decouples handle and object via adding indirect layer. For that, it allocates handle dynamically and returns it to user. The handle is the address allocated by slab allocation so it's unique and we could keep object's location in the memory space allocated for handle. With it, we can change object's position without changing handle itself. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Juneho Choi <juno.choi@lge.com> Cc: Gunho Lee <gunho.lee@lge.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 2月, 2015 2 次提交
-
-
由 Ganesh Mahendran 提交于
Keeping fragmentation of zsmalloc in a low level is our target. But now we still need to add the debug code in zsmalloc to get the quantitative data. This patch adds a new configuration CONFIG_ZSMALLOC_STAT to enable the statistics collection for developers. Currently only the objects statatitics in each class are collected. User can get the information via debugfs. cat /sys/kernel/debug/zsmalloc/zram0/... For example: After I copied "jdk-8u25-linux-x64.tar.gz" to zram with ext4 filesystem: class size obj_allocated obj_used pages_used 0 32 0 0 0 1 48 256 12 3 2 64 64 14 1 3 80 51 7 1 4 96 128 5 3 5 112 73 5 2 6 128 32 4 1 7 144 0 0 0 8 160 0 0 0 9 176 0 0 0 10 192 0 0 0 11 208 0 0 0 12 224 0 0 0 13 240 0 0 0 14 256 16 1 1 15 272 15 9 1 16 288 0 0 0 17 304 0 0 0 18 320 0 0 0 19 336 0 0 0 20 352 0 0 0 21 368 0 0 0 22 384 0 0 0 23 400 0 0 0 24 416 0 0 0 25 432 0 0 0 26 448 0 0 0 27 464 0 0 0 28 480 0 0 0 29 496 33 1 4 30 512 0 0 0 31 528 0 0 0 32 544 0 0 0 33 560 0 0 0 34 576 0 0 0 35 592 0 0 0 36 608 0 0 0 37 624 0 0 0 38 640 0 0 0 40 672 0 0 0 42 704 0 0 0 43 720 17 1 3 44 736 0 0 0 46 768 0 0 0 49 816 0 0 0 51 848 0 0 0 52 864 14 1 3 54 896 0 0 0 57 944 13 1 3 58 960 0 0 0 62 1024 4 1 1 66 1088 15 2 4 67 1104 0 0 0 71 1168 0 0 0 74 1216 0 0 0 76 1248 0 0 0 83 1360 3 1 1 91 1488 11 1 4 94 1536 0 0 0 100 1632 5 1 2 107 1744 0 0 0 111 1808 9 1 4 126 2048 4 4 2 144 2336 7 3 4 151 2448 0 0 0 168 2720 15 15 10 190 3072 28 27 21 202 3264 0 0 0 254 4096 36209 36209 36209 Total 37022 36326 36288 We can calculate the overall fragentation by the last line: Total 37022 36326 36288 (37022 - 36326) / 37022 = 1.87% Also by analysing objects alocated in every class we know why we got so low fragmentation: Most of the allocated objects is in <class 254>. And there is only 1 page in class 254 zspage. So, No fragmentation will be introduced by allocating objs in class 254. And in future, we can collect other zsmalloc statistics as we need and analyse them. Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com> Suggested-by: NMinchan Kim <minchan@kernel.org> Acked-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Dan Streetman <ddstreet@ieee.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ganesh Mahendran 提交于
Currently the underlay of zpool: zsmalloc/zbud, do not know who creates them. There is not a method to let zsmalloc/zbud find which caller they belong to. Now we want to add statistics collection in zsmalloc. We need to name the debugfs dir for each pool created. The way suggested by Minchan Kim is to use a name passed by caller(such as zram) to create the zsmalloc pool. /sys/kernel/debug/zsmalloc/zram0 This patch adds an argument `name' to zs_create_pool() and other related functions. Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com> Acked-by: NMinchan Kim <minchan@kernel.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Dan Streetman <ddstreet@ieee.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 12月, 2014 1 次提交
-
-
由 Ganesh Mahendran 提交于
Currently functions in zsmalloc.c does not arranged in a readable and reasonable sequence. With the more and more functions added, we may meet below inconvenience. For example: Current functions: void zs_init() { } static void get_maxobj_per_zspage() { } Then I want to add a func_1() which is called from zs_init(), and this new added function func_1() will used get_maxobj_per_zspage() which is defined below zs_init(). void func_1() { get_maxobj_per_zspage() } void zs_init() { func_1() } static void get_maxobj_per_zspage() { } This will cause compiling issue. So we must add a declaration: static void get_maxobj_per_zspage(); before func_1() if we do not put get_maxobj_per_zspage() before func_1(). In addition, puting module_[init|exit] functions at the bottom of the file conforms to our habit. So, this patch ajusts function sequence as: /* helper functions */ ... obj_location_to_handle() ... /* Some exported functions */ ... zs_map_object() zs_unmap_object() zs_malloc() zs_free() zs_init() zs_exit() Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Acked-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 12月, 2014 6 次提交
-
-
由 Ganesh Mahendran 提交于
In zs_create_pool(), we allocate memory more then sizeof(struct zs_pool) ovhd_size = roundup(sizeof(*pool), PAGE_SIZE); This patch allocate memory of exactly needed size. Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com> Acked-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Dan Streetman <ddstreet@ieee.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ganesh Mahendran 提交于
In zs_create_pool(), prev_class is assigned (ZS_SIZE_CLASSES - 1) times. And the prev_class only references to the previous size_class. So we do not need unnecessary assignement. This patch assigns *prev_class* when a new size_class structure is allocated and uses prev_class to check whether the first class has been allocated. [akpm@linux-foundation.org: remove now-unused ZS_SIZE_CLASSES] Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Reviewed-by: NDan Streetman <ddstreet@ieee.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mahendran Ganesh 提交于
I sent a patch [1] for unnecessary check in zsmalloc. And Minchan Kim found zsmalloc even does not support allocating an obj with the size of ZS_MAX_ALLOC_SIZE in some situations. For example: In system with 64KB PAGE_SIZE and 32 bit of physical addr. Then: ZS_MIN_ALLOC_SIZE is 32 bytes which is calculated by: MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS)) ZS_MAX_ALLOC_SIZE is 64KB(in current code, is PAGE_SIZE) ZS_SIZE_CLASS_DELTA is 256 bytes So, ZS_SIZE_CLASSES = (ZS_MAX_ALLOC_SIZE - ZS_MIN_ALLOC_SIZE) / ZS_SIZE_CLASS_DELTA + 1 = 256 In zs_create_pool(), the max size obj which can be allocated will be: ZS_MIN_ALLOC_SIZE + i * ZS_SIZE_CLASS_DELTA = 32 + 255*256 = 65312 We can see that 65312 < 65536 (ZS_MAX_ALLOC_SIZE). So we can NOT allocate objs with size ZS_MAX_ALLOC_SIZE(65536) which we promise upper users we can do. [1] http://lkml.iu.edu/hypermail/linux/kernel/1411.2/03835.html [2] http://lkml.iu.edu/hypermail/linux/kernel/1411.2/04534.html This patch fixes this issue by dynamiclly calculating zs_size_classes when module is loaded, allocates buffer with size ZS_MAX_ALLOC_SIZE. Then the max obj(size is ZS_MAX_ALLOC_SIZE) can be stored in it. [akpm@linux-foundation.org: restore ZS_SIZE_CLASSES to fix bisectability] Signed-off-by: NMahendran Ganesh <opensource.ganesh@gmail.com> Suggested-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
The kunmap_atomic should use virtual address getting by kmap_atomic. However, some pieces of code in zsmalloc uses modified address, not the one got by kmap_atomic for kunmap_atomic. It's okay for working because zsmalloc modifies the address inner PAGE_SIZE bounday so it works with current kmap_atomic's implementation. But it's still fragile with potential changing of kmap_atomic so let's correct it. I got a subtle bug when I implemented a new feature of zsmalloc (compaction) due to a link's mishandling (the link was over page boundary). Although it was totally my mistake, it took a while to find the cause because an unpredictable kmapped address was unmapped causing an almost random crash. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sergey Senozhatsky 提交于
Mahendran Ganesh reported that zpool-enabled zsmalloc should not call zpool_unregister_driver() from zs_init() if cpu notifier registration has failed, because error handling is performed before we register the driver via zpool_register_driver() call. Factor out cpu notifier registration and unregistration code and fix zs_init() error handling. link: http://lkml.iu.edu//hypermail/linux/kernel/1411.1/04156.html [akpm@linux-foundation.org: squash bogus gcc warning] [akpm@linux-foundation.org: use __init and __exit] Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: NMahendran Ganesh <opensource.ganesh@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
zsmalloc has many size_classes to reduce fragmentation and they are in 16 bytes unit, for example, 16, 32, 48, etc., if PAGE_SIZE is 4096. And, zsmalloc has constraint that each zspage has 4 pages at maximum. In this situation, we can see interesting aspect. Let's think about size_class for 1488, 1472, ..., 1376. To prevent external fragmentation, they uses 4 pages per zspage and so all they can contain 11 objects at maximum. 16384 (4096 * 4) = 1488 * 11 + remains 16384 (4096 * 4) = 1472 * 11 + remains 16384 (4096 * 4) = ... 16384 (4096 * 4) = 1376 * 11 + remains It means that they have same characteristics and classification between them isn't needed. If we use one size_class for them, we can reduce fragementation and save some memory since both the 1488 and 1472 sized classes can only fit 11 objects into 4 pages, and an object that's 1472 bytes can fit into an object that's 1488 bytes, merging these classes to always use objects that are 1488 bytes will reduce the total number of size classes. And reducing the total number of size classes reduces overall fragmentation, because a wider range of compressed pages can fit into a single size class, leaving less unused objects in each size class. For this purpose, this patch implement size_class merging. If there is size_class that have same pages_per_zspage and same number of objects per zspage with previous size_class, we don't create new size_class. Instead, we use previous, same characteristic size_class. With this way, above example sizes (1488, 1472, ..., 1376) use just one size_class so we can get much more memory utilization. Below is result of my simple test. TEST ENV: EXT4 on zram, mount with discard option WORKLOAD: untar kernel source code, remove directory in descending order in size. (drivers arch fs sound include net Documentation firmware kernel tools) Each line represents orig_data_size, compr_data_size, mem_used_total, fragmentation overhead (mem_used - compr_data_size) and overhead ratio (overhead to compr_data_size), respectively, after untar and remove operation is executed. * untar-nomerge.out orig_size compr_size used_size overhead overhead_ratio 525.88MB 199.16MB 210.23MB 11.08MB 5.56% 288.32MB 97.43MB 105.63MB 8.20MB 8.41% 177.32MB 61.12MB 69.40MB 8.28MB 13.55% 146.47MB 47.32MB 56.10MB 8.78MB 18.55% 124.16MB 38.85MB 48.41MB 9.55MB 24.58% 103.93MB 31.68MB 40.93MB 9.25MB 29.21% 84.34MB 22.86MB 32.72MB 9.86MB 43.13% 66.87MB 14.83MB 23.83MB 9.00MB 60.70% 60.67MB 11.11MB 18.60MB 7.49MB 67.48% 55.86MB 8.83MB 16.61MB 7.77MB 88.03% 53.32MB 8.01MB 15.32MB 7.31MB 91.24% * untar-merge.out orig_size compr_size used_size overhead overhead_ratio 526.23MB 199.18MB 209.81MB 10.64MB 5.34% 288.68MB 97.45MB 104.08MB 6.63MB 6.80% 177.68MB 61.14MB 66.93MB 5.79MB 9.47% 146.83MB 47.34MB 52.79MB 5.45MB 11.51% 124.52MB 38.87MB 44.30MB 5.43MB 13.96% 104.29MB 31.70MB 36.83MB 5.13MB 16.19% 84.70MB 22.88MB 27.92MB 5.04MB 22.04% 67.11MB 14.83MB 19.26MB 4.43MB 29.86% 60.82MB 11.10MB 14.90MB 3.79MB 34.17% 55.90MB 8.82MB 12.61MB 3.79MB 42.97% 53.32MB 8.01MB 11.73MB 3.73MB 46.53% As you can see above result, merged one has better utilization (overhead ratio, 5th column) and uses less memory (mem_used_total, 3rd column). Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: NDan Streetman <ddstreet@ieee.org> Cc: Luigi Semenzato <semenzato@google.com> Cc: <juno.choi@lge.com> Cc: "seungho1.park" <seungho1.park@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 10月, 2014 4 次提交
-
-
由 Dan Streetman 提交于
Change zsmalloc init_zspage() logic to iterate through each object on each of its pages, checking the offset to verify the object is on the current page before linking it into the zspage. The current zsmalloc init_zspage free object linking code has logic that relies on there only being one page per zspage when PAGE_SIZE is a multiple of class->size. It calculates the number of objects for the current page, and iterates through all of them plus one, to account for the assumed partial object at the end of the page. While this currently works, the logic can be simplified to just link the object at each successive offset until the offset is larger than PAGE_SIZE, which does not rely on PAGE_SIZE being a multiple of class->size. Signed-off-by: NDan Streetman <ddstreet@ieee.org> Acked-by: NMinchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjennings@variantweb.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wang Sheng-Hui 提交于
The letter 'f' in "n <= N/f" stands for fullness_threshold_frac, not 1/fullness_threshold_frac. Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com> Acked-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
zs_get_total_size_bytes returns a amount of memory zsmalloc consumed with *byte unit* but zsmalloc operates *page unit* rather than byte unit so let's change the API so benefit we could get is that reduce unnecessary overhead (ie, change page unit with byte unit) in zsmalloc. Since return type is pages, "zs_get_total_pages" is better than "zs_get_total_size_bytes". Signed-off-by: NMinchan Kim <minchan@kernel.org> Reviewed-by: NDan Streetman <ddstreet@ieee.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: <juno.choi@lge.com> Cc: <seungho1.park@lge.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: David Horner <ds2horner@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Currently, zram has no feature to limit memory so theoretically zram can deplete system memory. Users have asked for a limit several times as even without exhaustion zram makes it hard to control memory usage of the platform. This patchset adds the feature. Patch 1 makes zs_get_total_size_bytes faster because it would be used frequently in later patches for the new feature. Patch 2 changes zs_get_total_size_bytes's return unit from bytes to page so that zsmalloc doesn't need unnecessary operation(ie, << PAGE_SHIFT). Patch 3 adds new feature. I added the feature into zram layer, not zsmalloc because limiation is zram's requirement, not zsmalloc so any other user using zsmalloc(ie, zpool) shouldn't affected by unnecessary branch of zsmalloc. In future, if every users of zsmalloc want the feature, then, we could move the feature from client side to zsmalloc easily but vice versa would be painful. Patch 4 adds news facility to report maximum memory usage of zram so that this avoids user polling frequently via /sys/block/zram0/ mem_used_total and ensures transient max are not missed. This patch (of 4): pages_allocated has counted in size_class structure and when user of zsmalloc want to see total_size_bytes, it should gather all of count from each size_class to report the sum. It's not bad if user don't see the value often but if user start to see the value frequently, it would be not a good deal for performance pov. This patch moves the count from size_class to zs_pool so it could reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_long_t)]). Signed-off-by: NMinchan Kim <minchan@kernel.org> Reviewed-by: NDan Streetman <ddstreet@ieee.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: <juno.choi@lge.com> Cc: <seungho1.park@lge.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjennings@variantweb.net> Reviewed-by: NDavid Horner <ds2horner@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 8月, 2014 1 次提交
-
-
由 Kees Cook 提交于
To avoid potential format string expansion via module parameters, do not use the zpool type directly in request_module() without a format string. Additionally, to avoid arbitrary modules being loaded via zpool API (e.g. via the zswap_zpool_type module parameter) add a "zpool-" prefix to the requested module, as well as module aliases for the existing zpool types (zbud and zsmalloc). Signed-off-by: NKees Cook <keescook@chromium.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Acked-by: NDan Streetman <ddstreet@ieee.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 8月, 2014 3 次提交
-
-
由 Dan Streetman 提交于
Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: NDan Streetman <ddstreet@ieee.org> Tested-by: NSeth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Streetman 提交于
Add zpool api. zpool provides an interface for memory storage, typically of compressed memory. Users can select what backend to use; currently the only implementations are zbud, a low density implementation with up to two compressed pages per storage page, and zsmalloc, a higher density implementation with multiple compressed pages per storage page. Signed-off-by: NDan Streetman <ddstreet@ieee.org> Tested-by: NSeth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 WANG Chao 提交于
Currently map_vm_area() takes (struct page *** pages) as third argument, and after mapping, it moves (*pages) to point to (*pages + nr_mappped_pages). It looks like this kind of increment is useless to its caller these days. The callers don't care about the increments and actually they're trying to avoid this by passing another copy to map_vm_area(). The caller can always guarantee all the pages can be mapped into vm_area as specified in first argument and the caller only cares about whether map_vm_area() fails or not. This patch cleans up the pointer movement in map_vm_area() and updates its callers accordingly. Signed-off-by: NWANG Chao <chaowang@redhat.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 6月, 2014 2 次提交
-
-
由 Weijie Yang 提交于
According to calculation, ZS_SIZE_CLASSES value is 255 on systems with 4K page size, not 254. The old value may forget count the ZS_MIN_ALLOC_SIZE in. This patch fixes this trivial issue in the comments. Signed-off-by: NWeijie Yang <weijie.yang@samsung.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
Replace places where __get_cpu_var() is used for an address calculation with this_cpu_ptr(). Signed-off-by: NChristoph Lameter <cl@linux.com> Cc: Tejun Heo <tj@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 3月, 2014 1 次提交
-
-
由 Srivatsa S. Bhat 提交于
Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the zsmalloc code by using this latter form of callback registration. Cc: Nitin Gupta <ngupta@vflare.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 31 1月, 2014 2 次提交
-
-
由 Minchan Kim 提交于
Add my copyright to the zsmalloc source code which I maintain. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
This patch moves zsmalloc under mm directory. Before that, description will explain why we have needed custom allocator. Zsmalloc is a new slab-based memory allocator for storing compressed pages. It is designed for low fragmentation and high allocation success rate on large object, but <= PAGE_SIZE allocations. zsmalloc differs from the kernel slab allocator in two primary ways to achieve these design goals. zsmalloc never requires high order page allocations to back slabs, or "size classes" in zsmalloc terms. Instead it allows multiple single-order pages to be stitched together into a "zspage" which backs the slab. This allows for higher allocation success rate under memory pressure. Also, zsmalloc allows objects to span page boundaries within the zspage. This allows for lower fragmentation than could be had with the kernel slab allocator for objects between PAGE_SIZE/2 and PAGE_SIZE. With the kernel slab allocator, if a page compresses to 60% of it original size, the memory savings gained through compression is lost in fragmentation because another object of the same size can't be stored in the leftover space. This ability to span pages results in zsmalloc allocations not being directly addressable by the user. The user is given an non-dereferencable handle in response to an allocation request. That handle must be mapped, using zs_map_object(), which returns a pointer to the mapped region that can be used. The mapping is necessary since the object data may reside in two different noncontigious pages. The zsmalloc fulfills the allocation needs for zram perfectly [sjenning@linux.vnet.ibm.com: borrow Seth's quote] Signed-off-by: NMinchan Kim <minchan@kernel.org> Acked-by: NNitin Gupta <ngupta@vflare.org> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Pekka Enberg <penberg@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 12月, 2013 1 次提交
-
-
由 Paul Gortmaker 提交于
None of these files are actually using any __init type directives and hence don't need to include <linux/init.h>. Most are just a left over from __devinit and __cpuinit removal, or simply due to code getting copied from one driver to the next. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 11 12月, 2013 2 次提交
-
-
由 Nitin Cupta 提交于
This patch adds lots of comments and it will help others to review and enhance. Signed-off-by: NSeth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: NNitin Gupta <ngupta@vflare.org> Signed-off-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Minchan Kim 提交于
Zsmalloc has two methods 1) copy-based and 2) pte based to access objects that span two pages. You can see history why we supported two approach from [1]. But it was bad choice that adding hard coding to select arch which want to use pte based method because there are lots of SoC in an architecure and they can have different cache size, CPU speed and so on so it would be better to expose it to user as selectable Kconfig option like Andrew Morton suggested. [1] https://lkml.org/lkml/2012/7/11/58Acked-by: NNitin Gupta <ngupta@vflare.org> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 11月, 2013 1 次提交
-
-
由 Olav Haugan 提交于
zsmalloc encodes a handle using the pfn and an object index. On hardware platforms with physical memory starting at 0x0 the pfn can be 0. This causes the encoded handle to be 0 and is incorrectly interpreted as an allocation failure. This issue affects all current and future SoCs with physical memory starting at 0x0. All MSM8974 SoCs which includes Google Nexus 5 devices are affected. To prevent this false error we ensure that the encoded handle will not be 0 when allocation succeeds. Signed-off-by: NOlav Haugan <ohaugan@codeaurora.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 24 7月, 2013 1 次提交
-
-
由 Sunghan Suh 提交于
Signed-off-by: NSunghan Suh <sunghan.suh@samsung.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 22 5月, 2013 1 次提交
-
-
由 Sara Bird 提交于
The existing comments are using an odd style. Fixed them up to adhere to the StyleGuide. No code changes. Signed-off-by: NSara Bird <sara.bird.iar@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 5月, 2013 1 次提交
-
-
由 Marlies Ruck 提交于
Fixes the following checkpatch warning: WARNING: quoted string split across lines Signed-off-by: NMarlies Ruck <marlies.ruck@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 24 4月, 2013 1 次提交
-
-
由 Arnd Bergmann 提交于
Building zsmalloc as a module does not work on ARM because it uses an interface that is not exported: ERROR: "flush_tlb_kernel_range" [drivers/staging/zsmalloc/zsmalloc.ko] undefined! Since this is only used as a performance optimization and only on ARM, we can avoid the problem simply by not using that optimization when building zsmalloc it is a loadable module. flush_tlb_kernel_range is often an inline function, but out of the architectures that use an extern function, only powerpc exports it. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 29 3月, 2013 1 次提交
-
-
由 Joerg Roedel 提交于
Testing the arm chromebook config against the upstream kernel produces a linker error for the zsmalloc module from staging. The symbol flush_tlb_kernel_range is not available there. Fix this by removing the reimplementation of unmap_kernel_range in the zsmalloc module and using the function directly. The unmap_kernel_range function is not usable by modules, so also disallow building the driver as a module for now. Cc: stable <stable@vger.kernel.org> Signed-off-by: NJoerg Roedel <joro@8bytes.org> Acked-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 24 2月, 2013 1 次提交
-
-
由 Mel Gorman 提交于
The function names page_xchg_last_nid(), page_last_nid() and reset_page_last_nid() were judged to be inconsistent so rename them to a struct_field_op style pattern. As it looked jarring to have reset_page_mapcount() and page_nid_reset_last() beside each other in memmap_init_zone(), this patch also renames reset_page_mapcount() to page_mapcount_reset(). There are others like init_page_count() but as it is used throughout the arch code a rename would likely cause more conflicts than it is worth. [akpm@linux-foundation.org: fix zcache] Signed-off-by: NMel Gorman <mgorman@suse.de> Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 1月, 2013 1 次提交
-
-
由 Seth Jennings 提交于
zs_create_pool() currently takes a name argument which is never used in any useful way. This patch removes it. Signed-off-by: NSeth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: NNitin Gupta <ngupta@vflare.org> Acked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 30 1月, 2013 2 次提交
-
-
由 Minchan Kim 提交于
Recently, Matt Sealey reported he fail to build zsmalloc caused by using of local_flush_tlb_kernel_range which are architecture dependent function so !CONFIG_SMP in ARM couldn't implement it so it ends up build error following as. MODPOST 216 modules LZMA arch/arm/boot/compressed/piggy.lzma AS arch/arm/boot/compressed/lib1funcs.o ERROR: "v7wbi_flush_kern_tlb_range" [drivers/staging/zsmalloc/zsmalloc.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 make: *** Waiting for unfinished jobs.... The reason we used that function is copy method by [1] was really slow in ARM but at that time. More severe problem is ARM can prefetch speculatively on other CPUs so under us, other TLBs can have an entry only if we do flush local CPU. Russell King pointed that. Thanks! We don't have many choices except using flush_tlb_kernel_range. My experiment in ARMv7 processor 4 core didn't make any difference with zsmapbench[2] between local_flush_tlb_kernel_range and flush_tlb_kernel_range but still page-table based is much better than copy-based. * bigger is better. 1. local_flush_tlb_kernel_range: 3918795 mappings 2. flush_tlb_kernel_range : 3989538 mappings 3. copy-based: 635158 mappings This patch replace local_flush_tlb_kernel_range with flush_tlb_kernel_range which are avaialbe in all architectures because we already have used it in vmalloc allocator which are generic one so build problem should go away and performane loss shoud be void. [1] f553646a, zsmalloc: add page table mapping method [2] https://github.com/spartacus06/zsmapbench Cc: stable@vger.kernel.org Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Konrad Rzeszutek Wilk <konrad@darnok.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Reported-by: NMatt Sealey <matt@genesi-usa.com> Signed-off-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Seth Jennings 提交于
Right now ZS_SIZE_CLASS_DELTA is hardcoded to be 16. This creates 254 classes for systems with 4k pages. However, on PPC64 with 64k pages, it creates 4095 classes which is far too many. This patch makes ZS_SIZE_CLASS_DELTA relative to PAGE_SIZE so that regardless of the page size, there will be the same number of classes. Acked-by: NNitin Gupta <ngupta@vflare.org> Acked-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NSeth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: NDan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 16 1月, 2013 1 次提交
-
-
由 Davidlohr Bueso 提交于
Just as with zs_malloc() and zs_map_object(), it is worth formally commenting the zs_create_pool() function. Signed-off-by: NDavidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 14 8月, 2012 3 次提交
-
-
由 Seth Jennings 提交于
The patch collapses in the internal zsmalloc_int.h into the zsmalloc-main.c file. This is done in preparation for the promotion to mm/ where separate internal headers are discouraged. Signed-off-by: NSeth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: NMinchan Kim <minchan@kernel.org> Acked-by: NNitin Gupta <ngupta@vflare.org> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Seth Jennings 提交于
This patchset provides page mapping via the page table. On some archs, most notably ARM, this method has been demonstrated to be faster than copying. The logic controlling the method selection (copy vs page table) is controlled by the definition of USE_PGTABLE_MAPPING which is/can be defined for any arch that performs better with page table mapping. Signed-off-by: NSeth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Seth Jennings 提交于
Because we use per-cpu mapping areas shared among the pools/users, we can't allow mapping in interrupt context because it can corrupt another users mappings. Signed-off-by: NSeth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: NMinchan Kim <minchan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-