You need to sign in or sign up before continuing.
未验证 提交 9c8467e9 编写于 作者: O openeuler-ci-bot 提交者: Gitee

!1802 zram: Support multiple compression streams

Merge Pull Request from: @ci-robot 
 
PR sync from: Jinjiang Tu <tujinjiang@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/UAQZCBQFUA5LYCLXXVHHJGAK2GNRA6LC/ 
Support multiple compression streams for zram.

Alexey Romanov (1):
  zram: add size class equals check into recompression

Andy Shevchenko (1):
  lib/cmdline: Export next_arg() for being used in modules

Ming Lei (1):
  zram: fix race between zram_reset_device() and disksize_store()

Sergey Senozhatsky (11):
  zram: preparation for multi-zcomp support
  zram: add recompression algorithm sysfs knob
  zram: factor out WB and non-WB zram read functions
  zram: introduce recompress sysfs knob
  zram: add recompress flag to read_block_state()
  zram: clarify writeback_store() comment
  zram: remove redundant checks from zram_recompress()
  zram: add algo parameter support to zram_recompress()
  documentation: add zram recompression documentation
  zram: add incompressible writeback
  zram: add incompressible flag to read_block_state()


-- 
2.25.1
 
https://gitee.com/openeuler/kernel/issues/I7TWVA 
 
Link:https://gitee.com/openeuler/kernel/pulls/1802 

Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
...@@ -334,6 +334,11 @@ Admin can request writeback of those idle pages at right timing via:: ...@@ -334,6 +334,11 @@ Admin can request writeback of those idle pages at right timing via::
With the command, zram writeback idle pages from memory to the storage. With the command, zram writeback idle pages from memory to the storage.
If a user chooses to writeback only incompressible pages (pages that none of
algorithms can compress) this can be accomplished with::
echo incompressible > /sys/block/zramX/writeback
If there are lots of write IO with flash device, potentially, it has If there are lots of write IO with flash device, potentially, it has
flash wearout problem so that admin needs to design write limitation flash wearout problem so that admin needs to design write limitation
to guarantee storage health for entire product life. to guarantee storage health for entire product life.
...@@ -382,6 +387,87 @@ budget in next setting is user's job. ...@@ -382,6 +387,87 @@ budget in next setting is user's job.
If admin wants to measure writeback count in a certain period, he could If admin wants to measure writeback count in a certain period, he could
know it via /sys/block/zram0/bd_stat's 3rd column. know it via /sys/block/zram0/bd_stat's 3rd column.
recompression
-------------
With CONFIG_ZRAM_MULTI_COMP, zram can recompress pages using alternative
(secondary) compression algorithms. The basic idea is that alternative
compression algorithm can provide better compression ratio at a price of
(potentially) slower compression/decompression speeds. Alternative compression
algorithm can, for example, be more successful compressing huge pages (those
that default algorithm failed to compress). Another application is idle pages
recompression - pages that are cold and sit in the memory can be recompressed
using more effective algorithm and, hence, reduce zsmalloc memory usage.
With CONFIG_ZRAM_MULTI_COMP, zram supports up to 4 compression algorithms:
one primary and up to 3 secondary ones. Primary zram compressor is explained
in "3) Select compression algorithm", secondary algorithms are configured
using recomp_algorithm device attribute.
Example:::
#show supported recompression algorithms
cat /sys/block/zramX/recomp_algorithm
#1: lzo lzo-rle lz4 lz4hc [zstd]
#2: lzo lzo-rle lz4 [lz4hc] zstd
Alternative compression algorithms are sorted by priority. In the example
above, zstd is used as the first alternative algorithm, which has priority
of 1, while lz4hc is configured as a compression algorithm with priority 2.
Alternative compression algorithm's priority is provided during algorithms
configuration:::
#select zstd recompression algorithm, priority 1
echo "algo=zstd priority=1" > /sys/block/zramX/recomp_algorithm
#select deflate recompression algorithm, priority 2
echo "algo=deflate priority=2" > /sys/block/zramX/recomp_algorithm
Another device attribute that CONFIG_ZRAM_MULTI_COMP enables is recompress,
which controls recompression.
Examples:::
#IDLE pages recompression is activated by `idle` mode
echo "type=idle" > /sys/block/zramX/recompress
#HUGE pages recompression is activated by `huge` mode
echo "type=huge" > /sys/block/zram0/recompress
#HUGE_IDLE pages recompression is activated by `huge_idle` mode
echo "type=huge_idle" > /sys/block/zramX/recompress
The number of idle pages can be significant, so user-space can pass a size
threshold (in bytes) to the recompress knob: zram will recompress only pages
of equal or greater size:::
#recompress all pages larger than 3000 bytes
echo "threshold=3000" > /sys/block/zramX/recompress
#recompress idle pages larger than 2000 bytes
echo "type=idle threshold=2000" > /sys/block/zramX/recompress
Recompression of idle pages requires memory tracking.
During re-compression for every page, that matches re-compression criteria,
ZRAM iterates the list of registered alternative compression algorithms in
order of their priorities. ZRAM stops either when re-compression was
successful (re-compressed object is smaller in size than the original one)
and matches re-compression criteria (e.g. size threshold) or when there are
no secondary algorithms left to try. If none of the secondary algorithms can
successfully re-compressed the page such a page is marked as incompressible,
so ZRAM will not attempt to re-compress it in the future.
This re-compression behaviour, when it iterates through the list of
registered compression algorithms, increases our chances of finding the
algorithm that successfully compresses a particular page. Sometimes, however,
it is convenient (and sometimes even necessary) to limit recompression to
only one particular algorithm so that it will not try any other algorithms.
This can be achieved by providing a algo=NAME parameter:::
#use zstd algorithm only (if registered)
echo "type=huge algo=zstd" > /sys/block/zramX/recompress
memory tracking memory tracking
=============== ===============
...@@ -392,9 +478,11 @@ pages of the process with*pagemap. ...@@ -392,9 +478,11 @@ pages of the process with*pagemap.
If you enable the feature, you could see block state via If you enable the feature, you could see block state via
/sys/kernel/debug/zram/zram0/block_state". The output is as follows:: /sys/kernel/debug/zram/zram0/block_state". The output is as follows::
300 75.033841 .wh. 300 75.033841 .wh...
301 63.806904 s... 301 63.806904 s.....
302 63.806919 ..hi 302 63.806919 ..hi..
303 62.801919 ....r.
304 146.781902 ..hi.n
First column First column
zram's block index. zram's block index.
...@@ -411,6 +499,10 @@ Third column ...@@ -411,6 +499,10 @@ Third column
huge page huge page
i: i:
idle page idle page
r:
recompressed page (secondary compression algorithm)
n:
none (including secondary) of algorithms could compress it
First line of above example says 300th block is accessed at 75.033841sec First line of above example says 300th block is accessed at 75.033841sec
and the block's state is huge so it is written back to the backing and the block's state is huge so it is written back to the backing
......
...@@ -37,3 +37,12 @@ config ZRAM_MEMORY_TRACKING ...@@ -37,3 +37,12 @@ config ZRAM_MEMORY_TRACKING
/sys/kernel/debug/zram/zramX/block_state. /sys/kernel/debug/zram/zramX/block_state.
See Documentation/admin-guide/blockdev/zram.rst for more information. See Documentation/admin-guide/blockdev/zram.rst for more information.
config ZRAM_MULTI_COMP
bool "Enable multiple compression streams"
depends on ZRAM
help
This will enable multi-compression streams, so that ZRAM can
re-compress pages using a potentially slower but more effective
compression algorithm. Note, that IDLE page recompression
requires ZRAM_MEMORY_TRACKING.
...@@ -204,7 +204,7 @@ void zcomp_destroy(struct zcomp *comp) ...@@ -204,7 +204,7 @@ void zcomp_destroy(struct zcomp *comp)
* case of allocation error, or any other error potentially * case of allocation error, or any other error potentially
* returned by zcomp_init(). * returned by zcomp_init().
*/ */
struct zcomp *zcomp_create(const char *compress) struct zcomp *zcomp_create(const char *alg)
{ {
struct zcomp *comp; struct zcomp *comp;
int error; int error;
...@@ -214,14 +214,14 @@ struct zcomp *zcomp_create(const char *compress) ...@@ -214,14 +214,14 @@ struct zcomp *zcomp_create(const char *compress)
* is not loaded yet. We must do it here, otherwise we are about to * is not loaded yet. We must do it here, otherwise we are about to
* call /sbin/modprobe under CPU hot-plug lock. * call /sbin/modprobe under CPU hot-plug lock.
*/ */
if (!zcomp_available_algorithm(compress)) if (!zcomp_available_algorithm(alg))
return ERR_PTR(-EINVAL); return ERR_PTR(-EINVAL);
comp = kzalloc(sizeof(struct zcomp), GFP_KERNEL); comp = kzalloc(sizeof(struct zcomp), GFP_KERNEL);
if (!comp) if (!comp)
return ERR_PTR(-ENOMEM); return ERR_PTR(-ENOMEM);
comp->name = compress; comp->name = alg;
error = zcomp_init(comp); error = zcomp_init(comp);
if (error) { if (error) {
kfree(comp); kfree(comp);
......
...@@ -27,7 +27,7 @@ int zcomp_cpu_dead(unsigned int cpu, struct hlist_node *node); ...@@ -27,7 +27,7 @@ int zcomp_cpu_dead(unsigned int cpu, struct hlist_node *node);
ssize_t zcomp_available_show(const char *comp, char *buf); ssize_t zcomp_available_show(const char *comp, char *buf);
bool zcomp_available_algorithm(const char *comp); bool zcomp_available_algorithm(const char *comp);
struct zcomp *zcomp_create(const char *comp); struct zcomp *zcomp_create(const char *alg);
void zcomp_destroy(struct zcomp *comp); void zcomp_destroy(struct zcomp *comp);
struct zcomp_strm *zcomp_stream_get(struct zcomp *comp); struct zcomp_strm *zcomp_stream_get(struct zcomp *comp);
......
此差异已折叠。
...@@ -41,6 +41,9 @@ ...@@ -41,6 +41,9 @@
*/ */
#define ZRAM_FLAG_SHIFT 24 #define ZRAM_FLAG_SHIFT 24
/* Only 2 bits are allowed for comp priority index */
#define ZRAM_COMP_PRIORITY_MASK 0x3
/* Flags for zram pages (table[page_no].flags) */ /* Flags for zram pages (table[page_no].flags) */
enum zram_pageflags { enum zram_pageflags {
/* zram slot is locked */ /* zram slot is locked */
...@@ -50,6 +53,10 @@ enum zram_pageflags { ...@@ -50,6 +53,10 @@ enum zram_pageflags {
ZRAM_UNDER_WB, /* page is under writeback */ ZRAM_UNDER_WB, /* page is under writeback */
ZRAM_HUGE, /* Incompressible page */ ZRAM_HUGE, /* Incompressible page */
ZRAM_IDLE, /* not accessed page since last idle marking */ ZRAM_IDLE, /* not accessed page since last idle marking */
ZRAM_INCOMPRESSIBLE, /* none of the algorithms could compress it */
ZRAM_COMP_PRIORITY_BIT1, /* First bit of comp priority index */
ZRAM_COMP_PRIORITY_BIT2, /* Second bit of comp priority index */
__NR_ZRAM_PAGEFLAGS, __NR_ZRAM_PAGEFLAGS,
}; };
...@@ -89,10 +96,20 @@ struct zram_stats { ...@@ -89,10 +96,20 @@ struct zram_stats {
#endif #endif
}; };
#ifdef CONFIG_ZRAM_MULTI_COMP
#define ZRAM_PRIMARY_COMP 0U
#define ZRAM_SECONDARY_COMP 1U
#define ZRAM_MAX_COMPS 4U
#else
#define ZRAM_PRIMARY_COMP 0U
#define ZRAM_SECONDARY_COMP 0U
#define ZRAM_MAX_COMPS 1U
#endif
struct zram { struct zram {
struct zram_table_entry *table; struct zram_table_entry *table;
struct zs_pool *mem_pool; struct zs_pool *mem_pool;
struct zcomp *comp; struct zcomp *comps[ZRAM_MAX_COMPS];
struct gendisk *disk; struct gendisk *disk;
/* Prevent concurrent execution of device init */ /* Prevent concurrent execution of device init */
struct rw_semaphore init_lock; struct rw_semaphore init_lock;
...@@ -107,7 +124,8 @@ struct zram { ...@@ -107,7 +124,8 @@ struct zram {
* we can store in a disk. * we can store in a disk.
*/ */
u64 disksize; /* bytes */ u64 disksize; /* bytes */
char compressor[CRYPTO_MAX_ALG_NAME]; const char *comp_algs[ZRAM_MAX_COMPS];
s8 num_active_comps;
/* /*
* zram is claimed so open request will be failed * zram is claimed so open request will be failed
*/ */
......
...@@ -55,5 +55,7 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle); ...@@ -55,5 +55,7 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
unsigned long zs_get_total_pages(struct zs_pool *pool); unsigned long zs_get_total_pages(struct zs_pool *pool);
unsigned long zs_compact(struct zs_pool *pool); unsigned long zs_compact(struct zs_pool *pool);
unsigned int zs_lookup_class_index(struct zs_pool *pool, unsigned int size);
void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats); void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats);
#endif #endif
...@@ -253,3 +253,4 @@ char *next_arg(char *args, char **param, char **val) ...@@ -253,3 +253,4 @@ char *next_arg(char *args, char **param, char **val)
/* Chew up trailing spaces. */ /* Chew up trailing spaces. */
return skip_spaces(next); return skip_spaces(next);
} }
EXPORT_SYMBOL(next_arg);
...@@ -1221,6 +1221,27 @@ static bool zspage_full(struct size_class *class, struct zspage *zspage) ...@@ -1221,6 +1221,27 @@ static bool zspage_full(struct size_class *class, struct zspage *zspage)
return get_zspage_inuse(zspage) == class->objs_per_zspage; return get_zspage_inuse(zspage) == class->objs_per_zspage;
} }
/**
* zs_lookup_class_index() - Returns index of the zsmalloc &size_class
* that hold objects of the provided size.
* @pool: zsmalloc pool to use
* @size: object size
*
* Context: Any context.
*
* Return: the index of the zsmalloc &size_class that hold objects of the
* provided size.
*/
unsigned int zs_lookup_class_index(struct zs_pool *pool, unsigned int size)
{
struct size_class *class;
class = pool->size_class[get_size_class_index(size)];
return class->index;
}
EXPORT_SYMBOL_GPL(zs_lookup_class_index);
unsigned long zs_get_total_pages(struct zs_pool *pool) unsigned long zs_get_total_pages(struct zs_pool *pool)
{ {
return atomic_long_read(&pool->pages_allocated); return atomic_long_read(&pool->pages_allocated);
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册