提交 919e2bb8 编写于 作者: J Jonathan Corbet

Merge branch 'mauro' into docs-next

Mauro says:

This is the second part of a series I wrote sometime ago where I manually
convert lots of files to be properly parsed by Sphinx as ReST files.

As it touches on lot of stuff, this series is based on today's linux-next,
at tag next-20190617.

The first version of this series had 57 patches. The first part with 28 patches
were already merged. Right now, there are still ~76  patches pending applying
(including this series), and that's because I opted to do ~1 patch per converted
 directory.

That sounds too much to be send on a single round. So, I'm opting to split
it on 3 parts for the conversion, plus a final patch adding orphaned books
to existing ones.

Those patches should probably be good to be merged either by subsystem
maintainers or via the docs tree.

I opted to mark new files not included yet to the main index.rst (directly or
indirectly) with the :orphan: tag, in order to avoid adding warnings to the
build system. This should be removed after we find a "home" for all
the converted files within the new document tree arrangement, after I
submit the third part.
......@@ -895,7 +895,7 @@ this sysctl interface anymore.
pty
===
See Documentation/filesystems/devpts.txt.
See Documentation/filesystems/devpts.rst.
randomize_va_space
......
.. SPDX-License-Identifier: GPL-2.0
=================
Automount Support
=================
Support is available for filesystems that wish to do automounting
support (such as kAFS which can be found in fs/afs/ and NFS in
fs/nfs/). This facility includes allowing in-kernel mounts to be
......@@ -5,13 +12,12 @@ performed and mountpoint degradation to be requested. The latter can
also be requested by userspace.
======================
IN-KERNEL AUTOMOUNTING
In-Kernel Automounting
======================
See section "Mount Traps" of Documentation/filesystems/autofs.rst
Then from userspace, you can just do something like:
Then from userspace, you can just do something like::
[root@andromeda root]# mount -t afs \#root.afs. /afs
[root@andromeda root]# ls /afs
......@@ -21,7 +27,7 @@ Then from userspace, you can just do something like:
[root@andromeda root]# ls /afs/cambridge/afsdoc/
ChangeLog html LICENSE pdf RELNOTES-1.2.2
And then if you look in the mountpoint catalogue, you'll see something like:
And then if you look in the mountpoint catalogue, you'll see something like::
[root@andromeda root]# cat /proc/mounts
...
......@@ -30,8 +36,7 @@ And then if you look in the mountpoint catalogue, you'll see something like:
#afsdoc. /afs/cambridge.redhat.com/afsdoc afs rw 0 0
===========================
AUTOMATIC MOUNTPOINT EXPIRY
Automatic Mountpoint Expiry
===========================
Automatic expiration of mountpoints is easy, provided you've mounted the
......@@ -43,7 +48,8 @@ To do expiration, you need to follow these steps:
hung.
(2) When a new mountpoint is created in the ->d_automount method, add
the mnt to the list using mnt_set_expiry()
the mnt to the list using mnt_set_expiry()::
mnt_set_expiry(newmnt, &afs_vfsmounts);
(3) When you want mountpoints to be expired, call mark_mounts_for_expiry()
......@@ -70,8 +76,7 @@ and the copies of those that are on an expiration list will be added to the
same expiration list.
=======================
USERSPACE DRIVEN EXPIRY
Userspace Driven Expiry
=======================
As an alternative, it is possible for userspace to request expiry of any
......
==========================
FS-CACHE CACHE BACKEND API
==========================
.. SPDX-License-Identifier: GPL-2.0
==========================
FS-Cache Cache backend API
==========================
The FS-Cache system provides an API by which actual caches can be supplied to
FS-Cache for it to then serve out to network filesystems and other interested
......@@ -9,15 +11,14 @@ parties.
This API is declared in <linux/fscache-cache.h>.
====================================
INITIALISING AND REGISTERING A CACHE
Initialising and Registering a Cache
====================================
To start off, a cache definition must be initialised and registered for each
cache the backend wants to make available. For instance, CacheFS does this in
the fill_super() operation on mounting.
The cache definition (struct fscache_cache) should be initialised by calling:
The cache definition (struct fscache_cache) should be initialised by calling::
void fscache_init_cache(struct fscache_cache *cache,
struct fscache_cache_ops *ops,
......@@ -26,17 +27,17 @@ The cache definition (struct fscache_cache) should be initialised by calling:
Where:
(*) "cache" is a pointer to the cache definition;
* "cache" is a pointer to the cache definition;
(*) "ops" is a pointer to the table of operations that the backend supports on
* "ops" is a pointer to the table of operations that the backend supports on
this cache; and
(*) "idfmt" is a format and printf-style arguments for constructing a label
* "idfmt" is a format and printf-style arguments for constructing a label
for the cache.
The cache should then be registered with FS-Cache by passing a pointer to the
previously initialised cache definition to:
previously initialised cache definition to::
int fscache_add_cache(struct fscache_cache *cache,
struct fscache_object *fsdef,
......@@ -44,12 +45,12 @@ previously initialised cache definition to:
Two extra arguments should also be supplied:
(*) "fsdef" which should point to the object representation for the FS-Cache
* "fsdef" which should point to the object representation for the FS-Cache
master index in this cache. Netfs primary index entries will be created
here. FS-Cache keeps the caller's reference to the index object if
successful and will release it upon withdrawal of the cache.
(*) "tagname" which, if given, should be a text string naming this cache. If
* "tagname" which, if given, should be a text string naming this cache. If
this is NULL, the identifier will be used instead. For CacheFS, the
identifier is set to name the underlying block device and the tag can be
supplied by mount.
......@@ -58,20 +59,18 @@ This function may return -ENOMEM if it ran out of memory or -EEXIST if the tag
is already in use. 0 will be returned on success.
=====================
UNREGISTERING A CACHE
Unregistering a Cache
=====================
A cache can be withdrawn from the system by calling this function with a
pointer to the cache definition:
pointer to the cache definition::
void fscache_withdraw_cache(struct fscache_cache *cache);
In CacheFS's case, this is called by put_super().
========
SECURITY
Security
========
The cache methods are executed one of two contexts:
......@@ -89,8 +88,7 @@ be masqueraded for the duration of the cache driver's access to the cache.
This is left to the cache to handle; FS-Cache makes no effort in this regard.
===================================
CONTROL AND STATISTICS PRESENTATION
Control and Statistics Presentation
===================================
The cache may present data to the outside world through FS-Cache's interfaces
......@@ -101,11 +99,10 @@ is enabled. This is accessible through the kobject struct fscache_cache::kobj
and is for use by the cache as it sees fit.
========================
RELEVANT DATA STRUCTURES
Relevant Data Structures
========================
(*) Index/Data file FS-Cache representation cookie:
* Index/Data file FS-Cache representation cookie::
struct fscache_cookie {
struct fscache_object_def *def;
......@@ -121,7 +118,7 @@ RELEVANT DATA STRUCTURES
cache operations.
(*) In-cache object representation:
* In-cache object representation::
struct fscache_object {
int debug_id;
......@@ -150,7 +147,7 @@ RELEVANT DATA STRUCTURES
initialised by calling fscache_object_init(object).
(*) FS-Cache operation record:
* FS-Cache operation record::
struct fscache_operation {
atomic_t usage;
......@@ -173,7 +170,7 @@ RELEVANT DATA STRUCTURES
an operation needs more processing time, it should be enqueued again.
(*) FS-Cache retrieval operation record:
* FS-Cache retrieval operation record::
struct fscache_retrieval {
struct fscache_operation op;
......@@ -198,7 +195,7 @@ RELEVANT DATA STRUCTURES
it sees fit.
(*) FS-Cache storage operation record:
* FS-Cache storage operation record::
struct fscache_storage {
struct fscache_operation op;
......@@ -212,16 +209,17 @@ RELEVANT DATA STRUCTURES
storage.
================
CACHE OPERATIONS
Cache Operations
================
The cache backend provides FS-Cache with a table of operations that can be
performed on the denizens of the cache. These are held in a structure of type:
struct fscache_cache_ops
::
struct fscache_cache_ops
(*) Name of cache provider [mandatory]:
* Name of cache provider [mandatory]::
const char *name
......@@ -229,7 +227,7 @@ performed on the denizens of the cache. These are held in a structure of type:
the backend.
(*) Allocate a new object [mandatory]:
* Allocate a new object [mandatory]::
struct fscache_object *(*alloc_object)(struct fscache_cache *cache,
struct fscache_cookie *cookie)
......@@ -244,7 +242,7 @@ performed on the denizens of the cache. These are held in a structure of type:
form once lookup is complete or aborted.
(*) Look up and create object [mandatory]:
* Look up and create object [mandatory]::
void (*lookup_object)(struct fscache_object *object)
......@@ -263,7 +261,7 @@ performed on the denizens of the cache. These are held in a structure of type:
to abort the lookup of that object.
(*) Release lookup data [mandatory]:
* Release lookup data [mandatory]::
void (*lookup_complete)(struct fscache_object *object)
......@@ -271,7 +269,7 @@ performed on the denizens of the cache. These are held in a structure of type:
using to perform a lookup.
(*) Increment object refcount [mandatory]:
* Increment object refcount [mandatory]::
struct fscache_object *(*grab_object)(struct fscache_object *object)
......@@ -280,7 +278,7 @@ performed on the denizens of the cache. These are held in a structure of type:
It should return the object pointer if successful.
(*) Lock/Unlock object [mandatory]:
* Lock/Unlock object [mandatory]::
void (*lock_object)(struct fscache_object *object)
void (*unlock_object)(struct fscache_object *object)
......@@ -289,7 +287,7 @@ performed on the denizens of the cache. These are held in a structure of type:
to schedule with the lock held, so a spinlock isn't sufficient.
(*) Pin/Unpin object [optional]:
* Pin/Unpin object [optional]::
int (*pin_object)(struct fscache_object *object)
void (*unpin_object)(struct fscache_object *object)
......@@ -299,7 +297,7 @@ performed on the denizens of the cache. These are held in a structure of type:
enough space in the cache to permit this.
(*) Check coherency state of an object [mandatory]:
* Check coherency state of an object [mandatory]::
int (*check_consistency)(struct fscache_object *object)
......@@ -308,7 +306,7 @@ performed on the denizens of the cache. These are held in a structure of type:
if they're consistent and -ESTALE otherwise. -ENOMEM and -ERESTARTSYS
may also be returned.
(*) Update object [mandatory]:
* Update object [mandatory]::
int (*update_object)(struct fscache_object *object)
......@@ -317,7 +315,7 @@ performed on the denizens of the cache. These are held in a structure of type:
obtained by calling object->cookie->def->get_aux()/get_attr().
(*) Invalidate data object [mandatory]:
* Invalidate data object [mandatory]::
int (*invalidate_object)(struct fscache_operation *op)
......@@ -329,7 +327,7 @@ performed on the denizens of the cache. These are held in a structure of type:
fscache_op_complete() must be called on op before returning.
(*) Discard object [mandatory]:
* Discard object [mandatory]::
void (*drop_object)(struct fscache_object *object)
......@@ -341,7 +339,7 @@ performed on the denizens of the cache. These are held in a structure of type:
caller. The caller will invoke the put_object() method as appropriate.
(*) Release object reference [mandatory]:
* Release object reference [mandatory]::
void (*put_object)(struct fscache_object *object)
......@@ -349,7 +347,7 @@ performed on the denizens of the cache. These are held in a structure of type:
be freed when all the references to it are released.
(*) Synchronise a cache [mandatory]:
* Synchronise a cache [mandatory]::
void (*sync)(struct fscache_cache *cache)
......@@ -357,7 +355,7 @@ performed on the denizens of the cache. These are held in a structure of type:
device.
(*) Dissociate a cache [mandatory]:
* Dissociate a cache [mandatory]::
void (*dissociate_pages)(struct fscache_cache *cache)
......@@ -365,7 +363,7 @@ performed on the denizens of the cache. These are held in a structure of type:
cache withdrawal.
(*) Notification that the attributes on a netfs file changed [mandatory]:
* Notification that the attributes on a netfs file changed [mandatory]::
int (*attr_changed)(struct fscache_object *object);
......@@ -386,7 +384,7 @@ performed on the denizens of the cache. These are held in a structure of type:
execution of this operation.
(*) Reserve cache space for an object's data [optional]:
* Reserve cache space for an object's data [optional]::
int (*reserve_space)(struct fscache_object *object, loff_t size);
......@@ -404,7 +402,7 @@ performed on the denizens of the cache. These are held in a structure of type:
size if larger than that already.
(*) Request page be read from cache [mandatory]:
* Request page be read from cache [mandatory]::
int (*read_or_alloc_page)(struct fscache_retrieval *op,
struct page *page,
......@@ -446,7 +444,7 @@ performed on the denizens of the cache. These are held in a structure of type:
with. This will complete the operation when all pages are dealt with.
(*) Request pages be read from cache [mandatory]:
* Request pages be read from cache [mandatory]::
int (*read_or_alloc_pages)(struct fscache_retrieval *op,
struct list_head *pages,
......@@ -457,7 +455,7 @@ performed on the denizens of the cache. These are held in a structure of type:
of pages instead of one page. Any pages on which a read operation is
started must be added to the page cache for the specified mapping and also
to the LRU. Such pages must also be removed from the pages list and
*nr_pages decremented per page.
``*nr_pages`` decremented per page.
If there was an error such as -ENOMEM, then that should be returned; else
if one or more pages couldn't be read or allocated, then -ENOBUFS should
......@@ -466,7 +464,7 @@ performed on the denizens of the cache. These are held in a structure of type:
returned.
(*) Request page be allocated in the cache [mandatory]:
* Request page be allocated in the cache [mandatory]::
int (*allocate_page)(struct fscache_retrieval *op,
struct page *page,
......@@ -482,7 +480,7 @@ performed on the denizens of the cache. These are held in a structure of type:
allocated, then the netfs page should be marked and 0 returned.
(*) Request pages be allocated in the cache [mandatory]:
* Request pages be allocated in the cache [mandatory]::
int (*allocate_pages)(struct fscache_retrieval *op,
struct list_head *pages,
......@@ -493,7 +491,7 @@ performed on the denizens of the cache. These are held in a structure of type:
nr_pages should be treated as for the read_or_alloc_pages() method.
(*) Request page be written to cache [mandatory]:
* Request page be written to cache [mandatory]::
int (*write_page)(struct fscache_storage *op,
struct page *page);
......@@ -514,7 +512,7 @@ performed on the denizens of the cache. These are held in a structure of type:
appropriately.
(*) Discard retained per-page metadata [mandatory]:
* Discard retained per-page metadata [mandatory]::
void (*uncache_page)(struct fscache_object *object, struct page *page)
......@@ -523,13 +521,12 @@ performed on the denizens of the cache. These are held in a structure of type:
maintains for this page.
==================
FS-CACHE UTILITIES
FS-Cache Utilities
==================
FS-Cache provides some utilities that a cache backend may make use of:
(*) Note occurrence of an I/O error in a cache:
* Note occurrence of an I/O error in a cache::
void fscache_io_error(struct fscache_cache *cache)
......@@ -541,7 +538,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
This does not actually withdraw the cache. That must be done separately.
(*) Invoke the retrieval I/O completion function:
* Invoke the retrieval I/O completion function::
void fscache_end_io(struct fscache_retrieval *op, struct page *page,
int error);
......@@ -550,8 +547,8 @@ FS-Cache provides some utilities that a cache backend may make use of:
error value should be 0 if successful and an error otherwise.
(*) Record that one or more pages being retrieved or allocated have been dealt
with:
* Record that one or more pages being retrieved or allocated have been dealt
with::
void fscache_retrieval_complete(struct fscache_retrieval *op,
int n_pages);
......@@ -562,7 +559,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
completed.
(*) Record operation completion:
* Record operation completion::
void fscache_op_complete(struct fscache_operation *op);
......@@ -571,7 +568,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
one or more pending operations to start running.
(*) Set highest store limit:
* Set highest store limit::
void fscache_set_store_limit(struct fscache_object *object,
loff_t i_size);
......@@ -581,7 +578,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
rejected by fscache_read_alloc_page() and co with -ENOBUFS.
(*) Mark pages as being cached:
* Mark pages as being cached::
void fscache_mark_pages_cached(struct fscache_retrieval *op,
struct pagevec *pagevec);
......@@ -590,7 +587,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
the netfs must call fscache_uncache_page() to unmark the pages.
(*) Perform coherency check on an object:
* Perform coherency check on an object::
enum fscache_checkaux fscache_check_aux(struct fscache_object *object,
const void *data,
......@@ -603,29 +600,26 @@ FS-Cache provides some utilities that a cache backend may make use of:
One of three values will be returned:
(*) FSCACHE_CHECKAUX_OKAY
FSCACHE_CHECKAUX_OKAY
The coherency data indicates the object is valid as is.
(*) FSCACHE_CHECKAUX_NEEDS_UPDATE
FSCACHE_CHECKAUX_NEEDS_UPDATE
The coherency data needs updating, but otherwise the object is
valid.
(*) FSCACHE_CHECKAUX_OBSOLETE
FSCACHE_CHECKAUX_OBSOLETE
The coherency data indicates that the object is obsolete and should
be discarded.
(*) Initialise a freshly allocated object:
* Initialise a freshly allocated object::
void fscache_object_init(struct fscache_object *object);
This initialises all the fields in an object representation.
(*) Indicate the destruction of an object:
* Indicate the destruction of an object::
void fscache_object_destroyed(struct fscache_cache *cache);
......@@ -635,7 +629,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
all the objects.
(*) Indicate negative lookup on an object:
* Indicate negative lookup on an object::
void fscache_object_lookup_negative(struct fscache_object *object);
......@@ -650,7 +644,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
significant - all subsequent calls are ignored.
(*) Indicate an object has been obtained:
* Indicate an object has been obtained::
void fscache_obtained_object(struct fscache_object *object);
......@@ -667,7 +661,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
(2) that writes may now proceed against this object.
(*) Indicate that object lookup failed:
* Indicate that object lookup failed::
void fscache_object_lookup_error(struct fscache_object *object);
......@@ -676,7 +670,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
as possible.
(*) Indicate that a stale object was found and discarded:
* Indicate that a stale object was found and discarded::
void fscache_object_retrying_stale(struct fscache_object *object);
......@@ -685,7 +679,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
discarded from the cache and the lookup will be performed again.
(*) Indicate that the caching backend killed an object:
* Indicate that the caching backend killed an object::
void fscache_object_mark_killed(struct fscache_object *object,
enum fscache_why_object_killed why);
......@@ -693,13 +687,20 @@ FS-Cache provides some utilities that a cache backend may make use of:
This is called to indicate that the cache backend preemptively killed an
object. The why parameter should be set to indicate the reason:
FSCACHE_OBJECT_IS_STALE - the object was stale and needs discarding.
FSCACHE_OBJECT_NO_SPACE - there was insufficient cache space
FSCACHE_OBJECT_WAS_RETIRED - the object was retired when relinquished.
FSCACHE_OBJECT_WAS_CULLED - the object was culled to make space.
FSCACHE_OBJECT_IS_STALE
- the object was stale and needs discarding.
FSCACHE_OBJECT_NO_SPACE
- there was insufficient cache space
FSCACHE_OBJECT_WAS_RETIRED
- the object was retired when relinquished.
FSCACHE_OBJECT_WAS_CULLED
- the object was culled to make space.
(*) Get and release references on a retrieval record:
* Get and release references on a retrieval record::
void fscache_get_retrieval(struct fscache_retrieval *op);
void fscache_put_retrieval(struct fscache_retrieval *op);
......@@ -708,7 +709,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
asynchronous data retrieval and block allocation.
(*) Enqueue a retrieval record for processing.
* Enqueue a retrieval record for processing::
void fscache_enqueue_retrieval(struct fscache_retrieval *op);
......@@ -718,7 +719,7 @@ FS-Cache provides some utilities that a cache backend may make use of:
within the callback function.
(*) List of object state names:
* List of object state names::
const char *fscache_object_states[];
......
===============================================
CacheFiles: CACHE ON ALREADY MOUNTED FILESYSTEM
===============================================
.. SPDX-License-Identifier: GPL-2.0
Contents:
===============================================
CacheFiles: CACHE ON ALREADY MOUNTED FILESYSTEM
===============================================
.. Contents:
(*) Overview.
......@@ -27,8 +29,8 @@ Contents:
(*) Debugging.
========
OVERVIEW
Overview
========
CacheFiles is a caching backend that's meant to use as a cache a directory on
......@@ -58,8 +60,8 @@ spare space and automatically contract when the set of data requires more
space.
============
REQUIREMENTS
Requirements
============
The use of CacheFiles and its daemon requires the following features to be
......@@ -79,84 +81,70 @@ It is strongly recommended that the "dir_index" option is enabled on Ext3
filesystems being used as a cache.
=============
CONFIGURATION
Configuration
=============
The cache is configured by a script in /etc/cachefilesd.conf. These commands
set up cache ready for use. The following script commands are available:
(*) brun <N>%
(*) bcull <N>%
(*) bstop <N>%
(*) frun <N>%
(*) fcull <N>%
(*) fstop <N>%
brun <N>%, bcull <N>%, bstop <N>%, frun <N>%, fcull <N>%, fstop <N>%
Configure the culling limits. Optional. See the section on culling
The defaults are 7% (run), 5% (cull) and 1% (stop) respectively.
The commands beginning with a 'b' are file space (block) limits, those
beginning with an 'f' are file count limits.
(*) dir <path>
dir <path>
Specify the directory containing the root of the cache. Mandatory.
(*) tag <name>
tag <name>
Specify a tag to FS-Cache to use in distinguishing multiple caches.
Optional. The default is "CacheFiles".
(*) debug <mask>
debug <mask>
Specify a numeric bitmask to control debugging in the kernel module.
Optional. The default is zero (all off). The following values can be
OR'd into the mask to collect various information:
== =================================================
1 Turn on trace of function entry (_enter() macros)
2 Turn on trace of function exit (_leave() macros)
4 Turn on trace of internal debug points (_debug())
== =================================================
This mask can also be set through sysfs, eg:
This mask can also be set through sysfs, eg::
echo 5 >/sys/modules/cachefiles/parameters/debug
==================
STARTING THE CACHE
Starting the Cache
==================
The cache is started by running the daemon. The daemon opens the cache device,
configures the cache and tells it to begin caching. At that point the cache
binds to fscache and the cache becomes live.
The daemon is run as follows:
The daemon is run as follows::
/sbin/cachefilesd [-d]* [-s] [-n] [-f <configfile>]
The flags are:
(*) -d
``-d``
Increase the debugging level. This can be specified multiple times and
is cumulative with itself.
(*) -s
``-s``
Send messages to stderr instead of syslog.
(*) -n
``-n``
Don't daemonise and go into background.
(*) -f <configfile>
``-f <configfile>``
Use an alternative configuration file rather than the default one.
===============
THINGS TO AVOID
Things to Avoid
===============
Do not mount other things within the cache as this will cause problems. The
......@@ -179,8 +167,7 @@ Do not chmod files in the cache. The module creates things with minimal
permissions to prevent random users being able to access them directly.
=============
CACHE CULLING
Cache Culling
=============
The cache may need culling occasionally to make space. This involves
......@@ -192,27 +179,21 @@ Cache culling is done on the basis of the percentage of blocks and the
percentage of files available in the underlying filesystem. There are six
"limits":
(*) brun
(*) frun
brun, frun
If the amount of free space and the number of available files in the cache
rises above both these limits, then culling is turned off.
(*) bcull
(*) fcull
bcull, fcull
If the amount of available space or the number of available files in the
cache falls below either of these limits, then culling is started.
(*) bstop
(*) fstop
bstop, fstop
If the amount of available space or the number of available files in the
cache falls below either of these limits, then no further allocation of
disk space or files is permitted until culling has raised things above
these limits again.
These must be configured thusly:
These must be configured thusly::
0 <= bstop < bcull < brun < 100
0 <= fstop < fcull < frun < 100
......@@ -226,16 +207,14 @@ started as soon as space is made in the table. Objects will be skipped if
their atimes have changed or if the kernel module says it is still using them.
===============
CACHE STRUCTURE
Cache Structure
===============
The CacheFiles module will create two directories in the directory it was
given:
(*) cache/
(*) graveyard/
* cache/
* graveyard/
The active cache objects all reside in the first directory. The CacheFiles
kernel module moves any retired or culled objects that it can't simply unlink
......@@ -261,10 +240,10 @@ If an object has children, then it will be represented as a directory.
Immediately in the representative directory are a collection of directories
named for hash values of the child object keys with an '@' prepended. Into
this directory, if possible, will be placed the representations of the child
objects:
objects::
INDEX INDEX INDEX DATA FILES
========= ========== ================================= ================
/INDEX /INDEX /INDEX /DATA FILES
/=========/==========/=================================/================
cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400
cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...DB1ry
cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...N22ry
......@@ -275,7 +254,7 @@ If the key is so long that it exceeds NAME_MAX with the decorations added on to
it, then it will be cut into pieces, the first few of which will be used to
make a nest of directories, and the last one of which will be the objects
inside the last directory. The names of the intermediate directories will have
'+' prepended:
'+' prepended::
J1223/@23/+xy...z/+kl...m/Epqr
......@@ -288,11 +267,13 @@ To handle this, CacheFiles will use a suitably printable filename directly and
"base-64" encode ones that aren't directly suitable. The two versions of
object filenames indicate the encoding:
=============== =============== ===============
OBJECT TYPE PRINTABLE ENCODED
=============== =============== ===============
Index "I..." "J..."
Data "D..." "E..."
Special "S..." "T..."
=============== =============== ===============
Intermediate directories are always "@" or "+" as appropriate.
......@@ -307,8 +288,7 @@ Note that CacheFiles will erase from the cache any file it doesn't recognise or
any file of an incorrect type (such as a FIFO file or a device file).
==========================
SECURITY MODEL AND SELINUX
Security Model and SELinux
==========================
CacheFiles is implemented to deal properly with the LSM security features of
......@@ -331,26 +311,26 @@ When the CacheFiles module is asked to bind to its cache, it:
(1) Finds the security label attached to the root cache directory and uses
that as the security label with which it will create files. By default,
this is:
this is::
cachefiles_var_t
(2) Finds the security label of the process which issued the bind request
(presumed to be the cachefilesd daemon), which by default will be:
(presumed to be the cachefilesd daemon), which by default will be::
cachefilesd_t
and asks LSM to supply a security ID as which it should act given the
daemon's label. By default, this will be:
daemon's label. By default, this will be::
cachefiles_kernel_t
SELinux transitions the daemon's security ID to the module's security ID
based on a rule of this form in the policy.
based on a rule of this form in the policy::
type_transition <daemon's-ID> kernel_t : process <module's-ID>;
For instance:
For instance::
type_transition cachefilesd_t kernel_t : process cachefiles_kernel_t;
......@@ -370,7 +350,7 @@ There are policy source files available in:
http://people.redhat.com/~dhowells/fscache/cachefilesd-0.8.tar.bz2
and later versions. In that tarball, see the files:
and later versions. In that tarball, see the files::
cachefilesd.te
cachefilesd.fc
......@@ -379,7 +359,7 @@ and later versions. In that tarball, see the files:
They are built and installed directly by the RPM.
If a non-RPM based system is being used, then copy the above files to their own
directory and run:
directory and run::
make -f /usr/share/selinux/devel/Makefile
semodule -i cachefilesd.pp
......@@ -394,7 +374,7 @@ an auxiliary policy must be installed to label the alternate location of the
cache.
For instructions on how to add an auxiliary policy to enable the cache to be
located elsewhere when SELinux is in enforcing mode, please see:
located elsewhere when SELinux is in enforcing mode, please see::
/usr/share/doc/cachefilesd-*/move-cache.txt
......@@ -402,8 +382,7 @@ When the cachefilesd rpm is installed; alternatively, the document can be found
in the sources.
==================
A NOTE ON SECURITY
A Note on Security
==================
CacheFiles makes use of the split security in the task_struct. It allocates
......@@ -445,17 +424,18 @@ for CacheFiles to run in a context of a specific security label, or to create
files and directories with another security label.
=======================
STATISTICAL INFORMATION
Statistical Information
=======================
If FS-Cache is compiled with the following option enabled:
If FS-Cache is compiled with the following option enabled::
CONFIG_CACHEFILES_HISTOGRAM=y
then it will gather certain statistics and display them through a proc file.
(*) /proc/fs/cachefiles/histogram
/proc/fs/cachefiles/histogram
::
cat /proc/fs/cachefiles/histogram
JIFS SECS LOOKUPS MKDIRS CREATES
......@@ -465,36 +445,39 @@ then it will gather certain statistics and display them through a proc file.
between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The
columns are as follows:
======= =======================================================
COLUMN TIME MEASUREMENT
======= =======================================================
LOOKUPS Length of time to perform a lookup on the backing fs
MKDIRS Length of time to perform a mkdir on the backing fs
CREATES Length of time to perform a create on the backing fs
======= =======================================================
Each row shows the number of events that took a particular range of times.
Each step is 1 jiffy in size. The JIFS column indicates the particular
jiffy range covered, and the SECS field the equivalent number of seconds.
=========
DEBUGGING
Debugging
=========
If CONFIG_CACHEFILES_DEBUG is enabled, the CacheFiles facility can have runtime
debugging enabled by adjusting the value in:
debugging enabled by adjusting the value in::
/sys/module/cachefiles/parameters/debug
This is a bitmask of debugging streams to enable:
======= ======= =============================== =======================
BIT VALUE STREAM POINT
======= ======= =============================== =======================
0 1 General Function entry trace
1 2 Function exit trace
2 4 General
======= ======= =============================== =======================
The appropriate set of values should be OR'd together and the result written to
the control file. For example:
the control file. For example::
echo $((1|4|8)) >/sys/module/cachefiles/parameters/debug
......
.. SPDX-License-Identifier: GPL-2.0
Filesystem Caching
==================
.. toctree::
:maxdepth: 2
fscache
object
backend-api
cachefiles
netfs-api
operations
===============================
FS-CACHE NETWORK FILESYSTEM API
===============================
.. SPDX-License-Identifier: GPL-2.0
===============================
FS-Cache Network Filesystem API
===============================
There's an API by which a network filesystem can make use of the FS-Cache
facilities. This is based around a number of principles:
......@@ -19,7 +21,7 @@ facilities. This is based around a number of principles:
This API is declared in <linux/fscache.h>.
This document contains the following sections:
.. This document contains the following sections:
(1) Network filesystem definition
(2) Index definition
......@@ -41,12 +43,11 @@ This document contains the following sections:
(18) FS-Cache specific page flags.
=============================
NETWORK FILESYSTEM DEFINITION
Network Filesystem Definition
=============================
FS-Cache needs a description of the network filesystem. This is specified
using a record of the following structure:
using a record of the following structure::
struct fscache_netfs {
uint32_t version;
......@@ -71,7 +72,7 @@ The fields are:
another parameter passed into the registration function.
For example, kAFS (linux/fs/afs/) uses the following definitions to describe
itself:
itself::
struct fscache_netfs afs_cache_netfs = {
.version = 0,
......@@ -79,8 +80,7 @@ itself:
};
================
INDEX DEFINITION
Index Definition
================
Indices are used for two purposes:
......@@ -114,11 +114,10 @@ There are some limits on indices:
function is recursive. Too many layers will run the kernel out of stack.
=================
OBJECT DEFINITION
Object Definition
=================
To define an object, a structure of the following type should be filled out:
To define an object, a structure of the following type should be filled out::
struct fscache_cookie_def
{
......@@ -149,16 +148,13 @@ This has the following fields:
This is one of the following values:
(*) FSCACHE_COOKIE_TYPE_INDEX
FSCACHE_COOKIE_TYPE_INDEX
This defines an index, which is a special FS-Cache type.
(*) FSCACHE_COOKIE_TYPE_DATAFILE
FSCACHE_COOKIE_TYPE_DATAFILE
This defines an ordinary data file.
(*) Any other value between 2 and 255
Any other value between 2 and 255
This defines an extraordinary object such as an XATTR.
(2) The name of the object type (NUL terminated unless all 16 chars are used)
......@@ -192,9 +188,14 @@ This has the following fields:
If present, the function should return one of the following values:
(*) FSCACHE_CHECKAUX_OKAY - the entry is okay as is
(*) FSCACHE_CHECKAUX_NEEDS_UPDATE - the entry requires update
(*) FSCACHE_CHECKAUX_OBSOLETE - the entry should be deleted
FSCACHE_CHECKAUX_OKAY
- the entry is okay as is
FSCACHE_CHECKAUX_NEEDS_UPDATE
- the entry requires update
FSCACHE_CHECKAUX_OBSOLETE
- the entry should be deleted
This function can also be used to extract data from the auxiliary data in
the cache and copy it into the netfs's structures.
......@@ -236,32 +237,30 @@ This has the following fields:
This function is not required for indices as they're not permitted data.
===================================
NETWORK FILESYSTEM (UN)REGISTRATION
Network Filesystem (Un)registration
===================================
The first step is to declare the network filesystem to the cache. This also
involves specifying the layout of the primary index (for AFS, this would be the
"cell" level).
The registration function is:
The registration function is::
int fscache_register_netfs(struct fscache_netfs *netfs);
It just takes a pointer to the netfs definition. It returns 0 or an error as
appropriate.
For kAFS, registration is done as follows:
For kAFS, registration is done as follows::
ret = fscache_register_netfs(&afs_cache_netfs);
The last step is, of course, unregistration:
The last step is, of course, unregistration::
void fscache_unregister_netfs(struct fscache_netfs *netfs);
================
CACHE TAG LOOKUP
Cache Tag Lookup
================
FS-Cache permits the use of more than one cache. To permit particular index
......@@ -270,7 +269,7 @@ representation tags. This step is optional; it can be left entirely up to
FS-Cache as to which cache should be used. The problem with doing that is that
FS-Cache will always pick the first cache that was registered.
To get the representation for a named tag:
To get the representation for a named tag::
struct fscache_cache_tag *fscache_lookup_cache_tag(const char *name);
......@@ -278,7 +277,7 @@ This takes a text string as the name and returns a representation of a tag. It
will never return an error. It may return a dummy tag, however, if it runs out
of memory; this will inhibit caching with this tag.
Any representation so obtained must be released by passing it to this function:
Any representation so obtained must be released by passing it to this function::
void fscache_release_cache_tag(struct fscache_cache_tag *tag);
......@@ -286,13 +285,12 @@ The tag will be retrieved by FS-Cache when it calls the object definition
operation select_cache().
==================
INDEX REGISTRATION
Index Registration
==================
The third step is to inform FS-Cache about part of an index hierarchy that can
be used to locate files. This is done by requesting a cookie for each index in
the path to the file:
the path to the file::
struct fscache_cookie *
fscache_acquire_cookie(struct fscache_cookie *parent,
......@@ -339,7 +337,7 @@ must be enabled to do anything with it. A disabled cookie can be enabled by
calling fscache_enable_cookie() (see below).
For example, with AFS, a cell would be added to the primary index. This index
entry would have a dependent inode containing volume mappings within this cell:
entry would have a dependent inode containing volume mappings within this cell::
cell->cache =
fscache_acquire_cookie(afs_cache_netfs.primary_index,
......@@ -349,7 +347,7 @@ entry would have a dependent inode containing volume mappings within this cell:
cell, 0, true);
And then a particular volume could be added to that index by ID, creating
another index for vnodes (AFS inode equivalents):
another index for vnodes (AFS inode equivalents)::
volume->cache =
fscache_acquire_cookie(volume->cell->cache,
......@@ -359,13 +357,12 @@ another index for vnodes (AFS inode equivalents):
volume, 0, true);
======================
DATA FILE REGISTRATION
Data File Registration
======================
The fourth step is to request a data file be created in the cache. This is
identical to index cookie acquisition. The only difference is that the type in
the object definition should be something other than index type.
the object definition should be something other than index type::
vnode->cache =
fscache_acquire_cookie(volume->cache,
......@@ -375,15 +372,14 @@ the object definition should be something other than index type.
vnode, vnode->status.size, true);
=================================
MISCELLANEOUS OBJECT REGISTRATION
Miscellaneous Object Registration
=================================
An optional step is to request an object of miscellaneous type be created in
the cache. This is almost identical to index cookie acquisition. The only
difference is that the type in the object definition should be something other
than index type. While the parent object could be an index, it's more likely
it would be some other type of object such as a data file.
it would be some other type of object such as a data file::
xattr->cache =
fscache_acquire_cookie(vnode->cache,
......@@ -396,13 +392,12 @@ Miscellaneous objects might be used to store extended attributes or directory
entries for example.
==========================
SETTING THE DATA FILE SIZE
Setting the Data File Size
==========================
The fifth step is to set the physical attributes of the file, such as its size.
This doesn't automatically reserve any space in the cache, but permits the
cache to adjust its metadata for data tracking appropriately:
cache to adjust its metadata for data tracking appropriately::
int fscache_attr_changed(struct fscache_cookie *cookie);
......@@ -417,8 +412,7 @@ some point in the future, and as such, it may happen after the function returns
to the caller. The attribute adjustment excludes read and write operations.
=====================
PAGE ALLOC/READ/WRITE
Page alloc/read/write
=====================
And the sixth step is to store and retrieve pages in the cache. There are
......@@ -441,7 +435,7 @@ PAGE READ
Firstly, the netfs should ask FS-Cache to examine the caches and read the
contents cached for a particular page of a particular file if present, or else
allocate space to store the contents if not:
allocate space to store the contents if not::
typedef
void (*fscache_rw_complete_t)(struct page *page,
......@@ -474,14 +468,14 @@ Else if there's a copy of the page resident in the cache:
(4) When the read is complete, end_io_func() will be invoked with:
(*) The netfs data supplied when the cookie was created.
* The netfs data supplied when the cookie was created.
(*) The page descriptor.
* The page descriptor.
(*) The context argument passed to the above function. This will be
* The context argument passed to the above function. This will be
maintained with the get_context/put_context functions mentioned above.
(*) An argument that's 0 on success or negative for an error code.
* An argument that's 0 on success or negative for an error code.
If an error occurs, it should be assumed that the page contains no usable
data. fscache_readpages_cancel() may need to be called.
......@@ -504,11 +498,11 @@ This function may also return -ENOMEM or -EINTR, in which case it won't have
read any data from the cache.
PAGE ALLOCATE
Page Allocate
-------------
Alternatively, if there's not expected to be any data in the cache for a page
because the file has been extended, a block can simply be allocated instead:
because the file has been extended, a block can simply be allocated instead::
int fscache_alloc_page(struct fscache_cookie *cookie,
struct page *page,
......@@ -523,12 +517,12 @@ The mark_pages_cached() cookie operation will be called on the page if
successful.
PAGE WRITE
Page Write
----------
Secondly, if the netfs changes the contents of the page (either due to an
initial download or if a user performs a write), then the page should be
written back to the cache:
written back to the cache::
int fscache_write_page(struct fscache_cookie *cookie,
struct page *page,
......@@ -566,11 +560,11 @@ place if unforeseen circumstances arose (such as a disk error).
Writing takes place asynchronously.
MULTIPLE PAGE READ
Multiple Page Read
------------------
A facility is provided to read several pages at once, as requested by the
readpages() address space operation:
readpages() address space operation::
int fscache_read_or_alloc_pages(struct fscache_cookie *cookie,
struct address_space *mapping,
......@@ -598,7 +592,7 @@ This works in a similar way to fscache_read_or_alloc_page(), except:
be returned.
Otherwise, if all pages had reads dispatched, then 0 will be returned, the
list will be empty and *nr_pages will be 0.
list will be empty and ``*nr_pages`` will be 0.
(4) end_io_func will be called once for each page being read as the reads
complete. It will be called in process context if error != 0, but it may
......@@ -609,13 +603,13 @@ some of the pages being read and some being allocated. Those pages will have
been marked appropriately and will need uncaching.
CANCELLATION OF UNREAD PAGES
Cancellation of Unread Pages
----------------------------
If one or more pages are passed to fscache_read_or_alloc_pages() but not then
read from the cache and also not read from the underlying filesystem then
those pages will need to have any marks and reservations removed. This can be
done by calling:
done by calling::
void fscache_readpages_cancel(struct fscache_cookie *cookie,
struct list_head *pages);
......@@ -625,11 +619,10 @@ fscache_read_or_alloc_pages(). Every page in the pages list will be examined
and any that have PG_fscache set will be uncached.
==============
PAGE UNCACHING
Page Uncaching
==============
To uncache a page, this function should be called:
To uncache a page, this function should be called::
void fscache_uncache_page(struct fscache_cookie *cookie,
struct page *page);
......@@ -644,12 +637,12 @@ data file must be retired (see the relinquish cookie function below).
Furthermore, note that this does not cancel the asynchronous read or write
operation started by the read/alloc and write functions, so the page
invalidation functions must use:
invalidation functions must use::
bool fscache_check_page_write(struct fscache_cookie *cookie,
struct page *page);
to see if a page is being written to the cache, and:
to see if a page is being written to the cache, and::
void fscache_wait_on_page_write(struct fscache_cookie *cookie,
struct page *page);
......@@ -660,7 +653,7 @@ to wait for it to finish if it is.
When releasepage() is being implemented, a special FS-Cache function exists to
manage the heuristics of coping with vmscan trying to eject pages, which may
conflict with the cache trying to write pages to the cache (which may itself
need to allocate memory):
need to allocate memory)::
bool fscache_maybe_release_page(struct fscache_cookie *cookie,
struct page *page,
......@@ -676,12 +669,12 @@ storage request to complete, or it may attempt to cancel the storage request -
in which case the page will not be stored in the cache this time.
BULK INODE PAGE UNCACHE
Bulk Image Page Uncache
-----------------------
A convenience routine is provided to perform an uncache on all the pages
attached to an inode. This assumes that the pages on the inode correspond on a
1:1 basis with the pages in the cache.
1:1 basis with the pages in the cache::
void fscache_uncache_all_inode_pages(struct fscache_cookie *cookie,
struct inode *inode);
......@@ -692,12 +685,11 @@ written to the cache and for the cache to finish with the page generally. No
error is returned.
===============================
INDEX AND DATA FILE CONSISTENCY
Index and Data File consistency
===============================
To find out whether auxiliary data for an object is up to data within the
cache, the following function can be called:
cache, the following function can be called::
int fscache_check_consistency(struct fscache_cookie *cookie,
const void *aux_data);
......@@ -708,7 +700,7 @@ data buffer first. It returns 0 if it is and -ESTALE if it isn't; it may also
return -ENOMEM and -ERESTARTSYS.
To request an update of the index data for an index or other object, the
following function should be called:
following function should be called::
void fscache_update_cookie(struct fscache_cookie *cookie,
const void *aux_data);
......@@ -721,8 +713,7 @@ Note that partial updates may happen automatically at other times, such as when
data blocks are added to a data file object.
=================
COOKIE ENABLEMENT
Cookie Enablement
=================
Cookies exist in one of two states: enabled and disabled. If a cookie is
......@@ -731,7 +722,7 @@ invalidate its state; allocate, read or write backing pages - though it is
still possible to uncache pages and relinquish the cookie.
The initial enablement state is set by fscache_acquire_cookie(), but the cookie
can be enabled or disabled later. To disable a cookie, call:
can be enabled or disabled later. To disable a cookie, call::
void fscache_disable_cookie(struct fscache_cookie *cookie,
const void *aux_data,
......@@ -746,7 +737,7 @@ All possible failures are handled internally. The caller should consider
calling fscache_uncache_all_inode_pages() afterwards to make sure all page
markings are cleared up.
Cookies can be enabled or reenabled with:
Cookies can be enabled or reenabled with::
void fscache_enable_cookie(struct fscache_cookie *cookie,
const void *aux_data,
......@@ -771,13 +762,12 @@ In both cases, the cookie's auxiliary data buffer is updated from aux_data if
that is non-NULL inside the enablement lock before proceeding.
===============================
MISCELLANEOUS COOKIE OPERATIONS
Miscellaneous Cookie operations
===============================
There are a number of operations that can be used to control cookies:
(*) Cookie pinning:
* Cookie pinning::
int fscache_pin_cookie(struct fscache_cookie *cookie);
void fscache_unpin_cookie(struct fscache_cookie *cookie);
......@@ -790,7 +780,7 @@ There are a number of operations that can be used to control cookies:
-ENOSPC if there isn't enough space to honour the operation, -ENOMEM or
-EIO if there's any other problem.
(*) Data space reservation:
* Data space reservation::
int fscache_reserve_space(struct fscache_cookie *cookie, loff_t size);
......@@ -809,11 +799,10 @@ There are a number of operations that can be used to control cookies:
make space if it's not in use.
=====================
COOKIE UNREGISTRATION
Cookie Unregistration
=====================
To get rid of a cookie, this function should be called.
To get rid of a cookie, this function should be called::
void fscache_relinquish_cookie(struct fscache_cookie *cookie,
const void *aux_data,
......@@ -835,16 +824,14 @@ the cookies for "child" indices, objects and pages have been relinquished
first.
==================
INDEX INVALIDATION
Index Invalidation
==================
There is no direct way to invalidate an index subtree. To do this, the caller
should relinquish and retire the cookie they have, and then acquire a new one.
======================
DATA FILE INVALIDATION
Data File Invalidation
======================
Sometimes it will be necessary to invalidate an object that contains data.
......@@ -853,7 +840,7 @@ change - at which point the netfs has to throw away all the state it had for an
inode and reload from the server.
To indicate that a cache object should be invalidated, the following function
can be called:
can be called::
void fscache_invalidate(struct fscache_cookie *cookie);
......@@ -868,13 +855,12 @@ auxiliary data update operation as it is very likely these will have changed.
Using the following function, the netfs can wait for the invalidation operation
to have reached a point at which it can start submitting ordinary operations
once again:
once again::
void fscache_wait_on_invalidate(struct fscache_cookie *cookie);
===========================
FS-CACHE SPECIFIC PAGE FLAG
FS-cache Specific Page Flag
===========================
FS-Cache makes use of a page flag, PG_private_2, for its own purpose. This is
......@@ -898,7 +884,7 @@ was given under certain circumstances.
This bit does not overlap with such as PG_private. This means that FS-Cache
can be used with a filesystem that uses the block buffering code.
There are a number of operations defined on this flag:
There are a number of operations defined on this flag::
int PageFsCache(struct page *page);
void SetPageFsCache(struct page *page)
......
====================================================
IN-KERNEL CACHE OBJECT REPRESENTATION AND MANAGEMENT
====================================================
.. SPDX-License-Identifier: GPL-2.0
====================================================
In-Kernel Cache Object Representation and Management
====================================================
By: David Howells <dhowells@redhat.com>
Contents:
.. Contents:
(*) Representation
......@@ -18,8 +20,7 @@ Contents:
(*) The set of events.
==============
REPRESENTATION
Representation
==============
FS-Cache maintains an in-kernel representation of each object that a netfs is
......@@ -38,7 +39,7 @@ or even by no objects (it may not be cached).
Furthermore, both cookies and objects are hierarchical. The two hierarchies
correspond, but the cookies tree is a superset of the union of the object trees
of multiple caches:
of multiple caches::
NETFS INDEX TREE : CACHE 1 : CACHE 2
: :
......@@ -89,8 +90,7 @@ pointers to the cookies. The cookies themselves and any objects attached to
those cookies are hidden from it.
===============================
OBJECT MANAGEMENT STATE MACHINE
Object Management State Machine
===============================
Within FS-Cache, each active object is managed by its own individual state
......@@ -124,7 +124,7 @@ is not masked, the object will be queued for processing (by calling
fscache_enqueue_object()).
PROVISION OF CPU TIME
Provision of CPU Time
---------------------
The work to be done by the various states was given CPU time by the threads of
......@@ -141,7 +141,7 @@ because:
workqueues don't necessarily have the right numbers of threads.
LOCKING SIMPLIFICATION
Locking Simplification
----------------------
Because only one worker thread may be operating on any particular object's
......@@ -151,8 +151,7 @@ from the cache backend's representation (fscache_object) - which may be
requested from either end.
=================
THE SET OF STATES
The Set of States
=================
The object state machine has a set of states that it can be in. There are
......@@ -275,19 +274,17 @@ memory and potentially deletes stuff from disk:
this state.
THE SET OF EVENTS
The Set of Events
-----------------
There are a number of events that can be raised to an object state machine:
(*) FSCACHE_OBJECT_EV_UPDATE
FSCACHE_OBJECT_EV_UPDATE
The netfs requested that an object be updated. The state machine will ask
the cache backend to update the object, and the cache backend will ask the
netfs for details of the change through its cookie definition ops.
(*) FSCACHE_OBJECT_EV_CLEARED
FSCACHE_OBJECT_EV_CLEARED
This is signalled in two circumstances:
(a) when an object's last child object is dropped and
......@@ -296,20 +293,16 @@ There are a number of events that can be raised to an object state machine:
This is used to proceed from the dying state.
(*) FSCACHE_OBJECT_EV_ERROR
FSCACHE_OBJECT_EV_ERROR
This is signalled when an I/O error occurs during the processing of some
object.
(*) FSCACHE_OBJECT_EV_RELEASE
(*) FSCACHE_OBJECT_EV_RETIRE
FSCACHE_OBJECT_EV_RELEASE, FSCACHE_OBJECT_EV_RETIRE
These are signalled when the netfs relinquishes a cookie it was using.
The event selected depends on whether the netfs asks for the backing
object to be retired (deleted) or retained.
(*) FSCACHE_OBJECT_EV_WITHDRAW
FSCACHE_OBJECT_EV_WITHDRAW
This is signalled when the cache backend wants to withdraw an object.
This means that the object will have to be detached from the netfs's
cookie.
......
================================
ASYNCHRONOUS OPERATIONS HANDLING
================================
.. SPDX-License-Identifier: GPL-2.0
================================
Asynchronous Operations Handling
================================
By: David Howells <dhowells@redhat.com>
Contents:
.. Contents:
(*) Overview.
......@@ -17,8 +19,7 @@ Contents:
(*) Asynchronous callback.
========
OVERVIEW
Overview
========
FS-Cache has an asynchronous operations handling facility that it uses for its
......@@ -33,11 +34,10 @@ backend for completion.
To make use of this facility, <linux/fscache-cache.h> should be #included.
===============================
OPERATION RECORD INITIALISATION
Operation Record Initialisation
===============================
An operation is recorded in an fscache_operation struct:
An operation is recorded in an fscache_operation struct::
struct fscache_operation {
union {
......@@ -50,7 +50,7 @@ An operation is recorded in an fscache_operation struct:
};
Someone wanting to issue an operation should allocate something with this
struct embedded in it. They should initialise it by calling:
struct embedded in it. They should initialise it by calling::
void fscache_operation_init(struct fscache_operation *op,
fscache_operation_release_t release);
......@@ -67,8 +67,7 @@ FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the
operation and waited for afterwards.
==========
PARAMETERS
Parameters
==========
There are a number of parameters that can be set in the operation record's flag
......@@ -87,7 +86,7 @@ operations:
If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags
before submitting the operation, and the operating thread must wait for it
to be cleared before proceeding:
to be cleared before proceeding::
wait_on_bit(&op->flags, FSCACHE_OP_WAITING,
TASK_UNINTERRUPTIBLE);
......@@ -101,7 +100,7 @@ operations:
page to a netfs page after the backing fs has read the page in.
If this option is used, op->fast_work and op->processor must be
initialised before submitting the operation:
initialised before submitting the operation::
INIT_WORK(&op->fast_work, do_some_work);
......@@ -114,7 +113,7 @@ operations:
pages that have just been fetched from a remote server.
If this option is used, op->slow_work and op->processor must be
initialised before submitting the operation:
initialised before submitting the operation::
fscache_operation_init_slow(op, processor)
......@@ -132,8 +131,7 @@ Furthermore, operations may be one of two types:
operations running at the same time.
=========
PROCEDURE
Procedure
=========
Operations are used through the following procedure:
......@@ -143,7 +141,7 @@ Operations are used through the following procedure:
generic op embedded within.
(2) The submitting thread must then submit the operation for processing using
one of the following two functions:
one of the following two functions::
int fscache_submit_op(struct fscache_object *object,
struct fscache_operation *op);
......@@ -164,7 +162,7 @@ Operations are used through the following procedure:
operation of conflicting exclusivity is in progress on the object.
If the operation is asynchronous, the manager will retain a reference to
it, so the caller should put their reference to it by passing it to:
it, so the caller should put their reference to it by passing it to::
void fscache_put_operation(struct fscache_operation *op);
......@@ -179,12 +177,12 @@ Operations are used through the following procedure:
(4) The operation holds an effective lock upon the object, preventing other
exclusive ops conflicting until it is released. The operation can be
enqueued for further immediate asynchronous processing by adjusting the
CPU time provisioning option if necessary, eg:
CPU time provisioning option if necessary, eg::
op->flags &= ~FSCACHE_OP_TYPE;
op->flags |= ~FSCACHE_OP_FAST;
and calling:
and calling::
void fscache_enqueue_operation(struct fscache_operation *op)
......@@ -192,13 +190,12 @@ Operations are used through the following procedure:
pools.
=====================
ASYNCHRONOUS CALLBACK
Asynchronous Callback
=====================
When used in asynchronous mode, the worker thread pool will invoke the
processor method with a pointer to the operation. This should then get at the
container struct by using container_of():
container struct by using container_of()::
static void fscache_write_op(struct fscache_operation *_op)
{
......
.. SPDX-License-Identifier: GPL-2.0
===========================================
Mounting root file system via SMB (cifs.ko)
===========================================
Written 2019 by Paulo Alcantara <palcantara@suse.de>
Written 2019 by Aurelien Aptel <aaptel@suse.com>
The CONFIG_CIFS_ROOT option enables experimental root file system
......@@ -32,7 +36,7 @@ Server configuration
====================
To enable SMB1+UNIX extensions you will need to set these global
settings in Samba smb.conf:
settings in Samba smb.conf::
[global]
server min protocol = NT1
......@@ -41,12 +45,16 @@ settings in Samba smb.conf:
Kernel command line
===================
root=/dev/cifs
::
root=/dev/cifs
This is just a virtual device that basically tells the kernel to mount
the root file system via SMB protocol.
cifsroot=//<server-ip>/<share>[,options]
::
cifsroot=//<server-ip>/<share>[,options]
Enables the kernel to mount the root file system via SMB that are
located in the <server-ip> and <share> specified in this option.
......@@ -65,33 +73,33 @@ options
Examples
========
Export root file system as a Samba share in smb.conf file.
Export root file system as a Samba share in smb.conf file::
...
[linux]
path = /path/to/rootfs
read only = no
guest ok = yes
force user = root
force group = root
browseable = yes
writeable = yes
admin users = root
public = yes
create mask = 0777
directory mask = 0777
...
...
[linux]
path = /path/to/rootfs
read only = no
guest ok = yes
force user = root
force group = root
browseable = yes
writeable = yes
admin users = root
public = yes
create mask = 0777
directory mask = 0777
...
Restart smb service.
Restart smb service::
# systemctl restart smb
# systemctl restart smb
Test it under QEMU on a kernel built with CONFIG_CIFS_ROOT and
CONFIG_IP_PNP options enabled.
CONFIG_IP_PNP options enabled::
# qemu-system-x86_64 -enable-kvm -cpu host -m 1024 \
-kernel /path/to/linux/arch/x86/boot/bzImage -nographic \
-append "root=/dev/cifs rw ip=dhcp cifsroot=//10.0.2.2/linux,username=foo,password=bar console=ttyS0 3"
# qemu-system-x86_64 -enable-kvm -cpu host -m 1024 \
-kernel /path/to/linux/arch/x86/boot/bzImage -nographic \
-append "root=/dev/cifs rw ip=dhcp cifsroot=//10.0.2.2/linux,username=foo,password=bar console=ttyS0 3"
1: https://wiki.samba.org/index.php/UNIX_Extensions
configfs - Userspace-driven kernel object configuration.
=======================================================
Configfs - Userspace-driven Kernel Object Configuration
=======================================================
Joel Becker <joel.becker@oracle.com>
......@@ -9,7 +10,8 @@ Copyright (c) 2005 Oracle Corporation,
Joel Becker <joel.becker@oracle.com>
[What is configfs?]
What is configfs?
=================
configfs is a ram-based filesystem that provides the converse of
sysfs's functionality. Where sysfs is a filesystem-based view of
......@@ -35,10 +37,11 @@ kernel modules backing the items must respond to this.
Both sysfs and configfs can and should exist together on the same
system. One is not a replacement for the other.
[Using configfs]
Using configfs
==============
configfs can be compiled as a module or into the kernel. You can access
it by doing
it by doing::
mount -t configfs none /config
......@@ -56,28 +59,29 @@ values. Don't mix more than one attribute in one attribute file.
There are two types of configfs attributes:
* Normal attributes, which similar to sysfs attributes, are small ASCII text
files, with a maximum size of one page (PAGE_SIZE, 4096 on i386). Preferably
only one value per file should be used, and the same caveats from sysfs apply.
Configfs expects write(2) to store the entire buffer at once. When writing to
normal configfs attributes, userspace processes should first read the entire
file, modify the portions they wish to change, and then write the entire
buffer back.
files, with a maximum size of one page (PAGE_SIZE, 4096 on i386). Preferably
only one value per file should be used, and the same caveats from sysfs apply.
Configfs expects write(2) to store the entire buffer at once. When writing to
normal configfs attributes, userspace processes should first read the entire
file, modify the portions they wish to change, and then write the entire
buffer back.
* Binary attributes, which are somewhat similar to sysfs binary attributes,
but with a few slight changes to semantics. The PAGE_SIZE limitation does not
apply, but the whole binary item must fit in single kernel vmalloc'ed buffer.
The write(2) calls from user space are buffered, and the attributes'
write_bin_attribute method will be invoked on the final close, therefore it is
imperative for user-space to check the return code of close(2) in order to
verify that the operation finished successfully.
To avoid a malicious user OOMing the kernel, there's a per-binary attribute
maximum buffer value.
but with a few slight changes to semantics. The PAGE_SIZE limitation does not
apply, but the whole binary item must fit in single kernel vmalloc'ed buffer.
The write(2) calls from user space are buffered, and the attributes'
write_bin_attribute method will be invoked on the final close, therefore it is
imperative for user-space to check the return code of close(2) in order to
verify that the operation finished successfully.
To avoid a malicious user OOMing the kernel, there's a per-binary attribute
maximum buffer value.
When an item needs to be destroyed, remove it with rmdir(2). An
item cannot be destroyed if any other item has a link to it (via
symlink(2)). Links can be removed via unlink(2).
[Configuring FakeNBD: an Example]
Configuring FakeNBD: an Example
===============================
Imagine there's a Network Block Device (NBD) driver that allows you to
access remote block devices. Call it FakeNBD. FakeNBD uses configfs
......@@ -86,14 +90,14 @@ sysadmins use to configure FakeNBD, but somehow that program has to tell
the driver about it. Here's where configfs comes in.
When the FakeNBD driver is loaded, it registers itself with configfs.
readdir(3) sees this just fine:
readdir(3) sees this just fine::
# ls /config
fakenbd
A fakenbd connection can be created with mkdir(2). The name is
arbitrary, but likely the tool will make some use of the name. Perhaps
it is a uuid or a disk name:
it is a uuid or a disk name::
# mkdir /config/fakenbd/disk1
# ls /config/fakenbd/disk1
......@@ -102,7 +106,7 @@ it is a uuid or a disk name:
The target attribute contains the IP address of the server FakeNBD will
connect to. The device attribute is the device on the server.
Predictably, the rw attribute determines whether the connection is
read-only or read-write.
read-only or read-write::
# echo 10.0.0.1 > /config/fakenbd/disk1/target
# echo /dev/sda1 > /config/fakenbd/disk1/device
......@@ -111,7 +115,8 @@ read-only or read-write.
That's it. That's all there is. Now the device is configured, via the
shell no less.
[Coding With configfs]
Coding With configfs
====================
Every object in configfs is a config_item. A config_item reflects an
object in the subsystem. It has attributes that match values on that
......@@ -130,7 +135,10 @@ appears as a directory at the top of the configfs filesystem. A
subsystem is also a config_group, and can do everything a config_group
can.
[struct config_item]
struct config_item
==================
::
struct config_item {
char *ci_name;
......@@ -168,7 +176,10 @@ By itself, a config_item cannot do much more than appear in configfs.
Usually a subsystem wants the item to display and/or store attributes,
among other things. For that, it needs a type.
[struct config_item_type]
struct config_item_type
=======================
::
struct configfs_item_operations {
void (*release)(struct config_item *);
......@@ -192,7 +203,10 @@ allocated dynamically will need to provide the ct_item_ops->release()
method. This method is called when the config_item's reference count
reaches zero.
[struct configfs_attribute]
struct configfs_attribute
=========================
::
struct configfs_attribute {
char *ca_name;
......@@ -214,7 +228,10 @@ be called whenever userspace asks for a read(2) on the attribute. If an
attribute is writable and provides a ->store method, that method will be
be called whenever userspace asks for a write(2) on the attribute.
[struct configfs_bin_attribute]
struct configfs_bin_attribute
=============================
::
struct configfs_bin_attribute {
struct configfs_attribute cb_attr;
......@@ -240,11 +257,12 @@ will happen for write(2). The reads/writes are bufferred so only a
single read/write will occur; the attributes' need not concern itself
with it.
[struct config_group]
struct config_group
===================
A config_item cannot live in a vacuum. The only way one can be created
is via mkdir(2) on a config_group. This will trigger creation of a
child item.
child item::
struct config_group {
struct config_item cg_item;
......@@ -264,7 +282,7 @@ The config_group structure contains a config_item. Properly configuring
that item means that a group can behave as an item in its own right.
However, it can do more: it can create child items or groups. This is
accomplished via the group operations specified on the group's
config_item_type.
config_item_type::
struct configfs_group_operations {
struct config_item *(*make_item)(struct config_group *group,
......@@ -279,7 +297,8 @@ config_item_type.
};
A group creates child items by providing the
ct_group_ops->make_item() method. If provided, this method is called from mkdir(2) in the group's directory. The subsystem allocates a new
ct_group_ops->make_item() method. If provided, this method is called from
mkdir(2) in the group's directory. The subsystem allocates a new
config_item (or more likely, its container structure), initializes it,
and returns it to configfs. Configfs will then populate the filesystem
tree to reflect the new item.
......@@ -296,13 +315,14 @@ upon item allocation. If a subsystem has no work to do, it may omit
the ct_group_ops->drop_item() method, and configfs will call
config_item_put() on the item on behalf of the subsystem.
IMPORTANT: drop_item() is void, and as such cannot fail. When rmdir(2)
is called, configfs WILL remove the item from the filesystem tree
(assuming that it has no children to keep it busy). The subsystem is
responsible for responding to this. If the subsystem has references to
the item in other threads, the memory is safe. It may take some time
for the item to actually disappear from the subsystem's usage. But it
is gone from configfs.
Important:
drop_item() is void, and as such cannot fail. When rmdir(2)
is called, configfs WILL remove the item from the filesystem tree
(assuming that it has no children to keep it busy). The subsystem is
responsible for responding to this. If the subsystem has references to
the item in other threads, the memory is safe. It may take some time
for the item to actually disappear from the subsystem's usage. But it
is gone from configfs.
When drop_item() is called, the item's linkage has already been torn
down. It no longer has a reference on its parent and has no place in
......@@ -319,10 +339,11 @@ is implemented in the configfs rmdir(2) code. ->drop_item() will not be
called, as the item has not been dropped. rmdir(2) will fail, as the
directory is not empty.
[struct configfs_subsystem]
struct configfs_subsystem
=========================
A subsystem must register itself, usually at module_init time. This
tells configfs to make the subsystem appear in the file tree.
tells configfs to make the subsystem appear in the file tree::
struct configfs_subsystem {
struct config_group su_group;
......@@ -332,17 +353,19 @@ tells configfs to make the subsystem appear in the file tree.
int configfs_register_subsystem(struct configfs_subsystem *subsys);
void configfs_unregister_subsystem(struct configfs_subsystem *subsys);
A subsystem consists of a toplevel config_group and a mutex.
A subsystem consists of a toplevel config_group and a mutex.
The group is where child config_items are created. For a subsystem,
this group is usually defined statically. Before calling
configfs_register_subsystem(), the subsystem must have initialized the
group via the usual group _init() functions, and it must also have
initialized the mutex.
When the register call returns, the subsystem is live, and it
When the register call returns, the subsystem is live, and it
will be visible via configfs. At that point, mkdir(2) can be called and
the subsystem must be ready for it.
[An Example]
An Example
==========
The best example of these basic concepts is the simple_children
subsystem/group and the simple_child item in
......@@ -350,7 +373,8 @@ samples/configfs/configfs_sample.c. It shows a trivial object displaying
and storing an attribute, and a simple group creating and destroying
these children.
[Hierarchy Navigation and the Subsystem Mutex]
Hierarchy Navigation and the Subsystem Mutex
============================================
There is an extra bonus that configfs provides. The config_groups and
config_items are arranged in a hierarchy due to the fact that they
......@@ -375,7 +399,8 @@ be in its parent's cg_children list for the same duration. This allows
a subsystem to trust ci_parent and cg_children while they hold the
mutex.
[Item Aggregation Via symlink(2)]
Item Aggregation Via symlink(2)
===============================
configfs provides a simple group via the group->item parent/child
relationship. Often, however, a larger environment requires aggregation
......@@ -403,7 +428,8 @@ A config_item cannot be removed while it links to any other item, nor
can it be removed while an item links to it. Dangling symlinks are not
allowed in configfs.
[Automatically Created Subgroups]
Automatically Created Subgroups
===============================
A new config_group may want to have two types of child config_items.
While this could be codified by magic names in ->make_item(), it is much
......@@ -433,7 +459,8 @@ As a consequence of this, default groups cannot be removed directly via
rmdir(2). They also are not considered when rmdir(2) on the parent
group is checking for children.
[Dependent Subsystems]
Dependent Subsystems
====================
Sometimes other drivers depend on particular configfs items. For
example, ocfs2 mounts depend on a heartbeat region item. If that
......@@ -460,9 +487,11 @@ succeeds, then heartbeat knows the region is safe to give to ocfs2.
If it fails, it was being torn down anyway, and heartbeat can gracefully
pass up an error.
[Committable Items]
Committable Items
=================
NOTE: Committable items are currently unimplemented.
Note:
Committable items are currently unimplemented.
Some config_items cannot have a valid initial state. That is, no
default values can be specified for the item's attributes such that the
......@@ -504,5 +533,3 @@ As rmdir(2) does not work in the "live" directory, an item must be
shutdown, or "uncommitted". Again, this is done via rename(2), this
time from the "live" directory back to the "pending" one. The subsystem
is notified by the ct_group_ops->uncommit_object() method.
.. SPDX-License-Identifier: GPL-2.0
=====================
The Devpts Filesystem
=====================
Each mount of the devpts filesystem is now distinct such that ptys
and their indicies allocated in one mount are independent from ptys
and their indicies in all other mounts.
All mounts of the devpts filesystem now create a ``/dev/pts/ptmx`` node
with permissions ``0000``.
To retain backwards compatibility the a ptmx device node (aka any node
created with ``mknod name c 5 2``) when opened will look for an instance
of devpts under the name ``pts`` in the same directory as the ptmx device
node.
As an option instead of placing a ``/dev/ptmx`` device node at ``/dev/ptmx``
it is possible to place a symlink to ``/dev/pts/ptmx`` at ``/dev/ptmx`` or
to bind mount ``/dev/ptx/ptmx`` to ``/dev/ptmx``. If you opt for using
the devpts filesystem in this manner devpts should be mounted with
the ``ptmxmode=0666``, or ``chmod 0666 /dev/pts/ptmx`` should be called.
Total count of pty pairs in all instances is limited by sysctls::
kernel.pty.max = 4096 - global limit
kernel.pty.reserve = 1024 - reserved for filesystems mounted from the initial mount namespace
kernel.pty.nr - current count of ptys
Per-instance limit could be set by adding mount option ``max=<count>``.
This feature was added in kernel 3.4 together with
``sysctl kernel.pty.reserve``.
In kernels older than 3.4 sysctl ``kernel.pty.max`` works as per-instance limit.
Each mount of the devpts filesystem is now distinct such that ptys
and their indicies allocated in one mount are independent from ptys
and their indicies in all other mounts.
All mounts of the devpts filesystem now create a /dev/pts/ptmx node
with permissions 0000.
To retain backwards compatibility the a ptmx device node (aka any node
created with "mknod name c 5 2") when opened will look for an instance
of devpts under the name "pts" in the same directory as the ptmx device
node.
As an option instead of placing a /dev/ptmx device node at /dev/ptmx
it is possible to place a symlink to /dev/pts/ptmx at /dev/ptmx or
to bind mount /dev/ptx/ptmx to /dev/ptmx. If you opt for using
the devpts filesystem in this manner devpts should be mounted with
the ptmxmode=0666, or chmod 0666 /dev/pts/ptmx should be called.
Total count of pty pairs in all instances is limited by sysctls:
kernel.pty.max = 4096 - global limit
kernel.pty.reserve = 1024 - reserved for filesystems mounted from the initial mount namespace
kernel.pty.nr - current count of ptys
Per-instance limit could be set by adding mount option "max=<count>".
This feature was added in kernel 3.4 together with sysctl kernel.pty.reserve.
In kernels older than 3.4 sysctl kernel.pty.max works as per-instance limit.
Linux Directory Notification
============================
.. SPDX-License-Identifier: GPL-2.0
============================
Linux Directory Notification
============================
Stephen Rothwell <sfr@canb.auug.org.au>
......@@ -12,6 +15,7 @@ being delivered using signals.
The application decides which "events" it wants to be notified about.
The currently defined events are:
========= =====================================================
DN_ACCESS A file in the directory was accessed (read)
DN_MODIFY A file in the directory was modified (write,truncate)
DN_CREATE A file was created in the directory
......@@ -19,6 +23,7 @@ The currently defined events are:
DN_RENAME A file in the directory was renamed
DN_ATTRIB A file in the directory had its attributes
changed (chmod,chown)
========= =====================================================
Usually, the application must reregister after each notification, but
if DN_MULTISHOT is or'ed with the event mask, then the registration will
......@@ -36,7 +41,7 @@ especially important if DN_MULTISHOT is specified. Note that SIGRTMIN
is often blocked, so it is better to use (at least) SIGRTMIN + 1.
Implementation expectations (features and bugs :-))
---------------------------
---------------------------------------------------
The notification should work for any local access to files even if the
actual file system is on a remote server. This implies that remote
......
.. SPDX-License-Identifier: GPL-2.0
===================================
File management in the Linux kernel
-----------------------------------
===================================
This document describes how locking for files (struct file)
and file descriptor table (struct files) works.
......@@ -34,7 +37,7 @@ appear atomic. Here are the locking rules for
the fdtable structure -
1. All references to the fdtable must be done through
the files_fdtable() macro :
the files_fdtable() macro::
struct fdtable *fdt;
......@@ -61,7 +64,8 @@ the fdtable structure -
4. To look up the file structure given an fd, a reader
must use either fcheck() or fcheck_files() APIs. These
take care of barrier requirements due to lock-free lookup.
An example :
An example::
struct file *file;
......@@ -77,7 +81,7 @@ the fdtable structure -
of the fd (fget()/fget_light()) are lock-free, it is possible
that look-up may race with the last put() operation on the
file structure. This is avoided using atomic_long_inc_not_zero()
on ->f_count :
on ->f_count::
rcu_read_lock();
file = fcheck_files(files, fd);
......@@ -106,7 +110,8 @@ the fdtable structure -
holding files->file_lock. If ->file_lock is dropped, then
another thread expand the files thereby creating a new
fdtable and making the earlier fdtable pointer stale.
For example :
For example::
spin_lock(&files->file_lock);
fd = locate_fd(files, file, start);
......
.. SPDX-License-Identifier: GPL-2.0
==============
Fuse I/O Modes
==============
Fuse supports the following I/O modes:
- direct-io
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册