提交 e82894f8 编写于 作者: T Tom Zanussi 提交者: Linus Torvalds

[PATCH] relayfs

Here's the latest version of relayfs, against linux-2.6.11-mm2.  I'm hoping
you'll consider putting this version back into your tree - the previous
rounds of comment seem to have shaken out all the API issues and the number
of comments on the code itself have also steadily dwindled.

This patch is essentially the same as the relayfs redux part 5 patch, with
some minor changes based on reviewer comments.  Thanks again to Pekka
Enberg for those.  The patch size without documentation is now a little
smaller at just over 40k.  Here's a detailed list of the changes:

- removed the attribute_flags in relay open and changed it to a
  boolean specifying either overwrite or no-overwrite mode, and removed
  everything referencing the attribute flags.
- added a check for NULL names in relayfs_create_entry()
- got rid of the unnecessary multiple labels in relay_create_buf()
- some minor simplification of relay_alloc_buf() which got rid of a
  couple params
- updated the Documentation

In addition, this version (through code contained in the relay-apps tarball
linked to below, not as part of the relayfs patch) tries to make it as easy
as possible to create the cooperating kernel/user pieces of a typical and
common type of logging application, one where kernel logging is kicked off
when a user space data collection app starts and stops when the collection
app exits, with the data being automatically logged to disk in between.  To
create this type of application, you basically just include a header file
(relay-app.h, included in the relay-apps tarball) in your kernel module,
define a couple of callbacks and call an initialization function, and on
the user side call a single function that sets up and continuously monitors
the buffers, and writes data to files as it becomes available.  Channels
are created when the collection app is started and destroyed when it exits,
not when the kernel module is inserted, so different channel buffer sizes
can be specified for each separate run via command-line options.  See the
README in the relay-apps tarball for details.

Also included in the relay-apps tarball are a couple examples
demonstrating how you can use this to create quick and dirty kernel
logging/debugging applications.  They are:

- tprintk, short for 'tee printk', which temporarily puts a kprobe on
  printk() and writes a duplicate stream of printk output to a relayfs
  channel.  This could be used anywhere there's printk() debugging code
  in the kernel which you'd like to exercise, but would rather not have
  your system logs cluttered with debugging junk.  You'd probably want
  to kill klogd while you do this, otherwise there wouldn't be much
  point (since putting a kprobe on printk() doesn't change the output
  of printk()).  I've used this method to temporarily divert the packet
  logging output of the iptables LOG target from the system logs to
  relayfs files instead, for instance.

- klog, which just provides a printk-like formatted logging function
  on top of relayfs.  Again, you can use this to keep stuff out of your
  system logs if used in place of printk.

The example applications can be found here:

http://prdownloads.sourceforge.net/dprobes/relay-apps.tar.gz?download

From: Christoph Hellwig <hch@lst.de>

  avoid lookup_hash usage in relayfs
Signed-off-by: NTom Zanussi <zanussi@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
上级 8446f1d3
relayfs - a high-speed data relay filesystem
============================================
relayfs is a filesystem designed to provide an efficient mechanism for
tools and facilities to relay large and potentially sustained streams
of data from kernel space to user space.
The main abstraction of relayfs is the 'channel'. A channel consists
of a set of per-cpu kernel buffers each represented by a file in the
relayfs filesystem. Kernel clients write into a channel using
efficient write functions which automatically log to the current cpu's
channel buffer. User space applications mmap() the per-cpu files and
retrieve the data as it becomes available.
The format of the data logged into the channel buffers is completely
up to the relayfs client; relayfs does however provide hooks which
allow clients to impose some stucture on the buffer data. Nor does
relayfs implement any form of data filtering - this also is left to
the client. The purpose is to keep relayfs as simple as possible.
This document provides an overview of the relayfs API. The details of
the function parameters are documented along with the functions in the
filesystem code - please see that for details.
Semantics
=========
Each relayfs channel has one buffer per CPU, each buffer has one or
more sub-buffers. Messages are written to the first sub-buffer until
it is too full to contain a new message, in which case it it is
written to the next (if available). Messages are never split across
sub-buffers. At this point, userspace can be notified so it empties
the first sub-buffer, while the kernel continues writing to the next.
When notified that a sub-buffer is full, the kernel knows how many
bytes of it are padding i.e. unused. Userspace can use this knowledge
to copy only valid data.
After copying it, userspace can notify the kernel that a sub-buffer
has been consumed.
relayfs can operate in a mode where it will overwrite data not yet
collected by userspace, and not wait for it to consume it.
relayfs itself does not provide for communication of such data between
userspace and kernel, allowing the kernel side to remain simple and not
impose a single interface on userspace. It does provide a separate
helper though, described below.
klog, relay-app & librelay
==========================
relayfs itself is ready to use, but to make things easier, two
additional systems are provided. klog is a simple wrapper to make
writing formatted text or raw data to a channel simpler, regardless of
whether a channel to write into exists or not, or whether relayfs is
compiled into the kernel or is configured as a module. relay-app is
the kernel counterpart of userspace librelay.c, combined these two
files provide glue to easily stream data to disk, without having to
bother with housekeeping. klog and relay-app can be used together,
with klog providing high-level logging functions to the kernel and
relay-app taking care of kernel-user control and disk-logging chores.
It is possible to use relayfs without relay-app & librelay, but you'll
have to implement communication between userspace and kernel, allowing
both to convey the state of buffers (full, empty, amount of padding).
klog, relay-app and librelay can be found in the relay-apps tarball on
http://relayfs.sourceforge.net
The relayfs user space API
==========================
relayfs implements basic file operations for user space access to
relayfs channel buffer data. Here are the file operations that are
available and some comments regarding their behavior:
open() enables user to open an _existing_ buffer.
mmap() results in channel buffer being mapped into the caller's
memory space. Note that you can't do a partial mmap - you must
map the entire file, which is NRBUF * SUBBUFSIZE.
read() read the contents of a channel buffer. The bytes read are
'consumed' by the reader i.e. they won't be available again
to subsequent reads. If the channel is being used in
no-overwrite mode (the default), it can be read at any time
even if there's an active kernel writer. If the channel is
being used in overwrite mode and there are active channel
writers, results may be unpredictable - users should make
sure that all logging to the channel has ended before using
read() with overwrite mode.
poll() POLLIN/POLLRDNORM/POLLERR supported. User applications are
notified when sub-buffer boundaries are crossed.
close() decrements the channel buffer's refcount. When the refcount
reaches 0 i.e. when no process or kernel client has the buffer
open, the channel buffer is freed.
In order for a user application to make use of relayfs files, the
relayfs filesystem must be mounted. For example,
mount -t relayfs relayfs /mnt/relay
NOTE: relayfs doesn't need to be mounted for kernel clients to create
or use channels - it only needs to be mounted when user space
applications need access to the buffer data.
The relayfs kernel API
======================
Here's a summary of the API relayfs provides to in-kernel clients:
channel management functions:
relay_open(base_filename, parent, subbuf_size, n_subbufs,
callbacks)
relay_close(chan)
relay_flush(chan)
relay_reset(chan)
relayfs_create_dir(name, parent)
relayfs_remove_dir(dentry)
channel management typically called on instigation of userspace:
relay_subbufs_consumed(chan, cpu, subbufs_consumed)
write functions:
relay_write(chan, data, length)
__relay_write(chan, data, length)
relay_reserve(chan, length)
callbacks:
subbuf_start(buf, subbuf, prev_subbuf, prev_padding)
buf_mapped(buf, filp)
buf_unmapped(buf, filp)
helper functions:
relay_buf_full(buf)
subbuf_start_reserve(buf, length)
Creating a channel
------------------
relay_open() is used to create a channel, along with its per-cpu
channel buffers. Each channel buffer will have an associated file
created for it in the relayfs filesystem, which can be opened and
mmapped from user space if desired. The files are named
basename0...basenameN-1 where N is the number of online cpus, and by
default will be created in the root of the filesystem. If you want a
directory structure to contain your relayfs files, you can create it
with relayfs_create_dir() and pass the parent directory to
relay_open(). Clients are responsible for cleaning up any directory
structure they create when the channel is closed - use
relayfs_remove_dir() for that.
The total size of each per-cpu buffer is calculated by multiplying the
number of sub-buffers by the sub-buffer size passed into relay_open().
The idea behind sub-buffers is that they're basically an extension of
double-buffering to N buffers, and they also allow applications to
easily implement random-access-on-buffer-boundary schemes, which can
be important for some high-volume applications. The number and size
of sub-buffers is completely dependent on the application and even for
the same application, different conditions will warrant different
values for these parameters at different times. Typically, the right
values to use are best decided after some experimentation; in general,
though, it's safe to assume that having only 1 sub-buffer is a bad
idea - you're guaranteed to either overwrite data or lose events
depending on the channel mode being used.
Channel 'modes'
---------------
relayfs channels can be used in either of two modes - 'overwrite' or
'no-overwrite'. The mode is entirely determined by the implementation
of the subbuf_start() callback, as described below. In 'overwrite'
mode, also known as 'flight recorder' mode, writes continuously cycle
around the buffer and will never fail, but will unconditionally
overwrite old data regardless of whether it's actually been consumed.
In no-overwrite mode, writes will fail i.e. data will be lost, if the
number of unconsumed sub-buffers equals the total number of
sub-buffers in the channel. It should be clear that if there is no
consumer or if the consumer can't consume sub-buffers fast enought,
data will be lost in either case; the only difference is whether data
is lost from the beginning or the end of a buffer.
As explained above, a relayfs channel is made of up one or more
per-cpu channel buffers, each implemented as a circular buffer
subdivided into one or more sub-buffers. Messages are written into
the current sub-buffer of the channel's current per-cpu buffer via the
write functions described below. Whenever a message can't fit into
the current sub-buffer, because there's no room left for it, the
client is notified via the subbuf_start() callback that a switch to a
new sub-buffer is about to occur. The client uses this callback to 1)
initialize the next sub-buffer if appropriate 2) finalize the previous
sub-buffer if appropriate and 3) return a boolean value indicating
whether or not to actually go ahead with the sub-buffer switch.
To implement 'no-overwrite' mode, the userspace client would provide
an implementation of the subbuf_start() callback something like the
following:
static int subbuf_start(struct rchan_buf *buf,
void *subbuf,
void *prev_subbuf,
unsigned int prev_padding)
{
if (prev_subbuf)
*((unsigned *)prev_subbuf) = prev_padding;
if (relay_buf_full(buf))
return 0;
subbuf_start_reserve(buf, sizeof(unsigned int));
return 1;
}
If the current buffer is full i.e. all sub-buffers remain unconsumed,
the callback returns 0 to indicate that the buffer switch should not
occur yet i.e. until the consumer has had a chance to read the current
set of ready sub-buffers. For the relay_buf_full() function to make
sense, the consumer is reponsible for notifying relayfs when
sub-buffers have been consumed via relay_subbufs_consumed(). Any
subsequent attempts to write into the buffer will again invoke the
subbuf_start() callback with the same parameters; only when the
consumer has consumed one or more of the ready sub-buffers will
relay_buf_full() return 0, in which case the buffer switch can
continue.
The implementation of the subbuf_start() callback for 'overwrite' mode
would be very similar:
static int subbuf_start(struct rchan_buf *buf,
void *subbuf,
void *prev_subbuf,
unsigned int prev_padding)
{
if (prev_subbuf)
*((unsigned *)prev_subbuf) = prev_padding;
subbuf_start_reserve(buf, sizeof(unsigned int));
return 1;
}
In this case, the relay_buf_full() check is meaningless and the
callback always returns 1, causing the buffer switch to occur
unconditionally. It's also meaningless for the client to use the
relay_subbufs_consumed() function in this mode, as it's never
consulted.
The default subbuf_start() implementation, used if the client doesn't
define any callbacks, or doesn't define the subbuf_start() callback,
implements the simplest possible 'no-overwrite' mode i.e. it does
nothing but return 0.
Header information can be reserved at the beginning of each sub-buffer
by calling the subbuf_start_reserve() helper function from within the
subbuf_start() callback. This reserved area can be used to store
whatever information the client wants. In the example above, room is
reserved in each sub-buffer to store the padding count for that
sub-buffer. This is filled in for the previous sub-buffer in the
subbuf_start() implementation; the padding value for the previous
sub-buffer is passed into the subbuf_start() callback along with a
pointer to the previous sub-buffer, since the padding value isn't
known until a sub-buffer is filled. The subbuf_start() callback is
also called for the first sub-buffer when the channel is opened, to
give the client a chance to reserve space in it. In this case the
previous sub-buffer pointer passed into the callback will be NULL, so
the client should check the value of the prev_subbuf pointer before
writing into the previous sub-buffer.
Writing to a channel
--------------------
kernel clients write data into the current cpu's channel buffer using
relay_write() or __relay_write(). relay_write() is the main logging
function - it uses local_irqsave() to protect the buffer and should be
used if you might be logging from interrupt context. If you know
you'll never be logging from interrupt context, you can use
__relay_write(), which only disables preemption. These functions
don't return a value, so you can't determine whether or not they
failed - the assumption is that you wouldn't want to check a return
value in the fast logging path anyway, and that they'll always succeed
unless the buffer is full and no-overwrite mode is being used, in
which case you can detect a failed write in the subbuf_start()
callback by calling the relay_buf_full() helper function.
relay_reserve() is used to reserve a slot in a channel buffer which
can be written to later. This would typically be used in applications
that need to write directly into a channel buffer without having to
stage data in a temporary buffer beforehand. Because the actual write
may not happen immediately after the slot is reserved, applications
using relay_reserve() can keep a count of the number of bytes actually
written, either in space reserved in the sub-buffers themselves or as
a separate array. See the 'reserve' example in the relay-apps tarball
at http://relayfs.sourceforge.net for an example of how this can be
done. Because the write is under control of the client and is
separated from the reserve, relay_reserve() doesn't protect the buffer
at all - it's up to the client to provide the appropriate
synchronization when using relay_reserve().
Closing a channel
-----------------
The client calls relay_close() when it's finished using the channel.
The channel and its associated buffers are destroyed when there are no
longer any references to any of the channel buffers. relay_flush()
forces a sub-buffer switch on all the channel buffers, and can be used
to finalize and process the last sub-buffers before the channel is
closed.
Misc
----
Some applications may want to keep a channel around and re-use it
rather than open and close a new channel for each use. relay_reset()
can be used for this purpose - it resets a channel to its initial
state without reallocating channel buffer memory or destroying
existing mappings. It should however only be called when it's safe to
do so i.e. when the channel isn't currently being written to.
Finally, there are a couple of utility callbacks that can be used for
different purposes. buf_mapped() is called whenever a channel buffer
is mmapped from user space and buf_unmapped() is called when it's
unmapped. The client can use this notification to trigger actions
within the kernel application, such as enabling/disabling logging to
the channel.
Resources
=========
For news, example code, mailing list, etc. see the relayfs homepage:
http://relayfs.sourceforge.net
Credits
=======
The ideas and specs for relayfs came about as a result of discussions
on tracing involving the following:
Michel Dagenais <michel.dagenais@polymtl.ca>
Richard Moore <richardj_moore@uk.ibm.com>
Bob Wisniewski <bob@watson.ibm.com>
Karim Yaghmour <karim@opersys.com>
Tom Zanussi <zanussi@us.ibm.com>
Also thanks to Hubertus Franke for a lot of useful suggestions and bug
reports.
......@@ -816,6 +816,18 @@ config RAMFS
To compile this as a module, choose M here: the module will be called
ramfs.
config RELAYFS_FS
tristate "Relayfs file system support"
---help---
Relayfs is a high-speed data relay filesystem designed to provide
an efficient mechanism for tools and facilities to relay large
amounts of data from kernel space to user space.
To compile this code as a module, choose M here: the module will be
called relayfs.
If unsure, say N.
endmenu
menu "Miscellaneous filesystems"
......
......@@ -90,6 +90,7 @@ obj-$(CONFIG_AUTOFS_FS) += autofs/
obj-$(CONFIG_AUTOFS4_FS) += autofs4/
obj-$(CONFIG_ADFS_FS) += adfs/
obj-$(CONFIG_UDF_FS) += udf/
obj-$(CONFIG_RELAYFS_FS) += relayfs/
obj-$(CONFIG_SUN_OPENPROMFS) += openpromfs/
obj-$(CONFIG_JFS_FS) += jfs/
obj-$(CONFIG_XFS_FS) += xfs/
......
obj-$(CONFIG_RELAYFS_FS) += relayfs.o
relayfs-y := relay.o inode.o buffers.o
/*
* RelayFS buffer management code.
*
* Copyright (C) 2002-2005 - Tom Zanussi (zanussi@us.ibm.com), IBM Corp
* Copyright (C) 1999-2005 - Karim Yaghmour (karim@opersys.com)
*
* This file is released under the GPL.
*/
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/mm.h>
#include <linux/relayfs_fs.h>
#include "relay.h"
#include "buffers.h"
/*
* close() vm_op implementation for relayfs file mapping.
*/
static void relay_file_mmap_close(struct vm_area_struct *vma)
{
struct rchan_buf *buf = vma->vm_private_data;
buf->chan->cb->buf_unmapped(buf, vma->vm_file);
}
/*
* nopage() vm_op implementation for relayfs file mapping.
*/
static struct page *relay_buf_nopage(struct vm_area_struct *vma,
unsigned long address,
int *type)
{
struct page *page;
struct rchan_buf *buf = vma->vm_private_data;
unsigned long offset = address - vma->vm_start;
if (address > vma->vm_end)
return NOPAGE_SIGBUS; /* Disallow mremap */
if (!buf)
return NOPAGE_OOM;
page = vmalloc_to_page(buf->start + offset);
if (!page)
return NOPAGE_OOM;
get_page(page);
if (type)
*type = VM_FAULT_MINOR;
return page;
}
/*
* vm_ops for relay file mappings.
*/
static struct vm_operations_struct relay_file_mmap_ops = {
.nopage = relay_buf_nopage,
.close = relay_file_mmap_close,
};
/**
* relay_mmap_buf: - mmap channel buffer to process address space
* @buf: relay channel buffer
* @vma: vm_area_struct describing memory to be mapped
*
* Returns 0 if ok, negative on error
*
* Caller should already have grabbed mmap_sem.
*/
int relay_mmap_buf(struct rchan_buf *buf, struct vm_area_struct *vma)
{
unsigned long length = vma->vm_end - vma->vm_start;
struct file *filp = vma->vm_file;
if (!buf)
return -EBADF;
if (length != (unsigned long)buf->chan->alloc_size)
return -EINVAL;
vma->vm_ops = &relay_file_mmap_ops;
vma->vm_private_data = buf;
buf->chan->cb->buf_mapped(buf, filp);
return 0;
}
/**
* relay_alloc_buf - allocate a channel buffer
* @buf: the buffer struct
* @size: total size of the buffer
*
* Returns a pointer to the resulting buffer, NULL if unsuccessful
*/
static void *relay_alloc_buf(struct rchan_buf *buf, unsigned long size)
{
void *mem;
unsigned int i, j, n_pages;
size = PAGE_ALIGN(size);
n_pages = size >> PAGE_SHIFT;
buf->page_array = kcalloc(n_pages, sizeof(struct page *), GFP_KERNEL);
if (!buf->page_array)
return NULL;
for (i = 0; i < n_pages; i++) {
buf->page_array[i] = alloc_page(GFP_KERNEL);
if (unlikely(!buf->page_array[i]))
goto depopulate;
}
mem = vmap(buf->page_array, n_pages, GFP_KERNEL, PAGE_KERNEL);
if (!mem)
goto depopulate;
memset(mem, 0, size);
buf->page_count = n_pages;
return mem;
depopulate:
for (j = 0; j < i; j++)
__free_page(buf->page_array[j]);
kfree(buf->page_array);
return NULL;
}
/**
* relay_create_buf - allocate and initialize a channel buffer
* @alloc_size: size of the buffer to allocate
* @n_subbufs: number of sub-buffers in the channel
*
* Returns channel buffer if successful, NULL otherwise
*/
struct rchan_buf *relay_create_buf(struct rchan *chan)
{
struct rchan_buf *buf = kcalloc(1, sizeof(struct rchan_buf), GFP_KERNEL);
if (!buf)
return NULL;
buf->padding = kmalloc(chan->n_subbufs * sizeof(size_t *), GFP_KERNEL);
if (!buf->padding)
goto free_buf;
buf->start = relay_alloc_buf(buf, chan->alloc_size);
if (!buf->start)
goto free_buf;
buf->chan = chan;
kref_get(&buf->chan->kref);
return buf;
free_buf:
kfree(buf->padding);
kfree(buf);
return NULL;
}
/**
* relay_destroy_buf - destroy an rchan_buf struct and associated buffer
* @buf: the buffer struct
*/
void relay_destroy_buf(struct rchan_buf *buf)
{
struct rchan *chan = buf->chan;
unsigned int i;
if (likely(buf->start)) {
vunmap(buf->start);
for (i = 0; i < buf->page_count; i++)
__free_page(buf->page_array[i]);
kfree(buf->page_array);
}
kfree(buf->padding);
kfree(buf);
kref_put(&chan->kref, relay_destroy_channel);
}
/**
* relay_remove_buf - remove a channel buffer
*
* Removes the file from the relayfs fileystem, which also frees the
* rchan_buf_struct and the channel buffer. Should only be called from
* kref_put().
*/
void relay_remove_buf(struct kref *kref)
{
struct rchan_buf *buf = container_of(kref, struct rchan_buf, kref);
relayfs_remove(buf->dentry);
}
#ifndef _BUFFERS_H
#define _BUFFERS_H
/* This inspired by rtai/shmem */
#define FIX_SIZE(x) (((x) - 1) & PAGE_MASK) + PAGE_SIZE
extern int relay_mmap_buf(struct rchan_buf *buf, struct vm_area_struct *vma);
extern struct rchan_buf *relay_create_buf(struct rchan *chan);
extern void relay_destroy_buf(struct rchan_buf *buf);
extern void relay_remove_buf(struct kref *kref);
#endif/* _BUFFERS_H */
/*
* VFS-related code for RelayFS, a high-speed data relay filesystem.
*
* Copyright (C) 2003-2005 - Tom Zanussi <zanussi@us.ibm.com>, IBM Corp
* Copyright (C) 2003-2005 - Karim Yaghmour <karim@opersys.com>
*
* Based on ramfs, Copyright (C) 2002 - Linus Torvalds
*
* This file is released under the GPL.
*/
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/mount.h>
#include <linux/pagemap.h>
#include <linux/init.h>
#include <linux/string.h>
#include <linux/backing-dev.h>
#include <linux/namei.h>
#include <linux/poll.h>
#include <linux/relayfs_fs.h>
#include "relay.h"
#include "buffers.h"
#define RELAYFS_MAGIC 0xF0B4A981
static struct vfsmount * relayfs_mount;
static int relayfs_mount_count;
static kmem_cache_t * relayfs_inode_cachep;
static struct backing_dev_info relayfs_backing_dev_info = {
.ra_pages = 0, /* No readahead */
.capabilities = BDI_CAP_NO_ACCT_DIRTY | BDI_CAP_NO_WRITEBACK,
};
static struct inode *relayfs_get_inode(struct super_block *sb, int mode,
struct rchan *chan)
{
struct rchan_buf *buf = NULL;
struct inode *inode;
if (S_ISREG(mode)) {
BUG_ON(!chan);
buf = relay_create_buf(chan);
if (!buf)
return NULL;
}
inode = new_inode(sb);
if (!inode) {
relay_destroy_buf(buf);
return NULL;
}
inode->i_mode = mode;
inode->i_uid = 0;
inode->i_gid = 0;
inode->i_blksize = PAGE_CACHE_SIZE;
inode->i_blocks = 0;
inode->i_mapping->backing_dev_info = &relayfs_backing_dev_info;
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
switch (mode & S_IFMT) {
case S_IFREG:
inode->i_fop = &relayfs_file_operations;
RELAYFS_I(inode)->buf = buf;
break;
case S_IFDIR:
inode->i_op = &simple_dir_inode_operations;
inode->i_fop = &simple_dir_operations;
/* directory inodes start off with i_nlink == 2 (for "." entry) */
inode->i_nlink++;
break;
default:
break;
}
return inode;
}
/**
* relayfs_create_entry - create a relayfs directory or file
* @name: the name of the file to create
* @parent: parent directory
* @mode: mode
* @chan: relay channel associated with the file
*
* Returns the new dentry, NULL on failure
*
* Creates a file or directory with the specifed permissions.
*/
static struct dentry *relayfs_create_entry(const char *name,
struct dentry *parent,
int mode,
struct rchan *chan)
{
struct dentry *d;
struct inode *inode;
int error = 0;
BUG_ON(!name || !(S_ISREG(mode) || S_ISDIR(mode)));
error = simple_pin_fs("relayfs", &relayfs_mount, &relayfs_mount_count);
if (error) {
printk(KERN_ERR "Couldn't mount relayfs: errcode %d\n", error);
return NULL;
}
if (!parent && relayfs_mount && relayfs_mount->mnt_sb)
parent = relayfs_mount->mnt_sb->s_root;
if (!parent) {
simple_release_fs(&relayfs_mount, &relayfs_mount_count);
return NULL;
}
parent = dget(parent);
down(&parent->d_inode->i_sem);
d = lookup_one_len(name, parent, strlen(name));
if (IS_ERR(d)) {
d = NULL;
goto release_mount;
}
if (d->d_inode) {
d = NULL;
goto release_mount;
}
inode = relayfs_get_inode(parent->d_inode->i_sb, mode, chan);
if (!inode) {
d = NULL;
goto release_mount;
}
d_instantiate(d, inode);
dget(d); /* Extra count - pin the dentry in core */
if (S_ISDIR(mode))
parent->d_inode->i_nlink++;
goto exit;
release_mount:
simple_release_fs(&relayfs_mount, &relayfs_mount_count);
exit:
up(&parent->d_inode->i_sem);
dput(parent);
return d;
}
/**
* relayfs_create_file - create a file in the relay filesystem
* @name: the name of the file to create
* @parent: parent directory
* @mode: mode, if not specied the default perms are used
* @chan: channel associated with the file
*
* Returns file dentry if successful, NULL otherwise.
*
* The file will be created user r on behalf of current user.
*/
struct dentry *relayfs_create_file(const char *name, struct dentry *parent,
int mode, struct rchan *chan)
{
if (!mode)
mode = S_IRUSR;
mode = (mode & S_IALLUGO) | S_IFREG;
return relayfs_create_entry(name, parent, mode, chan);
}
/**
* relayfs_create_dir - create a directory in the relay filesystem
* @name: the name of the directory to create
* @parent: parent directory, NULL if parent should be fs root
*
* Returns directory dentry if successful, NULL otherwise.
*
* The directory will be created world rwx on behalf of current user.
*/
struct dentry *relayfs_create_dir(const char *name, struct dentry *parent)
{
int mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
return relayfs_create_entry(name, parent, mode, NULL);
}
/**
* relayfs_remove - remove a file or directory in the relay filesystem
* @dentry: file or directory dentry
*
* Returns 0 if successful, negative otherwise.
*/
int relayfs_remove(struct dentry *dentry)
{
struct dentry *parent;
int error = 0;
if (!dentry)
return -EINVAL;
parent = dentry->d_parent;
if (!parent)
return -EINVAL;
parent = dget(parent);
down(&parent->d_inode->i_sem);
if (dentry->d_inode) {
if (S_ISDIR(dentry->d_inode->i_mode))
error = simple_rmdir(parent->d_inode, dentry);
else
error = simple_unlink(parent->d_inode, dentry);
if (!error)
d_delete(dentry);
}
if (!error)
dput(dentry);
up(&parent->d_inode->i_sem);
dput(parent);
if (!error)
simple_release_fs(&relayfs_mount, &relayfs_mount_count);
return error;
}
/**
* relayfs_remove_dir - remove a directory in the relay filesystem
* @dentry: directory dentry
*
* Returns 0 if successful, negative otherwise.
*/
int relayfs_remove_dir(struct dentry *dentry)
{
return relayfs_remove(dentry);
}
/**
* relayfs_open - open file op for relayfs files
* @inode: the inode
* @filp: the file
*
* Increments the channel buffer refcount.
*/
static int relayfs_open(struct inode *inode, struct file *filp)
{
struct rchan_buf *buf = RELAYFS_I(inode)->buf;
kref_get(&buf->kref);
return 0;
}
/**
* relayfs_mmap - mmap file op for relayfs files
* @filp: the file
* @vma: the vma describing what to map
*
* Calls upon relay_mmap_buf to map the file into user space.
*/
static int relayfs_mmap(struct file *filp, struct vm_area_struct *vma)
{
struct inode *inode = filp->f_dentry->d_inode;
return relay_mmap_buf(RELAYFS_I(inode)->buf, vma);
}
/**
* relayfs_poll - poll file op for relayfs files
* @filp: the file
* @wait: poll table
*
* Poll implemention.
*/
static unsigned int relayfs_poll(struct file *filp, poll_table *wait)
{
unsigned int mask = 0;
struct inode *inode = filp->f_dentry->d_inode;
struct rchan_buf *buf = RELAYFS_I(inode)->buf;
if (buf->finalized)
return POLLERR;
if (filp->f_mode & FMODE_READ) {
poll_wait(filp, &buf->read_wait, wait);
if (!relay_buf_empty(buf))
mask |= POLLIN | POLLRDNORM;
}
return mask;
}
/**
* relayfs_release - release file op for relayfs files
* @inode: the inode
* @filp: the file
*
* Decrements the channel refcount, as the filesystem is
* no longer using it.
*/
static int relayfs_release(struct inode *inode, struct file *filp)
{
struct rchan_buf *buf = RELAYFS_I(inode)->buf;
kref_put(&buf->kref, relay_remove_buf);
return 0;
}
/**
* relayfs_read_consume - update the consumed count for the buffer
*/
static void relayfs_read_consume(struct rchan_buf *buf,
size_t read_pos,
size_t bytes_consumed)
{
size_t subbuf_size = buf->chan->subbuf_size;
size_t n_subbufs = buf->chan->n_subbufs;
size_t read_subbuf;
if (buf->bytes_consumed + bytes_consumed > subbuf_size) {
relay_subbufs_consumed(buf->chan, buf->cpu, 1);
buf->bytes_consumed = 0;
}
buf->bytes_consumed += bytes_consumed;
read_subbuf = read_pos / buf->chan->subbuf_size;
if (buf->bytes_consumed + buf->padding[read_subbuf] == subbuf_size) {
if ((read_subbuf == buf->subbufs_produced % n_subbufs) &&
(buf->offset == subbuf_size))
return;
relay_subbufs_consumed(buf->chan, buf->cpu, 1);
buf->bytes_consumed = 0;
}
}
/**
* relayfs_read_avail - boolean, are there unconsumed bytes available?
*/
static int relayfs_read_avail(struct rchan_buf *buf, size_t read_pos)
{
size_t bytes_produced, bytes_consumed, write_offset;
size_t subbuf_size = buf->chan->subbuf_size;
size_t n_subbufs = buf->chan->n_subbufs;
size_t produced = buf->subbufs_produced % n_subbufs;
size_t consumed = buf->subbufs_consumed % n_subbufs;
write_offset = buf->offset > subbuf_size ? subbuf_size : buf->offset;
if (consumed > produced) {
if ((produced > n_subbufs) &&
(produced + n_subbufs - consumed <= n_subbufs))
produced += n_subbufs;
} else if (consumed == produced) {
if (buf->offset > subbuf_size) {
produced += n_subbufs;
if (buf->subbufs_produced == buf->subbufs_consumed)
consumed += n_subbufs;
}
}
if (buf->offset > subbuf_size)
bytes_produced = (produced - 1) * subbuf_size + write_offset;
else
bytes_produced = produced * subbuf_size + write_offset;
bytes_consumed = consumed * subbuf_size + buf->bytes_consumed;
if (bytes_produced == bytes_consumed)
return 0;
relayfs_read_consume(buf, read_pos, 0);
return 1;
}
/**
* relayfs_read_subbuf_avail - return bytes available in sub-buffer
*/
static size_t relayfs_read_subbuf_avail(size_t read_pos,
struct rchan_buf *buf)
{
size_t padding, avail = 0;
size_t read_subbuf, read_offset, write_subbuf, write_offset;
size_t subbuf_size = buf->chan->subbuf_size;
write_subbuf = (buf->data - buf->start) / subbuf_size;
write_offset = buf->offset > subbuf_size ? subbuf_size : buf->offset;
read_subbuf = read_pos / subbuf_size;
read_offset = read_pos % subbuf_size;
padding = buf->padding[read_subbuf];
if (read_subbuf == write_subbuf) {
if (read_offset + padding < write_offset)
avail = write_offset - (read_offset + padding);
} else
avail = (subbuf_size - padding) - read_offset;
return avail;
}
/**
* relayfs_read_start_pos - find the first available byte to read
*
* If the read_pos is in the middle of padding, return the
* position of the first actually available byte, otherwise
* return the original value.
*/
static size_t relayfs_read_start_pos(size_t read_pos,
struct rchan_buf *buf)
{
size_t read_subbuf, padding, padding_start, padding_end;
size_t subbuf_size = buf->chan->subbuf_size;
size_t n_subbufs = buf->chan->n_subbufs;
read_subbuf = read_pos / subbuf_size;
padding = buf->padding[read_subbuf];
padding_start = (read_subbuf + 1) * subbuf_size - padding;
padding_end = (read_subbuf + 1) * subbuf_size;
if (read_pos >= padding_start && read_pos < padding_end) {
read_subbuf = (read_subbuf + 1) % n_subbufs;
read_pos = read_subbuf * subbuf_size;
}
return read_pos;
}
/**
* relayfs_read_end_pos - return the new read position
*/
static size_t relayfs_read_end_pos(struct rchan_buf *buf,
size_t read_pos,
size_t count)
{
size_t read_subbuf, padding, end_pos;
size_t subbuf_size = buf->chan->subbuf_size;
size_t n_subbufs = buf->chan->n_subbufs;
read_subbuf = read_pos / subbuf_size;
padding = buf->padding[read_subbuf];
if (read_pos % subbuf_size + count + padding == subbuf_size)
end_pos = (read_subbuf + 1) * subbuf_size;
else
end_pos = read_pos + count;
if (end_pos >= subbuf_size * n_subbufs)
end_pos = 0;
return end_pos;
}
/**
* relayfs_read - read file op for relayfs files
* @filp: the file
* @buffer: the userspace buffer
* @count: number of bytes to read
* @ppos: position to read from
*
* Reads count bytes or the number of bytes available in the
* current sub-buffer being read, whichever is smaller.
*/
static ssize_t relayfs_read(struct file *filp,
char __user *buffer,
size_t count,
loff_t *ppos)
{
struct inode *inode = filp->f_dentry->d_inode;
struct rchan_buf *buf = RELAYFS_I(inode)->buf;
size_t read_start, avail;
ssize_t ret = 0;
void *from;
down(&inode->i_sem);
if(!relayfs_read_avail(buf, *ppos))
goto out;
read_start = relayfs_read_start_pos(*ppos, buf);
avail = relayfs_read_subbuf_avail(read_start, buf);
if (!avail)
goto out;
from = buf->start + read_start;
ret = count = min(count, avail);
if (copy_to_user(buffer, from, count)) {
ret = -EFAULT;
goto out;
}
relayfs_read_consume(buf, read_start, count);
*ppos = relayfs_read_end_pos(buf, read_start, count);
out:
up(&inode->i_sem);
return ret;
}
/**
* relayfs alloc_inode() implementation
*/
static struct inode *relayfs_alloc_inode(struct super_block *sb)
{
struct relayfs_inode_info *p = kmem_cache_alloc(relayfs_inode_cachep, SLAB_KERNEL);
if (!p)
return NULL;
p->buf = NULL;
return &p->vfs_inode;
}
/**
* relayfs destroy_inode() implementation
*/
static void relayfs_destroy_inode(struct inode *inode)
{
if (RELAYFS_I(inode)->buf)
relay_destroy_buf(RELAYFS_I(inode)->buf);
kmem_cache_free(relayfs_inode_cachep, RELAYFS_I(inode));
}
static void init_once(void *p, kmem_cache_t *cachep, unsigned long flags)
{
struct relayfs_inode_info *i = p;
if ((flags & (SLAB_CTOR_VERIFY | SLAB_CTOR_CONSTRUCTOR)) == SLAB_CTOR_CONSTRUCTOR)
inode_init_once(&i->vfs_inode);
}
struct file_operations relayfs_file_operations = {
.open = relayfs_open,
.poll = relayfs_poll,
.mmap = relayfs_mmap,
.read = relayfs_read,
.llseek = no_llseek,
.release = relayfs_release,
};
static struct super_operations relayfs_ops = {
.statfs = simple_statfs,
.drop_inode = generic_delete_inode,
.alloc_inode = relayfs_alloc_inode,
.destroy_inode = relayfs_destroy_inode,
};
static int relayfs_fill_super(struct super_block * sb, void * data, int silent)
{
struct inode *inode;
struct dentry *root;
int mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
sb->s_blocksize = PAGE_CACHE_SIZE;
sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
sb->s_magic = RELAYFS_MAGIC;
sb->s_op = &relayfs_ops;
inode = relayfs_get_inode(sb, mode, NULL);
if (!inode)
return -ENOMEM;
root = d_alloc_root(inode);
if (!root) {
iput(inode);
return -ENOMEM;
}
sb->s_root = root;
return 0;
}
static struct super_block * relayfs_get_sb(struct file_system_type *fs_type,
int flags, const char *dev_name,
void *data)
{
return get_sb_single(fs_type, flags, data, relayfs_fill_super);
}
static struct file_system_type relayfs_fs_type = {
.owner = THIS_MODULE,
.name = "relayfs",
.get_sb = relayfs_get_sb,
.kill_sb = kill_litter_super,
};
static int __init init_relayfs_fs(void)
{
int err;
relayfs_inode_cachep = kmem_cache_create("relayfs_inode_cache",
sizeof(struct relayfs_inode_info), 0,
0, init_once, NULL);
if (!relayfs_inode_cachep)
return -ENOMEM;
err = register_filesystem(&relayfs_fs_type);
if (err)
kmem_cache_destroy(relayfs_inode_cachep);
return err;
}
static void __exit exit_relayfs_fs(void)
{
unregister_filesystem(&relayfs_fs_type);
kmem_cache_destroy(relayfs_inode_cachep);
}
module_init(init_relayfs_fs)
module_exit(exit_relayfs_fs)
EXPORT_SYMBOL_GPL(relayfs_file_operations);
EXPORT_SYMBOL_GPL(relayfs_create_dir);
EXPORT_SYMBOL_GPL(relayfs_remove_dir);
MODULE_AUTHOR("Tom Zanussi <zanussi@us.ibm.com> and Karim Yaghmour <karim@opersys.com>");
MODULE_DESCRIPTION("Relay Filesystem");
MODULE_LICENSE("GPL");
/*
* Public API and common code for RelayFS.
*
* See Documentation/filesystems/relayfs.txt for an overview of relayfs.
*
* Copyright (C) 2002-2005 - Tom Zanussi (zanussi@us.ibm.com), IBM Corp
* Copyright (C) 1999-2005 - Karim Yaghmour (karim@opersys.com)
*
* This file is released under the GPL.
*/
#include <linux/errno.h>
#include <linux/stddef.h>
#include <linux/slab.h>
#include <linux/module.h>
#include <linux/string.h>
#include <linux/relayfs_fs.h>
#include "relay.h"
#include "buffers.h"
/**
* relay_buf_empty - boolean, is the channel buffer empty?
* @buf: channel buffer
*
* Returns 1 if the buffer is empty, 0 otherwise.
*/
int relay_buf_empty(struct rchan_buf *buf)
{
return (buf->subbufs_produced - buf->subbufs_consumed) ? 0 : 1;
}
/**
* relay_buf_full - boolean, is the channel buffer full?
* @buf: channel buffer
*
* Returns 1 if the buffer is full, 0 otherwise.
*/
int relay_buf_full(struct rchan_buf *buf)
{
size_t ready = buf->subbufs_produced - buf->subbufs_consumed;
return (ready >= buf->chan->n_subbufs) ? 1 : 0;
}
/*
* High-level relayfs kernel API and associated functions.
*/
/*
* rchan_callback implementations defining default channel behavior. Used
* in place of corresponding NULL values in client callback struct.
*/
/*
* subbuf_start() default callback. Does nothing.
*/
static int subbuf_start_default_callback (struct rchan_buf *buf,
void *subbuf,
void *prev_subbuf,
size_t prev_padding)
{
if (relay_buf_full(buf))
return 0;
return 1;
}
/*
* buf_mapped() default callback. Does nothing.
*/
static void buf_mapped_default_callback(struct rchan_buf *buf,
struct file *filp)
{
}
/*
* buf_unmapped() default callback. Does nothing.
*/
static void buf_unmapped_default_callback(struct rchan_buf *buf,
struct file *filp)
{
}
/* relay channel default callbacks */
static struct rchan_callbacks default_channel_callbacks = {
.subbuf_start = subbuf_start_default_callback,
.buf_mapped = buf_mapped_default_callback,
.buf_unmapped = buf_unmapped_default_callback,
};
/**
* wakeup_readers - wake up readers waiting on a channel
* @private: the channel buffer
*
* This is the work function used to defer reader waking. The
* reason waking is deferred is that calling directly from write
* causes problems if you're writing from say the scheduler.
*/
static void wakeup_readers(void *private)
{
struct rchan_buf *buf = private;
wake_up_interruptible(&buf->read_wait);
}
/**
* __relay_reset - reset a channel buffer
* @buf: the channel buffer
* @init: 1 if this is a first-time initialization
*
* See relay_reset for description of effect.
*/
static inline void __relay_reset(struct rchan_buf *buf, unsigned int init)
{
size_t i;
if (init) {
init_waitqueue_head(&buf->read_wait);
kref_init(&buf->kref);
INIT_WORK(&buf->wake_readers, NULL, NULL);
} else {
cancel_delayed_work(&buf->wake_readers);
flush_scheduled_work();
}
buf->subbufs_produced = 0;
buf->subbufs_consumed = 0;
buf->bytes_consumed = 0;
buf->finalized = 0;
buf->data = buf->start;
buf->offset = 0;
for (i = 0; i < buf->chan->n_subbufs; i++)
buf->padding[i] = 0;
buf->chan->cb->subbuf_start(buf, buf->data, NULL, 0);
}
/**
* relay_reset - reset the channel
* @chan: the channel
*
* This has the effect of erasing all data from all channel buffers
* and restarting the channel in its initial state. The buffers
* are not freed, so any mappings are still in effect.
*
* NOTE: Care should be taken that the channel isn't actually
* being used by anything when this call is made.
*/
void relay_reset(struct rchan *chan)
{
unsigned int i;
if (!chan)
return;
for (i = 0; i < NR_CPUS; i++) {
if (!chan->buf[i])
continue;
__relay_reset(chan->buf[i], 0);
}
}
/**
* relay_open_buf - create a new channel buffer in relayfs
*
* Internal - used by relay_open().
*/
static struct rchan_buf *relay_open_buf(struct rchan *chan,
const char *filename,
struct dentry *parent)
{
struct rchan_buf *buf;
struct dentry *dentry;
/* Create file in fs */
dentry = relayfs_create_file(filename, parent, S_IRUSR, chan);
if (!dentry)
return NULL;
buf = RELAYFS_I(dentry->d_inode)->buf;
buf->dentry = dentry;
__relay_reset(buf, 1);
return buf;
}
/**
* relay_close_buf - close a channel buffer
* @buf: channel buffer
*
* Marks the buffer finalized and restores the default callbacks.
* The channel buffer and channel buffer data structure are then freed
* automatically when the last reference is given up.
*/
static inline void relay_close_buf(struct rchan_buf *buf)
{
buf->finalized = 1;
buf->chan->cb = &default_channel_callbacks;
cancel_delayed_work(&buf->wake_readers);
flush_scheduled_work();
kref_put(&buf->kref, relay_remove_buf);
}
static inline void setup_callbacks(struct rchan *chan,
struct rchan_callbacks *cb)
{
if (!cb) {
chan->cb = &default_channel_callbacks;
return;
}
if (!cb->subbuf_start)
cb->subbuf_start = subbuf_start_default_callback;
if (!cb->buf_mapped)
cb->buf_mapped = buf_mapped_default_callback;
if (!cb->buf_unmapped)
cb->buf_unmapped = buf_unmapped_default_callback;
chan->cb = cb;
}
/**
* relay_open - create a new relayfs channel
* @base_filename: base name of files to create
* @parent: dentry of parent directory, NULL for root directory
* @subbuf_size: size of sub-buffers
* @n_subbufs: number of sub-buffers
* @cb: client callback functions
*
* Returns channel pointer if successful, NULL otherwise.
*
* Creates a channel buffer for each cpu using the sizes and
* attributes specified. The created channel buffer files
* will be named base_filename0...base_filenameN-1. File
* permissions will be S_IRUSR.
*/
struct rchan *relay_open(const char *base_filename,
struct dentry *parent,
size_t subbuf_size,
size_t n_subbufs,
struct rchan_callbacks *cb)
{
unsigned int i;
struct rchan *chan;
char *tmpname;
if (!base_filename)
return NULL;
if (!(subbuf_size && n_subbufs))
return NULL;
chan = kcalloc(1, sizeof(struct rchan), GFP_KERNEL);
if (!chan)
return NULL;
chan->version = RELAYFS_CHANNEL_VERSION;
chan->n_subbufs = n_subbufs;
chan->subbuf_size = subbuf_size;
chan->alloc_size = FIX_SIZE(subbuf_size * n_subbufs);
setup_callbacks(chan, cb);
kref_init(&chan->kref);
tmpname = kmalloc(NAME_MAX + 1, GFP_KERNEL);
if (!tmpname)
goto free_chan;
for_each_online_cpu(i) {
sprintf(tmpname, "%s%d", base_filename, i);
chan->buf[i] = relay_open_buf(chan, tmpname, parent);
chan->buf[i]->cpu = i;
if (!chan->buf[i])
goto free_bufs;
}
kfree(tmpname);
return chan;
free_bufs:
for (i = 0; i < NR_CPUS; i++) {
if (!chan->buf[i])
break;
relay_close_buf(chan->buf[i]);
}
kfree(tmpname);
free_chan:
kref_put(&chan->kref, relay_destroy_channel);
return NULL;
}
/**
* relay_switch_subbuf - switch to a new sub-buffer
* @buf: channel buffer
* @length: size of current event
*
* Returns either the length passed in or 0 if full.
* Performs sub-buffer-switch tasks such as invoking callbacks,
* updating padding counts, waking up readers, etc.
*/
size_t relay_switch_subbuf(struct rchan_buf *buf, size_t length)
{
void *old, *new;
size_t old_subbuf, new_subbuf;
if (unlikely(length > buf->chan->subbuf_size))
goto toobig;
if (buf->offset != buf->chan->subbuf_size + 1) {
buf->prev_padding = buf->chan->subbuf_size - buf->offset;
old_subbuf = buf->subbufs_produced % buf->chan->n_subbufs;
buf->padding[old_subbuf] = buf->prev_padding;
buf->subbufs_produced++;
if (waitqueue_active(&buf->read_wait)) {
PREPARE_WORK(&buf->wake_readers, wakeup_readers, buf);
schedule_delayed_work(&buf->wake_readers, 1);
}
}
old = buf->data;
new_subbuf = buf->subbufs_produced % buf->chan->n_subbufs;
new = buf->start + new_subbuf * buf->chan->subbuf_size;
buf->offset = 0;
if (!buf->chan->cb->subbuf_start(buf, new, old, buf->prev_padding)) {
buf->offset = buf->chan->subbuf_size + 1;
return 0;
}
buf->data = new;
buf->padding[new_subbuf] = 0;
if (unlikely(length + buf->offset > buf->chan->subbuf_size))
goto toobig;
return length;
toobig:
printk(KERN_WARNING "relayfs: event too large (%Zd)\n", length);
WARN_ON(1);
return 0;
}
/**
* relay_subbufs_consumed - update the buffer's sub-buffers-consumed count
* @chan: the channel
* @cpu: the cpu associated with the channel buffer to update
* @subbufs_consumed: number of sub-buffers to add to current buf's count
*
* Adds to the channel buffer's consumed sub-buffer count.
* subbufs_consumed should be the number of sub-buffers newly consumed,
* not the total consumed.
*
* NOTE: kernel clients don't need to call this function if the channel
* mode is 'overwrite'.
*/
void relay_subbufs_consumed(struct rchan *chan,
unsigned int cpu,
size_t subbufs_consumed)
{
struct rchan_buf *buf;
if (!chan)
return;
if (cpu >= NR_CPUS || !chan->buf[cpu])
return;
buf = chan->buf[cpu];
buf->subbufs_consumed += subbufs_consumed;
if (buf->subbufs_consumed > buf->subbufs_produced)
buf->subbufs_consumed = buf->subbufs_produced;
}
/**
* relay_destroy_channel - free the channel struct
*
* Should only be called from kref_put().
*/
void relay_destroy_channel(struct kref *kref)
{
struct rchan *chan = container_of(kref, struct rchan, kref);
kfree(chan);
}
/**
* relay_close - close the channel
* @chan: the channel
*
* Closes all channel buffers and frees the channel.
*/
void relay_close(struct rchan *chan)
{
unsigned int i;
if (!chan)
return;
for (i = 0; i < NR_CPUS; i++) {
if (!chan->buf[i])
continue;
relay_close_buf(chan->buf[i]);
}
kref_put(&chan->kref, relay_destroy_channel);
}
/**
* relay_flush - close the channel
* @chan: the channel
*
* Flushes all channel buffers i.e. forces buffer switch.
*/
void relay_flush(struct rchan *chan)
{
unsigned int i;
if (!chan)
return;
for (i = 0; i < NR_CPUS; i++) {
if (!chan->buf[i])
continue;
relay_switch_subbuf(chan->buf[i], 0);
}
}
EXPORT_SYMBOL_GPL(relay_open);
EXPORT_SYMBOL_GPL(relay_close);
EXPORT_SYMBOL_GPL(relay_flush);
EXPORT_SYMBOL_GPL(relay_reset);
EXPORT_SYMBOL_GPL(relay_subbufs_consumed);
EXPORT_SYMBOL_GPL(relay_switch_subbuf);
EXPORT_SYMBOL_GPL(relay_buf_full);
#ifndef _RELAY_H
#define _RELAY_H
struct dentry *relayfs_create_file(const char *name,
struct dentry *parent,
int mode,
struct rchan *chan);
extern int relayfs_remove(struct dentry *dentry);
extern int relay_buf_empty(struct rchan_buf *buf);
extern void relay_destroy_channel(struct kref *kref);
#endif /* _RELAY_H */
/*
* linux/include/linux/relayfs_fs.h
*
* Copyright (C) 2002, 2003 - Tom Zanussi (zanussi@us.ibm.com), IBM Corp
* Copyright (C) 1999, 2000, 2001, 2002 - Karim Yaghmour (karim@opersys.com)
*
* RelayFS definitions and declarations
*/
#ifndef _LINUX_RELAYFS_FS_H
#define _LINUX_RELAYFS_FS_H
#include <linux/config.h>
#include <linux/types.h>
#include <linux/sched.h>
#include <linux/wait.h>
#include <linux/list.h>
#include <linux/fs.h>
#include <linux/poll.h>
#include <linux/kref.h>
/*
* Tracks changes to rchan_buf struct
*/
#define RELAYFS_CHANNEL_VERSION 5
/*
* Per-cpu relay channel buffer
*/
struct rchan_buf
{
void *start; /* start of channel buffer */
void *data; /* start of current sub-buffer */
size_t offset; /* current offset into sub-buffer */
size_t subbufs_produced; /* count of sub-buffers produced */
size_t subbufs_consumed; /* count of sub-buffers consumed */
struct rchan *chan; /* associated channel */
wait_queue_head_t read_wait; /* reader wait queue */
struct work_struct wake_readers; /* reader wake-up work struct */
struct dentry *dentry; /* channel file dentry */
struct kref kref; /* channel buffer refcount */
struct page **page_array; /* array of current buffer pages */
unsigned int page_count; /* number of current buffer pages */
unsigned int finalized; /* buffer has been finalized */
size_t *padding; /* padding counts per sub-buffer */
size_t prev_padding; /* temporary variable */
size_t bytes_consumed; /* bytes consumed in cur read subbuf */
unsigned int cpu; /* this buf's cpu */
} ____cacheline_aligned;
/*
* Relay channel data structure
*/
struct rchan
{
u32 version; /* the version of this struct */
size_t subbuf_size; /* sub-buffer size */
size_t n_subbufs; /* number of sub-buffers per buffer */
size_t alloc_size; /* total buffer size allocated */
struct rchan_callbacks *cb; /* client callbacks */
struct kref kref; /* channel refcount */
void *private_data; /* for user-defined data */
struct rchan_buf *buf[NR_CPUS]; /* per-cpu channel buffers */
};
/*
* Relayfs inode
*/
struct relayfs_inode_info
{
struct inode vfs_inode;
struct rchan_buf *buf;
};
static inline struct relayfs_inode_info *RELAYFS_I(struct inode *inode)
{
return container_of(inode, struct relayfs_inode_info, vfs_inode);
}
/*
* Relay channel client callbacks
*/
struct rchan_callbacks
{
/*
* subbuf_start - called on buffer-switch to a new sub-buffer
* @buf: the channel buffer containing the new sub-buffer
* @subbuf: the start of the new sub-buffer
* @prev_subbuf: the start of the previous sub-buffer
* @prev_padding: unused space at the end of previous sub-buffer
*
* The client should return 1 to continue logging, 0 to stop
* logging.
*
* NOTE: subbuf_start will also be invoked when the buffer is
* created, so that the first sub-buffer can be initialized
* if necessary. In this case, prev_subbuf will be NULL.
*
* NOTE: the client can reserve bytes at the beginning of the new
* sub-buffer by calling subbuf_start_reserve() in this callback.
*/
int (*subbuf_start) (struct rchan_buf *buf,
void *subbuf,
void *prev_subbuf,
size_t prev_padding);
/*
* buf_mapped - relayfs buffer mmap notification
* @buf: the channel buffer
* @filp: relayfs file pointer
*
* Called when a relayfs file is successfully mmapped
*/
void (*buf_mapped)(struct rchan_buf *buf,
struct file *filp);
/*
* buf_unmapped - relayfs buffer unmap notification
* @buf: the channel buffer
* @filp: relayfs file pointer
*
* Called when a relayfs file is successfully unmapped
*/
void (*buf_unmapped)(struct rchan_buf *buf,
struct file *filp);
};
/*
* relayfs kernel API, fs/relayfs/relay.c
*/
struct rchan *relay_open(const char *base_filename,
struct dentry *parent,
size_t subbuf_size,
size_t n_subbufs,
struct rchan_callbacks *cb);
extern void relay_close(struct rchan *chan);
extern void relay_flush(struct rchan *chan);
extern void relay_subbufs_consumed(struct rchan *chan,
unsigned int cpu,
size_t consumed);
extern void relay_reset(struct rchan *chan);
extern int relay_buf_full(struct rchan_buf *buf);
extern size_t relay_switch_subbuf(struct rchan_buf *buf,
size_t length);
extern struct dentry *relayfs_create_dir(const char *name,
struct dentry *parent);
extern int relayfs_remove_dir(struct dentry *dentry);
/**
* relay_write - write data into the channel
* @chan: relay channel
* @data: data to be written
* @length: number of bytes to write
*
* Writes data into the current cpu's channel buffer.
*
* Protects the buffer by disabling interrupts. Use this
* if you might be logging from interrupt context. Try
* __relay_write() if you know you won't be logging from
* interrupt context.
*/
static inline void relay_write(struct rchan *chan,
const void *data,
size_t length)
{
unsigned long flags;
struct rchan_buf *buf;
local_irq_save(flags);
buf = chan->buf[smp_processor_id()];
if (unlikely(buf->offset + length > chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
local_irq_restore(flags);
}
/**
* __relay_write - write data into the channel
* @chan: relay channel
* @data: data to be written
* @length: number of bytes to write
*
* Writes data into the current cpu's channel buffer.
*
* Protects the buffer by disabling preemption. Use
* relay_write() if you might be logging from interrupt
* context.
*/
static inline void __relay_write(struct rchan *chan,
const void *data,
size_t length)
{
struct rchan_buf *buf;
buf = chan->buf[get_cpu()];
if (unlikely(buf->offset + length > buf->chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
put_cpu();
}
/**
* relay_reserve - reserve slot in channel buffer
* @chan: relay channel
* @length: number of bytes to reserve
*
* Returns pointer to reserved slot, NULL if full.
*
* Reserves a slot in the current cpu's channel buffer.
* Does not protect the buffer at all - caller must provide
* appropriate synchronization.
*/
static inline void *relay_reserve(struct rchan *chan, size_t length)
{
void *reserved;
struct rchan_buf *buf = chan->buf[smp_processor_id()];
if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
length = relay_switch_subbuf(buf, length);
if (!length)
return NULL;
}
reserved = buf->data + buf->offset;
buf->offset += length;
return reserved;
}
/**
* subbuf_start_reserve - reserve bytes at the start of a sub-buffer
* @buf: relay channel buffer
* @length: number of bytes to reserve
*
* Helper function used to reserve bytes at the beginning of
* a sub-buffer in the subbuf_start() callback.
*/
static inline void subbuf_start_reserve(struct rchan_buf *buf,
size_t length)
{
BUG_ON(length >= buf->chan->subbuf_size - 1);
buf->offset = length;
}
/*
* exported relayfs file operations, fs/relayfs/inode.c
*/
extern struct file_operations relayfs_file_operations;
#endif /* _LINUX_RELAYFS_FS_H */
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册