提交 · f78f5b90c4ffa559e400c3919a02236101f29f3f · openanolis / cloud-kernel

23 7月, 2015 1 次提交

rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN() · f78f5b90

由 Paul E. McKenney 提交于 6月 18, 2015

This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for
consistency with the WARN() series of macros.  This also requires
inverting the sense of the conditional, which this commit also does.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NIngo Molnar <mingo@kernel.org>

f78f5b90

01 7月, 2015 1 次提交

fs/file.c: don't acquire files->file_lock in fd_install() · 8a81252b

由 Eric Dumazet 提交于 6月 30, 2015

Mateusz Guzik reported :

 Currently obtaining a new file descriptor results in locking fdtable
 twice - once in order to reserve a slot and second time to fill it.

Holding the spinlock in __fd_install() is needed in case a resize is
done, or to prevent a resize.

Mateusz provided an RFC patch and a micro benchmark :
  http://people.redhat.com/~mguzik/pipebench.c

A resize is an unlikely operation in a process lifetime,
as table size is at least doubled at every resize.

We can use RCU instead of the spinlock.

__fd_install() must wait if a resize is in progress.

The resize must block new __fd_install() callers from starting,
and wait that ongoing install are finished (synchronize_sched())

resize should be attempted by a single thread to not waste resources.

rcu_sched variant is used, as __fd_install() and expand_fdtable() run
from process context.

It gives us a ~30% speedup using pipebench on a dual Intel(R) Xeon(R)
CPU E5-2696 v2 @ 2.50GHz
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NMateusz Guzik <mguzik@redhat.com>
Acked-by: NMateusz Guzik <mguzik@redhat.com>
Tested-by: NMateusz Guzik <mguzik@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8a81252b

02 4月, 2014 1 次提交

get rid of files_defer_init() · 7f4b36f9

由 Al Viro 提交于 3月 14, 2014

the only thing it's doing these days is calculation of
upper limit for fs.nr_open sysctl and that can be done
statically
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7f4b36f9

25 1月, 2014 1 次提交

introduce __fcheck_files() to fix rcu_dereference_check_fdtable(), kill rcu_my_thread_group_empty() · a8d4b834

由 Oleg Nesterov 提交于 1月 11, 2014

rcu_dereference_check_fdtable() looks very wrong,

1. rcu_my_thread_group_empty() was added by 844b9a87 "vfs: fix
   RCU-lockdep false positive due to /proc" but it doesn't really
   fix the problem. A CLONE_THREAD (without CLONE_FILES) task can
   hit the same race with get_files_struct().

   And otoh rcu_my_thread_group_empty() can suppress the correct
   warning if the caller is the CLONE_FILES (without CLONE_THREAD)
   task.

2. files->count == 1 check is not really right too. Even if this
   files_struct is not shared it is not safe to access it lockless
   unless the caller is the owner.

   Otoh, this check is sub-optimal. files->count == 0 always means
   it is safe to use it lockless even if files != current->files,
   but put_files_struct() has to take rcu_read_lock(). See the next
   patch.

This patch removes the buggy checks and turns fcheck_files() into
__fcheck_files() which uses rcu_dereference_raw(), the "unshared"
callers, fget_light() and fget_raw_light(), can use it to avoid
the warning from RCU-lockdep.

fcheck_files() is trivially reimplemented as rcu_lockdep_assert()
plus __fcheck_files().
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a8d4b834

02 5月, 2013 1 次提交
- A
  don't bother with deferred freeing of fdtables · ac3e3c5b
  由 Al Viro 提交于 4月 28, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  ac3e3c5b
29 11月, 2012 1 次提交
- A
  kill daemonize() · c4144670
  由 Al Viro 提交于 10月 02, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  c4144670
27 9月, 2012 9 次提交

new helper: daemonize_descriptors() · 864bdb3b

由 Al Viro 提交于 8月 22, 2012

descriptor-related parts of daemonize, done right.  As the
result we simplify the locking rules for ->files - we
hold task_lock in *all* cases when we modify ->files.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

864bdb3b

new helper: iterate_fd() · c3c073f8

由 Al Viro 提交于 8月 21, 2012

iterates through the opened files in given descriptor table,
calling a supplied function; we stop once non-zero is returned.
Callback gets struct file *, descriptor number and const void *
argument passed to iterator.  It is called with files->file_lock
held, so it is not allowed to block.

tty_io, netprio_cgroup and selinux flush_unauthorized_files()
converted to its use.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c3c073f8

A
make expand_files() and alloc_fd() static · ad47bd72
由 Al Viro 提交于 8月 21, 2012
```
no callers outside of fs/file.c left
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ad47bd72
A
take __{set,clear}_{open_fd,close_on_exec}() into fs/file.c · b8318b01
由 Al Viro 提交于 8月 21, 2012
```
nobody uses those outside anymore.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
b8318b01

take close-on-exec logics to fs/file.c, clean it up a bit · 6a6d27de

由 Al Viro 提交于 8月 21, 2012

... and add cond_resched() there, while we are at it.  We can
get large latencies as is...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6a6d27de

A
take descriptor-related part of close() to file.c · 483ce1d4
由 Al Viro 提交于 8月 19, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
483ce1d4

expose a low-level variant of fd_install() for binder · f869e8a7

由 Al Viro 提交于 8月 15, 2012

Similar situation to that of __alloc_fd(); do not use unless you
really have to.  You should not touch any descriptor table other
than your own; it's a sure sign of a really bad API design.

As with __alloc_fd(), you *must* use a first-class reference to
struct files_struct; something obtained by get_files_struct(some task)
(let alone direct task->files) will not do.  It must be either
current->files, or obtained by get_files_struct(current) by the
owner of that sucker and given to you.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f869e8a7

A
move files_struct-related bits from kernel/exit.c to fs/file.c · 7cf4dc3c
由 Al Viro 提交于 8月 15, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
7cf4dc3c

new helper: __alloc_fd() · dcfadfa4

由 Al Viro 提交于 8月 12, 2012

Essentially, alloc_fd() in a files_struct we own a reference to.
Most of the time wanting to use it is a sign of lousy API
design (such as android/binder).  It's *not* a general-purpose
interface; better that than open-coding its guts, but again,
playing with other process' descriptor table is a sign of bad
design.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dcfadfa4

20 2月, 2012 2 次提交

Replace the fd_sets in struct fdtable with an array of unsigned longs · 1fd36adc

由 David Howells 提交于 2月 16, 2012

Replace the fd_sets in struct fdtable with an array of unsigned longs and then
use the standard non-atomic bit operations rather than the FD_* macros.

This:

 (1) Removes the abuses of struct fd_set:

     (a) Since we don't want to allocate a full fd_set the vast majority of the
     	 time, we actually, in effect, just allocate a just-big-enough array of
     	 unsigned longs and cast it to an fd_set type - so why bother with the
     	 fd_set at all?

     (b) Some places outside of the core fdtable handling code (such as
     	 SELinux) want to look inside the array of unsigned longs hidden inside
     	 the fd_set struct for more efficient iteration over the entire set.

 (2) Eliminates the use of FD_*() macros in the kernel completely.

 (3) Permits the __FD_*() macros to be deleted entirely where not exposed to
     userspace.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20120216174954.23314.48147.stgit@warthog.procyon.org.ukSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>

1fd36adc

Wrap accesses to the fd_sets in struct fdtable · 1dce27c5

由 David Howells 提交于 2月 16, 2012

Wrap accesses to the fd_sets in struct fdtable (for recording open files and
close-on-exec flags) so that we can move away from using fd_sets since we
abuse the fd_set structs by not allocating the full-sized structure under
normal circumstances and by non-core code looking at the internals of the
fd_sets.

The first abuse means that use of FD_ZERO() on these fd_sets is not permitted,
since that cannot be told about their abnormal lengths.

This introduces six wrapper functions for setting, clearing and testing
close-on-exec flags and fd-is-open flags:

	void __set_close_on_exec(int fd, struct fdtable *fdt);
	void __clear_close_on_exec(int fd, struct fdtable *fdt);
	bool close_on_exec(int fd, const struct fdtable *fdt);
	void __set_open_fd(int fd, struct fdtable *fdt);
	void __clear_open_fd(int fd, struct fdtable *fdt);
	bool fd_is_open(int fd, const struct fdtable *fdt);

Note that I've prepended '__' to the names of the set/clear functions because
they require the caller to hold a lock to use them.

Note also that I haven't added wrappers for looking behind the scenes at the
the array.  Possibly that should exist too.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20120216174942.23314.1364.stgit@warthog.procyon.org.ukSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>

1dce27c5

27 7月, 2011 1 次提交

atomic: use <linux/atomic.h> · 60063497

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

09 7月, 2011 1 次提交

rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check · d8bf4ca9

由 Michal Hocko 提交于 7月 08, 2011

Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check use rcu_read_lock_held as a part of condition
automatically so callers do not have to do that as well.
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

d8bf4ca9

20 8月, 2010 1 次提交

kernel: __rcu annotations · 4d2deb40

由 Arnd Bergmann 提交于 2月 24, 2010

This adds annotations for RCU operations in core kernel components
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>

4d2deb40

21 7月, 2010 1 次提交

vfs: fix RCU-lockdep false positive due to /proc · 844b9a87

由 Paul E. McKenney 提交于 7月 20, 2010

If a single-threaded process does a file-descriptor operation, and some
other process accesses that same file descriptor via /proc, the current
rcu_dereference_check_fdtable() can give a false-positive RCU-lockdep
splat due to the reference count being increased by the /proc access after
the reference-count check in fget_light() but before the check in
rcu_dereference_check_fdtable().

This commit prevents this false positive by checking for a single-threaded
process.  To avoid #include hell, this commit uses the wrapper for
thread_group_empty(current) defined by rcu_my_thread_group_empty()
provided in a separate commit.
Located-by: NMiles Lane <miles.lane@gmail.com>
Located-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

844b9a87

15 6月, 2010 1 次提交

vfs: add fs.h to define struct file · 2c666df8

由 Paul E. McKenney 提交于 4月 29, 2010

The sparse RCU-pointer annotations require definition of the
underlying type of any pointer passed to rcu_dereference() and friends.
So fcheck_files() needs "struct file" to be defined, so include fs.h.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>

2c666df8

25 2月, 2010 2 次提交

vfs: Abstract rcu_dereference_check for files-fdtable use · af61b96b

由 Paul E. McKenney 提交于 2月 22, 2010

Create an rcu_dereference_check_fdtable() that encapsulates the
rcu_dereference_check() condition for fcheck_files() use.  This
has the beneficial side-effect of getting rid of a very long
line.
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <1266887105-1528-9-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

af61b96b

vfs: Apply lockdep-based checking to rcu_dereference() uses · 7dc52157

由 Paul E. McKenney 提交于 2月 22, 2010

Add lockdep-ified RCU primitives to alloc_fd(), files_fdtable()
and fcheck_files().

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
LKML-Reference: <1266887105-1528-8-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7dc52157

03 4月, 2009 1 次提交

kmemtrace, fs: fix linux/fdtable.h header file dependencies · 21e54459

由 Ingo Molnar 提交于 3月 25, 2009

Impact: cleanup

We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace
changes), but this is not possible currently without breaking the
build because fdtable.h has an implicit include file dependency: it
uses __init does not include init.h.

This can cause build failures on non-x86 architectures:

 /home/mingo/tip/include/linux/fdtable.h:66: error: expected '=', ',',
 ';', 'asm' or '__attribute__' before 'files_defer_init'
 make[2]: *** [fs/locks.o] Error 1

We got this header included indirectly via rcupdate.h's percpu.h
inclusion - but if that is not there the build will break.

Fix it.

Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: paulmck@linux.vnet.ibm.com
LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

21e54459

01 1月, 2009 1 次提交

filp_cachep can be static in fs/file_table.c · b6b3fdea

由 Eric Dumazet 提交于 12月 10, 2008

Instead of creating the "filp" kmem_cache in vfs_caches_init(),
we can do it a litle be later in files_init(), so that filp_cachep
is static to fs/file_table.c
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b6b3fdea

17 5月, 2008 1 次提交

[PATCH] dup_fd() fixes, part 1 · 02afc626

由 Al Viro 提交于 5月 08, 2008

Move the sucker to fs/file.c in preparation to the rest
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

02afc626

02 5月, 2008 1 次提交

[PATCH] split linux/file.h · 9f3acc31

由 Al Viro 提交于 4月 24, 2008

Initial splitoff of the low-level stuff; taken to fdtable.h
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9f3acc31

25 4月, 2008 1 次提交

[PATCH] sanitize unshare_files/reset_files_struct · 3b125388

由 Al Viro 提交于 4月 22, 2008

* let unshare_files() give caller the displaced files_struct
* don't bother with grabbing reference only to drop it in the
  caller if it hadn't been shared in the first place
* in that form unshare_files() is trivially implemented via
  unshare_fd(), so we eliminate the duplicate logics in fork.c
* reset_files_struct() is not just only called for current;
  it will break the system if somebody ever calls it for anything
  else (we can't modify ->files of somebody else).  Lose the
  task_struct * argument.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3b125388

19 4月, 2008 1 次提交

[PATCH] r/o bind mounts: create helper to drop file write access · aceaf78d

由 Dave Hansen 提交于 2月 15, 2008

If someone decides to demote a file from r/w to just
r/o, they can use this same code as __fput().

NFS does just that, and will use this in the next
patch.

AV: drop write access in __fput() only after we evict from file list.
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Cc: Erez Zadok <ezk@cs.sunysb.edu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J Bruce Fields" <bfields@fieldses.org>
Acked-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

aceaf78d

14 2月, 2008 1 次提交

include/linux: Remove all users of FASTCALL() macro · b3c97528

由 Harvey Harrison 提交于 2月 13, 2008

FASTCALL() is always expanded to empty, remove it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b3c97528

17 10月, 2007 1 次提交

r/o bind mounts: filesystem helpers for custom 'struct file's · ce8d2cdf

由 Dave Hansen 提交于 10月 16, 2007

Why do we need r/o bind mounts?

This feature allows a read-only view into a read-write filesystem.  In the
process of doing that, it also provides infrastructure for keeping track of
the number of writers to any given mount.

This has a number of uses.  It allows chroots to have parts of filesystems
writable.  It will be useful for containers in the future because users may
have root inside a container, but should not be allowed to write to
somefilesystems.  This also replaces patches that vserver has had out of the
tree for several years.

It allows security enhancement by making sure that parts of your filesystem
read-only (such as when you don't trust your FTP server), when you don't want
to have entire new filesystems mounted, or when you want atime selectively
updated.  I've been using the following script to test that the feature is
working as desired.  It takes a directory and makes a regular bind and a r/o
bind mount of it.  It then performs some normal filesystem operations on the
three directories, including ones that are expected to fail, like creating a
file on the r/o mount.

This patch:

Some filesystems forego the vfs and may_open() and create their own 'struct
file's.

This patch creates a couple of helper functions which can be used by these
filesystems, and will provide a unified place which the r/o bind mount code
may patch.

Also, rename an existing, static-scope init_file() to a less generic name.
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ce8d2cdf

17 7月, 2007 1 次提交

O_CLOEXEC for SCM_RIGHTS · 4a19542e

由 Ulrich Drepper 提交于 7月 15, 2007

Part two in the O_CLOEXEC saga: adding support for file descriptors received
through Unix domain sockets.

The patch is once again pretty minimal, it introduces a new flag for recvmsg
and passes it just like the existing MSG_CMSG_COMPAT flag.  I think this bit
is not used otherwise but the networking people will know better.

This new flag is not recognized by recvfrom and recv.  These functions cannot
be used for that purpose and the asymmetry this introduces is not worse than
the already existing MSG_CMSG_COMPAT situations.

The patch must be applied on the patch which introduced O_CLOEXEC.  It has to
remove static from the new get_unused_fd_flags function but since scm.c cannot
live in a module the function still hasn't to be exported.

Here's a test program to make sure the code works.  It's so much longer than
the actual patch...

#include <errno.h>
#include <error.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/un.h>

#ifndef O_CLOEXEC
# define O_CLOEXEC 02000000
#endif
#ifndef MSG_CMSG_CLOEXEC
# define MSG_CMSG_CLOEXEC 0x40000000
#endif

int
main (int argc, char *argv[])
{
  if (argc > 1)
    {
      int fd = atol (argv[1]);
      printf ("child: fd = %d\n", fd);
      if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
        {
          puts ("file descriptor valid in child");
          return 1;
        }
      return 0;

    }

  struct sockaddr_un sun;
  strcpy (sun.sun_path, "./testsocket");
  sun.sun_family = AF_UNIX;

  char databuf[] = "hello";
  struct iovec iov[1];
  iov[0].iov_base = databuf;
  iov[0].iov_len = sizeof (databuf);

  union
  {
    struct cmsghdr hdr;
    char bytes[CMSG_SPACE (sizeof (int))];
  } buf;
  struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1,
                        .msg_control = buf.bytes,
                        .msg_controllen = sizeof (buf) };
  struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg);

  cmsg->cmsg_level = SOL_SOCKET;
  cmsg->cmsg_type = SCM_RIGHTS;
  cmsg->cmsg_len = CMSG_LEN (sizeof (int));

  msg.msg_controllen = cmsg->cmsg_len;

  pid_t child = fork ();
  if (child == -1)
    error (1, errno, "fork");
  if (child == 0)
    {
      int sock = socket (PF_UNIX, SOCK_STREAM, 0);
      if (sock < 0)
        error (1, errno, "socket");

      if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
        error (1, errno, "bind");
      if (listen (sock, SOMAXCONN) < 0)
        error (1, errno, "listen");

      int conn = accept (sock, NULL, NULL);
      if (conn == -1)
        error (1, errno, "accept");

      *(int *) CMSG_DATA (cmsg) = sock;
      if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0)
        error (1, errno, "sendmsg");

      return 0;
    }

  /* For a test suite this should be more robust like a
     barrier in shared memory.  */
  sleep (1);

  int sock = socket (PF_UNIX, SOCK_STREAM, 0);
  if (sock < 0)
    error (1, errno, "socket");

  if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
    error (1, errno, "connect");
  unlink (sun.sun_path);

  *(int *) CMSG_DATA (cmsg) = -1;

  if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0)
    error (1, errno, "recvmsg");

  int fd = *(int *) CMSG_DATA (cmsg);
  if (fd == -1)
    error (1, 0, "no descriptor received");

  char fdname[20];
  snprintf (fdname, sizeof (fdname), "%d", fd);
  execl ("/proc/self/exe", argv[0], fdname, NULL);
  puts ("execl failed");
  return 1;
}

[akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch]
[akpm@linux-foundation.org: build fix]
Signed-off-by: NUlrich Drepper <drepper@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4a19542e

23 12月, 2006 1 次提交

[PATCH] fdtable: Provide free_fdtable() wrapper · 01b2d93c

由 Vadim Lobanov 提交于 12月 22, 2006

Christoph Hellwig has expressed concerns that the recent fdtable changes
expose the details of the RCU methodology used to release no-longer-used
fdtable structures to the rest of the kernel.  The trivial patch below
addresses these concerns by introducing the appropriate free_fdtable()
calls, which simply wrap the release RCU usage.  Since free_fdtable() is a
one-liner, it makes sense to promote it to an inline helper.
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

01b2d93c

11 12月, 2006 3 次提交

[PATCH] fdtable: Implement new pagesize-based fdtable allocator · 5466b456

由 Vadim Lobanov 提交于 12月 10, 2006

This patch provides an improved fdtable allocation scheme, useful for
expanding fdtable file descriptor entries.  The main focus is on the fdarray,
as its memory usage grows 128 times faster than that of an fdset.

The allocation algorithm sizes the fdarray in such a way that its memory usage
increases in easy page-sized chunks. The overall algorithm expands the allowed
size in powers of two, in order to amortize the cost of invoking vmalloc() for
larger allocation sizes. Namely, the following sizes for the fdarray are
considered, and the smallest that accommodates the requested fd count is
chosen:

    pagesize / 4
    pagesize / 2
    pagesize      <- memory allocator switch point
    pagesize * 2
    pagesize * 4
    ...etc...

Unlike the current implementation, this allocation scheme does not require a
loop to compute the optimal fdarray size, and can be done in efficient
straightline code.

Furthermore, since the fdarray overflows the pagesize boundary long before any
of the fdsets do, it makes sense to optimize run-time by allocating both
fdsets in a single swoop.  Even together, they will still be, by far, smaller
than the fdarray.  The fdtable->open_fds is now used as the anchor for the
fdset memory allocation.
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5466b456

[PATCH] fdtable: Remove the free_files field · 4fd45812

由 Vadim Lobanov 提交于 12月 10, 2006

An fdtable can either be embedded inside a files_struct or standalone (after
being expanded).  When an fdtable is being discarded after all RCU references
to it have expired, we must either free it directly, in the standalone case,
or free the files_struct it is contained within, in the embedded case.

Currently the free_files field controls this behavior, but we can get rid of
it entirely, as all the necessary information is already recorded.  We can
distinguish embedded and standalone fdtables using max_fds, and if it is
embedded we can divine the relevant files_struct using container_of().
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4fd45812

[PATCH] fdtable: Make fdarray and fdsets equal in size · bbea9f69

由 Vadim Lobanov 提交于 12月 10, 2006

Currently, each fdtable supports three dynamically-sized arrays of data: the
fdarray and two fdsets.  The code allows the number of fds supported by the
fdarray (fdtable->max_fds) to differ from the number of fds supported by each
of the fdsets (fdtable->max_fdset).

In practice, it is wasteful for these two sizes to differ: whenever we hit a
limit on the smaller-capacity structure, we will reallocate the entire fdtable
and all the dynamic arrays within it, so any delta in the memory used by the
larger-capacity structure will never be touched at all.

Rather than hogging this excess, we shouldn't even allocate it in the first
place, and keep the capacities of the fdarray and the fdsets equal.  This
patch removes fdtable->max_fdset.  As an added bonus, most of the supporting
code becomes simpler.
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

bbea9f69

08 12月, 2006 2 次提交

[PATCH] Move filep_cachep to include/file.h · 8b7d91eb

由 Christoph Lameter 提交于 12月 06, 2006

filp_cachep is only used in fs/file_table.c and in fs/dcache.c where
it is defined.

Move it to related definitions in linux/file.h.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8b7d91eb

[PATCH] Move files_cachep to include/file.h · 5d6538fc

由 Christoph Lameter 提交于 12月 06, 2006

Proper place is in file.h since files_cachep uses are rated to file I/O.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5d6538fc

30 9月, 2006 1 次提交

[PATCH] Fix unserialized task->files changing · 3b9b8ab6

由 Kirill Korotaev 提交于 9月 29, 2006

Fixed race on put_files_struct on exec with proc.  Restoring files on
current on error path may lead to proc having a pointer to already kfree-d
files_struct.

->files changing at exit.c and khtread.c are safe as exit_files() makes all
things under lock.

Found during OpenVZ stress testing.

[akpm@osdl.org: add export]
Signed-off-by: NPavel Emelianov <xemul@openvz.org>
Signed-off-by: NKirill Korotaev <dev@openvz.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3b9b8ab6

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功