提交 · ee1904ba44bd4a242b453e8fe179b374906da173 · openanolis / cloud-kernel

12 7月, 2018 5 次提交

make alloc_file() static · ee1904ba

由 Al Viro 提交于 6月 17, 2018

Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ee1904ba

new helper: alloc_file_clone() · 183266f2

由 Al Viro 提交于 6月 17, 2018

alloc_file_clone(old_file, mode, ops): create a new struct file with
->f_path equal to that of old_file.  pipe converted.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

183266f2

new wrapper: alloc_file_pseudo() · d93aa9d8

由 Al Viro 提交于 6月 09, 2018

takes inode, vfsmount, name, O_... flags and file_operations and
either returns a new struct file (in which case inode reference we
held is consumed) or returns ERR_PTR(), in which case no refcounts
are altered.

converted aio_private_file() and sock_alloc_file() to it
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d93aa9d8

fold put_filp() into fput() · 4d27f326

由 Al Viro 提交于 7月 09, 2018

Just check FMODE_OPENED in __fput() and be done with that...
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4d27f326

alloc_file(): switch to passing O_... flags instead of FMODE_... mode · c9c554f2

由 Al Viro 提交于 7月 11, 2018

... so that it could set both ->f_flags and ->f_mode, without callers
having to set ->f_flags manually.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c9c554f2

02 11月, 2017 1 次提交

License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318

由 Greg Kroah-Hartman 提交于 11月 01, 2017

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b2441318

06 12月, 2016 1 次提交
- A
  constify alloc_file() · a4141d7c
  由 Al Viro 提交于 11月 20, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  a4141d7c
03 5月, 2016 1 次提交
- A
  give readdir(2)/getdents(2)/etc. uniform exclusion with lseek() · 63b6df14
  由 Al Viro 提交于 4月 20, 2016
```
same as read() on regular files has, and for the same reason.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  63b6df14
11 12月, 2014 1 次提交

include/linux/file.h: remove get_unused_fd() macro · f938612d

由 Yann Droneaud 提交于 12月 10, 2014

Macro get_unused_fd() is used to allocate a file descriptor with default
flags.  Those default flags (0) don't enable close-on-exec.

This can be seen as an unsafe default: in most case close-on-exec should
be enabled to not leak file descriptor across exec().

It would be better to have a "safer" default set of flags, eg.  O_CLOEXEC
must be used to enable close-on-exec.

Instead this patch removes get_unused_fd() so that out of tree modules
won't be affect by a runtime behavor change which might introduce other
kind of bugs: it's better to catch the change at build time, making it
easier to fix.

Removing the macro will also promote use of get_unused_fd_flags() (or
anon_inode_getfd()) with flags provided by userspace.  Or, if flags cannot
be given by userspace, with flags set to O_CLOEXEC by default.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f938612d

10 3月, 2014 2 次提交

get rid of fget_light() · bd2a31d5

由 Al Viro 提交于 3月 04, 2014

instead of returning the flags by reference, we can just have the
low-level primitive return those in lower bits of unsigned long,
with struct file * derived from the rest.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bd2a31d5

vfs: atomic f_pos accesses as per POSIX · 9c225f26

由 Linus Torvalds 提交于 3月 03, 2014

Our write() system call has always been atomic in the sense that you get
the expected thread-safe contiguous write, but we haven't actually
guaranteed that concurrent writes are serialized wrt f_pos accesses, so
threads (or processes) that share a file descriptor and use "write()"
concurrently would quite likely overwrite each others data.

This violates POSIX.1-2008/SUSv4 Section XSI 2.9.7 that says:

 "2.9.7 Thread Interactions with Regular File Operations

  All of the following functions shall be atomic with respect to each
  other in the effects specified in POSIX.1-2008 when they operate on
  regular files or symbolic links: [...]"

and one of the effects is the file position update.

This unprotected file position behavior is not new behavior, and nobody
has ever cared.  Until now.  Yongzhi Pan reported unexpected behavior to
Michael Kerrisk that was due to this.

This resolves the issue with a f_pos-specific lock that is taken by
read/write/lseek on file descriptors that may be shared across threads
or processes.
Reported-by: NYongzhi Pan <panyongzhi@gmail.com>
Reported-by: NMichael Kerrisk <mtk.manpages@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9c225f26

27 9月, 2012 6 次提交
- A
  switch simple cases of fget_light to fdget · 2903ff01
  由 Al Viro 提交于 8月 28, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  2903ff01
- A
  new helpers: fdget()/fdput() · a5b470ba
  由 Al Viro 提交于 8月 27, 2012
```
Signed-off-bs: Al Viro <viro@zeniv.linux.org.uk>
```
  a5b470ba
- A
  make expand_files() and alloc_fd() static · ad47bd72
  由 Al Viro 提交于 8月 21, 2012
```
no callers outside of fs/file.c left
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  ad47bd72
- A
  new helper: replace_fd() · 8280d161
  由 Al Viro 提交于 8月 21, 2012
```
analog of dup2(), except that it takes struct file * as source.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  8280d161
- A
  take purely descriptor-related stuff from fcntl.c to file.c · fe17f22d
  由 Al Viro 提交于 8月 21, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  fe17f22d
- A
  make get_unused_fd_flags() a function · 1a7bd226
  由 Al Viro 提交于 8月 12, 2012
```
... and get_unused_fd() a macro around it
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  1a7bd226
23 7月, 2012 1 次提交

switch fput to task_work_add · 4a9d4b02

由 Al Viro 提交于 6月 24, 2012

... and schedule_work() for interrupt/kernel_thread callers
(and yes, now it *is* OK to call from interrupt).

We are guaranteed that __fput() will be done before we return
to userland (or exit).  Note that for fput() from a kernel
thread we get an async behaviour; it's almost always OK, but
sometimes you might need to have __fput() completed before
you do anything else.  There are two mechanisms for that -
a general barrier (flush_delayed_fput()) and explicit
__fput_sync().  Both should be used with care (as was the
case for fput() from kernel threads all along).  See comments
in fs/file_table.c for details.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4a9d4b02

21 3月, 2012 1 次提交
- A
  vfs: drop_file_write_access() made static · b57ce969
  由 Al Viro 提交于 2月 12, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  b57ce969
15 3月, 2011 1 次提交

New kind of open files - "location only". · 1abf0c71

由 Al Viro 提交于 3月 13, 2011

New flag for open(2) - O_PATH.  Semantics:
	* pathname is resolved, but the file itself is _NOT_ opened
as far as filesystem is concerned.
	* almost all operations on the resulting descriptors shall
fail with -EBADF.  Exceptions are:
	1) operations on descriptors themselves (i.e.
		close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD),
		fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD),
		fcntl(fd, F_SETFD, ...))
	2) fcntl(fd, F_GETFL), for a common non-destructive way to
		check if descriptor is open
	3) "dfd" arguments of ...at(2) syscalls, i.e. the starting
		points of pathname resolution
	* closing such descriptor does *NOT* affect dnotify or
posix locks.
	* permissions are checked as usual along the way to file;
no permission checks are applied to the file itself.  Of course,
giving such thing to syscall will result in permission checks (at
the moment it means checking that starting point of ....at() is
a directory and caller has exec permissions on it).

fget() and fget_light() return NULL on such descriptors; use of
fget_raw() and fget_raw_light() is needed to get them.  That protects
existing code from dealing with those things.

There are two things still missing (they come in the next commits):
one is handling of symlinks (right now we refuse to open them that
way; see the next commit for semantics related to those) and another
is descriptor passing via SCM_RIGHTS datagrams.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1abf0c71

17 1月, 2011 1 次提交

fs: Remove unlikely() from fput_light() · c2b3e74b

由 Steven Rostedt 提交于 12月 13, 2010

In fput_light(), there's an unlikely(fput_needed), which running on
my normal desktop doing firefox, xchat, evolution and part of my distcc farm,
and running the annotate branch profiler shows that the unlikely is not
very unlikely.

 correct incorrect  %        Function             File              Line
 ------- ---------  -        --------             ----              ----
       0       48 100 fput_light                file.h               26
115828710 897415279  88 fput_light              file.h               26
865271179 5286128445  85 fput_light             file.h               26
19568539  8923664  31 fput_light                file.h               26
12353677  3562279  22 fput_light                file.h               26
  267691    67062  20 fput_light                file.h               26
15014853   348172   2 fput_light                file.h               26
  209258      205   0 fput_light                file.h               26
 1364164        0   0 fput_light                file.h               26

Which gives 1032903812 times it was correct and 6203351846 times it was
incorrect, or 85% incorrect.

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c2b3e74b

28 5月, 2010 1 次提交

get rid of the magic around f_count in aio · d7065da0

由 Al Viro 提交于 5月 26, 2010

__aio_put_req() plays sick games with file refcount.  What
it wants is fput() from atomic context; it's almost always
done with f_count > 1, so they only have to deal with delayed
work in rare cases when their reference happens to be the
last one.  Current code decrements f_count and if it hasn't
hit 0, everything is fine.  Otherwise it keeps a pointer
to struct file (with zero f_count!) around and has delayed
work do __fput() on it.

Better way to do it: use atomic_long_add_unless( , -1, 1)
instead of !atomic_long_dec_and_test().  IOW, decrement it
only if it's not the last reference, leave refcount alone
if it was.  And use normal fput() in delayed work.

I've made that atomic_long_add_unless call a new helper -
fput_atomic().  Drops a reference to file if it's safe to
do in atomic (i.e. if that's not the last one), tells if
it had been able to do that.  aio.c converted to it, __fput()
use is gone.  req->ki_file *always* contributes to refcount
now.  And __fput() became static.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d7065da0

17 12月, 2009 2 次提交

switch alloc_file() to passing struct path · 2c48b9c4

由 Al Viro 提交于 8月 09, 2009

... and have the caller grab both mnt and dentry; kill
leak in infiniband, while we are at it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2c48b9c4

A
get rid of init_file() · 3d1e4631
由 Al Viro 提交于 8月 08, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3d1e4631

21 10月, 2008 1 次提交
- A
  [PATCH] introduce fmode_t, do annotations · aeb5d727
  由 Al Viro 提交于 9月 02, 2008
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  aeb5d727
01 8月, 2008 1 次提交

[PATCH] merge locate_fd() and get_unused_fd() · 1027abe8

由 Al Viro 提交于 7月 30, 2008

	New primitive: alloc_fd(start, flags).  get_unused_fd() and
get_unused_fd_flags() become wrappers on top of it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1027abe8

02 5月, 2008 1 次提交

[PATCH] split linux/file.h · 9f3acc31

由 Al Viro 提交于 4月 24, 2008

Initial splitoff of the low-level stuff; taken to fdtable.h
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9f3acc31

25 4月, 2008 1 次提交

[PATCH] sanitize unshare_files/reset_files_struct · 3b125388

由 Al Viro 提交于 4月 22, 2008

* let unshare_files() give caller the displaced files_struct
* don't bother with grabbing reference only to drop it in the
  caller if it hadn't been shared in the first place
* in that form unshare_files() is trivially implemented via
  unshare_fd(), so we eliminate the duplicate logics in fork.c
* reset_files_struct() is not just only called for current;
  it will break the system if somebody ever calls it for anything
  else (we can't modify ->files of somebody else).  Lose the
  task_struct * argument.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3b125388

19 4月, 2008 1 次提交

[PATCH] r/o bind mounts: create helper to drop file write access · aceaf78d

由 Dave Hansen 提交于 2月 15, 2008

If someone decides to demote a file from r/w to just
r/o, they can use this same code as __fput().

NFS does just that, and will use this in the next
patch.

AV: drop write access in __fput() only after we evict from file list.
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Cc: Erez Zadok <ezk@cs.sunysb.edu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J Bruce Fields" <bfields@fieldses.org>
Acked-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

aceaf78d

14 2月, 2008 1 次提交

include/linux: Remove all users of FASTCALL() macro · b3c97528

由 Harvey Harrison 提交于 2月 13, 2008

FASTCALL() is always expanded to empty, remove it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b3c97528

17 10月, 2007 1 次提交

r/o bind mounts: filesystem helpers for custom 'struct file's · ce8d2cdf

由 Dave Hansen 提交于 10月 16, 2007

Why do we need r/o bind mounts?

This feature allows a read-only view into a read-write filesystem.  In the
process of doing that, it also provides infrastructure for keeping track of
the number of writers to any given mount.

This has a number of uses.  It allows chroots to have parts of filesystems
writable.  It will be useful for containers in the future because users may
have root inside a container, but should not be allowed to write to
somefilesystems.  This also replaces patches that vserver has had out of the
tree for several years.

It allows security enhancement by making sure that parts of your filesystem
read-only (such as when you don't trust your FTP server), when you don't want
to have entire new filesystems mounted, or when you want atime selectively
updated.  I've been using the following script to test that the feature is
working as desired.  It takes a directory and makes a regular bind and a r/o
bind mount of it.  It then performs some normal filesystem operations on the
three directories, including ones that are expected to fail, like creating a
file on the r/o mount.

This patch:

Some filesystems forego the vfs and may_open() and create their own 'struct
file's.

This patch creates a couple of helper functions which can be used by these
filesystems, and will provide a unified place which the r/o bind mount code
may patch.

Also, rename an existing, static-scope init_file() to a less generic name.
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ce8d2cdf

17 7月, 2007 1 次提交

O_CLOEXEC for SCM_RIGHTS · 4a19542e

由 Ulrich Drepper 提交于 7月 15, 2007

Part two in the O_CLOEXEC saga: adding support for file descriptors received
through Unix domain sockets.

The patch is once again pretty minimal, it introduces a new flag for recvmsg
and passes it just like the existing MSG_CMSG_COMPAT flag.  I think this bit
is not used otherwise but the networking people will know better.

This new flag is not recognized by recvfrom and recv.  These functions cannot
be used for that purpose and the asymmetry this introduces is not worse than
the already existing MSG_CMSG_COMPAT situations.

The patch must be applied on the patch which introduced O_CLOEXEC.  It has to
remove static from the new get_unused_fd_flags function but since scm.c cannot
live in a module the function still hasn't to be exported.

Here's a test program to make sure the code works.  It's so much longer than
the actual patch...

#include <errno.h>
#include <error.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/un.h>

#ifndef O_CLOEXEC
# define O_CLOEXEC 02000000
#endif
#ifndef MSG_CMSG_CLOEXEC
# define MSG_CMSG_CLOEXEC 0x40000000
#endif

int
main (int argc, char *argv[])
{
  if (argc > 1)
    {
      int fd = atol (argv[1]);
      printf ("child: fd = %d\n", fd);
      if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
        {
          puts ("file descriptor valid in child");
          return 1;
        }
      return 0;

    }

  struct sockaddr_un sun;
  strcpy (sun.sun_path, "./testsocket");
  sun.sun_family = AF_UNIX;

  char databuf[] = "hello";
  struct iovec iov[1];
  iov[0].iov_base = databuf;
  iov[0].iov_len = sizeof (databuf);

  union
  {
    struct cmsghdr hdr;
    char bytes[CMSG_SPACE (sizeof (int))];
  } buf;
  struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1,
                        .msg_control = buf.bytes,
                        .msg_controllen = sizeof (buf) };
  struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg);

  cmsg->cmsg_level = SOL_SOCKET;
  cmsg->cmsg_type = SCM_RIGHTS;
  cmsg->cmsg_len = CMSG_LEN (sizeof (int));

  msg.msg_controllen = cmsg->cmsg_len;

  pid_t child = fork ();
  if (child == -1)
    error (1, errno, "fork");
  if (child == 0)
    {
      int sock = socket (PF_UNIX, SOCK_STREAM, 0);
      if (sock < 0)
        error (1, errno, "socket");

      if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
        error (1, errno, "bind");
      if (listen (sock, SOMAXCONN) < 0)
        error (1, errno, "listen");

      int conn = accept (sock, NULL, NULL);
      if (conn == -1)
        error (1, errno, "accept");

      *(int *) CMSG_DATA (cmsg) = sock;
      if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0)
        error (1, errno, "sendmsg");

      return 0;
    }

  /* For a test suite this should be more robust like a
     barrier in shared memory.  */
  sleep (1);

  int sock = socket (PF_UNIX, SOCK_STREAM, 0);
  if (sock < 0)
    error (1, errno, "socket");

  if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
    error (1, errno, "connect");
  unlink (sun.sun_path);

  *(int *) CMSG_DATA (cmsg) = -1;

  if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0)
    error (1, errno, "recvmsg");

  int fd = *(int *) CMSG_DATA (cmsg);
  if (fd == -1)
    error (1, 0, "no descriptor received");

  char fdname[20];
  snprintf (fdname, sizeof (fdname), "%d", fd);
  execl ("/proc/self/exe", argv[0], fdname, NULL);
  puts ("execl failed");
  return 1;
}

[akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch]
[akpm@linux-foundation.org: build fix]
Signed-off-by: NUlrich Drepper <drepper@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4a19542e

23 12月, 2006 1 次提交

[PATCH] fdtable: Provide free_fdtable() wrapper · 01b2d93c

由 Vadim Lobanov 提交于 12月 22, 2006

Christoph Hellwig has expressed concerns that the recent fdtable changes
expose the details of the RCU methodology used to release no-longer-used
fdtable structures to the rest of the kernel.  The trivial patch below
addresses these concerns by introducing the appropriate free_fdtable()
calls, which simply wrap the release RCU usage.  Since free_fdtable() is a
one-liner, it makes sense to promote it to an inline helper.
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

01b2d93c

11 12月, 2006 3 次提交

[PATCH] fdtable: Implement new pagesize-based fdtable allocator · 5466b456

由 Vadim Lobanov 提交于 12月 10, 2006

This patch provides an improved fdtable allocation scheme, useful for
expanding fdtable file descriptor entries.  The main focus is on the fdarray,
as its memory usage grows 128 times faster than that of an fdset.

The allocation algorithm sizes the fdarray in such a way that its memory usage
increases in easy page-sized chunks. The overall algorithm expands the allowed
size in powers of two, in order to amortize the cost of invoking vmalloc() for
larger allocation sizes. Namely, the following sizes for the fdarray are
considered, and the smallest that accommodates the requested fd count is
chosen:

    pagesize / 4
    pagesize / 2
    pagesize      <- memory allocator switch point
    pagesize * 2
    pagesize * 4
    ...etc...

Unlike the current implementation, this allocation scheme does not require a
loop to compute the optimal fdarray size, and can be done in efficient
straightline code.

Furthermore, since the fdarray overflows the pagesize boundary long before any
of the fdsets do, it makes sense to optimize run-time by allocating both
fdsets in a single swoop.  Even together, they will still be, by far, smaller
than the fdarray.  The fdtable->open_fds is now used as the anchor for the
fdset memory allocation.
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5466b456

[PATCH] fdtable: Remove the free_files field · 4fd45812

由 Vadim Lobanov 提交于 12月 10, 2006

An fdtable can either be embedded inside a files_struct or standalone (after
being expanded).  When an fdtable is being discarded after all RCU references
to it have expired, we must either free it directly, in the standalone case,
or free the files_struct it is contained within, in the embedded case.

Currently the free_files field controls this behavior, but we can get rid of
it entirely, as all the necessary information is already recorded.  We can
distinguish embedded and standalone fdtables using max_fds, and if it is
embedded we can divine the relevant files_struct using container_of().
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4fd45812

[PATCH] fdtable: Make fdarray and fdsets equal in size · bbea9f69

由 Vadim Lobanov 提交于 12月 10, 2006

Currently, each fdtable supports three dynamically-sized arrays of data: the
fdarray and two fdsets.  The code allows the number of fds supported by the
fdarray (fdtable->max_fds) to differ from the number of fds supported by each
of the fdsets (fdtable->max_fdset).

In practice, it is wasteful for these two sizes to differ: whenever we hit a
limit on the smaller-capacity structure, we will reallocate the entire fdtable
and all the dynamic arrays within it, so any delta in the memory used by the
larger-capacity structure will never be touched at all.

Rather than hogging this excess, we shouldn't even allocate it in the first
place, and keep the capacities of the fdarray and the fdsets equal.  This
patch removes fdtable->max_fdset.  As an added bonus, most of the supporting
code becomes simpler.
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

bbea9f69

08 12月, 2006 2 次提交

[PATCH] Move filep_cachep to include/file.h · 8b7d91eb

由 Christoph Lameter 提交于 12月 06, 2006

filp_cachep is only used in fs/file_table.c and in fs/dcache.c where
it is defined.

Move it to related definitions in linux/file.h.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8b7d91eb

[PATCH] Move files_cachep to include/file.h · 5d6538fc

由 Christoph Lameter 提交于 12月 06, 2006

Proper place is in file.h since files_cachep uses are rated to file I/O.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5d6538fc

30 9月, 2006 1 次提交

[PATCH] Fix unserialized task->files changing · 3b9b8ab6

由 Kirill Korotaev 提交于 9月 29, 2006

Fixed race on put_files_struct on exec with proc.  Restoring files on
current on error path may lead to proc having a pointer to already kfree-d
files_struct.

->files changing at exit.c and khtread.c are safe as exit_files() makes all
things under lock.

Found during OpenVZ stress testing.

[akpm@osdl.org: add export]
Signed-off-by: NPavel Emelianov <xemul@openvz.org>
Signed-off-by: NKirill Korotaev <dev@openvz.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3b9b8ab6

23 3月, 2006 1 次提交

[PATCH] Shrinks sizeof(files_struct) and better layout · 0c9e63fd

由 Eric Dumazet 提交于 3月 23, 2006

1) Reduce the size of (struct fdtable) to exactly 64 bytes on 32bits
   platforms, lowering kmalloc() allocated space by 50%.

2) Reduce the size of (files_struct), using a special 32 bits (or
   64bits) embedded_fd_set, instead of a 1024 bits fd_set for the
   close_on_exec_init and open_fds_init fields.  This save some ram (248
   bytes per task) as most tasks dont open more than 32 files.  D-Cache
   footprint for such tasks is also reduced to the minimum.

3) Reduce size of allocated fdset.  Currently two full pages are
   allocated, that is 32768 bits on x86 for example, and way too much.  The
   minimum is now L1_CACHE_BYTES.

UP and SMP should benefit from this patch, because most tasks will touch
only one cache line when open()/close() stdin/stdout/stderr (0/1/2),
(next_fd, close_on_exec_init, open_fds_init, fd_array[0 ..  2] being in the
same cache line)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0c9e63fd

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功