提交 · c9f01245b6a7d77d17deaa71af10f6aca14fa24e · openeuler / raspberrypi-kernel

01 11月, 2011 1 次提交

oom: remove oom_disable_count · c9f01245

由 David Rientjes 提交于 10月 31, 2011

This removes mm->oom_disable_count entirely since it's unnecessary and
currently buggy.  The counter was intended to be per-process but it's
currently decremented in the exit path for each thread that exits, causing
it to underflow.

The count was originally intended to prevent oom killing threads that
share memory with threads that cannot be killed since it doesn't lead to
future memory freeing.  The counter could be fixed to represent all
threads sharing the same mm, but it's better to remove the count since:

 - it is possible that the OOM_DISABLE thread sharing memory with the
   victim is waiting on that thread to exit and will actually cause
   future memory freeing, and

 - there is no guarantee that a thread is disabled from oom killing just
   because another thread sharing its mm is oom disabled.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Reported-by: NOleg Nesterov <oleg@redhat.com>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Cc: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c9f01245

07 8月, 2011 2 次提交

vfs: show O_CLOEXE bit properly in /proc/<pid>/fdinfo/<fd> files · 1117f72e

由 Linus Torvalds 提交于 8月 06, 2011

The CLOEXE bit is magical, and for performance (and semantic) reasons we
don't actually maintain it in the file descriptor itself, but in a
separate bit array.  Which means that when we show f_flags, the CLOEXE
status is shown incorrectly: we show the status not as it is now, but as
it was when the file was opened.

Fix that by looking up the bit properly in the 'fdt->close_on_exec' bit
array.

Uli needs this in order to re-implement the pfiles program:

  "For normal file descriptors (not sockets) this was the last piece of
   information which wasn't available.  This is all part of my 'give
   Solaris users no reason to not switch' effort.  I intend to offer the
   code to the util-linux-ng maintainers."
Requested-by: NUlrich Drepper <drepper@akkadia.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1117f72e

oom_ajd: don't use WARN_ONCE, just use printk_once · c2142704

由 Linus Torvalds 提交于 8月 06, 2011

WARN_ONCE() is very annoying, in that it shows the stack trace that we
don't care about at all, and also triggers various user-level "kernel
oopsed" logic that we really don't care about.  And it's not like the
user can do anything about the applications (sshd) in question, it's a
distro issue.

Requested-by: Andi Kleen <andi@firstfloor.org> (and many others)
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c2142704

27 7月, 2011 1 次提交

proc: fix a race in do_io_accounting() · 293eb1e7

由 Vasiliy Kulikov 提交于 7月 26, 2011

If an inode's mode permits opening /proc/PID/io and the resulting file
descriptor is kept across execve() of a setuid or similar binary, the
ptrace_may_access() check tries to prevent using this fd against the
task with escalated privileges.

Unfortunately, there is a race in the check against execve().  If
execve() is processed after the ptrace check, but before the actual io
information gathering, io statistics will be gathered from the
privileged process.  At least in theory this might lead to gathering
sensible information (like ssh/ftp password length) that wouldn't be
available otherwise.

Holding task->signal->cred_guard_mutex while gathering the io
information should protect against the race.

The order of locking is similar to the one inside of ptrace_attach():
first goes cred_guard_mutex, then lock_task_sighand().
Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

293eb1e7

26 7月, 2011 1 次提交

oom: make deprecated use of oom_adj more verbose · be8f684d

由 David Rientjes 提交于 7月 25, 2011

/proc/pid/oom_adj is deprecated and scheduled for removal in August 2012
according to Documentation/feature-removal-schedule.txt.

This patch makes the warning more verbose by making it appear as a more
serious problem (the presence of a stack trace and being multiline should
attract more attention) so that applications still using the old interface
can get fixed.

Very popular users of the old interface have been converted since the oom
killer rewrite has been introduced.  udevd switched to the
/proc/pid/oom_score_adj interface for v162, kde switched in 4.6.1, and
opensshd switched in 5.7p1.

At the start of 2012, this should be changed into a WARN() to emit all
such incidents and then finally remove the tunable in August 2012 as
scheduled.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

be8f684d

21 7月, 2011 1 次提交

fs: seq_file - add event counter to simplify poll() support · f1514638

由 Kay Sievers 提交于 7月 12, 2011

Moving the event counter into the dynamically allocated 'struc seq_file'
allows poll() support without the need to allocate its own tracking
structure.

All current users are switched over to use the new counter.

Requested-by: Andrew Morton akpm@linux-foundation.org
Acked-by: NNeilBrown <neilb@suse.de>
Tested-by: Lucas De Marchi lucas.demarchi@profusion.mobi
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f1514638

20 7月, 2011 3 次提交

A
->permission() sanitizing: don't pass flags to ->permission() · 10556cb2
由 Al Viro 提交于 6月 20, 2011
```
not used by the instances anymore.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
10556cb2

->permission() sanitizing: don't pass flags to generic_permission() · 2830ba7f

由 Al Viro 提交于 6月 20, 2011

redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of
them removes that bit.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2830ba7f

kill check_acl callback of generic_permission() · 178ea735

由 Al Viro 提交于 6月 20, 2011

its value depends only on inode and does not change; we might as
well store it in ->i_op->check_acl and be done with that.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

178ea735

29 6月, 2011 1 次提交

proc: restrict access to /proc/PID/io · 1d1221f3

由 Vasiliy Kulikov 提交于 6月 24, 2011

/proc/PID/io may be used for gathering private information.  E.g.  for
openssh and vsftpd daemons wchars/rchars may be used to learn the
precise password length.  Restrict it to processes being able to ptrace
the target process.

ptrace_may_access() is needed to prevent keeping open file descriptor of
"io" file, executing setuid binary and gathering io information of the
setuid'ed process.
Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d1221f3

23 6月, 2011 1 次提交

ptrace: s/tracehook_tracer_task()/ptrace_parent()/ · 06d98473

由 Tejun Heo 提交于 6月 17, 2011

tracehook.h is on the way out.  Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>

06d98473

20 6月, 2011 1 次提交
- A
  proc_fd_permission() is doesn't need to bail out in RCU mode · cf127911
  由 Al Viro 提交于 6月 18, 2011
```
nothing blocking except generic_permission()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  cf127911
27 5月, 2011 4 次提交

arch/tile: more /proc and /sys file support · f133ecca

由 Chris Metcalf 提交于 5月 26, 2011

This change introduces a few of the less controversial /proc and
/proc/sys interfaces for tile, along with sysfs attributes for
various things that were originally proposed as /proc/tile files.
It also adjusts the "hardwall" proc API.

Arnd Bergmann reviewed the initial arch/tile submission, which
included a complete set of all the /proc/tile and /proc/sys/tile
knobs that we had added in a somewhat ad hoc way during initial
development, and provided feedback on where most of them should go.

One knob turned out to be similar enough to the existing
/proc/sys/debug/exception-trace that it was re-implemented to use
that model instead.

Another knob was /proc/tile/grid, which reported the "grid" dimensions
of a tile chip (e.g. 8x8 processors = 64-core chip).  Arnd suggested
looking at sysfs for that, so this change moves that information
to a pair of sysfs attributes (chip_width and chip_height) in the
/sys/devices/system/cpu directory.  We also put the "chip_serial"
and "chip_revision" information from our old /proc/tile/board file
as attributes in /sys/devices/system/cpu.

Other information collected via hypervisor APIs is now placed in
/sys/hypervisor.  We create a /sys/hypervisor/type file (holding the
constant string "tilera") to be parallel with the Xen use of
/sys/hypervisor/type holding "xen".  We create three top-level files,
"version" (the hypervisor's own version), "config_version" (the
version of the configuration file), and "hvconfig" (the contents of
the configuration file).  The remaining information from our old
/proc/tile/board and /proc/tile/switch files becomes an attribute
group appearing under /sys/hypervisor/board/.

Finally, after some feedback from Arnd Bergmann for the previous
version of this patch, the /proc/tile/hardwall file is split up into
two conceptual parts.  First, a directory /proc/tile/hardwall/ which
contains one file per active hardwall, each file named after the
hardwall's ID and holding a cpulist that says which cpus are enclosed by
the hardwall.  Second, a /proc/PID file "hardwall" that is either
empty (for non-hardwall-using processes) or contains the hardwall ID.

Finally, this change pushes the /proc/sys/tile/unaligned_fixup/
directory, with knobs controlling the kernel code for handling the
fixup of unaligned exceptions.
Reviewed-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

f133ecca

proc: put check_mem_permission after __get_free_page in mem_write · 30cd8903

由 KOSAKI Motohiro 提交于 5月 26, 2011

It whould be better if put check_mem_permission after __get_free_page in
mem_write, to be same as function mem_read.

Hugh Dickins explained the reason.

    check_mem_permission gets a reference to the mm.  If we __get_free_page
    after check_mem_permission, imagine what happens if the system is out
    of memory, and the mm we're looking at is selected for killing by the
    OOM killer: while we wait in __get_free_page for more memory, no memory
    is freed from the selected mm because it cannot reach exit_mmap while
    we hold that reference.
Reported-by: NJovi Zhang <bookjovi@gmail.com>
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: NHugh Dickins <hughd@google.com>
Reviewed-by: NStephen Wilson <wilsons@start.ca>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

30cd8903

fs/proc: convert to kstrtoX() · 0a8cb8e3

由 Alexey Dobriyan 提交于 5月 26, 2011

Convert fs/proc/ from strict_strto*() to kstrto*() functions.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0a8cb8e3

mm: extract exe_file handling from procfs · 38646013

由 Jiri Slaby 提交于 5月 26, 2011

Setup and cleanup of mm_struct->exe_file is currently done in fs/proc/.
This was because exe_file was needed only for /proc/<pid>/exe.  Since we
will need the exe_file functionality also for core dumps (so core name can
contain full binary path), built this functionality always into the
kernel.

To achieve that move that out of proc FS to the kernel/ where in fact it
should belong.  By doing that we can make dup_mm_exe_file static.  Also we
can drop linux/proc_fs.h inclusion in fs/exec.c and kernel/fork.c.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

38646013

11 5月, 2011 1 次提交

ns: proc files for namespace naming policy. · 6b4e306a

由 Eric W. Biederman 提交于 3月 07, 2010

Create files under /proc/<pid>/ns/ to allow controlling the
namespaces of a process.

This addresses three specific problems that can make namespaces hard to
work with.
- Namespaces require a dedicated process to pin them in memory.
- It is not possible to use a namespace unless you are the child
  of the original creator.
- Namespaces don't have names that userspace can use to talk about
  them.

The namespace files under /proc/<pid>/ns/ can be opened and the
file descriptor can be used to talk about a specific namespace, and
to keep the specified namespace alive.

A namespace can be kept alive by either holding the file descriptor
open or bind mounting the file someplace else.  aka:
mount --bind /proc/self/ns/net /some/filesystem/path
mount --bind /proc/self/fd/<N> /some/filesystem/path

This allows namespaces to be named with userspace policy.

It requires additional support to make use of these filedescriptors
and that will be comming in the following patches.
Acked-by: NDaniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

6b4e306a

19 4月, 2011 1 次提交

proc: do proper range check on readdir offset · d8bdc59f

由 Linus Torvalds 提交于 4月 18, 2011

Rather than pass in some random truncated offset to the pid-related
functions, check that the offset is in range up-front.

This is just cleanup, the previous commit fixed the real problem.

Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8bdc59f

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

24 3月, 2011 12 次提交

procfs: fix some wrong error code usage · fc3d8767

由 Jovi Zhang 提交于 3月 23, 2011

[root@wei 1]# cat /proc/1/mem
cat: /proc/1/mem: No such process

error code -ESRCH is wrong in this situation.  Return -EPERM instead.
Signed-off-by: NJovi Zhang <bookjovi@gmail.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fc3d8767

proc: hide kernel addresses via %pK in /proc/<pid>/stack · 51e03149

由 Konstantin Khlebnikov 提交于 3月 23, 2011

This file is readable for the task owner.  Hide kernel addresses from
unprivileged users, leave them function names and offsets.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: NKees Cook <kees.cook@canonical.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51e03149

deal with races in /proc/*/{syscall,stack,personality} · a9712bc1

由 Al Viro 提交于 3月 23, 2011

All of those are rw-r--r-- and all are broken for suid - if you open
a file before the target does suid-root exec, you'll be still able
to access it.  For personality it's not a big deal, but for syscall
and stack it's a real problem.

Fix: check that task is tracable for you at the time of read().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a9712bc1

proc: enable writing to /proc/pid/mem · 198214a7

由 Stephen Wilson 提交于 3月 13, 2011

With recent changes there is no longer a security hazard with writing to
/proc/pid/mem.  Remove the #ifdef.
Signed-off-by: NStephen Wilson <wilsons@start.ca>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

198214a7

proc: make check_mem_permission() return an mm_struct on success · 8b0db9db

由 Stephen Wilson 提交于 3月 13, 2011

This change allows us to take advantage of access_remote_vm(), which in turn
eliminates a security issue with the mem_write() implementation.

The previous implementation of mem_write() was insecure since the target task
could exec a setuid-root binary between the permission check and the actual
write.  Holding a reference to the target mm_struct eliminates this
vulnerability.
Signed-off-by: NStephen Wilson <wilsons@start.ca>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8b0db9db

proc: hold cred_guard_mutex in check_mem_permission() · 18f661bc

由 Stephen Wilson 提交于 3月 13, 2011

Avoid a potential race when task exec's and we get a new ->mm but check against
the old credentials in ptrace_may_access().

Holding of the mutex is implemented by factoring out the body of the code into a
helper function __check_mem_permission(). Performing this factorization now
simplifies upcoming changes and minimizes churn in the diff's.
Signed-off-by: NStephen Wilson <wilsons@start.ca>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

18f661bc

proc: disable mem_write after exec · 26947f8c

由 Stephen Wilson 提交于 3月 13, 2011

This change makes mem_write() observe the same constraints as mem_read(). This
is particularly important for mem_write as an accidental leak of the fd across
an exec could result in arbitrary modification of the target process' memory.
IOW, /proc/pid/mem is implicitly close-on-exec.
Signed-off-by: NStephen Wilson <wilsons@start.ca>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

26947f8c

auxv: require the target to be tracable (or yourself) · 2fadaef4

由 Al Viro 提交于 2月 15, 2011

same as for environ, except that we didn't do any checks to
prevent access after suid execve
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2fadaef4

close race in /proc/*/environ · d6f64b89

由 Al Viro 提交于 2月 15, 2011

Switch to mm_for_maps().  Maybe we ought to make it r--r--r--,
since we do checks on IO anyway...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d6f64b89

A
report errors in /proc/*/*map* sanely · ec6fd8a4
由 Al Viro 提交于 2月 15, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ec6fd8a4
A
pagemap: close races with suid execve · ca6b0bf0
由 Al Viro 提交于 2月 15, 2011
```
just use mm_for_maps()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ca6b0bf0
A
make sessionid permissions in /proc/*/task/* match those in /proc/* · 26ec3c64
由 Al Viro 提交于 2月 15, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
26ec3c64

10 3月, 2011 1 次提交
- A
  /proc/self is never going to be invalidated... · ae50adcb
  由 Al Viro 提交于 2月 16, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  ae50adcb
14 1月, 2011 4 次提交

oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj down · dabb16f6

由 Mandeep Singh Baines 提交于 1月 13, 2011

We'd like to be able to oom_score_adj a process up/down as it
enters/leaves the foreground.  Currently, it is not possible to oom_adj
down without CAP_SYS_RESOURCE.  This patch allows a task to decrease its
oom_score_adj back to the value that a CAP_SYS_RESOURCE thread set it to
or its inherited value at fork.  Assuming the thread that has forked it
has oom_score_adj of 0, each process could decrease it back from 0 upon
activation unless a CAP_SYS_RESOURCE thread elevated it to something
higher.

Alternative considered:

* a setuid binary
* a daemon with CAP_SYS_RESOURCE

Since you don't wan't all processes to be able to reduce their oom_adj, a
setuid or daemon implementation would be complex.  The alternatives also
have much higher overhead.

This patch updated from original patch based on feedback from David
Rientjes.
Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dabb16f6

proc: use single_open() correctly · c6a34058

由 Jovi Zhang 提交于 1月 12, 2011

single_open()'s third argument is for copying into seq_file->private.  Use
that, rather than open-coding it.
Signed-off-by: NJovi Zhang <bookjovi@gmail.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NAlexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c6a34058

proc: use seq_puts()/seq_putc() where possible · 9d6de12f

由 Alexey Dobriyan 提交于 1月 12, 2011

For string without format specifiers, use seq_puts().
For seq_printf("\n"), use seq_putc('\n').

   text	   data	    bss	    dec	    hex	filename
  61866	    488	    112	  62466	   f402	fs/proc/proc.o
  61729	    488	    112	  62329	   f379	fs/proc/proc.o
  ----------------------------------------------------
  			   -139
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9d6de12f

fs/proc/base.c, kernel/latencytop.c: convert sprintf_symbol() to %ps · 34e49d4f

由 Joe Perches 提交于 1月 12, 2011

Use temporary lr for struct latency_record for improved readability and
fewer columns used.  Removed trailing space from output.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NJoe Perches <joe@perches.com>
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

34e49d4f

07 1月, 2011 4 次提交

N
fs: provide rcu-walk aware permission i_ops · b74c79e9
由 Nick Piggin 提交于 1月 07, 2011
```
Signed-off-by: NNick Piggin <npiggin@kernel.dk>
```
b74c79e9

fs: rcu-walk aware d_revalidate method · 34286d66

由 Nick Piggin 提交于 1月 07, 2011

Require filesystems be aware of .d_revalidate being called in rcu-walk
mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning
-ECHILD from all implementations.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

34286d66

fs: dcache reduce branches in lookup path · fb045adb

由 Nick Piggin 提交于 1月 07, 2011

Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry->d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.

Patched with:

git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fb045adb

fs: change d_delete semantics · fe15ce44

由 Nick Piggin 提交于 1月 07, 2011

Change d_delete from a dentry deletion notification to a dentry caching
advise, more like ->drop_inode. Require it to be constant and idempotent,
and not take d_lock. This is how all existing filesystems use the callback
anyway.

This makes fine grained dentry locking of dput and dentry lru scanning
much simpler.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fe15ce44