提交 · fb045adb99d9b7c562dc7fef834857f78249daa1 · openeuler / raspberrypi-kernel

07 1月, 2011 3 次提交

fs: avoid inode RCU freeing for pseudo fs · ff0c7d15

由 Nick Piggin 提交于 1月 07, 2011

Pseudo filesystems that don't put inode on RCU list or reachable by
rcu-walk dentries do not need to RCU free their inodes.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

ff0c7d15

fs: icache RCU free inodes · fa0d7e3d

由 Nick Piggin 提交于 1月 07, 2011

RCU free the struct inode. This will allow:

- Subsequent store-free path walking patch. The inode must be consulted for
  permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
  to take i_lock no longer need to take sb_inode_list_lock to walk the list in
  the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
  page lock to follow page->mapping.

The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.

In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.

The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fa0d7e3d

fs: dcache remove dcache_lock · b5c84bf6

由 Nick Piggin 提交于 1月 07, 2011

dcache_lock no longer protects anything. remove it.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

b5c84bf6

02 12月, 2010 1 次提交

Call the filesystem back whenever a page is removed from the page cache · 6072d13c

由 Linus Torvalds 提交于 12月 01, 2010

NFS needs to be able to release objects that are stored in the page
cache once the page itself is no longer visible from the page cache.

This patch adds a callback to the address space operations that allows
filesystems to perform page cleanups once the page has been removed
from the page cache.

Original patch by: Linus Torvalds <torvalds@linux-foundation.org>
[trondmy: cover the cases of invalidate_inode_pages2() and
          truncate_inode_pages()]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

6072d13c

25 11月, 2010 1 次提交

include/linux/fs.h: fix userspace build · 3a3a1af3

由 Loïc Minier 提交于 11月 24, 2010

dpkg uses fiemap but didn't particularly need to include stdint.h so far.
Since 367a51a3 ("fs: Add FITRIM ioctl"), build of linux/fs.h failed in
dpkg with:

  In file included from ../../src/filesdb.c:27:0:
  /usr/include/linux/fs.h:37:2: error: expected specifier-qualifier-list before 'uint64_t'

Use exportable type __u64 to avoid the dependency on stdint.h.

b31d42a5 ("Fix compile brekage with !CONFIG_BLOCK") fixed only the
kernel build by including linux/types.h, but this also fixed "make
headers_check", so don't revert it.
Signed-off-by: NLoïc Minier <loic.minier@linaro.org>
Tested-by: NArnd Bergmann <arnd.bergmann@linaro.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a3a1af3

20 11月, 2010 1 次提交

fs: Do not dispatch FITRIM through separate super_operation · 93bb41f4

由 Lukas Czerner 提交于 11月 19, 2010

There was concern that FITRIM ioctl is not common enough to be included
in core vfs ioctl, as Christoph Hellwig pointed out there's no real point
in dispatching this out to a separate vector instead of just through
->ioctl.

So this commit removes ioctl_fstrim() from vfs ioctl and trim_fs
from super_operation structure.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

93bb41f4

31 10月, 2010 2 次提交

locks: remove fl_copy_lock lock_manager operation · bb8430a2

由 Christoph Hellwig 提交于 10月 31, 2010

This one was only used for a nasty hack in nfsd, which has recently
been removed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bb8430a2

locks: fix setlease methods to free passed-in lock · 05fa3135

由 J. Bruce Fields 提交于 10月 30, 2010

We modified setlease to require the caller to allocate the new lease in
the case of creating a new lease, but forgot to fix up the filesystem
methods.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05fa3135

30 10月, 2010 1 次提交

readv/writev: do the same MAX_RW_COUNT truncation that read/write does · 435f49a5

由 Linus Torvalds 提交于 10月 29, 2010

We used to protect against overflow, but rather than return an error, do
what read/write does, namely to limit the total size to MAX_RW_COUNT.
This is not only more consistent, but it also means that any broken
low-level read/write routine that still keeps counts in 'int' can't
break.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

435f49a5

29 10月, 2010 7 次提交

A
switch get_sb_ns() users · ceefda69
由 Al Viro 提交于 7月 26, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ceefda69
A
convert get_sb_pseudo() users · 51139ada
由 Al Viro 提交于 7月 25, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
51139ada
A
convert get_sb_nodev() users · 3c26ff6e
由 Al Viro 提交于 7月 25, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3c26ff6e
A
convert get_sb_single() users · fc14f2fe
由 Al Viro 提交于 7月 25, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
fc14f2fe

new helper: mount_bdev() · 152a0836

由 Al Viro 提交于 7月 25, 2010

... and switch of the obvious get_sb_bdev() users to ->mount()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

152a0836

beginning of transtion: ->mount() · c96e41e9

由 Al Viro 提交于 7月 25, 2010

eventual replacement for ->get_sb() - does *not* get vfsmount,
return ERR_PTR(error) or root of subtree to be mounted.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c96e41e9

Fix compile brekage with !CONFIG_BLOCK · b31d42a5

由 Ingo Molnar 提交于 10月 28, 2010

Today's git tree fails to build on !CONFIG_BLOCK, due to upstream commit
367a51a3 ("fs: Add FITRIM ioctl"):

include/linux/fs.h:36: error: expected specifier-qualifier-list before ‘uint64_t’
include/linux/fs.h:36: error: expected specifier-qualifier-list before ‘uint64_t’
include/linux/fs.h:36: error: expected specifier-qualifier-list before ‘uint64_t’

The commit adds uint64_t type usage to fs.h, but linux/types.h is not included
explicitly - it's only included implicitly via linux/blk_types.h, and there only if
CONFIG_BLOCK is enabled.

Add the explicit #include to fix this.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b31d42a5

28 10月, 2010 3 次提交

fs: Add FITRIM ioctl · 367a51a3

由 Lukas Czerner 提交于 10月 27, 2010

Adds an filesystem independent ioctl to allow implementation of file
system batched discard support. I takes fstrim_range structure as an
argument. fstrim_range is definec in the include/fs.h and its
definition is as follows.

struct fstrim_range {
	start;
	len;
	minlen;
}

start	- first Byte to trim
len	- number of Bytes to trim from start
minlen	- minimum extent length to trim, free extents shorter than this
	  number of Bytes will be ignored. This will be rounded up to fs
	  block size.

It is also possible to specify NULL as an argument. In this case the
arguments will set itself as follows:

start = 0;
len = ULLONG_MAX;
minlen = 0;

So it will trim the whole file system at one run.

After the FITRIM is done, the number of actually discarded Bytes is stored
in fstrim_range.len to give the user better insight on how much storage
space has been really released for wear-leveling.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Reviewed-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

367a51a3

fasync: re-organize fasync entry insertion to allow it under a spinlock · f7347ce4

由 Linus Torvalds 提交于 10月 27, 2010

You currently cannot use "fasync_helper()" in an atomic environment to
insert a new fasync entry, because it will need to allocate the new
"struct fasync_struct".

Yet fcntl_setlease() wants to call this under lock_flocks(), which is in
the process of being converted from the BKL to a spinlock.

In order to fix this, this abstracts out the actual fasync list
insertion and the fasync allocations into functions of their own, and
teaches fs/locks.c to pre-allocate the fasync_struct entry.  That way
the actual list insertion can happen while holding the required
spinlock.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
[bfields@redhat.com: rebase on top of my changes to Arnd's patch]
Tested-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

f7347ce4

locks/nfsd: allocate file lock outside of spinlock · c5b1f0d9

由 Arnd Bergmann 提交于 10月 27, 2010

As suggested by Christoph Hellwig, this moves allocation
of new file locks out of generic_setlease into the
callers, nfs4_open_delegation and fcntl_setlease in order
to allow GFP_KERNEL allocations when lock_flocks has
become a spinlock.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NJ. Bruce Fields <bfields@redhat.com>

c5b1f0d9

27 10月, 2010 3 次提交

fs: allow for more than 2^31 files · 518de9b3

由 Eric Dumazet 提交于 10月 26, 2010

Robin Holt tried to boot a 16TB system and found af_unix was overflowing
a 32bit value :

<quote>

We were seeing a failure which prevented boot.  The kernel was incapable
of creating either a named pipe or unix domain socket.  This comes down
to a common kernel function called unix_create1() which does:

        atomic_inc(&unix_nr_socks);
        if (atomic_read(&unix_nr_socks) > 2 * get_max_files())
                goto out;

The function get_max_files() is a simple return of files_stat.max_files.
files_stat.max_files is a signed integer and is computed in
fs/file_table.c's files_init().

        n = (mempages * (PAGE_SIZE / 1024)) / 10;
        files_stat.max_files = n;

In our case, mempages (total_ram_pages) is approx 3,758,096,384
(0xe0000000).  That leaves max_files at approximately 1,503,238,553.
This causes 2 * get_max_files() to integer overflow.

</quote>

Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long
integers, and change af_unix to use an atomic_long_t instead of atomic_t.

get_max_files() is changed to return an unsigned long.  get_nr_files() is
changed to return a long.

unix_nr_socks is changed from atomic_t to atomic_long_t, while not
strictly needed to address Robin problem.

Before patch (on a 64bit kernel) :
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
-18446744071562067968

After patch:
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
2147483648
# cat /proc/sys/fs/file-nr
704     0       2147483648
Reported-by: NRobin Holt <holt@sgi.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NDavid Miller <davem@davemloft.net>
Reviewed-by: NRobin Holt <holt@sgi.com>
Tested-by: NRobin Holt <holt@sgi.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

518de9b3

IMA: explicit IMA i_flag to remove global lock on inode_delete · 196f5181

由 Eric Paris 提交于 10月 25, 2010

Currently for every removed inode IMA must take a global lock and search
the IMA rbtree looking for an associated integrity structure. Instead
we explicitly mark an inode when we add an integrity structure so we
only have to take the global lock and do the removal if it exists.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

196f5181

IMA: move read counter into struct inode · a178d202

由 Eric Paris 提交于 10月 25, 2010

IMA currently allocated an inode integrity structure for every inode in
core.  This stucture is about 120 bytes long.  Most files however
(especially on a system which doesn't make use of IMA) will never need
any of this space.  The problem is that if IMA is enabled we need to
know information about the number of readers and the number of writers
for every inode on the box.  At the moment we collect that information
in the per inode iint structure and waste the rest of the space.  This
patch moves those counters into the struct inode so we can eventually
stop allocating an IMA integrity structure except when absolutely
needed.

This patch does the minimum needed to move the location of the data.
Further cleanups, especially the location of counter updates, may still
be possible.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a178d202

26 10月, 2010 13 次提交

fs: inode split IO and LRU lists · 7ccf19a8

由 Nick Piggin 提交于 10月 21, 2010

The use of the same inode list structure (inode->i_list) for two
different list constructs with different lifecycles and purposes
makes it impossible to separate the locking of the different
operations. Therefore, to enable the separation of the locking of
the writeback and reclaim lists, split the inode->i_list into two
separate lists dedicated to their specific tracking functions.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7ccf19a8

fs: use percpu counter for nr_dentry and nr_dentry_unused · 312d3ca8

由 Christoph Hellwig 提交于 10月 10, 2010

The nr_dentry stat is a globally touched cacheline and atomic operation
twice over the lifetime of a dentry. It is used for the benfit of userspace
only. Turn it into a per-cpu counter and always decrement it in d_free instead
of doing various batching operations to reduce lock hold times in the callers.

Based on an earlier patch from Nick Piggin <npiggin@suse.de>.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

312d3ca8

fs: do not assign default i_ino in new_inode · 85fe4025

由 Christoph Hellwig 提交于 10月 23, 2010

Instead of always assigning an increasing inode number in new_inode
move the call to assign it into those callers that actually need it.
For now callers that need it is estimated conservatively, that is
the call is added to all filesystems that do not assign an i_ino
by themselves.  For a few more filesystems we can avoid assigning
any inode number given that they aren't user visible, and for others
it could be done lazily when an inode number is actually needed,
but that's left for later patches.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

85fe4025

new helper: ihold() · 7de9c6ee

由 Al Viro 提交于 10月 23, 2010

Clones an existing reference to inode; caller must already hold one.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7de9c6ee

fs: remove inode_add_to_list/__inode_add_to_list · 646ec461

由 Christoph Hellwig 提交于 10月 23, 2010

Split up inode_add_to_list/__inode_add_to_list.  Locking for the two
lists will be split soon so these helpers really don't buy us much
anymore.

The __ prefixes for the sb list helpers will go away soon, but until
inode_lock is gone we'll need them to distinguish between the locked
and unlocked variants.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

646ec461

fs: Implement lazy LRU updates for inodes · 9e38d86f

由 Nick Piggin 提交于 10月 23, 2010

Convert the inode LRU to use lazy updates to reduce lock and
cacheline traffic.  We avoid moving inodes around in the LRU list
during iget/iput operations so these frequent operations don't need
to access the LRUs. Instead, we defer the refcount checks to
reclaim-time and use a per-inode state flag, I_REFERENCED, to tell
reclaim that iget has touched the inode in the past. This means that
only reclaim should be touching the LRU with any frequency, hence
significantly reducing lock acquisitions and the amount contention
on LRU updates.

This also removes the inode_in_use list, which means we now only
have one list for tracking the inode LRU status. This makes it much
simpler to split out the LRU list operations under it's own lock.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9e38d86f

fs: Convert nr_inodes and nr_unused to per-cpu counters · cffbc8aa

由 Dave Chinner 提交于 10月 23, 2010

The number of inodes allocated does not need to be tied to the
addition or removal of an inode to/from a list. If we are not tied
to a list lock, we could update the counters when inodes are
initialised or destroyed, but to do that we need to convert the
counters to be per-cpu (i.e. independent of a lock). This means that
we have the freedom to change the list/locking implementation
without needing to care about the counters.

Based on a patch originally from Eric Dumazet.

[AV: cleaned up a bit, fixed build breakage on weird configs
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cffbc8aa

new helper: inode_unhashed() · 1d3382cb

由 Al Viro 提交于 10月 23, 2010

note: for race-free uses you inode_lock held
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1d3382cb

A
unexport invalidate_inodes · a8dade34
由 Al Viro 提交于 10月 24, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
a8dade34

vfs: introduce FMODE_UNSIGNED_OFFSET for allowing negative f_pos · 4a3956c7

由 KAMEZAWA Hiroyuki 提交于 10月 01, 2010

Now, rw_verify_area() checsk f_pos is negative or not.  And if negative,
returns -EINVAL.

But, some special files as /dev/(k)mem and /proc/<pid>/mem etc..  has
negative offsets.  And we can't do any access via read/write to the
file(device).

So introduce FMODE_UNSIGNED_OFFSET to allow negative file offsets.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4a3956c7

fs: allow for more than 2^31 files · 7e360c38

由 Eric Dumazet 提交于 10月 05, 2010

Andrew,

Could you please review this patch, you probably are the right guy to
take it, because it crosses fs and net trees.

Note : /proc/sys/fs/file-nr is a read-only file, so this patch doesnt
depend on previous patch (sysctl: fix min/max handling in
__do_proc_doulongvec_minmax())

Thanks !

[PATCH V4] fs: allow for more than 2^31 files

Robin Holt tried to boot a 16TB system and found af_unix was overflowing
a 32bit value :

<quote>

We were seeing a failure which prevented boot.  The kernel was incapable
of creating either a named pipe or unix domain socket.  This comes down
to a common kernel function called unix_create1() which does:

        atomic_inc(&unix_nr_socks);
        if (atomic_read(&unix_nr_socks) > 2 * get_max_files())
                goto out;

The function get_max_files() is a simple return of files_stat.max_files.
files_stat.max_files is a signed integer and is computed in
fs/file_table.c's files_init().

        n = (mempages * (PAGE_SIZE / 1024)) / 10;
        files_stat.max_files = n;

In our case, mempages (total_ram_pages) is approx 3,758,096,384
(0xe0000000).  That leaves max_files at approximately 1,503,238,553.
This causes 2 * get_max_files() to integer overflow.

</quote>

Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long
integers, and change af_unix to use an atomic_long_t instead of
atomic_t.

get_max_files() is changed to return an unsigned long.
get_nr_files() is changed to return a long.

unix_nr_socks is changed from atomic_t to atomic_long_t, while not
strictly needed to address Robin problem.

Before patch (on a 64bit kernel) :
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
-18446744071562067968

After patch:
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
2147483648
# cat /proc/sys/fs/file-nr
704     0       2147483648
Reported-by: NRobin Holt <holt@sgi.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NDavid Miller <davem@davemloft.net>
Reviewed-by: NRobin Holt <holt@sgi.com>
Tested-by: NRobin Holt <holt@sgi.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7e360c38

fs: mark destroy_inode static · 56b0dacf

由 Christoph Hellwig 提交于 10月 06, 2010

Hugetlbfs used to need it, but after the destroy_inode and evict_inode
changes it's not required anymore.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

56b0dacf

fs: add sync_inode_metadata · c3765016

由 Christoph Hellwig 提交于 10月 06, 2010

Add a new helper to write out the inode using the writeback code,
that is including the correct dirty bit and list manipulation.  A few
of filesystems already opencode this, and a lot of others should be
using it instead of using write_inode_now which also writes out the
data.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c3765016

05 10月, 2010 1 次提交

fs/locks.c: prepare for BKL removal · b89f4321

由 Arnd Bergmann 提交于 9月 18, 2010

This prepares the removal of the big kernel lock from the
file locking code. We still use the BKL as long as fs/lockd
uses it and ceph might sleep, but we can flip the definition
to a private spinlock as soon as that's done.
All users outside of fs/lockd get converted to use
lock_flocks() instead of lock_kernel() where appropriate.

Based on an earlier patch to use a spinlock from Matthew
Wilcox, who has attempted this a few times before, the
earliest patch from over 10 years ago turned it into
a semaphore, which ended up being slower than the BKL
and was subsequently reverted.

Someone should do some serious performance testing when
this becomes a spinlock, since this has caused problems
before. Using a spinlock should be at least as good
as the BKL in theory, but who knows...
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NMatthew Wilcox <willy@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Sage Weil <sage@newdream.net>
Cc: linux-kernel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org

b89f4321

22 9月, 2010 1 次提交

fs: {lock,unlock}_flocks() stubs to prepare for BKL removal · 8b15575c

由 Sage Weil 提交于 9月 21, 2010

The lock structs are currently protected by the BKL, but are accessed by
code in fs/locks.c and misc file system and DLM code.  These stubs will
allow all users to switch to the new interface before the implementation
is changed to a spinlock.
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8b15575c

16 9月, 2010 1 次提交

libfs: use generic_file_llseek for simple_attr · 1ec5584e

由 Arnd Bergmann 提交于 8月 15, 2010

Simple attribute files need to be seekable to
allow resetting the file for another read.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

1ec5584e

10 9月, 2010 2 次提交

ext3/ext4: Factor out disk addressability check · 30ca22c7

由 Patrick J. LoPresti 提交于 7月 22, 2010

As part of adding support for OCFS2 to mount huge volumes, we need to
check that the sector_t and page cache of the system are capable of
addressing the entire volume.

An identical check already appears in ext3 and ext4.  This patch moves
the addressability check into its own function in fs/libfs.c and
modifies ext3 and ext4 to invoke it.

[Edited to -EINVAL instead of BUG_ON() for bad blocksize_bits -- Joel]
Signed-off-by: NPatrick LoPresti <lopresti@gmail.com>
Cc: linux-ext4@vger.kernel.org
Acked-by: NAndreas Dilger <adilger@dilger.ca>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

30ca22c7

block: remove the BLKDEV_IFL_BARRIER flag · 8c555367

由 Christoph Hellwig 提交于 8月 18, 2010

Remove support for barriers on discards, which is unused now.  Also
remove the DISCARD_NOBARRIER I/O type in favour of just setting the
rw flags up locally in blkdev_issue_discard.

tj: Also remove DISCARD_SECURE and use REQ_SECURE directly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

8c555367