提交 · 750d1151a6c95ef9b9a188bb7cff6b80ee30da17 · openeuler / raspberrypi-kernel

27 6月, 2006 1 次提交
- S
  [CIFS] Fix allocation of buffers for new session setup routine to allow · 750d1151
  由 Steve French 提交于 6月 27, 2006
```
longer user and domain names and allow passing sec options on mount
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
  750d1151
26 6月, 2006 2 次提交

[CIFS] Remove calls to to take f_owner.lock · 124a27fe

由 Ingo Molnar 提交于 6月 26, 2006

CIFS takes/releases f_owner.lock - why?  It does not change anything in the
fowner state.  Remove this locking.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

124a27fe

S
[CIFS] remove some redundant null pointer checks · cd49b492
由 Steve French 提交于 6月 26, 2006
```
some of them pointed out by Dave Jones
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
cd49b492

25 6月, 2006 1 次提交
- S
  [CIFS] Fix compile warning when CONFIG_CIFS_EXPERIMENTAL is off · f90f00a3
  由 Steve French 提交于 6月 25, 2006
```
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
  f90f00a3
23 6月, 2006 34 次提交

[PATCH] splice: retrieve mapping after locking the page · 9e94cd4f

由 Jens Axboe 提交于 6月 20, 2006

Otherwise we could be racing with truncate/mapping removal.

Problem found/fixed by Nick Piggin <npiggin@suse.de>, logic rewritten
by me.
Signed-off-by: NJens Axboe <axboe@suse.de>

9e94cd4f

[PATCH] Kill PF_SYNCWRITE flag · b31dc66a

由 Jens Axboe 提交于 6月 13, 2006

A process flag to indicate whether we are doing sync io is incredibly
ugly. It also causes performance problems when one does a lot of async
io and then proceeds to sync it. Part of the io will go out as async,
and the other part as sync. This causes a disconnect between the
previously submitted io and the synced io. For io schedulers such as CFQ,
this will cause us lost merges and suboptimal behaviour in scheduling.

Remove PF_SYNCWRITE completely from the fsync/msync paths, and let
the O_DIRECT path just directly indicate that the writes are sync
by using WRITE_SYNC instead.
Signed-off-by: NJens Axboe <axboe@suse.de>

b31dc66a

[PATCH] make kernel warn about incorrectly sized partitions · 98bd34ea

由 Mike Miller 提交于 6月 23, 2006

Sometimes partitions claim to be larger than the reported capacity of a
disk device. This patch makes the kernel warn about those partitions.

We still permit these patitions to be used. Quoting Andries Brouwer
<Andries.Brouwer@cwi.nl>:

Case 1: The kernel is mistaken about the size of the disk. (There are
commands to clip a disk to a certain capacity, there are jumpers to tell a
disk that it should report a certain capacity etc. Usually this is because
of BIOS bugs. In bad cases the machine will crash in the BIOS and hence fail
to boot if the disk reports full capacity.) In such cases actually accessing
the blocks of the partition may work fine, or may work fine after running an
unclip utility. I wrote "setmax" some years ago precisely for this reason.

Case 2: There was a messy partition table (maybe just a rounding error) but
the actual filesystem on the partition is contained in the physical disk.
Now using the filesystem goes without problem.

Case 3: Both partition and filesystem extend beyond the end of the disk. In
forensic or debugging situations one often uses a copy of the start of a
disk. Now access beyond the end gives an expected I/O error.
Signed-off-by: NMike Miller <mike.miller@hp.com>
Signed-off-by: NStephen Cameron <steve.cameron@hp.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

98bd34ea

[PATCH] JBD: split checkpoint lists · 78ce89c9

由 Jan Kara 提交于 6月 23, 2006

Split the checkpoint list of the transaction into two lists.  In the first
list we keep the buffers that need to be submitted for IO.  In the second
list are kept buffers that were already submitted and we just have to wait
for the IO to complete.  This should simplify a handling of checkpoint
lists a bit and can eventually be also a performance gain.
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

78ce89c9

[PATCH] list: use list_replace_init() instead of list_splice_init() · 626ab0e6

由 Oleg Nesterov 提交于 6月 23, 2006

list_splice_init(list, head) does unneeded job if it is known that
list_empty(head) == 1.  We can use list_replace_init() instead.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

626ab0e6

[PATCH] percpu counter data type changes to suppport more than 2**31 ext3 free blocks counter · 0216bfcf

由 Mingming Cao 提交于 6月 23, 2006

The percpu counter data type are changed in this set of patches to support
more users like ext3 who need more than 32 bit to store the free blocks
total in the filesystem.

- Generic perpcu counters data type changes.  The size of the global counter
  and local counter were explictly specified using s64 and s32.  The global
  counter is changed from long to s64, while the local counter is changed from
  long to s32, so we could avoid doing 64 bit update in most cases.

- Users of the percpu counters are updated to make use of the new
  percpu_counter_init() routine now taking an additional parameter to allow
  users to pass the initial value of the global counter.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0216bfcf

[PATCH] binflt_elf: remove more casts · 785d5570

由 Jesper Juhl 提交于 6月 23, 2006

Remove redundant casts from NEW_AUX_ENT() arguments in fs/binfmt_elf.c
Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

785d5570

[PATCH] binfmt_elf: CodingStyle cleanup and remove some pointless casts · f4e5cc2c

由 Jesper Juhl 提交于 6月 23, 2006

Do a CodingStyle cleanup of fs/binfmt_elf.c and also remove some pointless
casts of kmalloc() return values in the same file.
Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f4e5cc2c

[PATCH] ext3_clear_inode(): avoid kfree(NULL) · e6022603

由 Andrew Morton 提交于 6月 23, 2006

Steven Rostedt <rostedt@goodmis.org> points out that `rsv' here is usually
NULL, so we should avoid calling kfree().

Also, fix up some nearby whitespace damage.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e6022603

[PATCH] jbd: avoid kfree(NULL) · 304c4c84

由 Andrew Morton 提交于 6月 23, 2006

There are a couple of places where JBD has to check to see whether an unneeded
memory allocation was performed.  Usually it _was_ needed, so we end up
calling kfree(NULL).  We can micro-optimise that by checking the pointer
before calling kfree().

Thanks to Steven Rostedt <rostedt@goodmis.org> for identifying this.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

304c4c84

[PATCH] jbd: fix BUG in journal_commit_transaction() · 9ada7340

由 Jan Kara 提交于 6月 23, 2006

Fix possible assertion failure in journal_commit_transaction() on
jh->b_next_transaction == NULL (when we are processing BJ_Forget list and
buffer is not jbddirty).

!jbddirty buffers can be placed on BJ_Forget list for example by
journal_forget() or by __dispose_buffer() - generally such buffer means
that it has been freed by this transaction.

Freed buffers should not be reallocated until the transaction has committed
(that's why we have the assertion there) but they *can* be reallocated when
the transaction has already been committed to disk and we are just
processing the BJ_Forget list (as soon as we remove b_committed_data from
the bitmap bh, ext3 will be able to reallocate buffers freed by the
committing transaction).  So we have to also count with the case that the
buffer has been reallocated and b_next_transaction has been already set.

And one more subtle point: it can happen that we manage to reallocate the
buffer and also mark it jbddirty.  Then we also add the freed buffer to the
checkpoint list of the committing trasaction.  But that should do no harm.

Non-jbddirty buffers should be filed to BJ_Reserved and not BJ_Metadata
list.  It can actually happen that we refile such buffers during the commit
phase when we reallocate in the running transaction blocks deleted in
committing transaction (and that can happen if the committing transaction
already wrote all the data and is just cleaning up BJ_Forget list).
Signed-off-by: NJan Kara <jack@suse.cz>
Acked-by: N"Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9ada7340

[PATCH] Poll cleanups/microoptimizations · 4a4b69f7

由 Vadim Lobanov 提交于 6月 23, 2006

The "count" and "pt" variables are declared and modified by do_poll(), as
well as accessed and written indirectly in the do_pollfd() subroutine.

This patch pulls all handling of these variables into the do_poll()
function, thereby eliminating the odd use of indirection in do_pollfd().
This is done by pulling the "struct pollfd" traversal loop from do_pollfd()
into its only caller do_poll(). As an added bonus, the patch saves a few
clock cycles, and also adds comments to make the code easier to follow.
Signed-off-by: NVadim Lobanov <vlobanov@speakeasy.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4a4b69f7

[PATCH] fs/fat/misc.c: unexport fat_sync_bhs · 2da13264

由 Adrian Bunk 提交于 6月 23, 2006

Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2da13264

[PATCH] fs/locks.c: make posix_locks_deadlock() static · b0904e14

由 Adrian Bunk 提交于 6月 23, 2006

We can now make posix_locks_deadlock() static.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b0904e14

[PATCH] vfs: add lock owner argument to flush operation · 75e1fcc0

由 Miklos Szeredi 提交于 6月 23, 2006

Pass the POSIX lock owner ID to the flush operation.

This is useful for filesystems which don't want to store any locking state
in inode->i_flock but want to handle locking/unlocking POSIX locks
internally.  FUSE is one such filesystem but I think it possible that some
network filesystems would need this also.

Also add a flag to indicate that a POSIX locking request was generated by
close(), so filesystems using the above feature won't send an extra locking
request in this case.
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

75e1fcc0

[PATCH] locks: clean up locks_remove_posix() · ff7b86b8

由 Miklos Szeredi 提交于 6月 23, 2006

locks_remove_posix() can use posix_lock_file() instead of doing the lock
removal by hand.  posix_lock_file() now does exacly the same.

The comment about pids no longer applies, posix_lock_file() takes only the
owner into account.
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ff7b86b8

[PATCH] locks: don't do unnecessary allocations · 39005d02

由 Miklos Szeredi 提交于 6月 23, 2006

posix_lock_file() always allocates new locks in advance, even if it's easy to
determine that no allocations will be needed.

Optimize these cases:

 - FL_ACCESS flag is set

 - Unlocking the whole range
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

39005d02

[PATCH] locks: don't unnecessarily fail posix lock operations · 0d9a490a

由 Miklos Szeredi 提交于 6月 23, 2006

posix_lock_file() was too cautious, failing operations on OOM, even if they
didn't actually require an allocation.

This has the disadvantage, that a failing unlock on process exit could lead to
a memory leak.  There are two possibilites for this:

- filesystem implements .lock() and calls back to posix_lock_file().  On
cleanup of files_struct locks_remove_posix() is called which should remove all
locks belonging to files_struct.  However if filesystem calls
posix_lock_file() which fails, then those locks will never be freed.

- if a file is closed while a lock is blocked, then after acquiring
fcntl_setlk() will undo the lock.  But this unlock itself might fail on OOM,
again possibly leaking the lock.

The solution is to move the checking of the allocations until after it is sure
that they will be needed.  This will solve the above problem since unlock will
always succeed unless it splits an existing region.
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0d9a490a

[PATCH] read_mapping_page for address space · 090d2b18

由 Pekka Enberg 提交于 6月 23, 2006

Add read_mapping_page() which is used for callers that pass
mapping->a_ops->readpage as the filler for read_cache_page.  This removes
some duplication from filesystem code.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

090d2b18

[PATCH] ext2 XIP won't build without MMU · 0c426f26

由 Al Viro 提交于 6月 23, 2006

Disable Ext2 XIP if the kernel is configured in no-MMU mode as the former
won't build.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0c426f26

[PATCH] frv: binfmt_elf_fdpic __user annotations · 530018bf

由 Al Viro 提交于 6月 23, 2006

Add __user annotations to binfmt_elf_fdpic.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

530018bf

[PATCH] lsm: add task_setioprio hook · 03e68060

由 James Morris 提交于 6月 23, 2006

Implement an LSM hook for setting a task's IO priority, similar to the hook
for setting a tasks's nice value.

A previous version of this LSM hook was included in an older version of
multiadm by Jan Engelhardt, although I don't recall it being submitted
upstream.

Also included is the corresponding SELinux hook, which re-uses the setsched
permission in the proccess class.
Signed-off-by: NJames Morris <jmorris@namei.org>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

03e68060

[PATCH] writeback: fix range handling · 111ebb6e

由 OGAWA Hirofumi 提交于 6月 23, 2006

When a writeback_control's `start' and `end' fields are used to
indicate a one-byte-range starting at file offset zero, the required
values of .start=0,.end=0 mean that the ->writepages() implementation
has no way of telling that it is being asked to perform a range
request.  Because we're currently overloading (start == 0 && end == 0)
to mean "this is not a write-a-range request".

To make all this sane, the patch changes range of writeback_control.

So caller does: If it is calling ->writepages() to write pages, it
sets range (range_start/end or range_cyclic) always.

And if range_cyclic is true, ->writepages() thinks the range is
cyclic, otherwise it just uses range_start and range_end.

This patch does,

    - Add LLONG_MAX, LLONG_MIN, ULLONG_MAX to include/linux/kernel.h
      -1 is usually ok for range_end (type is long long). But, if someone did,

		range_end += val;		range_end is "val - 1"
		u64val = range_end >> bits;	u64val is "~(0ULL)"

      or something, they are wrong. So, this adds LLONG_MAX to avoid nasty
      things, and uses LLONG_MAX for range_end.

    - All callers of ->writepages() sets range_start/end or range_cyclic.

    - Fix updates of ->writeback_index. It seems already bit strange.
      If it starts at 0 and ended by check of nr_to_write, this last
      index may reduce chance to scan end of file.  So, this updates
      ->writeback_index only if range_cyclic is true or whole-file is
      scanned.
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

111ebb6e

[PATCH] tightening hugetlb strict accounting · a43a8c39

由 Chen, Kenneth W 提交于 6月 23, 2006

Current hugetlb strict accounting for shared mapping always assume mapping
starts at zero file offset and reserves pages between zero and size of the
file.  This assumption often reserves (or lock down) a lot more pages then
necessary if application maps at none zero file offset.  libhugetlbfs is
one example that requires proper reservation on shared mapping starts at
none zero offset.

This patch extends the reservation and hugetlb strict accounting to support
any arbitrary pair of (offset, len), resulting a much more robust and
accurate scheme.  More importantly, it won't lock down any hugetlb pages
outside file mapping.
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Acked-by: NAdam Litke <agl@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a43a8c39

[PATCH] for_each_possible_cpu: xfs · 6f0419e0

由 KAMEZAWA Hiroyuki 提交于 6月 23, 2006

for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu.
in xfs.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NNathan Scott <nathans@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6f0419e0

[PATCH] XFS: Use the dentry passed to statfs() to limit the scope of the results · d6938d1b

由 David Howells 提交于 6月 23, 2006

Enable XFS to limit the statfs() results to the project quota covering the
dentry used as a base for call.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NNathan Scott <nathans@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d6938d1b

[PATCH] VFS: Permit filesystem to perform statfs with a known root dentry · 726c3342

由 David Howells 提交于 6月 23, 2006

Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch.  That reduced the significance of
sb->s_root, allowing NFS to place a fake root there.  However, NFS does
require a dentry to use as a target for the statfs operation.  This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

726c3342

[PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398

由 David Howells 提交于 6月 23, 2006

Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

454e2398

S
[CIFS] Enable sec flags on mount for cifs (part one) · 189acaae
由 Steve French 提交于 6月 23, 2006
```
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
189acaae

[PATCH] prune_one_dentry() tweaks · d702ccb3

由 Andrew Morton 提交于 6月 22, 2006

- Add description of d_lock handling to comments over prune_one_dentry().

- It has three callsites - uninline it, saving 200 bytes of text.

Cc: Jan Blunck <jblunck@suse.de>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d702ccb3

[PATCH] Fix dcache race during umount · 0feae5c4

由 NeilBrown 提交于 6月 22, 2006

The race is that the shrink_dcache_memory shrinker could get called while a
filesystem is being unmounted, and could try to prune a dentry belonging to
that filesystem.

If it does, then it will call in to iput on the inode while the dentry is
no longer able to be found by the umounting process.  If iput takes a
while, generic_shutdown_super could get all the way though
shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
ever waiting on this particular inode.

Eventually the superblock gets freed anyway and if the iput tried to touch
it (which some filesystems certainly do), it will lose.  The promised
"Self-destruct in 5 seconds" doesn't lead to a nice day.

The race is closed by holding s_umount while calling prune_one_dentry on
someone else's dentry.  As a down_read_trylock is used,
shrink_dcache_memory will no longer try to prune the dentry of a filesystem
that is being unmounted, and unmount will not be able to start until any
such active prune_one_dentry completes.

This requires that prune_dcache *knows* which filesystem (if any) it is
doing the prune on behalf of so that it can be careful of other
filesystems.  shrink_dcache_memory isn't called it on behalf of any
filesystem, and so is careful of everything.

shrink_dcache_anon is now passed a super_block rather than the s_anon list
out of the superblock, so it can get the s_anon list itself, and can pass
the superblock down to prune_dcache.

If prune_dcache finds a dentry that it cannot free, it leaves it where it
is (at the tail of the list) and exits, on the assumption that some other
thread will be removing that dentry soon.  To try to make sure that some
work gets done, a limited number of dnetries which are untouchable are
skipped over while choosing the dentry to work on.

I believe this race was first found by Kirill Korotaev.

Cc: Jan Blunck <jblunck@suse.de>
Acked-by: NKirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Acked-by: NBalbir Singh <balbir@in.ibm.com>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NBalbir Singh <balbir@in.ibm.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0feae5c4

[PATCH] remove steal_locks() · c89681ed

由 Miklos Szeredi 提交于 6月 22, 2006

This patch removes the steal_locks() function.

steal_locks() doesn't work correctly with any filesystem that does it's own
lock management, including NFS, CIFS, etc.

In addition it has weird semantics on local filesystems in case tasks
sharing file-descriptor tables are doing POSIX locking operations in
parallel to execve().

The steal_locks() function has an effect on applications doing:

clone(CLONE_FILES)
  /* in child */
  lock
  execve
  lock

POSIX locks acquired before execve (by "child", "parent" or any further
task sharing files_struct) will after the execve be owned exclusively by
"child".

According to Chris Wright some LSB/LTP kind of suite triggers without the
stealing behavior, but there's no known real-world application that would
also fail.

Apps using NPTL are not affected, since all other threads are killed before
execve.

Apps using LinuxThreads are only affected if they

  - have multiple threads during exec (LinuxThreads doesn't kill other
    threads, the app may do it with pthread_kill_other_threads_np())
  - rely on POSIX locks being inherited across exec

Both conditions are documented, but not their interaction.

Apps using clone() natively are affected if they

  - use clone(CLONE_FILES)
  - rely on POSIX locks being inherited across exec

The above scenarios are unlikely, but possible.

If the patch is vetoed, there's a plan B, that involves mostly keeping the
weird stealing semantics, but changing the way lock ownership is handled so
that network and local filesystems work consistently.

That would add more complexity though, so this solution seems to be
preferred by most people.
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c89681ed

[PATCH] Fix a race condition between ->i_mapping and iput() · 09d967c6

由 OGAWA Hirofumi 提交于 6月 22, 2006

This race became a cause of oops, and can reproduce by the following.

    while true; do
	dd if=/dev/zero of=/dev/.static/dev/hdg1 bs=512 count=1000 & sync
    done

This race condition was between __sync_single_inode() and iput().

          cpu0 (fs's inode)                 cpu1 (bdev's inode)
          -----------------                 -------------------
                                       close("/dev/hda2")
                                       [...]
__sync_single_inode()
   /* copy the bdev's ->i_mapping */
   mapping = inode->i_mapping;

                                       generic_forget_inode()
                                          bdev_clear_inode()
					     /* restre the fs's ->i_mapping */
				             inode->i_mapping = &inode->i_data;
				          /* bdev's inode was freed */
                                          destroy_inode(inode);

   if (wait) {
      /* dereference a freed bdev's mapping->host */
      filemap_fdatawait(mapping);  /* Oops */

Since __sync_single_inode() is only taking a ref-count of fs's inode, the
another process can be close() and freeing the bdev's inode while writing
fs's inode.  So, __sync_signle_inode() accesses the freed ->i_mapping,
oops.

This patch takes a ref-count on the bdev's inode for the fs's inode before
setting a ->i_mapping, and the clear_inode() of the fs's inode does iput() on
the bdev's inode.  So if the fs's inode is still living, bdev's inode
shouldn't be freed.
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

09d967c6

[PATCH] NTFS: Critical bug fix (affects MIPS and possibly others) · f893afbe

由 Anton Altaparmakov 提交于 6月 22, 2006

Many thanks to Pauline Ng for the detailed bug report and analysis!
Signed-off-by: NAnton Altaparmakov <aia21@cantab.net>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f893afbe

22 6月, 2006 1 次提交

[PATCH] Driver core: add generic "subsystem" link to all devices · b9d9c82b

由 Kay Sievers 提交于 6月 15, 2006

Like the SUBSYTEM= key we find in the environment of the uevent, this
creates a generic "subsystem" link in sysfs for every device. Userspace
usually doesn't care at all if its a "class" or a "bus" device. This
provides an unified way to determine the subsytem of a device, regardless
of the way the driver core has created it.
Signed-off-by: NKay Sievers <kay.sievers@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

b9d9c82b

20 6月, 2006 1 次提交

[PATCH] log more info for directory entry change events · 9c937dcc

由 Amy Griffis 提交于 6月 08, 2006

When an audit event involves changes to a directory entry, include
a PATH record for the directory itself.  A few other notable changes:

    - fixed audit_inode_child() hooks in fsnotify_move()
    - removed unused flags arg from audit_inode()
    - added audit log routines for logging a portion of a string

Here's some sample output.

before patch:
type=SYSCALL msg=audit(1149821605.320:26): arch=40000003 syscall=39 success=yes exit=0 a0=bf8d3c7c a1=1ff a2=804e1b8 a3=bf8d3c7c items=1 ppid=739 pid=800 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
type=CWD msg=audit(1149821605.320:26):  cwd="/root"
type=PATH msg=audit(1149821605.320:26): item=0 name="foo" parent=164068 inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0

after patch:
type=SYSCALL msg=audit(1149822032.332:24): arch=40000003 syscall=39 success=yes exit=0 a0=bfdd9c7c a1=1ff a2=804e1b8 a3=bfdd9c7c items=2 ppid=714 pid=777 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
type=CWD msg=audit(1149822032.332:24):  cwd="/root"
type=PATH msg=audit(1149822032.332:24): item=0 name="/root" inode=164068 dev=03:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_dir_t:s0
type=PATH msg=audit(1149822032.332:24): item=1 name="foo" inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0
Signed-off-by: NAmy Griffis <amy.griffis@hp.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9c937dcc