提交 · 95a631e2d9853c9138e14fbaa9a51e6451f040b4 · openanolis / cloud-kernel

20 7月, 2007 4 次提交

由 Nick Piggin 提交于 7月 19, 2007

This patch completes Linus's wish that the fault return codes be made into
bit flags, which I agree makes everything nicer.  This requires requires
all handle_mm_fault callers to be modified (possibly the modifications
should go further and do things like fault accounting in handle_mm_fault --
however that would be for another patch).

[akpm@linux-foundation.org: fix alpha build]
[akpm@linux-foundation.org: fix s390 build]
[akpm@linux-foundation.org: fix sparc build]
[akpm@linux-foundation.org: fix sparc64 build]
[akpm@linux-foundation.org: fix ia64 build]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Bryan Wu <bryan.wu@analog.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Acked-by: NKyle McMartin <kyle@mcmartin.ca>
Acked-by: NHaavard Skinnemoen <hskinnemoen@atmel.com>
Acked-by: NRalf Baechle <ralf@linux-mips.org>
Acked-by: NAndi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
[ Still apparently needs some ARM and PPC loving - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

83c54070

mm: fault feedback #1 · d0217ac0

由 Nick Piggin 提交于 7月 19, 2007

Change ->fault prototype.  We now return an int, which contains
VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
 FAULT_RET_ code tells the VM whether a page was found, whether it has been
locked, and potentially other things.  This is not quite the way he wanted
it yet, but that's changed in the next patch (which requires changes to
arch code).

This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
that a page is locked which requires filemap_nopage to go away (because we
can no longer remain backward compatible without that flag), but we were
going to do that anyway.

struct fault_data is renamed to struct vm_fault as Linus asked. address
is now a void __user * that we should firmly encourage drivers not to use
without really good reason.

The page is now returned via a page pointer in the vm_fault struct.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0217ac0

mm: merge populate and nopage into fault (fixes nonlinear) · 54cb8821

由 Nick Piggin 提交于 7月 19, 2007

Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
the virtual address -> file offset differently from linear mappings.

->populate is a layering violation because the filesystem/pagecache code
should need to know anything about the virtual memory mapping.  The hitch here
is that the ->nopage handler didn't pass down enough information (ie.  pgoff).
 But it is more logical to pass pgoff rather than have the ->nopage function
calculate it itself anyway (because that's a similar layering violation).

Having the populate handler install the pte itself is likewise a nasty thing
to be doing.

This patch introduces a new fault handler that replaces ->nopage and
->populate and (later) ->nopfn.  Most of the old mechanism is still in place
so there is a lot of duplication and nice cleanups that can be removed if
everyone switches over.

The rationale for doing this in the first place is that nonlinear mappings are
subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
to duplicate the synchronisation logic rather than just consolidate the two.

After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
pagecache.  Seems like a fringe functionality anyway.

NOPAGE_REFAULT is removed.  This should be implemented with ->fault, and no
users have hit mainline yet.

[akpm@linux-foundation.org: cleanup]
[randy.dunlap@oracle.com: doc. fixes for readahead]
[akpm@linux-foundation.org: build fix]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

54cb8821

mm: fix fault vs invalidate race for linear mappings · d00806b1

由 Nick Piggin 提交于 7月 19, 2007

Fix the race between invalidate_inode_pages and do_no_page.

Andrea Arcangeli identified a subtle race between invalidation of pages from
pagecache with userspace mappings, and do_no_page.

The issue is that invalidation has to shoot down all mappings to the page,
before it can be discarded from the pagecache. Between shooting down ptes to
a particular page, and actually dropping the struct page from the pagecache,
do_no_page from any process might fault on that page and establish a new
mapping to the page just before it gets discarded from the pagecache.

The most common case where such invalidation is used is in file truncation.
This case was catered for by doing a sort of open-coded seqlock between the
file's i_size, and its truncate_count.

Truncation will decrease i_size, then increment truncate_count before
unmapping userspace pages; do_no_page will read truncate_count, then find the
page if it is within i_size, and then check truncate_count under the page
table lock and back out and retry if it had subsequently been changed (ptl
will serialise against unmapping, and ensure a potentially updated
truncate_count is actually visible).

Complexity and documentation issues aside, the locking protocol fails in the
case where we would like to invalidate pagecache inside i_size. do_no_page
can come in anytime and filemap_nopage is not aware of the invalidation in
progress (as it is when it is outside i_size). The end result is that
dangling (->mapping == NULL) pages that appear to be from a particular file
may be mapped into userspace with nonsense data. Valid mappings to the same
place will see a different page.

Andrea implemented two working fixes, one using a real seqlock, another using
a page->flags bit. He also proposed using the page lock in do_no_page, but
that was initially considered too heavyweight. However, it is not a global or
per-file lock, and the page cacheline is modified in do_no_page to increment
_count and _mapcount anyway, so a further modification should not be a large
performance hit. Scalability is not an issue.

This patch implements this latter approach. ->nopage implementations return
with the page locked if it is possible for their underlying file to be
invalidated (in that case, they must set a special vm_flags bit to indicate
so). do_no_page only unlocks the page after setting up the mapping
completely. invalidation is excluded because it holds the page lock during
invalidation of each page (and ensures that the page is not mapped while
holding the lock).

This also allows significant simplifications in do_no_page, because we have
the page locked in the right place in the pagecache from the start.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d00806b1

19 7月, 2007 19 次提交

S
[CIFS] merge conflict in fs/cifs/export.c · 70b315b0
由 Steve French 提交于 7月 19, 2007
```
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
70b315b0

[CIFS] Allow disabling CIFS Unix Extensions as mount option · c18c842b

由 Steve French 提交于 7月 18, 2007

Previously the only way to do this was to umount all mounts to that server,
turn off a proc setting (/proc/fs/cifs/LinuxExtensionsEnabled).

Fixes Samba bugzilla bug number: 4582 (and also 2008)
Signed-off-by: NSteve French <sfrench@us.ibm.com>

c18c842b

locks: fix vfs_test_lock() comment · 6924c554

由 J. Bruce Fields 提交于 5月 11, 2007

Thanks to Doug Chapman for pointing out that the comment here is
inconsistent with the function prototype.
Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>

6924c554

locks: make posix_test_lock() interface more consistent · 6d34ac19

由 J. Bruce Fields 提交于 5月 11, 2007

Since posix_test_lock(), like fcntl() and ->lock(), indicates absence or
presence of a conflict lock by setting fl_type to, respectively, F_UNLCK
or something other than F_UNLCK, the return value is no longer needed.
Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>

6d34ac19

nfs: disable leases over NFS · 370f6599

由 J. Bruce Fields 提交于 6月 08, 2007

As Peter Staubach says elsewhere
(http://marc.info/?l=linux-kernel&m=118113649526444&w=2):

> The problem is that some file system such as NFSv2 and NFSv3 do
> not have sufficient support to be able to support leases correctly.
> In particular for these two file systems, there is no over the wire
> protocol support.
>
> Currently, these two file systems fail the fcntl(F_SETLEASE) call
> accidentally, due to a reference counting difference.  These file
> systems should fail more consciously, with a proper error to
> indicate that the call is invalid for them.

Define an nfs setlease method that just returns -EINVAL.

If someone can demonstrate a real need, perhaps we could reenable
them in the presence of the "nolock" mount option.
Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
Cc: Peter Staubach <staubach@redhat.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>

370f6599

gfs2: stop giving out non-cluster-coherent leases · 60446067

由 Marc Eshel 提交于 1月 15, 2007

Since gfs2 can't prevent conflicting opens or leases on other nodes, we
probably shouldn't allow it to give out leases at all.

Put the newly defined lease operation into use in gfs2 by turning off
lease, unless we're using the "nolock' locking module (in which case all
locking is local anyway).
Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>

60446067

locks: export setlease to filesystems · 4698afe8

由 J. Bruce Fields 提交于 7月 04, 2007

Export setlease so it can used by filesystems to implement their lease
methods.
Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>

4698afe8

locks: provide a file lease method enabling cluster-coherent leases · f9ffed26

由 J. Bruce Fields 提交于 11月 14, 2006

Currently leases are only kept locally, so there's no way for a distributed
filesystem to enforce them against multiple clients.  We're particularly
interested in the case of nfsd exporting a cluster filesystem, in which
case nfsd needs cluster-coherent leases in order to implement delegations
correctly.

Also add some documentation.
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

f9ffed26

locks: rename lease functions to reflect locks.c conventions · a9933cea

由 J. Bruce Fields 提交于 6月 07, 2007

We've been using the convention that vfs_foo is the function that calls
a filesystem-specific foo method if it exists, or falls back on a
generic method if it doesn't; thus vfs_foo is what is called when some
other part of the kernel (normally lockd or nfsd) wants to get a lock,
whereas foo is what filesystems call to use the underlying local
functionality as part of their lock implementation.

So rename setlease to vfs_setlease (which will call a
filesystem-specific setlease after a later patch) and __setlease to
setlease.

Also, vfs_setlease need only be GPL-exported as long as it's only needed
by lockd and nfsd.
Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>

a9933cea

locks: share more common lease code · 6d5e8b05

由 J. Bruce Fields 提交于 5月 31, 2007

Share more code between setlease (used by nfsd) and fcntl.

Also some minor cleanup.
Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
Acked-by: NChristoph Hellwig <hch@infradead.org>

6d5e8b05

locks: clean up lease_alloc() · e32b8ee2

由 J. Bruce Fields 提交于 3月 01, 2007

Return the newly allocated structure as the return value instead of
using a struct ** parameter.
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

e32b8ee2

locks: convert an -EINVAL return to a BUG · d2ab0b0c

由 J. Bruce Fields 提交于 6月 30, 2007

There's no point trying to return an error in these cases, which all represent
bugs in the callers.
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

d2ab0b0c

leases: minor break_lease() comment clarification · 87250dd2

由 david m. richter 提交于 5月 09, 2007

clarify that break_lease() checks for presence of any lock, not just leases.
Signed-off-by: NDavid M. Richter <richterd@citi.umich.edu>
Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>

87250dd2

sysfs: cosmetic clean up on node creation failure paths · 967e35dc

由 Tejun Heo 提交于 7月 18, 2007

Node addition failure is detected by testing return value of
sysfs_addfm_finish() which returns the number of added and removed
nodes.  As the function is called as the last step of addition right
on top of error handling block, the if blocks looked like the
following.

	if (sysfs_addrm_finish(&acxt))
		success handling, usually return;
	/* fall through to error handling */

This is the opposite of usual convention in sysfs and makes the code
difficult to understand.  This patch inverts the test and makes those
blocks look more like others.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Cc: Gabriel C <nix.or.die@googlemail.com>
Cc: Miles Lane <miles.lane@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

967e35dc

sysfs: kill an extra put in sysfs_create_link() failure path · a1da4dfe

由 Tejun Heo 提交于 7月 18, 2007

There is a subtle bug in sysfs_create_link() failure path.  When
symlink creation fails because there's already a node with the same
name, the target sysfs_dirent is put twice - once by failure path of
sysfs_create_link() and once more when the symlink is released.

Fix it by making only the symlink node responsible for putting
target_sd.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Cc: Gabriel C <nix.or.die@googlemail.com>
Cc: Miles Lane <miles.lane@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

a1da4dfe

sysfs: make sysfs_init_inode() static · bc37e283

由 Tejun Heo 提交于 7月 18, 2007

With sysfs_fill_super() converted to use sysfs_get_inode(), there is
no user of sysfs_init_inode() outside of fs/sysfs/inode.c.  Make it
static.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Acked-by: NJean Delvare <khali@linux-fr.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

bc37e283

sysfs: fix sysfs root inode nlink accounting · e080e436

由 Tejun Heo 提交于 7月 18, 2007

While making sysfs indoes hashed, sysfs root inode was left out.  Now
that nlink accounting depends on the inode being on the hash, sysfs
root inode nlink isn't adjusted properly.

Put sysfs root inode on the inode hash by allocating it using
sysfs_get_inode() like other sysfs inodes.  While at it, massage
comments a bit.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Acked-by: NJean Delvare <khali@linux-fr.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

e080e436

sysfs: avoid kmem_cache_free(NULL) · 01da2425

由 Akinobu Mita 提交于 7月 14, 2007

kmem_cache_free() with NULL is not allowed. But it may happen
if out of memory error is triggered in sysfs_new_dirent().
This patch fixes that error handling.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

01da2425

debugfs: remove rmdir() non-empty complaint · a6bb340d

由 Jens Axboe 提交于 7月 11, 2007

Hi,

This patch kills the pointless debugfs rmdir() printk() when called on a
non-empty directory. blktrace will sometimes have to call it a few times
when forcefully ending a trace, which polutes the log with pointless
warnings.

Rationale:

- It's more code to work-around this "problem" in the debugfs users, and
  you would have to add code to check for empty directories to do so (or
  assume that debugfs is using simple_ helpers, but that would be a
  layering violation).

- Other rmdir() implementations don't complain about something this
  silly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

a6bb340d

18 7月, 2007 17 次提交

usermodehelper: Tidy up waiting · 86313c48

由 Jeremy Fitzhardinge 提交于 7月 17, 2007

Rather than using a tri-state integer for the wait flag in
call_usermodehelper_exec, define a proper enum, and use that.  I've
preserved the integer values so that any callers I've missed should
still work OK.
Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: David Howells <dhowells@redhat.com>

86313c48

ext4: extent macros cleanup · e9f410b1

由 Dmitry Monakhov 提交于 7月 18, 2007

Use the EXT_LAST_INDEX macro; that's what it's there for.

Clean up ext4_ext_ext_grow_indepth() so the correct EXT_FIRST_INDEX or
EXT_FIRST_MACRO is used as necessary.  The two macros are equivalent, so
the C will collapse the if statement out, but it makes the code much
more readable.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Acked-by: NAlex Tomas <alex@clusterfs.com>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Singed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e9f410b1

Fix compilation with EXT_DEBUG, also fix leXX_to_cpu conversions. · 26d535ed

由 Dmitry Monakhov 提交于 7月 18, 2007

Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Acked-by: NAlex Tomas <alex@clusterfs.com>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

26d535ed

ext4: remove extra IS_RDONLY() check · d699594d

由 Dave Hansen 提交于 7月 18, 2007

ext4_change_inode_journal_flag() is only called from one location:
ext4_ioctl(EXT3_IOC_SETFLAGS).  That ioctl case already has a IS_RDONLY()
call in it so this one is superfluous.
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d699594d

ext4: Use is_power_of_2() · 1330593e

由 Vignesh Babu 提交于 7月 18, 2007

Replace (n & (n-1)) in the context of power of 2 checks with
is_power_of_2()
Signed-off-by: NVignesh Babu <vignesh.babu@wipro.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1330593e

Use zero_user_page() in ext4 where possible · fc0e15a6

由 Eric Sandeen 提交于 7月 18, 2007

Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fc0e15a6

ext4: Remove 65000 subdirectory limit · f8628a14

由 Andreas Dilger 提交于 7月 18, 2007

This patch adds support to ext4 for allowing more than 65000
subdirectories. Currently the maximum number of subdirectories is capped
at 32000.

If we exceed 65000 subdirectories in an htree directory it sets the
inode link count to 1 and no longer counts subdirectories.  The
directory link count is not actually used when determining if a
directory is empty, as that only counts subdirectories and not regular
files that might be in there. 

A EXT4_FEATURE_RO_COMPAT_DIR_NLINK flag has been added and it is set if
the subdir count for any directory crosses 65000. A later fsck will clear
EXT4_FEATURE_RO_COMPAT_DIR_NLINK if there are no longer any directory
with >65000 subdirs.
Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
Signed-off-by: NKalpak Shah <kalpak@clusterfs.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f8628a14

ext4: Expand extra_inodes space per the s_{want,min}_extra_isize fields · 6dd4ee7c

由 Kalpak Shah 提交于 7月 18, 2007

We need to make sure that existing ext3 filesystems can also avail the
new fields that have been added to the ext4 inode. We use
s_want_extra_isize and s_min_extra_isize to decide by how much we should
expand the inode. If EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature is set
then we expand the inode by max(s_want_extra_isize, s_min_extra_isize ,
sizeof(ext4_inode) - EXT4_GOOD_OLD_INODE_SIZE) bytes. Actually it is
still an open question about whether users should be able to set
s_*_extra_isize smaller than the known fields or not.

This patch also adds the functionality to expand inodes to include the
newly added fields. We start by trying to expand by s_want_extra_isize
bytes and if its fails we try to expand by s_min_extra_isize bytes. This
is done by changing the i_extra_isize if enough space is available in
the inode and no EAs are present. If EAs are present and there is enough
space in the inode then the EAs in the inode are shifted to make space.
If enough space is not available in the inode due to the EAs then 1 or
more EAs are shifted to the external EA block. In the worst case when
even the external EA block does not have enough space we inform the user
that some EA would need to be deleted or s_min_extra_isize would have to
be reduced.
Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
Signed-off-by: NKalpak Shah <kalpak@clusterfs.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6dd4ee7c

ext4: Add nanosecond timestamps · ef7f3835

由 Kalpak Shah 提交于 7月 18, 2007

This patch adds nanosecond timestamps for ext4. This involves adding
*time_extra fields to the ext4_inode to extend the timestamps to
64-bits.  Creation time is also added by this patch.

These extended fields will fit into an inode if the filesystem was
formatted with large inodes (-I 256 or larger) and there are currently
no EAs consuming all of the available space. For new inodes we always
reserve enough space for the kernel's known extended fields, but for
inodes created with an old kernel this might not have been the case. So
this patch also adds the EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature
flag(ro-compat so that older kernels can't create inodes with a smaller
extra_isize). which indicates if the fields fitting inside
s_min_extra_isize are available or not.  If the expansion of inodes if
unsuccessful then this feature will be disabled.  This feature is only
enabled if requested by the sysadmin.

None of the extended inode fields is critical for correct filesystem
operation.
Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
Signed-off-by: NKalpak Shah <kalpak@clusterfs.com>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ef7f3835

jbd2: Move jbd2-debug file to debugfs · 0f49d5d0

由 Jose R. Santos 提交于 7月 18, 2007

The jbd2-debug file used to be located in /proc/sys/fs/jbd2-debug, but it
incorrectly used create_proc_entry() instead of the sysctl routines, and
no proc entry was ever created.

Instead of fixing this we might as well move the jbd2-debug file to
debugfs which would be the preferred location for this kind of tunable.
The new location is now /sys/kernel/debug/jbd2/jbd2-debug.
Signed-off-by: NJose R. Santos <jrs@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0f49d5d0

jbd2: Fix CONFIG_JBD_DEBUG ifdef to be CONFIG_JBD2_DEBUG · e23291b9

由 Jose R. Santos 提交于 7月 18, 2007

When the JBD code was forked to create the new JBD2 code base, the
references to CONFIG_JBD_DEBUG where never changed to
CONFIG_JBD2_DEBUG.  This patch fixes that.
Signed-off-by: NJose R. Santos <jrs@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e23291b9

ext4: Set the journal JBD2_FEATURE_INCOMPAT_64BIT on large devices · eb40a09c

由 Jose R. Santos 提交于 7月 18, 2007

Set the journals JBD2_FEATURE_INCOMPAT_64BIT on devices with more
than 32bit block sizes during mount time.  This ensure proper record
lenth when writing to the journal.
Signed-off-by: NJose R. Santos <jrs@us.ibm.com>
Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: NLaurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

eb40a09c

ext4: Make extents code sanely handle on-disk corruption · c29c0ae7

由 Alex Tomas 提交于 7月 18, 2007

Add more run-time checking of extent header fields and remove BUG_ON
checks so we don't panic the kernel just because the on-disk filesystem
is corrupted.
Signed-off-by: NAlex Tomas <alex@clusterfs.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c29c0ae7

ext4: copy i_flags to inode flags on write · ff9ddf7e

由 Jan Kara 提交于 7月 18, 2007

    
Propagate flags such as S_APPEND, S_IMMUTABLE, etc. from i_flags into
ext4-specific i_flags.  Quota code changes these flags on quota files
(to make it harder for sysadmin to screw himself) and these changes were
not correctly propagated into the filesystem.

(This is a forward port patch from ext3)
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ff9ddf7e

ext4: Enable extents by default · 1e2462f9

由 Mingming Cao 提交于 7月 18, 2007

Turn on extents feature by default in ext4 filesystem, to get wider
testing of extents feature in ext4dev.  This can be disabled using 
-o noextents.  
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1e2462f9

Change on-disk format to support 2^15 uninitialized extents · 749269fa

由 Amit Arora 提交于 7月 18, 2007

This change was suggested by Andreas Dilger.
This patch changes the EXT_MAX_LEN value and extent code which marks/checks
uninitialized extents. With this change it will be possible to have
initialized extents with 2^15 blocks (earlier the max blocks we could have
was 2^15 - 1). This way we can have better extent-to-block alignment.
Now, maximum number of blocks we can have in an initialized extent is 2^15
and in an uninitialized extent is 2^15 - 1.
Signed-off-by: NAmit Arora <aarora@in.ibm.com>

749269fa

write support for preallocated blocks · 56055d3a

由 Amit Arora 提交于 7月 17, 2007

This patch adds write support to the uninitialized extents that get
created when a preallocation is done using fallocate(). It takes care of
splitting the extents into multiple (upto three) extents and merging the
new split extents with neighbouring ones, if possible.
Signed-off-by: NAmit Arora <aarora@in.ibm.com>

56055d3a

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功