提交 · a2bbcb60ff9a8e8a4159e11bc3ed84f7221fe79f · openeuler / Kernel

07 3月, 2016 10 次提交

D

Merge branch 'xfs-gut-icdinode-4.6' into for-next · a2bbcb60
由 Dave Chinner 提交于 3月 07, 2016

a2bbcb60
D

Merge branch 'xfs-misc-fixes-4.6' into for-next · 6d247d47
由 Dave Chinner 提交于 3月 07, 2016

6d247d47
D

Merge branch 'xfs-dio-fix-4.6' into for-next · acb3e26f
由 Dave Chinner 提交于 3月 07, 2016

acb3e26f
D

Merge branch 'xfs-get-next-dquot-4.6' into for-next · 1b186d25
由 Dave Chinner 提交于 3月 07, 2016

1b186d25
D

Merge branch 'xfs-rt-fixes-4.6' into for-next · c53473be
由 Dave Chinner 提交于 3月 07, 2016

c53473be
D

Merge branch 'xfs-torn-log-fixes-4.5' into for-next · 9deed095
由 Dave Chinner 提交于 3月 07, 2016

9deed095

xfs: only run torn log write detection on dirty logs · 7f6aff3a

由 Brian Foster 提交于 3月 07, 2016

XFS uses CRC verification over a sub-range of the head of the log to
detect and handle torn writes. This torn log write detection currently
runs unconditionally at mount time, regardless of whether the log is
dirty or clean. This is problematic in cases where a filesystem might
end up being moved across different, incompatible (i.e., opposite
byte-endianness) architectures.

The problem lies in the fact that log data is not necessarily written in
an architecture independent format. For example, certain bits of data
are written in native endian format. Further, the size of certain log
data structures differs (i.e., struct xlog_rec_header) depending on the
word size of the cpu. This leads to false positive crc verification
errors and ultimately failed mounts when a cleanly unmounted filesystem
is mounted on a system with an incompatible architecture from data that
was written near the head of the log.

Update the log head/tail discovery code to run torn write detection only
when the log is not clean. This means something other than an unmount
record resides at the head of the log and log recovery is imminent. It
is a requirement to run log recovery on the same type of host that had
written the content of the dirty log and therefore CRC failures are
legitimate corruptions in that scenario.
Reported-by: NJan Beulich <JBeulich@suse.com>
Tested-by: NJan Beulich <JBeulich@suse.com>
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

7f6aff3a

xfs: refactor in-core log state update to helper · 717bc0eb

由 Brian Foster 提交于 3月 07, 2016

Once the record at the head of the log is identified and verified, the
in-core log state is updated based on the record. This includes
information such as the current head block and cycle, the start block of
the last record written to the log, the tail lsn, etc.

Once torn write detection is conditional, this logic will need to be
reused. Factor the code to update the in-core log data structures into a
new helper function. This patch does not change behavior.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

717bc0eb

xfs: refactor unmount record detection into helper · 65b99a08

由 Brian Foster 提交于 3月 07, 2016

Once the mount sequence has identified the head and tail blocks of the
physical log, the record at the head of the log is located and examined
for an unmount record to determine if the log is clean. This currently
occurs after torn write verification of the head region of the log.

This must ultimately be separated from torn write verification and may
need to be called again if the log head is walked back due to a torn
write (to determine whether the new head record is an unmount record).
Separate this logic into a new helper function. This patch does not
change behavior.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

65b99a08

xfs: separate log head record discovery from verification · 82ff6cc2

由 Brian Foster 提交于 3月 07, 2016

The code that locates the log record at the head of the log is buried in
the log head verification function. This is fine when torn write
verification occurs unconditionally, but this behavior is problematic
for filesystems that might be moved across systems with different
architectures.

In preparation for separating examination of the log head for unmount
records from torn write detection, lift the record location logic out of
the log verification function and into the caller. This patch does not
change behavior.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

82ff6cc2

29 2月, 2016 1 次提交

ext4: Fix data exposure after failed AIO DIO · 74c66bcb

由 Jan Kara 提交于 2月 29, 2016

When AIO DIO fails e.g. due to IO error, we must not convert unwritten
extents as that will expose uninitialized data. Handle this case
by clearing unwritten flag from io_end in case of error and thus
preventing extent conversion.
Signed-off-by: NJan Kara <jack@suse.cz>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

74c66bcb

09 2月, 2016 11 次提交

xfs: mode di_mode to vfs inode · c19b3b05

由 Dave Chinner 提交于 2月 09, 2016

Move the di_mode value from the xfs_icdinode to the VFS inode, reducing
the xfs_icdinode byte another 2 bytes and collapsing another 2 byte hole
in the structure.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

c19b3b05

xfs: move di_changecount to VFS inode · 83e06f21

由 Dave Chinner 提交于 2月 09, 2016

We can store the di_changecount in the i_version field of the VFS
inode and remove another 8 bytes from the xfs_icdinode.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

83e06f21

xfs: move inode generation count to VFS inode · 9e9a2674

由 Dave Chinner 提交于 2月 09, 2016

Pull another 4 bytes out of the xfs_icdinode.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

9e9a2674

xfs: use vfs inode nlink field everywhere · 54d7b5c1

由 Dave Chinner 提交于 2月 09, 2016

The VFS tracks the inode nlink just like the xfs_icdinode. We can
remove the variable from the icdinode and use the VFS inode variable
everywhere, reducing the size of the xfs_icdinode by a further 4
bytes.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

54d7b5c1

xfs: reinitialise recycled VFS inode correctly · 50997470

由 Dave Chinner 提交于 2月 09, 2016

We are going to keep certain on-disk information in the VFS inode
rather than in a separate XFS specific stucture, so we have to be
careful of the VFS code clearing that information when we
re-initialise reclaimable cached inodes during lookup. If we don't
do this, then we lose critical information from the inode and that
results in corruption being detected.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

50997470

xfs: move v1 inode conversion to xfs_inode_from_disk · faeb4e47

由 Dave Chinner 提交于 2月 09, 2016

So we don't have to carry an di_onlink variable around anymore, move
the inode conversion from v1 inode format to v2 inode format into
xfs_inode_from_disk(). This means we can remove the di_onlink fields
from the struct xfs_icdinode.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

faeb4e47

xfs: cull unnecessary icdinode fields · 93f958f9

由 Dave Chinner 提交于 2月 09, 2016

Now that the struct xfs_icdinode is not directly related to the
on-disk format, we can cull things in it we really don't need to
store:

	- magic number never changes
	- padding is not necessary
	- next_unlinked is never used
	- inode number is redundant
	- uuid is redundant
	- lsn is accessed directly from dinode
	- inode CRC is only accessed directly from dinode

Hence we can remove these from the struct xfs_icdinode and redirect
the code that uses them to the xfs_dinode appripriately.  This
reduces the size of the struct icdinode from 152 bytes to 88 bytes,
and removes a fair chunk of unnecessary code, too.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

93f958f9

xfs: remove timestamps from incore inode · 3987848c

由 Dave Chinner 提交于 2月 09, 2016

The struct xfs_inode has two copies of the current timestamps in it,
one in the vfs inode and one in the struct xfs_icdinode. Now that we
no longer log the struct xfs_icdinode directly, we don't need to
keep the timestamps in this structure. instead we can copy them
straight out of the VFS inode when formatting the inode log item or
the on-disk inode.

This reduces the struct xfs_inode in size by 24 bytes.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

3987848c

xfs: introduce inode log format object · f8d55aa0

由 Dave Chinner 提交于 2月 09, 2016

We currently carry around and log an entire inode core in the
struct xfs_inode. A lot of the information in the inode core is
duplicated in the VFS inode, but we cannot remove this duplication
of infomration because the inode core is logged directly in
xfs_inode_item_format().

Add a new function xfs_inode_item_format_core() that copies the
inode core data into a struct xfs_icdinode that is pulled directly
from the log vector buffer. This means we no longer directly
copy the inode core, but copy the structures one member at a time.
This will be slightly less efficient than copying, but will allow us
to remove duplicate and unnecessary items from the struct xfs_inode.

To enable us to do this, call the new structure a xfs_log_dinode,
so that we know it's different to the physical xfs_dinode and the
in-core xfs_icdinode.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>

f8d55aa0

xfs: RT bitmap and summary buffers need verifiers · bf85e099

由 Dave Chinner 提交于 2月 09, 2016

Buffers without verifiers issue runtime warnings on XFS. We don't
have anything we can actually verify in the RT buffers (no CRCs, not
magic numbers, etc), but we still need verifiers to avoid the
warnings.

Add a set of dummy verifier operations for the realtime buffers and
apply them in the appropriate places.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

bf85e099

xfs: RT bitmap and summary buffers are not typed · f67ca6ec

由 Dave Chinner 提交于 2月 09, 2016

When logging buffers, we attach a type to them that follows the
buffer all the way into the log and is used to identify the buffer
contents in log recovery. Both the realtime summary buffers and the
bitmap buffers do not have types defined or set, so when we try to
log them we see assert failure:

XFS: Assertion failed: (bip->bli_flags & XFS_BLI_STALE) || (xfs_blft_from_flags(&bip->__bli_format) > XFS_BLFT_UNKNOWN_BUF && xfs_blft_from_flags(&bip->__bli_format) < XFS_BLFT_MAX_BUF), file: fs/xfs/xfs_buf_item.c, line: 294

Fix this by adding buffer log format types for these buffers, and
add identification support into log recovery for them. Only build the log
recovery support if CONFIG_XFS_RT=y - we can't get into log recovery for real
time filesystems if support is not built into the kernel, and this avoids
potential build problems.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

f67ca6ec

08 2月, 2016 18 次提交

xfs: fix xfs_log_ticket leak in xfs_end_io() after fs shutdown · af055e37

由 Brian Foster 提交于 2月 08, 2016

If the filesystem has shut down, xfs_end_io() currently sets an
error on the ioend and proceeds to ioend destruction. The ioend
might contain a truncate transaction if the I/O extended the size of
the file. This transaction is only cleaned up in
xfs_setfilesize_ioend(), however, which is skipped in this case.
This results in an xfs_log_ticket leak message when the associate
cache slab is destroyed (e.g., on rmmod).

This was originally reproduced by xfs/141 on a distro kernel. The
problem is reproducible on an upstream kernel, but not easily
detected in current upstream if the xfs_log_ticket cache happens to
be merged with another cache. This can be reproduced more
deterministically with the 'slab_nomerge' kernel boot option.

Update xfs_end_io() to proceed with normal end I/O processing after
an error is set on an ioend due to fs shutdown. The I/O type-based
processing is already designed to handle an I/O error and ensure
that the ioend is cleaned up correctly.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

af055e37

xfs: clean up unwritten buffers on write failure · 60630fe6

由 Brian Foster 提交于 2月 08, 2016

The xfs_vm_write_failed() handler is currently responsible for cleaning
up any delalloc blocks over the range of a failed write beyond EOF.
Failure to do so results in warning messages and other inconsistencies
between buffer and extent state. The ->releasepage() handler currently
warns in the event of a page being released with either unwritten or
delalloc buffers, as neither is ever expected by the time a page is
released.

As has been reproduced by generic/083 on a -bsize=1k fs, it is currently
possible to trigger the ->releasepage() warning for a page with
unwritten buffers when a filesystem is near ENOSPC. This is reproduced
by the following sequence:

  $ mkfs.xfs -f -b size=1k -d size=100m <dev>
  $ mount <dev> /mnt/
  $
  $ xfs_io -fc "falloc -k 0 1k" /mnt/file
  $ dd if=/dev/zero of=/mnt/enospc conv=notrunc oflag=append
  $
  $ xfs_io -c "pwrite 512 1k" /mnt/file
  $ xfs_io -d -c "pwrite 16k 1k" /mnt/file

The first pwrite command attempts a block unaligned write across an
unwritten block and a hole. The delalloc for the hole fails with ENOSPC
and the subsequent error handling does not clean up the unwritten buffer
that was instantiated during the first ->get_block() call.

The second pwrite triggers a warning as part of the inode mapping
invalidation that occurs prior to direct I/O. The releasepage() handler
detects the unwritten buffer at this time, warns and prevents the
release of the page.

To deal with this problem, update xfs_vm_write_failed() to clean up
unwritten as well as delalloc buffers that are beyond EOF and within the
range of the failed write.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

60630fe6

xfs: move struct xfs_attr_shortform to xfs_da_format.h · 244efeaf

由 Darrick J. Wong 提交于 2月 08, 2016

Move the shortform attr structure definition to the same place as the
other attribute structure definitions for consistency and also so that
xfs/122 verifies the structure size.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

244efeaf

xfs: Make xfsaild freezeable again · 18f1df4e

由 Michal Hocko 提交于 2月 08, 2016

Hendik has reported suspend failures due to xfsaild blocking the freezer
to settle down.
Jan 17 19:59:56 linux-6380 kernel: PM: Syncing filesystems ... done.
Jan 17 19:59:56 linux-6380 kernel: PM: Preparing system for sleep (mem)
Jan 17 19:59:56 linux-6380 kernel: Freezing user space processes ... (elapsed 0.001 seconds) done.
Jan 17 19:59:56 linux-6380 kernel: Freezing remaining freezable tasks ...
Jan 17 19:59:56 linux-6380 kernel: Freezing of tasks failed after 20.002 seconds (1 tasks refusing to freeze, wq_busy=0):
Jan 17 19:59:56 linux-6380 kernel: xfsaild/dm-5 S 00000000 0 1293 2 0x00000080
Jan 17 19:59:56 linux-6380 kernel: f0ef5f00 00000046 00000200 00000000 ffff9022 c02d3800 00000000 00000032
Jan 17 19:59:56 linux-6380 kernel: ee0b2400 00000032 f71e0d00 f36fabc0 f0ef2d00 f0ef6000 f0ef2d00 f12f90c0
Jan 17 19:59:56 linux-6380 kernel: f0ef5f0c c0844e44 00000000 f0ef5f6c f811e0be 00000000 00000000 f0ef2d00
Jan 17 19:59:56 linux-6380 kernel: Call Trace:
Jan 17 19:59:56 linux-6380 kernel: [<c0844e44>] schedule+0x34/0x90
Jan 17 19:59:56 linux-6380 kernel: [<f811e0be>] xfsaild+0x5de/0x600 [xfs]
Jan 17 19:59:56 linux-6380 kernel: [<c0286cbb>] kthread+0x9b/0xb0
Jan 17 19:59:56 linux-6380 kernel: [<c0848a79>] ret_from_kernel_thread+0x21/0x38

The issue has been there for quite some time but it has been made
visible by only by 24ba16bb ("xfs: clear PF_NOFREEZE for xfsaild
kthread") because the suspend started seeing xfsaild.

The above commit has missed that the !xfs_ail_min branch might call
schedule with TASK_INTERRUPTIBLE without calling try_to_freeze so the pm
suspend would wake up the kernel thread over and over again without any
progress. What we want here is to use freezable_schedule instead to hide
the thread from the suspend.

While we are here also change schedule_timeout to freezable variant to
prevent from spurious wakeups by suspend.

[dchinner: re-add set_freezeable call so the freezer will account properly
for this kthread. ]
Reported-by: NHendrik Woltersdorf <hendrikw@arcor.de>
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

18f1df4e

xfs: remove unused function definitions · de0b85a8

由 Eric Sandeen 提交于 2月 08, 2016

Old leftovers.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

de0b85a8

xfs: move buffer invalidation to xfs_btree_free_block · edfd9dd5

由 Christoph Hellwig 提交于 2月 08, 2016

... instead of leaving it in the methods.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

edfd9dd5

xfs: factor btree block freeing into a helper · c46ee8ad

由 Christoph Hellwig 提交于 2月 08, 2016

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

c46ee8ad

xfs: handle errors from ->free_blocks in xfs_btree_kill_iroot · 196328ec

由 Christoph Hellwig 提交于 2月 08, 2016

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

196328ec

xfs: fold xfs_vm_do_dio into xfs_vm_direct_IO · c19b104a

由 Christoph Hellwig 提交于 2月 08, 2016

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

c19b104a

xfs: don't use ioends for direct write completions · 273dda76

由 Christoph Hellwig 提交于 2月 08, 2016

We only need to communicate two bits of information to the direct I/O
completion handler:

 (1) do we need to convert any unwritten extents in the range
 (2) do we need to check if we need to update the inode size based
     on the range passed to the completion handler

We can use the private data passed to the get_block handler and the
completion handler as a simple bitmask to communicate this information
instead of the current complicated infrastructure reusing the ioends
from the buffer I/O path, and thus avoiding a memory allocation and
a context switch for any non-trivial direct write.  As a nice side
effect we also decouple the direct I/O path implementation from that
of the buffered I/O path.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

273dda76

direct-io: always call ->end_io if non-NULL · 187372a3

由 Christoph Hellwig 提交于 2月 08, 2016

This way we can pass back errors to the file system, and allow for
cleanup required for all direct I/O invocations.

Also allow the ->end_io handlers to return errors on their own, so that
I/O completion errors can be passed on to the callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

187372a3

xfs: Split default quota limits by quota type · be607946

由 Carlos Maiolino 提交于 2月 08, 2016

Default quotas are globally set due historical reasons. IRIX only
supported user and project quotas, and default quota was only
applied to user quotas.

In Linux, when a default quota is set, all different quota types
inherits the same default value.

An user with a quota limit larger than the default quota value, will
still be limited to the default value because the group quotas also
inherits the default quotas. Unless the group which the user belongs
to have a custom quota limit set.

This patch aims to split the default quota value by quota type.
Allowing each quota type having different default values.

Default time limits are still set globally. XFS does not set a
per-user/group timer, but a single global timer. For changing this
behavior, some changes should be made in user-space tools another
bugs being fixed.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

be607946

xfs: wire up Q_XGETNEXTQUOTA / get_nextdqblk · 296c24e2

由 Eric Sandeen 提交于 2月 08, 2016

Add code to allow the Q_XGETNEXTQUOTA quotactl to quickly find
all active quotas by examining the quota inode, and skipping
over unallocated or uninitialized regions.

Userspace can then use this interface rather than i.e. a
getpwent() loop when asked to report all active quotas.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

296c24e2

xfs: Factor xfs_seek_hole_data into helper · 8aa7d37e

由 Eric Sandeen 提交于 2月 08, 2016

Factor xfs_seek_hole_data into an unlocked helper which takes
an xfs inode rather than a file for internal use.

Also allow specification of "end" - the vfs lseek interface is
defined such that any offset past eof/i_size shall return -ENXIO,
but we will use this for quota code which does not maintain i_size,
and we want to be able to SEEK_DATA past i_size as well.  So the
lseek path can send in i_size, and the quota code can determine
its own ending offset.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

8aa7d37e

xfs: get quota inode from mp & flags rather than dqp · 4d4d9523

由 Eric Sandeen 提交于 2月 08, 2016

Allow us to get the appropriate quota inode from any
mp & quota flags, not necessarily associated with a
particular dqp.  Needed for when we are searching for
the next active ID with quotas and we want to examine
the quota inode.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

4d4d9523

xfs: don't overflow quota ID when initializing dqblk · a484bcdd

由 Eric Sandeen 提交于 2月 08, 2016

Quota IDs are unsigned, and so we can pass in values up
to 2^32-1.  But if we try to initialize a block containing
values over MAX_INT, curid will overflow and assert.

curid holds a quota ID, so give it the proper
xfs_dqid_t type (and remove the now-impossible ASSERT).
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

a484bcdd

quota: add new quotactl Q_GETNEXTQUOTA · 926132c0

由 Eric Sandeen 提交于 2月 08, 2016

Q_GETNEXTQUOTA is exactly like Q_GETQUOTA, except that it
will return quota information for the id equal to or greater
than the id requested.  In other words, if the requested id has
no quota, the command will return quota information for the
next higher id which does have a quota set.  If no higher id
has an active quota, -ESRCH is returned.

This allows filesystems to do efficient iteration in kernelspace,
much like extN filesystems do in userspace when asked to report
all active quotas.

This does require a new data structure for userspace, as the
current structure does not include an ID for the returned quota
information.

Today, Ext4 with a hidden quota inode requires getpwent-style
iterations, and for systems which have i.e. LDAP backends,
this can be very slow, or even impossible if iteration is not
allowed in the configuration.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDave Chinner <david@fromorbit.com>

926132c0

quota: add new quotactl Q_XGETNEXTQUOTA · 8b375249

由 Eric Sandeen 提交于 2月 08, 2016

Q_XGETNEXTQUOTA is exactly like Q_XGETQUOTA, except that it
will return quota information for the id equal to or greater
than the id requested.  In other words, if the requested id has
no quota, the command will return quota information for the
next higher id which does have a quota set.  If no higher id
has an active quota, -ESRCH is returned.

This allows filesystems to do efficient iteration in kernelspace,
much like extN filesystems do in userspace when asked to report
all active quotas.

The patch adds a d_id field to struct qc_dqblk so that we can
pass back the id of the quota which was found, and return it
to userspace.

Today, filesystems such as XFS require getpwent-style iterations,
and for systems which have i.e. LDAP backends, this can be very
slow, or even impossible if iteration is not allowed in the
configuration.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDave Chinner <david@fromorbit.com>

8b375249

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功