提交 · d20a5e3851969fa685f118a80e4df670255a4e8d · openeuler / Kernel

26 9月, 2017 5 次提交

xfs: report zeroed or not correctly in xfs_zero_range() · d20a5e38

由 Eryu Guan 提交于 9月 18, 2017

The 'did_zero' param of xfs_zero_range() was not passed to
iomap_zero_range() correctly. This was introduced by commit
7bb41db3 ("xfs: handle 64-bit length in xfs_iozero"), and found
by code inspection.
Signed-off-by: NEryu Guan <eguan@redhat.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d20a5e38

xfs: kill meaningless variable 'zero' · 64671baf

由 Eryu Guan 提交于 9月 18, 2017

In xfs_file_aio_write_checks(), variable 'zero' is there only to
satisfy xfs_zero_eof(), the result of it is ignored. Now, with
iomap_zero_range() based xfs_zero_eof(), we can safely pass NULL as
the last param of it and kill 'zero'.
Signed-off-by: NEryu Guan <eguan@redhat.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

64671baf

fs/xfs: Use %pS printk format for direct addresses · e150dcd4

由 Helge Deller 提交于 9月 18, 2017

Use the %pS instead of the %pF printk format specifier for printing symbols
from direct addresses. This is needed for the ia64, ppc64 and parisc64
architectures.
Signed-off-by: NHelge Deller <deller@gmx.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

e150dcd4

xfs: evict CoW fork extents when performing finsert/fcollapse · 3af423b0

由 Darrick J. Wong 提交于 9月 18, 2017

When we perform an finsert/fcollapse operation, cancel all the CoW
extents for the affected file offset range so that they don't end up
pointing to the wrong blocks.
Reported-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

3af423b0

xfs: don't unconditionally clear the reflink flag on zero-block files · cc6f7771

由 Darrick J. Wong 提交于 9月 18, 2017

If we have speculative cow preallocations hanging around in the cow
fork, don't let a truncate operation clear the reflink flag because if
we do then there's a chance we'll forget to free those extents when we
destroy the incore inode.
Reported-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

cc6f7771

15 9月, 2017 9 次提交

vfs: constify path argument to kernel_read_file_from_path · 711aab1d

由 Mimi Zohar 提交于 9月 12, 2017

This patch constifies the path argument to kernel_read_file_from_path().
Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

711aab1d

orangefs: Adjust three checks for null pointers · 0b08273c

由 Markus Elfring 提交于 8月 17, 2017

MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The script “checkpatch.pl” pointed information out like the following.

Comparison to NULL could be written !…

Thus fix affected source code places.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

0b08273c

orangefs: Use kcalloc() in orangefs_prepare_cdm_array() · 5e273a0e

由 Markus Elfring 提交于 8月 17, 2017

* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data structure by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

5e273a0e

orangefs: Delete error messages for a failed memory allocation in five functions · 07a25853

由 Markus Elfring 提交于 8月 17, 2017

Omit an extra message for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

07a25853

orangefs: constify xattr_handler structure · 12174444

由 Julia Lawall 提交于 8月 02, 2017

The xattr_handler structure is only stored in an array of const
structures.  Thus the xattr_handler structure itself can be
const.
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

12174444

orangefs: don't call filemap_write_and_wait from fsync · 49e55713

由 Jeff Layton 提交于 4月 12, 2017

Orangefs doesn't do buffered writes yet, so there's no point in
initiating and waiting for writeback.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

49e55713

orangefs: off by ones in xattr size checks · 5f13e587

由 Dan Carpenter 提交于 5月 22, 2017

A previous patch which claimed to remove off by ones actually introduced
them.

strlen() returns the length of the string not including the NUL
character.  We are using strcpy() to copy "name" into a buffer which is
ORANGEFS_MAX_XATTR_NAMELEN characters long.  We should make sure to
leave space for the NUL, otherwise we're writing one character beyond
the end of the buffer.

Fixes: e675c5ec ("orangefs: clean up oversize xattr validation")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

5f13e587

orangefs: react properly to posix_acl_update_mode's aftermath. · 4bef6900

由 Mike Marshall 提交于 8月 10, 2017

posix_acl_update_mode checks to see if the permissions
described by the ACL can be encoded into the
object's mode. If so, it sets "acl" to NULL
and "mode" to the new desired value. Prior to this patch
we failed to actually propagate the new mode back to the
server.
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

4bef6900

orangefs: Don't clear SGID when inheriting ACLs · b5accbb0

由 Jan Kara 提交于 6月 22, 2017

When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.

Fix the problem by creating __orangefs_set_acl() function that does not
call posix_acl_update_mode() and use it when inheriting ACLs. That
prevents SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.

Fixes: 07393101
CC: stable@vger.kernel.org
CC: Mike Marshall <hubcap@omnibond.com>
CC: pvfs2-developers@beowulf-underground.org
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMike Marshall <hubcap@omnibond.com>

b5accbb0

14 9月, 2017 3 次提交

mm: treewide: remove GFP_TEMPORARY allocation flag · 0ee931c4

由 Michal Hocko 提交于 9月 13, 2017

GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation.  As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.

The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory.  So
this is rather misleading and hard to evaluate for any benefits.

I have checked some random users and none of them has added the flag
with a specific justification.  I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring.  This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.

I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL.  Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.

I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.

This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
seems to be a heuristic without any measured advantage for most (if not
all) its current users.  The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers.  So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.

[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
  Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NMel Gorman <mgorman@suse.de>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0ee931c4

fscache: fix fscache_objlist_show format processing · ebfddb3d

由 Arnd Bergmann 提交于 9月 13, 2017

gcc points out a minor bug in the handling of unknown cookie types,
which could result in a string overflow when the integer is copied into
a 3-byte string:

  fs/fscache/object-list.c: In function 'fscache_objlist_show':
  fs/fscache/object-list.c:265:19: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
   sprintf(_type, "%02u", cookie->def->type);
                  ^~~~~~
  fs/fscache/object-list.c:265:4: note: 'sprintf' output between 3 and 4 bytes into a destination of size 3

This is currently harmless as no code sets a type other than 0 or 1, but
it makes sense to use snprintf() here to avoid overflowing the array if
that changes.

Link: http://lkml.kernel.org/r/20170714120720.906842-22-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ebfddb3d

procfs: remove unused variable · 6dec0dd4

由 Arnd Bergmann 提交于 9月 13, 2017

In NOMMU configurations, we get a warning about a variable that has become
unused:

  fs/proc/task_nommu.c: In function 'nommu_vma_show':
  fs/proc/task_nommu.c:148:28: error: unused variable 'priv' [-Werror=unused-variable]

Link: http://lkml.kernel.org/r/20170911200231.3171415-1-arnd@arndb.de
Fixes: 1240ea0d ("fs, proc: remove priv argument from is_stack")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6dec0dd4

13 9月, 2017 4 次提交

xfs: XFS_IS_REALTIME_INODE() should be false if no rt device present · b31ff3cd

由 Richard Wareing 提交于 9月 13, 2017

If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on
a directory in a filesystem that does not have a realtime device and
create a new file in that directory, it gets marked as a real time file.
When data is written and a fsync is issued, the filesystem attempts to
flush a non-existent rt device during the fsync process.

This results in a crash dereferencing a null buftarg pointer in
xfs_blkdev_issue_flush():

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  IP: xfs_blkdev_issue_flush+0xd/0x20
  .....
  Call Trace:
    xfs_file_fsync+0x188/0x1c0
    vfs_fsync_range+0x3b/0xa0
    do_fsync+0x3d/0x70
    SyS_fsync+0x10/0x20
    do_syscall_64+0x4d/0xb0
    entry_SYSCALL64_slow_path+0x25/0x25

Setting RT inode flags does not require special privileges so any
unprivileged user can cause this oops to occur.  To reproduce, confirm
kernel is compiled with CONFIG_XFS_RT=y and run:

  # mkfs.xfs -f /dev/pmem0
  # mount /dev/pmem0 /mnt/test
  # mkdir /mnt/test/foo
  # xfs_io -c 'chattr +t' /mnt/test/foo
  # xfs_io -f -c 'pwrite 0 5m' -c fsync /mnt/test/foo/bar

Or just run xfstests with MKFS_OPTIONS="-d rtinherit=1" and wait.

Kernels built with CONFIG_XFS_RT=n are not exposed to this bug.

Fixes: f538d4da ("[XFS] write barrier support")
Cc: <stable@vger.kernel.org>
Signed-off-by: NRichard Wareing <rwareing@fb.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b31ff3cd

f2fs: hurry up to issue discard after io interruption · e6c6de18

由 Chao Yu 提交于 9月 12, 2017

Once we encounter I/O interruption during issuing discards, we will delay
long time before next round, but if system status is I/O idle during the
time, it may loses opportunity to issue discards. So this patch changes
to hurry up to issue discard after io interruption.

Besides, this patch also fixes to issue discards accurately with assigned
rate.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e6c6de18

f2fs: fix to show correct discard_granularity in sysfs · 80647e5f

由 Chao Yu 提交于 9月 12, 2017

Fix below incorrect display when reading discard_granularity sysfs node.

$ cat /sys/fs/f2fs/<device>/discard_granularity
$ 16
$ echo 32 > /sys/fs/f2fs/<device>/discard_granularity
$ cat /sys/fs/f2fs/<device>/discard_granularity
$ 16
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

80647e5f

f2fs: detect dirty inode in evict_inode · ca7d802a

由 Chao Yu 提交于 9月 12, 2017

Add a bugon in f2fs_evict_inode to detect inconsistent status between
inode cache and related node page cache.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ca7d802a

12 9月, 2017 10 次提交

ovl: fix false positive ESTALE on lookup · 939ae4ef

由 Amir Goldstein 提交于 9月 11, 2017

Commit b9ac5c27 ("ovl: hash overlay non-dir inodes by copy up origin")
verifies that the origin lower inode stored in the overlayfs inode matched
the inode of a copy up origin dentry found by lookup.

There is a false positive result in that check when lower fs does not
support file handles and copy up origin cannot be followed by file handle
at lookup time.

The false negative happens when finding an overlay inode in cache on a
copied up overlay dentry lookup. The overlay inode still 'remembers' the
copy up origin inode, but the copy up origin dentry is not available for
verification.

Relax the check in case copy up origin dentry is not available.

Fixes: b9ac5c27 ("ovl: hash overlay non-dir inodes by copy up...")
Cc: <stable@vger.kernel.org> # v4.13
Reported-by: NJordi Pujol <jordipujolp@gmail.com>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

939ae4ef

fuse: getattr cleanup · 5b97eeac

由 Miklos Szeredi 提交于 9月 12, 2017

The refreshed argument isn't used by any caller, get rid of it.

Use a helper for just updating the inode (no need to fill in a kstat).
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5b97eeac

fuse: honor iocb sync flags on write · e1c0eecb

由 Miklos Szeredi 提交于 9月 12, 2017

If the IOCB_DSYNC flag is set a sync is not being performed by
fuse_file_write_iter.

Honor IOCB_DSYNC/IOCB_SYNC by setting O_DYSNC/O_SYNC respectively in the
flags filed of the write request.

We don't need to sync data or metadata, since fuse_perform_write() does
write-through and the filesystem is responsible for updating file times.

Original patch by Vitaly Zolotusky.
Reported-by: NNate Clark <nate@neworld.us>
Cc: Vitaly Zolotusky <vitaly@unitc.com>.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e1c0eecb

fuse: allow server to run in different pid_ns · 5d6d3a30

由 Miklos Szeredi 提交于 9月 12, 2017

Commit 0b6e9ea0 ("fuse: Add support for pid namespaces") broke
Sandstorm.io development tools, which have been sending FUSE file
descriptors across PID namespace boundaries since early 2014.

The above patch added a check that prevented I/O on the fuse device file
descriptor if the pid namespace of the reader/writer was different from the
pid namespace of the mounter. With this change passing the device file
descriptor to a different pid namespace simply doesn't work. The check was
added because pids are transferred to/from the fuse userspace server in the
namespace registered at mount time.

To fix this regression, remove the checks and do the following:

1) the pid in the request header (the pid of the task that initiated the
filesystem operation) is translated to the reader's pid namespace. If a
mapping doesn't exist for this pid, then a zero pid is used. Note: even if
a mapping would exist between the initiator task's pid namespace and the
reader's pid namespace the pid will be zero if either mapping from
initator's to mounter's namespace or mapping from mounter's to reader's
namespace doesn't exist.

2) The lk.pid value in setlk/setlkw requests and getlk reply is left alone.
Userspace should not interpret this value anyway. Also allow the
setlk/setlkw operations if the pid of the task cannot be represented in the
mounter's namespace (pid being zero in that case).
Reported-by: NKenton Varda <kenton@sandstorm.io>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 0b6e9ea0 ("fuse: Add support for pid namespaces")
Cc: <stable@vger.kernel.org> # v4.12+
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Seth Forshee <seth.forshee@canonical.com>

5d6d3a30

f2fs: clear radix tree dirty tag of pages whose dirty flag is cleared · 0abd8e70

由 Daeho Jeong 提交于 9月 11, 2017

On a senario like writing out the first dirty page of the inode
as the inline data, we only cleared dirty flags of the pages, but
didn't clear the dirty tags of those pages in the radix tree.

If we don't clear the dirty tags of the pages in the radix tree, the
inodes which contain the pages will be marked with I_DIRTY_PAGES again
and again, and writepages() for the inodes will be invoked in every
writeback period. As a result, nothing will be done in every
writepages() for the inodes and it will just consume CPU time
meaninglessly.
Signed-off-by: NDaeho Jeong <daeho.jeong@samsung.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0abd8e70

NFS: various changes relating to reporting IO errors. · bf4b4905

由 NeilBrown 提交于 9月 11, 2017

1/ remove 'start' and 'end' args from nfs_file_fsync_commit().
   They aren't used.

2/ Make nfs_context_set_write_error() a "static inline" in internal.h
   so we can...

3/ Use nfs_context_set_write_error() instead of mapping_set_error()
   if nfs_pageio_add_request() fails before sending any request.
   NFS generally keeps errors in the open_context, not the mapping,
   so this is more consistent.

4/ If filemap_write_and_write_range() reports any error, still
   check ctx->error.  The value in ctx->error is likely to be
   more useful.  As part of this, NFS_CONTEXT_ERROR_WRITE is
   cleared slightly earlier, before nfs_file_fsync_commit() is called,
   rather than at the start of that function.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

bf4b4905

NFS: Add static NFS I/O tracepoints · 8224b273

由 Chuck Lever 提交于 8月 21, 2017

Tools like tcpdump and rpcdebug can be very useful. But there are
plenty of environments where they are difficult or impossible to
use. For example, we've had customers report I/O failures during
workloads so heavy that collecting network traffic or enabling
RPC debugging are themselves onerous.

The kernel's static tracepoints are lightweight (less likely to
introduce timing changes) and efficient (the trace data is compact).
They also work in scenarios where capturing network traffic is not
possible due to lack of hardware support (some InfiniBand HCAs) or
where data or network privacy is a concern.

Introduce tracepoints that show when an NFS READ, WRITE, or COMMIT
is initiated, and when it completes. Record the arguments and
results of each operation, which are not shown by existing sunrpc
module's tracepoints.

For instance, the recorded offset and count can be used to match an
"initiate" event to a "done" event. If an NFS READ result returns
fewer bytes than requested or zero, seeing the EOF flag can be
probative. Seeing an NFS4ERR_BAD_STATEID result is also indication
of a particular class of problems. The timing information attached
to each event record can often be useful as well.

Usage example:

[root@manet tmp]# trace-cmd record -e nfs:*initiate* -e nfs:*done
/sys/kernel/debug/tracing/events/nfs/*initiate*/filter
/sys/kernel/debug/tracing/events/nfs/*done/filter
Hit Ctrl^C to stop recording
^CKernel buffer statistics:
  Note: "entries" are the entries left in the kernel ring buffer and are not
        recorded in the trace data. They should all be zero.

CPU: 0
entries: 0
overrun: 0
commit overrun: 0
bytes: 3680
oldest event ts:    78.367422
now ts:   100.124419
dropped events: 0
read events: 74

... and so on.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8224b273

pNFS: Use the standard I/O stateid when calling LAYOUTGET · 70d2f7b1

由 Trond Myklebust 提交于 9月 11, 2017

Instead of having a private method for copying the open/delegation stateid,
use the same call that is used for standard I/O through the MDS.

Note that this means we transmit the stateid with a zero seqid, avoiding
issues with NFS4ERR_OLD_STATEID.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

70d2f7b1

f2fs: speed up gc_urgent mode with SSR · b3a97a2a

由 Jaegeuk Kim 提交于 9月 09, 2017

This patch activates SSR in gc_urgent mode.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b3a97a2a

f2fs: better to wait for fstrim completion · 1eb1ef4a

由 Jaegeuk Kim 提交于 9月 09, 2017

In android, we'd better wait for fstrim completion instead of issuing the
discard commands asynchronous.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1eb1ef4a

11 9月, 2017 1 次提交

dax: remove the pmem_dax_ops->flush abstraction · c3ca015f

由 Mikulas Patocka 提交于 8月 31, 2017

Commit abebfbe2 ("dm: add ->flush() dax operation support") is
buggy. A DM device may be composed of multiple underlying devices and
all of them need to be flushed. That commit just routes the flush
request to the first device and ignores the other devices.

It could be fixed by adding more complex logic to the device mapper. But
there is only one implementation of the method pmem_dax_ops->flush - that
is pmem_dax_flush() - and it calls arch_wb_cache_pmem(). Consequently, we
don't need the pmem_dax_ops->flush abstraction at all, we can call
arch_wb_cache_pmem() directly from dax_flush() because dax_dev->ops->flush
can't ever reach anything different from arch_wb_cache_pmem().

It should be also pointed out that for some uses of persistent memory it
is needed to flush only a very small amount of data (such as 1 cacheline),
and it would be overkill if we go through that device mapper machinery for
a single flushed cache line.

Fix this by removing the pmem_dax_ops->flush abstraction and call
arch_wb_cache_pmem() directly from dax_flush(). Also, remove the device
mapper code that forwards the flushes.

Fixes: abebfbe2 ("dm: add ->flush() dax operation support")
Cc: stable@vger.kernel.org
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

c3ca015f

10 9月, 2017 4 次提交

NFS: Count the bytes of skipped subrequests in nfs_lock_and_join_requests() · 1bd5d6d0

由 Trond Myklebust 提交于 9月 09, 2017

If we skip a subrequest due to a zero refcount, we should still count
the byte range that it covered so that we accurately reconstruct the
original request size.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1bd5d6d0

NFS: Don't hold the group lock when calling nfs_release_request() · 8b77484f

由 Trond Myklebust 提交于 9月 09, 2017

That can deadlock if this is the last reference since
nfs_page_group_destroy() calls nfs_page_group_sync_on_bit().
Note that even if the page was removed from the subpage list,
the req->wb_head could still be pointing to the old head.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8b77484f

NFS: Remove pnfs_generic_transfer_commit_list() · 5d2a9d9d

由 Trond Myklebust 提交于 9月 09, 2017

It's pretty much a duplicate of nfs_scan_commit_list() that also
clears the PG_COMMIT_TO_DS flag.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5d2a9d9d

NFS: nfs_lock_and_join_requests and nfs_scan_commit_list can deadlock · 137da553

由 Trond Myklebust 提交于 9月 09, 2017

Since the commit list is not ordered, it is possible for nfs_scan_commit_list
to hold a request that nfs_lock_and_join_requests() is waiting for, while
at the same time trying to grab a request that nfs_lock_and_join_requests
already holds.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

137da553

09 9月, 2017 4 次提交

squashfs: Add zstd support · 87bf54bb

由 Sean Purcell 提交于 8月 09, 2017

Add zstd compression and decompression support to SquashFS. zstd is a
great fit for SquashFS because it can compress at ratios approaching xz,
while decompressing twice as fast as zlib. For SquashFS in particular,
it can decompress as fast as lzo and lz4. It also has the flexibility
to turn down the compression ratio for faster compression times.

The compression benchmark is run on the file tree from the SquashFS archive
found in ubuntu-16.10-desktop-amd64.iso [1]. It uses `mksquashfs` with the
default block size (128 KB) and and various compression algorithms/levels.
xz and zstd are also benchmarked with 256 KB blocks. The decompression
benchmark times how long it takes to `tar` the file tree into `/dev/null`.
See the benchmark file in the upstream zstd source repository located under
`contrib/linux-kernel/squashfs-benchmark.sh` [2] for details.

I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD.

| Method         | Ratio | Compression MB/s | Decompression MB/s |
|----------------|-------|------------------|--------------------|
| gzip           |  2.92 |               15 |                128 |
| lzo            |  2.64 |              9.5 |                217 |
| lz4            |  2.12 |               94 |                218 |
| xz             |  3.43 |              5.5 |                 35 |
| xz 256 KB      |  3.53 |              5.4 |                 40 |
| zstd 1         |  2.71 |               96 |                210 |
| zstd 5         |  2.93 |               69 |                198 |
| zstd 10        |  3.01 |               41 |                225 |
| zstd 15        |  3.13 |             11.4 |                224 |
| zstd 16 256 KB |  3.24 |              8.1 |                210 |

This patch was written by Sean Purcell <me@seanp.xyz>, but I will be
taking over the submission process.

[1] http://releases.ubuntu.com/16.10/
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh

zstd source repository: https://github.com/facebook/zstdSigned-off-by: NSean Purcell <me@seanp.xyz>
Signed-off-by: NNick Terrell <terrelln@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NPhillip Lougher <phillip@squashfs.org.uk>

87bf54bb

NFS: Fix 2 use after free issues in the I/O code · 196639eb

由 Trond Myklebust 提交于 9月 08, 2017

The writeback code wants to send a commit after processing the pages,
which is why we want to delay releasing the struct path until after
that's done.

Also, the layout code expects that we do not free the inode before
we've put the layout segments in pnfs_writehdr_free() and
pnfs_readhdr_free()

Fixes: 919e3bd9 ("NFS: Ensure we commit after writeback is complete")
Fixes: 4714fb51 ("nfs: remove pgio_header refcount, related cleanup")
Cc: stable@vger.kernel.org
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

196639eb

vfat: deduplicate hex2bin() · 5680db4b

由 OGAWA Hirofumi 提交于 9月 08, 2017

We may use hex2bin() instead of custom approach.

Link: http://lkml.kernel.org/r/87zibktpil.fsf@devronSigned-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5680db4b

autofs: use unsigned int/long instead of uint/ulong for ioctl args · b9fa2ad1

由 Tomohiro Kusumi 提交于 9月 08, 2017

The standard types unsigned int and unsigned long should be used for
.compat_ioctl.  autofs is the only fs using uing/ulong for this, and these
are even the only uint/ulong in the entire autofs code.

Drop unneeded long cast in return value of autofs_dev_ioctl_compat().
It's already long.

Link: http://lkml.kernel.org/r/150285069709.4670.3884827966280147529.stgit@pluto.themaw.netSigned-off-by: NTomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b9fa2ad1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功