提交 · c8d78c1823f46519473949d33f0d1d33fe21ea16 · openanolis / cloud-kernel

11 2月, 2015 33 次提交

mm: replace remap_file_pages() syscall with emulation · c8d78c18

由 Kirill A. Shutemov 提交于 2月 10, 2015

remap_file_pages(2) was invented to be able efficiently map parts of
huge file into limited 32-bit virtual address space such as in database
workloads.

Nonlinear mappings are pain to support and it seems there's no
legitimate use-cases nowadays since 64-bit systems are widely available.

Let's drop it and get rid of all these special-cased code.

The patch replaces the syscall with emulation which creates new VMA on
each remap_file_pages(), unless they it can be merged with an adjacent
one.

I didn't find *any* real code that uses remap_file_pages(2) to test
emulation impact on.  I've checked Debian code search and source of all
packages in ALT Linux.  No real users: libc wrappers, mentions in
strace, gdb, valgrind and this kind of stuff.

There are few basic tests in LTP for the syscall.  They work just fine
with emulation.

To test performance impact, I've written small test case which
demonstrate pretty much worst case scenario: map 4G shmfs file, write to
begin of every page pgoff of the page, remap pages in reverse order,
read every page.

The test creates 1 million of VMAs if emulation is in use, so I had to
set vm.max_map_count to 1100000 to avoid -ENOMEM.

Before:		23.3 ( +-  4.31% ) seconds
After:		43.9 ( +-  0.85% ) seconds
Slowdown:	1.88x

I believe we can live with that.

Test case:

        #define _GNU_SOURCE
        #include <assert.h>
        #include <stdlib.h>
        #include <stdio.h>
        #include <sys/mman.h>

        #define MB	(1024UL * 1024)
        #define SIZE	(4096 * MB)

        int main(int argc, char **argv)
        {
                unsigned long *p;
                long i, pass;

                for (pass = 0; pass < 10; pass++) {
                        p = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
                                        MAP_SHARED | MAP_ANONYMOUS, -1, 0);
                        if (p == MAP_FAILED) {
                                perror("mmap");
                                return -1;
                        }

                        for (i = 0; i < SIZE / 4096; i++)
                                p[i * 4096 / sizeof(*p)] = i;

                        for (i = 0; i < SIZE / 4096; i++) {
                                if (remap_file_pages(p + i * 4096 / sizeof(*p), 4096,
                                                0, (SIZE - 4096 * (i + 1)) >> 12, 0)) {
                                        perror("remap_file_pages");
                                        return -1;
                                }
                        }

                        for (i = SIZE / 4096 - 1; i >= 0; i--)
                                assert(p[i * 4096 / sizeof(*p)] == SIZE / 4096 - i - 1);

                        munmap(p, SIZE);
                }

                return 0;
        }

[akpm@linux-foundation.org: fix spello]
[sasha.levin@oracle.com: initialize populate before usage]
[sasha.levin@oracle.com: grab file ref to prevent race while mmaping]
Signed-off-by: N"Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Armin Rigo <arigo@tunes.org>
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c8d78c18

mm/vmstat.c: fix/cleanup ifdefs · 3c486871

由 Andrew Morton 提交于 2月 10, 2015

CONFIG_COMPACTION=y, CONFIG_DEBUG_FS=n:

  mm/vmstat.c:690: warning: 'frag_start' defined but not used
  mm/vmstat.c:702: warning: 'frag_next' defined but not used
  mm/vmstat.c:710: warning: 'frag_stop' defined but not used
  mm/vmstat.c:715: warning: 'walk_zones_in_node' defined but not used

It's all a bit of a tangly mess and it's unclear why CONFIG_COMPACTION
figures in there at all.  Move frag_start/frag_next/frag_stop and
migratetype_names[] into the existing CONFIG_PROC_FS block.

walk_zones_in_node() gets a special ifdef.

Also move the #include lines up to where #include lines live.

[axel.lin@ingics.com: fix build error when !CONFIG_PROC_FS]
Signed-off-by: NAxel Lin <axel.lin@ingics.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Tested-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3c486871

mm/slab_common.c: use kmem_cache_free() · 7c4da061

由 Vaishali Thakkar 提交于 2月 10, 2015

Here, free memory is allocated using kmem_cache_zalloc.  So, use
kmem_cache_free instead of kfree.

This is done using Coccinelle and semantic patch used
is as follows:

@@
expression x,E,c;
@@

 x = \(kmem_cache_alloc\|kmem_cache_zalloc\|kmem_cache_alloc_node\)(c,...)
 ... when != x = E
     when != &x
?-kfree(x)
+kmem_cache_free(c,x)
Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com>
Acked-by: NChristoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c4da061

mm/slub.c: fix typo in comment · 94e4d712

由 Kim Phillips 提交于 2月 10, 2015

Signed-off-by: NKim Phillips <kim.phillips@freescale.com>
Acked-by: NChristoph Lameter <cl@linux.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

94e4d712

mm: don't use compound_head() in virt_to_head_page() · ccaafd7f

由 Joonsoo Kim 提交于 2月 10, 2015

compound_head() is implemented with assumption that there would be race
condition when checking tail flag.  This assumption is only true when we
try to access arbitrary positioned struct page.

The situation that virt_to_head_page() is called is different case.  We
call virt_to_head_page() only in the range of allocated pages, so there
is no race condition on tail flag.  In this case, we don't need to
handle race condition and we can reduce overhead slightly.  This patch
implements compound_head_fast() which is similar with compound_head()
except tail flag race handling.  And then, virt_to_head_page() uses this
optimized function to improve performance.

I saw 1.8% win in a fast-path loop over kmem_cache_alloc/free, (14.063
ns -> 13.810 ns) if target object is on tail page.
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NChristoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ccaafd7f

mm/slub: optimize alloc/free fastpath by removing preemption on/off · 9aabf810

由 Joonsoo Kim 提交于 2月 10, 2015

We had to insert a preempt enable/disable in the fastpath a while ago in
order to guarantee that tid and kmem_cache_cpu are retrieved on the same
cpu.  It is the problem only for CONFIG_PREEMPT in which scheduler can
move the process to other cpu during retrieving data.

Now, I reach the solution to remove preempt enable/disable in the
fastpath.  If tid is matched with kmem_cache_cpu's tid after tid and
kmem_cache_cpu are retrieved by separate this_cpu operation, it means
that they are retrieved on the same cpu.  If not matched, we just have
to retry it.

With this guarantee, preemption enable/disable isn't need at all even if
CONFIG_PREEMPT, so this patch removes it.

I saw roughly 5% win in a fast-path loop over kmem_cache_alloc/free in
CONFIG_PREEMPT.  (14.821 ns -> 14.049 ns)

Below is the result of Christoph's slab_test reported by Jesper Dangaard
Brouer.

* Before

 Single thread testing
 =====================
 1. Kmalloc: Repeatedly allocate then free test
 10000 times kmalloc(8) -> 49 cycles kfree -> 62 cycles
 10000 times kmalloc(16) -> 48 cycles kfree -> 64 cycles
 10000 times kmalloc(32) -> 53 cycles kfree -> 70 cycles
 10000 times kmalloc(64) -> 64 cycles kfree -> 77 cycles
 10000 times kmalloc(128) -> 74 cycles kfree -> 84 cycles
 10000 times kmalloc(256) -> 84 cycles kfree -> 114 cycles
 10000 times kmalloc(512) -> 83 cycles kfree -> 116 cycles
 10000 times kmalloc(1024) -> 81 cycles kfree -> 120 cycles
 10000 times kmalloc(2048) -> 104 cycles kfree -> 136 cycles
 10000 times kmalloc(4096) -> 142 cycles kfree -> 165 cycles
 10000 times kmalloc(8192) -> 238 cycles kfree -> 226 cycles
 10000 times kmalloc(16384) -> 403 cycles kfree -> 264 cycles
 2. Kmalloc: alloc/free test
 10000 times kmalloc(8)/kfree -> 68 cycles
 10000 times kmalloc(16)/kfree -> 68 cycles
 10000 times kmalloc(32)/kfree -> 69 cycles
 10000 times kmalloc(64)/kfree -> 68 cycles
 10000 times kmalloc(128)/kfree -> 68 cycles
 10000 times kmalloc(256)/kfree -> 68 cycles
 10000 times kmalloc(512)/kfree -> 74 cycles
 10000 times kmalloc(1024)/kfree -> 75 cycles
 10000 times kmalloc(2048)/kfree -> 74 cycles
 10000 times kmalloc(4096)/kfree -> 74 cycles
 10000 times kmalloc(8192)/kfree -> 75 cycles
 10000 times kmalloc(16384)/kfree -> 510 cycles

* After

 Single thread testing
 =====================
 1. Kmalloc: Repeatedly allocate then free test
 10000 times kmalloc(8) -> 46 cycles kfree -> 61 cycles
 10000 times kmalloc(16) -> 46 cycles kfree -> 63 cycles
 10000 times kmalloc(32) -> 49 cycles kfree -> 69 cycles
 10000 times kmalloc(64) -> 57 cycles kfree -> 76 cycles
 10000 times kmalloc(128) -> 66 cycles kfree -> 83 cycles
 10000 times kmalloc(256) -> 84 cycles kfree -> 110 cycles
 10000 times kmalloc(512) -> 77 cycles kfree -> 114 cycles
 10000 times kmalloc(1024) -> 80 cycles kfree -> 116 cycles
 10000 times kmalloc(2048) -> 102 cycles kfree -> 131 cycles
 10000 times kmalloc(4096) -> 135 cycles kfree -> 163 cycles
 10000 times kmalloc(8192) -> 238 cycles kfree -> 218 cycles
 10000 times kmalloc(16384) -> 399 cycles kfree -> 262 cycles
 2. Kmalloc: alloc/free test
 10000 times kmalloc(8)/kfree -> 65 cycles
 10000 times kmalloc(16)/kfree -> 66 cycles
 10000 times kmalloc(32)/kfree -> 65 cycles
 10000 times kmalloc(64)/kfree -> 66 cycles
 10000 times kmalloc(128)/kfree -> 66 cycles
 10000 times kmalloc(256)/kfree -> 71 cycles
 10000 times kmalloc(512)/kfree -> 72 cycles
 10000 times kmalloc(1024)/kfree -> 71 cycles
 10000 times kmalloc(2048)/kfree -> 71 cycles
 10000 times kmalloc(4096)/kfree -> 71 cycles
 10000 times kmalloc(8192)/kfree -> 65 cycles
 10000 times kmalloc(16384)/kfree -> 511 cycles

Most of the results are better than before.

Note that this change slightly worses performance in !CONFIG_PREEMPT,
roughly 0.3%.  Implementing each case separately would help performance,
but, since it's so marginal, I didn't do that.  This would help
maintanance since we have same code for all cases.
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NChristoph Lameter <cl@linux.com>
Tested-by: NJesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9aabf810

fsioctl.c: make generic_block_fiemap() signal-tolerant · 913e027c

由 Dmitry Monakhov 提交于 2月 10, 2015

__generic_block_fiemap may spin very long time for large sparse files.

Without this patch an unprivileged user may abuse system resources simply
by spawning a vast number of unkilable busyloops (works on ext2/ext3):

  truncate --size 1T test
  for ((i=0;i<1024;i++))
  do
         filefrag test > /dev/null &
  done
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

913e027c

o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper · 99b8874e

由 Srinivas Eeda 提交于 2月 10, 2015

A tiny race between BAST and unlock message causes the NULL dereference.

A node sends an unlock request to master and receives a response.  Before
processing the response it receives a BAST from the master.  Since both
requests are processed by different threads it creates a race.  While the
BAST is being processed, lock can get freed by unlock code.

This patch makes bast to return immediately if lock is found but unlock is
pending.  The code should handle this race.  We also have to fix master
node to skip sending BAST after receiving unlock message.

Below is the crash stack

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
    IP: o2dlm_blocking_ast_wrapper+0xd/0x16
    dlm_do_local_bast+0x8e/0x97 [ocfs2_dlm]
    dlm_proxy_ast_handler+0x838/0x87e [ocfs2_dlm]
    o2net_process_message+0x395/0x5b8 [ocfs2_nodemanager]
    o2net_rx_until_empty+0x762/0x90d [ocfs2_nodemanager]
    worker_thread+0x14d/0x1ed

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

99b8874e

ocfs2: prune the dcache before deleting the dentry of directory · 10ab8811

由 alex chen 提交于 2月 10, 2015

In ocfs2_dentry_convert_worker, we should prune the dcache before deleting
the dentry of directory, otherwise, in the following cases the inode of
directory will still remain in orphan directory until the device being
umounted.

Mount point: /mnt/ocfs2
Node A                              Node B
mkdir /mnt/ocfs2/testdir
  ocfs2_mkdir
  ->ocfs2_mknod
  ->ocfs2_dentry_attach_lock
  ->ocfs2_dentry_lock(dentry, 0)
  ... ...
touch /mnt/ocfs2/testdir/testfile
                                    unlink /mnt/test/testdir/testfile
                                    rmdir /mnt/ocfs2/testdir
                                      ocfs2_unlink
                                      ->ocfs2_remote_dentry_delete
                                      ->ocfs2_dentry_lock(dentry, 1)
                                      ... ...
... ...
ocfs2_downconvert_thread
->ocfs2_unblock_lock
->ocfs2_dentry_convert_worker
->ocfs2_find_local_alias
  ->dget_dlock
->d_delete
Here the dentry can not be
released because the children's
dentry is negative but still exist.
Finally, this inode will still remain
in orphan directory until its children
are destroyed.

So before deleting dentry of directory, we should prune the dcache to
remove unused children of the parent dentry by shrink_dcache_parent().
Signed-off-by: NAlex Chen <alex.chen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Reviewed-by: Njoyce.xue <xuejiufei@huawei.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

10ab8811

ocfs2: make resv_lock spinlock static · 8989b673

由 Fabian Frederick 提交于 2月 10, 2015

resv_lock is only used in reservations.c
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8989b673

ocfs2: remove unreachable code in __ocfs2_recovery_thread() · 9d6008c7

由 Daeseok Youn 提交于 2月 10, 2015

Signed-off-by: NDaeseok Youn <daeseok.youn@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9d6008c7

ocfs2: removes mlog_errno() call twice in ocfs2_find_dir_space_el() · 9b572691

由 Daeseok Youn 提交于 2月 10, 2015

mlog_errno() is called twice when some functions are failed.
Signed-off-by: NDaeseok Youn <daeseok.youn@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9b572691

ocfs2: remove unreachable code · 592a202a

由 Daeseok Youn 提交于 2月 10, 2015

Signed-off-by: NDaeseok Youn <daeseok.youn@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

592a202a

ocfs2: o2net: silence uninitialized variable warning · e6d9f86d

由 Dan Carpenter 提交于 2月 10, 2015

Smatch complains that, if o2net_tx_can_proceed() returns false, then "sc"
and "ret" are uninialized or maybe we are re-using the data from previous
iteration.  I do not know if we can hit this bug in real life but checking
the return value is harmless and we may as well silence the static checker
warning.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6d9f86d

ocfs2: remove pointless assignment from ocfs2_calc_refcount_meta_credits() · d9510a20

由 Jan Kara 提交于 2月 10, 2015

The assigned value is never used.

Coverity-id 1226847.
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d9510a20

ocfs2: add a mount option journal_async_commit on ocfs2 filesystem · 1dfeb768

由 alex chen 提交于 2月 10, 2015

Add a mount option to support JBD2 feature:

JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT.  When this feature is opened, journal
commit block can be written to disk without waiting for descriptor blocks,
which can improve journal commit performance.  This option will enable
'journal_checksum' internally.

Using the fs_mark benchmark, using journal_async_commit shows a 50%
improvement, the files per second go up from 215.2 to 317.5.

test script:
fs_mark  -d  /mnt/ocfs2/  -s  10240  -n  1000

default:
FSUse%        Count         Size    Files/sec     App Overhead
     0         1000        10240        215.2            17878

with journal_async_commit option:
FSUse%        Count         Size    Files/sec     App Overhead
     0         1000        10240        317.5            17881
Signed-off-by: NAlex Chen <alex.chen@huawei.com>
Signed-off-by: NWeiwei Wang <wangww631@huawei.comm>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1dfeb768

ocfs2: fix journal commit deadlock in ocfs2_convert_inline_data_to_extents · 15eba0fe

由 alex chen 提交于 2月 10, 2015

Similar to ocfs2_write_end_nolock() which is metioned at commit
136f49b9 ("ocfs2: fix journal commit deadlock"), we should unlock
pages before ocfs2_commit_trans() in ocfs2_convert_inline_data_to_extents.

Otherwise, it will cause a deadlock with journal commit threads.
Signed-off-by: NAlex Chen <alex.chen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

15eba0fe

ocfs2: dlm: dlmdomain: remove unused function · 95671c63

由 Rickard Strandqvist 提交于 2月 10, 2015

Remove dlm_joined() that is not used anywhere.

This was partially found by using a static code analysis program called
cppcheck.
Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

95671c63

ocfs2: quota_local: remove unused function · 7ea62d70

由 Rickard Strandqvist 提交于 2月 10, 2015

Remove ol_dqblk_file_block() that is not used anywhere.

This was partially found by using a static code analysis program called
cppcheck.
Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7ea62d70

ocfs2: xattr: remove unused function · d6edc87a

由 Rickard Strandqvist 提交于 2月 10, 2015

Remove ocfs2_xattr_bucket_get_val() that is not used anywhere.

This was partially found by using a static code analysis program called
cppcheck.
Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d6edc87a

ocfs2: fix snprintf format specifier in dlmdebug.c · 79c83ea1

由 alex chen 提交于 2月 10, 2015

Use snprintf format specifier "%lu" instead of "%ld" for argument of type
'unsigned long'.
Signed-off-by: NAlex Chen <alex.chen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

79c83ea1

ocfs2: fix wrong comment · 41d6247f

由 Junxiao Bi 提交于 2月 10, 2015

O2NET_CONN_IDLE_DELAY is not defined, connection attempts will not be
canceled due to timeout.
Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

41d6247f

ocfs2: fix uninitialized variable access · 696cdf73

由 Junxiao Bi 提交于 2月 10, 2015

Variable "why" is not yet initialized at line 615, fix it.
Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

696cdf73

ocfs2: remove unnecessary else in ocfs2_set_acl() · e6e74618

由 Fabian Frederick 提交于 2月 10, 2015

else is unnecessary after return.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6e74618

ocfs2/dlm: add missing dlm_lock_put() when recovery master down · b934beaf

由 Xue jiufei 提交于 2月 10, 2015

When the recovery master is down, the owner of $RECOVERY calls
dlm_do_local_recovery_cleanup() to prune any $RECOVERY entries for dead
nodes.  The lock is in the granted list and the refcount must be 2.  We
should put twice to remove this lock.  Otherwise, it will lead to a memory
leak.
Signed-off-by: Njoyce.xue <xuejiufei@huawei.com>
Reported-by: Nyangwenfang <vicky.yangwenfang@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b934beaf

sh: eliminate unused irq_reg_{readl,writel} accessors · 102ca660

由 Kevin Cernekee 提交于 2月 10, 2015

Defining these macros way down in arch/sh/.../irq.c doesn't cause
kernel/irq/generic-chip.c to use them.  As far as I can tell this code has
no effect.

Fixes: 332fd7c4 ("genirq: Generic chip: Change irq_reg_{readl,writel} arguments")
Signed-off-by: NKevin Cernekee <cernekee@gmail.com>
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> (cpp/asm comparison)
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

102ca660

sh: build superh without CONFIG_EXPERT · 560b8c0e

由 Rob Landley 提交于 2月 10, 2015

What sh4 actually wants is HAVE_PATA_PLATFORM, so select that instead.
Signed-off-by: NRob Landley <rob@landley.net>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Acked-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

560b8c0e

fsnotify: fix handling of renames in audit · 6ee8e25f

由 Jan Kara 提交于 2月 10, 2015

Commit e9fd702a ("audit: convert audit watches to use fsnotify
instead of inotify") broke handling of renames in audit.  Audit code
wants to update inode number of an inode corresponding to watched name
in a directory.  When something gets renamed into a directory to a
watched name, inotify previously passed moved inode to audit code
however new fsnotify code passes directory inode where the change
happened.  That confuses audit and it starts watching parent directory
instead of a file in a directory.

This can be observed for example by doing:

  cd /tmp
  touch foo bar
  auditctl -w /tmp/foo
  touch foo
  mv bar foo
  touch foo

In audit log we see events like:

  type=CONFIG_CHANGE msg=audit(1423563584.155:90): auid=1000 ses=2 op="updated rules" path="/tmp/foo" key=(null) list=4 res=1
  ...
  type=PATH msg=audit(1423563584.155:91): item=2 name="bar" inode=1046884 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=DELETE
  type=PATH msg=audit(1423563584.155:91): item=3 name="foo" inode=1046842 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=DELETE
  type=PATH msg=audit(1423563584.155:91): item=4 name="foo" inode=1046884 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=CREATE
  ...

and that's it - we see event for the first touch after creating the
audit rule, we see events for rename but we don't see any event for the
last touch.  However we start seeing events for unrelated stuff
happening in /tmp.

Fix the problem by passing moved inode as data in the FS_MOVED_FROM and
FS_MOVED_TO events instead of the directory where the change happens.
This doesn't introduce any new problems because noone besides
audit_watch.c cares about the passed value:

  fs/notify/fanotify/fanotify.c cares only about FSNOTIFY_EVENT_PATH events.
  fs/notify/dnotify/dnotify.c doesn't care about passed 'data' value at all.
  fs/notify/inotify/inotify_fsnotify.c uses 'data' only for FSNOTIFY_EVENT_PATH.
  kernel/audit_tree.c doesn't care about passed 'data' at all.
  kernel/audit_watch.c expects moved inode as 'data'.

Fixes: e9fd702a ("audit: convert audit watches to use fsnotify instead of inotify")
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6ee8e25f

inotify: update documentation to reflect code changes · a5b2f95d

由 Zhang Zhen 提交于 2月 10, 2015

The inotify interface has changed a lot.  The user interface was too
old, and the kernel interface was removed by Eric Paris in commit:
2dfc1cae inotify: remove inotify in kernel interface.
Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Wang Kai <morgan.wang@huawei.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Robert Love <robert.w.love@intel.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5b2f95d

fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask · 66ba93c0

由 Lino Sanfilippo 提交于 2月 10, 2015

Currently FAN_ONDIR is always set on a mark's ignored mask when the
event mask is extended without FAN_MARK_ONDIR being set.  This may
result in events for directories being ignored unexpectedly for call
sequences like

  fanotify_mark(fd, FAN_MARK_ADD, FAN_OPEN | FAN_ONDIR , AT_FDCWD, "dir");
  fanotify_mark(fd, FAN_MARK_ADD, FAN_CLOSE, AT_FDCWD, "dir");

Also FAN_MARK_ONDIR is only honored when adding events to a mark's mask,
but not for event removal.  Fix both issues by not setting FAN_ONDIR
implicitly on the ignore mask any more.  Instead treat FAN_ONDIR as any
other event flag and require FAN_MARK_ONDIR to be set by the user for
both event mask and ignore mask.  Furthermore take FAN_MARK_ONDIR into
account when set for event removal.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

66ba93c0

fanotify: don't recalculate a marks mask if only the ignored mask changed · d2c1874c

由 Lino Sanfilippo 提交于 2月 10, 2015

If removing bits from a mark's ignored mask, the concerning
inodes/vfsmounts mask is not affected.  So don't recalculate it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d2c1874c

fanotify: only destroy mark when both mask and ignored_mask are cleared · a118449a

由 Lino Sanfilippo 提交于 2月 10, 2015

In fanotify_mark_remove_from_mask() a mark is destroyed if only one of
both bitmasks (mask or ignored_mask) of a mark is cleared.  However the
other mask may still be set and contain information that should not be
lost.  So only destroy a mark if both masks are cleared.
Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a118449a

hugetlb, x86: register 1G page size if we can allocate them at runtime · ece84b39

由 Kirill A. Shutemov 提交于 2月 10, 2015

After commit 944d9fec ("hugetlb: add support for gigantic page
allocation at runtime") we can allocate 1G pages at runtime if CMA is
enabled.

Let's register 1G pages into hugetlb even if the user hasn't requested
them explicitly at boot time with hugepagesz=1G.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: NLuiz Capitulino <lcapitulino@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ece84b39

10 2月, 2015 7 次提交

Merge branch 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 3e8c04eb

由 Linus Torvalds 提交于 2月 09, 2015

Pull libata changes from Tejun Heo:
 "Mostly driver-specific changes.  Nothing too noteworthy.

  This pull request contains three merges from for-3.19-fixes.  The
  first two are to pull ahci_xgene and sata_dwc_460ex fix commits which
  are depended upon by later changes.  The last one is to pull in a fix
  patch which missed the v3.19-rc window"

* 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (24 commits)
  ahci_xgene: Fix the dma state machine lockup for the ATA_CMD_SMART PIO mode command.
  ata: libahci: Use of_platform_device_create only if supported
  sata_mv: Delete unnecessary checks before the function call "phy_power_off"
  ata: Delete unnecessary checks before the function call "pci_dev_put"
  ata: pata_platform: fix owner module reference mismatch for scsi host
  ata: ahci_platform: fix owner module reference mismatch for scsi host
  pata_pdc2027x: Use 64-bit timekeeping
  ata: libahci: Fix devres cleanup on failure
  ata: libahci: Allow using multiple regulators
  Documentation: bindings: Add the regulator property to the sub-nodes AHCI bindings
  ata: libahci: Clean-up the ahci_platform_en/disable_phys functions
  sata_rcar: extend PM methods
  sata_dwc_460ex: disable compilation on ARM and ARM64
  ata: libata-core: Remove unused function
  sata_dwc_460ex: convert to devm_kzalloc in ->probe()
  sata_dwc_460ex: remove extra message
  sata_dwc_460ex: use np local variable in ->probe()
  sata_dwc_460ex: fix most of the sparse warnings
  sata_dwc_460ex: enable COMPILE_TEST for the driver
  sata_dwc_460ex: remove redundant dev_set_drvdata
  ...

3e8c04eb

Merge branch 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 44dbf058

由 Linus Torvalds 提交于 2月 09, 2015

Pull workqueue changes from Tejun Heo:
 "One cosmetic cleanup patch"

* 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue.h: remove loops of single statement macros

44dbf058

Merge branch 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 15763db1

由 Linus Torvalds 提交于 2月 09, 2015

Pull cgroup changes from Tejun Heo:
 "Three mostly trivial patches.  The biggest change is that blkio is now
  initialized before memcg which will be needed to make memcg and blkcg
  cooperate on writeback IOs"

* 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: add dummy css_put() for !CONFIG_CGROUPS
  cgroup: reorder SUBSYS(blkio) in cgroup_subsys.h
  Update of Documentation/cgroups/00-INDEX

15763db1

Merge branch 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu · c2189e3a

由 Linus Torvalds 提交于 2月 09, 2015

Pull percpu changes from Tejun Heo:
 "Nothing too interesting.  One cleanup patch and another to add a
  trivial state check function"

* 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu_ref: implement percpu_ref_is_dying()
  percpu_ref: remove unnecessary ACCESS_ONCE() in percpu_ref_tryget_live()

c2189e3a

Merge tag 'edac_for_3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · ed824a62

由 Linus Torvalds 提交于 2月 09, 2015

Pull EDAC updates from Borislav Petkov:

 - a new synopsys_edac.c driver for the Synopsys DDR controller, from
   Punnaiah Choudary Kalluri.

 - minor fixes/cleanups all around

 - Mauro and I are adding the repo URLs to MAINTAINERS since people
   asked for trees to base upcoming work on.

* tag 'edac_for_3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC: Add repo URLs to MAINTAINERS
  EDAC, mv64x60_edac: Fix an error code in probe()
  EDAC: edac_mc_sysfs: Make stuff static
  EDAC: Fix the leak of mci->bus->name when bus_register fails
  edac: i5100_edac: Remove unused i5100_recmema_dm_buf_id
  EDAC, synps: Add EDAC support for zynq ddr ecc controller
  mpc85xx_edac: Fix a typo in comments
  EDAC, mce_amd_inj: Fix sparse non static symbol warning

ed824a62

Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e07e0d4c

由 Linus Torvalds 提交于 2月 09, 2015

Pull x86 RAS update from Ingo Molnar:
 "The changes in this cycle were:

   - allow mmcfg access to APEI error injection handlers

   - improve MCE error messages

   - smaller cleanups"

* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, mce: Fix sparse errors
  x86, mce: Improve timeout error messages
  ACPI, EINJ: Enhance error injection tolerance level

e07e0d4c

Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 57d36294

由 Linus Torvalds 提交于 2月 09, 2015

Pull x86 mm cleanups from Ingo Molnar:
 "Two cleanups: simplify parse_setup_data() and sanitize_e820_map()
  usage"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, e820: Clean up sanitize_e820_map() users
  x86, setup: Let early_memremap() handle page alignment

57d36294

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功