提交 · d31af0a325ca4ff8e57b8616ab2228913df369ad · openeuler / Kernel

01 4月, 2015 5 次提交

NVMe: increase depth of admin queue · d31af0a3

由 Jens Axboe 提交于 3月 06, 2015

Usually the admin queue depth of 64 is plenty, but for some use cases we
really need it larger. Examples are use cases like MAT, where you have
to touch all of NAND for init/format like purposes. In those cases, we
see a good 2x increase with an increased queue depth.
Signed-off-by: NJens Axboe <axboe@fb.com>
Acked-by: NKeith Busch <keith.busch@intel.com>

d31af0a3

nvme: Fix PRP list calculation for non-4k system page size · f137e0f1

由 Murali Iyer 提交于 3月 26, 2015

PRP list calculation is supposed to be based on device's page size.
Systems with page size larger than device's page size cause corruption
to the name space as well as system memory with out this fix.
Systems like x86 might not experience this issue because it uses
PAGE_SIZE of 4K where as powerpc uses PAGE_SIZE of 64k while NVMe device's
page size varies depending upon the vendor.
Signed-off-by: NMurali Iyer <mniyer@us.ibm.com>
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f137e0f1

NVMe: Fix blk-mq hot cpu notification · 1efccc9d

由 Keith Busch 提交于 3月 31, 2015

The driver may issue commands to a device that may never return, so its
request_queue could always have active requests while the controller is
running. Waiting for the queue to freeze could block forever, which is
what blk-mq's hot cpu notification handler was doing when nvme drives
were in use.

This has the nvme driver make the asynchronous event command's tag
reserved and does not keep the request active. We can't have more than
one since the request is released back to the request_queue before the
command is completed. Having only one avoids potential tag collisions,
and reserving the tag for this purpose prevents other admin tasks from
reusing the tag.

I also couldn't think of a scenario where issuing AEN requests single
depth is worse than issuing them in batches, so I don't think we lose
anything with this change.

As an added bonus, doing it this way removes "Cancelling I/O" warnings
observed when unbinding the nvme driver from a device.
Reported-by: NYigal Korman <yigal@plexistor.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1efccc9d

NVMe: embedded iod mask cleanup · fda631ff

由 Chong Yuan 提交于 3月 27, 2015

Remove unused mask in nvme_alloc_iod
Signed-off-by: NChong Yuan <chong.yuan@memblaze.com>
Reviewed-by: NWenbo Wang  <wenbo.wang@memblaze.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

fda631ff

NVMe: Freeze admin queue on device failure · 6df3dbc8

由 Keith Busch 提交于 3月 26, 2015

This fixes a race accessing an invalid address when a controller's admin
queue is in use during a reset for failure or hot removal occurs. The
admin queue will be frozen to prevent new users from entering prior to
the doorbell queue being unmapped.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6df3dbc8

25 3月, 2015 2 次提交

block, drbd: use mempool_create_slab_pool() · cbc4ffdb

由 David Rientjes 提交于 3月 24, 2015

Mempools created for slab caches should use
mempool_create_slab_pool().

Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

cbc4ffdb

block, drbd: fix drbd_req_new() initialization · 23fe8f8b

由 David Rientjes 提交于 3月 24, 2015

mempool_alloc() does not support __GFP_ZERO since elements may come from
memory that has already been released by mempool_free().

Remove __GFP_ZERO from mempool_alloc() in drbd_req_new() and properly
initialize it to 0.

Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

23fe8f8b

18 3月, 2015 1 次提交

brd: update maintainer to be Jens Axboe · ea7618ec

由 Ross Zwisler 提交于 3月 17, 2015

Nick Piggin is currently listed as the maintainer of BRD in MAINTAINERS,
but the mails sent to the listed address are returned as undeliverable.
Update the maintainer for BRD to be Jens Axboe, since patches for BRD
flow up through him.
Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ea7618ec

13 3月, 2015 24 次提交

blk-mq: don't wait in blk_mq_queue_enter() if __GFP_WAIT isn't set · bfd343aa

由 Keith Busch 提交于 3月 11, 2015

Return -EBUSY if we're unable to enter a queue immediately when
allocating a blk-mq request without __GFP_WAIT.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

bfd343aa

blk-mq: export blk_mq_run_hw_queues · b94ec296

由 Mike Snitzer 提交于 3月 11, 2015

Rename blk_mq_run_queues to blk_mq_run_hw_queues, add async argument,
and export it.

DM's suspend support must be able to run the queue without starting
stopped hw queues.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b94ec296

blk-mq: add blk_mq_init_allocated_queue and export blk_mq_register_disk · b62c21b7

由 Mike Snitzer 提交于 3月 12, 2015

Add a variant of blk_mq_init_queue that allows a previously allocated
queue to be initialized. blk_mq_init_allocated_queue models
blk_init_allocated_queue -- which was also created for DM's use.

DM's approach to device creation requires a placeholder request_queue be
allocated for use with alloc_dev() but the decision about what type of
request_queue will be ultimately created is deferred until all component
devices referenced in the DM table are processed to determine the table
type (request-based, blk-mq request-based, or bio-based).

Also, because of DM's late finalization of the request_queue type
the call to blk_mq_register_disk() doesn't happen during alloc_dev().
Must export blk_mq_register_disk() so that DM can backfill the 'mq' dir
once the blk-mq queue is fully allocated.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Reviewed-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b62c21b7

J

Merge branch 'for-linus' into for-4.1/core · 64f9b683
由 Jens Axboe 提交于 3月 13, 2015

64f9b683

blk-mq: fix use of incorrect goto label in blk_mq_init_queue error path · 9a30b096

由 Mike Snitzer 提交于 3月 12, 2015

If percpu_ref_init() fails the allocated q and hctxs must get cleaned
up; using 'err_map' doesn't allow that to happen.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Reviewed-by: NMing Lei <ming.lei@canonical.com>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

9a30b096

Merge branch 'akpm' (patches from Andrew) · c202baf0

由 Linus Torvalds 提交于 3月 12, 2015

Merge misc fixes from Andrew Morton:
 "13 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  memcg: disable hierarchy support if bound to the legacy cgroup hierarchy
  mm: reorder can_do_mlock to fix audit denial
  kasan, module: move MODULE_ALIGN macro into <linux/moduleloader.h>
  kasan, module, vmalloc: rework shadow allocation for modules
  fanotify: fix event filtering with FAN_ONDIR set
  mm/nommu.c: export symbol max_mapnr
  arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU
  nilfs2: fix deadlock of segment constructor during recovery
  mm: cma: fix CMA aligned offset calculation
  mm, hugetlb: close race when setting PageTail for gigantic pages
  mm, oom: do not fail __GFP_NOFAIL allocation if oom killer is disabled
  drivers/rtc/rtc-s3c.c: add .needs_src_clk to s3c6410 RTC data
  ocfs2: make append_dio an incompat feature

c202baf0

memcg: disable hierarchy support if bound to the legacy cgroup hierarchy · 7feee590

由 Vladimir Davydov 提交于 3月 12, 2015

If the memory cgroup controller is initially mounted in the scope of the
default cgroup hierarchy and then remounted to a legacy hierarchy, it will
still have hierarchy support enabled, which is incorrect.  We should
disable hierarchy support if bound to the legacy cgroup hierarchy.
Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7feee590

mm: reorder can_do_mlock to fix audit denial · a5a6579d

由 Jeff Vander Stoep 提交于 3月 12, 2015

A userspace call to mmap(MAP_LOCKED) may result in the successful locking
of memory while also producing a confusing audit log denial.  can_do_mlock
checks capable and rlimit.  If either of these return positive
can_do_mlock returns true.  The capable check leads to an LSM hook used by
apparmour and selinux which produce the audit denial.  Reordering so
rlimit is checked first eliminates the denial on success, only recording a
denial when the lock is unsuccessful as a result of the denial.
Signed-off-by: NJeff Vander Stoep <jeffv@google.com>
Acked-by: NNick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Paul Cassella <cassella@cray.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5a6579d

kasan, module: move MODULE_ALIGN macro into <linux/moduleloader.h> · d3733e5c

由 Andrey Ryabinin 提交于 3月 12, 2015

include/linux/moduleloader.h is more suitable place for this macro.
Also change alignment to PAGE_SIZE for CONFIG_KASAN=n as such
alignment already assumed in several places.
Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d3733e5c

kasan, module, vmalloc: rework shadow allocation for modules · a5af5aa8

由 Andrey Ryabinin 提交于 3月 12, 2015

Current approach in handling shadow memory for modules is broken.

Shadow memory could be freed only after memory shadow corresponds it is no
longer used.  vfree() called from interrupt context could use memory its
freeing to store 'struct llist_node' in it:

    void vfree(const void *addr)
    {
    ...
        if (unlikely(in_interrupt())) {
            struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
            if (llist_add((struct llist_node *)addr, &p->list))
                    schedule_work(&p->wq);

Later this list node used in free_work() which actually frees memory.
Currently module_memfree() called in interrupt context will free shadow
before freeing module's memory which could provoke kernel crash.

So shadow memory should be freed after module's memory.  However, such
deallocation order could race with kasan_module_alloc() in module_alloc().

Free shadow right before releasing vm area.  At this point vfree()'d
memory is not used anymore and yet not available for other allocations.
New VM_KASAN flag used to indicate that vm area has dynamically allocated
shadow memory so kasan frees shadow only if it was previously allocated.
Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5af5aa8

fanotify: fix event filtering with FAN_ONDIR set · b3c1030d

由 Suzuki K. Poulose 提交于 3月 12, 2015

With FAN_ONDIR set, the user can end up getting events, which it hasn't
marked.  This was revealed with fanotify04 testcase failure on
Linux-4.0-rc1, and is a regression from 3.19, revealed with 66ba93c0
("fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask").

   # /opt/ltp/testcases/bin/fanotify04
   [ ... ]
  fanotify04    7  TPASS  :  event generated properly for type 100000
  fanotify04    8  TFAIL  :  fanotify04.c:147: got unexpected event 30
  fanotify04    9  TPASS  :  No event as expected

The testcase sets the adds the following marks : FAN_OPEN | FAN_ONDIR for
a fanotify on a dir.  Then does an open(), followed by close() of the
directory and expects to see an event FAN_OPEN(0x20).  However, the
fanotify returns (FAN_OPEN|FAN_CLOSE_NOWRITE(0x10)).  This happens due to
the flaw in the check for event_mask in fanotify_should_send_event() which
does:

	if (event_mask & marks_mask & ~marks_ignored_mask)
		return true;

where, event_mask == (FAN_ONDIR | FAN_CLOSE_NOWRITE),
       marks_mask == (FAN_ONDIR | FAN_OPEN),
       marks_ignored_mask == 0

Fix this by masking the outgoing events to the user, as we already take
care of FAN_ONDIR and FAN_EVENT_ON_CHILD.
Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b3c1030d

mm/nommu.c: export symbol max_mapnr · 5b8bf307

由 gchen gchen 提交于 3月 12, 2015

Several modules may need max_mapnr, so export, the related error with
allmodconfig under c6x:

  MODPOST 3327 modules
  ERROR: "max_mapnr" [fs/pstore/ramoops.ko] undefined!
  ERROR: "max_mapnr" [drivers/media/v4l2-core/videobuf2-dma-contig.ko] undefined!
Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5b8bf307

arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU · 65b9ab88

由 Chen Gang 提交于 3月 12, 2015

When !MMU, asm-generic will not define default pgprot_writecombine, so c6x
needs to define it by itself.  The related error:

    CC [M]  fs/pstore/ram_core.o
  fs/pstore/ram_core.c: In function 'persistent_ram_vmap':
  fs/pstore/ram_core.c:399:10: error: implicit declaration of function 'pgprot_writecombine' [-Werror=implicit-function-declaration]
     prot = pgprot_writecombine(PAGE_KERNEL);
            ^
  fs/pstore/ram_core.c:399:8: error: incompatible types when assigning to type 'pgprot_t {aka struct <anonymous>}' from type 'int'
     prot = pgprot_writecombine(PAGE_KERNEL);
          ^
Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

65b9ab88

nilfs2: fix deadlock of segment constructor during recovery · 283ee148

由 Ryusuke Konishi 提交于 3月 12, 2015

According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
during recovery at mount time.  The code path that caused the deadlock was
as follows:

  nilfs_fill_super()
    load_nilfs()
      nilfs_salvage_orphan_logs()
        * Do roll-forwarding, attach segment constructor for recovery,
          and kick it.

        nilfs_segctor_thread()
          nilfs_segctor_thread_construct()
           * A lock is held with nilfs_transaction_lock()
             nilfs_segctor_do_construct()
               nilfs_segctor_drop_written_files()
                 iput()
                   iput_final()
                     write_inode_now()
                       writeback_single_inode()
                         __writeback_single_inode()
                           do_writepages()
                             nilfs_writepage()
                               nilfs_construct_dsync_segment()
                                 nilfs_transaction_lock() --> deadlock

This can happen if commit 7ef3ff2f ("nilfs2: fix deadlock of segment
constructor over I_SYNC flag") is applied and roll-forward recovery was
performed at mount time.  The roll-forward recovery can happen if datasync
write is done and the file system crashes immediately after that.  For
instance, we can reproduce the issue with the following steps:

 < nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
 # dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
 # dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
 count=1 && reboot -nfh
 < the system will immediately reboot >
 # mount -t nilfs2 /dev/sdb1 /nilfs

The deadlock occurs because iput() can run segment constructor through
writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags.  The
above commit changed segment constructor so that it calls iput()
asynchronously for inodes with i_nlink == 0, but that change was
imperfect.

This fixes the another deadlock by deferring iput() in segment constructor
even for the case that mount is not finished, that is, for the case that
MS_ACTIVE flag is not set.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reported-by: NYuxuan Shui <yshuiv7@gmail.com>
Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

283ee148

mm: cma: fix CMA aligned offset calculation · 850fc430

由 Danesh Petigara 提交于 3月 12, 2015

The CMA aligned offset calculation is incorrect for non-zero order_per_bit
values.

For example, if cma->order_per_bit=1, cma->base_pfn= 0x2f800000 and
align_order=12, the function returns a value of 0x17c00 instead of 0x400.

This patch fixes the CMA aligned offset calculation.

The previous calculation was wrong and would return too-large values for
the offset, so that when cma_alloc looks for free pages in the bitmap with
the requested alignment > order_per_bit, it starts too far into the bitmap
and so CMA allocations will fail despite there actually being plenty of
free pages remaining.  It will also probably have the wrong alignment.
With this change, we will get the correct offset into the bitmap.

One affected user is powerpc KVM, which has kvm_cma->order_per_bit set to
KVM_CMA_CHUNK_ORDER - PAGE_SHIFT, or 18 - 12 = 6.

[gregory.0xf0@gmail.com: changelog additions]
Signed-off-by: NDanesh Petigara <dpetigara@broadcom.com>
Reviewed-by: NGregory Fong <gregory.0xf0@gmail.com>
Acked-by: NMichal Nazarewicz <mina86@mina86.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

850fc430

mm, hugetlb: close race when setting PageTail for gigantic pages · 44fc8057

由 David Rientjes 提交于 3月 12, 2015

Now that gigantic pages are dynamically allocatable, care must be taken to
ensure that p->first_page is valid before setting PageTail.

If this isn't done, then it is possible to race and have compound_head()
return NULL.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NDavidlohr Bueso <dave@stgolabs.net>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NHillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

44fc8057

mm, oom: do not fail __GFP_NOFAIL allocation if oom killer is disabled · e009d5dc

由 Michal Hocko 提交于 3月 12, 2015

Tetsuo Handa has pointed out that __GFP_NOFAIL allocations might fail
after OOM killer is disabled if the allocation is performed by a kernel
thread.  This behavior was introduced from the very beginning by
7f33d49a ("mm, PM/Freezer: Disable OOM killer when tasks are frozen").
 This means that the basic contract for the allocation request is broken
and the context requesting such an allocation might blow up unexpectedly.

There are basically two ways forward.

1) move oom_killer_disable after kernel threads are frozen.  This has a
   risk that the OOM victim wouldn't be able to finish because it would
   depend on an already frozen kernel thread.  This would be really tricky
   to debug.

2) do not fail GFP_NOFAIL allocation no matter what and risk a
   potential Freezable kernel threads will loop and fail the suspend.
   Incidental allocations after kernel threads are frozen will at least
   dump a warning - if we are lucky and the serial console is still active
   of course...

This patch implements the later option because it is safer.  We would see
warning rather than allocation failures for the kernel threads which would
blow up otherwise and have a higher chances to identify __GFP_NOFAIL users
from deeper pm code.
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NDavid Rientjes <rientjes@gooogle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e009d5dc

drivers/rtc/rtc-s3c.c: add .needs_src_clk to s3c6410 RTC data · 8792f777

由 Javier Martinez Canillas 提交于 3月 12, 2015

Commit df9e26d0 ("rtc: s3c: add support for RTC of Exynos3250 SoC")
added an "rtc_src" DT property to specify the clock used as a source to
the S3C real-time clock.

Not all SoCs needs this so commit eaf3a659 ("drivers/rtc/rtc-s3c.c:
fix initialization failure without rtc source clock") changed to check
the struct s3c_rtc_data .needs_src_clk to conditionally grab the clock.

But that commit didn't update the data for each IP version so the RTC
broke on the boards that needs a source clock. This is the case of at
least Exynos5250 and Exynos5440 which uses the s3c6410 RTC IP block.

This commit fixes the S3C rtc on the Exynos5250 Snow and Exynos5420
Peach Pit and Pi Chromebooks.
Signed-off-by: NJavier Martinez Canillas <javier.martinez@collabora.co.uk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Chanwoo Choi <cw00.choi@samsung.com>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Tyler Baker <tyler.baker@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8792f777

ocfs2: make append_dio an incompat feature · 18d585f0

由 Mark Fasheh 提交于 3月 12, 2015

It turns out that making this feature ro_compat isn't quite enough to
prevent accidental corruption on mount from older kernels.  Ocfs2 (like
other file systems) will process orphaned inodes even when the user mounts
in 'ro' mode.  So for the case of a filesystem not knowing the append_dio
feature, mounting the filesystem could result in orphaned-for-dio files
being deleted, which we clearly don't want.

So instead, turn this into an incompat flag.

Btw, this is kind of my fault - initially I asked that we add a flag to
cover the feature and even suggested that we use an ro flag.  It wasn't
until I was looking through our commits for v4.0-rc1 that I realized we
actually want this to be incompat.
Signed-off-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

18d585f0

mm: thp: Return the correct value for change_huge_pmd · ba68bc01

由 Mel Gorman 提交于 3月 07, 2015

The wrong value is being returned by change_huge_pmd since commit
10c1045f ("mm: numa: avoid unnecessary TLB flushes when setting
NUMA hinting entries") which allows a fallthrough that tries to adjust
non-existent PTEs. This patch corrects it.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ba68bc01

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 09d35919

由 Linus Torvalds 提交于 3月 12, 2015

Pull i2c fix from Wolfram Sang:
 "An important bugfix for the I2C subsystem core"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  Revert "i2c: core: Dispose OF IRQ mapping at client removal time"

09d35919

Merge tag 'pci-v4.0-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 91e9134e

由 Linus Torvalds 提交于 3月 12, 2015

Pull PCI fixes from Bjorn Helgaas:
 "Here are a couple updates for v4.0.

  One fixes a config accessor problem on APM X-Gene that we introduced
  when switching to generic config accessors, and the other fixes an
  older read-past-end-of-buffer problem in sysfs.

  APM X-Gene host bridge driver
    - Add register offset to config space base address (Feng Kan)

  Miscellaneous
    - Don't read past the end of sysfs "driver_override" buffer (Sasha Levin)"

* tag 'pci-v4.0-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: xgene: Add register offset to config space base address
  PCI: Don't read past the end of sysfs "driver_override" buffer

91e9134e

Merge tag 'microblaze-4.0-rc4' of git://git.monstr.eu/linux-2.6-microblaze · d3dd73fc

由 Linus Torvalds 提交于 3月 12, 2015

Pull arch/microblaze fixes from Michal Simek:
 "Fix syscall error recovery.

  Two patches - one is just preparation patch for the second which is
  fixing the problem with syscalls"

* tag 'microblaze-4.0-rc4' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Fix syscall error recovery for invalid syscall IDs
  microblaze: Coding style cleanup

d3dd73fc

Merge tag 'nios2-fix-4.0-rc4' of git://git.rocketboards.org/linux-socfpga-next · 56275112

由 Linus Torvalds 提交于 3月 12, 2015

Pull arch/nios2 fix from Ley Foon Tan:
 "Remove pt_regs from user header and use generic ucontext.h"

* tag 'nios2-fix-4.0-rc4' of git://git.rocketboards.org/linux-socfpga-next:
  nios2: update pt_regs

56275112

12 3月, 2015 3 次提交

mm: fix up numa read-only thread grouping logic · 53da3bc2

由 Linus Torvalds 提交于 3月 12, 2015

Dave Chinner reported that commit 4d942466 ("mm: convert
p[te|md]_mknonnuma and remaining page table manipulations") slowed down
his xfsrepair test enormously.  In particular, it was using more system
time due to extra TLB flushing.

The ultimate reason turns out to be how the change to use the regular
page table accessor functions broke the NUMA grouping logic.  The old
special mknuma/mknonnuma code accessed the page table present bit and
the magic NUMA bit directly, while the new code just changes the page
protections using PROT_NONE and the regular vma protections.

That sounds equivalent, and from a fault standpoint it really is, but a
subtle side effect is that the *other* protection bits of the page table
entries also change.  And the code to decide how to group the NUMA
entries together used the writable bit to decide whether a particular
page was likely to be shared read-only or not.

And with the change to make the NUMA handling use the regular permission
setting functions, that writable bit was basically always cleared for
private mappings due to COW.  So even if the page actually ends up being
written to in the end, the NUMA balancing would act as if it was always
shared RO.

This code is a heuristic anyway, so the fix - at least for now - is to
instead check whether the page is dirty rather than writable.  The bit
doesn't change with protection changes.

NOTE! This also adds a FIXME comment to revisit this issue,

Not only should we probably re-visit the whole "is this a shared
read-only page" heuristic (we might want to take the vma permissions
into account and base this more on those than the per-page ones, and
also look at whether the particular access that triggers it is a write
or not), but the whole COW issue shows that we should think about the
NUMA fault handling some more.

For example, maybe we should do the early-COW thing that a regular fault
does.  Or maybe we should accept that while using the same bits as
PROTNONE was a good thing (and got rid of the specual NUMA bit), we
might still want to just preseve the other protection bits across NUMA
faulting.

Those are bigger questions, left for later.  This just fixes up the
heuristic so that it at least approximates working again.  More analysis
and work needed.
Reported-by: NDave Chinner <david@fromorbit.com>
Tested-by: NMel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>,
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

53da3bc2

Revert "i2c: core: Dispose OF IRQ mapping at client removal time" · a4944572

由 Jakub Kicinski 提交于 3月 11, 2015

This reverts commit e4df3a0b
("i2c: core: Dispose OF IRQ mapping at client removal time")

Calling irq_dispose_mapping() will destroy the mapping and disassociate
the IRQ from the IRQ chip to which it belongs. Keeping it is OK, because
existent mappings are reused properly.

Also, this commit breaks drivers using devm* for IRQ management on
OF-based systems because devm* cleanup happens in device code, after
bus's remove() method returns.
Signed-off-by: NJakub Kicinski <kubakici@wp.pl>
Reported-by: NSébastien Szymanski <sebastien.szymanski@armadeus.com>
Acked-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
[wsa: updated the commit message with findings fromt the other bug report]
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Fixes: e4df3a0b

a4944572

nios2: update pt_regs · 92d5dd8c

由 Chung-Ling Tang 提交于 3月 12, 2015

Remove struct pt_regs from user header and use generic ucontext.h.
Signed-off-by: NChung-Ling Tang <cltang@codesourcery.com>
Acked-by: NLey Foon Tan <lftan@altera.com>

92d5dd8c

11 3月, 2015 2 次提交

Merge tag 'for-linus-20150310' of git://git.infradead.org/linux-mtd · cca28a5f

由 Linus Torvalds 提交于 3月 10, 2015

Pull MTD fixes from Brian Norris:

 * pxa3xx_nand
   - fix timeout issues when draining the FIFO (BCH only)
   - don't crash when no chip-selects are used

 * hisi504_nand
   - depend on HAS_DMA, to fix compile errors

* tag 'for-linus-20150310' of git://git.infradead.org/linux-mtd:
  mtd: nand: MTD_NAND_HISI504 should depend on HAS_DMA
  mtd: pxa3xx_nand: fix driver when num_cs is 0
  mtd: nand: pxa3xx: Fix PIO FIFO draining

cca28a5f

Merge tag 'iommu-fixes-v4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 9c3e1323

由 Linus Torvalds 提交于 3月 10, 2015

Pull iommu fixes from Joerg Roedel:
 "The patches contain:

   - fix multiple ARM IOMMU drivers to behave well when the hardware is
     not present

   - mark MSM driver as broken

   - fix build errors with the new ARM generic io-page-table code"

* tag 'iommu-fixes-v4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/io-pgtable-arm: Add built time dependency
  iommu/msm: Mark driver BROKEN
  iommu/rockchip: Play nice in multi-platform builds
  iommu/omap: Play nice in multi-platform builds
  iommu/exynos: Play nice in multi-platform builds
  iommu/io-pgtable-arm: Fix self-test WARNs on i386

9c3e1323

10 3月, 2015 3 次提交

Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · affb8172

由 Linus Torvalds 提交于 3月 09, 2015

Pull kvm/s390 bugfixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: non-LPAR case obsolete during facilities mask init
  KVM: s390: include guest facilities in kvm facility test
  KVM: s390: fix in memory copy of facility lists
  KVM: s390/cpacf: Fix kernel bug under z/VM
  KVM: s390/cpacf: Enable key wrapping by default

affb8172

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · ec0e6bd3

由 Linus Torvalds 提交于 3月 09, 2015

Pull s390 fixes from Martin Schwidefsky:
 "One performance optimization for page_clear and a couple of bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: fix incorrect ASCE after crst_table_downgrade
  s390/ftrace: fix crashes when switching tracers / add notrace to cpu_relax()
  s390/pci: unify pci_iomap symbol exports
  s390/pci: fix [un]map_resources sequence
  s390: let the compiler do page clearing
  s390/pci: fix possible information leak in mmio syscall
  s390/dcss: array index 'i' is used before limits check.
  s390/scm_block: fix off by one during cluster reservation
  s390/jump label: improve and fix sanity check
  s390/jump label: add missing jump_label_apply_nops() call

ec0e6bd3

Merge tag 'trace-fixes-v4.0-rc2-2' of... · e7901af1

由 Linus Torvalds 提交于 3月 09, 2015

Merge tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull seq-buf/ftrace fixes from Steven Rostedt:
 "This includes fixes for seq_buf_bprintf() truncation issue.  It also
  contains fixes to ftrace when /proc/sys/kernel/ftrace_enabled and
  function tracing are started.  Doing the following causes some issues:

    # echo 0 > /proc/sys/kernel/ftrace_enabled
    # echo function_graph > /sys/kernel/debug/tracing/current_tracer
    # echo 1 > /proc/sys/kernel/ftrace_enabled
    # echo nop > /sys/kernel/debug/tracing/current_tracer
    # echo function_graph > /sys/kernel/debug/tracing/current_tracer

  As well as with function tracing too.  Pratyush Anand first reported
  this issue to me and supplied a patch.  When I tested this on my x86
  test box, it caused thousands of backtraces and warnings to appear in
  dmesg, which also caused a denial of service (a warning for every
  function that was listed).  I applied Pratyush's patch but it did not
  fix the issue for me.  I looked into it and found a slight problem
  with trampoline accounting.  I fixed it and sent Pratyush a patch, but
  he said that it did not fix the issue for him.

  I later learned tha Pratyush was using an ARM64 server, and when I
  tested on my ARM board, I was able to reproduce the same issue as
  Pratyush.  After applying his patch, it fixed the problem.  The above
  test uncovered two different bugs, one in x86 and one in ARM and
  ARM64.  As this looked like it would affect PowerPC, I tested it on my
  PPC64 box.  It too broke, but neither the patch that fixed ARM or x86
  fixed this box (the changes were all in generic code!).  The above
  test, uncovered two more bugs that affected PowerPC.  Again, the
  changes were only done to generic code.  It's the way the arch code
  expected things to be done that was different between the archs.  Some
  where more sensitive than others.

  The rest of this series fixes the PPC bugs as well"

* tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled
  ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl
  ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl
  seq_buf: Fix seq_buf_bprintf() truncation
  seq_buf: Fix seq_buf_vprintf() truncation

e7901af1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功