1. 27 3月, 2019 5 次提交
  2. 23 3月, 2019 2 次提交
  3. 21 3月, 2019 3 次提交
  4. 20 3月, 2019 1 次提交
    • I
      libceph: wait for latest osdmap in ceph_monc_blacklist_add() · bb229bbb
      Ilya Dryomov 提交于
      Because map updates are distributed lazily, an OSD may not know about
      the new blacklist for quite some time after "osd blacklist add" command
      is completed.  This makes it possible for a blacklisted but still alive
      client to overwrite a post-blacklist update, resulting in data
      corruption.
      
      Waiting for latest osdmap in ceph_monc_blacklist_add() and thus using
      the post-blacklist epoch for all post-blacklist requests ensures that
      all such requests "wait" for the blacklist to come into force on their
      respective OSDs.
      
      Cc: stable@vger.kernel.org
      Fixes: 6305a3b4 ("libceph: support for blacklisting clients")
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NJason Dillaman <dillaman@redhat.com>
      bb229bbb
  5. 19 3月, 2019 3 次提交
    • D
      blk-mq: remove unused 'nr_expired' from blk_mq_hw_ctx · 9496c015
      Dongli Zhang 提交于
      There is no usage of 'nr_expired'.
      
      The 'nr_expired' was introduced by commit 1d9bd516 ("blk-mq: replace
      timeout synchronization with a RCU and generation based scheme"). Its usage
      was removed since commit 12f5b931 ("blk-mq: Remove generation
      seqeunce").
      Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9496c015
    • J
      block: add BIO_NO_PAGE_REF flag · 399254aa
      Jens Axboe 提交于
      If bio_iov_iter_get_pages() is called on an iov_iter that is flagged
      with NO_REF, then we don't need to add a page reference for the pages
      that we add.
      
      Add BIO_NO_PAGE_REF to track this in the bio, so IO completion knows
      not to drop a reference to these pages.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      399254aa
    • J
      iov_iter: add ITER_BVEC_FLAG_NO_REF flag · 875f1d07
      Jens Axboe 提交于
      For ITER_BVEC, if we're holding on to kernel pages, the caller
      doesn't need to grab a reference to the bvec pages, and drop that
      same reference on IO completion. This is essentially safe for any
      ITER_BVEC, but some use cases end up reusing pages and uncondtionally
      dropping a page reference on completion. And example of that is
      sendfile(2), that ends up being a splice_in + splice_out on the
      pipe pages.
      
      Add a flag that tells us it's fine to not grab a page reference
      to the bvec pages, since that caller knows not to drop a reference
      when it's done with the pages.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      875f1d07
  6. 18 3月, 2019 1 次提交
  7. 17 3月, 2019 2 次提交
  8. 16 3月, 2019 1 次提交
    • J
      filemap: kill page_cache_read usage in filemap_fault · a75d4c33
      Josef Bacik 提交于
      Patch series "drop the mmap_sem when doing IO in the fault path", v6.
      
      Now that we have proper isolation in place with cgroups2 we have started
      going through and fixing the various priority inversions.  Most are all
      gone now, but this one is sort of weird since it's not necessarily a
      priority inversion that happens within the kernel, but rather because of
      something userspace does.
      
      We have giant applications that we want to protect, and parts of these
      giant applications do things like watch the system state to determine how
      healthy the box is for load balancing and such.  This involves running
      'ps' or other such utilities.  These utilities will often walk
      /proc/<pid>/whatever, and these files can sometimes need to
      down_read(&task->mmap_sem).  Not usually a big deal, but we noticed when
      we are stress testing that sometimes our protected application has latency
      spikes trying to get the mmap_sem for tasks that are in lower priority
      cgroups.
      
      This is because any down_write() on a semaphore essentially turns it into
      a mutex, so even if we currently have it held for reading, any new readers
      will not be allowed on to keep from starving the writer.  This is fine,
      except a lower priority task could be stuck doing IO because it has been
      throttled to the point that its IO is taking much longer than normal.  But
      because a higher priority group depends on this completing it is now stuck
      behind lower priority work.
      
      In order to avoid this particular priority inversion we want to use the
      existing retry mechanism to stop from holding the mmap_sem at all if we
      are going to do IO.  This already exists in the read case sort of, but
      needed to be extended for more than just grabbing the page lock.  With
      io.latency we throttle at submit_bio() time, so the readahead stuff can
      block and even page_cache_read can block, so all these paths need to have
      the mmap_sem dropped.
      
      The other big thing is ->page_mkwrite.  btrfs is particularly shitty here
      because we have to reserve space for the dirty page, which can be a very
      expensive operation.  We use the same retry method as the read path, and
      simply cache the page and verify the page is still setup properly the next
      pass through ->page_mkwrite().
      
      I've tested these patches with xfstests and there are no regressions.
      
      This patch (of 3):
      
      If we do not have a page at filemap_fault time we'll do this weird forced
      page_cache_read thing to populate the page, and then drop it again and
      loop around and find it.  This makes for 2 ways we can read a page in
      filemap_fault, and it's not really needed.  Instead add a FGP_FOR_MMAP
      flag so that pagecache_get_page() will return a unlocked page that's in
      pagecache.  Then use the normal page locking and readpage logic already in
      filemap_fault.  This simplifies the no page in page cache case
      significantly.
      
      [akpm@linux-foundation.org: fix comment text]
      [josef@toxicpanda.com: don't unlock null page in FGP_FOR_MMAP case]
        Link: http://lkml.kernel.org/r/20190312201742.22935-1-josef@toxicpanda.com
      Link: http://lkml.kernel.org/r/20181211173801.29535-2-josef@toxicpanda.comSigned-off-by: NJosef Bacik <josef@toxicpanda.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a75d4c33
  9. 15 3月, 2019 1 次提交
  10. 14 3月, 2019 1 次提交
  11. 13 3月, 2019 20 次提交
    • D
      tracing: kdb: Fix ftdump to not sleep · 31b265b3
      Douglas Anderson 提交于
      As reported back in 2016-11 [1], the "ftdump" kdb command triggers a
      BUG for "sleeping function called from invalid context".
      
      kdb's "ftdump" command wants to call ring_buffer_read_prepare() in
      atomic context.  A very simple solution for this is to add allocation
      flags to ring_buffer_read_prepare() so kdb can call it without
      triggering the allocation error.  This patch does that.
      
      Note that in the original email thread about this, it was suggested
      that perhaps the solution for kdb was to either preallocate the buffer
      ahead of time or create our own iterator.  I'm hoping that this
      alternative of adding allocation flags to ring_buffer_read_prepare()
      can be considered since it means I don't need to duplicate more of the
      core trace code into "trace_kdb.c" (for either creating my own
      iterator or re-preparing a ring allocator whose memory was already
      allocated).
      
      NOTE: another option for kdb is to actually figure out how to make it
      reuse the existing ftrace_dump() function and totally eliminate the
      duplication.  This sounds very appealing and actually works (the "sr
      z" command can be seen to properly dump the ftrace buffer).  The
      downside here is that ftrace_dump() fully consumes the trace buffer.
      Unless that is changed I'd rather not use it because it means "ftdump
      | grep xyz" won't be very useful to search the ftrace buffer since it
      will throw away the whole trace on the first grep.  A future patch to
      dump only the last few lines of the buffer will also be hard to
      implement.
      
      [1] https://lkml.kernel.org/r/20161117191605.GA21459@google.com
      
      Link: http://lkml.kernel.org/r/20190308193205.213659-1-dianders@chromium.orgReported-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      31b265b3
    • C
      f2fs: print more parameters in trace_f2fs_map_blocks · 76630f20
      Chao Yu 提交于
      for better map_blocks trace.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      76630f20
    • C
      f2fs: trace f2fs_ioc_shutdown · 559e87c4
      Chao Yu 提交于
      This patch supports to trace f2fs_ioc_shutdown.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      559e87c4
    • Z
      f2fs: correct spelling mistake · 68b79cdc
      Zeng Guangyue 提交于
      correct spelling mistake for "nunmber"
      Signed-off-by: NZeng Guangyue <zengguangyue@hisilicon.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      68b79cdc
    • A
      dt-bindings: clock: imx8mq: Fix numbering overlaps and gaps · 010d5166
      Abel Vesa 提交于
      IMX8MQ_CLK_USB_PHY_REF changes from 163 to 153, this way removing the gap.
      All the following clock ids are now decreased by 10 to keep the numbering
      right. Doing this, the IMX8MQ_CLK_CSI2_CORE is not overlapped with
      IMX8MQ_CLK_GPT1 anymore. IMX8MQ_CLK_GPT1_ROOT changes from 193 to 183 and
      all the following ids are updated accordingly.
      Reported-by: NPatrick Wildt <patrick@blueri.se>
      Fixes: 1cf3817b ("dt-bindings: Add binding for i.MX8MQ CCM")
      Signed-off-by: NAbel Vesa <abel.vesa@nxp.com>
      Signed-off-by: NStephen Boyd <sboyd@kernel.org>
      010d5166
    • O
      fix null pointer deref in tracepoints in back channel · f87b543a
      Olga Kornievskaia 提交于
      Backchannel doesn't have the rq_task->tk_clientid pointer set.
      
      Otherwise can lead to the following oops:
      ocalhost login: [  111.385319] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
      [  111.388073] #PF error: [normal kernel read fault]
      [  111.389452] PGD 80000000290d8067 P4D 80000000290d8067 PUD 75f25067 PMD 0
      [  111.391224] Oops: 0000 [#1] SMP PTI
      [  111.392151] CPU: 0 PID: 3533 Comm: NFSv4 callback Not tainted 5.0.0-rc7+ #1
      [  111.393787] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
      [  111.396340] RIP: 0010:trace_event_raw_event_xprt_enq_xmit+0x6f/0xf0 [sunrpc]
      [  111.397974] Code: 00 00 00 48 89 ee 48 89 e7 e8 bd 0a 85 d7 48 85 c0 74 4a 41 0f b7 94 24 e0 00 00 00 48 89 e7 89 50 08 49 8b 94 24 a8 00 00 00 <8b> 52 04 89 50 0c 49 8b 94 24 c0 00 00 00 8b 92 a8 00 00 00 0f ca
      [  111.402215] RSP: 0018:ffffb98743263cf8 EFLAGS: 00010286
      [  111.403406] RAX: ffffa0890fc3bc88 RBX: 0000000000000003 RCX: 0000000000000000
      [  111.405057] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb98743263cf8
      [  111.406656] RBP: ffffa0896f5368f0 R08: 0000000000000246 R09: 0000000000000000
      [  111.408437] R10: ffffe19b01c01500 R11: 0000000000000000 R12: ffffa08977d28a00
      [  111.410210] R13: 0000000000000004 R14: ffffa089315303f0 R15: ffffa08931530000
      [  111.411856] FS:  0000000000000000(0000) GS:ffffa0897bc00000(0000) knlGS:0000000000000000
      [  111.413699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  111.415068] CR2: 0000000000000004 CR3: 000000002ac90004 CR4: 00000000001606f0
      [  111.416745] Call Trace:
      [  111.417339]  xprt_request_enqueue_transmit+0x2b6/0x4a0 [sunrpc]
      [  111.418709]  ? rpc_task_need_encode+0x40/0x40 [sunrpc]
      [  111.419957]  call_bc_transmit+0xd5/0x170 [sunrpc]
      [  111.421067]  __rpc_execute+0x7e/0x3f0 [sunrpc]
      [  111.422177]  rpc_run_bc_task+0x78/0xd0 [sunrpc]
      [  111.423212]  bc_svc_process+0x281/0x340 [sunrpc]
      [  111.424325]  nfs41_callback_svc+0x130/0x1c0 [nfsv4]
      [  111.425430]  ? remove_wait_queue+0x60/0x60
      [  111.426398]  kthread+0xf5/0x130
      [  111.427155]  ? nfs_callback_authenticate+0x50/0x50 [nfsv4]
      [  111.428388]  ? kthread_bind+0x10/0x10
      [  111.429270]  ret_from_fork+0x1f/0x30
      
      localhost login: [  467.462259] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
      [  467.464411] #PF error: [normal kernel read fault]
      [  467.465445] PGD 80000000728c1067 P4D 80000000728c1067 PUD 728c0067 PMD 0
      [  467.466980] Oops: 0000 [#1] SMP PTI
      [  467.467759] CPU: 0 PID: 3517 Comm: NFSv4 callback Not tainted 5.0.0-rc7+ #1
      [  467.469393] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
      [  467.471840] RIP: 0010:trace_event_raw_event_xprt_transmit+0x7c/0xf0 [sunrpc]
      [  467.473392] Code: f6 48 85 c0 74 4b 49 8b 94 24 98 00 00 00 48 89 e7 0f b7 92 e0 00 00 00 89 50 08 49 8b 94 24 98 00 00 00 48 8b 92 a8 00 00 00 <8b> 52 04 89 50 0c 41 8b 94 24 a8 00 00 00 0f ca 89 50 10 41 8b 94
      [  467.477605] RSP: 0018:ffffabe7434fbcd0 EFLAGS: 00010282
      [  467.478793] RAX: ffff99720fc3bce0 RBX: 0000000000000003 RCX: 0000000000000000
      [  467.480409] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffabe7434fbcd0
      [  467.482011] RBP: ffff99726f631948 R08: 0000000000000246 R09: 0000000000000000
      [  467.483591] R10: 0000000070000000 R11: 0000000000000000 R12: ffff997277dfcc00
      [  467.485226] R13: 0000000000000000 R14: 0000000000000000 R15: ffff99722fecdca8
      [  467.486830] FS:  0000000000000000(0000) GS:ffff99727bc00000(0000) knlGS:0000000000000000
      [  467.488596] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  467.489931] CR2: 0000000000000004 CR3: 00000000270e6006 CR4: 00000000001606f0
      [  467.491559] Call Trace:
      [  467.492128]  xprt_transmit+0x303/0x3f0 [sunrpc]
      [  467.493143]  ? rpc_task_need_encode+0x40/0x40 [sunrpc]
      [  467.494328]  call_bc_transmit+0x49/0x170 [sunrpc]
      [  467.495379]  __rpc_execute+0x7e/0x3f0 [sunrpc]
      [  467.496451]  rpc_run_bc_task+0x78/0xd0 [sunrpc]
      [  467.497467]  bc_svc_process+0x281/0x340 [sunrpc]
      [  467.498507]  nfs41_callback_svc+0x130/0x1c0 [nfsv4]
      [  467.499751]  ? remove_wait_queue+0x60/0x60
      [  467.500686]  kthread+0xf5/0x130
      [  467.501438]  ? nfs_callback_authenticate+0x50/0x50 [nfsv4]
      [  467.502640]  ? kthread_bind+0x10/0x10
      [  467.503454]  ret_from_fork+0x1f/0x30
      Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      f87b543a
    • K
      Drop flex_arrays · 586187d7
      Kent Overstreet 提交于
      All existing users have been converted to generic radix trees
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-8-kent.overstreet@gmail.comSigned-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      Acked-by: NDave Hansen <dave.hansen@intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      586187d7
    • K
      sctp: convert to genradix · 2075e50c
      Kent Overstreet 提交于
      This also makes sctp_stream_alloc_(out|in) saner, in that they no longer
      allocate new flex_arrays/genradixes, they just preallocate more
      elements.
      
      This code does however have a suspicious lack of locking.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-7-kent.overstreet@gmail.comSigned-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2075e50c
    • K
      generic radix trees · ba20ba2e
      Kent Overstreet 提交于
      Very simple radix tree implementation that supports storing arbitrary
      size entries, up to PAGE_SIZE - upcoming patches will convert existing
      flex_array users to genradixes.  The new genradix code has a much
      simpler API and implementation, and doesn't have a hard limit on the
      number of elements like flex_array does.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-5-kent.overstreet@gmail.comSigned-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba20ba2e
    • M
      memblock: remove memblock_{set,clear}_region_flags · fe145124
      Mike Rapoport 提交于
      The memblock API provides dedicated helpers to set or clear a flag on a
      memory region, e.g.  memblock_{mark,clear}_hotplug().
      
      The memblock_{set,clear}_region_flags() functions are used only by the
      memblock internal function that adjusts the region flags.  Drop these
      functions and use open-coded implementation instead.
      
      Link: http://lkml.kernel.org/r/1549455025-17706-2-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe145124
    • M
      memblock: drop memblock_alloc_*_nopanic() variants · 26fb3dae
      Mike Rapoport 提交于
      As all the memblock allocation functions return NULL in case of error
      rather than panic(), the duplicates with _nopanic suffix can be removed.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-22-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: Petr Mladek <pmladek@suse.com>		[printk]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26fb3dae
    • M
      memblock: make memblock_find_in_range_node() and choose_memblock_flags() static · c366ea89
      Mike Rapoport 提交于
      These functions are not used outside memblock.  Make them static.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-12-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c366ea89
    • M
      memblock: refactor internal allocation functions · 92d12f95
      Mike Rapoport 提交于
      Currently, memblock has several internal functions with overlapping
      functionality.  They all call memblock_find_in_range_node() to find free
      memory and then reserve the allocated range and mark it with kmemleak.
      However, there is difference in the allocation constraints and in
      fallback strategies.
      
      The allocations returning physical address first attempt to find free
      memory on the specified node within mirrored memory regions, then retry
      on the same node without the requirement for memory mirroring and
      finally fall back to all available memory.
      
      The allocations returning virtual address start with clamping the
      allowed range to memblock.current_limit, attempt to allocate from the
      specified node from regions with mirroring and with user defined minimal
      address.  If such allocation fails, next attempt is done with node
      restriction lifted.  Next, the allocation is retried with minimal
      address reset to zero and at last without the requirement for mirrored
      regions.
      
      Let's consolidate various fallbacks handling and make them more
      consistent for physical and virtual variants.  Most of the fallback
      handling is moved to memblock_alloc_range_nid() and it now handles node
      and mirror fallbacks.
      
      The memblock_alloc_internal() uses memblock_alloc_range_nid() to get a
      physical address of the allocated range and converts it to virtual
      address.
      
      The fallback for allocation below the specified minimal address remains
      in memblock_alloc_internal() because memblock_alloc_range_nid() is used
      by CMA with exact requirement for lower bounds.
      
      The memblock_phys_alloc_nid() function is completely dropped as it is not
      used anywhere outside memblock and its only usage can be replaced by a
      call to memblock_alloc_range_nid().
      
      [rppt@linux.ibm.com: fix parameter order in memblock_phys_alloc_try_nid()]
        Link: http://lkml.kernel.org/r/20190203113915.GC8620@rapoport-lnx
      Link: http://lkml.kernel.org/r/1548057848-15136-11-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Tested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      92d12f95
    • M
      memblock: drop memblock_alloc_base() · 0ba9e6ed
      Mike Rapoport 提交于
      The memblock_alloc_base() function tries to allocate a memory up to the
      limit specified by its max_addr parameter and panics if the allocation
      fails.  Replace its usage with memblock_phys_alloc_range() and make the
      callers check the return value and panic in case of error.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-10-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ba9e6ed
    • M
      memblock: drop __memblock_alloc_base() · 42b46aef
      Mike Rapoport 提交于
      The __memblock_alloc_base() function tries to allocate a memory up to
      the limit specified by its max_addr parameter.  Depending on the value
      of this parameter, the __memblock_alloc_base() can is replaced with the
      appropriate memblock_phys_alloc*() variant.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-9-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NRob Herring <robh@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      42b46aef
    • M
      memblock: memblock_phys_alloc(): don't panic · ecc3e771
      Mike Rapoport 提交于
      Make the memblock_phys_alloc() function an inline wrapper for
      memblock_phys_alloc_range() and update the memblock_phys_alloc() callers
      to check the returned value and panic in case of error.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-8-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ecc3e771
    • M
      memblock: emphasize that memblock_alloc_range() returns a physical address · 8a770c2a
      Mike Rapoport 提交于
      Rename memblock_alloc_range() to memblock_phys_alloc_range() to
      emphasize that it returns a physical address.
      
      While on it, remove the 'enum memblock_flags' parameter from this
      function as its only user anyway sets it to MEMBLOCK_NONE, which is the
      default for the most of memblock allocations.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-6-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a770c2a
    • M
      memblock: drop memblock_alloc_base_nid() · 53d818d2
      Mike Rapoport 提交于
      memblock_alloc_base_nid() is a oneliner wrapper for
      memblock_alloc_range_nid() without any side effect.
      
      Replace it's usage by the direct calls to memblock_alloc_range_nid().
      
      Link: http://lkml.kernel.org/r/1548057848-15136-5-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53d818d2
    • N
      mm: refactor readahead defines in mm.h · b5420237
      Nikolay Borisov 提交于
      All users of VM_MAX_READAHEAD actually convert it to kbytes and then to
      pages. Define the macro explicitly as (SZ_128K / PAGE_SIZE). This
      simplifies the expression in every filesystem. Also rename the macro to
      VM_READAHEAD_PAGES to properly convey its meaning. Finally remove unused
      VM_MIN_READAHEAD
      
      [akpm@linux-foundation.org: fix fs/io_uring.c, per Stephen]
      Link: http://lkml.kernel.org/r/20181221144053.24318-1-nborisov@suse.comSigned-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NMatthew Wilcox <willy@infradead.org>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Latchesar Ionkov <lucho@ionkov.net>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Chris Mason <clm@fb.com>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: David Sterba <dsterba@suse.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5420237
    • S
      mm/hmm: convert to use vm_fault_t · b57e622e
      Souptick Joarder 提交于
      Convert to use vm_fault_t type as return type for fault handler.
      
      kbuild reported warning during testing of
      *mm-create-the-new-vm_fault_t-type.patch* available in below link -
      https://patchwork.kernel.org/patch/10752741/
      
        kernel/memremap.c:46:34: warning: incorrect type in return expression
                                 (different base types)
        kernel/memremap.c:46:34: expected restricted vm_fault_t
        kernel/memremap.c:46:34: got int
      
      This patch has fixed the warnings and also hmm_devmem_fault() is
      converted to return vm_fault_t to avoid further warnings.
      
      [sfr@canb.auug.org.au: drm/nouveau/dmem: update for struct hmm_devmem_ops member change]
        Link: http://lkml.kernel.org/r/20190220174407.753d94e5@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20190110145900.GA1317@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Reviewed-by: NJérôme Glisse <jglisse@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b57e622e