1. 21 11月, 2019 1 次提交
    • R
      PCI: of: Add inbound resource parsing to helpers · 331f6345
      Rob Herring 提交于
      Extend devm_of_pci_get_host_bridge_resources() and
      pci_parse_request_of_pci_ranges() helpers to also parse the inbound
      addresses from DT 'dma-ranges' and populate a resource list with the
      translated addresses. This will help ensure 'dma-ranges' is always
      parsed in a consistent way.
      Tested-by: NSrinath Mannam <srinath.mannam@broadcom.com>
      Tested-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> # for AArdvark
      Signed-off-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NSrinath Mannam <srinath.mannam@broadcom.com>
      Reviewed-by: NAndrew Murray <andrew.murray@arm.com>
      Acked-by: NGustavo Pimentel <gustavo.pimentel@synopsys.com>
      Cc: Jingoo Han <jingoohan1@gmail.com>
      Cc: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Toan Le <toan@os.amperecomputing.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Tom Joseph <tjoseph@cadence.com>
      Cc: Ray Jui <rjui@broadcom.com>
      Cc: Scott Branden <sbranden@broadcom.com>
      Cc: bcm-kernel-feedback-list@broadcom.com
      Cc: Ryder Lee <ryder.lee@mediatek.com>
      Cc: Karthikeyan Mitran <m.karthikeyan@mobiveil.co.in>
      Cc: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Shawn Lin <shawn.lin@rock-chips.com>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: rfi@lists.rocketboards.org
      Cc: linux-mediatek@lists.infradead.org
      Cc: linux-renesas-soc@vger.kernel.org
      Cc: linux-rockchip@lists.infradead.org
      331f6345
  2. 29 10月, 2019 1 次提交
  3. 29 9月, 2019 2 次提交
    • D
      Revert "Revert "Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"" · 19deb769
      David Rientjes 提交于
      This reverts commit 92717d42.
      
      Since commit a8282608 ("Revert "mm, thp: restore node-local hugepage
      allocations"") is reverted in this series, it is better to restore the
      previous 5.2 behavior between the thp allocation and the page allocator
      rather than to attempt any consolidation or cleanup for a policy that is
      now reverted.  It's less risky during an rc cycle and subsequent patches
      in this series further modify the same policy that the pre-5.3 behavior
      implements.
      
      Consolidation and cleanup can be done subsequent to a sane default page
      allocation strategy, so this patch reverts a cleanup done on a strategy
      that is now reverted and thus is the least risky option.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19deb769
    • D
      Revert "Revert "mm, thp: restore node-local hugepage allocations"" · ac79f78d
      David Rientjes 提交于
      This reverts commit a8282608.
      
      The commit references the original intended semantic for MADV_HUGEPAGE
      which has subsequently taken on three unique purposes:
      
       - enables or disables thp for a range of memory depending on the system's
         config (is thp "enabled" set to "always" or "madvise"),
      
       - determines the synchronous compaction behavior for thp allocations at
         fault (is thp "defrag" set to "always", "defer+madvise", or "madvise"),
         and
      
       - reverts a previous MADV_NOHUGEPAGE (there is no madvise mode to only
         clear previous hugepage advice).
      
      These are the three purposes that currently exist in 5.2 and over the
      past several years that userspace has been written around.  Adding a
      NUMA locality preference adds a fourth dimension to an already conflated
      advice mode.
      
      Based on the semantic that MADV_HUGEPAGE has provided over the past
      several years, there exist workloads that use the tunable based on these
      principles: specifically that the allocation should attempt to
      defragment a local node before falling back.  It is agreed that remote
      hugepages typically (but not always) have a better access latency than
      remote native pages, although on Naples this is at parity for
      intersocket.
      
      The revert commit that this patch reverts allows hugepage allocation to
      immediately allocate remotely when local memory is fragmented.  This is
      contrary to the semantic of MADV_HUGEPAGE over the past several years:
      that is, memory compaction should be attempted locally before falling
      back.
      
      The performance degradation of remote hugepages over local hugepages on
      Rome, for example, is 53.5% increased access latency.  For this reason,
      the goal is to revert back to the 5.2 and previous behavior that would
      attempt local defragmentation before falling back.  With the patch that
      is reverted by this patch, we see performance degradations at the tail
      because the allocator happily allocates the remote hugepage rather than
      even attempting to make a local hugepage available.
      
      zone_reclaim_mode is not a solution to this problem since it does not
      only impact hugepage allocations but rather changes the memory
      allocation strategy for *all* page allocations.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac79f78d
  4. 28 9月, 2019 1 次提交
    • F
      sk_buff: drop all skb extensions on free and skb scrubbing · 174e2381
      Florian Westphal 提交于
      Now that we have a 3rd extension, add a new helper that drops the
      extension space and use it when we need to scrub an sk_buff.
      
      At this time, scrubbing clears secpath and bridge netfilter data, but
      retains the tc skb extension, after this patch all three get cleared.
      
      NAPI reuse/free assumes we can only have a secpath attached to skb, but
      it seems better to clear all extensions there as well.
      
      v2: add unlikely hint (Eric Dumazet)
      
      Fixes: 95a7233c ("net: openvswitch: Set OvS recirc_id from tc chain index")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      174e2381
  5. 27 9月, 2019 1 次提交
    • M
      mm: treewide: clarify pgtable_page_{ctor,dtor}() naming · b4ed71f5
      Mark Rutland 提交于
      The naming of pgtable_page_{ctor,dtor}() seems to have confused a few
      people, and until recently arm64 used these erroneously/pointlessly for
      other levels of page table.
      
      To make it incredibly clear that these only apply to the PTE level, and to
      align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them
      to pgtable_pte_page_{ctor,dtor}().
      
      These changes were generated with the following shell script:
      
      ----
      git grep -lw 'pgtable_page_.tor' | while read FILE; do
          sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE;
          sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE;
      done
      ----
      
      ... with the documentation re-flowed to remain under 80 columns, and
      whitespace fixed up in macros to keep backslashes aligned.
      
      There should be no functional change as a result of this patch.
      
      Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.comSigned-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4ed71f5
  6. 26 9月, 2019 12 次提交
    • M
      mm: introduce MADV_PAGEOUT · 1a4e58cc
      Minchan Kim 提交于
      When a process expects no accesses to a certain memory range for a long
      time, it could hint kernel that the pages can be reclaimed instantly but
      data should be preserved for future use.  This could reduce workingset
      eviction so it ends up increasing performance.
      
      This patch introduces the new MADV_PAGEOUT hint to madvise(2) syscall.
      MADV_PAGEOUT can be used by a process to mark a memory range as not
      expected to be used for a long time so that kernel reclaims *any LRU*
      pages instantly.  The hint can help kernel in deciding which pages to
      evict proactively.
      
      A note: It doesn't apply SWAP_CLUSTER_MAX LRU page isolation limit
      intentionally because it's automatically bounded by PMD size.  If PMD
      size(e.g., 256) makes some trouble, we could fix it later by limit it to
      SWAP_CLUSTER_MAX[1].
      
      - man-page material
      
      MADV_PAGEOUT (since Linux x.x)
      
      Do not expect access in the near future so pages in the specified
      regions could be reclaimed instantly regardless of memory pressure.
      Thus, access in the range after successful operation could cause
      major page fault but never lose the up-to-date contents unlike
      MADV_DONTNEED. Pages belonging to a shared mapping are only processed
      if a write access is allowed for the calling process.
      
      MADV_PAGEOUT cannot be applied to locked pages, Huge TLB pages, or
      VM_PFNMAP pages.
      
      [1] https://lore.kernel.org/lkml/20190710194719.GS29695@dhcp22.suse.cz/
      
      [minchan@kernel.org: clear PG_active on MADV_PAGEOUT]
        Link: http://lkml.kernel.org/r/20190802200643.GA181880@google.com
      [akpm@linux-foundation.org: resolve conflicts with hmm.git]
      Link: http://lkml.kernel.org/r/20190726023435.214162-5-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oleksandr Natalenko <oleksandr@redhat.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Sonny Rao <sonnyrao@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Tim Murray <timmurray@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1a4e58cc
    • M
      mm: introduce MADV_COLD · 9c276cc6
      Minchan Kim 提交于
      Patch series "Introduce MADV_COLD and MADV_PAGEOUT", v7.
      
      - Background
      
      The Android terminology used for forking a new process and starting an app
      from scratch is a cold start, while resuming an existing app is a hot
      start.  While we continually try to improve the performance of cold
      starts, hot starts will always be significantly less power hungry as well
      as faster so we are trying to make hot start more likely than cold start.
      
      To increase hot start, Android userspace manages the order that apps
      should be killed in a process called ActivityManagerService.
      ActivityManagerService tracks every Android app or service that the user
      could be interacting with at any time and translates that into a ranked
      list for lmkd(low memory killer daemon).  They are likely to be killed by
      lmkd if the system has to reclaim memory.  In that sense they are similar
      to entries in any other cache.  Those apps are kept alive for
      opportunistic performance improvements but those performance improvements
      will vary based on the memory requirements of individual workloads.
      
      - Problem
      
      Naturally, cached apps were dominant consumers of memory on the system.
      However, they were not significant consumers of swap even though they are
      good candidate for swap.  Under investigation, swapping out only begins
      once the low zone watermark is hit and kswapd wakes up, but the overall
      allocation rate in the system might trip lmkd thresholds and cause a
      cached process to be killed(we measured performance swapping out vs.
      zapping the memory by killing a process.  Unsurprisingly, zapping is 10x
      times faster even though we use zram which is much faster than real
      storage) so kill from lmkd will often satisfy the high zone watermark,
      resulting in very few pages actually being moved to swap.
      
      - Approach
      
      The approach we chose was to use a new interface to allow userspace to
      proactively reclaim entire processes by leveraging platform information.
      This allowed us to bypass the inaccuracy of the kernel’s LRUs for pages
      that are known to be cold from userspace and to avoid races with lmkd by
      reclaiming apps as soon as they entered the cached state.  Additionally,
      it could provide many chances for platform to use much information to
      optimize memory efficiency.
      
      To achieve the goal, the patchset introduce two new options for madvise.
      One is MADV_COLD which will deactivate activated pages and the other is
      MADV_PAGEOUT which will reclaim private pages instantly.  These new
      options complement MADV_DONTNEED and MADV_FREE by adding non-destructive
      ways to gain some free memory space.  MADV_PAGEOUT is similar to
      MADV_DONTNEED in a way that it hints the kernel that memory region is not
      currently needed and should be reclaimed immediately; MADV_COLD is similar
      to MADV_FREE in a way that it hints the kernel that memory region is not
      currently needed and should be reclaimed when memory pressure rises.
      
      This patch (of 5):
      
      When a process expects no accesses to a certain memory range, it could
      give a hint to kernel that the pages can be reclaimed when memory pressure
      happens but data should be preserved for future use.  This could reduce
      workingset eviction so it ends up increasing performance.
      
      This patch introduces the new MADV_COLD hint to madvise(2) syscall.
      MADV_COLD can be used by a process to mark a memory range as not expected
      to be used in the near future.  The hint can help kernel in deciding which
      pages to evict early during memory pressure.
      
      It works for every LRU pages like MADV_[DONTNEED|FREE]. IOW, It moves
      
      	active file page -> inactive file LRU
      	active anon page -> inacdtive anon LRU
      
      Unlike MADV_FREE, it doesn't move active anonymous pages to inactive file
      LRU's head because MADV_COLD is a little bit different symantic.
      MADV_FREE means it's okay to discard when the memory pressure because the
      content of the page is *garbage* so freeing such pages is almost zero
      overhead since we don't need to swap out and access afterward causes just
      minor fault.  Thus, it would make sense to put those freeable pages in
      inactive file LRU to compete other used-once pages.  It makes sense for
      implmentaion point of view, too because it's not swapbacked memory any
      longer until it would be re-dirtied.  Even, it could give a bonus to make
      them be reclaimed on swapless system.  However, MADV_COLD doesn't mean
      garbage so reclaiming them requires swap-out/in in the end so it's bigger
      cost.  Since we have designed VM LRU aging based on cost-model, anonymous
      cold pages would be better to position inactive anon's LRU list, not file
      LRU.  Furthermore, it would help to avoid unnecessary scanning if system
      doesn't have a swap device.  Let's start simpler way without adding
      complexity at this moment.  However, keep in mind, too that it's a caveat
      that workloads with a lot of pages cache are likely to ignore MADV_COLD on
      anonymous memory because we rarely age anonymous LRU lists.
      
      * man-page material
      
      MADV_COLD (since Linux x.x)
      
      Pages in the specified regions will be treated as less-recently-accessed
      compared to pages in the system with similar access frequencies.  In
      contrast to MADV_FREE, the contents of the region are preserved regardless
      of subsequent writes to pages.
      
      MADV_COLD cannot be applied to locked pages, Huge TLB pages, or VM_PFNMAP
      pages.
      
      [akpm@linux-foundation.org: resolve conflicts with hmm.git]
      Link: http://lkml.kernel.org/r/20190726023435.214162-2-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oleksandr Natalenko <oleksandr@redhat.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Sonny Rao <sonnyrao@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Tim Murray <timmurray@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c276cc6
    • D
      kgdb: don't use a notifier to enter kgdb at panic; call directly · 7d92bda2
      Douglas Anderson 提交于
      Right now kgdb/kdb hooks up to debug panics by registering for the panic
      notifier.  This works OK except that it means that kgdb/kdb gets called
      _after_ the CPUs in the system are taken offline.  That means that if
      anything important was happening on those CPUs (like something that might
      have contributed to the panic) you can't debug them.
      
      Specifically I ran into a case where I got a panic because a task was
      "blocked for more than 120 seconds" which was detected on CPU 2.  I nicely
      got shown stack traces in the kernel log for all CPUs including CPU 0,
      which was running 'PID: 111 Comm: kworker/0:1H' and was in the middle of
      __mmc_switch().
      
      I then ended up at the kdb prompt where switched over to kgdb to try to
      look at local variables of the process on CPU 0.  I found that I couldn't.
      Digging more, I found that I had no info on any tasks running on CPUs
      other than CPU 2 and that asking kdb for help showed me "Error: no saved
      data for this cpu".  This was because all the CPUs were offline.
      
      Let's move the entry of kdb/kgdb to a direct call from panic() and stop
      using the generic notifier.  Putting a direct call in allows us to order
      things more properly and it also doesn't seem like we're breaking any
      abstractions by calling into the debugger from the panic function.
      
      Daniel said:
      
      : This patch changes the way kdump and kgdb interact with each other.
      : However it would seem rather odd to have both tools simultaneously armed
      : and, even if they were, the user still has the option to use panic_timeout
      : to force a kdump to happen.  Thus I think the change of order is
      : acceptable.
      
      Link: http://lkml.kernel.org/r/20190703170354.217312-1-dianders@chromium.orgSigned-off-by: NDouglas Anderson <dianders@chromium.org>
      Reviewed-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: YueHaibing <yuehaibing@huawei.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d92bda2
    • K
      uaccess: add missing __must_check attributes · 9dd819a1
      Kees Cook 提交于
      The usercopy implementation comments describe that callers of the
      copy_*_user() family of functions must always have their return values
      checked.  This can be enforced at compile time with __must_check, so add
      it where needed.
      
      Link: http://lkml.kernel.org/r/201908251609.ADAD5CAAC1@keescookSigned-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9dd819a1
    • V
      kexec: restore arch_kexec_kernel_image_probe declaration · d5372c39
      Vasily Gorbik 提交于
      arch_kexec_kernel_image_probe function declaration has been removed by
      commit 9ec4ecef ("kexec_file,x86,powerpc: factor out kexec_file_ops
      functions").  Still this function is overridden by couple of architectures
      and proper prototype declaration is therefore important, so bring it back.
      This fixes the following sparse warning on s390:
      arch/s390/kernel/machine_kexec_file.c:333:5: warning: symbol
      'arch_kexec_kernel_image_probe' was not declared.  Should it be static?
      
      Link: http://lkml.kernel.org/r/patch.git-ff1c9045ebdc.your-ad-here.call-01564402297-ext-5690@work.hoursSigned-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Reviewed-by: NBhupesh Sharma <bhsharma@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5372c39
    • A
      cpumask: nicer for_each_cpumask_and() signature · 2a4a4082
      Alexey Dobriyan 提交于
      Mask arguments can be swapped without changing anything.  Make arguments
      names reflect that:
      
      	#define for_each_cpu_and(cpu, mask1, mask2)
      
      Link: http://lkml.kernel.org/r/20190724183350.GA15041@avx2Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a4a4082
    • S
      fork: improve error message for corrupted page tables · 8495f7e6
      Sai Praneeth Prakhya 提交于
      When a user process exits, the kernel cleans up the mm_struct of the user
      process and during cleanup, check_mm() checks the page tables of the user
      process for corruption (E.g: unexpected page flags set/cleared).  For
      corrupted page tables, the error message printed by check_mm() isn't very
      clear as it prints the loop index instead of page table type (E.g:
      Resident file mapping pages vs Resident shared memory pages).  The loop
      index in check_mm() is used to index rss_stat[] which represents
      individual memory type stats.  Hence, instead of printing index, print
      memory type, thereby improving error message.
      
      Without patch:
      --------------
      [  204.836425] mm/pgtable-generic.c:29: bad p4d 0000000089eb4e92(800000025f941467)
      [  204.836544] BUG: Bad rss-counter state mm:00000000f75895ea idx:0 val:2
      [  204.836615] BUG: Bad rss-counter state mm:00000000f75895ea idx:1 val:5
      [  204.836685] BUG: non-zero pgtables_bytes on freeing mm: 20480
      
      With patch:
      -----------
      [   69.815453] mm/pgtable-generic.c:29: bad p4d 0000000084653642(800000025ca37467)
      [   69.815872] BUG: Bad rss-counter state mm:00000000014a6c03 type:MM_FILEPAGES val:2
      [   69.815962] BUG: Bad rss-counter state mm:00000000014a6c03 type:MM_ANONPAGES val:5
      [   69.816050] BUG: non-zero pgtables_bytes on freeing mm: 20480
      
      Also, change print function (from printk(KERN_ALERT, ..) to pr_alert()) so
      that it matches the other print statement.
      
      Link: http://lkml.kernel.org/r/da75b5153f617f4c5739c08ee6ebeb3d19db0fbc.1565123758.git.sai.praneeth.prakhya@intel.comSigned-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Suggested-by: NDave Hansen <dave.hansen@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NDave Hansen <dave.hansen@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8495f7e6
    • S
      lib/hexdump: make print_hex_dump_bytes() a nop on !DEBUG builds · 091cb099
      Stephen Boyd 提交于
      I'm seeing a bunch of debug prints from a user of print_hex_dump_bytes()
      in my kernel logs, but I don't have CONFIG_DYNAMIC_DEBUG enabled nor do I
      have DEBUG defined in my build.  The problem is that
      print_hex_dump_bytes() calls a wrapper function in lib/hexdump.c that
      calls print_hex_dump() with KERN_DEBUG level.  There are three cases to
      consider here
      
        1. CONFIG_DYNAMIC_DEBUG=y  --> call dynamic_hex_dum()
        2. CONFIG_DYNAMIC_DEBUG=n && DEBUG --> call print_hex_dump()
        3. CONFIG_DYNAMIC_DEBUG=n && !DEBUG --> stub it out
      
      Right now, that last case isn't detected and we still call
      print_hex_dump() from the stub wrapper.
      
      Let's make print_hex_dump_bytes() only call print_hex_dump_debug() so that
      it works properly in all cases.
      
      Case #1, print_hex_dump_debug() calls dynamic_hex_dump() and we get same
      behavior.  Case #2, print_hex_dump_debug() calls print_hex_dump() with
      KERN_DEBUG and we get the same behavior.  Case #3, print_hex_dump_debug()
      is a nop, changing behavior to what we want, i.e.  print nothing.
      
      Link: http://lkml.kernel.org/r/20190816235624.115280-1-swboyd@chromium.orgSigned-off-by: NStephen Boyd <swboyd@chromium.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      091cb099
    • J
      kernel-doc: core-api: include string.h into core-api · 917cda27
      Joe Perches 提交于
      core-api should show all the various string functions including the newly
      added stracpy and stracpy_pad.
      
      Miscellanea:
      
      o Update the Returns: value for strscpy
      o fix a defect with %NUL)
      
      [joe@perches.com: correct return of -E2BIG descriptions]
        Link: http://lkml.kernel.org/r/29f998b4c1a9d69fbeae70500ba0daa4b340c546.1563889130.git.joe@perches.com
      Link: http://lkml.kernel.org/r/224a6ebf39955f4107c0c376d66155d970e46733.1563841972.git.joe@perches.comSigned-off-by: NJoe Perches <joe@perches.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Stephen Kitt <steve@sk2.org>
      Cc: Nitin Gote <nitin.r.gote@intel.com>
      Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
      Cc: Jann Horn <jannh@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      917cda27
    • M
      augmented rbtree: rework the RB_DECLARE_CALLBACKS macro definition · 6d2052d1
      Michel Lespinasse 提交于
      Change the definition of the RBCOMPUTE function.  The propagate callback
      repeatedly calls RBCOMPUTE as it moves from leaf to root.  it wants to
      stop recomputing once the augmented subtree information doesn't change.
      This was previously checked using the == operator, but that only works
      when the augmented subtree information is a scalar field.  This commit
      modifies the RBCOMPUTE function so that it now sets the augmented subtree
      information instead of returning it, and returns a boolean value
      indicating if the propagate callback should stop.
      
      The motivation for this change is that I want to introduce augmented
      rbtree uses where the augmented data for the subtree is a struct instead
      of a scalar.
      
      Link: http://lkml.kernel.org/r/20190703040156.56953-4-walken@google.comSigned-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6d2052d1
    • M
      augmented rbtree: add new RB_DECLARE_CALLBACKS_MAX macro · 315cc066
      Michel Lespinasse 提交于
      Add RB_DECLARE_CALLBACKS_MAX, which generates augmented rbtree callbacks
      for the case where the augmented value is a scalar whose definition
      follows a max(f(node)) pattern.  This actually covers all present uses of
      RB_DECLARE_CALLBACKS, and saves some (source) code duplication in the
      various RBCOMPUTE function definitions.
      
      [walken@google.com: fix mm/vmalloc.c]
        Link: http://lkml.kernel.org/r/CANN689FXgK13wDYNh1zKxdipeTuALG4eKvKpsdZqKFJ-rvtGiQ@mail.gmail.com
      [walken@google.com: re-add check to check_augmented()]
        Link: http://lkml.kernel.org/r/20190727022027.GA86863@google.com
      Link: http://lkml.kernel.org/r/20190703040156.56953-3-walken@google.comSigned-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      315cc066
    • M
      augmented rbtree: add comments for RB_DECLARE_CALLBACKS macro · 444b8a83
      Michel Lespinasse 提交于
      Patch series "make RB_DECLARE_CALLBACKS more generic", v3.
      
      These changes are intended to make the RB_DECLARE_CALLBACKS macro more
      generic (allowing the aubmented subtree information to be a struct instead
      of a scalar).
      
      I have verified the compiled lib/interval_tree.o and mm/mmap.o files to
      check that they didn't change.  This held as expected for interval_tree.o;
      mmap.o did have some changes which could be reverted by marking
      __vma_link_rb as noinline.  I did not add such a change to the patchset; I
      felt it was reasonable enough to leave the inlining decision up to the
      compiler.
      
      This patch (of 3):
      
      Add a short comment summarizing the arguments to RB_DECLARE_CALLBACKS.
      The arguments are also now capitalized.  This copies the style of the
      INTERVAL_TREE_DEFINE macro.
      
      No functional changes in this commit, only comments and capitalization.
      
      Link: http://lkml.kernel.org/r/20190703040156.56953-2-walken@google.comSigned-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NDavidlohr Bueso <dbueso@suse.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      444b8a83
  7. 25 9月, 2019 22 次提交