1. 15 5月, 2019 40 次提交
    • A
      userfaultfd: use RCU to free the task struct when fork fails · c3f3ce04
      Andrea Arcangeli 提交于
      The task structure is freed while get_mem_cgroup_from_mm() holds
      rcu_read_lock() and dereferences mm->owner.
      
        get_mem_cgroup_from_mm()                failing fork()
        ----                                    ---
        task = mm->owner
                                                mm->owner = NULL;
                                                free(task)
        if (task) *task; /* use after free */
      
      The fix consists in freeing the task with RCU also in the fork failure
      case, exactly like it always happens for the regular exit(2) path.  That
      is enough to make the rcu_read_lock hold in get_mem_cgroup_from_mm()
      (left side above) effective to avoid a use after free when dereferencing
      the task structure.
      
      An alternate possible fix would be to defer the delivery of the
      userfaultfd contexts to the monitor until after fork() is guaranteed to
      succeed.  Such a change would require more changes because it would
      create a strict ordering dependency where the uffd methods would need to
      be called beyond the last potentially failing branch in order to be
      safe.  This solution as opposed only adds the dependency to common code
      to set mm->owner to NULL and to free the task struct that was pointed by
      mm->owner with RCU, if fork ends up failing.  The userfaultfd methods
      can still be called anywhere during the fork runtime and the monitor
      will keep discarding orphaned "mm" coming from failed forks in userland.
      
      This race condition couldn't trigger if CONFIG_MEMCG was set =n at build
      time.
      
      [aarcange@redhat.com: improve changelog, reduce #ifdefs per Michal]
        Link: http://lkml.kernel.org/r/20190429035752.4508-1-aarcange@redhat.com
      Link: http://lkml.kernel.org/r/20190325225636.11635-2-aarcange@redhat.com
      Fixes: 893e26e6 ("userfaultfd: non-cooperative: Add fork() event")
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Tested-by: Nzhong jiang <zhongjiang@huawei.com>
      Reported-by: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: zhong jiang <zhongjiang@huawei.com>
      Cc: syzbot+cbb52e396df3e565ab02@syzkaller.appspotmail.com
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c3f3ce04
    • A
      kernel/Makefile: don't assume that kernel/gen_ikh_data.sh is executable · acb2ec3d
      Andrew Morton 提交于
      If the user downloads and applies patch-5.1.gz using patch(1), the x bit
      on kernel/gen_ikh_data.sh is not set.
      
        /bin/sh: 1: ./kernel/gen_ikh_data.sh: Permission denied
      
      Fix this by using CONFIG_SHELL.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      acb2ec3d
    • L
      Merge tag 'backlight-next-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight · e0654264
      Linus Torvalds 提交于
      Pull backlight updates from Lee Jones:
       "Fix-ups:
         - Remove unused BACKLIGHT_LCD_SUPPORT symbol
         - Remove unused BACKLIGHT_CLASS_DEVICE dependencies
         - Add DT support to lm3630a_bl
      
        Bug Fixes:
         - Fix error path issues in lm3630a_bl"
      
      * tag 'backlight-next-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
        backlight: lm3630a: Add firmware node support
        dt-bindings: backlight: Add lm3630a bindings
        backlight: lm3630a: Return 0 on success in update_status functions
        video: lcd: Remove useless BACKLIGHT_CLASS_DEVICE dependencies
        video: backlight: Remove useless BACKLIGHT_LCD_SUPPORT kernel symbol
      e0654264
    • L
      Merge tag 'mfd-next-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · ebcf5bb2
      Linus Torvalds 提交于
      Pull MFD updates from Lee Jones:
       "Core Framework:
         - Document (kerneldoc) core mfd_add_devices() API
      
        New Drivers:
         - Altera SOCFPGA System Manager
         - Maxim MAX77650/77651 PMIC
         - Maxim MAX77663 PMIC
         - ST Multi-Function eXpander (STMFX)
      
        New Device Support:
         - LEDs support in Intel Cherry Trail Whiskey Cove PMIC
         - RTC support in SAMSUNG Electronics S2MPA01 PMIC
         - SAM9X60 support in Atmel HLCDC (High-end LCD Controller)
         - USB X-Powers AXP 8xx PMICs
         - Integrated Sensor Hub (ISH) in ChromeOS EC
         - USB PD Logger in ChromeOS EC
         - AXP223 in X-Powers AXP series PMICs
         - Power Supply in X-Powers AXP 803 PMICs
         - Comet Lake in Intel Low Power Subsystem
         - Fingerprint MCU in ChromeOS EC
         - Touchpad MCU in ChromeOS EC
         - Move TI LM3532 support to LED
      
        New Functionality:
         - max77650, max77620: Add/extend DT support
         - max77620 power-off
         - syscon clocking
         - croc_ec host sleep event
      
        Fix-ups:
         - Trivial; Formatting, spelling, etc; Kconfig, sec-core, ab8500-debugfs
         - Remove unused functionality; rk808, da9063-*
         - SPDX conversion; da9063-*, atmel-*,
         - Adapt/add new register definitions; cs47l35-tables, cs47l90-tables, imx6q-iomuxc-gpr
         - Fix-up DT bindings; ti-lmu, cirrus,lochnagar
         - Simply obtaining driver data; ssbi, t7l66xb, tc6387xb, tc6393xb
      
        Bug Fixes:
         - Fix incorrect defined values; max77620, da9063
         - Fix device initialisation; twl6040
         - Reset device on init; intel-lpss
         - Fix build warnings when !OF; sun6i-prcm
         - Register OF match tables; tps65912-spi
         - Fix DMI matching; intel_quark_i2c_gpio"
      
      * tag 'mfd-next-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (65 commits)
        mfd: Use dev_get_drvdata() directly
        mfd: cros_ec: Instantiate properly CrOS Touchpad MCU device
        mfd: cros_ec: Instantiate properly CrOS FP MCU device
        mfd: cros_ec: Update the EC feature codes
        mfd: intel-lpss: Add Intel Comet Lake PCI IDs
        mfd: lochnagar: Add links to binding docs for sound and hwmon
        mfd: ab8500-debugfs: Fix a typo ("deubgfs")
        mfd: imx6sx: Add MQS register definition for iomuxc gpr
        dt-bindings: mfd: LMU: Fix lm3632 dt binding example
        mfd: intel_quark_i2c_gpio: Adjust IOT2000 matching
        mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L
        mfd: tps65912-spi: Add missing of table registration
        mfd: axp20x: Add USB power supply mfd cell to AXP803
        mfd: sun6i-prcm: Fix build warning for non-OF configurations
        mfd: intel-lpss: Set the device in reset state when init
        platform/chrome: Add support for v1 of host sleep event
        mfd: cros_ec: Add host_sleep_event_v1 command
        mfd: cros_ec: Instantiate the CrOS USB PD logger driver
        mfd: cs47l90: Make DAC_AEC_CONTROL_2 readable
        mfd: cs47l35: Make DAC_AEC_CONTROL_2 readable
        ...
      ebcf5bb2
    • L
      Merge tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 414147d9
      Linus Torvalds 提交于
      Pull PCI updates from Bjorn Helgaas:
       "Enumeration changes:
      
         - Add _HPX Type 3 settings support, which gives firmware more
           influence over device configuration (Alexandru Gagniuc)
      
         - Support fixed bus numbers from bridge Enhanced Allocation
           capabilities (Subbaraya Sundeep)
      
         - Add "external-facing" DT property to identify cases where we
           require IOMMU protection against untrusted devices (Jean-Philippe
           Brucker)
      
         - Enable PCIe services for host controller drivers that use managed
           host bridge alloc (Jean-Philippe Brucker)
      
         - Log PCIe port service messages with pci_dev, not the pcie_device
           (Frederick Lawler)
      
         - Convert pciehp from pciehp_debug module parameter to generic
           dynamic debug (Frederick Lawler)
      
        Peer-to-peer DMA:
      
         - Add whitelist of Root Complexes that support peer-to-peer DMA
           between Root Ports (Christian König)
      
        Native controller drivers:
      
         - Add PCI host bridge DMA ranges for bridges that can't DMA
           everywhere, e.g., iProc (Srinath Mannam)
      
         - Add Amazon Annapurna Labs PCIe host controller driver (Jonathan
           Chocron)
      
         - Fix Tegra MSI target allocation so DMA doesn't generate unwanted
           MSIs (Vidya Sagar)
      
         - Fix of_node reference leaks (Wen Yang)
      
         - Fix Hyper-V module unload & device removal issues (Dexuan Cui)
      
         - Cleanup R-Car driver (Marek Vasut)
      
         - Cleanup Keystone driver (Kishon Vijay Abraham I)
      
         - Cleanup i.MX6 driver (Andrey Smirnov)
      
        Significant bug fixes:
      
         - Reset Lenovo ThinkPad P50 GPU so nouveau works after reboot (Lyude
           Paul)
      
         - Fix Switchtec firmware update performance issue (Wesley Sheng)
      
         - Work around Pericom switch link retraining erratum (Stefan Mätje)"
      
      * tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (141 commits)
        MAINTAINERS: Add Karthikeyan Mitran and Hou Zhiqiang for Mobiveil PCI
        PCI: pciehp: Remove pointless MY_NAME definition
        PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition
        PCI: pciehp: Remove unused dbg/err/info/warn() wrappers
        PCI: pciehp: Log messages with pci_dev, not pcie_device
        PCI: pciehp: Replace pciehp_debug module param with dyndbg
        PCI: pciehp: Remove pciehp_debug uses
        PCI/AER: Log messages with pci_dev, not pcie_device
        PCI/DPC: Log messages with pci_dev, not pcie_device
        PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()
        PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info()
        PCI: Replace dev_printk(KERN_DEBUG) with dev_info(), etc
        PCI: Replace printk(KERN_INFO) with pr_info(), etc
        PCI: Use dev_printk() when possible
        PCI: Cleanup setup-bus.c comments and whitespace
        PCI: imx6: Allow asynchronous probing
        PCI: dwc: Save root bus for driver remove hooks
        PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code
        PCI: dwc: Free MSI in dw_pcie_host_init() error path
        PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi()
        ...
      414147d9
    • L
      Merge branch 'akpm' (patches from Andrew) · 318222a3
      Linus Torvalds 提交于
      Merge misc updates from Andrew Morton:
      
       - a few misc things and hotfixes
      
       - ocfs2
      
       - almost all of MM
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (139 commits)
        kernel/memremap.c: remove the unused device_private_entry_fault() export
        mm: delete find_get_entries_tag
        mm/huge_memory.c: make __thp_get_unmapped_area static
        mm/mprotect.c: fix compilation warning because of unused 'mm' variable
        mm/page-writeback: introduce tracepoint for wait_on_page_writeback()
        mm/vmscan: simplify trace_reclaim_flags and trace_shrink_flags
        mm/Kconfig: update "Memory Model" help text
        mm/vmscan.c: don't disable irq again when count pgrefill for memcg
        mm: memblock: make keeping memblock memory opt-in rather than opt-out
        hugetlbfs: always use address space in inode for resv_map pointer
        mm/z3fold.c: support page migration
        mm/z3fold.c: add structure for buddy handles
        mm/z3fold.c: improve compression by extending search
        mm/z3fold.c: introduce helper functions
        mm/page_alloc.c: remove unnecessary parameter in rmqueue_pcplist
        mm/hmm: add ARCH_HAS_HMM_MIRROR ARCH_HAS_HMM_DEVICE Kconfig
        mm/vmscan.c: simplify shrink_inactive_list()
        fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback
        xen/privcmd-buf.c: convert to use vm_map_pages_zero()
        xen/gntdev.c: convert to use vm_map_pages()
        ...
      318222a3
    • C
      kernel/memremap.c: remove the unused device_private_entry_fault() export · 640be2d1
      Christoph Hellwig 提交于
      This export has been entirely unused since it was added more than 1 1/2
      years ago.
      
      Link: http://lkml.kernel.org/r/20190429115535.12793-1-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      640be2d1
    • M
      mm: delete find_get_entries_tag · a1b8e6ab
      Matthew Wilcox (Oracle) 提交于
      I removed the only user of this and hadn't noticed it was now unused.
      
      Link: http://lkml.kernel.org/r/20190430152929.21813-1-willy@infradead.orgSigned-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: NRoss Zwisler <zwisler@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a1b8e6ab
    • B
      mm/huge_memory.c: make __thp_get_unmapped_area static · b3b07077
      Bharath Vedartham 提交于
      __thp_get_unmapped_area is only used in mm/huge_memory.c.  Make it static.
      Tested by building and booting the kernel.
      
      Link: http://lkml.kernel.org/r/20190504102353.GA22525@bharath12345-Inspiron-5559Signed-off-by: NBharath Vedartham <linux.bhar@gmail.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3b07077
    • M
      mm/mprotect.c: fix compilation warning because of unused 'mm' variable · 94393c78
      Mike Rapoport 提交于
      Since 0cbe3e26 ("mm: update ptep_modify_prot_start/commit to take
      vm_area_struct as arg") the only place that uses the local 'mm' variable
      in change_pte_range() is the call to set_pte_at().
      
      Many architectures define set_pte_at() as macro that does not use the 'mm'
      parameter, which generates the following compilation warning:
      
       CC      mm/mprotect.o
      mm/mprotect.c: In function 'change_pte_range':
      mm/mprotect.c:42:20: warning: unused variable 'mm' [-Wunused-variable]
        struct mm_struct *mm = vma->vm_mm;
                          ^~
      
      Fix it by passing vma->mm to set_pte_at() and dropping the local 'mm'
      variable in change_pte_range().
      
      [liu.song.a23@gmail.com: fix missed conversions]
        Link: http://lkml.kernel.org/r/CAPhsuW6wcQgYLHNdBdw6m0YiR4RWsS4XzfpSKU7wBLLeOCTbpw@mail.gmail.comLink: http://lkml.kernel.org/r/1557305432-4940-1-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      94393c78
    • Y
      mm/page-writeback: introduce tracepoint for wait_on_page_writeback() · 19343b5b
      Yafang Shao 提交于
      Recently there have been some hung tasks on our server due to
      wait_on_page_writeback(), and we want to know the details of this
      PG_writeback, i.e.  this page is writing back to which device.  But it is
      not so convenient to get the details.
      
      I think it would be better to introduce a tracepoint for diagnosing the
      writeback details.
      
      Link: http://lkml.kernel.org/r/1556274402-19018-1-git-send-email-laoar.shao@gmail.comSigned-off-by: NYafang Shao <laoar.shao@gmail.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19343b5b
    • Y
      mm/vmscan: simplify trace_reclaim_flags and trace_shrink_flags · 60b62ff7
      Yafang Shao 提交于
      trace_reclaim_flags and trace_shrink_flags are almost the same.
      We can simplify them to avoid redundant code.
      
      Link: http://lkml.kernel.org/r/1556169203-5858-1-git-send-email-laoar.shao@gmail.comSigned-off-by: NYafang Shao <laoar.shao@gmail.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      60b62ff7
    • M
      mm/Kconfig: update "Memory Model" help text · d66d109d
      Mike Rapoport 提交于
      The help describing the memory model selection is outdated.  It still says
      that SPARSEMEM is experimental and DISCONTIGMEM is a preferred over
      SPARSEMEM.
      
      Update the help text for the relevant options:
      * add a generic help for the "Memory Model" prompt
      * add description for FLATMEM
      * reduce the description of DISCONTIGMEM and add a deprecation note
      * prefer SPARSEMEM over DISCONTIGMEM
      
      Link: http://lkml.kernel.org/r/1556188531-20728-1-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d66d109d
    • Y
      mm/vmscan.c: don't disable irq again when count pgrefill for memcg · 2fa2690c
      Yafang Shao 提交于
      We can use __count_memcg_events() directly because this callsite is alreay
      protected by spin_lock_irq().
      
      Link: http://lkml.kernel.org/r/1556093494-30798-1-git-send-email-laoar.shao@gmail.comSigned-off-by: NYafang Shao <laoar.shao@gmail.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2fa2690c
    • M
      mm: memblock: make keeping memblock memory opt-in rather than opt-out · 350e88ba
      Mike Rapoport 提交于
      Most architectures do not need the memblock memory after the page
      allocator is initialized, but only few enable ARCH_DISCARD_MEMBLOCK in the
      arch Kconfig.
      
      Replacing ARCH_DISCARD_MEMBLOCK with ARCH_KEEP_MEMBLOCK and inverting the
      logic makes it clear which architectures actually use memblock after
      system initialization and skips the necessity to add ARCH_DISCARD_MEMBLOCK
      to the architectures that are still missing that option.
      
      Link: http://lkml.kernel.org/r/1556102150-32517-1-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      350e88ba
    • M
      hugetlbfs: always use address space in inode for resv_map pointer · f27a5136
      Mike Kravetz 提交于
      Continuing discussion about 58b6e5e8 ("hugetlbfs: fix memory leak for
      resv_map") brought up the issue that inode->i_mapping may not point to the
      address space embedded within the inode at inode eviction time.  The
      hugetlbfs truncate routine handles this by explicitly using inode->i_data.
      However, code cleaning up the resv_map will still use the address space
      pointed to by inode->i_mapping.  Luckily, private_data is NULL for address
      spaces in all such cases today but, there is no guarantee this will
      continue.
      
      Change all hugetlbfs code getting a resv_map pointer to explicitly get it
      from the address space embedded within the inode.  In addition, add more
      comments in the code to indicate why this is being done.
      
      Link: http://lkml.kernel.org/r/20190419204435.16984-1-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Reported-by: NYufen Yu <yuyufen@huawei.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f27a5136
    • V
      mm/z3fold.c: support page migration · 1f862989
      Vitaly Wool 提交于
      Now that we are not using page address in handles directly, we can make
      z3fold pages movable to decrease the memory fragmentation z3fold may
      create over time.
      
      This patch starts advertising non-headless z3fold pages as movable and
      uses the existing kernel infrastructure to implement moving of such pages
      per memory management subsystem's request.  It thus implements 3 required
      callbacks for page migration:
      
      * isolation callback: z3fold_page_isolate(): try to isolate the page by
        removing it from all lists.  Pages scheduled for some activity and
        mapped pages will not be isolated.  Return true if isolation was
        successful or false otherwise
      
      * migration callback: z3fold_page_migrate(): re-check critical
        conditions and migrate page contents to the new page provided by the
        memory subsystem.  Returns 0 on success or negative error code otherwise
      
      * putback callback: z3fold_page_putback(): put back the page if
        z3fold_page_migrate() for it failed permanently (i.  e.  not with
        -EAGAIN code).
      
      [lkp@intel.com: z3fold_page_isolate() can be static]
        Link: http://lkml.kernel.org/r/20190419130924.GA161478@ivb42
      Link: http://lkml.kernel.org/r/20190417103922.31253da5c366c4ebe0419cfc@gmail.comSigned-off-by: NVitaly Wool <vitaly.vul@sony.com>
      Signed-off-by: Nkbuild test robot <lkp@intel.com>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f862989
    • V
      mm/z3fold.c: add structure for buddy handles · 7c2b8baa
      Vitaly Wool 提交于
      For z3fold to be able to move its pages per request of the memory
      subsystem, it should not use direct object addresses in handles.  Instead,
      it will create abstract handles (3 per page) which will contain pointers
      to z3fold objects.  Thus, it will be possible to change these pointers
      when z3fold page is moved.
      
      Link: http://lkml.kernel.org/r/20190417103826.484eaf18c1294d682769880f@gmail.comSigned-off-by: NVitaly Wool <vitaly.vul@sony.com>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c2b8baa
    • V
      mm/z3fold.c: improve compression by extending search · 351618b2
      Vitaly Wool 提交于
      The current z3fold implementation only searches this CPU's page lists for
      a fitting page to put a new object into.  This patch adds quick search for
      very well fitting pages (i.  e.  those having exactly the required number
      of free space) on other CPUs too, before allocating a new page for that
      object.
      
      Link: http://lkml.kernel.org/r/20190417103733.72ae81abe1552397c95a008e@gmail.comSigned-off-by: NVitaly Wool <vitaly.vul@sony.com>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      351618b2
    • V
      mm/z3fold.c: introduce helper functions · 9050cce1
      Vitaly Wool 提交于
      Patch series "z3fold: support page migration", v2.
      
      This patchset implements page migration support and slightly better buddy
      search.  To implement page migration support, z3fold has to move away from
      the current scheme of handle encoding.  i.  e.  stop encoding page address
      in handles.  Instead, a small per-page structure is created which will
      contain actual addresses for z3fold objects, while pointers to fields of
      that structure will be used as handles.
      
      Thus, it will be possible to change the underlying addresses to reflect
      page migration.
      
      To support migration itself, 3 callbacks will be implemented:
      
      1: isolation callback: z3fold_page_isolate(): try to isolate the page
         by removing it from all lists.  Pages scheduled for some activity and
         mapped pages will not be isolated.  Return true if isolation was
         successful or false otherwise
      
      2: migration callback: z3fold_page_migrate(): re-check critical
         conditions and migrate page contents to the new page provided by the
         system.  Returns 0 on success or negative error code otherwise
      
      3: putback callback: z3fold_page_putback(): put back the page if
         z3fold_page_migrate() for it failed permanently (i.  e.  not with
         -EAGAIN code).
      
      To make sure an isolated page doesn't get freed, its kref is incremented
      in z3fold_page_isolate() and decremented during post-migration compaction,
      if migration was successful, or by z3fold_page_putback() in the other
      case.
      
      Since the new handle encoding scheme implies slight memory consumption
      increase, better buddy search (which decreases memory consumption) is
      included in this patchset.
      
      This patch (of 4):
      
      Introduce a separate helper function for object allocation, as well as 2
      smaller helpers to add a buddy to the list and to get a pointer to the
      pool from the z3fold header.  No functional changes here.
      
      Link: http://lkml.kernel.org/r/20190417103633.a4bb770b5bf0fb7e43ce1666@gmail.comSigned-off-by: NVitaly Wool <vitaly.vul@sony.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9050cce1
    • Y
      mm/page_alloc.c: remove unnecessary parameter in rmqueue_pcplist · 1c52e6d0
      Yafang Shao 提交于
      Because rmqueue_pcplist() is only called when order is 0, we don't need to
      use order as a parameter.
      
      Link: http://lkml.kernel.org/r/1555591709-11744-1-git-send-email-laoar.shao@gmail.comSigned-off-by: NYafang Shao <laoar.shao@gmail.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NPankaj Gupta <pagupta@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c52e6d0
    • J
      mm/hmm: add ARCH_HAS_HMM_MIRROR ARCH_HAS_HMM_DEVICE Kconfig · 2c8fc3dc
      Jérôme Glisse 提交于
      Add 2 new Kconfig variables that are not used by anyone.  I check that
      various make ARCH=somearch allmodconfig do work and do not complain.  This
      new Kconfig needs to be added first so that device drivers that depend on
      HMM can be updated.
      
      Once drivers are updated then I can update the HMM Kconfig to depend on
      this new Kconfig in a followup patch.
      
      This is about solving Kconfig for HMM given that device driver are
      going through their own tree we want to avoid changing them from the mm
      tree.  So plan is:
      
      1 - Kernel release N add the new Kconfig to mm/Kconfig (this patch)
      2 - Kernel release N+1 update driver to depend on new Kconfig ie
          stop using ARCH_HASH_HMM and start using ARCH_HAS_HMM_MIRROR
          and ARCH_HAS_HMM_DEVICE (one or the other or both depending
          on the driver)
      3 - Kernel release N+2 remove ARCH_HASH_HMM and do final Kconfig
          update in mm/Kconfig
      
      Link: http://lkml.kernel.org/r/20190417211141.17580-1-jglisse@redhat.comSigned-off-by: NJérôme Glisse <jglisse@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2c8fc3dc
    • K
      mm/vmscan.c: simplify shrink_inactive_list() · f46b7912
      Kirill Tkhai 提交于
      This merges together duplicated patterns of code.  Also, replace
      count_memcg_events() with its irq-careless namesake, because they are
      already called in interrupts disabled context.
      
      Link: http://lkml.kernel.org/r/2ece1df4-2989-bc9b-6172-61e9fdde5bfd@virtuozzo.comSigned-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f46b7912
    • A
      fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback · c553ea4f
      Amir Goldstein 提交于
      23d01270 ("fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE
      writeback") claims that sync_file_range(2) syscall was "created for
      userspace to be able to issue background writeout and so waiting for
      in-flight IO is undesirable there" and changes the writeback (back) to
      WB_SYNC_NONE.
      
      This claim is only partially true.  It is true for users that use the flag
      SYNC_FILE_RANGE_WRITE by itself, as does PostgreSQL, the user that was the
      reason for changing to WB_SYNC_NONE writeback.
      
      However, that claim is not true for users that use that flag combination
      SYNC_FILE_RANGE_{WAIT_BEFORE|WRITE|_WAIT_AFTER}.  Those users explicitly
      requested to wait for in-flight IO as well as to writeback of dirty pages.
      
      Re-brand that flag combination as SYNC_FILE_RANGE_WRITE_AND_WAIT and use
      WB_SYNC_ALL writeback to perform the full range sync request.
      
      Link: http://lkml.kernel.org/r/20190409114922.30095-1-amir73il@gmail.com
      Link: http://lkml.kernel.org/r/20190419072938.31320-1-amir73il@gmail.com
      Fixes: 23d01270 ("fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE")
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Acked-by: NJan Kara <jack@suse.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c553ea4f
    • S
      xen/privcmd-buf.c: convert to use vm_map_pages_zero() · 53269057
      Souptick Joarder 提交于
      Convert to use vm_map_pages_zero() to map range of kernel memory to user
      vma.
      
      This driver has ignored vm_pgoff.  We could later "fix" these drivers to
      behave according to the normal vm_pgoff offsetting simply by removing the
      _zero suffix on the function name and if that causes regressions, it gives
      us an easy way to revert.
      
      Link: http://lkml.kernel.org/r/acf678e81d554d01a9b590716ac0ccbdcdf71c25.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53269057
    • S
      xen/gntdev.c: convert to use vm_map_pages() · df9bde01
      Souptick Joarder 提交于
      Convert to use vm_map_pages() to map range of kernel memory to user vma.
      
      map->count is passed to vm_map_pages() and internal API verify map->count
      against count ( count = vma_pages(vma)) for page array boundary overrun
      condition.
      
      Link: http://lkml.kernel.org/r/88e56e82d2db98705c2d842e9c9806c00b366d67.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df9bde01
    • S
      videobuf2/videobuf2-dma-sg.c: convert to use vm_map_pages() · a17ae147
      Souptick Joarder 提交于
      Convert to use vm_map_pages() to map range of kernel memory to user vma.
      
      vm_pgoff is treated in V4L2 API as a 'cookie' to select a buffer, not as a
      in-buffer offset by design and it always want to mmap a whole buffer from
      its beginning.
      
      Link: http://lkml.kernel.org/r/a953fe6b3056de1cc6eab654effdd4a22f125375.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Suggested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
      Reviewed-by: NMarek Szyprowski <m.szyprowski@samsung.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a17ae147
    • S
      iommu/dma-iommu.c: convert to use vm_map_pages() · b0d0084f
      Souptick Joarder 提交于
      Convert to use vm_map_pages() to map range of kernel memory to user vma.
      
      Link: http://lkml.kernel.org/r/80c3d220fc6ada73a88ce43ca049afb55a889258.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0d0084f
    • S
      drm/xen/xen_drm_front_gem.c: convert to use vm_map_pages() · e60b72b1
      Souptick Joarder 提交于
      Convert to use vm_map_pages() to map range of kernel memory to user vma.
      
      Link: http://lkml.kernel.org/r/ff8e10ba778d79419c66ee8215bccf01560540fd.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: NOleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e60b72b1
    • S
      drm/rockchip/rockchip_drm_gem.c: convert to use vm_map_pages() · 2f69b3c8
      Souptick Joarder 提交于
      Convert to use vm_map_pages() to map range of kernel memory to user vma.
      
      Tested on Rockchip hardware and display is working, including talking to
      Lima via prime.
      
      Link: http://lkml.kernel.org/r/7ba359eb1aceac388d05983c1f29b915bdf291f9.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Tested-by: NHeiko Stuebner <heiko@sntech.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f69b3c8
    • S
      drivers/firewire/core-iso.c: convert to use vm_map_pages_zero() · 22660db8
      Souptick Joarder 提交于
      Convert to use vm_map_pages_zero() to map range of kernel memory to user
      vma.
      
      This driver has ignored vm_pgoff and mapped the entire pages.  We could
      later "fix" these drivers to behave according to the normal vm_pgoff
      offsetting simply by removing the _zero suffix on the function name and if
      that causes regressions, it gives us an easy way to revert.
      
      Link: http://lkml.kernel.org/r/88645f5ea8202784a8baaf389e592aeb8c505e8e.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22660db8
    • S
      arm: mm: dma-mapping: convert to use vm_map_pages() · 6248461d
      Souptick Joarder 提交于
      Convert to use vm_map_pages() to map range of kernel memory to user vma.
      
      Link: http://lkml.kernel.org/r/936e5e107c746a7310e3a3c471188ca3ac8f9754.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6248461d
    • S
      mm: introduce new vm_map_pages() and vm_map_pages_zero() API · a667d745
      Souptick Joarder 提交于
      Patch series "mm: Use vm_map_pages() and vm_map_pages_zero() API", v5.
      
      This patch (of 5):
      
      Previouly drivers have their own way of mapping range of kernel
      pages/memory into user vma and this was done by invoking vm_insert_page()
      within a loop.
      
      As this pattern is common across different drivers, it can be generalized
      by creating new functions and using them across the drivers.
      
      vm_map_pages() is the API which can be used to map kernel memory/pages in
      drivers which have considered vm_pgoff
      
      vm_map_pages_zero() is the API which can be used to map a range of kernel
      memory/pages in drivers which have not considered vm_pgoff.  vm_pgoff is
      passed as default 0 for those drivers.
      
      We _could_ then at a later "fix" these drivers which are using
      vm_map_pages_zero() to behave according to the normal vm_pgoff offsetting
      simply by removing the _zero suffix on the function name and if that
      causes regressions, it gives us an easy way to revert.
      
      Tested on Rockchip hardware and display is working, including talking to
      Lima via prime.
      
      Link: http://lkml.kernel.org/r/751cb8a0f4c3e67e95c58a3b072937617f338eea.1552921225.git.jrdr.linux@gmail.comSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Suggested-by: NRussell King <linux@armlinux.org.uk>
      Suggested-by: NMatthew Wilcox <willy@infradead.org>
      Reviewed-by: NMike Rapoport <rppt@linux.ibm.com>
      Tested-by: NHeiko Stuebner <heiko@sntech.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a667d745
    • B
      mm: remove redundant 'default n' from Kconfig-s · 62afcd1c
      Bartlomiej Zolnierkiewicz 提交于
      'default n' is the default value for any bool or tristate Kconfig
      setting so there is no need to write it explicitly.
      
      Also since commit f467c564 ("kconfig: only write '# CONFIG_FOO
      is not set' for visible symbols") the Kconfig behavior is the same
      regardless of 'default n' being present or not:
      
          ...
          One side effect of (and the main motivation for) this change is making
          the following two definitions behave exactly the same:
      
              config FOO
                      bool
      
              config FOO
                      bool
                      default n
      
          With this change, neither of these will generate a
          '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
          That might make it clearer to people that a bare 'default n' is
          redundant.
          ...
      
      Link: http://lkml.kernel.org/r/c3385916-e4d4-37d3-b330-e6b7dff83a52@samsung.comSigned-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62afcd1c
    • J
      mm: fix false-positive OVERCOMMIT_GUESS failures · 8c7829b0
      Johannes Weiner 提交于
      With the default overcommit==guess we occasionally run into mmap
      rejections despite plenty of memory that would get dropped under
      pressure but just isn't accounted reclaimable. One example of this is
      dying cgroups pinned by some page cache. A previous case was auxiliary
      path name memory associated with dentries; we have since annotated
      those allocations to avoid overcommit failures (see d79f7aa4 ("mm:
      treat indirectly reclaimable memory as free in overcommit logic")).
      
      But trying to classify all allocated memory reliably as reclaimable
      and unreclaimable is a bit of a fool's errand. There could be a myriad
      of dependencies that constantly change with kernel versions.
      
      It becomes even more questionable of an effort when considering how
      this estimate of available memory is used: it's not compared to the
      system-wide allocated virtual memory in any way. It's not even
      compared to the allocating process's address space. It's compared to
      the single allocation request at hand!
      
      So we have an elaborate left-hand side of the equation that tries to
      assess the exact breathing room the system has available down to a
      page - and then compare it to an isolated allocation request with no
      additional context. We could fail an allocation of N bytes, but for
      two allocations of N/2 bytes we'd do this elaborate dance twice in a
      row and then still let N bytes of virtual memory through. This doesn't
      make a whole lot of sense.
      
      Let's take a step back and look at the actual goal of the
      heuristic. From the documentation:
      
         Heuristic overcommit handling. Obvious overcommits of address
         space are refused. Used for a typical system. It ensures a
         seriously wild allocation fails while allowing overcommit to
         reduce swap usage.  root is allowed to allocate slightly more
         memory in this mode. This is the default.
      
      If all we want to do is catch clearly bogus allocation requests
      irrespective of the general virtual memory situation, the physical
      memory counter-part doesn't need to be that complicated, either.
      
      When in GUESS mode, catch wild allocations by comparing their request
      size to total amount of ram and swap in the system.
      
      Link: http://lkml.kernel.org/r/20190412191418.26333-1-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NRoman Gushchin <guro@fb.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8c7829b0
    • D
      mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail · ac5c9426
      David Hildenbrand 提交于
      All callers of arch_remove_memory() ignore errors.  And we should really
      try to remove any errors from the memory removal path.  No more errors are
      reported from __remove_pages().  BUG() in s390x code in case
      arch_remove_memory() is triggered.  We may implement that properly later.
      WARN in case powerpc code failed to remove the section mapping, which is
      better than ignoring the error completely right now.
      
      Link: http://lkml.kernel.org/r/20190409100148.24703-5-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac5c9426
    • D
      mm/memory_hotplug: make __remove_section() never fail · 9d1d887d
      David Hildenbrand 提交于
      Let's just warn in case a section is not valid instead of failing to
      remove somewhere in the middle of the process, returning an error that
      will be mostly ignored by callers.
      
      Link: http://lkml.kernel.org/r/20190409100148.24703-4-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d1d887d
    • D
      mm/memory_hotplug: make unregister_memory_section() never fail · cb7b3a36
      David Hildenbrand 提交于
      Failing while removing memory is mostly ignored and cannot really be
      handled.  Let's treat errors in unregister_memory_section() in a nice way,
      warning, but continuing.
      
      Link: http://lkml.kernel.org/r/20190409100148.24703-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb7b3a36
    • D
      mm/memory_hotplug: release memory resource after arch_remove_memory() · d9eb1417
      David Hildenbrand 提交于
      Patch series "mm/memory_hotplug: Better error handling when removing
      memory", v1.
      
      Error handling when removing memory is somewhat messed up right now.  Some
      errors result in warnings, others are completely ignored.  Memory unplug
      code can essentially not deal with errors properly as of now.
      remove_memory() will never fail.
      
      We have basically two choices:
      1. Allow arch_remov_memory() and friends to fail, propagating errors via
         remove_memory(). Might be problematic (e.g. DIMMs consisting of multiple
         pieces added/removed separately).
      2. Don't allow the functions to fail, handling errors in a nicer way.
      
      It seems like most errors that can theoretically happen are really corner
      cases and mostly theoretical (e.g.  "section not valid").  However e.g.
      aborting removal of sections while all callers simply continue in case of
      errors is not nice.
      
      If we can gurantee that removal of memory always works (and WARN/skip in
      case of theoretical errors so we can figure out what is going on), we can
      go ahead and implement better error handling when adding memory.
      
      E.g. via add_memory():
      
      arch_add_memory()
      ret = do_stuff()
      if (ret) {
      	arch_remove_memory();
      	goto error;
      }
      
      Handling here that arch_remove_memory() might fail is basically
      impossible.  So I suggest, let's avoid reporting errors while removing
      memory, warning on theoretical errors instead and continuing instead of
      aborting.
      
      This patch (of 4):
      
      __add_pages() doesn't add the memory resource, so __remove_pages()
      shouldn't remove it.  Let's factor it out.  Especially as it is a special
      case for memory used as system memory, added via add_memory() and friends.
      
      We now remove the resource after removing the sections instead of doing it
      the other way around.  I don't think this change is problematic.
      
      add_memory()
      	register memory resource
      	arch_add_memory()
      
      remove_memory
      	arch_remove_memory()
      	release memory resource
      
      While at it, explain why we ignore errors and that it only happeny if
      we remove memory in a different granularity as we added it.
      
      [david@redhat.com: fix printk warning]
        Link: http://lkml.kernel.org/r/20190417120204.6997-1-david@redhat.com
      Link: http://lkml.kernel.org/r/20190409100148.24703-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d9eb1417
    • L