1. 15 6月, 2019 40 次提交
    • C
      f2fs: fix to do sanity check on valid block count of segment · 64024854
      Chao Yu 提交于
      [ Upstream commit e95bcdb2fefa129f37bd9035af1d234ca92ee4ef ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203233
      
      - Overview
      When mounting the attached crafted image and running program, following errors are reported.
      Additionally, it hangs on sync after running program.
      
      The image is intentionally fuzzed from a normal f2fs image for testing.
      Compile options for F2FS are as follows.
      CONFIG_F2FS_FS=y
      CONFIG_F2FS_STAT_FS=y
      CONFIG_F2FS_FS_XATTR=y
      CONFIG_F2FS_FS_POSIX_ACL=y
      CONFIG_F2FS_CHECK_FS=y
      
      - Reproduces
      cc poc_13.c
      mkdir test
      mount -t f2fs tmp.img test
      cp a.out test
      cd test
      sudo ./a.out
      sync
      
      - Kernel messages
       F2FS-fs (sdb): Bitmap was wrongly set, blk:4608
       kernel BUG at fs/f2fs/segment.c:2102!
       RIP: 0010:update_sit_entry+0x394/0x410
       Call Trace:
        f2fs_allocate_data_block+0x16f/0x660
        do_write_page+0x62/0x170
        f2fs_do_write_node_page+0x33/0xa0
        __write_node_page+0x270/0x4e0
        f2fs_sync_node_pages+0x5df/0x670
        f2fs_write_checkpoint+0x372/0x1400
        f2fs_sync_fs+0xa3/0x130
        f2fs_do_sync_file+0x1a6/0x810
        do_fsync+0x33/0x60
        __x64_sys_fsync+0xb/0x10
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      sit.vblocks and sum valid block count in sit.valid_map may be
      inconsistent, segment w/ zero vblocks will be treated as free
      segment, while allocating in free segment, we may allocate a
      free block, if its bitmap is valid previously, it can cause
      kernel crash due to bitmap verification failure.
      
      Anyway, to avoid further serious metadata inconsistence and
      corruption, it is necessary and worth to detect SIT
      inconsistence. So let's enable check_block_count() to verify
      vblocks and valid_map all the time rather than do it only
      CONFIG_F2FS_CHECK_FS is enabled.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      64024854
    • C
      f2fs: fix to use inline space only if inline_xattr is enable · 101e48fe
      Chao Yu 提交于
      [ Upstream commit 622927f3b8809206f6da54a6a7ed4df1a7770fce ]
      
      With below mkfs and mount option:
      
      MKFS_OPTIONS  -- -O extra_attr -O project_quota -O inode_checksum -O flexible_inline_xattr -O inode_crtime -f
      MOUNT_OPTIONS -- -o noinline_xattr
      
      We may miss xattr data with below testcase:
      - mkdir dir
      - setfattr -n "user.name" -v 0 dir
      - for ((i = 0; i < 190; i++)) do touch dir/$i; done
      - umount
      - mount
      - getfattr -n "user.name" dir
      
      user.name: No such attribute
      
      The root cause is that we persist xattr data into reserved inline xattr
      space, even if inline_xattr is not enable in inline directory inode, after
      inline dentry conversion, reserved space no longer exists, so that xattr
      data missed.
      
      Let's use inline xattr space only if inline_xattr flag is set on inode
      to fix this iusse.
      
      Fixes: 6afc662e ("f2fs: support flexible inline xattr size")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      101e48fe
    • C
      f2fs: fix to avoid panic in dec_valid_block_count() · 45624f0e
      Chao Yu 提交于
      [ Upstream commit 5e159cd349bf3a31fb7e35c23a93308eb30f4f71 ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203209
      
      - Overview
      When mounting the attached crafted image and running program, I got this error.
      Additionally, it hangs on sync after the this script.
      
      The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on.
      
      - Reproduces
      cc poc_01.c
      ./run.sh f2fs
      sync
      
       kernel BUG at fs/f2fs/f2fs.h:1788!
       RIP: 0010:f2fs_truncate_data_blocks_range+0x342/0x350
       Call Trace:
        f2fs_truncate_blocks+0x36d/0x3c0
        f2fs_truncate+0x88/0x110
        f2fs_setattr+0x3e1/0x460
        notify_change+0x2da/0x400
        do_truncate+0x6d/0xb0
        do_sys_ftruncate+0xf1/0x160
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The reason is dec_valid_block_count() will trigger kernel panic due to
      inconsistent count in between inode.i_blocks and actual block.
      
      To avoid panic, let's just print debug message and set SBI_NEED_FSCK to
      give a hint to fsck for latter repairing.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix build warning and add unlikely]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      45624f0e
    • C
      f2fs: fix to clear dirty inode in error path of f2fs_iget() · 47a92acf
      Chao Yu 提交于
      [ Upstream commit 546d22f070d64a7b96f57c93333772085d3a5e6d ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203217
      
      - Overview
      When mounting the attached crafted image and running program, I got this error.
      Additionally, it hangs on sync after running the program.
      
      The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on.
      
      - Reproduces
      cc poc_test_05.c
      mkdir test
      mount -t f2fs tmp.img test
      sudo ./a.out
      sync
      
      - Messages
       kernel BUG at fs/f2fs/inode.c:707!
       RIP: 0010:f2fs_evict_inode+0x33f/0x3a0
       Call Trace:
        evict+0xba/0x180
        f2fs_iget+0x598/0xdf0
        f2fs_lookup+0x136/0x320
        __lookup_slow+0x92/0x140
        lookup_slow+0x30/0x50
        walk_component+0x1c1/0x350
        path_lookupat+0x62/0x200
        filename_lookup+0xb3/0x1a0
        do_readlinkat+0x56/0x110
        __x64_sys_readlink+0x16/0x20
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      During inode loading, __recover_inline_status() can recovery inode status
      and set inode dirty, once we failed in following process, it will fail
      the check in f2fs_evict_inode, result in trigger BUG_ON().
      
      Let's clear dirty inode in error path of f2fs_iget() to avoid panic.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      47a92acf
    • C
      f2fs: fix to do sanity check on free nid · ca9fcbc5
      Chao Yu 提交于
      [ Upstream commit 626bcf2b7ce87211dba565f2bfa7842ba5be5c1b ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203225
      
      - Overview
      When mounting the attached crafted image and unmounting it, following errors are reported.
      Additionally, it hangs on sync after unmounting.
      
      The image is intentionally fuzzed from a normal f2fs image for testing.
      Compile options for F2FS are as follows.
      CONFIG_F2FS_FS=y
      CONFIG_F2FS_STAT_FS=y
      CONFIG_F2FS_FS_XATTR=y
      CONFIG_F2FS_FS_POSIX_ACL=y
      CONFIG_F2FS_CHECK_FS=y
      
      - Reproduces
      mkdir test
      mount -t f2fs tmp.img test
      touch test/t
      umount test
      sync
      
      - Messages
       kernel BUG at fs/f2fs/node.c:3073!
       RIP: 0010:f2fs_destroy_node_manager+0x2f0/0x300
       Call Trace:
        f2fs_put_super+0xf4/0x270
        generic_shutdown_super+0x62/0x110
        kill_block_super+0x1c/0x50
        kill_f2fs_super+0xad/0xd0
        deactivate_locked_super+0x35/0x60
        cleanup_mnt+0x36/0x70
        task_work_run+0x75/0x90
        exit_to_usermode_loop+0x93/0xa0
        do_syscall_64+0xba/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
       RIP: 0010:f2fs_destroy_node_manager+0x2f0/0x300
      
      NAT table is corrupted, so reserved meta/node inode ids were added into
      free list incorrectly, during file creation, since reserved id has cached
      in inode hash, so it fails the creation and preallocated nid can not be
      released later, result in kernel panic.
      
      To fix this issue, let's do nid boundary check during free nid loading.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ca9fcbc5
    • C
      f2fs: fix to avoid panic in f2fs_remove_inode_page() · f3aa313d
      Chao Yu 提交于
      [ Upstream commit 8b6810f8acfe429fde7c7dad4714692cc5f75651 ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203219
      
      - Overview
      When mounting the attached crafted image and running program, I got this error.
      Additionally, it hangs on sync after running the program.
      
      The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on.
      
      - Reproduces
      cc poc_06.c
      mkdir test
      mount -t f2fs tmp.img test
      cp a.out test
      cd test
      sudo ./a.out
      sync
      
      - Messages
       kernel BUG at fs/f2fs/node.c:1183!
       RIP: 0010:f2fs_remove_inode_page+0x294/0x2d0
       Call Trace:
        f2fs_evict_inode+0x2a3/0x3a0
        evict+0xba/0x180
        __dentry_kill+0xbe/0x160
        dentry_kill+0x46/0x180
        dput+0xbb/0x100
        do_renameat2+0x3c9/0x550
        __x64_sys_rename+0x17/0x20
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The reason is f2fs_remove_inode_page() will trigger kernel panic due to
      inconsistent i_blocks value of inode.
      
      To avoid panic, let's just print debug message and set SBI_NEED_FSCK to
      give a hint to fsck for latter repairing of potential image corruption.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix build warning and add unlikely]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f3aa313d
    • C
      f2fs: fix to avoid panic in f2fs_inplace_write_data() · 0325c5cc
      Chao Yu 提交于
      [ Upstream commit 05573d6ccf702df549a7bdeabef31e4753df1a90 ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203239
      
      - Overview
      When mounting the attached crafted image and running program, following errors are reported.
      Additionally, it hangs on sync after running program.
      
      The image is intentionally fuzzed from a normal f2fs image for testing.
      Compile options for F2FS are as follows.
      CONFIG_F2FS_FS=y
      CONFIG_F2FS_STAT_FS=y
      CONFIG_F2FS_FS_XATTR=y
      CONFIG_F2FS_FS_POSIX_ACL=y
      CONFIG_F2FS_CHECK_FS=y
      
      - Reproduces
      cc poc_15.c
      ./run.sh f2fs
      sync
      
      - Kernel messages
       ------------[ cut here ]------------
       kernel BUG at fs/f2fs/segment.c:3162!
       RIP: 0010:f2fs_inplace_write_data+0x12d/0x160
       Call Trace:
        f2fs_do_write_data_page+0x3c1/0x820
        __write_data_page+0x156/0x720
        f2fs_write_cache_pages+0x20d/0x460
        f2fs_write_data_pages+0x1b4/0x300
        do_writepages+0x15/0x60
        __filemap_fdatawrite_range+0x7c/0xb0
        file_write_and_wait_range+0x2c/0x80
        f2fs_do_sync_file+0x102/0x810
        do_fsync+0x33/0x60
        __x64_sys_fsync+0xb/0x10
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The reason is f2fs_inplace_write_data() will trigger kernel panic due
      to data block locates in node type segment.
      
      To avoid panic, let's just return error code and set SBI_NEED_FSCK to
      give a hint to fsck for latter repairing.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0325c5cc
    • C
      f2fs: fix to avoid panic in do_recover_data() · 8490bf2d
      Chao Yu 提交于
      [ Upstream commit 22d61e286e2d9097dae36f75ed48801056b77cac ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203227
      
      - Overview
      When mounting the attached crafted image, following errors are reported.
      Additionally, it hangs on sync after trying to mount it.
      
      The image is intentionally fuzzed from a normal f2fs image for testing.
      Compile options for F2FS are as follows.
      CONFIG_F2FS_FS=y
      CONFIG_F2FS_STAT_FS=y
      CONFIG_F2FS_FS_XATTR=y
      CONFIG_F2FS_FS_POSIX_ACL=y
      CONFIG_F2FS_CHECK_FS=y
      
      - Reproduces
      mkdir test
      mount -t f2fs tmp.img test
      sync
      
      - Messages
       kernel BUG at fs/f2fs/recovery.c:549!
       RIP: 0010:recover_data+0x167a/0x1780
       Call Trace:
        f2fs_recover_fsync_data+0x613/0x710
        f2fs_fill_super+0x1043/0x1aa0
        mount_bdev+0x16d/0x1a0
        mount_fs+0x4a/0x170
        vfs_kern_mount+0x5d/0x100
        do_mount+0x200/0xcf0
        ksys_mount+0x79/0xc0
        __x64_sys_mount+0x1c/0x20
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      During recovery, if ofs_of_node is inconsistent in between recovered
      node page and original checkpointed node page, let's just fail recovery
      instead of making kernel panic.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      8490bf2d
    • M
      ntp: Allow TAI-UTC offset to be set to zero · 0b50d08c
      Miroslav Lichvar 提交于
      [ Upstream commit fdc6bae940ee9eb869e493990540098b8c0fd6ab ]
      
      The ADJ_TAI adjtimex mode sets the TAI-UTC offset of the system clock.
      It is typically set by NTP/PTP implementations and it is automatically
      updated by the kernel on leap seconds. The initial value is zero (which
      applications may interpret as unknown), but this value cannot be set by
      adjtimex. This limitation seems to go back to the original "nanokernel"
      implementation by David Mills.
      
      Change the ADJ_TAI check to accept zero as a valid TAI-UTC offset in
      order to allow setting it back to the initial value.
      
      Fixes: 153b5d05 ("ntp: support for TAI")
      Suggested-by: NOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Link: https://lkml.kernel.org/r/20190417084833.7401-1-mlichvar@redhat.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      0b50d08c
    • F
      mailbox: stm32-ipcc: check invalid irq · 102f6e12
      Fabien Dessenne 提交于
      [ Upstream commit 68a1c8485cf83734d4da9d81cd3b5d2ae7c0339b ]
      
      On failure of_irq_get() returns a negative value or zero, which is
      not handled as an error in the existing implementation.
      Instead of using this API, use platform_get_irq() that returns
      exclusively a negative value on failure.
      Also, do not output an error log in case of defer probe error.
      Signed-off-by: NFabien Dessenne <fabien.dessenne@st.com>
      Signed-off-by: NJassi Brar <jaswinder.singh@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      102f6e12
    • M
      pwm: meson: Use the spin-lock only to protect register modifications · c5b2c824
      Martin Blumenstingl 提交于
      [ Upstream commit f173747fffdf037c791405ab4f1ec0eb392fc48e ]
      
      Holding the spin-lock for all of the code in meson_pwm_apply() can
      result in a "BUG: scheduling while atomic". This can happen because
      clk_get_rate() (which is called from meson_pwm_calc()) may sleep.
      Only hold the spin-lock when modifying registers to solve this.
      
      The reason why we need a spin-lock in the driver is because the
      REG_MISC_AB register is shared between the two channels provided by one
      PWM controller. The only functions where REG_MISC_AB is modified are
      meson_pwm_enable() and meson_pwm_disable() so the register reads/writes
      in there need to be protected by the spin-lock.
      
      The original code also used the spin-lock to protect the values in
      struct meson_pwm_channel. This could be necessary if two consumers can
      use the same PWM channel. However, PWM core doesn't allow this so we
      don't need to protect the values in struct meson_pwm_channel with a
      lock.
      
      Fixes: 211ed630 ("pwm: Add support for Meson PWM Controller")
      Signed-off-by: NMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Reviewed-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Reviewed-by: NNeil Armstrong <narmstrong@baylibre.com>
      Signed-off-by: NThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c5b2c824
    • M
      EDAC/mpc85xx: Prevent building as a module · 689fe88d
      Michael Ellerman 提交于
      [ Upstream commit 2b8358a951b1e2a534a54924cd8245e58a1c5fb8 ]
      
      The mpc85xx EDAC driver can be configured as a module but then fails to
      build because it uses two unexported symbols:
      
        ERROR: ".pci_find_hose_for_OF_device" [drivers/edac/mpc85xx_edac_mod.ko] undefined!
        ERROR: ".early_find_capability" [drivers/edac/mpc85xx_edac_mod.ko] undefined!
      
      We don't want to export those symbols just for this driver, so make the
      driver only configurable as a built-in.
      
      This seems to have been broken since at least
      
        c92132f5 ("edac/85xx: Add PCIe error interrupt edac support")
      
      (Nov 2013).
      
       [ bp: make it depend on EDAC=y so that the EDAC core doesn't get built
         as a module. ]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NJohannes Thumshirn <jth@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: linuxppc-dev@ozlabs.org
      Cc: morbidrsa@gmail.com
      Link: https://lkml.kernel.org/r/20190502141941.12927-1-mpe@ellerman.id.auSigned-off-by: NSasha Levin <sashal@kernel.org>
      689fe88d
    • K
      bpf: fix undefined behavior in narrow load handling · f9ee13ce
      Krzesimir Nowak 提交于
      [ Upstream commit e2f7fc0ac6957cabff4cecf6c721979b571af208 ]
      
      Commit 31fd8581 ("bpf: permits narrower load from bpf program
      context fields") made the verifier add AND instructions to clear the
      unwanted bits with a mask when doing a narrow load. The mask is
      computed with
      
        (1 << size * 8) - 1
      
      where "size" is the size of the narrow load. When doing a 4 byte load
      of a an 8 byte field the verifier shifts the literal 1 by 32 places to
      the left. This results in an overflow of a signed integer, which is an
      undefined behavior. Typically, the computed mask was zero, so the
      result of the narrow load ended up being zero too.
      
      Cast the literal to long long to avoid overflows. Note that narrow
      load of the 4 byte fields does not have the undefined behavior,
      because the load size can only be either 1 or 2 bytes, so shifting 1
      by 8 or 16 places will not overflow it. And reading 4 bytes would not
      be a narrow load of a 4 bytes field.
      
      Fixes: 31fd8581 ("bpf: permits narrower load from bpf program context fields")
      Reviewed-by: NAlban Crequy <alban@kinvolk.io>
      Reviewed-by: NIago López Galeiras <iago@kinvolk.io>
      Signed-off-by: NKrzesimir Nowak <krzesimir@kinvolk.io>
      Cc: Yonghong Song <yhs@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f9ee13ce
    • B
      drm/nouveau/kms/gv100-: fix spurious window immediate interlocks · 991b5104
      Ben Skeggs 提交于
      [ Upstream commit d2434e4d942c32cadcbdbcd32c58f35098f3b604 ]
      
      Cursor position updates were accidentally causing us to attempt to interlock
      window with window immediate, and without a matching window immediate update,
      NVDisplay could hang forever in some circumstances.
      
      Fixes suspend/resume on (at least) Quadro RTX4000 (TU104).
      Reported-by: NLyude Paul <lyude@redhat.com>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      991b5104
    • J
      objtool: Don't use ignore flag for fake jumps · 20e1a167
      Josh Poimboeuf 提交于
      [ Upstream commit e6da9567959e164f82bc81967e0d5b10dee870b4 ]
      
      The ignore flag is set on fake jumps in order to keep
      add_jump_destinations() from setting their jump_dest, since it already
      got set when the fake jump was created.
      
      But using the ignore flag is a bit of a hack.  It's normally used to
      skip validation of an instruction, which doesn't really make sense for
      fake jumps.
      
      Also, after the next patch, using the ignore flag for fake jumps can
      trigger a false "why am I validating an ignored function?" warning.
      
      Instead just add an explicit check in add_jump_destinations() to skip
      fake jumps.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/71abc072ff48b2feccc197723a9c52859476c068.1557766718.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      20e1a167
    • M
      drm/bridge: adv7511: Fix low refresh rate selection · 124c23dc
      Matt Redfearn 提交于
      [ Upstream commit 67793bd3b3948dc8c8384b6430e036a30a0ecb43 ]
      
      The driver currently sets register 0xfb (Low Refresh Rate) based on the
      value of mode->vrefresh. Firstly, this field is specified to be in Hz,
      but the magic numbers used by the code are Hz * 1000. This essentially
      leads to the low refresh rate always being set to 0x01, since the
      vrefresh value will always be less than 24000. Fix the magic numbers to
      be in Hz.
      Secondly, according to the comment in drm_modes.h, the field is not
      supposed to be used in a functional way anyway. Instead, use the helper
      function drm_mode_vrefresh().
      
      Fixes: 9c8af882 ("drm: Add adv7511 encoder driver")
      Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NMatt Redfearn <matt.redfearn@thinci.com>
      Signed-off-by: NSean Paul <seanpaul@chromium.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424132210.26338-1-matt.redfearn@thinci.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      124c23dc
    • B
      drm/nouveau/kms/gf119-gp10x: push HeadSetControlOutputResource() mthd when encoders change · 2a3f2b43
      Ben Skeggs 提交于
      [ Upstream commit a0b694d0af21c9993d1a39a75fd814bd48bf7eb4 ]
      
      HW has error checks in place which check that pixel depth is explicitly
      provided on DP, while HDMI has a "default" setting that we use.
      
      In multi-display configurations with identical modelines, but different
      protocols (HDMI + DP, in this case), it was possible for the DP head to
      get swapped to the head which previously drove the HDMI output, without
      updating HeadSetControlOutputResource(), triggering the error check and
      hanging the core update.
      Reported-by: NLyude Paul <lyude@redhat.com>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      2a3f2b43
    • S
      perf/x86/intel: Allow PEBS multi-entry in watermark mode · f9706dd9
      Stephane Eranian 提交于
      [ Upstream commit c7a286577d7592720c2f179aadfb325a1ff48c95 ]
      
      This patch fixes a restriction/bug introduced by:
      
         583feb08e7f7 ("perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS")
      
      The original patch prevented using multi-entry PEBS when wakeup_events != 0.
      However given that wakeup_events is part of a union with wakeup_watermark, it
      means that in watermark mode, PEBS multi-entry is also disabled which is not the
      intent. This patch fixes this by checking is watermark mode is enabled.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Cc: vincent.weaver@maine.edu
      Fixes: 583feb08e7f7 ("perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS")
      Link: http://lkml.kernel.org/r/20190514003400.224340-1-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f9706dd9
    • T
      mfd: twl6040: Fix device init errors for ACCCTL register · 5540d014
      Tony Lindgren 提交于
      [ Upstream commit 48171d0ea7caccf21c9ee3ae75eb370f2a756062 ]
      
      I noticed that we can get a -EREMOTEIO errors on at least omap4 duovero:
      
      twl6040 0-004b: Failed to write 2d = 19: -121
      
      And then any following register access will produce errors.
      
      There 2d offset above is register ACCCTL that gets written on twl6040
      powerup. With error checking added to the related regcache_sync() call,
      the -EREMOTEIO error is reproducable on twl6040 powerup at least
      duovero.
      
      To fix the error, we need to wait until twl6040 is accessible after the
      powerup. Based on tests on omap4 duovero, we need to wait over 8ms after
      powerup before register write will complete without failures. Let's also
      make sure we warn about possible errors too.
      
      Note that we have twl6040_patch[] reg_sequence with the ACCCTL register
      configuration and regcache_sync() will write the new value to ACCCTL.
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      Acked-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5540d014
    • B
      drm/nouveau/disp/dp: respect sink limits when selecting failsafe link configuration · 3b8892be
      Ben Skeggs 提交于
      [ Upstream commit 13d03e9daf70dab032c03dc172e75bb98ad899c4 ]
      
      Where possible, we want the failsafe link configuration (one which won't
      hang the OR during modeset because of not enough bandwidth for the mode)
      to also be supported by the sink.
      
      This prevents "link rate unsupported by sink" messages when link training
      fails.
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      3b8892be
    • B
      mfd: intel-lpss: Set the device in reset state when init · e9a8c980
      Binbin Wu 提交于
      [ Upstream commit dad06532292d77f37fbe831a02948a593500f682 ]
      
      In virtualized setup, when system reboots due to warm
      reset interrupt storm is seen.
      
      Call Trace:
      <IRQ>
      dump_stack+0x70/0xa5
      __report_bad_irq+0x2e/0xc0
      note_interrupt+0x248/0x290
      ? add_interrupt_randomness+0x30/0x220
      handle_irq_event_percpu+0x54/0x80
      handle_irq_event+0x39/0x60
      handle_fasteoi_irq+0x91/0x150
      handle_irq+0x108/0x180
      do_IRQ+0x52/0xf0
      common_interrupt+0xf/0xf
      </IRQ>
      RIP: 0033:0x76fc2cfabc1d
      Code: 24 28 bf 03 00 00 00 31 c0 48 8d 35 63 77 0e 00 48 8d 15 2e
      94 0e 00 4c 89 f9 49 89 d9 4c 89 d3 e8 b8 e2 01 00 48 8b 54 24 18
      <48> 89 ef 48 89 de 4c 89 e1 e8 d5 97 01 00 84 c0 74 2d 48 8b 04
      24
      RSP: 002b:00007ffd247c1fc0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffda
      RAX: 0000000000000000 RBX: 00007ffd247c1ff0 RCX: 000000000003d3ce
      RDX: 0000000000000000 RSI: 00007ffd247c1ff0 RDI: 000076fc2cbb6010
      RBP: 000076fc2cded010 R08: 00007ffd247c2210 R09: 00007ffd247c22a0
      R10: 000076fc29465470 R11: 0000000000000000 R12: 00007ffd247c1fc0
      R13: 000076fc2ce8e470 R14: 000076fc27ec9960 R15: 0000000000000414
      handlers:
      [<000000000d3fa913>] idma64_irq
      Disabling IRQ #27
      
      To avoid interrupt storm, set the device in reset state
      before bringing out the device from reset state.
      
      Changelog v2:
      - correct the subject line by adding "mfd: "
      Signed-off-by: NBinbin Wu <binbin.wu@intel.com>
      Acked-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e9a8c980
    • D
      mfd: tps65912-spi: Add missing of table registration · 12c57327
      Daniel Gomez 提交于
      [ Upstream commit 9e364e87ad7f2c636276c773d718cda29d62b741 ]
      
      MODULE_DEVICE_TABLE(of, <of_match_table> should be called to complete DT
      OF mathing mechanism and register it.
      
      Before this patch:
      modinfo drivers/mfd/tps65912-spi.ko | grep alias
      alias:          spi:tps65912
      
      After this patch:
      modinfo drivers/mfd/tps65912-spi.ko | grep alias
      alias:          of:N*T*Cti,tps65912C*
      alias:          of:N*T*Cti,tps65912
      alias:          spi:tps65912
      Reported-by: NJavier Martinez Canillas <javier@dowhile0.org>
      Signed-off-by: NDaniel Gomez <dagmcr@gmail.com>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      12c57327
    • A
      drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER · 1196b79a
      Amit Kucheria 提交于
      [ Upstream commit fc7d18cf6a923cde7f5e7ba2c1105bb106d3e29a ]
      
      We print a calibration failure message on -EPROBE_DEFER from
      nvmem/qfprom as follows:
      [    3.003090] qcom-tsens 4a9000.thermal-sensor: version: 1.4
      [    3.005376] qcom-tsens 4a9000.thermal-sensor: tsens calibration failed
      [    3.113248] qcom-tsens 4a9000.thermal-sensor: version: 1.4
      
      This confuses people when, in fact, calibration succeeds later when
      nvmem/qfprom device is available. Don't print this message on a
      -EPROBE_DEFER.
      Signed-off-by: NAmit Kucheria <amit.kucheria@linaro.org>
      Signed-off-by: NEduardo Valentin <edubezval@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      1196b79a
    • J
      thermal: rcar_gen3_thermal: disable interrupt in .remove · fd77a511
      Jiada Wang 提交于
      [ Upstream commit 63f55fcea50c25ae5ad45af92d08dae3b84534c2 ]
      
      Currently IRQ remains enabled after .remove, later if device is probed,
      IRQ is requested before .thermal_init, this may cause IRQ function be
      called before device is initialized.
      
      this patch disables interrupt in .remove, to ensure irq function
      only be called after device is fully initialized.
      Signed-off-by: NJiada Wang <jiada_wang@mentor.com>
      Reviewed-by: NSimon Horman <horms+renesas@verge.net.au>
      Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NEduardo Valentin <edubezval@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      fd77a511
    • C
      kernel/sys.c: prctl: fix false positive in validate_prctl_map() · c50c4fb0
      Cyrill Gorcunov 提交于
      [ Upstream commit a9e73998f9d705c94a8dca9687633adc0f24a19a ]
      
      While validating new map we require the @start_data to be strictly less
      than @end_data, which is fine for regular applications (this is why this
      nit didn't trigger for that long).  These members are set from executable
      loaders such as elf handers, still it is pretty valid to have a loadable
      data section with zero size in file, in such case the start_data is equal
      to end_data once kernel loader finishes.
      
      As a result when we're trying to restore such programs the procedure fails
      and the kernel returns -EINVAL.  From the image dump of a program:
      
       | "mm_start_code": "0x400000",
       | "mm_end_code": "0x8f5fb4",
       | "mm_start_data": "0xf1bfb0",
       | "mm_end_data": "0xf1bfb0",
      
      Thus we need to change validate_prctl_map from strictly less to less or
      equal operator use.
      
      Link: http://lkml.kernel.org/r/20190408143554.GY1421@uranus.lan
      Fixes: f606b77f ("prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation")
      Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com>
      Cc: Andrey Vagin <avagin@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c50c4fb0
    • Q
      mm/slab.c: fix an infinite loop in leaks_show() · 515d18ce
      Qian Cai 提交于
      [ Upstream commit 745e10146c31b1c6ed3326286704ae251b17f663 ]
      
      "cat /proc/slab_allocators" could hang forever on SMP machines with
      kmemleak or object debugging enabled due to other CPUs running do_drain()
      will keep making kmemleak_object or debug_objects_cache dirty and unable
      to escape the first loop in leaks_show(),
      
      do {
      	set_store_user_clean(cachep);
      	drain_cpu_caches(cachep);
      	...
      
      } while (!is_store_user_clean(cachep));
      
      For example,
      
      do_drain
        slabs_destroy
          slab_destroy
            kmem_cache_free
              __cache_free
                ___cache_free
                  kmemleak_free_recursive
                    delete_object_full
                      __delete_object
                        put_object
                          free_object_rcu
                            kmem_cache_free
                              cache_free_debugcheck --> dirty kmemleak_object
      
      One approach is to check cachep->name and skip both kmemleak_object and
      debug_objects_cache in leaks_show().  The other is to set store_user_clean
      after drain_cpu_caches() which leaves a small window between
      drain_cpu_caches() and set_store_user_clean() where per-CPU caches could
      be dirty again lead to slightly wrong information has been stored but
      could also speed up things significantly which sounds like a good
      compromise.  For example,
      
       # cat /proc/slab_allocators
       0m42.778s # 1st approach
       0m0.737s  # 2nd approach
      
      [akpm@linux-foundation.org: tweak comment]
      Link: http://lkml.kernel.org/r/20190411032635.10325-1-cai@lca.pw
      Fixes: d31676df ("mm/slab: alternative implementation for DEBUG_SLAB_LEAK")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      515d18ce
    • Y
      mm/cma_debug.c: fix the break condition in cma_maxchunk_get() · 13e1ea08
      Yue Hu 提交于
      [ Upstream commit f0fd50504a54f5548eb666dc16ddf8394e44e4b7 ]
      
      If not find zero bit in find_next_zero_bit(), it will return the size
      parameter passed in, so the start bit should be compared with bitmap_maxno
      rather than cma->count.  Although getting maxchunk is working fine due to
      zero value of order_per_bit currently, the operation will be stuck if
      order_per_bit is set as non-zero.
      
      Link: http://lkml.kernel.org/r/20190319092734.276-1-zbestahu@gmail.comSigned-off-by: NYue Hu <huyue2@yulong.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Safonov <d.safonov@partner.samsung.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      13e1ea08
    • A
      mm: page_mkclean vs MADV_DONTNEED race · 38c5fce7
      Aneesh Kumar K.V 提交于
      [ Upstream commit 024eee0e83f0df52317be607ca521e0fc572aa07 ]
      
      MADV_DONTNEED is handled with mmap_sem taken in read mode.  We call
      page_mkclean without holding mmap_sem.
      
      MADV_DONTNEED implies that pages in the region are unmapped and subsequent
      access to the pages in that range is handled as a new page fault.  This
      implies that if we don't have parallel access to the region when
      MADV_DONTNEED is run we expect those range to be unallocated.
      
      w.r.t page_mkclean() we need to make sure that we don't break the
      MADV_DONTNEED semantics.  MADV_DONTNEED check for pmd_none without holding
      pmd_lock.  This implies we skip the pmd if we temporarily mark pmd none.
      Avoid doing that while marking the page clean.
      
      Keep the sequence same for dax too even though we don't support
      MADV_DONTNEED for dax mapping
      
      The bug was noticed by code review and I didn't observe any failures w.r.t
      test run.  This is similar to
      
      commit 58ceeb6b
      Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Date:   Thu Apr 13 14:56:26 2017 -0700
      
          thp: fix MADV_DONTNEED vs. MADV_FREE race
      
      commit ced10803
      Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Date:   Thu Apr 13 14:56:20 2017 -0700
      
          thp: fix MADV_DONTNEED vs. numa balancing race
      
      Link: http://lkml.kernel.org/r/20190321040610.14226-1-aneesh.kumar@linux.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc:"Kirill A . Shutemov" <kirill@shutemov.name>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      38c5fce7
    • Y
      mm/cma.c: fix the bitmap status to show failed allocation reason · 77a01e33
      Yue Hu 提交于
      [ Upstream commit 2b59e01a3aa665f751d1410b99fae9336bd424e1 ]
      
      Currently one bit in cma bitmap represents number of pages rather than
      one page, cma->count means cma size in pages. So to find available pages
      via find_next_zero_bit()/find_next_bit() we should use cma size not in
      pages but in bits although current free pages number is correct due to
      zero value of order_per_bit. Once order_per_bit is changed the bitmap
      status will be incorrect.
      
      The size input in cma_debug_show_areas() is not correct.  It will
      affect the available pages at some position to debug the failure issue.
      
      This is an example with order_per_bit = 1
      
      Before this change:
      [    4.120060] cma: number of available pages: 1@93+4@108+7@121+7@137+7@153+7@169+7@185+7@201+3@213+3@221+3@229+3@237+3@245+3@253+3@261+3@269+3@277+3@285+3@293+3@301+3@309+3@317+3@325+19@333+15@369+512@512=> 638 free of 1024 total pages
      
      After this change:
      [    4.143234] cma: number of available pages: 2@93+8@108+14@121+14@137+14@153+14@169+14@185+14@201+6@213+6@221+6@229+6@237+6@245+6@253+6@261+6@269+6@277+6@285+6@293+6@301+6@309+6@317+6@325+38@333+30@369=> 252 free of 1024 total pages
      
      Obviously the bitmap status before is incorrect.
      
      Link: http://lkml.kernel.org/r/20190320060829.9144-1-zbestahu@gmail.comSigned-off-by: NYue Hu <huyue2@yulong.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      77a01e33
    • C
      initramfs: free initrd memory if opening /initrd.image fails · 25511676
      Christoph Hellwig 提交于
      [ Upstream commit 54c7a8916a887f357088f99e9c3a7720cd57d2c8 ]
      
      Patch series "initramfs tidyups".
      
      I've spent some time chasing down behavior in initramfs and found
      plenty of opportunity to improve the code.  A first stab on that is
      contained in this series.
      
      This patch (of 7):
      
      We free the initrd memory for all successful or error cases except for the
      case where opening /initrd.image fails, which looks like an oversight.
      
      Steven said:
      
      : This also changes the behaviour when CONFIG_INITRAMFS_FORCE is enabled
      : - specifically it means that the initrd is freed (previously it was
      : ignored and never freed).  But that seems like reasonable behaviour and
      : the previous behaviour looks like another oversight.
      
      Link: http://lkml.kernel.org/r/20190213174621.29297-3-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSteven Price <steven.price@arm.com>
      Acked-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      25511676
    • Y
      mm/cma.c: fix crash on CMA allocation if bitmap allocation fails · e5f8857e
      Yue Hu 提交于
      [ Upstream commit 1df3a339074e31db95c4790ea9236874b13ccd87 ]
      
      f022d8cb ("mm: cma: Don't crash on allocation if CMA area can't be
      activated") fixes the crash issue when activation fails via setting
      cma->count as 0, same logic exists if bitmap allocation fails.
      
      Link: http://lkml.kernel.org/r/20190325081309.6004-1-zbestahu@gmail.comSigned-off-by: NYue Hu <huyue2@yulong.com>
      Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e5f8857e
    • L
      mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE · 5094a85d
      Linxu Fang 提交于
      [ Upstream commit 299c83dce9ea3a79bb4b5511d2cb996b6b8e5111 ]
      
      342332e6 ("mm/page_alloc.c: introduce kernelcore=mirror option") and
      later patches rewrote the calculation of node spanned pages.
      
      e506b996 ("mem-hotplug: fix node spanned pages when we have a movable
      node"), but the current code still has problems,
      
      When we have a node with only zone_movable and the node id is not zero,
      the size of node spanned pages is double added.
      
      That's because we have an empty normal zone, and zone_start_pfn or
      zone_end_pfn is not between arch_zone_lowest_possible_pfn and
      arch_zone_highest_possible_pfn, so we need to use clamp to constrain the
      range just like the commit <96e907d1> (bootmem: Reimplement
      __absent_pages_in_range() using for_each_mem_pfn_range()).
      
      e.g.
      Zone ranges:
        DMA      [mem 0x0000000000001000-0x0000000000ffffff]
        DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
        Normal   [mem 0x0000000100000000-0x000000023fffffff]
      Movable zone start for each node
        Node 0: 0x0000000100000000
        Node 1: 0x0000000140000000
      Early memory node ranges
        node   0: [mem 0x0000000000001000-0x000000000009efff]
        node   0: [mem 0x0000000000100000-0x00000000bffdffff]
        node   0: [mem 0x0000000100000000-0x000000013fffffff]
        node   1: [mem 0x0000000140000000-0x000000023fffffff]
      
      node 0 DMA	spanned:0xfff   present:0xf9e   absent:0x61
      node 0 DMA32	spanned:0xff000 present:0xbefe0	absent:0x40020
      node 0 Normal	spanned:0	present:0	absent:0
      node 0 Movable	spanned:0x40000 present:0x40000 absent:0
      On node 0 totalpages(node_present_pages): 1048446
      node_spanned_pages:1310719
      node 1 DMA	spanned:0	    present:0		absent:0
      node 1 DMA32	spanned:0	    present:0		absent:0
      node 1 Normal	spanned:0x100000    present:0x100000	absent:0
      node 1 Movable	spanned:0x100000    present:0x100000	absent:0
      On node 1 totalpages(node_present_pages): 2097152
      node_spanned_pages:2097152
      Memory: 6967796K/12582392K available (16388K kernel code, 3686K rwdata,
      4468K rodata, 2160K init, 10444K bss, 5614596K reserved, 0K
      cma-reserved)
      
      It shows that the current memory of node 1 is double added.
      After this patch, the problem is fixed.
      
      node 0 DMA	spanned:0xfff   present:0xf9e   absent:0x61
      node 0 DMA32	spanned:0xff000 present:0xbefe0	absent:0x40020
      node 0 Normal	spanned:0	present:0	absent:0
      node 0 Movable	spanned:0x40000 present:0x40000 absent:0
      On node 0 totalpages(node_present_pages): 1048446
      node_spanned_pages:1310719
      node 1 DMA	spanned:0	    present:0		absent:0
      node 1 DMA32	spanned:0	    present:0		absent:0
      node 1 Normal	spanned:0	    present:0		absent:0
      node 1 Movable	spanned:0x100000    present:0x100000	absent:0
      On node 1 totalpages(node_present_pages): 1048576
      node_spanned_pages:1048576
      memory: 6967796K/8388088K available (16388K kernel code, 3686K rwdata,
      4468K rodata, 2160K init, 10444K bss, 1420292K reserved, 0K
      cma-reserved)
      
      Link: http://lkml.kernel.org/r/1554178276-10372-1-git-send-email-fanglinxu@huawei.comSigned-off-by: NLinxu Fang <fanglinxu@huawei.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5094a85d
    • M
      hugetlbfs: on restore reserve error path retain subpool reservation · ffaafd27
      Mike Kravetz 提交于
      [ Upstream commit 0919e1b69ab459e06df45d3ba6658d281962db80 ]
      
      When a huge page is allocated, PagePrivate() is set if the allocation
      consumed a reservation.  When freeing a huge page, PagePrivate is checked.
      If set, it indicates the reservation should be restored.  PagePrivate
      being set at free huge page time mostly happens on error paths.
      
      When huge page reservations are created, a check is made to determine if
      the mapping is associated with an explicitly mounted filesystem.  If so,
      pages are also reserved within the filesystem.  The default action when
      freeing a huge page is to decrement the usage count in any associated
      explicitly mounted filesystem.  However, if the reservation is to be
      restored the reservation/use count within the filesystem should not be
      decrementd.  Otherwise, a subsequent page allocation and free for the same
      mapping location will cause the file filesystem usage to go 'negative'.
      
      Filesystem                         Size  Used Avail Use% Mounted on
      nodev                              4.0G -4.0M  4.1G    - /opt/hugepool
      
      To fix, when freeing a huge page do not adjust filesystem usage if
      PagePrivate() is set to indicate the reservation should be restored.
      
      I did not cc stable as the problem has been around since reserves were
      added to hugetlbfs and nobody has noticed.
      
      Link: http://lkml.kernel.org/r/20190328234704.27083-2-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ffaafd27
    • J
      mm/hmm: select mmu notifier when selecting HMM · 85e1a6c4
      Jérôme Glisse 提交于
      [ Upstream commit 734fb89968900b5c5f8edd5038bd4cdeab8c61d2 ]
      
      To avoid random config build issue, select mmu notifier when HMM is
      selected.  In any cases when HMM get selected it will be by users that
      will also wants the mmu notifier.
      
      Link: http://lkml.kernel.org/r/20190403193318.16478-2-jglisse@redhat.comSigned-off-by: NJérôme Glisse <jglisse@redhat.com>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      85e1a6c4
    • A
      ARM: prevent tracing IPI_CPU_BACKTRACE · e0c3fc1f
      Arnd Bergmann 提交于
      [ Upstream commit be167862ae7dd85c56d385209a4890678e1b0488 ]
      
      Patch series "compiler: allow all arches to enable
      CONFIG_OPTIMIZE_INLINING", v3.
      
      This patch (of 11):
      
      When function tracing for IPIs is enabled, we get a warning for an
      overflow of the ipi_types array with the IPI_CPU_BACKTRACE type as
      triggered by raise_nmi():
      
        arch/arm/kernel/smp.c: In function 'raise_nmi':
        arch/arm/kernel/smp.c:489:2: error: array subscript is above array bounds [-Werror=array-bounds]
          trace_ipi_raise(target, ipi_types[ipinr]);
      
      This is a correct warning as we actually overflow the array here.
      
      This patch raise_nmi() to call __smp_cross_call() instead of
      smp_cross_call(), to avoid calling into ftrace.  For clarification, I'm
      also adding a two new code comments describing how this one is special.
      
      The warning appears to have shown up after commit e7273ff4 ("ARM:
      8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI"), which changed the
      number assignment from '15' to '8', but as far as I can tell has existed
      since the IPI tracepoints were first introduced.  If we decide to
      backport this patch to stable kernels, we probably need to backport
      e7273ff4 as well.
      
      [yamada.masahiro@socionext.com: rebase on v5.1-rc1]
      Link: http://lkml.kernel.org/r/20190423034959.13525-2-yamada.masahiro@socionext.com
      Fixes: e7273ff4 ("ARM: 8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI")
      Fixes: 365ec7b1 ("ARM: add IPI tracepoints") # v3.17
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Boris Brezillon <bbrezillon@kernel.org>
      Cc: Miquel Raynal <miquel.raynal@bootlin.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e0c3fc1f
    • G
      drm/pl111: Initialize clock spinlock early · 4d3811a6
      Guenter Roeck 提交于
      [ Upstream commit 3e01ae2612bdd7975c74ec7123d7f8f5e6eed795 ]
      
      The following warning is seen on systems with broken clock divider.
      
      INFO: trying to register non-static key.
      the code is fine but needs lockdep annotation.
      turning off the locking correctness validator.
      CPU: 0 PID: 1 Comm: swapper Not tainted 5.1.0-09698-g1fb3b52 #1
      Hardware name: ARM Integrator/CP (Device Tree)
      [<c0011be8>] (unwind_backtrace) from [<c000ebb8>] (show_stack+0x10/0x18)
      [<c000ebb8>] (show_stack) from [<c07d3fd0>] (dump_stack+0x18/0x24)
      [<c07d3fd0>] (dump_stack) from [<c0060d48>] (register_lock_class+0x674/0x6f8)
      [<c0060d48>] (register_lock_class) from [<c005de2c>]
      	(__lock_acquire+0x68/0x2128)
      [<c005de2c>] (__lock_acquire) from [<c0060408>] (lock_acquire+0x110/0x21c)
      [<c0060408>] (lock_acquire) from [<c07f755c>] (_raw_spin_lock+0x34/0x48)
      [<c07f755c>] (_raw_spin_lock) from [<c0536c8c>]
      	(pl111_display_enable+0xf8/0x5fc)
      [<c0536c8c>] (pl111_display_enable) from [<c0502f54>]
      	(drm_atomic_helper_commit_modeset_enables+0x1ec/0x244)
      
      Since commit eedd6033 ("drm/pl111: Support variants with broken clock
      divider"), the spinlock is not initialized if the clock divider is broken.
      Initialize it earlier to fix the problem.
      
      Fixes: eedd6033 ("drm/pl111: Support variants with broken clock divider")
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/1557758781-23586-1-git-send-email-linux@roeck-us.netSigned-off-by: NSasha Levin <sashal@kernel.org>
      4d3811a6
    • L
      ipc: prevent lockup on alloc_msg and free_msg · 20de754a
      Li Rongqing 提交于
      [ Upstream commit d6a2946a88f524a47cc9b79279667137899db807 ]
      
      msgctl10 of ltp triggers the following lockup When CONFIG_KASAN is
      enabled on large memory SMP systems, the pages initialization can take a
      long time, if msgctl10 requests a huge block memory, and it will block
      rcu scheduler, so release cpu actively.
      
      After adding schedule() in free_msg, free_msg can not be called when
      holding spinlock, so adding msg to a tmp list, and free it out of
      spinlock
      
        rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
        rcu:     Tasks blocked on level-1 rcu_node (CPUs 16-31): P32505
        rcu:     Tasks blocked on level-1 rcu_node (CPUs 48-63): P34978
        rcu:     (detected by 11, t=35024 jiffies, g=44237529, q=16542267)
        msgctl10        R  running task    21608 32505   2794 0x00000082
        Call Trace:
         preempt_schedule_irq+0x4c/0xb0
         retint_kernel+0x1b/0x2d
        RIP: 0010:__is_insn_slot_addr+0xfb/0x250
        Code: 82 1d 00 48 8b 9b 90 00 00 00 4c 89 f7 49 c1 ee 03 e8 59 83 1d 00 48 b8 00 00 00 00 00 fc ff df 4c 39 eb 48 89 9d 58 ff ff ff <41> c6 04 06 f8 74 66 4c 8d 75 98 4c 89 f1 48 c1 e9 03 48 01 c8 48
        RSP: 0018:ffff88bce041f758 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
        RAX: dffffc0000000000 RBX: ffffffff8471bc50 RCX: ffffffff828a2a57
        RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88bce041f780
        RBP: ffff88bce041f828 R08: ffffed15f3f4c5b3 R09: ffffed15f3f4c5b3
        R10: 0000000000000001 R11: ffffed15f3f4c5b2 R12: 000000318aee9b73
        R13: ffffffff8471bc50 R14: 1ffff1179c083ef0 R15: 1ffff1179c083eec
         kernel_text_address+0xc1/0x100
         __kernel_text_address+0xe/0x30
         unwind_get_return_address+0x2f/0x50
         __save_stack_trace+0x92/0x100
         create_object+0x380/0x650
         __kmalloc+0x14c/0x2b0
         load_msg+0x38/0x1a0
         do_msgsnd+0x19e/0xcf0
         do_syscall_64+0x117/0x400
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
        rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
        rcu:     Tasks blocked on level-1 rcu_node (CPUs 0-15): P32170
        rcu:     (detected by 14, t=35016 jiffies, g=44237525, q=12423063)
        msgctl10        R  running task    21608 32170  32155 0x00000082
        Call Trace:
         preempt_schedule_irq+0x4c/0xb0
         retint_kernel+0x1b/0x2d
        RIP: 0010:lock_acquire+0x4d/0x340
        Code: 48 81 ec c0 00 00 00 45 89 c6 4d 89 cf 48 8d 6c 24 20 48 89 3c 24 48 8d bb e4 0c 00 00 89 74 24 0c 48 c7 44 24 20 b3 8a b5 41 <48> c1 ed 03 48 c7 44 24 28 b4 25 18 84 48 c7 44 24 30 d0 54 7a 82
        RSP: 0018:ffff88af83417738 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
        RAX: dffffc0000000000 RBX: ffff88bd335f3080 RCX: 0000000000000002
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88bd335f3d64
        RBP: ffff88af83417758 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000001 R11: ffffed13f3f745b2 R12: 0000000000000000
        R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
         is_bpf_text_address+0x32/0xe0
         kernel_text_address+0xec/0x100
         __kernel_text_address+0xe/0x30
         unwind_get_return_address+0x2f/0x50
         __save_stack_trace+0x92/0x100
         save_stack+0x32/0xb0
         __kasan_slab_free+0x130/0x180
         kfree+0xfa/0x2d0
         free_msg+0x24/0x50
         do_msgrcv+0x508/0xe60
         do_syscall_64+0x117/0x400
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Davidlohr said:
       "So after releasing the lock, the msg rbtree/list is empty and new
        calls will not see those in the newly populated tmp_msg list, and
        therefore they cannot access the delayed msg freeing pointers, which
        is good. Also the fact that the node_cache is now freed before the
        actual messages seems to be harmless as this is wanted for
        msg_insert() avoiding GFP_ATOMIC allocations, and after releasing the
        info->lock the thing is freed anyway so it should not change things"
      
      Link: http://lkml.kernel.org/r/1552029161-4957-1-git-send-email-lirongqing@baidu.comSigned-off-by: NLi RongQing <lirongqing@baidu.com>
      Signed-off-by: NZhang Yu <zhangyu31@baidu.com>
      Reviewed-by: NDavidlohr Bueso <dbueso@suse.de>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      20de754a
    • C
      sysctl: return -EINVAL if val violates minmax · 91ae202e
      Christian Brauner 提交于
      [ Upstream commit e260ad01f0aa9e96b5386d5cd7184afd949dc457 ]
      
      Currently when userspace gives us a values that overflow e.g.  file-max
      and other callers of __do_proc_doulongvec_minmax() we simply ignore the
      new value and leave the current value untouched.
      
      This can be problematic as it gives the illusion that the limit has
      indeed be bumped when in fact it failed.  This commit makes sure to
      return EINVAL when an overflow is detected.  Please note that this is a
      userspace facing change.
      
      Link: http://lkml.kernel.org/r/20190210203943.8227-4-christian@brauner.ioSigned-off-by: NChristian Brauner <christian@brauner.io>
      Acked-by: NLuis Chamberlain <mcgrof@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Joe Lawrence <joe.lawrence@redhat.com>
      Cc: Waiman Long <longman@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      91ae202e
    • H
      fs/fat/file.c: issue flush after the writeback of FAT · 5b6619b4
      Hou Tao 提交于
      [ Upstream commit bd8309de0d60838eef6fb575b0c4c7e95841cf73 ]
      
      fsync() needs to make sure the data & meta-data of file are persistent
      after the return of fsync(), even when a power-failure occurs later.  In
      the case of fat-fs, the FAT belongs to the meta-data of file, so we need
      to issue a flush after the writeback of FAT instead before.
      
      Also bail out early when any stage of fsync fails.
      
      Link: http://lkml.kernel.org/r/20190409030158.136316-1-houtao1@huawei.comSigned-off-by: NHou Tao <houtao1@huawei.com>
      Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5b6619b4
    • K
      rapidio: fix a NULL pointer dereference when create_workqueue() fails · 2a89e4c5
      Kangjie Lu 提交于
      [ Upstream commit 23015b22e47c5409620b1726a677d69e5cd032ba ]
      
      In case create_workqueue fails, the fix releases resources and returns
      -ENOMEM to avoid NULL pointer dereference.
      Signed-off-by: NKangjie Lu <kjlu@umn.edu>
      Acked-by: NAlexandre Bounine <alex.bou9@gmail.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      2a89e4c5