1. 21 11月, 2018 20 次提交
    • N
      drm/amd/display: Raise dispclk value for dce120 by 15% · ce46da15
      Nicholas Kazlauskas 提交于
      [ Upstream commit 481f576c6c21bf0446eaa23623ef0262e9a5387c ]
      
      [Why]
      
      The DISPCLK value was previously requested to be 15% higher for all
      ASICs that went through the dce110 bandwidth code path. As part of a
      refactoring of dce_clocks and the dce110 set bandwidth codepath this
      was removed for power saving considerations.
      
      That change caused display corruption under certain hardware
      configurations with Vega10.
      
      [How]
      
      The 15% DISPCLK increase is brought back but only on dce110 for now.
      This is should be a temporary workaround until the root cause is sorted
      out for why this occurs on Vega (or other ASICs, if reported).
      Tested-by: NNick Sarnie <sarnex@gentoo.org>
      Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
      Reviewed-by: NHarry Wentland <Harry.Wentland@amd.com>
      Acked-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce46da15
    • T
      drm/omap: fix memory barrier bug in DMM driver · 7423c288
      Tomi Valkeinen 提交于
      [ Upstream commit 538f66ba204944470a653a4cccc5f8befdf97c22 ]
      
      A DMM timeout "timed out waiting for done" has been observed on DRA7
      devices. The timeout happens rarely, and only when the system is under
      heavy load.
      
      Debugging showed that the timeout can be made to happen much more
      frequently by optimizing the DMM driver, so that there's almost no code
      between writing the last DMM descriptors to RAM, and writing to DMM
      register which starts the DMM transaction.
      
      The current theory is that a wmb() does not properly ensure that the
      data written to RAM is observable by all the components in the system.
      
      This DMM timeout has caused interesting (and rare) bugs as the error
      handling was not functioning properly (the error handling has been fixed
      in previous commits):
      
       * If a DMM timeout happened when a GEM buffer was being pinned for
         display on the screen, a timeout error would be shown, but the driver
         would continue programming DSS HW with broken buffer, leading to
         SYNCLOST floods and possible crashes.
      
       * If a DMM timeout happened when other user (say, video decoder) was
         pinning a GEM buffer, a timeout would be shown but if the user
         handled the error properly, no other issues followed.
      
       * If a DMM timeout happened when a GEM buffer was being released, the
         driver does not even notice the error, leading to crashes or hang
         later.
      
      This patch adds wmb() and readl() calls after the last bit is written to
      RAM, which should ensure that the execution proceeds only after the data
      is actually in RAM, and thus observable by DMM.
      
      The read-back should not be needed. Further study is required to understand
      if DMM is somehow special case and read-back is ok, or if DRA7's memory
      barriers do not work correctly.
      Signed-off-by: NTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7423c288
    • C
      powerpc/mm: Don't report hugepage tables as memory leaks when using kmemleak · 5b248698
      Christophe Leroy 提交于
      [ Upstream commit 803d690e68f0c5230183f1a42c7d50a41d16e380 ]
      
      When a process allocates a hugepage, the following leak is
      reported by kmemleak. This is a false positive which is
      due to the pointer to the table being stored in the PGD
      as physical memory address and not virtual memory pointer.
      
      unreferenced object 0xc30f8200 (size 512):
        comm "mmap", pid 374, jiffies 4872494 (age 627.630s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<e32b68da>] huge_pte_alloc+0xdc/0x1f8
          [<9e0df1e1>] hugetlb_fault+0x560/0x8f8
          [<7938ec6c>] follow_hugetlb_page+0x14c/0x44c
          [<afbdb405>] __get_user_pages+0x1c4/0x3dc
          [<b8fd7cd9>] __mm_populate+0xac/0x140
          [<3215421e>] vm_mmap_pgoff+0xb4/0xb8
          [<c148db69>] ksys_mmap_pgoff+0xcc/0x1fc
          [<4fcd760f>] ret_from_syscall+0x0/0x38
      
      See commit a984506c ("powerpc/mm: Don't report PUDs as
      memory leaks when using kmemleak") for detailed explanation.
      
      To fix that, this patch tells kmemleak to ignore the allocated
      hugepage table.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b248698
    • S
      drm/msm: dpu: Allow planes to extend past active display · fb4dd7a9
      Sean Paul 提交于
      [ Upstream commit 96fc56a7 ]
      
      The atomic_check is a bit too aggressive with respect to planes which
      leave the active area. This caused a bunch of log spew when the cursor
      got to the edge of the screen and stopped it from going all the way.
      
      This patch removes the conservative bounds checks from atomic and clips
      the dst rect such that we properly display planes which go off the
      screen.
      
      Changes in v2:
      - Apply the clip to src as well (taking into account scaling)
      Changes in v3:
      - Use drm_atomic_helper_check_plane_state() to clip src/dst
      
      Cc: Sravanthi Kollukuduru <skolluku@codeaurora.org>
      Cc: Jeykumar Sankaran <jsanka@codeaurora.org>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NJeykumar Sankaran <jsanka@codeaurora.org>
      Signed-off-by: NSean Paul <seanpaul@chromium.org>
      Signed-off-by: NRob Clark <robdclark@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb4dd7a9
    • S
      drm/msm/disp/dpu: Use proper define for drm_encoder_init() 'encoder_type' · 721b54a6
      Stephen Boyd 提交于
      [ Upstream commit 2c043eeffea4813b8f569e84b46035a08de5eb47 ]
      
      We got a bug report that this function oopses when trying to do a kasprintf().
      
      PC is at string+0x2c/0x60
      LR is at vsnprintf+0x28c/0x4ec
      pc : [<ffffff80088d35d8>] lr : [<ffffff80088d5fc4>] pstate: a0c00049
      sp : ffffff80095fb540
      x29: ffffff80095fb540 x28: ffffff8008ad42bc
      x27: 00000000ffffffd8 x26: 0000000000000000
      x25: ffffff8008c216c8 x24: 0000000000000000
      x23: 0000000000000000 x22: ffffff80095fb720
      x21: 0000000000000000 x20: ffffff80095fb720
      x19: ffffff80095fb6f0 x18: 000000000000000a
      x17: 00000000b42ba473 x16: ffffff800805bbe8
      x15: 00000000000a157d x14: 000000000000000c
      x13: 0000000000000000 x12: 0000ffff0000000f
      x11: 0000000000000003 x10: 0000000000000001
      x9 : 0000000000000040 x8 : 000000000000001c
      x7 : ffffffffffffffff x6 : 0000000000000000
      x5 : 0000000000000228 x4 : 0000000000000000
      x3 : ffff0a00ffffff04 x2 : 0000000000007961
      x1 : 0000000000000000 x0 : 0000000000000000
      Process kworker/3:1 (pid: 61, stack limit = 0xffffff80095f8000)
      Call trace:
      Exception stack(0xffffff80095fb400 to 0xffffff80095fb540)
      b400: 0000000000000000 0000000000000000 0000000000007961 ffff0a00ffffff04
      b420: 0000000000000000 0000000000000228 0000000000000000 ffffffffffffffff
      b440: 000000000000001c 0000000000000040 0000000000000001 0000000000000003
      b460: 0000ffff0000000f 0000000000000000 000000000000000c 00000000000a157d
      b480: ffffff800805bbe8 00000000b42ba473 000000000000000a ffffff80095fb6f0
      b4a0: ffffff80095fb720 0000000000000000 ffffff80095fb720 0000000000000000
      b4c0: 0000000000000000 ffffff8008c216c8 0000000000000000 00000000ffffffd8
      b4e0: ffffff8008ad42bc ffffff80095fb540 ffffff80088d5fc4 ffffff80095fb540
      b500: ffffff80088d35d8 00000000a0c00049 ffffff80095fb550 ffffff80080d06a4
      b520: ffffffffffffffff ffffff80088d5e0c ffffff80095fb540 ffffff80088d35d8
      [<ffffff80088d35d8>] string+0x2c/0x60
      [<ffffff80088d5fc4>] vsnprintf+0x28c/0x4ec
      [<ffffff80083973b8>] kvasprintf+0x68/0x100
      [<ffffff800839755c>] kasprintf+0x60/0x80
      [<ffffff800849cc24>] drm_encoder_init+0x134/0x164
      [<ffffff80084d9a7c>] dpu_encoder_init+0x60/0x94
      [<ffffff80084eced0>] _dpu_kms_drm_obj_init+0xa0/0x424
      [<ffffff80084ed870>] dpu_kms_hw_init+0x61c/0x6bc
      [<ffffff80084f7614>] msm_drm_bind+0x380/0x67c
      [<ffffff80085114e4>] try_to_bring_up_master+0x228/0x264
      [<ffffff80085116e8>] component_master_add_with_match+0x90/0xc0
      [<ffffff80084f722c>] msm_pdev_probe+0x260/0x2c8
      [<ffffff800851a910>] platform_drv_probe+0x58/0xa8
      [<ffffff80085185c8>] driver_probe_device+0x2d8/0x40c
      [<ffffff8008518928>] __device_attach_driver+0xd4/0x10c
      [<ffffff800851644c>] bus_for_each_drv+0xb4/0xd0
      [<ffffff8008518230>] __device_attach+0xd0/0x160
      [<ffffff8008518984>] device_initial_probe+0x24/0x30
      [<ffffff800851744c>] bus_probe_device+0x38/0x98
      [<ffffff8008517aac>] deferred_probe_work_func+0x144/0x148
      [<ffffff80080c8654>] process_one_work+0x218/0x3bc
      [<ffffff80080c883c>] process_scheduled_works+0x44/0x48
      [<ffffff80080c95bc>] worker_thread+0x288/0x32c
      [<ffffff80080cea30>] kthread+0x134/0x13c
      [<ffffff8008084750>] ret_from_fork+0x10/0x18
      Code: 910003fd 2a0403e6 eb0400ff 54000060 (38646845)
      
      Looking at the code I see that drm_encoder_init() is called from the DPU
      code with 'DRM_MODE_CONNECTOR_DSI' passed in as the 'encoder_type'
      argument (follow from _dpu_kms_initialize_dsi()). That corresponds to
      the integer 16. That is then indexed into drm_encoder_enum_list in
      drm_encoder_init() to look up the name of the encoder. If you're still
      following along, that's an encoder not a connector! We really want to
      use DRM_MODE_ENCODER_DSI (integer 6) instead of DRM_MODE_CONNECTOR_DSI
      here, or we'll go out of bounds of the encoder array. Pass the right
      thing and everything is fine.
      
      Cc: Jeykumar Sankaran <jsanka@codeaurora.org>
      Cc: Jordan Crouse <jcrouse@codeaurora.org>
      Cc: Sean Paul <seanpaul@chromium.org>
      Fixes: 25fdd593 (drm/msm: Add SDM845 DPU support)
      Tested-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
      Reviewed-by: NJeykumar Sankaran <jsanka@codeaurora.org>
      Signed-off-by: NStephen Boyd <swboyd@chromium.org>
      Signed-off-by: NSean Paul <seanpaul@chromium.org>
      Signed-off-by: NRob Clark <robdclark@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      721b54a6
    • A
      drm/msm/gpu: fix parameters in function msm_gpu_crashstate_capture · 9565e7c0
      Anders Roxell 提交于
      [ Upstream commit 6969019f ]
      
      When CONFIG_DEV_COREDUMP isn't defined msm_gpu_crashstate_capture
      doesn't pass the correct parameters.
      drivers/gpu/drm/msm/msm_gpu.c: In function ‘recover_worker’:
      drivers/gpu/drm/msm/msm_gpu.c:479:34: error: passing argument 2 of ‘msm_gpu_crashstate_capture’ from incompatible pointer type [-Werror=incompatible-pointer-types]
        msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
                                        ^~~~~~
      drivers/gpu/drm/msm/msm_gpu.c:388:13: note: expected ‘char *’ but argument is of type ‘struct msm_gem_submit *’
       static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, char *comm,
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/gpu/drm/msm/msm_gpu.c:479:2: error: too many arguments to function ‘msm_gpu_crashstate_capture’
        msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/gpu/drm/msm/msm_gpu.c:388:13: note: declared here
       static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, char *comm,
      
      In current code the function msm_gpu_crashstate_capture parameters.
      
      Fixes: cdb95931 ("drm/msm/gpu: Add the buffer objects from the submit to the crash dump")
      Signed-off-by: NAnders Roxell <anders.roxell@linaro.org>
      Reviewed-By: NJordan Crouse <jcrouse@codeaurora.org>
      Signed-off-by: NRob Clark <robdclark@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9565e7c0
    • D
      powerpc/nohash: fix undefined behaviour when testing page size support · b0f85991
      Daniel Axtens 提交于
      [ Upstream commit f5e284803a7206d43e26f9ffcae5de9626d95e37 ]
      
      When enumerating page size definitions to check hardware support,
      we construct a constant which is (1U << (def->shift - 10)).
      
      However, the array of page size definitions is only initalised for
      various MMU_PAGE_* constants, so it contains a number of 0-initialised
      elements with def->shift == 0. This means we end up shifting by a
      very large number, which gives the following UBSan splat:
      
      ================================================================================
      UBSAN: Undefined behaviour in /home/dja/dev/linux/linux/arch/powerpc/mm/tlb_nohash.c:506:21
      shift exponent 4294967286 is too large for 32-bit type 'unsigned int'
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc3-00045-ga604f927b012-dirty #6
      Call Trace:
      [c00000000101bc20] [c000000000a13d54] .dump_stack+0xa8/0xec (unreliable)
      [c00000000101bcb0] [c0000000004f20a8] .ubsan_epilogue+0x18/0x64
      [c00000000101bd30] [c0000000004f2b10] .__ubsan_handle_shift_out_of_bounds+0x110/0x1a4
      [c00000000101be20] [c000000000d21760] .early_init_mmu+0x1b4/0x5a0
      [c00000000101bf10] [c000000000d1ba28] .early_setup+0x100/0x130
      [c00000000101bf90] [c000000000000528] start_here_multiplatform+0x68/0x80
      ================================================================================
      
      Fix this by first checking if the element exists (shift != 0) before
      constructing the constant.
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b0f85991
    • F
      ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL · 270cbdee
      Fabio Estevam 提交于
      [ Upstream commit 35d3cbe8 ]
      
      Andreas Müller reports:
      
      "Fixes:
      
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[220]: Failed to apply ACL on /dev/v4l-subdev0: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[224]: Failed to apply ACL on /dev/v4l-subdev1: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[215]: Failed to apply ACL on /dev/v4l-subdev10: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[228]: Failed to apply ACL on /dev/v4l-subdev2: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[232]: Failed to apply ACL on /dev/v4l-subdev5: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[217]: Failed to apply ACL on /dev/v4l-subdev11: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[214]: Failed to apply ACL on /dev/dri/card1: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[216]: Failed to apply ACL on /dev/v4l-subdev8: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[226]: Failed to apply ACL on /dev/v4l-subdev9: Operation not supported
      
      and nasty follow-ups: Starting weston from sddm as unpriviledged user fails
      with some hints on missing access rights."
      
      Select the CONFIG_TMPFS_POSIX_ACL option to fix these issues.
      Reported-by: NAndreas Müller <schnitzeltony@gmail.com>
      Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com>
      Acked-by: NOtavio Salvador <otavio@ossystems.com.br>
      Signed-off-by: NShawn Guo <shawnguo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      270cbdee
    • C
      drm/amdgpu/powerplay: fix missing break in switch statements · 17a5d018
      Colin Ian King 提交于
      [ Upstream commit 14b284832e7dea6f54f0adfd7bed105548b94e57 ]
      
      There are several switch statements that are missing break statements.
      Add missing breaks to handle any fall-throughs corner cases.
      
      Detected by CoverityScan, CID#1457175 ("Missing break in switch")
      
      Fixes: 18aafc59 ("drm/amd/powerplay: implement fw related smu interface for iceland.")
      Acked-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17a5d018
    • G
      drm/nouveau/secboot/acr: fix memory leak · 05dff1e2
      Gustavo A. R. Silva 提交于
      [ Upstream commit 74a07c0a59fa372b069d879971ba4d9e341979cf ]
      
      In case memory resources for *bl_desc* were allocated, release
      them before return.
      
      Addresses-Coverity-ID: 1472021 ("Resource leak")
      Fixes: 0d466901 ("drm/nouveau/secboot/acr: Remove VLA usage")
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05dff1e2
    • M
      tracing/kprobes: Check the probe on unloaded module correctly · 784c2eb3
      Masami Hiramatsu 提交于
      [ Upstream commit 59158ec4 ]
      
      Current kprobe event doesn't checks correctly whether the
      given event is on unloaded module or not. It just checks
      the event has ":" in the name.
      
      That is not enough because if we define a probe on non-exist
      symbol on loaded module, it allows to define that (with
      warning message)
      
      To ensure it correctly, this searches the module name on
      loaded module list and only if there is not, it allows to
      define it. (this event will be available when the target
      module is loaded)
      
      Link: http://lkml.kernel.org/r/153547309528.26502.8300278470528281328.stgit@devboxSigned-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      784c2eb3
    • M
      tty: check name length in tty_find_polling_driver() · 7ecd146b
      Miles Chen 提交于
      [ Upstream commit 33a1a7be198657c8ca26ad406c4d2a89b7162bcc ]
      
      The issue is found by a fuzzing test.
      If tty_find_polling_driver() recevies an incorrect input such as
      ',,' or '0b', the len becomes 0 and strncmp() always return 0.
      In this case, a null p->ops->poll_init() is called and it causes a kernel
      panic.
      
      Fix this by checking name length against zero in tty_find_polling_driver().
      
      $echo ,, > /sys/module/kgdboc/parameters/kgdboc
      [   20.804451] WARNING: CPU: 1 PID: 104 at drivers/tty/serial/serial_core.c:457
      uart_get_baud_rate+0xe8/0x190
      [   20.804917] Modules linked in:
      [   20.805317] CPU: 1 PID: 104 Comm: sh Not tainted 4.19.0-rc7ajb #8
      [   20.805469] Hardware name: linux,dummy-virt (DT)
      [   20.805732] pstate: 20000005 (nzCv daif -PAN -UAO)
      [   20.805895] pc : uart_get_baud_rate+0xe8/0x190
      [   20.806042] lr : uart_get_baud_rate+0xc0/0x190
      [   20.806476] sp : ffffffc06acff940
      [   20.806676] x29: ffffffc06acff940 x28: 0000000000002580
      [   20.806977] x27: 0000000000009600 x26: 0000000000009600
      [   20.807231] x25: ffffffc06acffad0 x24: 00000000ffffeff0
      [   20.807576] x23: 0000000000000001 x22: 0000000000000000
      [   20.807807] x21: 0000000000000001 x20: 0000000000000000
      [   20.808049] x19: ffffffc06acffac8 x18: 0000000000000000
      [   20.808277] x17: 0000000000000000 x16: 0000000000000000
      [   20.808520] x15: ffffffffffffffff x14: ffffffff00000000
      [   20.808757] x13: ffffffffffffffff x12: 0000000000000001
      [   20.809011] x11: 0101010101010101 x10: ffffff880d59ff5f
      [   20.809292] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
      [   20.809549] x7 : 0000000000000000 x6 : ffffff880d59ff5f
      [   20.809803] x5 : 0000000080008001 x4 : 0000000000000003
      [   20.810056] x3 : ffffff900853e6b4 x2 : dfffff9000000000
      [   20.810693] x1 : ffffffc06acffad0 x0 : 0000000000000cb0
      [   20.811005] Call trace:
      [   20.811214]  uart_get_baud_rate+0xe8/0x190
      [   20.811479]  serial8250_do_set_termios+0xe0/0x6f4
      [   20.811719]  serial8250_set_termios+0x48/0x54
      [   20.811928]  uart_set_options+0x138/0x1bc
      [   20.812129]  uart_poll_init+0x114/0x16c
      [   20.812330]  tty_find_polling_driver+0x158/0x200
      [   20.812545]  configure_kgdboc+0xbc/0x1bc
      [   20.812745]  param_set_kgdboc_var+0xb8/0x150
      [   20.812960]  param_attr_store+0xbc/0x150
      [   20.813160]  module_attr_store+0x40/0x58
      [   20.813364]  sysfs_kf_write+0x8c/0xa8
      [   20.813563]  kernfs_fop_write+0x154/0x290
      [   20.813764]  vfs_write+0xf0/0x278
      [   20.813951]  __arm64_sys_write+0x84/0xf4
      [   20.814400]  el0_svc_common+0xf4/0x1dc
      [   20.814616]  el0_svc_handler+0x98/0xbc
      [   20.814804]  el0_svc+0x8/0xc
      [   20.822005] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [   20.826913] Mem abort info:
      [   20.827103]   ESR = 0x84000006
      [   20.827352]   Exception class = IABT (current EL), IL = 16 bits
      [   20.827655]   SET = 0, FnV = 0
      [   20.827855]   EA = 0, S1PTW = 0
      [   20.828135] user pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
      [   20.828484] [0000000000000000] pgd=00000000aadee003, pud=00000000aadee003, pmd=0000000000000000
      [   20.829195] Internal error: Oops: 84000006 [#1] SMP
      [   20.829564] Modules linked in:
      [   20.829890] CPU: 1 PID: 104 Comm: sh Tainted: G        W         4.19.0-rc7ajb #8
      [   20.830545] Hardware name: linux,dummy-virt (DT)
      [   20.830829] pstate: 60000085 (nZCv daIf -PAN -UAO)
      [   20.831174] pc :           (null)
      [   20.831457] lr : serial8250_do_set_termios+0x358/0x6f4
      [   20.831727] sp : ffffffc06acff9b0
      [   20.831936] x29: ffffffc06acff9b0 x28: ffffff9008d7c000
      [   20.832267] x27: ffffff900969e16f x26: 0000000000000000
      [   20.832589] x25: ffffff900969dfb0 x24: 0000000000000000
      [   20.832906] x23: ffffffc06acffad0 x22: ffffff900969e160
      [   20.833232] x21: 0000000000000000 x20: ffffffc06acffac8
      [   20.833559] x19: ffffff900969df90 x18: 0000000000000000
      [   20.833878] x17: 0000000000000000 x16: 0000000000000000
      [   20.834491] x15: ffffffffffffffff x14: ffffffff00000000
      [   20.834821] x13: ffffffffffffffff x12: 0000000000000001
      [   20.835143] x11: 0101010101010101 x10: ffffff880d59ff5f
      [   20.835467] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
      [   20.835790] x7 : 0000000000000000 x6 : ffffff880d59ff5f
      [   20.836111] x5 : c06419717c314100 x4 : 0000000000000007
      [   20.836419] x3 : 0000000000000000 x2 : 0000000000000000
      [   20.836732] x1 : 0000000000000001 x0 : ffffff900969df90
      [   20.837100] Process sh (pid: 104, stack limit = 0x(____ptrval____))
      [   20.837396] Call trace:
      [   20.837566]            (null)
      [   20.837816]  serial8250_set_termios+0x48/0x54
      [   20.838089]  uart_set_options+0x138/0x1bc
      [   20.838570]  uart_poll_init+0x114/0x16c
      [   20.838834]  tty_find_polling_driver+0x158/0x200
      [   20.839119]  configure_kgdboc+0xbc/0x1bc
      [   20.839380]  param_set_kgdboc_var+0xb8/0x150
      [   20.839658]  param_attr_store+0xbc/0x150
      [   20.839920]  module_attr_store+0x40/0x58
      [   20.840183]  sysfs_kf_write+0x8c/0xa8
      [   20.840183]  sysfs_kf_write+0x8c/0xa8
      [   20.840440]  kernfs_fop_write+0x154/0x290
      [   20.840702]  vfs_write+0xf0/0x278
      [   20.840942]  __arm64_sys_write+0x84/0xf4
      [   20.841209]  el0_svc_common+0xf4/0x1dc
      [   20.841471]  el0_svc_handler+0x98/0xbc
      [   20.841713]  el0_svc+0x8/0xc
      [   20.842057] Code: bad PC value
      [   20.842764] ---[ end trace a8835d7de79aaadf ]---
      [   20.843134] Kernel panic - not syncing: Fatal exception
      [   20.843515] SMP: stopping secondary CPUs
      [   20.844289] Kernel Offset: disabled
      [   20.844634] CPU features: 0x0,21806002
      [   20.844857] Memory Limit: none
      [   20.845172] ---[ end Kernel panic - not syncing: Fatal exception ]---
      Signed-off-by: NMiles Chen <miles.chen@mediatek.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ecd146b
    • S
      powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() · 234bee8b
      Sam Bobroff 提交于
      [ Upstream commit f9bc28aedfb5bbd572d2d365f3095c1becd7209b ]
      
      If an error occurs during an unplug operation, it's possible for
      eeh_dump_dev_log() to be called when edev->pdn is null, which
      currently leads to dereferencing a null pointer.
      
      Handle this by skipping the error log for those devices.
      Signed-off-by: NSam Bobroff <sbobroff@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      234bee8b
    • J
      powerpc/Makefile: Fix PPC_BOOK3S_64 ASFLAGS · 32f2674c
      Joel Stanley 提交于
      [ Upstream commit 960e30029863db95ec79a71009272d4661db5991 ]
      
      Ever since commit 15a3204d ("powerpc/64s: Set assembler machine type
      to POWER4") we force -mpower4 to be passed to the assembler
      irrespective of the CFLAGS used (for Book3s 64).
      
      When building a powerpc64 kernel with clang, clang will not add -many
      to the assembler flags, so any instructions that the compiler has
      generated that are not available on power4 will cause an error:
      
        /usr/bin/as -a64 -mppc64 -mlittle-endian -mpower8 \
         -I ./arch/powerpc/include -I ./arch/powerpc/include/generated \
         -I ./include -I ./arch/powerpc/include/uapi \
         -I ./arch/powerpc/include/generated/uapi -I ./include/uapi \
         -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc \
         -maltivec -mpower4 -o init/do_mounts.o /tmp/do_mounts-3b0a3d.s
        /tmp/do_mounts-51ce54.s:748: Error: unrecognized opcode: `isel'
      
      GCC does include -many, so the GCC driven gas call will succeed:
      
        as -v -I ./arch/powerpc/include -I ./arch/powerpc/include/generated -I
        ./include -I ./arch/powerpc/include/uapi
        -I ./arch/powerpc/include/generated/uapi -I ./include/uapi
        -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc
         -a64 -mpower8 -many -mlittle -maltivec -mpower4 -o init/do_mounts.o
      
      Note that isel is power7 and above for IBM CPUs. GCC only generates it
      for Power9 and above, but the above test was run against the clang
      generated assembly.
      
      Peter Bergner explains:
      
        When using -many -mpower4, gas will first try and find a matching
        power4 mnemonic and failing that, it will then allow any valid
        mnemonic that gas knows about. GCC's use of -many predates me
        though.
      
        IIRC, Alan looked at trying to remove it, but I forget why he
        didn't. Could be either a gcc or gas issue at the time. I'm not sure
        whether issue still exists or not. He and I have modified how gas
        works internally a fair amount since he tried removing gcc use of
        -many.
      
        I will also note that when using -many, gas will choose the first
        mnemonic that matches in the mnemonic table and we have (mostly)
        sorted the table so that server mnemonics show up earlier in the
        table than other mnemonics, so they'll be seen/chosen first.
      
      By explicitly setting -many we can build with Clang and GCC while
      retaining the -mpower4 option.
      Signed-off-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32f2674c
    • R
      Input: wm97xx-ts - fix exit path · d7dba42c
      Randy Dunlap 提交于
      [ Upstream commit a3f7c3fc ]
      
      Loading then unloading wm97xx-ts.ko when CONFIG_AC97_BUS=m
      causes a WARNING: from drivers/base/driver.c:
      
      Unexpected driver unregister!
      WARNING: CPU: 0 PID: 1709 at ../drivers/base/driver.c:193 driver_unregister+0x30/0x40
      
      Fix this by only calling driver_unregister() with the same
      condition that driver_register() is called.
      
      Fixes: ae9d1b5f ("Input: wm97xx: add new AC97 bus support")
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7dba42c
    • S
      drm/amd/display: fix bug of accessing invalid memory · 139dde1e
      Su Sung Chung 提交于
      [ Upstream commit 43c3ff27 ]
      
      [Why]
      A loop inside of build_evenly_distributed_points function that traverse through
      the array of points become an infinite loop when m_GammaUpdates does not
      get assigned to any value.
      
      [How]
      In DMColor, clear m_gammaIsValid bit just before writting all Zeromem for
      m_GammaUpdates, to prevent calling build_evenly_distributed_points
      before m_GammaUpdates gets assigned to some value.
      Signed-off-by: NSu Sung Chung <Su.Chung@amd.com>
      Reviewed-by: NAric Cyr <Aric.Cyr@amd.com>
      Acked-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      139dde1e
    • C
      powerpc/mm: fix always true/false warning in slice.c · 56e4367f
      Christophe Leroy 提交于
      [ Upstream commit 37e9c674e7e6f445e12cb1151017bd4bacdd1e2d ]
      
      This patch fixes the following warnings (obtained with make W=1).
      
      arch/powerpc/mm/slice.c: In function 'slice_range_to_mask':
      arch/powerpc/mm/slice.c:73:12: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (start < SLICE_LOW_TOP) {
                  ^
      arch/powerpc/mm/slice.c:81:20: error: comparison is always false due to limited range of data type [-Werror=type-limits]
        if ((start + len) > SLICE_LOW_TOP) {
                          ^
      arch/powerpc/mm/slice.c: In function 'slice_mask_for_free':
      arch/powerpc/mm/slice.c:136:17: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (high_limit <= SLICE_LOW_TOP)
                       ^
      arch/powerpc/mm/slice.c: In function 'slice_check_range_fits':
      arch/powerpc/mm/slice.c:185:12: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (start < SLICE_LOW_TOP) {
                  ^
      arch/powerpc/mm/slice.c:195:39: error: comparison is always false due to limited range of data type [-Werror=type-limits]
        if (SLICE_NUM_HIGH && ((start + len) > SLICE_LOW_TOP)) {
                                             ^
      arch/powerpc/mm/slice.c: In function 'slice_scan_available':
      arch/powerpc/mm/slice.c:306:11: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (addr < SLICE_LOW_TOP) {
                 ^
      arch/powerpc/mm/slice.c: In function 'get_slice_psize':
      arch/powerpc/mm/slice.c:709:11: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (addr < SLICE_LOW_TOP) {
                 ^
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56e4367f
    • M
      powerpc/mm: Fix page table dump to work on Radix · 287deb22
      Michael Ellerman 提交于
      [ Upstream commit 0d923962ab69c27cca664a2d535e90ef655110ca ]
      
      When we're running on Book3S with the Radix MMU enabled the page table
      dump currently prints the wrong addresses because it uses the wrong
      start address.
      
      Fix it to use PAGE_OFFSET rather than KERN_VIRT_START.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      287deb22
    • N
      powerpc/64/module: REL32 relocation range check · 63880b2f
      Nicholas Piggin 提交于
      [ Upstream commit b851ba02a6f3075f0f99c60c4bc30a4af80cf428 ]
      
      The recent module relocation overflow crash demonstrated that we
      have no range checking on REL32 relative relocations. This patch
      implements a basic check, the same kernel that previously oopsed
      and rebooted now continues with some of these errors when loading
      the module:
      
        module_64: x_tables: REL32 527703503449812 out of range!
      
      Possibly other relocations (ADDR32, REL16, TOC16, etc.) should also have
      overflow checks.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63880b2f
    • C
      powerpc/traps: restore recoverability of machine_check interrupts · 54de3d7d
      Christophe Leroy 提交于
      [ Upstream commit daf00ae71dad8aa05965713c62558aeebf2df48e ]
      
      commit b96672dd ("powerpc: Machine check interrupt is a non-
      maskable interrupt") added a call to nmi_enter() at the beginning of
      machine check restart exception handler. Due to that, in_interrupt()
      always returns true regardless of the state before entering the
      exception, and die() panics even when the system was not already in
      interrupt.
      
      This patch calls nmi_exit() before calling die() in order to restore
      the interrupt state we had before calling nmi_enter()
      
      Fixes: b96672dd ("powerpc: Machine check interrupt is a non-maskable interrupt")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54de3d7d
  2. 14 11月, 2018 20 次提交
    • G
      Linux 4.19.2 · 7950eb31
      Greg Kroah-Hartman 提交于
      7950eb31
    • S
      MD: fix invalid stored role for a disk - try2 · 589c3750
      Shaohua Li 提交于
      commit 9e753ba9b9b405e3902d9f08aec5f2ea58a0c317 upstream.
      
      Commit d595567dc4f0 (MD: fix invalid stored role for a disk) broke linear
      hotadd. Let's only fix the role for disks in raid1/10.
      Based on Guoqing's original patch.
      Reported-by: Nkernel test robot <rong.a.chen@intel.com>
      Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
      Cc: Guoqing Jiang <gqjiang@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      589c3750
    • T
      vga_switcheroo: Fix missing gpu_bound call at audio client registration · fcd90d7d
      Takashi Iwai 提交于
      commit fc09ab7a767394f9ecdad84ea6e85d68b83c8e21 upstream.
      
      The commit 37a3a98e ("ALSA: hda - Enable runtime PM only for
      discrete GPU") added a new ops gpu_bound to be called when GPU gets
      bound.  The patch overlooked, however, that vga_switcheroo_enable() is
      called only once at GPU is bound.  When an audio client is registered
      after that point, it would miss the gpu_bound call.  This leads to the
      unexpected lack of runtime PM in HD-audio side.
      
      For addressing that regression, just call gpu_bound callback manually
      at vga_switcheroo_register_audio_client() when the GPU was already
      bound.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201615
      Fixes: 37a3a98e ("ALSA: hda - Enable runtime PM only for discrete GPU")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fcd90d7d
    • D
      bpf: wait for running BPF programs when updating map-in-map · 34d96152
      Daniel Colascione 提交于
      commit 1ae80cf31938c8f77c37a29bbe29e7f1cd492be8 upstream.
      
      The map-in-map frequently serves as a mechanism for atomic
      snapshotting of state that a BPF program might record.  The current
      implementation is dangerous to use in this way, however, since
      userspace has no way of knowing when all programs that might have
      retrieved the "old" value of the map may have completed.
      
      This change ensures that map update operations on map-in-map map types
      always wait for all references to the old map to drop before returning
      to userspace.
      Signed-off-by: NDaniel Colascione <dancol@google.com>
      Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NChenbo Feng <fengc@google.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34d96152
    • J
      userns: also map extents in the reverse map to kernel IDs · 9a7a80fb
      Jann Horn 提交于
      commit d2f007dbe7e4c9583eea6eb04d60001e85c6f1bd upstream.
      
      The current logic first clones the extent array and sorts both copies, then
      maps the lower IDs of the forward mapping into the lower namespace, but
      doesn't map the lower IDs of the reverse mapping.
      
      This means that code in a nested user namespace with >5 extents will see
      incorrect IDs. It also breaks some access checks, like
      inode_owner_or_capable() and privileged_wrt_inode_uidgid(), so a process
      can incorrectly appear to be capable relative to an inode.
      
      To fix it, we have to make sure that the "lower_first" members of extents
      in both arrays are translated; and we have to make sure that the reverse
      map is sorted *after* the translation (since otherwise the translation can
      break the sorting).
      
      This is CVE-2018-18955.
      
      Fixes: 6397fac4 ("userns: bump idmap limits to 340")
      Cc: stable@vger.kernel.org
      Signed-off-by: NJann Horn <jannh@google.com>
      Tested-by: NEric W. Biederman <ebiederm@xmission.com>
      Reviewed-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a7a80fb
    • M
      vt: fix broken display when running aptitude · 13794b24
      Mikulas Patocka 提交于
      commit 943210ba807ec50aafa2fa7b13bd6d36a478969b upstream.
      
      If you run aptitude on framebuffer console, the display is corrupted. The
      corruption is caused by the commit d8ae7242. The patch adds "offset" to
      "start" when calling scr_memsetw, but it forgets to do the same addition
      on a subsequent call to do_update_region.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Fixes: d8ae7242 ("vt: preserve unicode values corresponding to screen characters")
      Reviewed-by: NNicolas Pitre <nico@linaro.org>
      Cc: stable@vger.kernel.org	# 4.19
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13794b24
    • D
      net: sched: Remove TCA_OPTIONS from policy · ab5d01b6
      David Ahern 提交于
      commit e72bde6b66299602087c8c2350d36a525e75d06e upstream.
      
      Marco reported an error with hfsc:
      root@Calimero:~# tc qdisc add dev eth0 root handle 1:0 hfsc default 1
      Error: Attribute failed policy validation.
      
      Apparently a few implementations pass TCA_OPTIONS as a binary instead
      of nested attribute, so drop TCA_OPTIONS from the policy.
      
      Fixes: 8b4c3cdd ("net: sched: Add policy validation for tc attributes")
      Reported-by: NMarco Berizzi <pupilla@libero.it>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab5d01b6
    • F
      Btrfs: fix use-after-free when dumping free space · 0c286e9d
      Filipe Manana 提交于
      commit 9084cb6a24bf5838a665af92ded1af8363f9e563 upstream.
      
      We were iterating a block group's free space cache rbtree without locking
      first the lock that protects it (the free_space_ctl->free_space_offset
      rbtree is protected by the free_space_ctl->tree_lock spinlock).
      
      KASAN reported an use-after-free problem when iterating such a rbtree due
      to a concurrent rbtree delete:
      
      [ 9520.359168] ==================================================================
      [ 9520.359656] BUG: KASAN: use-after-free in rb_next+0x13/0x90
      [ 9520.359949] Read of size 8 at addr ffff8800b7ada500 by task btrfs-transacti/1721
      [ 9520.360357]
      [ 9520.360530] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G             L    4.19.0-rc8-nbor #555
      [ 9520.360990] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [ 9520.362682] Call Trace:
      [ 9520.362887]  dump_stack+0xa4/0xf5
      [ 9520.363146]  print_address_description+0x78/0x280
      [ 9520.363412]  kasan_report+0x263/0x390
      [ 9520.363650]  ? rb_next+0x13/0x90
      [ 9520.363873]  __asan_load8+0x54/0x90
      [ 9520.364102]  rb_next+0x13/0x90
      [ 9520.364380]  btrfs_dump_free_space+0x146/0x160 [btrfs]
      [ 9520.364697]  dump_space_info+0x2cd/0x310 [btrfs]
      [ 9520.364997]  btrfs_reserve_extent+0x1ee/0x1f0 [btrfs]
      [ 9520.365310]  __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs]
      [ 9520.365646]  ? btrfs_update_time+0x180/0x180 [btrfs]
      [ 9520.365923]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.366204]  ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs]
      [ 9520.366549]  btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs]
      [ 9520.366880]  cache_save_setup+0x42e/0x580 [btrfs]
      [ 9520.367220]  ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs]
      [ 9520.367518]  ? lock_downgrade+0x2f0/0x2f0
      [ 9520.367799]  ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs]
      [ 9520.368104]  ? kasan_check_read+0x11/0x20
      [ 9520.368349]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.368638]  btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs]
      [ 9520.368978]  ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs]
      [ 9520.369282]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.369534]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.369811]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.370137]  commit_cowonly_roots+0x4b9/0x610 [btrfs]
      [ 9520.370560]  ? commit_fs_roots+0x350/0x350 [btrfs]
      [ 9520.370926]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.371285]  btrfs_commit_transaction+0x5e5/0x10e0 [btrfs]
      [ 9520.371612]  ? btrfs_apply_pending_changes+0x90/0x90 [btrfs]
      [ 9520.371943]  ? start_transaction+0x168/0x6c0 [btrfs]
      [ 9520.372257]  transaction_kthread+0x21c/0x240 [btrfs]
      [ 9520.372537]  kthread+0x1d2/0x1f0
      [ 9520.372793]  ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs]
      [ 9520.373090]  ? kthread_park+0xb0/0xb0
      [ 9520.373329]  ret_from_fork+0x3a/0x50
      [ 9520.373567]
      [ 9520.373738] Allocated by task 1804:
      [ 9520.373974]  kasan_kmalloc+0xff/0x180
      [ 9520.374208]  kasan_slab_alloc+0x11/0x20
      [ 9520.374447]  kmem_cache_alloc+0xfc/0x2d0
      [ 9520.374731]  __btrfs_add_free_space+0x40/0x580 [btrfs]
      [ 9520.375044]  unpin_extent_range+0x4f7/0x7a0 [btrfs]
      [ 9520.375383]  btrfs_finish_extent_commit+0x15f/0x4d0 [btrfs]
      [ 9520.375707]  btrfs_commit_transaction+0xb06/0x10e0 [btrfs]
      [ 9520.376027]  btrfs_alloc_data_chunk_ondemand+0x237/0x5c0 [btrfs]
      [ 9520.376365]  btrfs_check_data_free_space+0x81/0xd0 [btrfs]
      [ 9520.376689]  btrfs_delalloc_reserve_space+0x25/0x80 [btrfs]
      [ 9520.377018]  btrfs_direct_IO+0x42e/0x6d0 [btrfs]
      [ 9520.377284]  generic_file_direct_write+0x11e/0x220
      [ 9520.377587]  btrfs_file_write_iter+0x472/0xac0 [btrfs]
      [ 9520.377875]  aio_write+0x25c/0x360
      [ 9520.378106]  io_submit_one+0xaa0/0xdc0
      [ 9520.378343]  __se_sys_io_submit+0xfa/0x2f0
      [ 9520.378589]  __x64_sys_io_submit+0x43/0x50
      [ 9520.378840]  do_syscall_64+0x7d/0x240
      [ 9520.379081]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 9520.379387]
      [ 9520.379557] Freed by task 1802:
      [ 9520.379782]  __kasan_slab_free+0x173/0x260
      [ 9520.380028]  kasan_slab_free+0xe/0x10
      [ 9520.380262]  kmem_cache_free+0xc1/0x2c0
      [ 9520.380544]  btrfs_find_space_for_alloc+0x4cd/0x4e0 [btrfs]
      [ 9520.380866]  find_free_extent+0xa99/0x17e0 [btrfs]
      [ 9520.381166]  btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
      [ 9520.381474]  btrfs_get_blocks_direct+0x60b/0xbd0 [btrfs]
      [ 9520.381761]  __blockdev_direct_IO+0x10ee/0x58a1
      [ 9520.382059]  btrfs_direct_IO+0x25a/0x6d0 [btrfs]
      [ 9520.382321]  generic_file_direct_write+0x11e/0x220
      [ 9520.382623]  btrfs_file_write_iter+0x472/0xac0 [btrfs]
      [ 9520.382904]  aio_write+0x25c/0x360
      [ 9520.383172]  io_submit_one+0xaa0/0xdc0
      [ 9520.383416]  __se_sys_io_submit+0xfa/0x2f0
      [ 9520.383678]  __x64_sys_io_submit+0x43/0x50
      [ 9520.383927]  do_syscall_64+0x7d/0x240
      [ 9520.384165]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 9520.384439]
      [ 9520.384610] The buggy address belongs to the object at ffff8800b7ada500
                      which belongs to the cache btrfs_free_space of size 72
      [ 9520.385175] The buggy address is located 0 bytes inside of
                      72-byte region [ffff8800b7ada500, ffff8800b7ada548)
      [ 9520.385691] The buggy address belongs to the page:
      [ 9520.385957] page:ffffea0002deb680 count:1 mapcount:0 mapping:ffff880108a1d700 index:0x0 compound_mapcount: 0
      [ 9520.388030] flags: 0x8100(slab|head)
      [ 9520.388281] raw: 0000000000008100 ffffea0002deb608 ffffea0002728808 ffff880108a1d700
      [ 9520.388722] raw: 0000000000000000 0000000000130013 00000001ffffffff 0000000000000000
      [ 9520.389169] page dumped because: kasan: bad access detected
      [ 9520.389473]
      [ 9520.389658] Memory state around the buggy address:
      [ 9520.389943]  ffff8800b7ada400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.390368]  ffff8800b7ada480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.390796] >ffff8800b7ada500: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc
      [ 9520.391223]                    ^
      [ 9520.391461]  ffff8800b7ada580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.391885]  ffff8800b7ada600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.392313] ==================================================================
      [ 9520.392772] BTRFS critical (device vdc): entry offset 2258497536, bytes 131072, bitmap no
      [ 9520.393247] BUG: unable to handle kernel NULL pointer dereference at 0000000000000011
      [ 9520.393705] PGD 800000010dbab067 P4D 800000010dbab067 PUD 107551067 PMD 0
      [ 9520.394059] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [ 9520.394378] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G    B        L    4.19.0-rc8-nbor #555
      [ 9520.394858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [ 9520.395350] RIP: 0010:rb_next+0x3c/0x90
      [ 9520.396461] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292
      [ 9520.396762] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c
      [ 9520.397115] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011
      [ 9520.397468] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc
      [ 9520.397821] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000
      [ 9520.398188] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000
      [ 9520.398555] FS:  0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000
      [ 9520.399007] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9520.399335] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0
      [ 9520.399679] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9520.400023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9520.400400] Call Trace:
      [ 9520.400648]  btrfs_dump_free_space+0x146/0x160 [btrfs]
      [ 9520.400974]  dump_space_info+0x2cd/0x310 [btrfs]
      [ 9520.401287]  btrfs_reserve_extent+0x1ee/0x1f0 [btrfs]
      [ 9520.401609]  __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs]
      [ 9520.401952]  ? btrfs_update_time+0x180/0x180 [btrfs]
      [ 9520.402232]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.402522]  ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs]
      [ 9520.402882]  btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs]
      [ 9520.403261]  cache_save_setup+0x42e/0x580 [btrfs]
      [ 9520.403570]  ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs]
      [ 9520.403871]  ? lock_downgrade+0x2f0/0x2f0
      [ 9520.404161]  ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs]
      [ 9520.404481]  ? kasan_check_read+0x11/0x20
      [ 9520.404732]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.405026]  btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs]
      [ 9520.405375]  ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs]
      [ 9520.405694]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.405958]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.406243]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.406574]  commit_cowonly_roots+0x4b9/0x610 [btrfs]
      [ 9520.406899]  ? commit_fs_roots+0x350/0x350 [btrfs]
      [ 9520.407253]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.407589]  btrfs_commit_transaction+0x5e5/0x10e0 [btrfs]
      [ 9520.407925]  ? btrfs_apply_pending_changes+0x90/0x90 [btrfs]
      [ 9520.408262]  ? start_transaction+0x168/0x6c0 [btrfs]
      [ 9520.408582]  transaction_kthread+0x21c/0x240 [btrfs]
      [ 9520.408870]  kthread+0x1d2/0x1f0
      [ 9520.409138]  ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs]
      [ 9520.409440]  ? kthread_park+0xb0/0xb0
      [ 9520.409682]  ret_from_fork+0x3a/0x50
      [ 9520.410508] Dumping ftrace buffer:
      [ 9520.410764]    (ftrace buffer empty)
      [ 9520.411007] CR2: 0000000000000011
      [ 9520.411297] ---[ end trace 01a0863445cf360a ]---
      [ 9520.411568] RIP: 0010:rb_next+0x3c/0x90
      [ 9520.412644] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292
      [ 9520.412932] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c
      [ 9520.413274] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011
      [ 9520.413616] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc
      [ 9520.414007] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000
      [ 9520.414349] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000
      [ 9520.416074] FS:  0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000
      [ 9520.416536] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9520.416848] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0
      [ 9520.418477] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9520.418846] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9520.419204] Kernel panic - not syncing: Fatal exception
      [ 9520.419666] Dumping ftrace buffer:
      [ 9520.419930]    (ftrace buffer empty)
      [ 9520.420168] Kernel Offset: disabled
      [ 9520.420406] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Fix this by acquiring the respective lock before iterating the rbtree.
      Reported-by: NNikolay Borisov <nborisov@suse.com>
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c286e9d
    • F
      Btrfs: fix use-after-free during inode eviction · 5b7a4630
      Filipe Manana 提交于
      commit 421f0922a2cfb0c75acd9746454aaa576c711a65 upstream.
      
      At inode.c:evict_inode_truncate_pages(), when we iterate over the
      inode's extent states, we access an extent state record's "state" field
      after we unlocked the inode's io tree lock. This can lead to a
      use-after-free issue because after we unlock the io tree that extent
      state record might have been freed due to being merged into another
      adjacent extent state record (a previous inflight bio for a read
      operation finished in the meanwhile which unlocked a range in the io
      tree and cause a merge of extent state records, as explained in the
      comment before the while loop added in commit 6ca07097 ("Btrfs: fix
      hang during inode eviction due to concurrent readahead")).
      
      Fix this by keeping a copy of the extent state's flags in a local
      variable and using it after unlocking the io tree.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201189
      Fixes: b9d0b389 ("btrfs: Add handler for invalidate page")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b7a4630
    • J
      btrfs: move the dio_sem higher up the callchain · 51c62a33
      Josef Bacik 提交于
      commit c495144bc6962186feae31d687596d2472000e45 upstream.
      
      We're getting a lockdep splat because we take the dio_sem under the
      log_mutex.  What we really need is to protect fsync() from logging an
      extent map for an extent we never waited on higher up, so just guard the
      whole thing with dio_sem.
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.18.0-rc4-xfstests-00025-g5de5edbaf1d4 #411 Not tainted
      ------------------------------------------------------
      aio-dio-invalid/30928 is trying to acquire lock:
      0000000092621cfd (&mm->mmap_sem){++++}, at: get_user_pages_unlocked+0x5a/0x1e0
      
      but task is already holding lock:
      00000000cefe6b35 (&ei->dio_sem){++++}, at: btrfs_direct_IO+0x3be/0x400
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #5 (&ei->dio_sem){++++}:
             lock_acquire+0xbd/0x220
             down_write+0x51/0xb0
             btrfs_log_changed_extents+0x80/0xa40
             btrfs_log_inode+0xbaf/0x1000
             btrfs_log_inode_parent+0x26f/0xa80
             btrfs_log_dentry_safe+0x50/0x70
             btrfs_sync_file+0x357/0x540
             do_fsync+0x38/0x60
             __ia32_sys_fdatasync+0x12/0x20
             do_fast_syscall_32+0x9a/0x2f0
             entry_SYSENTER_compat+0x84/0x96
      
      -> #4 (&ei->log_mutex){+.+.}:
             lock_acquire+0xbd/0x220
             __mutex_lock+0x86/0xa10
             btrfs_record_unlink_dir+0x2a/0xa0
             btrfs_unlink+0x5a/0xc0
             vfs_unlink+0xb1/0x1a0
             do_unlinkat+0x264/0x2b0
             do_fast_syscall_32+0x9a/0x2f0
             entry_SYSENTER_compat+0x84/0x96
      
      -> #3 (sb_internal#2){.+.+}:
             lock_acquire+0xbd/0x220
             __sb_start_write+0x14d/0x230
             start_transaction+0x3e6/0x590
             btrfs_evict_inode+0x475/0x640
             evict+0xbf/0x1b0
             btrfs_run_delayed_iputs+0x6c/0x90
             cleaner_kthread+0x124/0x1a0
             kthread+0x106/0x140
             ret_from_fork+0x3a/0x50
      
      -> #2 (&fs_info->cleaner_delayed_iput_mutex){+.+.}:
             lock_acquire+0xbd/0x220
             __mutex_lock+0x86/0xa10
             btrfs_alloc_data_chunk_ondemand+0x197/0x530
             btrfs_check_data_free_space+0x4c/0x90
             btrfs_delalloc_reserve_space+0x20/0x60
             btrfs_page_mkwrite+0x87/0x520
             do_page_mkwrite+0x31/0xa0
             __handle_mm_fault+0x799/0xb00
             handle_mm_fault+0x7c/0xe0
             __do_page_fault+0x1d3/0x4a0
             async_page_fault+0x1e/0x30
      
      -> #1 (sb_pagefaults){.+.+}:
             lock_acquire+0xbd/0x220
             __sb_start_write+0x14d/0x230
             btrfs_page_mkwrite+0x6a/0x520
             do_page_mkwrite+0x31/0xa0
             __handle_mm_fault+0x799/0xb00
             handle_mm_fault+0x7c/0xe0
             __do_page_fault+0x1d3/0x4a0
             async_page_fault+0x1e/0x30
      
      -> #0 (&mm->mmap_sem){++++}:
             __lock_acquire+0x42e/0x7a0
             lock_acquire+0xbd/0x220
             down_read+0x48/0xb0
             get_user_pages_unlocked+0x5a/0x1e0
             get_user_pages_fast+0xa4/0x150
             iov_iter_get_pages+0xc3/0x340
             do_direct_IO+0xf93/0x1d70
             __blockdev_direct_IO+0x32d/0x1c20
             btrfs_direct_IO+0x227/0x400
             generic_file_direct_write+0xcf/0x180
             btrfs_file_write_iter+0x308/0x58c
             aio_write+0xf8/0x1d0
             io_submit_one+0x3a9/0x620
             __ia32_compat_sys_io_submit+0xb2/0x270
             do_int80_syscall_32+0x5b/0x1a0
             entry_INT80_compat+0x88/0xa0
      
      other info that might help us debug this:
      
      Chain exists of:
        &mm->mmap_sem --> &ei->log_mutex --> &ei->dio_sem
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&ei->dio_sem);
                                     lock(&ei->log_mutex);
                                     lock(&ei->dio_sem);
        lock(&mm->mmap_sem);
      
       *** DEADLOCK ***
      
      1 lock held by aio-dio-invalid/30928:
       #0: 00000000cefe6b35 (&ei->dio_sem){++++}, at: btrfs_direct_IO+0x3be/0x400
      
      stack backtrace:
      CPU: 0 PID: 30928 Comm: aio-dio-invalid Not tainted 4.18.0-rc4-xfstests-00025-g5de5edbaf1d4 #411
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
      Call Trace:
       dump_stack+0x7c/0xbb
       print_circular_bug.isra.37+0x297/0x2a4
       check_prev_add.constprop.45+0x781/0x7a0
       ? __lock_acquire+0x42e/0x7a0
       validate_chain.isra.41+0x7f0/0xb00
       __lock_acquire+0x42e/0x7a0
       lock_acquire+0xbd/0x220
       ? get_user_pages_unlocked+0x5a/0x1e0
       down_read+0x48/0xb0
       ? get_user_pages_unlocked+0x5a/0x1e0
       get_user_pages_unlocked+0x5a/0x1e0
       get_user_pages_fast+0xa4/0x150
       iov_iter_get_pages+0xc3/0x340
       do_direct_IO+0xf93/0x1d70
       ? __alloc_workqueue_key+0x358/0x490
       ? __blockdev_direct_IO+0x14b/0x1c20
       __blockdev_direct_IO+0x32d/0x1c20
       ? btrfs_run_delalloc_work+0x40/0x40
       ? can_nocow_extent+0x490/0x490
       ? kvm_clock_read+0x1f/0x30
       ? can_nocow_extent+0x490/0x490
       ? btrfs_run_delalloc_work+0x40/0x40
       btrfs_direct_IO+0x227/0x400
       ? btrfs_run_delalloc_work+0x40/0x40
       generic_file_direct_write+0xcf/0x180
       btrfs_file_write_iter+0x308/0x58c
       aio_write+0xf8/0x1d0
       ? kvm_clock_read+0x1f/0x30
       ? __might_fault+0x3e/0x90
       io_submit_one+0x3a9/0x620
       ? io_submit_one+0xe5/0x620
       __ia32_compat_sys_io_submit+0xb2/0x270
       do_int80_syscall_32+0x5b/0x1a0
       entry_INT80_compat+0x88/0xa0
      
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51c62a33
    • J
      btrfs: don't run delayed_iputs in commit · dd472956
      Josef Bacik 提交于
      commit 30928e9baac238a7330085a1c5747f0b5df444b4 upstream.
      
      This could result in a really bad case where we do something like
      
      evict
        evict_refill_and_join
          btrfs_commit_transaction
            btrfs_run_delayed_iputs
              evict
                evict_refill_and_join
                  btrfs_commit_transaction
      ... forever
      
      We have plenty of other places where we run delayed iputs that are much
      safer, let those do the work.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd472956
    • J
      btrfs: fix insert_reserved error handling · 186b5248
      Josef Bacik 提交于
      commit 80ee54bfe8a3850015585ebc84e8d207fcae6831 upstream.
      
      We were not handling the reserved byte accounting properly for data
      references.  Metadata was fine, if it errored out the error paths would
      free the bytes_reserved count and pin the extent, but it even missed one
      of the error cases.  So instead move this handling up into
      run_one_delayed_ref so we are sure that both cases are properly cleaned
      up in case of a transaction abort.
      
      CC: stable@vger.kernel.org # 4.18+
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      186b5248
    • J
      btrfs: only free reserved extent if we didn't insert it · 5a1e9bf4
      Josef Bacik 提交于
      commit 49940bdd57779c78462da7aa5a8650b2fea8c2ff upstream.
      
      When we insert the file extent once the ordered extent completes we free
      the reserved extent reservation as it'll have been migrated to the
      bytes_used counter.  However if we error out after this step we'll still
      clear the reserved extent reservation, resulting in a negative
      accounting of the reserved bytes for the block group and space info.
      Fix this by only doing the free if we didn't successfully insert a file
      extent for this extent.
      
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a1e9bf4
    • J
      btrfs: don't use ctl->free_space for max_extent_size · a746cfd0
      Josef Bacik 提交于
      commit fb5c39d7a887108087de6ff93d3f326b01b4ef41 upstream.
      
      max_extent_size is supposed to be the largest contiguous range for the
      space info, and ctl->free_space is the total free space in the block
      group.  We need to keep track of these separately and _only_ use the
      max_free_space if we don't have a max_extent_size, as that means our
      original request was too large to search any of the block groups for and
      therefore wouldn't have a max_extent_size set.
      
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a746cfd0
    • J
      btrfs: set max_extent_size properly · 4a351e75
      Josef Bacik 提交于
      commit ad22cf6ea47fa20fbe11ac324a0a15c0a9a4a2a9 upstream.
      
      We can't use entry->bytes if our entry is a bitmap entry, we need to use
      entry->max_extent_size in that case.  Fix up all the logic to make this
      consistent.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a351e75
    • J
      btrfs: reset max_extent_size properly · e982beca
      Josef Bacik 提交于
      commit 21a94f7a upstream.
      
      If we use up our block group before allocating a new one we'll easily
      get a max_extent_size that's set really really low, which will result in
      a lot of fragmentation.  We need to make sure we're resetting the
      max_extent_size when we add a new chunk or add new space.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e982beca
    • F
      Btrfs: fix deadlock when writing out free space caches · ea9c846f
      Filipe Manana 提交于
      commit 5ce555578e0919237fa4bda92b4670e2dd176f85 upstream.
      
      When writing out a block group free space cache we can end deadlocking
      with ourselves on an extent buffer lock resulting in a warning like the
      following:
      
        [245043.379979] WARNING: CPU: 4 PID: 2608 at fs/btrfs/locking.c:251 btrfs_tree_lock+0x1be/0x1d0 [btrfs]
        [245043.392792] CPU: 4 PID: 2608 Comm: btrfs-transacti Tainted: G
          W I      4.16.8 #1
        [245043.395489] RIP: 0010:btrfs_tree_lock+0x1be/0x1d0 [btrfs]
        [245043.396791] RSP: 0018:ffffc9000424b840 EFLAGS: 00010246
        [245043.398093] RAX: 0000000000000a30 RBX: ffff8807e20a3d20 RCX: 0000000000000001
        [245043.399414] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff8807e20a3d20
        [245043.400732] RBP: 0000000000000001 R08: ffff88041f39a700 R09: ffff880000000000
        [245043.402021] R10: 0000000000000040 R11: ffff8807e20a3d20 R12: ffff8807cb220630
        [245043.403296] R13: 0000000000000001 R14: ffff8807cb220628 R15: ffff88041fbdf000
        [245043.404780] FS:  0000000000000000(0000) GS:ffff88082fc80000(0000) knlGS:0000000000000000
        [245043.406050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [245043.407321] CR2: 00007fffdbdb9f10 CR3: 0000000001c09005 CR4: 00000000000206e0
        [245043.408670] Call Trace:
        [245043.409977]  btrfs_search_slot+0x761/0xa60 [btrfs]
        [245043.411278]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
        [245043.412572]  btrfs_insert_item+0x5b/0xc0 [btrfs]
        [245043.413922]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
        [245043.415216]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
        [245043.416487]  find_free_extent+0xcd0/0xf60 [btrfs]
        [245043.417813]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
        [245043.419105]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
        [245043.420378]  __btrfs_cow_block+0x127/0x550 [btrfs]
        [245043.421652]  btrfs_cow_block+0xee/0x190 [btrfs]
        [245043.422979]  btrfs_search_slot+0x227/0xa60 [btrfs]
        [245043.424279]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
        [245043.425538]  ? iput+0x72/0x1e0
        [245043.426798]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
        [245043.428131]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
        [245043.429419]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
        [245043.430712]  ? start_transaction+0x8e/0x410 [btrfs]
        [245043.432006]  transaction_kthread+0x184/0x1a0 [btrfs]
        [245043.433341]  kthread+0xf0/0x130
        [245043.434628]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
        [245043.435928]  ? kthread_create_worker_on_cpu+0x40/0x40
        [245043.437236]  ret_from_fork+0x1f/0x30
        [245043.441054] ---[ end trace 15abaa2aaf36827f ]---
      
      This is because at write_one_cache_group() when we are COWing a leaf from
      the extent tree we end up allocating a new block group (chunk) and,
      because we have hit a threshold on the number of bytes reserved for system
      chunks, we attempt to finalize the creation of new block groups from the
      current transaction, by calling btrfs_create_pending_block_groups().
      However here we also need to modify the extent tree in order to insert
      a block group item, and if the location for this new block group item
      happens to be in the same leaf that we were COWing earlier, we deadlock
      since btrfs_search_slot() tries to write lock the extent buffer that we
      locked before at write_one_cache_group().
      
      We have already hit similar cases in the past and commit d9a0540a
      ("Btrfs: fix deadlock when finalizing block group creation") fixed some
      of those cases by delaying the creation of pending block groups at the
      known specific spots that could lead to a deadlock. This change reworks
      that commit to be more generic so that we don't have to add similar logic
      to every possible path that can lead to a deadlock. This is done by
      making __btrfs_cow_block() disallowing the creation of new block groups
      (setting the transaction's can_flush_pending_bgs to false) before it
      attempts to allocate a new extent buffer for either the extent, chunk or
      device trees, since those are the trees that pending block creation
      modifies. Once the new extent buffer is allocated, it allows creation of
      pending block groups to happen again.
      
      This change depends on a recent patch from Josef which is not yet in
      Linus' tree, named "btrfs: make sure we create all new block groups" in
      order to avoid occasional warnings at btrfs_trans_release_chunk_metadata().
      
      Fixes: d9a0540a ("Btrfs: fix deadlock when finalizing block group creation")
      CC: stable@vger.kernel.org # 4.4+
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199753
      Link: https://lore.kernel.org/linux-btrfs/CAJtFHUTHna09ST-_EEiyWmDH6gAqS6wa=zMNMBsifj8ABu99cw@mail.gmail.com/Reported-by: NE V <eliventer@gmail.com>
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea9c846f
    • F
      Btrfs: fix assertion on fsync of regular file when using no-holes feature · e17af96e
      Filipe Manana 提交于
      commit 7ed586d0 upstream.
      
      When using the NO_HOLES feature and logging a regular file, we were
      expecting that if we find an inline extent, that either its size in RAM
      (uncompressed and unenconded) matches the size of the file or if it does
      not, that it matches the sector size and it represents compressed data.
      This assertion does not cover a case where the length of the inline extent
      is smaller than the sector size and also smaller the file's size, such
      case is possible through fallocate. Example:
      
        $ mkfs.btrfs -f -O no-holes /dev/sdb
        $ mount /dev/sdb /mnt
      
        $ xfs_io -f -c "pwrite -S 0xb60 0 21" /mnt/foobar
        $ xfs_io -c "falloc 40 40" /mnt/foobar
        $ xfs_io -c "fsync" /mnt/foobar
      
      In the above example we trigger the assertion because the inline extent's
      length is 21 bytes while the file size is 80 bytes. The fallocate() call
      merely updated the file's size and did not touch the existing inline
      extent, as expected.
      
      So fix this by adjusting the assertion so that an inline extent length
      smaller than the file size is valid if the file size is smaller than the
      filesystem's sector size.
      
      A test case for fstests follows soon.
      Reported-by: NAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Fixes: a89ca6f2 ("Btrfs: fix fsync after truncate when no_holes feature is enabled")
      CC: stable@vger.kernel.org # 4.14+
      Link: https://lore.kernel.org/linux-btrfs/CAE5jQCfRSBC7n4pUTFJcmHh109=gwyT9mFkCOL+NKfzswmR=_Q@mail.gmail.com/Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e17af96e
    • F
      Btrfs: fix null pointer dereference on compressed write path error · 85c5f244
      Filipe Manana 提交于
      commit 3527a018c00e5dbada2f9d7ed5576437b6dd5cfb upstream.
      
      At inode.c:compress_file_range(), under the "free_pages_out" label, we can
      end up dereferencing the "pages" pointer when it has a NULL value. This
      case happens when "start" has a value of 0 and we fail to allocate memory
      for the "pages" pointer. When that happens we jump to the "cont" label and
      then enter the "if (start == 0)" branch where we immediately call the
      cow_file_range_inline() function. If that function returns 0 (success
      creating an inline extent) or an error (like -ENOMEM for example) we jump
      to the "free_pages_out" label and then access "pages[i]" leading to a NULL
      pointer dereference, since "nr_pages" has a value greater than zero at
      that point.
      
      Fix this by setting "nr_pages" to 0 when we fail to allocate memory for
      the "pages" pointer.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201119
      Fixes: 771ed689 ("Btrfs: Optimize compressed writeback and reads")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85c5f244
    • Q
      btrfs: qgroup: Dirty all qgroups before rescan · 8181a8f8
      Qu Wenruo 提交于
      commit 9c7b0c2e8dbfbcd80a71e2cbfe02704f26c185c6 upstream.
      
      [BUG]
      In the following case, rescan won't zero out the number of qgroup 1/0:
      
        $ mkfs.btrfs -fq $DEV
        $ mount $DEV /mnt
      
        $ btrfs quota enable /mnt
        $ btrfs qgroup create 1/0 /mnt
        $ btrfs sub create /mnt/sub
        $ btrfs qgroup assign 0/257 1/0 /mnt
      
        $ dd if=/dev/urandom of=/mnt/sub/file bs=1k count=1000
        $ btrfs sub snap /mnt/sub /mnt/snap
        $ btrfs quota rescan -w /mnt
        $ btrfs qgroup show -pcre /mnt
        qgroupid         rfer         excl     max_rfer     max_excl parent  child
        --------         ----         ----     --------     -------- ------  -----
        0/5          16.00KiB     16.00KiB         none         none ---     ---
        0/257      1016.00KiB     16.00KiB         none         none 1/0     ---
        0/258      1016.00KiB     16.00KiB         none         none ---     ---
        1/0        1016.00KiB     16.00KiB         none         none ---     0/257
      
      So far so good, but:
      
        $ btrfs qgroup remove 0/257 1/0 /mnt
        WARNING: quotas may be inconsistent, rescan needed
        $ btrfs quota rescan -w /mnt
        $ btrfs qgroup show -pcre  /mnt
        qgoupid         rfer         excl     max_rfer     max_excl parent  child
        --------         ----         ----     --------     -------- ------  -----
        0/5          16.00KiB     16.00KiB         none         none ---     ---
        0/257      1016.00KiB     16.00KiB         none         none ---     ---
        0/258      1016.00KiB     16.00KiB         none         none ---     ---
        1/0        1016.00KiB     16.00KiB         none         none ---     ---
      	     ^^^^^^^^^^     ^^^^^^^^ not cleared
      
      [CAUSE]
      Before rescan we call qgroup_rescan_zero_tracking() to zero out all
      qgroups' accounting numbers.
      
      However we don't mark all qgroups dirty, but rely on rescan to do so.
      
      If we have any high level qgroup without children, it won't be marked
      dirty during rescan, since we cannot reach that qgroup.
      
      This will cause QGROUP_INFO items of childless qgroups never get updated
      in the quota tree, thus their numbers will stay the same in "btrfs
      qgroup show" output.
      
      [FIX]
      Just mark all qgroups dirty in qgroup_rescan_zero_tracking(), so even if
      we have childless qgroups, their QGROUP_INFO items will still get
      updated during rescan.
      Reported-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Tested-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8181a8f8