1. 21 11月, 2018 15 次提交
    • A
      drm/msm/gpu: fix parameters in function msm_gpu_crashstate_capture · 9565e7c0
      Anders Roxell 提交于
      [ Upstream commit 6969019f65b43afb6da6a26f1d9e55bbdfeebcd5 ]
      
      When CONFIG_DEV_COREDUMP isn't defined msm_gpu_crashstate_capture
      doesn't pass the correct parameters.
      drivers/gpu/drm/msm/msm_gpu.c: In function ‘recover_worker’:
      drivers/gpu/drm/msm/msm_gpu.c:479:34: error: passing argument 2 of ‘msm_gpu_crashstate_capture’ from incompatible pointer type [-Werror=incompatible-pointer-types]
        msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
                                        ^~~~~~
      drivers/gpu/drm/msm/msm_gpu.c:388:13: note: expected ‘char *’ but argument is of type ‘struct msm_gem_submit *’
       static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, char *comm,
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/gpu/drm/msm/msm_gpu.c:479:2: error: too many arguments to function ‘msm_gpu_crashstate_capture’
        msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/gpu/drm/msm/msm_gpu.c:388:13: note: declared here
       static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, char *comm,
      
      In current code the function msm_gpu_crashstate_capture parameters.
      
      Fixes: cdb95931 ("drm/msm/gpu: Add the buffer objects from the submit to the crash dump")
      Signed-off-by: NAnders Roxell <anders.roxell@linaro.org>
      Reviewed-By: NJordan Crouse <jcrouse@codeaurora.org>
      Signed-off-by: NRob Clark <robdclark@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9565e7c0
    • D
      powerpc/nohash: fix undefined behaviour when testing page size support · b0f85991
      Daniel Axtens 提交于
      [ Upstream commit f5e284803a7206d43e26f9ffcae5de9626d95e37 ]
      
      When enumerating page size definitions to check hardware support,
      we construct a constant which is (1U << (def->shift - 10)).
      
      However, the array of page size definitions is only initalised for
      various MMU_PAGE_* constants, so it contains a number of 0-initialised
      elements with def->shift == 0. This means we end up shifting by a
      very large number, which gives the following UBSan splat:
      
      ================================================================================
      UBSAN: Undefined behaviour in /home/dja/dev/linux/linux/arch/powerpc/mm/tlb_nohash.c:506:21
      shift exponent 4294967286 is too large for 32-bit type 'unsigned int'
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc3-00045-ga604f927b012-dirty #6
      Call Trace:
      [c00000000101bc20] [c000000000a13d54] .dump_stack+0xa8/0xec (unreliable)
      [c00000000101bcb0] [c0000000004f20a8] .ubsan_epilogue+0x18/0x64
      [c00000000101bd30] [c0000000004f2b10] .__ubsan_handle_shift_out_of_bounds+0x110/0x1a4
      [c00000000101be20] [c000000000d21760] .early_init_mmu+0x1b4/0x5a0
      [c00000000101bf10] [c000000000d1ba28] .early_setup+0x100/0x130
      [c00000000101bf90] [c000000000000528] start_here_multiplatform+0x68/0x80
      ================================================================================
      
      Fix this by first checking if the element exists (shift != 0) before
      constructing the constant.
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b0f85991
    • F
      ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL · 270cbdee
      Fabio Estevam 提交于
      [ Upstream commit 35d3cbe84544da74e39e1cec01374092467e3119 ]
      
      Andreas Müller reports:
      
      "Fixes:
      
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[220]: Failed to apply ACL on /dev/v4l-subdev0: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[224]: Failed to apply ACL on /dev/v4l-subdev1: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[215]: Failed to apply ACL on /dev/v4l-subdev10: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[228]: Failed to apply ACL on /dev/v4l-subdev2: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[232]: Failed to apply ACL on /dev/v4l-subdev5: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[217]: Failed to apply ACL on /dev/v4l-subdev11: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[214]: Failed to apply ACL on /dev/dri/card1: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[216]: Failed to apply ACL on /dev/v4l-subdev8: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[226]: Failed to apply ACL on /dev/v4l-subdev9: Operation not supported
      
      and nasty follow-ups: Starting weston from sddm as unpriviledged user fails
      with some hints on missing access rights."
      
      Select the CONFIG_TMPFS_POSIX_ACL option to fix these issues.
      Reported-by: NAndreas Müller <schnitzeltony@gmail.com>
      Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com>
      Acked-by: NOtavio Salvador <otavio@ossystems.com.br>
      Signed-off-by: NShawn Guo <shawnguo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      270cbdee
    • C
      drm/amdgpu/powerplay: fix missing break in switch statements · 17a5d018
      Colin Ian King 提交于
      [ Upstream commit 14b284832e7dea6f54f0adfd7bed105548b94e57 ]
      
      There are several switch statements that are missing break statements.
      Add missing breaks to handle any fall-throughs corner cases.
      
      Detected by CoverityScan, CID#1457175 ("Missing break in switch")
      
      Fixes: 18aafc59 ("drm/amd/powerplay: implement fw related smu interface for iceland.")
      Acked-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17a5d018
    • G
      drm/nouveau/secboot/acr: fix memory leak · 05dff1e2
      Gustavo A. R. Silva 提交于
      [ Upstream commit 74a07c0a59fa372b069d879971ba4d9e341979cf ]
      
      In case memory resources for *bl_desc* were allocated, release
      them before return.
      
      Addresses-Coverity-ID: 1472021 ("Resource leak")
      Fixes: 0d466901 ("drm/nouveau/secboot/acr: Remove VLA usage")
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05dff1e2
    • M
      tracing/kprobes: Check the probe on unloaded module correctly · 784c2eb3
      Masami Hiramatsu 提交于
      [ Upstream commit 59158ec4aef7d44be51a6f3e7e17fc64c32604eb ]
      
      Current kprobe event doesn't checks correctly whether the
      given event is on unloaded module or not. It just checks
      the event has ":" in the name.
      
      That is not enough because if we define a probe on non-exist
      symbol on loaded module, it allows to define that (with
      warning message)
      
      To ensure it correctly, this searches the module name on
      loaded module list and only if there is not, it allows to
      define it. (this event will be available when the target
      module is loaded)
      
      Link: http://lkml.kernel.org/r/153547309528.26502.8300278470528281328.stgit@devboxSigned-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      784c2eb3
    • M
      tty: check name length in tty_find_polling_driver() · 7ecd146b
      Miles Chen 提交于
      [ Upstream commit 33a1a7be198657c8ca26ad406c4d2a89b7162bcc ]
      
      The issue is found by a fuzzing test.
      If tty_find_polling_driver() recevies an incorrect input such as
      ',,' or '0b', the len becomes 0 and strncmp() always return 0.
      In this case, a null p->ops->poll_init() is called and it causes a kernel
      panic.
      
      Fix this by checking name length against zero in tty_find_polling_driver().
      
      $echo ,, > /sys/module/kgdboc/parameters/kgdboc
      [   20.804451] WARNING: CPU: 1 PID: 104 at drivers/tty/serial/serial_core.c:457
      uart_get_baud_rate+0xe8/0x190
      [   20.804917] Modules linked in:
      [   20.805317] CPU: 1 PID: 104 Comm: sh Not tainted 4.19.0-rc7ajb #8
      [   20.805469] Hardware name: linux,dummy-virt (DT)
      [   20.805732] pstate: 20000005 (nzCv daif -PAN -UAO)
      [   20.805895] pc : uart_get_baud_rate+0xe8/0x190
      [   20.806042] lr : uart_get_baud_rate+0xc0/0x190
      [   20.806476] sp : ffffffc06acff940
      [   20.806676] x29: ffffffc06acff940 x28: 0000000000002580
      [   20.806977] x27: 0000000000009600 x26: 0000000000009600
      [   20.807231] x25: ffffffc06acffad0 x24: 00000000ffffeff0
      [   20.807576] x23: 0000000000000001 x22: 0000000000000000
      [   20.807807] x21: 0000000000000001 x20: 0000000000000000
      [   20.808049] x19: ffffffc06acffac8 x18: 0000000000000000
      [   20.808277] x17: 0000000000000000 x16: 0000000000000000
      [   20.808520] x15: ffffffffffffffff x14: ffffffff00000000
      [   20.808757] x13: ffffffffffffffff x12: 0000000000000001
      [   20.809011] x11: 0101010101010101 x10: ffffff880d59ff5f
      [   20.809292] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
      [   20.809549] x7 : 0000000000000000 x6 : ffffff880d59ff5f
      [   20.809803] x5 : 0000000080008001 x4 : 0000000000000003
      [   20.810056] x3 : ffffff900853e6b4 x2 : dfffff9000000000
      [   20.810693] x1 : ffffffc06acffad0 x0 : 0000000000000cb0
      [   20.811005] Call trace:
      [   20.811214]  uart_get_baud_rate+0xe8/0x190
      [   20.811479]  serial8250_do_set_termios+0xe0/0x6f4
      [   20.811719]  serial8250_set_termios+0x48/0x54
      [   20.811928]  uart_set_options+0x138/0x1bc
      [   20.812129]  uart_poll_init+0x114/0x16c
      [   20.812330]  tty_find_polling_driver+0x158/0x200
      [   20.812545]  configure_kgdboc+0xbc/0x1bc
      [   20.812745]  param_set_kgdboc_var+0xb8/0x150
      [   20.812960]  param_attr_store+0xbc/0x150
      [   20.813160]  module_attr_store+0x40/0x58
      [   20.813364]  sysfs_kf_write+0x8c/0xa8
      [   20.813563]  kernfs_fop_write+0x154/0x290
      [   20.813764]  vfs_write+0xf0/0x278
      [   20.813951]  __arm64_sys_write+0x84/0xf4
      [   20.814400]  el0_svc_common+0xf4/0x1dc
      [   20.814616]  el0_svc_handler+0x98/0xbc
      [   20.814804]  el0_svc+0x8/0xc
      [   20.822005] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [   20.826913] Mem abort info:
      [   20.827103]   ESR = 0x84000006
      [   20.827352]   Exception class = IABT (current EL), IL = 16 bits
      [   20.827655]   SET = 0, FnV = 0
      [   20.827855]   EA = 0, S1PTW = 0
      [   20.828135] user pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
      [   20.828484] [0000000000000000] pgd=00000000aadee003, pud=00000000aadee003, pmd=0000000000000000
      [   20.829195] Internal error: Oops: 84000006 [#1] SMP
      [   20.829564] Modules linked in:
      [   20.829890] CPU: 1 PID: 104 Comm: sh Tainted: G        W         4.19.0-rc7ajb #8
      [   20.830545] Hardware name: linux,dummy-virt (DT)
      [   20.830829] pstate: 60000085 (nZCv daIf -PAN -UAO)
      [   20.831174] pc :           (null)
      [   20.831457] lr : serial8250_do_set_termios+0x358/0x6f4
      [   20.831727] sp : ffffffc06acff9b0
      [   20.831936] x29: ffffffc06acff9b0 x28: ffffff9008d7c000
      [   20.832267] x27: ffffff900969e16f x26: 0000000000000000
      [   20.832589] x25: ffffff900969dfb0 x24: 0000000000000000
      [   20.832906] x23: ffffffc06acffad0 x22: ffffff900969e160
      [   20.833232] x21: 0000000000000000 x20: ffffffc06acffac8
      [   20.833559] x19: ffffff900969df90 x18: 0000000000000000
      [   20.833878] x17: 0000000000000000 x16: 0000000000000000
      [   20.834491] x15: ffffffffffffffff x14: ffffffff00000000
      [   20.834821] x13: ffffffffffffffff x12: 0000000000000001
      [   20.835143] x11: 0101010101010101 x10: ffffff880d59ff5f
      [   20.835467] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
      [   20.835790] x7 : 0000000000000000 x6 : ffffff880d59ff5f
      [   20.836111] x5 : c06419717c314100 x4 : 0000000000000007
      [   20.836419] x3 : 0000000000000000 x2 : 0000000000000000
      [   20.836732] x1 : 0000000000000001 x0 : ffffff900969df90
      [   20.837100] Process sh (pid: 104, stack limit = 0x(____ptrval____))
      [   20.837396] Call trace:
      [   20.837566]            (null)
      [   20.837816]  serial8250_set_termios+0x48/0x54
      [   20.838089]  uart_set_options+0x138/0x1bc
      [   20.838570]  uart_poll_init+0x114/0x16c
      [   20.838834]  tty_find_polling_driver+0x158/0x200
      [   20.839119]  configure_kgdboc+0xbc/0x1bc
      [   20.839380]  param_set_kgdboc_var+0xb8/0x150
      [   20.839658]  param_attr_store+0xbc/0x150
      [   20.839920]  module_attr_store+0x40/0x58
      [   20.840183]  sysfs_kf_write+0x8c/0xa8
      [   20.840183]  sysfs_kf_write+0x8c/0xa8
      [   20.840440]  kernfs_fop_write+0x154/0x290
      [   20.840702]  vfs_write+0xf0/0x278
      [   20.840942]  __arm64_sys_write+0x84/0xf4
      [   20.841209]  el0_svc_common+0xf4/0x1dc
      [   20.841471]  el0_svc_handler+0x98/0xbc
      [   20.841713]  el0_svc+0x8/0xc
      [   20.842057] Code: bad PC value
      [   20.842764] ---[ end trace a8835d7de79aaadf ]---
      [   20.843134] Kernel panic - not syncing: Fatal exception
      [   20.843515] SMP: stopping secondary CPUs
      [   20.844289] Kernel Offset: disabled
      [   20.844634] CPU features: 0x0,21806002
      [   20.844857] Memory Limit: none
      [   20.845172] ---[ end Kernel panic - not syncing: Fatal exception ]---
      Signed-off-by: NMiles Chen <miles.chen@mediatek.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ecd146b
    • S
      powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() · 234bee8b
      Sam Bobroff 提交于
      [ Upstream commit f9bc28aedfb5bbd572d2d365f3095c1becd7209b ]
      
      If an error occurs during an unplug operation, it's possible for
      eeh_dump_dev_log() to be called when edev->pdn is null, which
      currently leads to dereferencing a null pointer.
      
      Handle this by skipping the error log for those devices.
      Signed-off-by: NSam Bobroff <sbobroff@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      234bee8b
    • J
      powerpc/Makefile: Fix PPC_BOOK3S_64 ASFLAGS · 32f2674c
      Joel Stanley 提交于
      [ Upstream commit 960e30029863db95ec79a71009272d4661db5991 ]
      
      Ever since commit 15a3204d ("powerpc/64s: Set assembler machine type
      to POWER4") we force -mpower4 to be passed to the assembler
      irrespective of the CFLAGS used (for Book3s 64).
      
      When building a powerpc64 kernel with clang, clang will not add -many
      to the assembler flags, so any instructions that the compiler has
      generated that are not available on power4 will cause an error:
      
        /usr/bin/as -a64 -mppc64 -mlittle-endian -mpower8 \
         -I ./arch/powerpc/include -I ./arch/powerpc/include/generated \
         -I ./include -I ./arch/powerpc/include/uapi \
         -I ./arch/powerpc/include/generated/uapi -I ./include/uapi \
         -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc \
         -maltivec -mpower4 -o init/do_mounts.o /tmp/do_mounts-3b0a3d.s
        /tmp/do_mounts-51ce54.s:748: Error: unrecognized opcode: `isel'
      
      GCC does include -many, so the GCC driven gas call will succeed:
      
        as -v -I ./arch/powerpc/include -I ./arch/powerpc/include/generated -I
        ./include -I ./arch/powerpc/include/uapi
        -I ./arch/powerpc/include/generated/uapi -I ./include/uapi
        -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc
         -a64 -mpower8 -many -mlittle -maltivec -mpower4 -o init/do_mounts.o
      
      Note that isel is power7 and above for IBM CPUs. GCC only generates it
      for Power9 and above, but the above test was run against the clang
      generated assembly.
      
      Peter Bergner explains:
      
        When using -many -mpower4, gas will first try and find a matching
        power4 mnemonic and failing that, it will then allow any valid
        mnemonic that gas knows about. GCC's use of -many predates me
        though.
      
        IIRC, Alan looked at trying to remove it, but I forget why he
        didn't. Could be either a gcc or gas issue at the time. I'm not sure
        whether issue still exists or not. He and I have modified how gas
        works internally a fair amount since he tried removing gcc use of
        -many.
      
        I will also note that when using -many, gas will choose the first
        mnemonic that matches in the mnemonic table and we have (mostly)
        sorted the table so that server mnemonics show up earlier in the
        table than other mnemonics, so they'll be seen/chosen first.
      
      By explicitly setting -many we can build with Clang and GCC while
      retaining the -mpower4 option.
      Signed-off-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32f2674c
    • R
      Input: wm97xx-ts - fix exit path · d7dba42c
      Randy Dunlap 提交于
      [ Upstream commit a3f7c3fcf60868c1e90671df5d0cf9be5900a09b ]
      
      Loading then unloading wm97xx-ts.ko when CONFIG_AC97_BUS=m
      causes a WARNING: from drivers/base/driver.c:
      
      Unexpected driver unregister!
      WARNING: CPU: 0 PID: 1709 at ../drivers/base/driver.c:193 driver_unregister+0x30/0x40
      
      Fix this by only calling driver_unregister() with the same
      condition that driver_register() is called.
      
      Fixes: ae9d1b5f ("Input: wm97xx: add new AC97 bus support")
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7dba42c
    • S
      drm/amd/display: fix bug of accessing invalid memory · 139dde1e
      Su Sung Chung 提交于
      [ Upstream commit 43c3ff27a47d83d153c4adc088243ba594582bf5 ]
      
      [Why]
      A loop inside of build_evenly_distributed_points function that traverse through
      the array of points become an infinite loop when m_GammaUpdates does not
      get assigned to any value.
      
      [How]
      In DMColor, clear m_gammaIsValid bit just before writting all Zeromem for
      m_GammaUpdates, to prevent calling build_evenly_distributed_points
      before m_GammaUpdates gets assigned to some value.
      Signed-off-by: NSu Sung Chung <Su.Chung@amd.com>
      Reviewed-by: NAric Cyr <Aric.Cyr@amd.com>
      Acked-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      139dde1e
    • C
      powerpc/mm: fix always true/false warning in slice.c · 56e4367f
      Christophe Leroy 提交于
      [ Upstream commit 37e9c674e7e6f445e12cb1151017bd4bacdd1e2d ]
      
      This patch fixes the following warnings (obtained with make W=1).
      
      arch/powerpc/mm/slice.c: In function 'slice_range_to_mask':
      arch/powerpc/mm/slice.c:73:12: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (start < SLICE_LOW_TOP) {
                  ^
      arch/powerpc/mm/slice.c:81:20: error: comparison is always false due to limited range of data type [-Werror=type-limits]
        if ((start + len) > SLICE_LOW_TOP) {
                          ^
      arch/powerpc/mm/slice.c: In function 'slice_mask_for_free':
      arch/powerpc/mm/slice.c:136:17: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (high_limit <= SLICE_LOW_TOP)
                       ^
      arch/powerpc/mm/slice.c: In function 'slice_check_range_fits':
      arch/powerpc/mm/slice.c:185:12: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (start < SLICE_LOW_TOP) {
                  ^
      arch/powerpc/mm/slice.c:195:39: error: comparison is always false due to limited range of data type [-Werror=type-limits]
        if (SLICE_NUM_HIGH && ((start + len) > SLICE_LOW_TOP)) {
                                             ^
      arch/powerpc/mm/slice.c: In function 'slice_scan_available':
      arch/powerpc/mm/slice.c:306:11: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (addr < SLICE_LOW_TOP) {
                 ^
      arch/powerpc/mm/slice.c: In function 'get_slice_psize':
      arch/powerpc/mm/slice.c:709:11: error: comparison is always true due to limited range of data type [-Werror=type-limits]
        if (addr < SLICE_LOW_TOP) {
                 ^
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56e4367f
    • M
      powerpc/mm: Fix page table dump to work on Radix · 287deb22
      Michael Ellerman 提交于
      [ Upstream commit 0d923962ab69c27cca664a2d535e90ef655110ca ]
      
      When we're running on Book3S with the Radix MMU enabled the page table
      dump currently prints the wrong addresses because it uses the wrong
      start address.
      
      Fix it to use PAGE_OFFSET rather than KERN_VIRT_START.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      287deb22
    • N
      powerpc/64/module: REL32 relocation range check · 63880b2f
      Nicholas Piggin 提交于
      [ Upstream commit b851ba02a6f3075f0f99c60c4bc30a4af80cf428 ]
      
      The recent module relocation overflow crash demonstrated that we
      have no range checking on REL32 relative relocations. This patch
      implements a basic check, the same kernel that previously oopsed
      and rebooted now continues with some of these errors when loading
      the module:
      
        module_64: x_tables: REL32 527703503449812 out of range!
      
      Possibly other relocations (ADDR32, REL16, TOC16, etc.) should also have
      overflow checks.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63880b2f
    • C
      powerpc/traps: restore recoverability of machine_check interrupts · 54de3d7d
      Christophe Leroy 提交于
      [ Upstream commit daf00ae71dad8aa05965713c62558aeebf2df48e ]
      
      commit b96672dd ("powerpc: Machine check interrupt is a non-
      maskable interrupt") added a call to nmi_enter() at the beginning of
      machine check restart exception handler. Due to that, in_interrupt()
      always returns true regardless of the state before entering the
      exception, and die() panics even when the system was not already in
      interrupt.
      
      This patch calls nmi_exit() before calling die() in order to restore
      the interrupt state we had before calling nmi_enter()
      
      Fixes: b96672dd ("powerpc: Machine check interrupt is a non-maskable interrupt")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54de3d7d
  2. 14 11月, 2018 25 次提交
    • G
      Linux 4.19.2 · 7950eb31
      Greg Kroah-Hartman 提交于
      7950eb31
    • S
      MD: fix invalid stored role for a disk - try2 · 589c3750
      Shaohua Li 提交于
      commit 9e753ba9b9b405e3902d9f08aec5f2ea58a0c317 upstream.
      
      Commit d595567dc4f0 (MD: fix invalid stored role for a disk) broke linear
      hotadd. Let's only fix the role for disks in raid1/10.
      Based on Guoqing's original patch.
      Reported-by: Nkernel test robot <rong.a.chen@intel.com>
      Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
      Cc: Guoqing Jiang <gqjiang@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      589c3750
    • T
      vga_switcheroo: Fix missing gpu_bound call at audio client registration · fcd90d7d
      Takashi Iwai 提交于
      commit fc09ab7a767394f9ecdad84ea6e85d68b83c8e21 upstream.
      
      The commit 37a3a98e ("ALSA: hda - Enable runtime PM only for
      discrete GPU") added a new ops gpu_bound to be called when GPU gets
      bound.  The patch overlooked, however, that vga_switcheroo_enable() is
      called only once at GPU is bound.  When an audio client is registered
      after that point, it would miss the gpu_bound call.  This leads to the
      unexpected lack of runtime PM in HD-audio side.
      
      For addressing that regression, just call gpu_bound callback manually
      at vga_switcheroo_register_audio_client() when the GPU was already
      bound.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201615
      Fixes: 37a3a98e ("ALSA: hda - Enable runtime PM only for discrete GPU")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fcd90d7d
    • D
      bpf: wait for running BPF programs when updating map-in-map · 34d96152
      Daniel Colascione 提交于
      commit 1ae80cf3 upstream.
      
      The map-in-map frequently serves as a mechanism for atomic
      snapshotting of state that a BPF program might record.  The current
      implementation is dangerous to use in this way, however, since
      userspace has no way of knowing when all programs that might have
      retrieved the "old" value of the map may have completed.
      
      This change ensures that map update operations on map-in-map map types
      always wait for all references to the old map to drop before returning
      to userspace.
      Signed-off-by: NDaniel Colascione <dancol@google.com>
      Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NChenbo Feng <fengc@google.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34d96152
    • J
      userns: also map extents in the reverse map to kernel IDs · 9a7a80fb
      Jann Horn 提交于
      commit d2f007dbe7e4c9583eea6eb04d60001e85c6f1bd upstream.
      
      The current logic first clones the extent array and sorts both copies, then
      maps the lower IDs of the forward mapping into the lower namespace, but
      doesn't map the lower IDs of the reverse mapping.
      
      This means that code in a nested user namespace with >5 extents will see
      incorrect IDs. It also breaks some access checks, like
      inode_owner_or_capable() and privileged_wrt_inode_uidgid(), so a process
      can incorrectly appear to be capable relative to an inode.
      
      To fix it, we have to make sure that the "lower_first" members of extents
      in both arrays are translated; and we have to make sure that the reverse
      map is sorted *after* the translation (since otherwise the translation can
      break the sorting).
      
      This is CVE-2018-18955.
      
      Fixes: 6397fac4 ("userns: bump idmap limits to 340")
      Cc: stable@vger.kernel.org
      Signed-off-by: NJann Horn <jannh@google.com>
      Tested-by: NEric W. Biederman <ebiederm@xmission.com>
      Reviewed-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a7a80fb
    • M
      vt: fix broken display when running aptitude · 13794b24
      Mikulas Patocka 提交于
      commit 943210ba807ec50aafa2fa7b13bd6d36a478969b upstream.
      
      If you run aptitude on framebuffer console, the display is corrupted. The
      corruption is caused by the commit d8ae7242. The patch adds "offset" to
      "start" when calling scr_memsetw, but it forgets to do the same addition
      on a subsequent call to do_update_region.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Fixes: d8ae7242 ("vt: preserve unicode values corresponding to screen characters")
      Reviewed-by: NNicolas Pitre <nico@linaro.org>
      Cc: stable@vger.kernel.org	# 4.19
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13794b24
    • D
      net: sched: Remove TCA_OPTIONS from policy · ab5d01b6
      David Ahern 提交于
      commit e72bde6b66299602087c8c2350d36a525e75d06e upstream.
      
      Marco reported an error with hfsc:
      root@Calimero:~# tc qdisc add dev eth0 root handle 1:0 hfsc default 1
      Error: Attribute failed policy validation.
      
      Apparently a few implementations pass TCA_OPTIONS as a binary instead
      of nested attribute, so drop TCA_OPTIONS from the policy.
      
      Fixes: 8b4c3cdd ("net: sched: Add policy validation for tc attributes")
      Reported-by: NMarco Berizzi <pupilla@libero.it>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab5d01b6
    • F
      Btrfs: fix use-after-free when dumping free space · 0c286e9d
      Filipe Manana 提交于
      commit 9084cb6a24bf5838a665af92ded1af8363f9e563 upstream.
      
      We were iterating a block group's free space cache rbtree without locking
      first the lock that protects it (the free_space_ctl->free_space_offset
      rbtree is protected by the free_space_ctl->tree_lock spinlock).
      
      KASAN reported an use-after-free problem when iterating such a rbtree due
      to a concurrent rbtree delete:
      
      [ 9520.359168] ==================================================================
      [ 9520.359656] BUG: KASAN: use-after-free in rb_next+0x13/0x90
      [ 9520.359949] Read of size 8 at addr ffff8800b7ada500 by task btrfs-transacti/1721
      [ 9520.360357]
      [ 9520.360530] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G             L    4.19.0-rc8-nbor #555
      [ 9520.360990] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [ 9520.362682] Call Trace:
      [ 9520.362887]  dump_stack+0xa4/0xf5
      [ 9520.363146]  print_address_description+0x78/0x280
      [ 9520.363412]  kasan_report+0x263/0x390
      [ 9520.363650]  ? rb_next+0x13/0x90
      [ 9520.363873]  __asan_load8+0x54/0x90
      [ 9520.364102]  rb_next+0x13/0x90
      [ 9520.364380]  btrfs_dump_free_space+0x146/0x160 [btrfs]
      [ 9520.364697]  dump_space_info+0x2cd/0x310 [btrfs]
      [ 9520.364997]  btrfs_reserve_extent+0x1ee/0x1f0 [btrfs]
      [ 9520.365310]  __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs]
      [ 9520.365646]  ? btrfs_update_time+0x180/0x180 [btrfs]
      [ 9520.365923]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.366204]  ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs]
      [ 9520.366549]  btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs]
      [ 9520.366880]  cache_save_setup+0x42e/0x580 [btrfs]
      [ 9520.367220]  ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs]
      [ 9520.367518]  ? lock_downgrade+0x2f0/0x2f0
      [ 9520.367799]  ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs]
      [ 9520.368104]  ? kasan_check_read+0x11/0x20
      [ 9520.368349]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.368638]  btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs]
      [ 9520.368978]  ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs]
      [ 9520.369282]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.369534]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.369811]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.370137]  commit_cowonly_roots+0x4b9/0x610 [btrfs]
      [ 9520.370560]  ? commit_fs_roots+0x350/0x350 [btrfs]
      [ 9520.370926]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.371285]  btrfs_commit_transaction+0x5e5/0x10e0 [btrfs]
      [ 9520.371612]  ? btrfs_apply_pending_changes+0x90/0x90 [btrfs]
      [ 9520.371943]  ? start_transaction+0x168/0x6c0 [btrfs]
      [ 9520.372257]  transaction_kthread+0x21c/0x240 [btrfs]
      [ 9520.372537]  kthread+0x1d2/0x1f0
      [ 9520.372793]  ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs]
      [ 9520.373090]  ? kthread_park+0xb0/0xb0
      [ 9520.373329]  ret_from_fork+0x3a/0x50
      [ 9520.373567]
      [ 9520.373738] Allocated by task 1804:
      [ 9520.373974]  kasan_kmalloc+0xff/0x180
      [ 9520.374208]  kasan_slab_alloc+0x11/0x20
      [ 9520.374447]  kmem_cache_alloc+0xfc/0x2d0
      [ 9520.374731]  __btrfs_add_free_space+0x40/0x580 [btrfs]
      [ 9520.375044]  unpin_extent_range+0x4f7/0x7a0 [btrfs]
      [ 9520.375383]  btrfs_finish_extent_commit+0x15f/0x4d0 [btrfs]
      [ 9520.375707]  btrfs_commit_transaction+0xb06/0x10e0 [btrfs]
      [ 9520.376027]  btrfs_alloc_data_chunk_ondemand+0x237/0x5c0 [btrfs]
      [ 9520.376365]  btrfs_check_data_free_space+0x81/0xd0 [btrfs]
      [ 9520.376689]  btrfs_delalloc_reserve_space+0x25/0x80 [btrfs]
      [ 9520.377018]  btrfs_direct_IO+0x42e/0x6d0 [btrfs]
      [ 9520.377284]  generic_file_direct_write+0x11e/0x220
      [ 9520.377587]  btrfs_file_write_iter+0x472/0xac0 [btrfs]
      [ 9520.377875]  aio_write+0x25c/0x360
      [ 9520.378106]  io_submit_one+0xaa0/0xdc0
      [ 9520.378343]  __se_sys_io_submit+0xfa/0x2f0
      [ 9520.378589]  __x64_sys_io_submit+0x43/0x50
      [ 9520.378840]  do_syscall_64+0x7d/0x240
      [ 9520.379081]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 9520.379387]
      [ 9520.379557] Freed by task 1802:
      [ 9520.379782]  __kasan_slab_free+0x173/0x260
      [ 9520.380028]  kasan_slab_free+0xe/0x10
      [ 9520.380262]  kmem_cache_free+0xc1/0x2c0
      [ 9520.380544]  btrfs_find_space_for_alloc+0x4cd/0x4e0 [btrfs]
      [ 9520.380866]  find_free_extent+0xa99/0x17e0 [btrfs]
      [ 9520.381166]  btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
      [ 9520.381474]  btrfs_get_blocks_direct+0x60b/0xbd0 [btrfs]
      [ 9520.381761]  __blockdev_direct_IO+0x10ee/0x58a1
      [ 9520.382059]  btrfs_direct_IO+0x25a/0x6d0 [btrfs]
      [ 9520.382321]  generic_file_direct_write+0x11e/0x220
      [ 9520.382623]  btrfs_file_write_iter+0x472/0xac0 [btrfs]
      [ 9520.382904]  aio_write+0x25c/0x360
      [ 9520.383172]  io_submit_one+0xaa0/0xdc0
      [ 9520.383416]  __se_sys_io_submit+0xfa/0x2f0
      [ 9520.383678]  __x64_sys_io_submit+0x43/0x50
      [ 9520.383927]  do_syscall_64+0x7d/0x240
      [ 9520.384165]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 9520.384439]
      [ 9520.384610] The buggy address belongs to the object at ffff8800b7ada500
                      which belongs to the cache btrfs_free_space of size 72
      [ 9520.385175] The buggy address is located 0 bytes inside of
                      72-byte region [ffff8800b7ada500, ffff8800b7ada548)
      [ 9520.385691] The buggy address belongs to the page:
      [ 9520.385957] page:ffffea0002deb680 count:1 mapcount:0 mapping:ffff880108a1d700 index:0x0 compound_mapcount: 0
      [ 9520.388030] flags: 0x8100(slab|head)
      [ 9520.388281] raw: 0000000000008100 ffffea0002deb608 ffffea0002728808 ffff880108a1d700
      [ 9520.388722] raw: 0000000000000000 0000000000130013 00000001ffffffff 0000000000000000
      [ 9520.389169] page dumped because: kasan: bad access detected
      [ 9520.389473]
      [ 9520.389658] Memory state around the buggy address:
      [ 9520.389943]  ffff8800b7ada400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.390368]  ffff8800b7ada480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.390796] >ffff8800b7ada500: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc
      [ 9520.391223]                    ^
      [ 9520.391461]  ffff8800b7ada580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.391885]  ffff8800b7ada600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.392313] ==================================================================
      [ 9520.392772] BTRFS critical (device vdc): entry offset 2258497536, bytes 131072, bitmap no
      [ 9520.393247] BUG: unable to handle kernel NULL pointer dereference at 0000000000000011
      [ 9520.393705] PGD 800000010dbab067 P4D 800000010dbab067 PUD 107551067 PMD 0
      [ 9520.394059] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [ 9520.394378] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G    B        L    4.19.0-rc8-nbor #555
      [ 9520.394858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [ 9520.395350] RIP: 0010:rb_next+0x3c/0x90
      [ 9520.396461] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292
      [ 9520.396762] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c
      [ 9520.397115] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011
      [ 9520.397468] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc
      [ 9520.397821] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000
      [ 9520.398188] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000
      [ 9520.398555] FS:  0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000
      [ 9520.399007] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9520.399335] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0
      [ 9520.399679] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9520.400023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9520.400400] Call Trace:
      [ 9520.400648]  btrfs_dump_free_space+0x146/0x160 [btrfs]
      [ 9520.400974]  dump_space_info+0x2cd/0x310 [btrfs]
      [ 9520.401287]  btrfs_reserve_extent+0x1ee/0x1f0 [btrfs]
      [ 9520.401609]  __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs]
      [ 9520.401952]  ? btrfs_update_time+0x180/0x180 [btrfs]
      [ 9520.402232]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.402522]  ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs]
      [ 9520.402882]  btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs]
      [ 9520.403261]  cache_save_setup+0x42e/0x580 [btrfs]
      [ 9520.403570]  ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs]
      [ 9520.403871]  ? lock_downgrade+0x2f0/0x2f0
      [ 9520.404161]  ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs]
      [ 9520.404481]  ? kasan_check_read+0x11/0x20
      [ 9520.404732]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.405026]  btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs]
      [ 9520.405375]  ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs]
      [ 9520.405694]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.405958]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.406243]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.406574]  commit_cowonly_roots+0x4b9/0x610 [btrfs]
      [ 9520.406899]  ? commit_fs_roots+0x350/0x350 [btrfs]
      [ 9520.407253]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.407589]  btrfs_commit_transaction+0x5e5/0x10e0 [btrfs]
      [ 9520.407925]  ? btrfs_apply_pending_changes+0x90/0x90 [btrfs]
      [ 9520.408262]  ? start_transaction+0x168/0x6c0 [btrfs]
      [ 9520.408582]  transaction_kthread+0x21c/0x240 [btrfs]
      [ 9520.408870]  kthread+0x1d2/0x1f0
      [ 9520.409138]  ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs]
      [ 9520.409440]  ? kthread_park+0xb0/0xb0
      [ 9520.409682]  ret_from_fork+0x3a/0x50
      [ 9520.410508] Dumping ftrace buffer:
      [ 9520.410764]    (ftrace buffer empty)
      [ 9520.411007] CR2: 0000000000000011
      [ 9520.411297] ---[ end trace 01a0863445cf360a ]---
      [ 9520.411568] RIP: 0010:rb_next+0x3c/0x90
      [ 9520.412644] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292
      [ 9520.412932] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c
      [ 9520.413274] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011
      [ 9520.413616] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc
      [ 9520.414007] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000
      [ 9520.414349] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000
      [ 9520.416074] FS:  0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000
      [ 9520.416536] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9520.416848] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0
      [ 9520.418477] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9520.418846] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9520.419204] Kernel panic - not syncing: Fatal exception
      [ 9520.419666] Dumping ftrace buffer:
      [ 9520.419930]    (ftrace buffer empty)
      [ 9520.420168] Kernel Offset: disabled
      [ 9520.420406] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Fix this by acquiring the respective lock before iterating the rbtree.
      Reported-by: NNikolay Borisov <nborisov@suse.com>
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c286e9d
    • F
      Btrfs: fix use-after-free during inode eviction · 5b7a4630
      Filipe Manana 提交于
      commit 421f0922a2cfb0c75acd9746454aaa576c711a65 upstream.
      
      At inode.c:evict_inode_truncate_pages(), when we iterate over the
      inode's extent states, we access an extent state record's "state" field
      after we unlocked the inode's io tree lock. This can lead to a
      use-after-free issue because after we unlock the io tree that extent
      state record might have been freed due to being merged into another
      adjacent extent state record (a previous inflight bio for a read
      operation finished in the meanwhile which unlocked a range in the io
      tree and cause a merge of extent state records, as explained in the
      comment before the while loop added in commit 6ca07097 ("Btrfs: fix
      hang during inode eviction due to concurrent readahead")).
      
      Fix this by keeping a copy of the extent state's flags in a local
      variable and using it after unlocking the io tree.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201189
      Fixes: b9d0b389 ("btrfs: Add handler for invalidate page")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b7a4630
    • J
      btrfs: move the dio_sem higher up the callchain · 51c62a33
      Josef Bacik 提交于
      commit c495144bc6962186feae31d687596d2472000e45 upstream.
      
      We're getting a lockdep splat because we take the dio_sem under the
      log_mutex.  What we really need is to protect fsync() from logging an
      extent map for an extent we never waited on higher up, so just guard the
      whole thing with dio_sem.
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.18.0-rc4-xfstests-00025-g5de5edbaf1d4 #411 Not tainted
      ------------------------------------------------------
      aio-dio-invalid/30928 is trying to acquire lock:
      0000000092621cfd (&mm->mmap_sem){++++}, at: get_user_pages_unlocked+0x5a/0x1e0
      
      but task is already holding lock:
      00000000cefe6b35 (&ei->dio_sem){++++}, at: btrfs_direct_IO+0x3be/0x400
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #5 (&ei->dio_sem){++++}:
             lock_acquire+0xbd/0x220
             down_write+0x51/0xb0
             btrfs_log_changed_extents+0x80/0xa40
             btrfs_log_inode+0xbaf/0x1000
             btrfs_log_inode_parent+0x26f/0xa80
             btrfs_log_dentry_safe+0x50/0x70
             btrfs_sync_file+0x357/0x540
             do_fsync+0x38/0x60
             __ia32_sys_fdatasync+0x12/0x20
             do_fast_syscall_32+0x9a/0x2f0
             entry_SYSENTER_compat+0x84/0x96
      
      -> #4 (&ei->log_mutex){+.+.}:
             lock_acquire+0xbd/0x220
             __mutex_lock+0x86/0xa10
             btrfs_record_unlink_dir+0x2a/0xa0
             btrfs_unlink+0x5a/0xc0
             vfs_unlink+0xb1/0x1a0
             do_unlinkat+0x264/0x2b0
             do_fast_syscall_32+0x9a/0x2f0
             entry_SYSENTER_compat+0x84/0x96
      
      -> #3 (sb_internal#2){.+.+}:
             lock_acquire+0xbd/0x220
             __sb_start_write+0x14d/0x230
             start_transaction+0x3e6/0x590
             btrfs_evict_inode+0x475/0x640
             evict+0xbf/0x1b0
             btrfs_run_delayed_iputs+0x6c/0x90
             cleaner_kthread+0x124/0x1a0
             kthread+0x106/0x140
             ret_from_fork+0x3a/0x50
      
      -> #2 (&fs_info->cleaner_delayed_iput_mutex){+.+.}:
             lock_acquire+0xbd/0x220
             __mutex_lock+0x86/0xa10
             btrfs_alloc_data_chunk_ondemand+0x197/0x530
             btrfs_check_data_free_space+0x4c/0x90
             btrfs_delalloc_reserve_space+0x20/0x60
             btrfs_page_mkwrite+0x87/0x520
             do_page_mkwrite+0x31/0xa0
             __handle_mm_fault+0x799/0xb00
             handle_mm_fault+0x7c/0xe0
             __do_page_fault+0x1d3/0x4a0
             async_page_fault+0x1e/0x30
      
      -> #1 (sb_pagefaults){.+.+}:
             lock_acquire+0xbd/0x220
             __sb_start_write+0x14d/0x230
             btrfs_page_mkwrite+0x6a/0x520
             do_page_mkwrite+0x31/0xa0
             __handle_mm_fault+0x799/0xb00
             handle_mm_fault+0x7c/0xe0
             __do_page_fault+0x1d3/0x4a0
             async_page_fault+0x1e/0x30
      
      -> #0 (&mm->mmap_sem){++++}:
             __lock_acquire+0x42e/0x7a0
             lock_acquire+0xbd/0x220
             down_read+0x48/0xb0
             get_user_pages_unlocked+0x5a/0x1e0
             get_user_pages_fast+0xa4/0x150
             iov_iter_get_pages+0xc3/0x340
             do_direct_IO+0xf93/0x1d70
             __blockdev_direct_IO+0x32d/0x1c20
             btrfs_direct_IO+0x227/0x400
             generic_file_direct_write+0xcf/0x180
             btrfs_file_write_iter+0x308/0x58c
             aio_write+0xf8/0x1d0
             io_submit_one+0x3a9/0x620
             __ia32_compat_sys_io_submit+0xb2/0x270
             do_int80_syscall_32+0x5b/0x1a0
             entry_INT80_compat+0x88/0xa0
      
      other info that might help us debug this:
      
      Chain exists of:
        &mm->mmap_sem --> &ei->log_mutex --> &ei->dio_sem
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&ei->dio_sem);
                                     lock(&ei->log_mutex);
                                     lock(&ei->dio_sem);
        lock(&mm->mmap_sem);
      
       *** DEADLOCK ***
      
      1 lock held by aio-dio-invalid/30928:
       #0: 00000000cefe6b35 (&ei->dio_sem){++++}, at: btrfs_direct_IO+0x3be/0x400
      
      stack backtrace:
      CPU: 0 PID: 30928 Comm: aio-dio-invalid Not tainted 4.18.0-rc4-xfstests-00025-g5de5edbaf1d4 #411
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
      Call Trace:
       dump_stack+0x7c/0xbb
       print_circular_bug.isra.37+0x297/0x2a4
       check_prev_add.constprop.45+0x781/0x7a0
       ? __lock_acquire+0x42e/0x7a0
       validate_chain.isra.41+0x7f0/0xb00
       __lock_acquire+0x42e/0x7a0
       lock_acquire+0xbd/0x220
       ? get_user_pages_unlocked+0x5a/0x1e0
       down_read+0x48/0xb0
       ? get_user_pages_unlocked+0x5a/0x1e0
       get_user_pages_unlocked+0x5a/0x1e0
       get_user_pages_fast+0xa4/0x150
       iov_iter_get_pages+0xc3/0x340
       do_direct_IO+0xf93/0x1d70
       ? __alloc_workqueue_key+0x358/0x490
       ? __blockdev_direct_IO+0x14b/0x1c20
       __blockdev_direct_IO+0x32d/0x1c20
       ? btrfs_run_delalloc_work+0x40/0x40
       ? can_nocow_extent+0x490/0x490
       ? kvm_clock_read+0x1f/0x30
       ? can_nocow_extent+0x490/0x490
       ? btrfs_run_delalloc_work+0x40/0x40
       btrfs_direct_IO+0x227/0x400
       ? btrfs_run_delalloc_work+0x40/0x40
       generic_file_direct_write+0xcf/0x180
       btrfs_file_write_iter+0x308/0x58c
       aio_write+0xf8/0x1d0
       ? kvm_clock_read+0x1f/0x30
       ? __might_fault+0x3e/0x90
       io_submit_one+0x3a9/0x620
       ? io_submit_one+0xe5/0x620
       __ia32_compat_sys_io_submit+0xb2/0x270
       do_int80_syscall_32+0x5b/0x1a0
       entry_INT80_compat+0x88/0xa0
      
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51c62a33
    • J
      btrfs: don't run delayed_iputs in commit · dd472956
      Josef Bacik 提交于
      commit 30928e9baac238a7330085a1c5747f0b5df444b4 upstream.
      
      This could result in a really bad case where we do something like
      
      evict
        evict_refill_and_join
          btrfs_commit_transaction
            btrfs_run_delayed_iputs
              evict
                evict_refill_and_join
                  btrfs_commit_transaction
      ... forever
      
      We have plenty of other places where we run delayed iputs that are much
      safer, let those do the work.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd472956
    • J
      btrfs: fix insert_reserved error handling · 186b5248
      Josef Bacik 提交于
      commit 80ee54bfe8a3850015585ebc84e8d207fcae6831 upstream.
      
      We were not handling the reserved byte accounting properly for data
      references.  Metadata was fine, if it errored out the error paths would
      free the bytes_reserved count and pin the extent, but it even missed one
      of the error cases.  So instead move this handling up into
      run_one_delayed_ref so we are sure that both cases are properly cleaned
      up in case of a transaction abort.
      
      CC: stable@vger.kernel.org # 4.18+
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      186b5248
    • J
      btrfs: only free reserved extent if we didn't insert it · 5a1e9bf4
      Josef Bacik 提交于
      commit 49940bdd57779c78462da7aa5a8650b2fea8c2ff upstream.
      
      When we insert the file extent once the ordered extent completes we free
      the reserved extent reservation as it'll have been migrated to the
      bytes_used counter.  However if we error out after this step we'll still
      clear the reserved extent reservation, resulting in a negative
      accounting of the reserved bytes for the block group and space info.
      Fix this by only doing the free if we didn't successfully insert a file
      extent for this extent.
      
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a1e9bf4
    • J
      btrfs: don't use ctl->free_space for max_extent_size · a746cfd0
      Josef Bacik 提交于
      commit fb5c39d7a887108087de6ff93d3f326b01b4ef41 upstream.
      
      max_extent_size is supposed to be the largest contiguous range for the
      space info, and ctl->free_space is the total free space in the block
      group.  We need to keep track of these separately and _only_ use the
      max_free_space if we don't have a max_extent_size, as that means our
      original request was too large to search any of the block groups for and
      therefore wouldn't have a max_extent_size set.
      
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a746cfd0
    • J
      btrfs: set max_extent_size properly · 4a351e75
      Josef Bacik 提交于
      commit ad22cf6ea47fa20fbe11ac324a0a15c0a9a4a2a9 upstream.
      
      We can't use entry->bytes if our entry is a bitmap entry, we need to use
      entry->max_extent_size in that case.  Fix up all the logic to make this
      consistent.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a351e75
    • J
      btrfs: reset max_extent_size properly · e982beca
      Josef Bacik 提交于
      commit 21a94f7acf0f748599ea552af5d9ee7d7e41c72f upstream.
      
      If we use up our block group before allocating a new one we'll easily
      get a max_extent_size that's set really really low, which will result in
      a lot of fragmentation.  We need to make sure we're resetting the
      max_extent_size when we add a new chunk or add new space.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e982beca
    • F
      Btrfs: fix deadlock when writing out free space caches · ea9c846f
      Filipe Manana 提交于
      commit 5ce555578e0919237fa4bda92b4670e2dd176f85 upstream.
      
      When writing out a block group free space cache we can end deadlocking
      with ourselves on an extent buffer lock resulting in a warning like the
      following:
      
        [245043.379979] WARNING: CPU: 4 PID: 2608 at fs/btrfs/locking.c:251 btrfs_tree_lock+0x1be/0x1d0 [btrfs]
        [245043.392792] CPU: 4 PID: 2608 Comm: btrfs-transacti Tainted: G
          W I      4.16.8 #1
        [245043.395489] RIP: 0010:btrfs_tree_lock+0x1be/0x1d0 [btrfs]
        [245043.396791] RSP: 0018:ffffc9000424b840 EFLAGS: 00010246
        [245043.398093] RAX: 0000000000000a30 RBX: ffff8807e20a3d20 RCX: 0000000000000001
        [245043.399414] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff8807e20a3d20
        [245043.400732] RBP: 0000000000000001 R08: ffff88041f39a700 R09: ffff880000000000
        [245043.402021] R10: 0000000000000040 R11: ffff8807e20a3d20 R12: ffff8807cb220630
        [245043.403296] R13: 0000000000000001 R14: ffff8807cb220628 R15: ffff88041fbdf000
        [245043.404780] FS:  0000000000000000(0000) GS:ffff88082fc80000(0000) knlGS:0000000000000000
        [245043.406050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [245043.407321] CR2: 00007fffdbdb9f10 CR3: 0000000001c09005 CR4: 00000000000206e0
        [245043.408670] Call Trace:
        [245043.409977]  btrfs_search_slot+0x761/0xa60 [btrfs]
        [245043.411278]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
        [245043.412572]  btrfs_insert_item+0x5b/0xc0 [btrfs]
        [245043.413922]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
        [245043.415216]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
        [245043.416487]  find_free_extent+0xcd0/0xf60 [btrfs]
        [245043.417813]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
        [245043.419105]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
        [245043.420378]  __btrfs_cow_block+0x127/0x550 [btrfs]
        [245043.421652]  btrfs_cow_block+0xee/0x190 [btrfs]
        [245043.422979]  btrfs_search_slot+0x227/0xa60 [btrfs]
        [245043.424279]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
        [245043.425538]  ? iput+0x72/0x1e0
        [245043.426798]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
        [245043.428131]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
        [245043.429419]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
        [245043.430712]  ? start_transaction+0x8e/0x410 [btrfs]
        [245043.432006]  transaction_kthread+0x184/0x1a0 [btrfs]
        [245043.433341]  kthread+0xf0/0x130
        [245043.434628]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
        [245043.435928]  ? kthread_create_worker_on_cpu+0x40/0x40
        [245043.437236]  ret_from_fork+0x1f/0x30
        [245043.441054] ---[ end trace 15abaa2aaf36827f ]---
      
      This is because at write_one_cache_group() when we are COWing a leaf from
      the extent tree we end up allocating a new block group (chunk) and,
      because we have hit a threshold on the number of bytes reserved for system
      chunks, we attempt to finalize the creation of new block groups from the
      current transaction, by calling btrfs_create_pending_block_groups().
      However here we also need to modify the extent tree in order to insert
      a block group item, and if the location for this new block group item
      happens to be in the same leaf that we were COWing earlier, we deadlock
      since btrfs_search_slot() tries to write lock the extent buffer that we
      locked before at write_one_cache_group().
      
      We have already hit similar cases in the past and commit d9a0540a
      ("Btrfs: fix deadlock when finalizing block group creation") fixed some
      of those cases by delaying the creation of pending block groups at the
      known specific spots that could lead to a deadlock. This change reworks
      that commit to be more generic so that we don't have to add similar logic
      to every possible path that can lead to a deadlock. This is done by
      making __btrfs_cow_block() disallowing the creation of new block groups
      (setting the transaction's can_flush_pending_bgs to false) before it
      attempts to allocate a new extent buffer for either the extent, chunk or
      device trees, since those are the trees that pending block creation
      modifies. Once the new extent buffer is allocated, it allows creation of
      pending block groups to happen again.
      
      This change depends on a recent patch from Josef which is not yet in
      Linus' tree, named "btrfs: make sure we create all new block groups" in
      order to avoid occasional warnings at btrfs_trans_release_chunk_metadata().
      
      Fixes: d9a0540a ("Btrfs: fix deadlock when finalizing block group creation")
      CC: stable@vger.kernel.org # 4.4+
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199753
      Link: https://lore.kernel.org/linux-btrfs/CAJtFHUTHna09ST-_EEiyWmDH6gAqS6wa=zMNMBsifj8ABu99cw@mail.gmail.com/Reported-by: NE V <eliventer@gmail.com>
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea9c846f
    • F
      Btrfs: fix assertion on fsync of regular file when using no-holes feature · e17af96e
      Filipe Manana 提交于
      commit 7ed586d0a8241e81d58c656c5b315f781fa6fc97 upstream.
      
      When using the NO_HOLES feature and logging a regular file, we were
      expecting that if we find an inline extent, that either its size in RAM
      (uncompressed and unenconded) matches the size of the file or if it does
      not, that it matches the sector size and it represents compressed data.
      This assertion does not cover a case where the length of the inline extent
      is smaller than the sector size and also smaller the file's size, such
      case is possible through fallocate. Example:
      
        $ mkfs.btrfs -f -O no-holes /dev/sdb
        $ mount /dev/sdb /mnt
      
        $ xfs_io -f -c "pwrite -S 0xb60 0 21" /mnt/foobar
        $ xfs_io -c "falloc 40 40" /mnt/foobar
        $ xfs_io -c "fsync" /mnt/foobar
      
      In the above example we trigger the assertion because the inline extent's
      length is 21 bytes while the file size is 80 bytes. The fallocate() call
      merely updated the file's size and did not touch the existing inline
      extent, as expected.
      
      So fix this by adjusting the assertion so that an inline extent length
      smaller than the file size is valid if the file size is smaller than the
      filesystem's sector size.
      
      A test case for fstests follows soon.
      Reported-by: NAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Fixes: a89ca6f2 ("Btrfs: fix fsync after truncate when no_holes feature is enabled")
      CC: stable@vger.kernel.org # 4.14+
      Link: https://lore.kernel.org/linux-btrfs/CAE5jQCfRSBC7n4pUTFJcmHh109=gwyT9mFkCOL+NKfzswmR=_Q@mail.gmail.com/Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e17af96e
    • F
      Btrfs: fix null pointer dereference on compressed write path error · 85c5f244
      Filipe Manana 提交于
      commit 3527a018c00e5dbada2f9d7ed5576437b6dd5cfb upstream.
      
      At inode.c:compress_file_range(), under the "free_pages_out" label, we can
      end up dereferencing the "pages" pointer when it has a NULL value. This
      case happens when "start" has a value of 0 and we fail to allocate memory
      for the "pages" pointer. When that happens we jump to the "cont" label and
      then enter the "if (start == 0)" branch where we immediately call the
      cow_file_range_inline() function. If that function returns 0 (success
      creating an inline extent) or an error (like -ENOMEM for example) we jump
      to the "free_pages_out" label and then access "pages[i]" leading to a NULL
      pointer dereference, since "nr_pages" has a value greater than zero at
      that point.
      
      Fix this by setting "nr_pages" to 0 when we fail to allocate memory for
      the "pages" pointer.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201119
      Fixes: 771ed689 ("Btrfs: Optimize compressed writeback and reads")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85c5f244
    • Q
      btrfs: qgroup: Dirty all qgroups before rescan · 8181a8f8
      Qu Wenruo 提交于
      commit 9c7b0c2e8dbfbcd80a71e2cbfe02704f26c185c6 upstream.
      
      [BUG]
      In the following case, rescan won't zero out the number of qgroup 1/0:
      
        $ mkfs.btrfs -fq $DEV
        $ mount $DEV /mnt
      
        $ btrfs quota enable /mnt
        $ btrfs qgroup create 1/0 /mnt
        $ btrfs sub create /mnt/sub
        $ btrfs qgroup assign 0/257 1/0 /mnt
      
        $ dd if=/dev/urandom of=/mnt/sub/file bs=1k count=1000
        $ btrfs sub snap /mnt/sub /mnt/snap
        $ btrfs quota rescan -w /mnt
        $ btrfs qgroup show -pcre /mnt
        qgroupid         rfer         excl     max_rfer     max_excl parent  child
        --------         ----         ----     --------     -------- ------  -----
        0/5          16.00KiB     16.00KiB         none         none ---     ---
        0/257      1016.00KiB     16.00KiB         none         none 1/0     ---
        0/258      1016.00KiB     16.00KiB         none         none ---     ---
        1/0        1016.00KiB     16.00KiB         none         none ---     0/257
      
      So far so good, but:
      
        $ btrfs qgroup remove 0/257 1/0 /mnt
        WARNING: quotas may be inconsistent, rescan needed
        $ btrfs quota rescan -w /mnt
        $ btrfs qgroup show -pcre  /mnt
        qgoupid         rfer         excl     max_rfer     max_excl parent  child
        --------         ----         ----     --------     -------- ------  -----
        0/5          16.00KiB     16.00KiB         none         none ---     ---
        0/257      1016.00KiB     16.00KiB         none         none ---     ---
        0/258      1016.00KiB     16.00KiB         none         none ---     ---
        1/0        1016.00KiB     16.00KiB         none         none ---     ---
      	     ^^^^^^^^^^     ^^^^^^^^ not cleared
      
      [CAUSE]
      Before rescan we call qgroup_rescan_zero_tracking() to zero out all
      qgroups' accounting numbers.
      
      However we don't mark all qgroups dirty, but rely on rescan to do so.
      
      If we have any high level qgroup without children, it won't be marked
      dirty during rescan, since we cannot reach that qgroup.
      
      This will cause QGROUP_INFO items of childless qgroups never get updated
      in the quota tree, thus their numbers will stay the same in "btrfs
      qgroup show" output.
      
      [FIX]
      Just mark all qgroups dirty in qgroup_rescan_zero_tracking(), so even if
      we have childless qgroups, their QGROUP_INFO items will still get
      updated during rescan.
      Reported-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Tested-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8181a8f8
    • F
      Btrfs: fix wrong dentries after fsync of file that got its parent replaced · d2c6df39
      Filipe Manana 提交于
      commit 0f375eed92b5a407657532637ed9652611a682f5 upstream.
      
      In a scenario like the following:
      
        mkdir /mnt/A               # inode 258
        mkdir /mnt/B               # inode 259
        touch /mnt/B/bar           # inode 260
      
        sync
      
        mv /mnt/B/bar /mnt/A/bar
        mv -T /mnt/A /mnt/B
        fsync /mnt/B/bar
      
        <power fail>
      
      After replaying the log we end up with file bar having 2 hard links, both
      with the name 'bar' and one in the directory with inode number 258 and the
      other in the directory with inode number 259. Also, we end up with the
      directory inode 259 still existing and with the directory inode 258 still
      named as 'A', instead of 'B'. In this scenario, file 'bar' should only
      have one hard link, located at directory inode 258, the directory inode
      259 should not exist anymore and the name for directory inode 258 should
      be 'B'.
      
      This incorrect behaviour happens because when attempting to log the old
      parents of an inode, we skip any parents that no longer exist. Fix this
      by forcing a full commit if an old parent no longer exists.
      
      A test case for fstests follows soon.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2c6df39
    • F
      Btrfs: fix warning when replaying log after fsync of a tmpfile · 55f21e16
      Filipe Manana 提交于
      commit f2d72f42d5fa3bf33761d9e47201745f624fcff5 upstream.
      
      When replaying a log which contains a tmpfile (which necessarily has a
      link count of 0) we end up calling inc_nlink(), at
      fs/btrfs/tree-log.c:replay_one_buffer(), which produces a warning like
      the following:
      
        [195191.943673] WARNING: CPU: 0 PID: 6924 at fs/inode.c:342 inc_nlink+0x33/0x40
        [195191.943723] CPU: 0 PID: 6924 Comm: mount Not tainted 4.19.0-rc6-btrfs-next-38 #1
        [195191.943724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
        [195191.943726] RIP: 0010:inc_nlink+0x33/0x40
        [195191.943728] RSP: 0018:ffffb96e425e3870 EFLAGS: 00010246
        [195191.943730] RAX: 0000000000000000 RBX: ffff8c0d1e6af4f0 RCX: 0000000000000006
        [195191.943731] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8c0d1e6af4f0
        [195191.943731] RBP: 0000000000000097 R08: 0000000000000001 R09: 0000000000000000
        [195191.943732] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb96e425e3a60
        [195191.943733] R13: ffff8c0d10cff0c8 R14: ffff8c0d0d515348 R15: ffff8c0d78a1b3f8
        [195191.943735] FS:  00007f570ee24480(0000) GS:ffff8c0dfb200000(0000) knlGS:0000000000000000
        [195191.943736] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [195191.943737] CR2: 00005593286277c8 CR3: 00000000bb8f2006 CR4: 00000000003606f0
        [195191.943739] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [195191.943740] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [195191.943741] Call Trace:
        [195191.943778]  replay_one_buffer+0x797/0x7d0 [btrfs]
        [195191.943802]  walk_up_log_tree+0x1c1/0x250 [btrfs]
        [195191.943809]  ? rcu_read_lock_sched_held+0x3f/0x70
        [195191.943825]  walk_log_tree+0xae/0x1d0 [btrfs]
        [195191.943840]  btrfs_recover_log_trees+0x1d7/0x4d0 [btrfs]
        [195191.943856]  ? replay_dir_deletes+0x280/0x280 [btrfs]
        [195191.943870]  open_ctree+0x1c3b/0x22a0 [btrfs]
        [195191.943887]  btrfs_mount_root+0x6b4/0x800 [btrfs]
        [195191.943894]  ? rcu_read_lock_sched_held+0x3f/0x70
        [195191.943899]  ? pcpu_alloc+0x55b/0x7c0
        [195191.943906]  ? mount_fs+0x3b/0x140
        [195191.943908]  mount_fs+0x3b/0x140
        [195191.943912]  ? __init_waitqueue_head+0x36/0x50
        [195191.943916]  vfs_kern_mount+0x62/0x160
        [195191.943927]  btrfs_mount+0x134/0x890 [btrfs]
        [195191.943936]  ? rcu_read_lock_sched_held+0x3f/0x70
        [195191.943938]  ? pcpu_alloc+0x55b/0x7c0
        [195191.943943]  ? mount_fs+0x3b/0x140
        [195191.943952]  ? btrfs_remount+0x570/0x570 [btrfs]
        [195191.943954]  mount_fs+0x3b/0x140
        [195191.943956]  ? __init_waitqueue_head+0x36/0x50
        [195191.943960]  vfs_kern_mount+0x62/0x160
        [195191.943963]  do_mount+0x1f9/0xd40
        [195191.943967]  ? memdup_user+0x4b/0x70
        [195191.943971]  ksys_mount+0x7e/0xd0
        [195191.943974]  __x64_sys_mount+0x21/0x30
        [195191.943977]  do_syscall_64+0x60/0x1b0
        [195191.943980]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [195191.943983] RIP: 0033:0x7f570e4e524a
        [195191.943986] RSP: 002b:00007ffd83589478 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
        [195191.943989] RAX: ffffffffffffffda RBX: 0000563f335b2060 RCX: 00007f570e4e524a
        [195191.943990] RDX: 0000563f335b2240 RSI: 0000563f335b2280 RDI: 0000563f335b2260
        [195191.943992] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000020
        [195191.943993] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000563f335b2260
        [195191.943994] R13: 0000563f335b2240 R14: 0000000000000000 R15: 00000000ffffffff
        [195191.944002] irq event stamp: 8688
        [195191.944010] hardirqs last  enabled at (8687): [<ffffffff9cb004c3>] console_unlock+0x503/0x640
        [195191.944012] hardirqs last disabled at (8688): [<ffffffff9ca037dd>] trace_hardirqs_off_thunk+0x1a/0x1c
        [195191.944018] softirqs last  enabled at (8638): [<ffffffff9cc0a5d1>] __set_page_dirty_nobuffers+0x101/0x150
        [195191.944020] softirqs last disabled at (8634): [<ffffffff9cc26bbe>] wb_wakeup_delayed+0x2e/0x60
        [195191.944022] ---[ end trace 5d6e873a9a0b811a ]---
      
      This happens because the inode does not have the flag I_LINKABLE set,
      which is a runtime only flag, not meant to be persisted, set when the
      inode is created through open(2) if the flag O_EXCL is not passed to it.
      Except for the warning, there are no other consequences (like corruptions
      or metadata inconsistencies).
      
      Since it's pointless to replay a tmpfile as it would be deleted in a
      later phase of the log replay procedure (it has a link count of 0), fix
      this by not logging tmpfiles and if a tmpfile is found in a log (created
      by a kernel without this change), skip the replay of the inode.
      
      A test case for fstests follows soon.
      
      Fixes: 471d557a ("Btrfs: fix loss of prealloc extents past i_size after fsync log replay")
      CC: stable@vger.kernel.org # 4.18+
      Reported-by: NMartin Steigerwald <martin@lichtvoll.de>
      Link: https://lore.kernel.org/linux-btrfs/3666619.NTnn27ZJZE@merkaba/Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55f21e16
    • J
      btrfs: make sure we create all new block groups · 1d6d4a03
      Josef Bacik 提交于
      commit 545e3366db823dc3342ca9d7fea803f829c9062f upstream.
      
      Allocating new chunks modifies both the extent and chunk tree, which can
      trigger new chunk allocations.  So instead of doing list_for_each_safe,
      just do while (!list_empty()) so we make sure we don't exit with other
      pending bg's still on our list.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Reviewed-by: NLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d6d4a03
    • J
      btrfs: reset max_extent_size on clear in a bitmap · 9aabbb2e
      Josef Bacik 提交于
      commit 553cceb49681d60975d00892877d4c871bf220f9 upstream.
      
      We need to clear the max_extent_size when we clear bits from a bitmap
      since it could have been from the range that contains the
      max_extent_size.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9aabbb2e
    • J
      btrfs: protect space cache inode alloc with GFP_NOFS · 9006ad16
      Josef Bacik 提交于
      commit 84de76a2fb217dc1b6bc2965cc397d1648aa1404 upstream.
      
      If we're allocating a new space cache inode it's likely going to be
      under a transaction handle, so we need to use memalloc_nofs_save() in
      order to avoid deadlocks, and more importantly lockdep messages that
      make xfstests fail.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9006ad16