1. 27 2月, 2020 5 次提交
    • R
      perf annotate: Fix --show-nr-samples for tui/stdio2 · 46ccb442
      Ravi Bangoria 提交于
      perf annotate --show-nr-samples does not really show number of samples.
      
      The reason is we have two separate variables for the same purpose.
      
      One is in symbol_conf.show_nr_samples and another is
      annotation_options.show_nr_samples.
      
      We save command line option in symbol_conf.show_nr_samples but uses
      annotation_option.show_nr_samples while rendering tui/stdio2 browser.
      
      Though, we copy symbol_conf.show_nr_samples to
      annotation__default_options.show_nr_samples but that is not really
      effective as we don't use annotation__default_options once we copy
      default options to dynamic variable annotate.opts in cmd_annotate().
      
      Instead of all these complication, keep only one variable and use it all
      over. symbol_conf.show_nr_samples is used by perf report/top as well. So
      let's kill annotation_options.show_nr_samples.
      
      On a side note, I've kept annotation_options.show_nr_samples definition
      because it's still used by perf-config code. Follow up patch to fix
      perf-config for annotate will remove annotation_options.show_nr_samples.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yisheng Xie <xieyisheng1@huawei.com>
      Link: http://lore.kernel.org/lkml/20200213064306.160480-4-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      46ccb442
    • R
      perf annotate: Fix --show-total-period for tui/stdio2 · 68aac855
      Ravi Bangoria 提交于
      perf annotate --show-total-period does not really show total period.
      
      The reason is we have two separate variables for the same purpose.
      
      One is in symbol_conf.show_total_period and another is
      annotation_options.show_total_period.
      
      We save command line option in symbol_conf.show_total_period but uses
      annotation_option.show_total_period while rendering tui/stdio2 browser.
      
      Though, we copy symbol_conf.show_total_period to
      annotation__default_options.show_total_period but that is not really
      effective as we don't use annotation__default_options once we copy
      default options to dynamic variable annotate.opts in cmd_annotate().
      
      Instead of all these complication, keep only one variable and use it all
      over. symbol_conf.show_total_period is used by perf report/top as well.
      So let's kill annotation_options.show_total_period.
      
      On a side note, I've kept annotation_options.show_total_period
      definition because it's still used by perf-config code. Follow up patch
      to fix perf-config for annotate will remove
      annotation_options.show_total_period.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yisheng Xie <xieyisheng1@huawei.com>
      Link: http://lore.kernel.org/lkml/20200213064306.160480-3-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      68aac855
    • R
      perf annotate/tui: Re-render title bar after switching back from script browser · 54cf752c
      Ravi Bangoria 提交于
      The 'perf annotate' TUI browser provides a 'r' hot key to switch to a
      script browser. But the annotate browser title bar becomes hidden while
      switching back from script browser. Fix it.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yisheng Xie <xieyisheng1@huawei.com>
      Link: http://lore.kernel.org/lkml/20200213064306.160480-2-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      54cf752c
    • A
      tools headers UAPI: Update tools's copy of kvm.h headers · 0d6f94fd
      Arnaldo Carvalho de Melo 提交于
      Picking the changes from:
      
        5ef8acbd ("KVM: nVMX: Emulate MTF when performing instruction emulation")
      
      Silencing this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
        diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
      
      No change in tooling ensues, just the x86 kvm tooling gets rebuilt as
      those headers are included in its build:
      
        $ cp arch/x86/include/uapi/asm/kvm.h tools/arch/x86/include/uapi/asm/kvm.h
        $ make -C tools/perf
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j12' parallel build
      
        Auto-detecting system features:
        ...                         dwarf: [ on  ]
        <SNIP>
        ...        disassembler-four-args: [ on  ]
      
          DESCEND  plugins
          CC       /tmp/build/perf/arch/x86/util/kvm-stat.o
        <SNIP>
          LD       /tmp/build/perf/arch/x86/util/perf-in.o
          LD       /tmp/build/perf/arch/x86/perf-in.o
          LD       /tmp/build/perf/arch/perf-in.o
          LD       /tmp/build/perf/perf-in.o
          LINK     /tmp/build/perf/perf
        <SNIP>
        $
      
      As it doesn't seem to be used there:
      
        $ grep STATE tools/perf/arch/x86/util/kvm-stat.c
        $
      
      And the 'perf trace' beautifier table generator isn't interested in
      these things:
      
        $ grep regex= tools/perf/trace/beauty/kvm_ioctl.sh
        regex='^#[[:space:]]*define[[:space:]]+KVM_(\w+)[[:space:]]+_IO[RW]*\([[:space:]]*KVMIO[[:space:]]*,[[:space:]]*(0x[[:xdigit:]]+).*'
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Oliver Upton <oupton@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0d6f94fd
    • A
      tools arch x86: Sync the msr-index.h copy with the kernel sources · d8e3ee2e
      Arnaldo Carvalho de Melo 提交于
      To pick up the changes from these csets:
      
        21b5ee59 ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF")
      
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
        $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
        $ git diff
        diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
        index ebe1685e92dd..d5e517d1c3dd 100644
        --- a/tools/arch/x86/include/asm/msr-index.h
        +++ b/tools/arch/x86/include/asm/msr-index.h
        @@ -512,6 +512,8 @@
         #define MSR_K7_HWCR                    0xc0010015
         #define MSR_K7_HWCR_SMMLOCK_BIT                0
         #define MSR_K7_HWCR_SMMLOCK            BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
        +#define MSR_K7_HWCR_IRPERF_EN_BIT      30
        +#define MSR_K7_HWCR_IRPERF_EN          BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT)
         #define MSR_K7_FID_VID_CTL             0xc0010041
         #define MSR_K7_FID_VID_STATUS          0xc0010042
        $
      
      That don't result in any change in tooling:
      
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
        $ diff -u before after
        $
      
      To silence this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
        diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d8e3ee2e
  2. 26 2月, 2020 3 次提交
    • I
      Merge tag 'perf-urgent-for-mingo-5.6-20200220' of... · 4c45945a
      Ingo Molnar 提交于
      Merge tag 'perf-urgent-for-mingo-5.6-20200220' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      auxtrace:
      
        Adrian Hunter:
      
        - Fix endless record after being terminated on arm-spe.
      
        Wei Li:
      
        - Fix endless record after being terminated on Intel PT and BTS and
          on ARM's cs-etm.
      
      perf test:
      
        Thomas Richter
      
        - Fix test trace+probe_vfs_getname.sh on s390
      
      PowerPC:
      
        Arnaldo Carvalho de Melo:
      
        - Sync powerpc syscall.tbl with the kernel sources.
      
      BPF:
      
        Arnaldo Carvalho de Melo:
      
        - Remove extraneous bpf/ subdir from bpf.h headers used to build bpf events.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c45945a
    • L
      Merge tag 'riscv-for-linux-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · c5f86891
      Linus Torvalds 提交于
      Pull RISC-V fixes from Palmer Dabbelt:
       "This contains a handful of RISC-V related fixes that I've collected
        and would like to target for 5.6-rc4:
      
         - A fix to set up the PMPs on boot, which allows the kernel to access
           memory on systems that don't set up permissive PMPs before getting
           to Linux. This only effects machine-mode kernels, which currently
           means only NOMMU kernels.
      
         - A fix to avoid enabling supervisor-mode interrupts when running in
           machine-mode, also only for NOMMU kernels.
      
         - A pair of fixes to our KASAN support to avoid corrupting memory.
      
         - A gitignore fix.
      
        This boots on QEMU's virt board for me"
      
      * tag 'riscv-for-linux-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: adjust the indent
        riscv: allocate a complete page size for each page table
        riscv: Fix gitignore
        RISC-V: Don't enable all interrupts in trap_init()
        riscv: set pmp configuration if kernel is running in M-mode
      c5f86891
    • L
      Merge branch 'mips-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · d67f250e
      Linus Torvalds 提交于
      Pull MIPS fixes from Paul Burton:
       "Here are a few MIPS fixes, and a MAINTAINERS update to hand over MIPS
        maintenance to Thomas Bogendoerfer - this will be my final pull
        request as MIPS maintainer.
      
        Thanks for your helpful comments, useful corrections & responsiveness
        during the time I've fulfilled the role, and I'm sure I'll pop up
        elsewhere in the tree somewhere down the line"
      
      * 'mips-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MAINTAINERS: Hand MIPS over to Thomas
        MIPS: ingenic: DTS: Fix watchdog nodes
        MIPS: X1000: Fix clock of watchdog node.
        MIPS: vdso: Wrap -mexplicit-relocs in cc-option
        MIPS: VPE: Fix a double free and a memory leak in 'release_vpe()'
        MIPS: cavium_octeon: Fix syncw generation.
        mips: vdso: add build time check that no 'jalr t9' calls left
        MIPS: Disable VDSO time functionality on microMIPS
        mips: vdso: fix 'jalr t9' crash in vdso code
      d67f250e
  3. 25 2月, 2020 8 次提交
  4. 24 2月, 2020 4 次提交
    • L
      Linux 5.6-rc3 · f8788d86
      Linus Torvalds 提交于
      f8788d86
    • L
      Merge tag 'for-5.6-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · d2eee258
      Linus Torvalds 提交于
      Pull btrfs fixes from David Sterba:
       "These are fixes that were found during testing with help of error
        injection, plus some other stable material.
      
        There's a fixup to patch added to rc1 causing locking in wrong context
        warnings, tests found one more deadlock scenario. The patches are
        tagged for stable, two of them now in the queue but we'd like all
        three released at the same time.
      
        I'm not happy about fixes to fixes in such a fast succession during
        rcs, but I hope we found all the fallouts of commit 28553fa9
        ('Btrfs: fix race between shrinking truncate and fiemap')"
      
      * tag 'for-5.6-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: fix deadlock during fast fsync when logging prealloc extents beyond eof
        Btrfs: fix btrfs_wait_ordered_range() so that it waits for all ordered extents
        btrfs: fix bytes_may_use underflow in prealloc error condtition
        btrfs: handle logged extent failure properly
        btrfs: do not check delayed items are empty for single transaction cleanup
        btrfs: reset fs_root to NULL on error in open_ctree
        btrfs: destroy qgroup extent records on transaction abort
      d2eee258
    • L
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · a3163ca0
      Linus Torvalds 提交于
      Pull ext4 fixes from Ted Ts'o:
       "More miscellaneous ext4 bug fixes (all stable fodder)"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix mount failure with quota configured as module
        jbd2: fix ocfs2 corrupt when clearing block group bits
        ext4: fix race between writepages and enabling EXT4_EXTENTS_FL
        ext4: rename s_journal_flag_rwsem to s_writepages_rwsem
        ext4: fix potential race between s_flex_groups online resizing and access
        ext4: fix potential race between s_group_info online resizing and access
        ext4: fix potential race between online resizing and write operations
        ext4: add cond_resched() to __ext4_find_entry()
        ext4: fix a data race in EXT4_I(inode)->i_disksize
      a3163ca0
    • L
      Merge tag 'csky-for-linus-5.6-rc3' of git://github.com/c-sky/csky-linux · c6188dff
      Linus Torvalds 提交于
      Pull csky updates from Guo Ren:
       "Sorry, I missed 5.6-rc1 merge window, but in this pull request the
        most are the fixes and the rests are between fixes and features. The
        only outside modification is the MAINTAINERS file update with our
        mailing list.
      
         - cache flush implementation fixes
      
         - ftrace modify panic fix
      
         - CONFIG_SMP boot problem fix
      
         - fix pt_regs saving for atomic.S
      
         - fix fixaddr_init without highmem.
      
         - fix stack protector support
      
         - fix fake Tightly-Coupled Memory code compile and use
      
         - fix some typos and coding convention"
      
      * tag 'csky-for-linus-5.6-rc3' of git://github.com/c-sky/csky-linux: (23 commits)
        csky: Replace <linux/clk-provider.h> by <linux/of_clk.h>
        csky: Implement copy_thread_tls
        csky: Add PCI support
        csky: Minimize defconfig to support buildroot config.fragment
        csky: Add setup_initrd check code
        csky: Cleanup old Kconfig options
        arch/csky: fix some Kconfig typos
        csky: Fixup compile warning for three unimplemented syscalls
        csky: Remove unused cache implementation
        csky: Fixup ftrace modify panic
        csky: Add flush_icache_mm to defer flush icache all
        csky: Optimize abiv2 copy_to_user_page with VM_EXEC
        csky: Enable defer flush_dcache_page for abiv2 cpus (807/810/860)
        csky: Remove unnecessary flush_icache_* implementation
        csky: Support icache flush without specific instructions
        csky/Kconfig: Add Kconfig.platforms to support some drivers
        csky/smp: Fixup boot failed when CONFIG_SMP
        csky: Set regs->usp to kernel sp, when the exception is from kernel
        csky/mm: Fixup export invalid_pte_table symbol
        csky: Separate fixaddr_init from highmem
        ...
      c6188dff
  5. 23 2月, 2020 16 次提交
  6. 22 2月, 2020 4 次提交
    • X
      io_uring: fix __io_iopoll_check deadlock in io_sq_thread · c7849be9
      Xiaoguang Wang 提交于
      Since commit a3a0e43f ("io_uring: don't enter poll loop if we have
      CQEs pending"), if we already events pending, we won't enter poll loop.
      In case SETUP_IOPOLL and SETUP_SQPOLL are both enabled, if app has
      been terminated and don't reap pending events which are already in cq
      ring, and there are some reqs in poll_list, io_sq_thread will enter
      __io_iopoll_check(), and find pending events, then return, this loop
      will never have a chance to exit.
      
      I have seen this issue in fio stress tests, to fix this issue, let
      io_sq_thread call io_iopoll_getevents() with argument 'min' being zero,
      and remove __io_iopoll_check().
      
      Fixes: a3a0e43f ("io_uring: don't enter poll loop if we have CQEs pending")
      Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c7849be9
    • J
      ext4: fix mount failure with quota configured as module · 9db176bc
      Jan Kara 提交于
      When CONFIG_QFMT_V2 is configured as a module, the test in
      ext4_feature_set_ok() fails and so mount of filesystems with quota or
      project features fails. Fix the test to use IS_ENABLED macro which
      works properly even for modules.
      
      Link: https://lore.kernel.org/r/20200221100835.9332-1-jack@suse.cz
      Fixes: d65d87a0 ("ext4: improve explanation of a mount failure caused by a misconfigured kernel")
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      9db176bc
    • W
      jbd2: fix ocfs2 corrupt when clearing block group bits · 8eedabfd
      wangyan 提交于
      I found a NULL pointer dereference in ocfs2_block_group_clear_bits().
      The running environment:
      	kernel version: 4.19
      	A cluster with two nodes, 5 luns mounted on two nodes, and do some
      	file operations like dd/fallocate/truncate/rm on every lun with storage
      	network disconnection.
      
      The fallocate operation on dm-23-45 caused an null pointer dereference.
      
      The information of NULL pointer dereference as follows:
      	[577992.878282] JBD2: Error -5 detected when updating journal superblock for dm-23-45.
      	[577992.878290] Aborting journal on device dm-23-45.
      	...
      	[577992.890778] JBD2: Error -5 detected when updating journal superblock for dm-24-46.
      	[577992.890908] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890916] (fallocate,88392,52):ocfs2_extend_trans:474 ERROR: status = -30
      	[577992.890918] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890920] (fallocate,88392,52):ocfs2_rotate_tree_right:2500 ERROR: status = -30
      	[577992.890922] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890924] (fallocate,88392,52):ocfs2_do_insert_extent:4382 ERROR: status = -30
      	[577992.890928] (fallocate,88392,52):ocfs2_insert_extent:4842 ERROR: status = -30
      	[577992.890928] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890930] (fallocate,88392,52):ocfs2_add_clusters_in_btree:4947 ERROR: status = -30
      	[577992.890933] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890939] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890949] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
      	[577992.890950] Mem abort info:
      	[577992.890951]   ESR = 0x96000004
      	[577992.890952]   Exception class = DABT (current EL), IL = 32 bits
      	[577992.890952]   SET = 0, FnV = 0
      	[577992.890953]   EA = 0, S1PTW = 0
      	[577992.890954] Data abort info:
      	[577992.890955]   ISV = 0, ISS = 0x00000004
      	[577992.890956]   CM = 0, WnR = 0
      	[577992.890958] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000f8da07a9
      	[577992.890960] [0000000000000020] pgd=0000000000000000
      	[577992.890964] Internal error: Oops: 96000004 [#1] SMP
      	[577992.890965] Process fallocate (pid: 88392, stack limit = 0x00000000013db2fd)
      	[577992.890968] CPU: 52 PID: 88392 Comm: fallocate Kdump: loaded Tainted: G        W  OE     4.19.36 #1
      	[577992.890969] Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019
      	[577992.890971] pstate: 60400009 (nZCv daif +PAN -UAO)
      	[577992.891054] pc : _ocfs2_free_suballoc_bits+0x63c/0x968 [ocfs2]
      	[577992.891082] lr : _ocfs2_free_suballoc_bits+0x618/0x968 [ocfs2]
      	[577992.891084] sp : ffff0000c8e2b810
      	[577992.891085] x29: ffff0000c8e2b820 x28: 0000000000000000
      	[577992.891087] x27: 00000000000006f3 x26: ffffa07957b02e70
      	[577992.891089] x25: ffff807c59d50000 x24: 00000000000006f2
      	[577992.891091] x23: 0000000000000001 x22: ffff807bd39abc30
      	[577992.891093] x21: ffff0000811d9000 x20: ffffa07535d6a000
      	[577992.891097] x19: ffff000001681638 x18: ffffffffffffffff
      	[577992.891098] x17: 0000000000000000 x16: ffff000080a03df0
      	[577992.891100] x15: ffff0000811d9708 x14: 203d207375746174
      	[577992.891101] x13: 73203a524f525245 x12: 20373439343a6565
      	[577992.891103] x11: 0000000000000038 x10: 0101010101010101
      	[577992.891106] x9 : ffffa07c68a85d70 x8 : 7f7f7f7f7f7f7f7f
      	[577992.891109] x7 : 0000000000000000 x6 : 0000000000000080
      	[577992.891110] x5 : 0000000000000000 x4 : 0000000000000002
      	[577992.891112] x3 : ffff000001713390 x2 : 2ff90f88b1c22f00
      	[577992.891114] x1 : ffff807bd39abc30 x0 : 0000000000000000
      	[577992.891116] Call trace:
      	[577992.891139]  _ocfs2_free_suballoc_bits+0x63c/0x968 [ocfs2]
      	[577992.891162]  _ocfs2_free_clusters+0x100/0x290 [ocfs2]
      	[577992.891185]  ocfs2_free_clusters+0x50/0x68 [ocfs2]
      	[577992.891206]  ocfs2_add_clusters_in_btree+0x198/0x5e0 [ocfs2]
      	[577992.891227]  ocfs2_add_inode_data+0x94/0xc8 [ocfs2]
      	[577992.891248]  ocfs2_extend_allocation+0x1bc/0x7a8 [ocfs2]
      	[577992.891269]  ocfs2_allocate_extents+0x14c/0x338 [ocfs2]
      	[577992.891290]  __ocfs2_change_file_space+0x3f8/0x610 [ocfs2]
      	[577992.891309]  ocfs2_fallocate+0xe4/0x128 [ocfs2]
      	[577992.891316]  vfs_fallocate+0x11c/0x250
      	[577992.891317]  ksys_fallocate+0x54/0x88
      	[577992.891319]  __arm64_sys_fallocate+0x28/0x38
      	[577992.891323]  el0_svc_common+0x78/0x130
      	[577992.891325]  el0_svc_handler+0x38/0x78
      	[577992.891327]  el0_svc+0x8/0xc
      
      My analysis process as follows:
      ocfs2_fallocate
        __ocfs2_change_file_space
          ocfs2_allocate_extents
            ocfs2_extend_allocation
              ocfs2_add_inode_data
                ocfs2_add_clusters_in_btree
                  ocfs2_insert_extent
                    ocfs2_do_insert_extent
                      ocfs2_rotate_tree_right
                        ocfs2_extend_rotate_transaction
                          ocfs2_extend_trans
                            jbd2_journal_restart
                              jbd2__journal_restart
                                /* handle->h_transaction is NULL,
                                 * is_handle_aborted(handle) is true
                                 */
                                handle->h_transaction = NULL;
                                start_this_handle
                                  return -EROFS;
                  ocfs2_free_clusters
                    _ocfs2_free_clusters
                      _ocfs2_free_suballoc_bits
                        ocfs2_block_group_clear_bits
                          ocfs2_journal_access_gd
                            __ocfs2_journal_access
                              jbd2_journal_get_undo_access
                                /* I think jbd2_write_access_granted() will
                                 * return true, because do_get_write_access()
                                 * will return -EROFS.
                                 */
                                if (jbd2_write_access_granted(...)) return 0;
                                do_get_write_access
                                  /* handle->h_transaction is NULL, it will
                                   * return -EROFS here, so do_get_write_access()
                                   * was not called.
                                   */
                                  if (is_handle_aborted(handle)) return -EROFS;
                          /* bh2jh(group_bh) is NULL, caused NULL
                             pointer dereference */
                          undo_bg = (struct ocfs2_group_desc *)
                                      bh2jh(group_bh)->b_committed_data;
      
      If handle->h_transaction == NULL, then jbd2_write_access_granted()
      does not really guarantee that journal_head will stay around,
      not even speaking of its b_committed_data. The bh2jh(group_bh)
      can be removed after ocfs2_journal_access_gd() and before call
      "bh2jh(group_bh)->b_committed_data". So, we should move
      is_handle_aborted() check from do_get_write_access() into
      jbd2_journal_get_undo_access() and jbd2_journal_get_write_access()
      before the call to jbd2_write_access_granted().
      
      Link: https://lore.kernel.org/r/f72a623f-b3f1-381a-d91d-d22a1c83a336@huawei.comSigned-off-by: NYan Wang <wangyan122@huawei.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJun Piao <piaojun@huawei.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      8eedabfd
    • E
      ext4: fix race between writepages and enabling EXT4_EXTENTS_FL · cb85f4d2
      Eric Biggers 提交于
      If EXT4_EXTENTS_FL is set on an inode while ext4_writepages() is running
      on it, the following warning in ext4_add_complete_io() can be hit:
      
      WARNING: CPU: 1 PID: 0 at fs/ext4/page-io.c:234 ext4_put_io_end_defer+0xf0/0x120
      
      Here's a minimal reproducer (not 100% reliable) (root isn't required):
      
              while true; do
                      sync
              done &
              while true; do
                      rm -f file
                      touch file
                      chattr -e file
                      echo X >> file
                      chattr +e file
              done
      
      The problem is that in ext4_writepages(), ext4_should_dioread_nolock()
      (which only returns true on extent-based files) is checked once to set
      the number of reserved journal credits, and also again later to select
      the flags for ext4_map_blocks() and copy the reserved journal handle to
      ext4_io_end::handle.  But if EXT4_EXTENTS_FL is being concurrently set,
      the first check can see dioread_nolock disabled while the later one can
      see it enabled, causing the reserved handle to unexpectedly be NULL.
      
      Since changing EXT4_EXTENTS_FL is uncommon, and there may be other races
      related to doing so as well, fix this by synchronizing changing
      EXT4_EXTENTS_FL with ext4_writepages() via the existing
      s_writepages_rwsem (previously called s_journal_flag_rwsem).
      
      This was originally reported by syzbot without a reproducer at
      https://syzkaller.appspot.com/bug?extid=2202a584a00fffd19fbf,
      but now that dioread_nolock is the default I also started seeing this
      when running syzkaller locally.
      
      Link: https://lore.kernel.org/r/20200219183047.47417-3-ebiggers@kernel.org
      Reported-by: syzbot+2202a584a00fffd19fbf@syzkaller.appspotmail.com
      Fixes: 6b523df4 ("ext4: use transaction reservation for extent conversion in ext4_end_io")
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      cb85f4d2