1. 18 1月, 2020 22 次提交
  2. 27 12月, 2019 4 次提交
  3. 26 12月, 2019 1 次提交
  4. 23 12月, 2019 13 次提交
    • R
      ext4: Move to shared i_rwsem even without dioread_nolock mount opt · bc6385da
      Ritesh Harjani 提交于
      We were using shared locking only in case of dioread_nolock mount option in case
      of DIO overwrites. This mount condition is not needed anymore with current code,
      since:-
      
      1. No race between buffered writes & DIO overwrites. Since buffIO writes takes
      exclusive lock & DIO overwrites will take shared locking. Also DIO path will
      make sure to flush and wait for any dirty page cache data.
      
      2. No race between buffered reads & DIO overwrites, since there is no block
      allocation that is possible with DIO overwrites. So no stale data exposure
      should happen. Same is the case between DIO reads & DIO overwrites.
      
      3. Also other paths like truncate is protected, since we wait there for any DIO
      in flight to be over.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Tested-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com>
      Link: https://lore.kernel.org/r/20191212055557.11151-4-riteshh@linux.ibm.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      bc6385da
    • R
      ext4: Start with shared i_rwsem in case of DIO instead of exclusive · aa9714d0
      Ritesh Harjani 提交于
      Earlier there was no shared lock in DIO read path. But this patch
      (16c54688: ext4: Allow parallel DIO reads)
      simplified some of the locking mechanism while still allowing for parallel DIO
      reads by adding shared lock in inode DIO read path.
      
      But this created problem with mixed read/write workload. It is due to the fact
      that in DIO path, we first start with exclusive lock and only when we determine
      that it is a ovewrite IO, we downgrade the lock. This causes the problem, since
      we still have shared locking in DIO reads.
      
      So, this patch tries to fix this issue by starting with shared lock and then
      switching to exclusive lock only when required based on ext4_dio_write_checks().
      
      Other than that, it also simplifies below cases:-
      
      1. Simplified ext4_unaligned_aio API to ext4_unaligned_io. Previous API was
      abused in the sense that it was not really checking for AIO anywhere also it
      used to check for extending writes. So this API was renamed and simplified to
      ext4_unaligned_io() which actully only checks if the IO is really unaligned.
      
      Now, in case of unaligned direct IO, iomap_dio_rw needs to do zeroing of partial
      block and that will require serialization against other direct IOs in the same
      block. So we take a exclusive inode lock for any unaligned DIO. In case of AIO
      we also need to wait for any outstanding IOs to complete so that conversion from
      unwritten to written is completed before anyone try to map the overlapping block.
      Hence we take exclusive inode lock and also wait for inode_dio_wait() for
      unaligned DIO case. Please note since we are anyway taking an exclusive lock in
      unaligned IO, inode_dio_wait() becomes a no-op in case of non-AIO DIO.
      
      2. Added ext4_extending_io(). This checks if the IO is extending the file.
      
      3. Added ext4_dio_write_checks(). In this we start with shared inode lock and
      only switch to exclusive lock if required. So in most cases with aligned,
      non-extending, dioread_nolock & overwrites, it tries to write with a shared
      lock. If not, then we restart the operation in ext4_dio_write_checks(), after
      acquiring exclusive lock.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Tested-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com>
      Link: https://lore.kernel.org/r/20191212055557.11151-3-riteshh@linux.ibm.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      aa9714d0
    • R
      ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAIT · f629afe3
      Ritesh Harjani 提交于
      Apparently our current rwsem code doesn't like doing the trylock, then
      lock for real scheme.  So change our dax read/write methods to just do the
      trylock for the RWF_NOWAIT case.
      This seems to fix AIM7 regression in some scalable filesystems upto ~25%
      in some cases. Claimed in commit 942491c9 ("xfs: fix AIM7 regression")
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NMatthew Bobrowski <mbobrowski@mbobrowski.org>
      Tested-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com>
      Link: https://lore.kernel.org/r/20191212055557.11151-2-riteshh@linux.ibm.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      f629afe3
    • T
      ext4: treat buffers contining write errors as valid in ext4_sb_bread() · cf2834a5
      Theodore Ts'o 提交于
      In commit 7963e5ac ("ext4: treat buffers with write errors as
      containing valid data") we missed changing ext4_sb_bread() to use
      ext4_buffer_uptodate().  So fix this oversight.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      cf2834a5
    • L
      Linux 5.5-rc3 · 46cf053e
      Linus Torvalds 提交于
      46cf053e
    • L
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 9efa3ed5
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "Eric's s_inodes softlockup fixes + Jan's fix for recent regression
        from pipe rework"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: call fsnotify_sb_delete after evict_inodes
        fs: avoid softlockups in s_inodes iterators
        pipe: Fix bogus dereference in iov_iter_alignment()
      9efa3ed5
    • L
      Merge tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · c6017471
      Linus Torvalds 提交于
      Pull xfs fixes from Darrick Wong:
       "Fix a few bugs that could lead to corrupt files, fsck complaints, and
        filesystem crashes:
      
         - Minor documentation fixes
      
         - Fix a file corruption due to read racing with an insert range
           operation.
      
         - Fix log reservation overflows when allocating large rt extents
      
         - Fix a buffer log item flags check
      
         - Don't allow administrators to mount with sunit= options that will
           cause later xfs_repair complaints about the root directory being
           suspicious because the fs geometry appeared inconsistent
      
         - Fix a non-static helper that should have been static"
      
      * tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: Make the symbol 'xfs_rtalloc_log_count' static
        xfs: don't commit sunit/swidth updates to disk if that would cause repair failures
        xfs: split the sunit parameter update into two parts
        xfs: refactor agfl length computation function
        libxfs: resync with the userspace libxfs
        xfs: use bitops interface for buf log item AIL flag check
        xfs: fix log reservation overflows when allocating large rt extents
        xfs: stabilize insert range start boundary to avoid COW writeback race
        xfs: fix Sphinx documentation warning
      c6017471
    • L
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · a3965607
      Linus Torvalds 提交于
      Pull ext4 bug fixes from Ted Ts'o:
       "Ext4 bug fixes, including a regression fix"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: clarify impact of 'commit' mount option
        ext4: fix unused-but-set-variable warning in ext4_add_entry()
        jbd2: fix kernel-doc notation warning
        ext4: use RCU API in debug_print_tree
        ext4: validate the debug_want_extra_isize mount option at parse time
        ext4: reserve revoke credits in __ext4_new_inode
        ext4: unlock on error in ext4_expand_extra_isize()
        ext4: optimize __ext4_check_dir_entry()
        ext4: check for directory entries too close to block end
        ext4: fix ext4_empty_dir() for directories with holes
      a3965607
    • L
      Merge tag 'block-5.5-20191221' of git://git.kernel.dk/linux-block · 44579f35
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "Let's try this one again, this time without the compat_ioctl changes.
        We've got those fixed up, but that can go out next week.
      
        This contains:
      
         - block queue flush lockdep annotation (Bart)
      
         - Type fix for bsg_queue_rq() (Bart)
      
         - Three dasd fixes (Stefan, Jan)
      
         - nbd deadlock fix (Mike)
      
         - Error handling bio user map fix (Yang)
      
         - iocost fix (Tejun)
      
         - sbitmap waitqueue addition fix that affects the kyber IO scheduler
           (David)"
      
      * tag 'block-5.5-20191221' of git://git.kernel.dk/linux-block:
        sbitmap: only queue kyber's wait callback if not already active
        block: fix memleak when __blk_rq_map_user_iov() is failed
        s390/dasd: fix typo in copyright statement
        s390/dasd: fix memleak in path handling error case
        s390/dasd/cio: Interpret ccw_device_get_mdc return value correctly
        block: Fix a lockdep complaint triggered by request queue flushing
        block: Fix the type of 'sts' in bsg_queue_rq()
        block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT
        nbd: fix shutdown and recv work deadlock v2
        iocost: over-budget forced IOs should schedule async delay
      44579f35
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a313c8e0
      Linus Torvalds 提交于
      Pull KVM fixes from Paolo Bonzini:
       "PPC:
         - Fix a bug where we try to do an ultracall on a system without an
           ultravisor
      
        KVM:
         - Fix uninitialised sysreg accessor
         - Fix handling of demand-paged device mappings
         - Stop spamming the console on IMPDEF sysregs
         - Relax mappings of writable memslots
         - Assorted cleanups
      
        MIPS:
         - Now orphan, James Hogan is stepping down
      
        x86:
         - MAINTAINERS change, so long Radim and thanks for all the fish
         - supported CPUID fixes for AMD machines without SPEC_CTRL"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        MAINTAINERS: remove Radim from KVM maintainers
        MAINTAINERS: Orphan KVM for MIPS
        kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD
        kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD
        KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor
        KVM: arm/arm64: Properly handle faulting of device mappings
        KVM: arm64: Ensure 'params' is initialised when looking up sys register
        KVM: arm/arm64: Remove excessive permission check in kvm_arch_prepare_memory_region
        KVM: arm64: Don't log IMP DEF sysreg traps
        KVM: arm64: Sanely ratelimit sysreg messages
        KVM: arm/arm64: vgic: Use wrapper function to lock/unlock all vcpus in kvm_vgic_create()
        KVM: arm/arm64: vgic: Fix potential double free dist->spis in __kvm_vgic_destroy()
        KVM: arm/arm64: Get rid of unused arg in cpu_init_hyp_mode()
      a313c8e0
    • L
      Merge tag 'riscv/for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 7214618c
      Linus Torvalds 提交于
      Pull RISC-V fixes from Paul Walmsley:
       "Several fixes, and one cleanup, for RISC-V.
      
        Fixes:
      
         - Fix an error in a Kconfig file that resulted in an undefined
           Kconfig option "CONFIG_CONFIG_MMU"
      
         - Fix undefined Kconfig option "CONFIG_CONFIG_MMU"
      
         - Fix scratch register clearing in M-mode (affects nommu users)
      
         - Fix a mismerge on my part that broke the build for
           CONFIG_SPARSEMEM_VMEMMAP users
      
        Cleanup:
      
         - Move SiFive L2 cache-related code to drivers/soc, per request"
      
      * tag 'riscv/for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: move sifive_l2_cache.c to drivers/soc
        riscv: define vmemmap before pfn_to_page calls
        riscv: fix scratch register clearing in M-mode.
        riscv: Fix use of undefined config option CONFIG_CONFIG_MMU
      7214618c
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 78bac77b
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Several nf_flow_table_offload fixes from Pablo Neira Ayuso,
          including adding a missing ipv6 match description.
      
       2) Several heap overflow fixes in mwifiex from qize wang and Ganapathi
          Bhat.
      
       3) Fix uninit value in bond_neigh_init(), from Eric Dumazet.
      
       4) Fix non-ACPI probing of nxp-nci, from Stephan Gerhold.
      
       5) Fix use after free in tipc_disc_rcv(), from Tuong Lien.
      
       6) Enforce limit of 33 tail calls in mips and riscv JIT, from Paul
          Chaignon.
      
       7) Multicast MAC limit test is off by one in qede, from Manish Chopra.
      
       8) Fix established socket lookup race when socket goes from
          TCP_ESTABLISHED to TCP_LISTEN, because there lacks an intervening
          RCU grace period. From Eric Dumazet.
      
       9) Don't send empty SKBs from tcp_write_xmit(), also from Eric Dumazet.
      
      10) Fix active backup transition after link failure in bonding, from
          Mahesh Bandewar.
      
      11) Avoid zero sized hash table in gtp driver, from Taehee Yoo.
      
      12) Fix wrong interface passed to ->mac_link_up(), from Russell King.
      
      13) Fix DSA egress flooding settings in b53, from Florian Fainelli.
      
      14) Memory leak in gmac_setup_txqs(), from Navid Emamdoost.
      
      15) Fix double free in dpaa2-ptp code, from Ioana Ciornei.
      
      16) Reject invalid MTU values in stmmac, from Jose Abreu.
      
      17) Fix refcount leak in error path of u32 classifier, from Davide
          Caratti.
      
      18) Fix regression causing iwlwifi firmware crashes on boot, from Anders
          Kaseorg.
      
      19) Fix inverted return value logic in llc2 code, from Chan Shu Tak.
      
      20) Disable hardware GRO when XDP is attached to qede, frm Manish
          Chopra.
      
      21) Since we encode state in the low pointer bits, dst metrics must be
          at least 4 byte aligned, which is not necessarily true on m68k. Add
          annotations to fix this, from Geert Uytterhoeven.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (160 commits)
        sfc: Include XDP packet headroom in buffer step size.
        sfc: fix channel allocation with brute force
        net: dst: Force 4-byte alignment of dst_metrics
        selftests: pmtu: fix init mtu value in description
        hv_netvsc: Fix unwanted rx_table reset
        net: phy: ensure that phy IDs are correctly typed
        mod_devicetable: fix PHY module format
        qede: Disable hardware gro when xdp prog is installed
        net: ena: fix issues in setting interrupt moderation params in ethtool
        net: ena: fix default tx interrupt moderation interval
        net/smc: unregister ib devices in reboot_event
        net: stmmac: platform: Fix MDIO init for platforms without PHY
        llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c)
        net: hisilicon: Fix a BUG trigered by wrong bytes_compl
        net: dsa: ksz: use common define for tag len
        s390/qeth: don't return -ENOTSUPP to userspace
        s390/qeth: fix promiscuous mode after reset
        s390/qeth: handle error due to unsupported transport mode
        cxgb4: fix refcount init for TC-MQPRIO offload
        tc-testing: initial tdc selftests for cls_u32
        ...
      78bac77b
    • J
      pipe: fix empty pipe check in pipe_write() · 0dd1e377
      Jan Stancek 提交于
      LTP pipeio_1 test is hanging with v5.5-rc2-385-gb8e382a1,
      with read side observing empty pipe and sleeping and write
      side running out of space and then sleeping as well. In this
      scenario there are 5 writers and 1 reader.
      
      Problem is that after pipe_write() reacquires pipe lock, it
      re-checks for empty pipe with potentially stale 'head' and
      doesn't wake up read side anymore. pipe->tail can advance
      beyond 'head', because there are multiple writers.
      
      Use pipe->head for empty pipe check after reacquiring lock
      to observe current state.
      
      Testing: With patch, LTP pipeio_1 ran successfully in loop for 1 hour.
               Without patch it hanged within a minute.
      
      Fixes: 1b6b26ae ("pipe: fix and clarify pipe write wakeup logic")
      Reported-by: NRachel Sibley <rasibley@redhat.com>
      Signed-off-by: NJan Stancek <jstancek@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0dd1e377