1. 02 10月, 2018 3 次提交
    • E
      ext4: fix reserved cluster accounting at page invalidation time · f456767d
      Eric Whitney 提交于
      Add new code to count canceled pending cluster reservations on bigalloc
      file systems and to reduce the cluster reservation count on all file
      systems using delayed allocation.  This replaces old code in
      ext4_da_page_release_reservations that was incorrect.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      f456767d
    • E
      ext4: fix reserved cluster accounting at delayed write time · 0b02f4c0
      Eric Whitney 提交于
      The code in ext4_da_map_blocks sometimes reserves space for more
      delayed allocated clusters than it should, resulting in premature
      ENOSPC, exceeded quota, and inaccurate free space reporting.
      
      Fix this by checking for written and unwritten blocks shared in the
      same cluster with the newly delayed allocated block.  A cluster
      reservation should not be made for a cluster for which physical space
      has already been allocated.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      0b02f4c0
    • E
      ext4: generalize extents status tree search functions · ad431025
      Eric Whitney 提交于
      Ext4 contains a few functions that are used to search for delayed
      extents or blocks in the extents status tree.  Rather than duplicate
      code to add new functions to search for extents with different status
      values, such as written or a combination of delayed and unwritten,
      generalize the existing code to search for caller-specified extents
      status values.  Also, move this code into extents_status.c where it
      is better associated with the data structures it operates upon, and
      where it can be more readily used to implement new extents status tree
      functions that might want a broader scope for i_es_lock.
      
      Three missing static specifiers in RFC version of patch reported and
      fixed by Fengguang Wu <fengguang.wu@intel.com>.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ad431025
  2. 16 9月, 2018 2 次提交
    • T
      ext4, dax: set ext4_dax_aops for dax files · cce6c9f7
      Toshi Kani 提交于
      Sync syscall to DAX file needs to flush processor cache, but it
      currently does not flush to existing DAX files.  This is because
      'ext4_da_aops' is set to address_space_operations of existing DAX
      files, instead of 'ext4_dax_aops', since S_DAX flag is set after
      ext4_set_aops() in the open path.
      
        New file
        --------
        lookup_open
          ext4_create
            __ext4_new_inode
              ext4_set_inode_flags   // Set S_DAX flag
            ext4_set_aops            // Set aops to ext4_dax_aops
      
        Existing file
        -------------
        lookup_open
          ext4_lookup
            ext4_iget
              ext4_set_aops          // Set aops to ext4_da_aops
              ext4_set_inode_flags   // Set S_DAX flag
      
      Change ext4_iget() to initialize i_flags before ext4_set_aops().
      
      Fixes: 5f0663bb ("ext4, dax: introduce ext4_dax_aops")
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Suggested-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      cce6c9f7
    • T
      ext4, dax: add ext4_bmap to ext4_dax_aops · 94dbb631
      Toshi Kani 提交于
      Ext4 mount path calls .bmap to the journal inode. This currently
      works for the DAX mount case because ext4_iget() always set
      'ext4_da_aops' to any regular files.
      
      In preparation to fix ext4_iget() to set 'ext4_dax_aops' for ext4
      DAX files, add ext4_bmap() to 'ext4_dax_aops', since bmap works for
      DAX inodes.
      
      Fixes: 5f0663bb ("ext4, dax: introduce ext4_dax_aops")
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Suggested-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      94dbb631
  3. 12 9月, 2018 1 次提交
  4. 02 9月, 2018 1 次提交
  5. 18 8月, 2018 1 次提交
  6. 02 8月, 2018 1 次提交
  7. 30 7月, 2018 2 次提交
  8. 10 7月, 2018 1 次提交
  9. 17 6月, 2018 1 次提交
  10. 16 6月, 2018 1 次提交
  11. 23 5月, 2018 1 次提交
  12. 14 5月, 2018 2 次提交
  13. 10 5月, 2018 1 次提交
    • E
      ext4: use raw i_version value for ea_inode · e254d1af
      Eryu Guan 提交于
      Currently, creating large xattr (e.g. 2k) in ea_inode would cause
      ea_inode refcount corruption, e.g.
      
        Pass 4: Checking reference counts
        Extended attribute inode 13 ref count is 0, should be 1. Fix? no
      
      This is because that we save the lower 32bit of refcount in
      inode->i_version and store it in raw_inode->i_disk_version on disk.
      But since commit ee73f9a5 ("ext4: convert to new i_version
      API"), we load/store modified i_disk_version from/to disk instead of
      raw value, which causes on-disk ea_inode refcount corruption.
      
      Fix it by loading/storing raw i_version/i_disk_version, because it's
      a self-managed value in this case.
      
      Fixes: ee73f9a5 ("ext4: convert to new i_version API")
      Cc: Tahsin Erdogan <tahsin@google.com>
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e254d1af
  14. 31 3月, 2018 1 次提交
    • D
      ext4, dax: introduce ext4_dax_aops · 5f0663bb
      Dan Williams 提交于
      In preparation for the dax implementation to start associating dax pages
      to inodes via page->mapping, we need to provide a 'struct
      address_space_operations' instance for dax. Otherwise, direct-I/O
      triggers incorrect page cache assumptions and warnings.
      
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: linux-ext4@vger.kernel.org
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5f0663bb
  15. 30 3月, 2018 1 次提交
  16. 28 3月, 2018 1 次提交
  17. 26 3月, 2018 1 次提交
  18. 22 3月, 2018 4 次提交
    • N
      ext4: remove EXT4_STATE_DIOREAD_LOCK flag · 1d39834f
      Nikolay Borisov 提交于
      Commit 16c54688 ("ext4: Allow parallel DIO reads") reworked the way
      locking happens around parallel dio reads. This resulted in obviating
      the need for EXT4_STATE_DIOREAD_LOCK flag and accompanying logic.
      Currently this amounts to dead code so let's remove it. No functional
      changes
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      1d39834f
    • J
      ext4: fix offset overflow on 32-bit archs in ext4_iomap_begin() · fe23cb65
      Jiri Slaby 提交于
      ext4_iomap_begin() has a bug where offset returned in the iomap
      structure will be truncated to unsigned long size. On 64-bit
      architectures this is fine but on 32-bit architectures obviously not.
      Not many places actually use the offset stored in the iomap structure
      but one of visible failures is in SEEK_HOLE / SEEK_DATA implementation.
      If we create a file like:
      
      dd if=/dev/urandom of=file bs=1k seek=8m count=1
      
      then
      
      lseek64("file", 0x100000000ULL, SEEK_DATA)
      
      wrongly returns 0x100000000 on unfixed kernel while it should return
      0x200000000. Avoid the overflow by proper type cast.
      
      Fixes: 545052e9 ("ext4: Switch to iomap for SEEK_HOLE / SEEK_DATA")
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org # v4.15
      fe23cb65
    • E
      ext4: update i_disksize if direct write past ondisk size · 45d8ec4d
      Eryu Guan 提交于
      Currently in ext4 direct write path, we update i_disksize only when
      new eof is greater than i_size, and don't update it even when new
      eof is greater than i_disksize but less than i_size. This doesn't
      work well with delalloc buffer write, which updates i_size and
      i_disksize only when delalloc blocks are resolved (at writeback
      time), the i_disksize from direct write can be lost if a previous
      buffer write succeeded at write time but failed at writeback time,
      then results in corrupted ondisk inode size.
      
      Consider this case, first buffer write 4k data to a new file at
      offset 16k with delayed allocation, then direct write 4k data to the
      same file at offset 4k before delalloc blocks are resolved, which
      doesn't update i_disksize because it writes within i_size(20k), but
      the extent tree metadata has been committed in journal. Then
      writeback of the delalloc blocks fails (due to device error etc.),
      and i_size/i_disksize from buffer write can't be written to disk
      (still zero). A subsequent umount/mount cycle recovers journal and
      writes extent tree metadata from direct write to disk, but with
      i_disksize being zero.
      
      Fix it by updating i_disksize too in direct write path when new eof
      is greater than i_disksize but less than i_size, so i_disksize is
      always consistent with direct write.
      
      This fixes occasional i_size corruption in fstests generic/475.
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      45d8ec4d
    • E
      ext4: protect i_disksize update by i_data_sem in direct write path · 73fdad00
      Eryu Guan 提交于
      i_disksize update should be protected by i_data_sem, by either taking
      the lock explicitly or by using ext4_update_i_disksize() helper. But the
      i_disksize updates in ext4_direct_IO_write() are not protected at all,
      which may be racing with i_disksize updates in writeback path in
      delalloc buffer write path.
      
      This is found by code inspection, and I didn't hit any i_disksize
      corruption due to this bug. Thanks to Jan Kara for catching this bug and
      suggesting the fix!
      Reported-by: NJan Kara <jack@suse.cz>
      Suggested-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      73fdad00
  19. 29 1月, 2018 2 次提交
  20. 10 1月, 2018 1 次提交
    • H
      ext4: fix a race in the ext4 shutdown path · abbc3f93
      Harshad Shirwadkar 提交于
      This patch fixes a race between the shutdown path and bio completion
      handling. In the ext4 direct io path with async io, after submitting a
      bio to the block layer, if journal starting fails,
      ext4_direct_IO_write() would bail out pretending that the IO
      failed. The caller would have had no way of knowing whether or not the
      IO was successfully submitted. So instead, we return -EIOCBQUEUED in
      this case. Now, the caller knows that the IO was submitted.  The bio
      completion handler takes care of the error.
      
      Tested: Ran the shutdown xfstest test 461 in loop for over 2 hours across
      4 machines resulting in over 400 runs. Verified that the race didn't
      occur. Usually the race was seen in about 20-30 iterations.
      Signed-off-by: NHarshad Shirwadkar <harshads@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      abbc3f93
  21. 04 12月, 2017 1 次提交
    • A
      ext4: support fast symlinks from ext3 file systems · fc82228a
      Andi Kleen 提交于
      407cd7fb (ext4: change fast symlink test to not rely on i_blocks)
      broke ~10 years old ext3 file systems created by 2.6.17. Any ELF
      executable fails because the /lib/ld-linux.so.2 fast symlink
      cannot be read anymore.
      
      The patch assumed fast symlinks were created in a specific way,
      but that's not true on these really old file systems.
      
      The new behavior is apparently needed only with the large EA inode
      feature.
      
      Revert to the old behavior if the large EA inode feature is not set.
      
      This makes my old VM boot again.
      
      Fixes: 407cd7fb (ext4: change fast symlink test to not rely on i_blocks)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      Cc: stable@vger.kernel.org
      fc82228a
  22. 28 11月, 2017 1 次提交
    • L
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds 提交于
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      
      The script to do this was:
      
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
          SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
                DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
                POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
                I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
                ACTIVE NOUSER"
      
          SED_PROG=
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
      
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
      
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1751e8a6
  23. 16 11月, 2017 3 次提交
  24. 14 11月, 2017 1 次提交
  25. 03 11月, 2017 1 次提交
    • J
      ext4: Support for synchronous DAX faults · b8a6176c
      Jan Kara 提交于
      We return IOMAP_F_DIRTY flag from ext4_iomap_begin() when asked to
      prepare blocks for writing and the inode has some uncommitted metadata
      changes. In the fault handler ext4_dax_fault() we then detect this case
      (through VM_FAULT_NEEDDSYNC return value) and call helper
      dax_finish_sync_fault() to flush metadata changes and insert page table
      entry. Note that this will also dirty corresponding radix tree entry
      which is what we want - fsync(2) will still provide data integrity
      guarantees for applications not using userspace flushing. And
      applications using userspace flushing can avoid calling fsync(2) and
      thus avoid the performance overhead.
      Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      b8a6176c
  26. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  27. 19 10月, 2017 2 次提交
    • E
      ext4: switch to fscrypt_prepare_setattr() · 3ce2b8dd
      Eric Biggers 提交于
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      3ce2b8dd
    • E
      fs, fscrypt: add an S_ENCRYPTED inode flag · 2ee6a576
      Eric Biggers 提交于
      Introduce a flag S_ENCRYPTED which can be set in ->i_flags to indicate
      that the inode is encrypted using the fscrypt (fs/crypto/) mechanism.
      
      Checking this flag will give the same information that
      inode->i_sb->s_cop->is_encrypted(inode) currently does, but will be more
      efficient.  This will be useful for adding higher-level helper functions
      for filesystems to use.  For example we'll be able to replace this:
      
      	if (ext4_encrypted_inode(inode)) {
      		ret = fscrypt_get_encryption_info(inode);
      		if (ret)
      			return ret;
      		if (!fscrypt_has_encryption_key(inode))
      			return -ENOKEY;
      	}
      
      with this:
      
      	ret = fscrypt_require_key(inode);
      	if (ret)
      		return ret;
      
      ... since we'll be able to retain the fast path for unencrypted files as
      a single flag check, using an inline function.  This wasn't possible
      before because we'd have had to frequently call through the
      ->i_sb->s_cop->is_encrypted function pointer, even when the encryption
      support was disabled or not being used.
      
      Note: we don't define S_ENCRYPTED to 0 if CONFIG_FS_ENCRYPTION is
      disabled because we want to continue to return an error if an encrypted
      file is accessed without encryption support, rather than pretending that
      it is unencrypted.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Acked-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2ee6a576
  28. 13 10月, 2017 1 次提交