1. 20 11月, 2020 1 次提交
    • T
      ext4: drop fast_commit from /proc/mounts · 704c2317
      Theodore Ts'o 提交于
      The options in /proc/mounts must be valid mount options --- and
      fast_commit is not a mount option.  Otherwise, command sequences like
      this will fail:
      
          # mount /dev/vdc /vdc
          # mkdir -p /vdc/phoronix_test_suite /pts
          # mount --bind /vdc/phoronix_test_suite /pts
          # mount -o remount,nodioread_nolock /pts
          mount: /pts: mount point not mounted or bad option.
      
      And in the system logs, you'll find:
      
          EXT4-fs (vdc): Unrecognized mount option "fast_commit" or missing value
      
      Fixes: 995a3ed6 ("ext4: add fast_commit feature and handling for extended mount options")
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      704c2317
  2. 12 11月, 2020 1 次提交
  3. 07 11月, 2020 7 次提交
  4. 29 10月, 2020 2 次提交
  5. 22 10月, 2020 5 次提交
  6. 18 10月, 2020 13 次提交
  7. 22 9月, 2020 2 次提交
    • E
      fscrypt: make fscrypt_set_test_dummy_encryption() take a 'const char *' · c8c868ab
      Eric Biggers 提交于
      fscrypt_set_test_dummy_encryption() requires that the optional argument
      to the test_dummy_encryption mount option be specified as a substring_t.
      That doesn't work well with filesystems that use the new mount API,
      since the new way of parsing mount options doesn't use substring_t.
      
      Make it take the argument as a 'const char *' instead.
      
      Instead of moving the match_strdup() into the callers in ext4 and f2fs,
      make them just use arg->from directly.  Since the pattern is
      "test_dummy_encryption=%s", the argument will be null-terminated.
      Acked-by: NJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-14-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>
      c8c868ab
    • E
      fscrypt: handle test_dummy_encryption in more logical way · ac4acb1f
      Eric Biggers 提交于
      The behavior of the test_dummy_encryption mount option is that when a
      new file (or directory or symlink) is created in an unencrypted
      directory, it's automatically encrypted using a dummy encryption policy.
      That's it; in particular, the encryption (or lack thereof) of existing
      files (or directories or symlinks) doesn't change.
      
      Unfortunately the implementation of test_dummy_encryption is a bit weird
      and confusing.  When test_dummy_encryption is enabled and a file is
      being created in an unencrypted directory, we set up an encryption key
      (->i_crypt_info) for the directory.  This isn't actually used to do any
      encryption, however, since the directory is still unencrypted!  Instead,
      ->i_crypt_info is only used for inheriting the encryption policy.
      
      One consequence of this is that the filesystem ends up providing a
      "dummy context" (policy + nonce) instead of a "dummy policy".  In
      commit ed318a6c ("fscrypt: support test_dummy_encryption=v2"), I
      mistakenly thought this was required.  However, actually the nonce only
      ends up being used to derive a key that is never used.
      
      Another consequence of this implementation is that it allows for
      'inode->i_crypt_info != NULL && !IS_ENCRYPTED(inode)', which is an edge
      case that can be forgotten about.  For example, currently
      FS_IOC_GET_ENCRYPTION_POLICY on an unencrypted directory may return the
      dummy encryption policy when the filesystem is mounted with
      test_dummy_encryption.  That seems like the wrong thing to do, since
      again, the directory itself is not actually encrypted.
      
      Therefore, switch to a more logical and maintainable implementation
      where the dummy encryption policy inheritance is done without setting up
      keys for unencrypted directories.  This involves:
      
      - Adding a function fscrypt_policy_to_inherit() which returns the
        encryption policy to inherit from a directory.  This can be a real
        policy, a dummy policy, or no policy.
      
      - Replacing struct fscrypt_dummy_context, ->get_dummy_context(), etc.
        with struct fscrypt_dummy_policy, ->get_dummy_policy(), etc.
      
      - Making fscrypt_fname_encrypted_size() take an fscrypt_policy instead
        of an inode.
      Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Acked-by: NJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-13-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>
      ac4acb1f
  8. 19 9月, 2020 1 次提交
  9. 20 8月, 2020 1 次提交
    • B
      ext4: limit the length of per-inode prealloc list · 27bc446e
      brookxu 提交于
      In the scenario of writing sparse files, the per-inode prealloc list may
      be very long, resulting in high overhead for ext4_mb_use_preallocated().
      To circumvent this problem, we limit the maximum length of per-inode
      prealloc list to 512 and allow users to modify it.
      
      After patching, we observed that the sys ratio of cpu has dropped, and
      the system throughput has increased significantly. We created a process
      to write the sparse file, and the running time of the process on the
      fixed kernel was significantly reduced, as follows:
      
      Running time on unfixed kernel:
      [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
      real    0m2.051s
      user    0m0.008s
      sys     0m2.026s
      
      Running time on fixed kernel:
      [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
      real    0m0.471s
      user    0m0.004s
      sys     0m0.395s
      Signed-off-by: NChunguang Xu <brookxu@tencent.com>
      Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      27bc446e
  10. 08 8月, 2020 7 次提交
    • X
      fs: prevent BUG_ON in submit_bh_wbc() · 377254b2
      Xianting Tian 提交于
      If a device is hot-removed --- for example, when a physical device is
      unplugged from pcie slot or a nbd device's network is shutdown ---
      this can result in a BUG_ON() crash in submit_bh_wbc().  This is
      because the when the block device dies, the buffer heads will have
      their Buffer_Mapped flag get cleared, leading to the crash in
      submit_bh_wbc.
      
      We had attempted to work around this problem in commit a17712c8
      ("ext4: check superblock mapped prior to committing").  Unfortunately,
      it's still possible to hit the BUG_ON(!buffer_mapped(bh)) if the
      device dies between when the work-around check in ext4_commit_super()
      and when submit_bh_wbh() is finally called:
      
      Code path:
      ext4_commit_super
          judge if 'buffer_mapped(sbh)' is false, return <== commit a17712c8
                lock_buffer(sbh)
                ...
                unlock_buffer(sbh)
                     __sync_dirty_buffer(sbh,...
                          lock_buffer(sbh)
                              judge if 'buffer_mapped(sbh))' is false, return <== added by this patch
                                  submit_bh(...,sbh)
                                      submit_bh_wbc(...,sbh,...)
      
      [100722.966497] kernel BUG at fs/buffer.c:3095! <== BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc()
      [100722.966503] invalid opcode: 0000 [#1] SMP
      [100722.966566] task: ffff8817e15a9e40 task.stack: ffffc90024744000
      [100722.966574] RIP: 0010:submit_bh_wbc+0x180/0x190
      [100722.966575] RSP: 0018:ffffc90024747a90 EFLAGS: 00010246
      [100722.966576] RAX: 0000000000620005 RBX: ffff8818a80603a8 RCX: 0000000000000000
      [100722.966576] RDX: ffff8818a80603a8 RSI: 0000000000020800 RDI: 0000000000000001
      [100722.966577] RBP: ffffc90024747ac0 R08: 0000000000000000 R09: ffff88207f94170d
      [100722.966578] R10: 00000000000437c8 R11: 0000000000000001 R12: 0000000000020800
      [100722.966578] R13: 0000000000000001 R14: 000000000bf9a438 R15: ffff88195f333000
      [100722.966580] FS:  00007fa2eee27700(0000) GS:ffff88203d840000(0000) knlGS:0000000000000000
      [100722.966580] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [100722.966581] CR2: 0000000000f0b008 CR3: 000000201a622003 CR4: 00000000007606e0
      [100722.966582] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [100722.966583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [100722.966583] PKRU: 55555554
      [100722.966583] Call Trace:
      [100722.966588]  __sync_dirty_buffer+0x6e/0xd0
      [100722.966614]  ext4_commit_super+0x1d8/0x290 [ext4]
      [100722.966626]  __ext4_std_error+0x78/0x100 [ext4]
      [100722.966635]  ? __ext4_journal_get_write_access+0xca/0x120 [ext4]
      [100722.966646]  ext4_reserve_inode_write+0x58/0xb0 [ext4]
      [100722.966655]  ? ext4_dirty_inode+0x48/0x70 [ext4]
      [100722.966663]  ext4_mark_inode_dirty+0x53/0x1e0 [ext4]
      [100722.966671]  ? __ext4_journal_start_sb+0x6d/0xf0 [ext4]
      [100722.966679]  ext4_dirty_inode+0x48/0x70 [ext4]
      [100722.966682]  __mark_inode_dirty+0x17f/0x350
      [100722.966686]  generic_update_time+0x87/0xd0
      [100722.966687]  touch_atime+0xa9/0xd0
      [100722.966690]  generic_file_read_iter+0xa09/0xcd0
      [100722.966694]  ? page_cache_tree_insert+0xb0/0xb0
      [100722.966704]  ext4_file_read_iter+0x4a/0x100 [ext4]
      [100722.966707]  ? __inode_security_revalidate+0x4f/0x60
      [100722.966709]  __vfs_read+0xec/0x160
      [100722.966711]  vfs_read+0x8c/0x130
      [100722.966712]  SyS_pread64+0x87/0xb0
      [100722.966716]  do_syscall_64+0x67/0x1b0
      [100722.966719]  entry_SYSCALL64_slow_path+0x25/0x25
      
      To address this, add the check of 'buffer_mapped(bh)' to
      __sync_dirty_buffer().  This also has the benefit of fixing this for
      other file systems.
      
      With this addition, we can drop the workaround in ext4_commit_supper().
      
      [ Commit description rewritten by tytso. ]
      Signed-off-by: NXianting Tian <xianting_tian@126.com>
      Link: https://lore.kernel.org/r/1596211825-8750-1-git-send-email-xianting_tian@126.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      377254b2
    • J
      ext4: correctly restore system zone info when remount fails · 0f5bde1d
      Jan Kara 提交于
      When remounting filesystem fails late during remount handling and
      block_validity mount option is also changed during the remount, we fail
      to restore system zone information to a state matching the mount option.
      This is mostly harmless, just the block validity checking will not match
      the situation described by the mount option. Make sure these two are always
      consistent.
      Reported-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20200728130437.7804-7-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      0f5bde1d
    • J
      ext4: handle error of ext4_setup_system_zone() on remount · d176b1f6
      Jan Kara 提交于
      ext4_setup_system_zone() can fail. Handle the failure in ext4_remount().
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20200728130437.7804-2-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      d176b1f6
    • D
      ext4: export msg_count and warning_count via sysfs · 1cf006ed
      Dmitry Monakhov 提交于
      This numbers can be analized by system automation similar to errors_count.
      In ideal world it would be nice to have separate counters for different
      log-levels, but this makes this patch too intrusive.
      Signed-off-by: NDmitry Monakhov <dmtrmonakhov@yandex-team.ru>
      Link: https://lore.kernel.org/r/20200725123313.4467-1-dmtrmonakhov@yandex-team.ruSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      1cf006ed
    • L
      ext4: handle option set by mount flags correctly · f25391eb
      Lukas Czerner 提交于
      Currently there is a problem with mount options that can be both set by
      vfs using mount flags or by a string parsing in ext4.
      
      i_version/iversion options gets lost after remount, for example
      
      $ mount -o i_version /dev/pmem0 /mnt
      $ grep pmem0 /proc/self/mountinfo | grep i_version
      310 95 259:0 / /mnt rw,relatime shared:163 - ext4 /dev/pmem0 rw,seclabel,i_version
      $ mount -o remount,ro /mnt
      $ grep pmem0 /proc/self/mountinfo | grep i_version
      
      nolazytime gets ignored by ext4 on remount, for example
      
      $ mount -o lazytime /dev/pmem0 /mnt
      $ grep pmem0 /proc/self/mountinfo | grep lazytime
      310 95 259:0 / /mnt rw,relatime shared:163 - ext4 /dev/pmem0 rw,lazytime,seclabel
      $ mount -o remount,nolazytime /mnt
      $ grep pmem0 /proc/self/mountinfo | grep lazytime
      310 95 259:0 / /mnt rw,relatime shared:163 - ext4 /dev/pmem0 rw,lazytime,seclabel
      
      Fix it by applying the SB_LAZYTIME and SB_I_VERSION flags from *flags to
      s_flags before we parse the option and use the resulting state of the
      same flags in *flags at the end of successful remount.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com>
      Link: https://lore.kernel.org/r/20200723150526.19931-1-lczerner@redhat.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      f25391eb
    • T
      ext4: add prefetch_block_bitmaps mount option · 3d392b26
      Theodore Ts'o 提交于
      For file systems where we can afford to keep the buddy bitmaps cached,
      we can speed up initial writes to large file systems by starting to
      load the block allocation bitmaps as soon as the file system is
      mounted.  This won't work well for _super_ large file systems, or
      memory constrained systems, so we only enable this when it is
      requested via a mount option.
      
      Addresses-Google-Bug: 159488342
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      3d392b26
    • Z
      jbd2: remove unused parameter in jbd2_journal_try_to_free_buffers() · 529a781e
      zhangyi (F) 提交于
      Parameter gfp_mask in jbd2_journal_try_to_free_buffers() is no longer
      used after commit <536fc240> ("jbd2: clean up
      jbd2_journal_try_to_free_buffers()"), so just remove it.
      Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
      Link: https://lore.kernel.org/r/20200620025427.1756360-6-yi.zhang@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      529a781e