1. 08 6月, 2016 1 次提交
  2. 17 5月, 2016 1 次提交
  3. 26 4月, 2016 1 次提交
  4. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  5. 04 4月, 2016 1 次提交
    • T
      ext4: ignore quota mount options if the quota feature is enabled · c325a67c
      Theodore Ts'o 提交于
      Previously, ext4 would fail the mount if the file system had the quota
      feature enabled and quota mount options (used for the older quota
      setups) were present.  This broke xfstests, since xfs silently ignores
      the usrquote and grpquota mount options if they are specified.  This
      commit changes things so that we are consistent with xfs; having the
      mount options specified is harmless, so no sense break users by
      forbidding them.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c325a67c
  6. 02 4月, 2016 1 次提交
  7. 01 4月, 2016 1 次提交
    • T
      ext4: add lockdep annotations for i_data_sem · daf647d2
      Theodore Ts'o 提交于
      With the internal Quota feature, mke2fs creates empty quota inodes and
      quota usage tracking is enabled as soon as the file system is mounted.
      Since quotacheck is no longer preallocating all of the blocks in the
      quota inode that are likely needed to be written to, we are now seeing
      a lockdep false positive caused by needing to allocate a quota block
      from inside ext4_map_blocks(), while holding i_data_sem for a data
      inode.  This results in this complaint:
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&ei->i_data_sem);
                                      lock(&s->s_dquot.dqio_mutex);
                                      lock(&ei->i_data_sem);
         lock(&s->s_dquot.dqio_mutex);
      
      Google-Bug-Id: 27907753
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      daf647d2
  8. 13 3月, 2016 1 次提交
  9. 09 3月, 2016 2 次提交
    • J
      ext4: remove i_ioend_count · 600be30a
      Jan Kara 提交于
      Remove counter of pending io ends as it is unused.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      600be30a
    • J
      ext4: use i_mutex to serialize unaligned AIO DIO · e142d052
      Jan Kara 提交于
      Currently we've used hashed aio_mutex to serialize unaligned AIO DIO.
      However the code cleanups that happened after 2011 when the lock was
      introduced made aio_mutex acquired at almost the same places where we
      already have exclusion using i_mutex. So just use i_mutex for the
      exclusion of unaligned AIO DIO.
      
      The change moves waiting for pending unwritten extent conversion under
      i_mutex. That makes special handling of O_APPEND writes unnecessary and
      also avoids possible livelocking of unaligned AIO DIO with aligned one
      (nothing was preventing contiguous stream of aligned AIO DIOs to let
      unaligned AIO DIO wait forever).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e142d052
  10. 23 2月, 2016 2 次提交
  11. 20 2月, 2016 1 次提交
  12. 09 2月, 2016 1 次提交
  13. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  14. 15 1月, 2016 1 次提交
    • V
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov 提交于
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  15. 09 1月, 2016 2 次提交
  16. 08 12月, 2015 2 次提交
    • J
      ext4: document lock ordering · e74031fd
      Jan Kara 提交于
      We have enough locks that it's probably worth documenting the lock
      ordering rules we have in ext4.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e74031fd
    • J
      ext4: fix races between page faults and hole punching · ea3d7209
      Jan Kara 提交于
      Currently, page faults and hole punching are completely unsynchronized.
      This can result in page fault faulting in a page into a range that we
      are punching after truncate_pagecache_range() has been called and thus
      we can end up with a page mapped to disk blocks that will be shortly
      freed. Filesystem corruption will shortly follow. Note that the same
      race is avoided for truncate by checking page fault offset against
      i_size but there isn't similar mechanism available for punching holes.
      
      Fix the problem by creating new rw semaphore i_mmap_sem in inode and
      grab it for writing over truncate, hole punching, and other functions
      removing blocks from extent tree and for read over page faults. We
      cannot easily use i_data_sem for this since that ranks below transaction
      start and we need something ranking above it so that it can be held over
      the whole truncate / hole punching operation. Also remove various
      workarounds we had in the code to reduce race window when page fault
      could have created pages with stale mapping information.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ea3d7209
  17. 17 11月, 2015 1 次提交
    • D
      ext2, ext4: warn when mounting with dax enabled · ef83b6e8
      Dan Williams 提交于
      Similar to XFS warn when mounting DAX while it is still considered under
      development.  Also, aspects of the DAX implementation, for example
      synchronization against multiple faults and faults causing block
      allocation, depend on the correct implementation in the filesystem.  The
      maturity of a given DAX implementation is filesystem specific.
      
      Cc: <stable@vger.kernel.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: linux-ext4@vger.kernel.org
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: NDave Chinner <david@fromorbit.com>
      Acked-by: NJan Kara <jack@suse.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      ef83b6e8
  18. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  19. 19 10月, 2015 3 次提交
    • D
      ext4: do not allow journal_opts for fs w/o journal · 1e381f60
      Dmitry Monakhov 提交于
      It is appeared that we can pass journal related mount options and such options
      be shown in /proc/mounts
      
      Example:
      #mkfs.ext4 -F /dev/vdb
      #tune2fs -O ^has_journal /dev/vdb
      #mount /dev/vdb /mnt/  -ocommit=20,journal_async_commit
      #cat /proc/mounts  | grep /mnt
       /dev/vdb /mnt ext4 rw,relatime,journal_checksum,journal_async_commit,commit=20,data=ordered 0 0
      
      But options:"journal_checksum,journal_async_commit,commit=20,data=ordered" has
      nothing with reality because there is no journal at all.
      
      This patch disallow following options for journalless configurations:
       - journal_checksum
       - journal_async_commit
       - commit=%ld
       - data={writeback,ordered,journal}
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      1e381f60
    • D
      ext4: explicit mount options parsing cleanup · c93cf2d7
      Dmitry Monakhov 提交于
      Currently MOPT_EXPLICIT treated as EXPLICIT_DELALLOC which may be changed
      in future. Let's fix it now.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c93cf2d7
    • D
      ext4, jbd2: ensure entering into panic after recording an error in superblock · 4327ba52
      Daeho Jeong 提交于
      If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
      journaling will be aborted first and the error number will be recorded
      into JBD2 superblock and, finally, the system will enter into the
      panic state in "errors=panic" option.  But, in the rare case, this
      sequence is little twisted like the below figure and it will happen
      that the system enters into panic state, which means the system reset
      in mobile environment, before completion of recording an error in the
      journal superblock. In this case, e2fsck cannot recognize that the
      filesystem failure occurred in the previous run and the corruption
      wouldn't be fixed.
      
      Task A                        Task B
      ext4_handle_error()
      -> jbd2_journal_abort()
        -> __journal_abort_soft()
          -> __jbd2_journal_abort_hard()
          | -> journal->j_flags |= JBD2_ABORT;
          |
          |                         __ext4_abort()
          |                         -> jbd2_journal_abort()
          |                         | -> __journal_abort_soft()
          |                         |   -> if (journal->j_flags & JBD2_ABORT)
          |                         |           return;
          |                         -> panic()
          |
          -> jbd2_journal_update_sb_errno()
      Tested-by: NHobin Woo <hobin.woo@samsung.com>
      Signed-off-by: NDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      4327ba52
  20. 18 10月, 2015 3 次提交
  21. 24 9月, 2015 2 次提交
  22. 05 9月, 2015 1 次提交
    • K
      fs: create and use seq_show_option for escaping · a068acf2
      Kees Cook 提交于
      Many file systems that implement the show_options hook fail to correctly
      escape their output which could lead to unescaped characters (e.g.  new
      lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
      could lead to confusion, spoofed entries (resulting in things like
      systemd issuing false d-bus "mount" notifications), and who knows what
      else.  This looks like it would only be the root user stepping on
      themselves, but it's possible weird things could happen in containers or
      in other situations with delegated mount privileges.
      
      Here's an example using overlay with setuid fusermount trusting the
      contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
      of "sudo" is something more sneaky:
      
        $ BASE="ovl"
        $ MNT="$BASE/mnt"
        $ LOW="$BASE/lower"
        $ UP="$BASE/upper"
        $ WORK="$BASE/work/ 0 0
        none /proc fuse.pwn user_id=1000"
        $ mkdir -p "$LOW" "$UP" "$WORK"
        $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
        $ cat /proc/mounts
        none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
        none /proc fuse.pwn user_id=1000 0 0
        $ fusermount -u /proc
        $ cat /proc/mounts
        cat: /proc/mounts: No such file or directory
      
      This fixes the problem by adding new seq_show_option and
      seq_show_option_n helpers, and updating the vulnerable show_option
      handlers to use them as needed.  Some, like SELinux, need to be open
      coded due to unusual existing escape mechanisms.
      
      [akpm@linux-foundation.org: add lost chunk, per Kees]
      [keescook@chromium.org: seq_show_option should be using const parameters]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NJan Kara <jack@suse.com>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Cc: J. R. Okajima <hooanon05g@gmail.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a068acf2
  23. 16 8月, 2015 2 次提交
    • T
      Revert "ext4: remove block_device_ejected" · bdfe0cbd
      Theodore Ts'o 提交于
      This reverts commit 08439fec.
      
      Unfortunately we still need to test for bdi->dev to avoid a crash when a
      USB stick is yanked out while a file system is mounted:
      
         usb 2-2: USB disconnect, device number 2
         Buffer I/O error on dev sdb1, logical block 15237120, lost sync page write
         JBD2: Error -5 detected when updating journal superblock for sdb1-8.
         BUG: unable to handle kernel paging request at 34beb000
         IP: [<c136ce88>] __percpu_counter_add+0x18/0xc0
         *pdpt = 0000000023db9001 *pde = 0000000000000000 
         Oops: 0000 [#1] SMP 
         CPU: 0 PID: 4083 Comm: umount Tainted: G     U     OE   4.1.1-040101-generic #201507011435
         Hardware name: LENOVO 7675CTO/7675CTO, BIOS 7NETC2WW (2.22 ) 03/22/2011
         task: ebf06b50 ti: ebebc000 task.ti: ebebc000
         EIP: 0060:[<c136ce88>] EFLAGS: 00010082 CPU: 0
         EIP is at __percpu_counter_add+0x18/0xc0
         EAX: f21c8e88 EBX: f21c8e88 ECX: 00000000 EDX: 00000001
         ESI: 00000001 EDI: 00000000 EBP: ebebde60 ESP: ebebde40
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
         CR0: 8005003b CR2: 34beb000 CR3: 33354200 CR4: 000007f0
         Stack:
          c1abe100 edcb0098 edcb00ec ffffffff f21c8e68 ffffffff f21c8e68 f286d160
          ebebde84 c1160454 00000010 00000282 f72a77f8 00000984 f72a77f8 f286d160
          f286d170 ebebdea0 c11e613f 00000000 00000282 f72a77f8 edd7f4d0 00000000
         Call Trace:
          [<c1160454>] account_page_dirtied+0x74/0x110
          [<c11e613f>] __set_page_dirty+0x3f/0xb0
          [<c11e6203>] mark_buffer_dirty+0x53/0xc0
          [<c124a0cb>] ext4_commit_super+0x17b/0x250
          [<c124ac71>] ext4_put_super+0xc1/0x320
          [<c11f04ba>] ? fsnotify_unmount_inodes+0x1aa/0x1c0
          [<c11cfeda>] ? evict_inodes+0xca/0xe0
          [<c11b925a>] generic_shutdown_super+0x6a/0xe0
          [<c10a1df0>] ? prepare_to_wait_event+0xd0/0xd0
          [<c1165a50>] ? unregister_shrinker+0x40/0x50
          [<c11b92f6>] kill_block_super+0x26/0x70
          [<c11b94f5>] deactivate_locked_super+0x45/0x80
          [<c11ba007>] deactivate_super+0x47/0x60
          [<c11d2b39>] cleanup_mnt+0x39/0x80
          [<c11d2bc0>] __cleanup_mnt+0x10/0x20
          [<c1080b51>] task_work_run+0x91/0xd0
          [<c1011e3c>] do_notify_resume+0x7c/0x90
          [<c1720da5>] work_notify
         Code: 8b 55 e8 e9 f4 fe ff ff 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 ec 20 89 5d f4 89 c3 89 75 f8 89 d6 89 7d fc 89 cf 8b 48 14 <64> 8b 01 89 45 ec 89 c2 8b 45 08 c1 fa 1f 01 75 ec 89 55 f0 89
         EIP: [<c136ce88>] __percpu_counter_add+0x18/0xc0 SS:ESP 0068:ebebde40
         CR2: 0000000034beb000
         ---[ end trace dd564a7bea834ecd ]---
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101011Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      bdfe0cbd
    • T
      ext4: ratelimit the file system mounted message · e294a537
      Theodore Ts'o 提交于
      The xfstests ext4/305 will mount and unmount the same file system over
      4,000 times, and each one of these will cause a system log message.
      Ratelimit this message since if we are getting more than a few dozen
      of these messages, they probably aren't going to be helpful.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      
      e294a537
  24. 15 8月, 2015 1 次提交
  25. 24 7月, 2015 1 次提交
    • J
      fs: Remove ext3 filesystem driver · c290ea01
      Jan Kara 提交于
      The functionality of ext3 is fully supported by ext4 driver. Major
      distributions (SUSE, RedHat) already use ext4 driver to handle ext3
      filesystems for quite some time. There is some ugliness in mm resulting
      from jbd cleaning buffers in a dirty page without cleaning page dirty
      bit and also support for buffer bouncing in the block layer when stable
      pages are required is there only because of jbd. So let's remove the
      ext3 driver. This saves us some 28k lines of duplicated code.
      Acked-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NJan Kara <jack@suse.cz>
      c290ea01
  26. 23 7月, 2015 1 次提交
    • D
      ext4, jbd2: add REQ_FUA flag when recording an error in the superblock · 564bc402
      Daeho Jeong 提交于
      When an error condition is detected, an error status should be recorded into
      superblocks of EXT4 or JBD2. However, the write request is submitted now
      without REQ_FUA flag, even in "barrier=1" mode, which is followed by
      panic() function in "errors=panic" mode. On mobile devices which make
      whole system reset as soon as kernel panic occurs, this write request
      containing an error flag will disappear just from storage cache without
      written to the physical cells. Therefore, when next start, even forever,
      the error flag cannot be shown in both superblocks, and e2fsck cannot fix
      the filesystem problems automatically, unless e2fsck is executed in
      force checking mode.
      
      [ Changed use test_opt(sb, BARRIER) of checking the journal flags -- TYT ]
      Signed-off-by: NDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      564bc402
  27. 22 7月, 2015 2 次提交
    • C
      ext4: reject journal options for ext2 mounts · 5ba92bcf
      Carlos Maiolino 提交于
      There is no reason to allow ext2 filesystems be mounted with journal
      mount options. So, this patch adds them to the MOPT_NO_EXT2 mount
      options list.
      Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      5ba92bcf
    • T
      ext4: implement cgroup writeback support · 001e4a87
      Tejun Heo 提交于
      For ordered and writeback data modes, all data IOs go through
      ext4_io_submit.  This patch adds cgroup writeback support by invoking
      wbc_init_bio() from io_submit_init_bio() and wbc_account_io() in
      io_submit_add_bh().  Journal data which is written by jbd2 worker is
      left alone by this patch and will always be written out from the root
      cgroup.
      
      ext4_fill_super() is updated to set MS_CGROUPWB when data mode is
      either ordered or writeback.  In journaled data mode, most IOs become
      synchronous through the journal and enabling cgroup writeback support
      doesn't make much sense or difference.  Journaled data mode is left
      alone.
      
      Lightly tested with sequential data write workload.  Behaves as
      expected.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      001e4a87
  28. 26 6月, 2015 1 次提交
  29. 24 6月, 2015 1 次提交