1. 16 9月, 2016 3 次提交
  2. 15 9月, 2016 5 次提交
    • F
      ext4: remove unused definition for MAX_32_NUM · be32197c
      Fabian Frederick 提交于
      MAX_32_NUM isn't used in ext4
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      be32197c
    • F
      ext4: create EXT4_MAX_BLOCKS() macro · 518eaa63
      Fabian Frederick 提交于
      Create a macro to calculate length + offset -> maximum blocks
      This adds more readability.
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      518eaa63
    • F
      ext4: remove unneeded test in ext4_alloc_file_blocks() · c3fe493c
      Fabian Frederick 提交于
      ext4_alloc_file_blocks() is called from ext4_zero_range() and
      ext4_fallocate() both already testing EXT4_INODE_EXTENTS
      We can call ext_depth(inode) unconditionnally.
      
      [ Added BUG_ON check to make sure ext4_alloc_file_blocks() won't get
        called for a indirect-mapped inode in the future.  -- tytso ]
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c3fe493c
    • F
      ext4: fix memory leak in ext4_insert_range() · edf15aa1
      Fabian Frederick 提交于
      Running xfstests generic/013 with kmemleak gives the following:
      
      unreferenced object 0xffff8801d3d27de0 (size 96):
        comm "fsstress", pid 4941, jiffies 4294860168 (age 53.485s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff818eaaf3>] kmemleak_alloc+0x23/0x40
          [<ffffffff81179805>] __kmalloc+0xf5/0x1d0
          [<ffffffff8122ef5c>] ext4_find_extent+0x1ec/0x2f0
          [<ffffffff8123530c>] ext4_insert_range+0x34c/0x4a0
          [<ffffffff81235942>] ext4_fallocate+0x4e2/0x8b0
          [<ffffffff81181334>] vfs_fallocate+0x134/0x210
          [<ffffffff8118203f>] SyS_fallocate+0x3f/0x60
          [<ffffffff818efa9b>] entry_SYSCALL_64_fastpath+0x13/0x8f
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Problem seems mitigated by dropping refs and freeing path
      when there's no path[depth].p_ext
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      edf15aa1
    • W
      ext4: bugfix for mmaped pages in mpage_release_unused_pages() · 4e800c03
      wangguang 提交于
      Pages clear buffers after ext4 delayed block allocation failed,
      However, it does not clean its pte_dirty flag.
      if the pages unmap ,in cording to the pte_dirty ,
      unmap_page_range may try to call __set_page_dirty,
      
      which may lead to the bugon at 
      mpage_prepare_extent_to_map:head = page_buffers(page);.
      
      This patch just call clear_page_dirty_for_io to clean pte_dirty 
      at mpage_release_unused_pages for pages mmaped. 
      
      Steps to reproduce the bug:
      
      (1) mmap a file in ext4
      	addr = (char *)mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED,
      	       	            fd, 0);
      	memset(addr, 'i', 4096);
      
      (2) return EIO at 
      
      	ext4_writepages->mpage_map_and_submit_extent->mpage_map_one_extent 
      
      which causes this log message to be print:
      
                      ext4_msg(sb, KERN_CRIT,
                              "Delayed block allocation failed for "
                              "inode %lu at logical offset %llu with"
                              " max blocks %u with error %d",
                              inode->i_ino,
                              (unsigned long long)map->m_lblk,
                              (unsigned)map->m_len, -err);
      
      (3)Unmap the addr cause warning at
      
      	__set_page_dirty:WARN_ON_ONCE(warn && !PageUptodate(page));
      
      (4) wait for a minute,then bugon happen.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: Nwangguang <wangguang03@zte.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      4e800c03
  3. 06 9月, 2016 5 次提交
    • D
      ext4: improve ext4lazyinit scalability · e22834f0
      Dmitry Monakhov 提交于
      ext4lazyinit is a global thread. This thread performs itable
      initalization under li_list_mtx mutex.
      
      It basically does the following:
      ext4_lazyinit_thread
        ->mutex_lock(&eli->li_list_mtx);
        ->ext4_run_li_request(elr)
          ->ext4_init_inode_table-> Do a lot of IO if the list is large
      
      And when new mount/umount arrive they have to block on ->li_list_mtx
      because  lazy_thread holds it during full walk procedure.
      ext4_fill_super
       ->ext4_register_li_request
         ->mutex_lock(&ext4_li_info->li_list_mtx);
         ->list_add(&elr->lr_request, &ext4_li_info >li_request_list);
      In my case mount takes 40minutes on server with 36 * 4Tb HDD.
      Common user may face this in case of very slow dev ( /dev/mmcblkXXX)
      Even more. If one of filesystems was frozen lazyinit_thread will simply
      block on sb_start_write() so other mount/umount will be stuck forever.
      
      This patch changes logic like follows:
      - grab ->s_umount read sem before processing new li_request.
        After that it is safe to drop li_list_mtx because all callers of
        li_remove_request are holding ->s_umount for write.
      - li_thread skips frozen SB's
      
      Locking order:
      Mh KOrder is asserted by umount path like follows: s_umount ->li_list_mtx so
      the only way to to grab ->s_mount inside li_thread is via down_read_trylock
      
      xfstests:ext4/023
      #PSBM-49658
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e22834f0
    • J
      ext4: cleanup ext4_sync_parent() · 6ae4c5a6
      Jan Kara 提交于
      A condition !hlist_empty(&inode->i_dentry) is always true for open file.
      Just remove it. Also ext4_sync_parent() could use some explanation why
      races with rmdir() are not an issue - add a comment explaining that.
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6ae4c5a6
    • K
      ext4: remove old feature helpers · 0b7b7779
      Kaho Ng 提交于
      Use the ext4_{has,set,clear}_feature_* helpers to replace the old
      feature helpers.
      Signed-off-by: NKaho Ng <ngkaho1234@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b7b7779
    • J
      ext4: enable quota enforcement based on mount options · 49da9392
      Jan Kara 提交于
      When quota information is stored in quota files, we enable only quota
      accounting on mount and enforcement is enabled only in response to
      Q_QUOTAON quotactl. To make ext4 behavior consistent with XFS, we add a
      possibility to enable quota enforcement on mount by specifying
      corresponding quota mount option (usrquota, grpquota, prjquota).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      49da9392
    • D
      ext4: reinforce check of i_dtime when clearing high fields of uid and gid · 93e3b4e6
      Daeho Jeong 提交于
      Now, ext4_do_update_inode() clears high 16-bit fields of uid/gid
      of deleted and evicted inode to fix up interoperability with old
      kernels. However, it checks only i_dtime of an inode to determine
      whether the inode was deleted and evicted, and this is very risky,
      because i_dtime can be used for the pointer maintaining orphan inode
      list, too. We need to further check whether the i_dtime is being
      used for the orphan inode list even if the i_dtime is not NULL.
      
      We found that high 16-bit fields of uid/gid of inode are unintentionally
      and permanently cleared when the inode truncation is just triggered,
      but not finished, and the inode metadata, whose high uid/gid bits are
      cleared, is written on disk, and the sudden power-off follows that
      in order.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: NHobin Woo <hobin.woo@samsung.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      93e3b4e6
  4. 31 8月, 2016 1 次提交
  5. 30 8月, 2016 12 次提交
    • E
      ext4: enforce online defrag restriction for encrypted files · 14fbd4aa
      Eric Whitney 提交于
      Online defragging of encrypted files is not currently implemented.
      However, the move extent ioctl can still return successfully when
      called.  For example, this occurs when xfstest ext4/020 is run on an
      encrypted file system, resulting in a corrupted test file and a
      corresponding test failure.
      
      Until the proper functionality is implemented, fail the move extent
      ioctl if either the original or donor file is encrypted.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      14fbd4aa
    • J
      ext4: factor out loop for freeing inode xattr space · dfa2064b
      Jan Kara 提交于
      Move loop to make enough space in the inode from
      ext4_expand_extra_isize_ea() into a separate function to make that
      function smaller and better readable and also to avoid delaration of
      variables inside a loop block.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      dfa2064b
    • J
      ext4: remove (almost) unused variables from ext4_expand_extra_isize_ea() · 6e0cd088
      Jan Kara 提交于
      'start' variable is completely unused in ext4_expand_extra_isize_ea().
      Variable 'first' is used only once in one place. So just remove them.
      Variables 'entry' and 'last' are only really used later in the function
      inside a loop. Move their declarations there.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6e0cd088
    • J
      ext4: factor out xattr moving · 3f2571c1
      Jan Kara 提交于
      Factor out function for moving xattrs from inode into external xattr
      block from ext4_expand_extra_isize_ea(). That function is already quite
      long and factoring out this rather standalone functionality helps
      readability.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      3f2571c1
    • J
      ext4: replace bogus assertion in ext4_xattr_shift_entries() · 94405713
      Jan Kara 提交于
      We were checking whether computed offsets do not exceed end of block in
      ext4_xattr_shift_entries(). However this does not make sense since we
      always only decrease offsets. So replace that assertion with a check
      whether we really decrease xattrs value offsets.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      94405713
    • J
      ext4: remove checks for e_value_block · 1cba4237
      Jan Kara 提交于
      Currently we don't support xattrs with e_value_block set. We don't allow
      them to pass initial xattr check so there's no point for checking for
      this later. Since these tests were untested, bugs were creeping in and
      not all places which should have checked were checking e_value_block
      anyway.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      1cba4237
    • J
      ext4: Check that external xattr value block is zero · 2de58f11
      Jan Kara 提交于
      Currently we don't support xattrs with values stored out of line. Check
      for that in ext4_xattr_check_names() to make sure we never work with
      such xattrs since not all the code counts with that resulting is possible
      weird corruption issues.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2de58f11
    • J
      ext4: fixup free space calculations when expanding inodes · e3014d14
      Jan Kara 提交于
      Conditions checking whether there is enough free space in an xattr block
      and when xattr is large enough to make enough space in the inode forgot
      to account for the fact that inode need not be completely filled up with
      xattrs. Thus we could move unnecessarily many xattrs out of inode or
      even falsely claim there is not enough space to expand the inode. We
      also forgot to update the amount of free space in xattr block when moving
      more xattrs and thus could decide to move too big xattr resulting in
      unexpected failure.
      
      Fix these problems by properly updating free space in the inode and
      xattr block as we move xattrs. To simplify the math, avoid shifting
      xattrs after removing each one xattr and instead just shift xattrs only
      once there is enough free space in the inode.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e3014d14
    • L
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · b8927721
      Linus Torvalds 提交于
      Pull ext4 fixes from Ted Ts'o:
       "Fix bugs that could cause kernel deadlocks or file system corruption
        while moving xattrs to expand the extended inode.
      
        Also add some sanity checks to the block group descriptors to make
        sure we don't end up overwriting the superblock"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: avoid deadlock when expanding inode size
        ext4: properly align shifted xattrs when expanding inodes
        ext4: fix xattr shifting when expanding inodes part 2
        ext4: fix xattr shifting when expanding inodes
        ext4: validate that metadata blocks do not overlap superblock
        ext4: reserve xattr index for the Hurd
      b8927721
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 1f6a563e
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Segregate namespaces properly in conntrack dumps, from Liping Zhang.
      
       2) tcp listener refcount fix in netfilter tproxy, from Eric Dumazet.
      
       3) Fix timeouts in qed driver due to xmit_more, from Yuval Mintz.
      
       4) Fix use-after-free in tcp_xmit_retransmit_queue().
      
       5) Userspace header fixups (use of __u32, missing includes, etc.) from
          Mikko Rapeli.
      
       6) Further refinements to fragmentation wrt gso and tunnels, from
          Shmulik Ladkani.
      
       7) Trigger poll correctly for zero length UDP packets, from Eric
          Dumazet.
      
       8) TCP window scaling fix, also from Eric Dumazet.
      
       9) SLAB_DESTROY_BY_RCU is not relevant any more for UDP sockets.
      
      10) Module refcount leak in qdisc_create_dflt(), from Eric Dumazet.
      
      11) Fix deadlock in cp_rx_poll() of 8139cp driver, from Gao Feng.
      
      12) Memory leak in rhashtable's alloc_bucket_locks(), from Eric Dumazet.
      
      13) Add new device ID to alx driver, from Owen Lin.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits)
        Add Killer E2500 device ID in alx driver.
        net: smc91x: fix SMC accesses
        Documentation: networking: dsa: Remove platform device TODO
        net/mlx5: Increase number of ethtool steering priorities
        net/mlx5: Add error prints when validate ETS failed
        net/mlx5e: Fix memory leak if refreshing TIRs fails
        net/mlx5e: Add ethtool counter for TX xmit_more
        net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ
        net/mlx5e: Don't wait for SQ completions on close
        net/mlx5e: Don't post fragmented MPWQE when RQ is disabled
        net/mlx5e: Don't wait for RQ completions on close
        net/mlx5e: Limit UMR length to the device's limitation
        rhashtable: fix a memory leak in alloc_bucket_locks()
        sfc: fix potential stack corruption from running past stat bitmask
        team: loadbalance: push lacpdus to exact delivery
        net: hns: dereference ppe_cb->ppe_common_cb if it is non-null
        8139cp: Fix one possible deadloop in cp_rx_poll
        i40e: Change some init flow for the client
        Revert "phy: IRQ cannot be shared"
        net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
        ...
      1f6a563e
    • L
      Merge tag 'platform-drivers-x86-v4.8-4' of... · cf4d3779
      Linus Torvalds 提交于
      Merge tag 'platform-drivers-x86-v4.8-4' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
      
      Pull x86 platform driver fixes from Darren Hart:
       "Remove module related code from two drivers that are only configurable
        as built-in: intel_pmic_gpio and platform/olpc"
      
      * tag 'platform-drivers-x86-v4.8-4' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
        intel_pmic_gpio: Make explicitly non-modular
        platform/olpc: Make ec explicitly non-modular
      cf4d3779
    • L
      Merge tag 'powerpc-4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 2a90309e
      Linus Torvalds 提交于
      Pull powerpc fixes from Ben Herrenschmidt:
       "This was meant to be sent early last week, but I has a change pending
        on one of the fixes and other things made me forget all about.  Ugh.
      
        We have some misc fixes for powerpc 4.8.  Some trivial bits and some
        regressions, and a trivial cleanup or two that I saw no point in
        letting rot in patchwork"
      
      * tag 'powerpc-4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: signals: Discard transaction state from signal frames
        powerpc/powernv : Drop reference added by kset_find_obj()
        powerpc/tm: do not use r13 for tabort_syscall
        powerpc: move hmi.c to arch/powerpc/kvm/
        powerpc: sysdev: cpm: fix gpio save_regs functions
        powerpc/pseries: PACA save area fix for MCE vs MCE
        powerpc/pseries: PACA save area fix for general exception vs MCE
        powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
        powerpc, hotplug: Avoid to touch non-existent cpumasks.
        powerpc: migrate exception table users off module.h and onto extable.h
        powerpc/powernv/pci: fix iterator signedness
        powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)
        cxl: use pcibios_free_controller_deferred() when removing vPHBs
        powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner"
        powerpc/512x: Delete unnecessary assignment for the field "owner"
        drivers/macintosh: Delete owner assignment
        powerpc: cputhreads: Add missing include file
      2a90309e
  6. 29 8月, 2016 14 次提交