1. 28 1月, 2015 2 次提交
  2. 07 11月, 2014 2 次提交
  3. 30 9月, 2014 1 次提交
  4. 26 9月, 2014 1 次提交
  5. 19 9月, 2014 1 次提交
  6. 08 9月, 2014 2 次提交
    • A
      UBIFS: fix free log space calculation · ba29e721
      Artem Bityutskiy 提交于
      Hu (hujianyang <hujianyang@huawei.com>) discovered an issue in the
      'empty_log_bytes()' function, which calculates how many bytes are left in the
      log:
      
      "
      If 'c->lhead_lnum + 1 == c->ltail_lnum' and 'c->lhead_offs == c->leb_size', 'h'
      would equalent to 't' and 'empty_log_bytes()' would return 'c->log_bytes'
      instead of 0.
      "
      
      At this point it is not clear what would be the consequences of this, and
      whether this may lead to any problems, but this patch addresses the issue just
      in case.
      
      Cc: stable@vger.kernel.org
      Tested-by: Nhujianyang <hujianyang@huawei.com>
      Reported-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      ba29e721
    • A
      UBIFS: fix a race condition · 052c2807
      Artem Bityutskiy 提交于
      Hu (hujianyang@huawei.com) discovered a race condition which may lead to a
      situation when UBIFS is unable to mount the file-system after an unclean
      reboot. The problem is theoretical, though.
      
      In UBIFS, we have the log, which basically a set of LEBs in a certain area. The
      log has the tail and the head.
      
      Every time user writes data to the file-system, the UBIFS journal grows, and
      the log grows as well, because we append new reference nodes to the head of the
      log. So the head moves forward all the time, while the log tail stays at the
      same position.
      
      At any time, the UBIFS master node points to the tail of the log. When we mount
      the file-system, we scan the log, and we always start from its tail, because
      this is where the master node points to. The only occasion when the tail of the
      log changes is the commit operation.
      
      The commit operation has 2 phases - "commit start" and "commit end". The former
      is relatively short, and does not involve much I/O. During this phase we mostly
      just build various in-memory lists of the things which have to be written to
      the flash media during "commit end" phase.
      
      During the commit start phase, what we do is we "clean" the log. Indeed, the
      commit operation will index all the data in the journal, so the entire journal
      "disappears", and therefore the data in the log become unneeded. So we just
      move the head of the log to the next LEB, and write the CS node there. This LEB
      will be the tail of the new log when the commit operation finishes.
      
      When the "commit start" phase finishes, users may write more data to the
      file-system, in parallel with the ongoing "commit end" operation. At this point
      the log tail was not changed yet, it is the same as it had been before we
      started the commit. The log head keeps moving forward, though.
      
      The commit operation now needs to write the new master node, and the new master
      node should point to the new log tail. After this the LEBs between the old log
      tail and the new log tail can be unmapped and re-used again.
      
      And here is the possible problem. We do 2 operations: (a) We first update the
      log tail position in memory (see 'ubifs_log_end_commit()'). (b) And then we
      write the master node (see the big lock of code in 'do_commit()').
      
      But nothing prevents the log head from moving forward between (a) and (b), and
      the log head may "wrap" now to the old log tail. And when the "wrap" happens,
      the contends of the log tail gets erased. Now a power cut happens and we are in
      trouble. We end up with the old master node pointing to the old tail, which was
      erased. And replay fails because it expects the master node to point to the
      correct log tail at all times.
      
      This patch merges the abovementioned (a) and (b) operations by moving the master
      node change code to the 'ubifs_log_end_commit()' function, so that it runs with
      the log mutex locked, which will prevent the log from being changed benween
      operations (a) and (b).
      
      Cc: stable@vger.kernel.org # 07e19dff UBIFS: remove mst_mutex
      Cc: stable@vger.kernel.org
      Reported-by: Nhujianyang <hujianyang@huawei.com>
      Tested-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      052c2807
  7. 31 7月, 2014 1 次提交
  8. 29 7月, 2014 1 次提交
  9. 19 7月, 2014 14 次提交
  10. 12 6月, 2014 1 次提交
    • A
      ->splice_write() via ->write_iter() · 8d020765
      Al Viro 提交于
      iter_file_splice_write() - a ->splice_write() instance that gathers the
      pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
      it to ->write_iter().  A bunch of simple cases coverted to that...
      
      [AV: fixed the braino spotted by Cyrill]
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8d020765
  11. 03 6月, 2014 1 次提交
  12. 02 6月, 2014 2 次提交
    • D
      UBIFS: respect MS_SILENT mount flag · 90bea5a3
      Daniel Golle 提交于
      When attempting to mount a non-ubifs formatted volume, lots of error
      messages (including a stack dump) are thrown to the kernel log even if
      the MS_SILENT mount flag is set.
      Fix this by introducing adding an additional state-variable in
      struct ubifs_info and suppress error messages in ubifs_read_node if
      MS_SILENT is set.
      Signed-off-by: NDaniel Golle <daniel@makrotopia.org>
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      90bea5a3
    • H
      UBIFS: Remove incorrect assertion in shrink_tnc() · 72abc8f4
      hujianyang 提交于
      I hit the same assert failed as Dolev Raviv reported in Kernel v3.10
      shows like this:
      
      [ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297)
      [ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G           O 3.10.40 #1
      [ 9641.234116] [<c0011a6c>] (unwind_backtrace+0x0/0x12c) from [<c000d0b0>] (show_stack+0x20/0x24)
      [ 9641.234137] [<c000d0b0>] (show_stack+0x20/0x24) from [<c0311134>] (dump_stack+0x20/0x28)
      [ 9641.234188] [<c0311134>] (dump_stack+0x20/0x28) from [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs])
      [ 9641.234265] [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs])
      [ 9641.234307] [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [<c00cdad8>] (shrink_slab+0x1d4/0x2f8)
      [ 9641.234327] [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) from [<c00d03d0>] (do_try_to_free_pages+0x300/0x544)
      [ 9641.234344] [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) from [<c00d0a44>] (try_to_free_pages+0x2d0/0x398)
      [ 9641.234363] [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) from [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8)
      [ 9641.234382] [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) from [<c00f62d8>] (new_slab+0x78/0x238)
      [ 9641.234400] [<c00f62d8>] (new_slab+0x78/0x238) from [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c)
      [ 9641.234419] [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) from [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188)
      [ 9641.234459] [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) from [<bf227908>] (do_readpage+0x168/0x468 [ubifs])
      [ 9641.234553] [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) from [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs])
      [ 9641.234606] [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) from [<c00c17c0>] (filemap_fault+0x304/0x418)
      [ 9641.234638] [<c00c17c0>] (filemap_fault+0x304/0x418) from [<c00de694>] (__do_fault+0xd4/0x530)
      [ 9641.234665] [<c00de694>] (__do_fault+0xd4/0x530) from [<c00e10c0>] (handle_pte_fault+0x480/0xf54)
      [ 9641.234690] [<c00e10c0>] (handle_pte_fault+0x480/0xf54) from [<c00e2bf8>] (handle_mm_fault+0x140/0x184)
      [ 9641.234716] [<c00e2bf8>] (handle_mm_fault+0x140/0x184) from [<c0316688>] (do_page_fault+0x150/0x3ac)
      [ 9641.234737] [<c0316688>] (do_page_fault+0x150/0x3ac) from [<c000842c>] (do_DataAbort+0x3c/0xa0)
      [ 9641.234759] [<c000842c>] (do_DataAbort+0x3c/0xa0) from [<c0314e38>] (__dabt_usr+0x38/0x40)
      
      After analyzing the code, I found a condition that may cause this failed
      in correct operations. Thus, I think this assertion is wrong and should be
      removed.
      
      Suppose there are two clean znodes and one dirty znode in TNC. So the
      per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode
      is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops
      on this znode. We clear COW bit and DIRTY bit in write_index() without
      @tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the
      comments in write_index() shows, if another process hold @tnc_mutex and
      dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1).
      We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in
      free_obsolete_znodes() to keep it right.
      
      If shrink_tnc() performs between decrease and increase, it will release
      other 2 clean znodes it holds and found @clean_zn_cnt is less than zero
      (1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will
      soon correct @clean_zn_cnt and no harm to fs in this case, I think this
      assertion could be removed.
      
      2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2
      
      Thread A (commit)         Thread B (write or others)       Thread C (shrinker)
      ->write_index
         ->clear_bit(DIRTY_NODE)
         ->clear_bit(COW_ZNODE)
      
                  @clean_zn_cnt == 2
                                ->mutex_locked(&tnc_mutex)
                                ->dirty_cow_znode
                                    ->!ubifs_zn_cow(znode)
                                    ->!test_and_set_bit(DIRTY_NODE)
                                    ->atomic_dec(&clean_zn_cnt)
                                ->mutex_unlocked(&tnc_mutex)
      
                  @clean_zn_cnt == 1
                                                                 ->mutex_locked(&tnc_mutex)
                                                                 ->shrink_tnc
                                                                   ->destroy_tnc_subtree
                                                                   ->atomic_sub(&clean_zn_cnt, 2)
                                                                   ->ubifs_assert  <- hit
                                                                 ->mutex_unlocked(&tnc_mutex)
      
                  @clean_zn_cnt == -1
      ->mutex_lock(&tnc_mutex)
      ->free_obsolete_znodes
         ->atomic_inc(&clean_zn_cnt)
      ->mutux_unlock(&tnc_mutex)
      
                  @clean_zn_cnt == 0 (correct after shrink)
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      72abc8f4
  13. 28 5月, 2014 1 次提交
  14. 27 5月, 2014 2 次提交
  15. 13 5月, 2014 2 次提交
  16. 07 5月, 2014 2 次提交
  17. 05 5月, 2014 1 次提交
  18. 18 4月, 2014 1 次提交
  19. 08 4月, 2014 1 次提交
  20. 04 4月, 2014 1 次提交
    • J
      mm + fs: store shadow entries in page cache · 91b0abe3
      Johannes Weiner 提交于
      Reclaim will be leaving shadow entries in the page cache radix tree upon
      evicting the real page.  As those pages are found from the LRU, an
      iput() can lead to the inode being freed concurrently.  At this point,
      reclaim must no longer install shadow pages because the inode freeing
      code needs to ensure the page tree is really empty.
      
      Add an address_space flag, AS_EXITING, that the inode freeing code sets
      under the tree lock before doing the final truncate.  Reclaim will check
      for this flag before installing shadow pages.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NMinchan Kim <minchan@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Metin Doslu <metin@citusdata.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ozgun Erdogan <ozgun@citusdata.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roman Gushchin <klamm@yandex-team.ru>
      Cc: Ryan Mallon <rmallon@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      91b0abe3