1. 14 5月, 2011 3 次提交
  2. 03 5月, 2011 2 次提交
    • A
      UBIFS: seek journal heads to the latest bud in replay · 52c6e6f9
      Artem Bityutskiy 提交于
      This is the second fix of the following symptom:
      
      UBIFS error (pid 34456): could not find an empty LEB
      
      which sometimes happens after power cuts when we mount the file-system - UBIFS
      refuses it with the above error message which comes from the
      'ubifs_rcvry_gc_commit()' function. I can reproduce this using the integck test
      with the UBIFS power cut emulation enabled.
      
      Analysis of the problem.
      
      Currently UBIFS replay seeks the journal heads to the last _replayed_ bud.
      But the buds are replayed out-of-order, so the replay basically seeks journal
      heads to the "random" bud belonging to this head, and not to the _last_ one.
      
      The result of this is that the GC head may be seeked to a full LEB with no free
      space, or very little free space. And 'ubifs_rcvry_gc_commit()' tries to find a
      fully or mostly dirty LEB to match the current GC head (because we need to
      garbage-collect that dirty LEB at one go, because we do not have @c->gc_lnum).
      So 'ubifs_find_dirty_leb()' fails and we fall back to finding an empty LEB and
      also fail. As a result - recovery fails and mounting fails.
      
      This patch teaches the replay to initialize the GC heads exactly to the latest
      buds, i.e. the buds which have the largest sequence number in corresponding
      log reference nodes.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Cc: stable@kernel.org
      52c6e6f9
    • A
      UBIFS: do not free write-buffers when in R/O mode · b50b9f40
      Artem Bityutskiy 提交于
      Currently UBIFS has a small optimization - it frees write-buffers when it is
      re-mounted from R/W mode to R/O mode. Of course, when it is mounted R/O, it
      does not allocate write-buffers as well.
      
      This optimization is nice but it leads to subtle problems and complications
      in recovery, which I can reproduce using the integck test. The symptoms are
      that after a power cut the file-system cannot be mounted if we first mount
      it R/O, and then re-mount R/W - 'ubifs_rcvry_gc_commit()' prints:
      
      UBIFS error (pid 34456): could not find an empty LEB
      
      Analysis of the  problem.
      
      When mounting R/W, the reply process sets journal heads to buds [1], but
      when mounting R/O - it does not do this, because the write-buffers are not
      allocated. So 'ubifs_rcvry_gc_commit()' works completely differently for the
      same file-system but for the following 2 cases:
      
      1. mounting R/W after a power cut and recover
      2. mounting R/O after a power cut, re-mounting R/W and run deferred recovery
      
      In the former case, we have journal heads seeked to the a bud, in the latter
      case, they are non-seeked (wbuf->lnum == -1). So in the latter case we do not
      try to recover the GC LEB by garbage-collecting to the GC head, but we just
      try to find an empty LEB, and there may be no empty LEBs, so we just fail.
      On the other hand, in the former case (mount R/W), we are able to make a GC LEB
      (@c->gc_lnum) by garbage-collecting.
      
      Thus, let's remove this small nice optimization and always allocate
      write-buffers. This should not make too big difference - we have only 3
      of them, each of max. write unit size, which is usually 2KiB. So this is
      about 6KiB of RAM for the typical case, and only when mounted R/O.
      
      [1]: Note, currently the replay process is setting (seeking) the journal heads
      to _some_ buds, not necessarily to the buds which had been the journal heads
      before the power cut happened. This will be fixed separately.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Cc: stable@kernel.org
      b50b9f40
  3. 21 4月, 2011 2 次提交
    • A
      UBIFS: fix master node recovery · 6e0d9fd3
      Artem Bityutskiy 提交于
      This patch fixes the following symptoms:
      1. Unmount UBIFS cleanly.
      2. Start mounting UBIFS R/W and have a power cut immediately
      3. Start mounting UBIFS R/O, this succeeds
      4. Try to re-mount UBIFS R/W - this fails immediately or later on,
         because UBIFS will write the master node to the flash area
         which has been written before.
      
      The analysis of the problem:
      
      1. UBIFS is unmounted cleanly, both copies of the master node are clean.
      2. UBIFS is being mounter R/W, starts changing master node copy 1, and
         a power cut happens. The copy N1 becomes corrupted.
      3. UBIFS is being mounted R/O. It notices the copy N1 is corrupted and
         reads copy N2. Copy N2 is clean.
      4. Because of R/O mode, UBIFS cannot recover copy 1.
      5. The mount code (ubifs_mount()) sees that the master node is clean,
         so it decides that no recovery is needed.
      6. We are re-mounting R/W. UBIFS believes no recovery is needed and
         starts updating the master node, but copy N1 is still corrupted
         and was not recovered!
      
      Fix this problem by marking the master node as dirty every time we
      recover it and we are in R/O mode. This forces further recovery and
      the UBIFS cleans-up the corruptions and recovers the copy N1 when
      re-mounting R/W later.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Cc: stable@kernel.org
      6e0d9fd3
    • A
      UBIFS: fix false assertion warning in case of I/O failures · 1a067a22
      Artem Bityutskiy 提交于
      When UBIFS switches to R/O mode because it detects I/O failures, then
      when we unmount, we still may have allocated budget, and the assertions
      which verify that we have not budget will fire. But it is expected to
      have the budget in case of I/O failures, so the assertion warnings will
      be false. Suppress them for the I/O failure case.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      1a067a22
  4. 20 4月, 2011 1 次提交
  5. 13 4月, 2011 2 次提交
  6. 06 4月, 2011 1 次提交
    • J
      fs: export empty_aops · 7dcda1c9
      Jens Axboe 提交于
      With the ->sync_page() hook gone, we have a few users that
      add their own static address_space_operations without any
      functions defined.
      
      fs/inode.c already has an empty_aops that it uses for init
      purposes. Lets export that and use it in the places where
      an otherwise empty aops was defined.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7dcda1c9
  7. 05 4月, 2011 7 次提交
    • A
      UBIFS: fix debugging failure in dbg_check_space_info · 7da6443a
      Artem Bityutskiy 提交于
      This patch fixes a debugging failure with which looks like this:
      UBIFS error (pid 32313): dbg_check_space_info: free space changed from 6019344 to 6022654
      
      The reason for this failure is described in the comment this patch adds
      to the code. But in short - 'c->freeable_cnt' may be different before
      and after re-mounting, and this is normal. So the debugging code should
      make sure that free space calculations do not depend on 'c->freeable_cnt'.
      
      A similar issue has been reported here:
      http://lists.infradead.org/pipermail/linux-mtd/2011-April/034647.html
      
      This patch should fix it.
      
      For the -stable guys: this patch is only relevant for kernels 2.6.30
      onwards.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Cc: stable@kernel.org [2.6.30+]
      7da6443a
    • A
      UBIFS: fix error path in dbg_debugfs_init_fs · 95169535
      Artem Bityutskiy 提交于
      The debug interface is substandard and on error returns either
      NULL or an error code packed in the pointer. So using "IS_ERR"
      for the pointers returned by debugfs function is incorrect.
      Instead, we should use IS_ERR_OR_NULL.
      
      This path is an improved vestion of the original patch from
      Phil Carmody.
      Reported-by: NPhil Carmody <ext-phil.2.carmody@nokia.com>
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Acked-by: NPhil Carmody <ext-phil.2.carmody@nokia.com>
      95169535
    • A
      UBIFS: unify error path dbg_debugfs_init_fs · cc6a86b9
      Artem Bityutskiy 提交于
      This is just a small clean-up patch which simlifies and unifies the
      error path in the dbg_debugfs_init_fs(). We have common error path
      for all failure cases in this function except of the very first
      case. And this patch makes the first failure case use the same
      error path as the other cases by using the 'fname' and 'dent'
      variables.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Acked-by: NPhil Carmody <ext-phil.2.carmody@nokia.com>
      cc6a86b9
    • A
      UBIFS: do not select KALLSYMS_ALL · 81354de3
      Artem Bityutskiy 提交于
      All UBIFS needs is to make sure we stacktraces when UBIFS debugging
      is enabled. It is enough to select KALLSYMS for this, KALLSYMS_ALL
      is not necessary. Moreover, Randy Dunlap reported that UBIFS causes
      the following Kconfig dependency warning:
      
      warning: (UBIFS_FS_DEBUG && LOCKDEP && LATENCYTOP) selects KALLSYMS_ALL
      which has unmet direct dependencies (DEBUG_KERNEL && KALLSYMS)
      
      The reason is that KALLSYMS_ALL requires DEBUG_KERNEL and KALLSYMS, so
      ideally, to select KALLSYMS_ALL we'd need to select DEBUG_KERNEL and
      KALLSYMS first.
      
      This seems to be too much to select. The easiest way to go is to forget
      about KALLSYMS_ALL and just select KALLSYMS when UBIFS debugging is
      enabled - that should be enough for stackdumps.
      Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
      81354de3
    • A
      UBIFS: fix assertion warnings · c88ac00c
      Artem Bityutskiy 提交于
      This patch fixes UBIFS assertion warnings like:
      
      UBIFS assert failed in ubifs_leb_unmap at 135 (pid 29365)
      Pid: 29365, comm: integck Tainted: G          I 2.6.37-ubi-2.6+ #34
      Call Trace:
       [<ffffffffa047c663>] ubifs_lpt_init+0x95e/0x9ee [ubifs]
       [<ffffffffa04623a7>] ubifs_remount_fs+0x2c7/0x762 [ubifs]
       [<ffffffff810f066e>] do_remount_sb+0xb6/0x101
       [<ffffffff81106ff4>] ? do_mount+0x191/0x78e
       [<ffffffff811070bb>] do_mount+0x258/0x78e
       [<ffffffff810da1e8>] ? alloc_pages_current+0xa2/0xc5
       [<ffffffff81107674>] sys_mount+0x83/0xbd
       [<ffffffff81009a12>] system_call_fastpath+0x16/0x1b
      
      They happen when we re-mount from R/O mode to R/W mode. While
      re-mounting, we write to the media, but we still have the c->ro_mount
      flag set. The fix is very simple - just clear the flag before
      starting re-mounting R/W.
      
      These warnings are caused by the following commit:
      2ef13294
      
      For -stable guys: this bug was introduced in 2.6.38, this is materieal
      for 2.6.38-stable.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Cc: stable@kernel.org [2.6.38]
      c88ac00c
    • A
      UBIFS: fix oops on error path in read_pnode · 54acbaaa
      Artem Bityutskiy 提交于
      Thanks to coverity which spotted that UBIFS will oops if 'kmalloc()'
      in 'read_pnode()' fails and we dereference a NULL 'pnode' pointer
      when we 'goto out'.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Cc: stable@kernel.org
      54acbaaa
    • A
      UBIFS: do not read flash unnecessarily · 8b229c76
      Artem Bityutskiy 提交于
      This fix makes the 'dbg_check_old_index()' function return
      immediately if debugging is disabled, instead of executing
      incorrect 'goto out' which causes UBIFS to:
      
      1. Allocate memory
      2. Read the flash
      
      On every commit. OK, we do not commit that often, but it is
      still silly to do unneeded I/O anyway.
      
      Credits to coverity for spotting this silly issue.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Cc: stable@kernel.org
      8b229c76
  8. 31 3月, 2011 1 次提交
  9. 24 3月, 2011 4 次提交
    • A
      UBIFS: fix assertion warning and refine comments · 6ed09c34
      Artem Bityutskiy 提交于
      This patch fixes the following UBIFS assertion warning:
      
      UBIFS assert failed in do_readpage at 115 (pid 199)
      [<b00321b8>] (unwind_backtrace+0x0/0xdc) from [<af025118>]
      (do_readpage+0x108/0x594 [ubifs])
      [<af025118>] (do_readpage+0x108/0x594 [ubifs]) from [<af025764>]
      (ubifs_write_end+0x1c0/0x2e8 [ubifs])
      [<af025764>] (ubifs_write_end+0x1c0/0x2e8 [ubifs]) from
      [<b00a0164>] (generic_file_buffered_write+0x18c/0x270)
      [<b00a0164>] (generic_file_buffered_write+0x18c/0x270) from
      [<b00a08d4>] (__generic_file_aio_write+0x478/0x4c0)
      [<b00a08d4>] (__generic_file_aio_write+0x478/0x4c0) from
      [<b00a0984>] (generic_file_aio_write+0x68/0xc8)
      [<b00a0984>] (generic_file_aio_write+0x68/0xc8) from
      [<af024a78>] (ubifs_aio_write+0x178/0x1d8 [ubifs])
      [<af024a78>] (ubifs_aio_write+0x178/0x1d8 [ubifs]) from
      [<b00d104c>] (do_sync_write+0xb0/0x100)
      [<b00d104c>] (do_sync_write+0xb0/0x100) from [<b00d1abc>]
      (vfs_write+0xac/0x154)
      [<b00d1abc>] (vfs_write+0xac/0x154) from [<b00d1c10>]
      (sys_write+0x3c/0x68)
      [<b00d1c10>] (sys_write+0x3c/0x68) from [<b002d9a0>]
      (ret_fast_syscall+0x0/0x2c)
      
      The 'PG_checked' flag is used to indicate that the page does not
      supposedly exist on the media (e.g., a hole or a page beyond the
      inode size), so it requires slightly bigger budget, because we have
      to account the indexing size increase. And this flag basically
      tells that the budget for this page has to be "new page budget".
      The "new page budget" is slightly bigger than the "existing page
      budget".
      
      The 'do_readpage()' function has the following assertion which
      sometimes is hit: 'ubifs_assert(!PageChecked(page))'. Obviously,
      the meaning of this assertion is: "I should not be asked to read
      a page which does not exist on the media".
      
      However, in 'ubifs_write_begin()' we have a small "trick". Notice,
      that VFS may write pages which were not read yet, so the page data
      were not loaded from the media to the page cache yet. If VFS tells
      that it is going to change only some part of the page, we obviously
      have to load it from the media. However, if VFS tells that it is
      going to change whole page, we do not read it from the media for
      optimization purposes.
      
      However, since we do not read it, we do not know if it exists on
      the media or not (a hole, etc). So we set the 'PG_checked' flag
      to this page to force bigger budget, just in case.
      
      So 'ubifs_write_begin()' sets 'PG_checked'. Then we are in
      'ubifs_write_end()'. And VFS tells us: "hey, for some reasons I
      changed my mind and did not change whole page". Frankly, I do not
      know why this happens, but I hit this somehow on an ARM platform.
      And this is extremely rare.
      
      So in this case UBIFS does the following:
      
      1. Cancels allocated budget.
      2. Loads the page from the media by calling 'do_readpage()'.
      3. Asks VFS to repeat the whole write operation from the very
         beginning (call '->write_begin() again, etc).
      
      And the assertion warning is hit at the step 2 - remember we have
      the 'PG_checked' set for this page, and 'do_readpage()' does not
      like this. So this patch fixes the problem by adding step 1.5 and
      cleaning the 'PG_checked' before calling 'do_readpage()'.
      
      All in all, this patch does not fix any functionality issue, but it
      silences UBIFS false positive warning which may happen in very very
      rare cases.
      
      And while on it, this patch also improves a commentary which explains
      the reasons of setting the 'PG_checked' flag for the page. The old
      commentary was a bit difficult to understand.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      6ed09c34
    • A
      UBIFS: kill CONFIG_UBIFS_FS_DEBUG_CHKS · 9d523caf
      Artem Bityutskiy 提交于
      Simplify UBIFS configuration menu and kill the option to enable self-check
      compile-time. We do not really need this because we can do this run-time
      using the module parameters or the corresponding sysfs interfaces. And
      there is a value in simplifying the kernel configuration menu which becomes
      increasingly large.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      9d523caf
    • A
      UBIFS: use GFP_NOFS properly · fc5e58c0
      Artem Bityutskiy 提交于
      This patch fixes a brown-paperbag bug which was introduced by me:
      I used incorrect "GFP_KERNEL | GFP_NOFS" allocation flags to make
      sure my allocations do not cause write-back. But the correct form
      is "GFP_NOFS".
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      fc5e58c0
    • S
      userns: rename is_owner_or_cap to inode_owner_or_capable · 2e149670
      Serge E. Hallyn 提交于
      And give it a kernel-doc comment.
      
      [akpm@linux-foundation.org: btrfs changed in linux-next]
      Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e149670
  10. 16 3月, 2011 7 次提交
  11. 15 3月, 2011 1 次提交
  12. 11 3月, 2011 4 次提交
    • A
      UBIFS: do not check data crc by default · 2bcf0021
      Artem Bityutskiy 提交于
      Change the default UBIFS behavior WRT data CRC checking. Currently,
      UBIFS checks data CRC when reading, which slows it down quite a bit,
      and this is the default option. However, it looks like in average
      user does not need this feature and would prefer faster read speed
      over extra reliability. And this seems to be de-facto standard that
      file-systems do not check data CRC every time they read from the
      media.
      
      Thus, make UBIFS default behavior so that it does not check data
      CRC. This corresponds to the no_chk_data_crc mount option. Those users
      who need extra protection can always enable it using the chk_data_crc
      option.
      
      Please, read more information about this feature here:
      http://www.linux-mtd.infradead.org/doc/ubifs.html#L_checksummingSigned-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      2bcf0021
    • A
      UBIFS: simplify UBIFS Kconfig menu · cce3f612
      Artem Bityutskiy 提交于
      Remove debug message level and debug checks Kconfig options as they
      proved to be useless anyway. We have sysfs interface which we can
      use for fine-grained debugging messages and checks selection, see
      Documentation/filesystems/ubifs.txt for mode details.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      cce3f612
    • A
      UBIFS: print max. index node size · 6342aaeb
      Artem Bityutskiy 提交于
      Improve debugging messages by printing the maximum index node size
      on mount.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      6342aaeb
    • M
      UBIFS: handle allocation failures in UBIFS write path · d882962f
      Matthew L. Creech 提交于
      Running kernel 2.6.37, my PPC-based device occasionally gets an
      order-2 allocation failure in UBIFS, which causes the root FS to
      become unwritable:
      
      kswapd0: page allocation failure. order:2, mode:0x4050
      Call Trace:
      [c787dc30] [c00085b8] show_stack+0x7c/0x194 (unreliable)
      [c787dc70] [c0061aec] __alloc_pages_nodemask+0x4f0/0x57c
      [c787dd00] [c0061b98] __get_free_pages+0x20/0x50
      [c787dd10] [c00e4f88] ubifs_jnl_write_data+0x54/0x200
      [c787dd50] [c00e82d4] do_writepage+0x94/0x198
      [c787dd90] [c00675e4] shrink_page_list+0x40c/0x77c
      [c787de40] [c0067de0] shrink_inactive_list+0x1e0/0x370
      [c787de90] [c0068224] shrink_zone+0x2b4/0x2b8
      [c787df00] [c0068854] kswapd+0x408/0x5d4
      [c787dfb0] [c0037bcc] kthread+0x80/0x84
      [c787dff0] [c000ef44] kernel_thread+0x4c/0x68
      
      Similar problems were encountered last April by Tomasz Stanislawski:
      
      http://patchwork.ozlabs.org/patch/50965/
      
      This patch implements Artem's suggested fix: fall back to a
      mutex-protected static buffer, allocated at mount time.  I tested it
      by forcing execution down the failure path, and didn't see any ill
      effects.
      
      Artem: massaged the patch a little, improved it so that we'd not
      allocate the write reserve buffer when we are in R/O mode.
      Signed-off-by: NMatthew L. Creech <mlcreech@gmail.com>
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      d882962f
  13. 10 3月, 2011 1 次提交
  14. 08 3月, 2011 4 次提交
    • A
      UBIFS: use max_write_size during recovery · 2765df7d
      Artem Bityutskiy 提交于
      When recovering from unclean reboots UBIFS scans the journal and checks nodes.
      If a corrupted node is found, UBIFS tries to check if this is the last node
      in the LEB or not. This is is done by checking if there only 0xFF bytes
      starting from the next min. I/O unit. However, since now we write in
      c->max_write_size, we should actually check for 0xFFs starting from the
      next max. write unit.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      2765df7d
    • A
      UBIFS: use max_write_size for write-buffers · 6c7f74f7
      Artem Bityutskiy 提交于
      Switch write-buffers from 'c->min_io_size' to 'c->max_write_size' which
      presumably has to be more write speed-efficient. However, when write-buffer
      is synchronized, write only the the min. I/O units which contain the
      data, do not write whole write-buffer. This is more space-efficient.
      
      Additionally, this patch takes into account that the LEB might not start
      from the max. write unit-aligned address.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      6c7f74f7
    • A
      UBIFS: introduce write-buffer size field · 3c89f396
      Artem Bityutskiy 提交于
      Currently we assume write-buffer size is always min_io_size. But
      this is about to change and write-buffers may be of variable size.
      Namely, they will be of max_write_size at the beginning, but will
      get smaller when we are approaching the end of LEB.
      
      This is a preparation patch which introduces 'size' field in
      the write-buffer structure which carries the current write-buffer
      size.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      3c89f396
    • A
      UBI: incorporate LEB offset information · ca2ec61d
      Artem Bityutskiy 提交于
      Incorporate the LEB offset information into UBIFS. We'll use this
      information in one of the next patches to figure out what are the
      max. write size offsets relative to the PEB. So this patch is just
      a preparation.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      ca2ec61d