1. 21 12月, 2012 1 次提交
  2. 21 9月, 2012 1 次提交
  3. 06 5月, 2012 1 次提交
  4. 02 4月, 2012 1 次提交
    • P
      logfs: destroy the reserved inodes while unmounting · d2dcd908
      Prasad Joshi 提交于
      We were assuming that the evict_inode() would never be called on
      reserved inodes. However, (after the commit 8e22c1a4 logfs: get rid
      of magical inodes) while unmounting the file system, in put_super, we
      call iput() on all of the reserved inodes.
      
      The following simple test used to cause a kernel panic on LogFS:
      
      1. Mount a LogFS file system on /mnt
      
      2. Create a file
         $ touch /mnt/a
      
      3. Try to unmount the FS
         $ umount /mnt
      
      The simple fix would be to drop the assumption and properly destroy
      the reserved inodes.
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      d2dcd908
  5. 20 3月, 2012 1 次提交
  6. 28 1月, 2012 4 次提交
    • J
      logfs: Grow inode in delete path · bbe01387
      Joern Engel 提交于
      Can be necessary if an inode gets deleted (through -ENOSPC) before being
      written.  Might be better to move this into logfs_write_rec(), but for
      now go with the stupid&safe patch.
      Signed-off-by: NJoern Engel <joern@logfs.org>
      bbe01387
    • P
      logfs: Propagate page parameter to __logfs_write_inode · 0bd90387
      Prasad Joshi 提交于
      During GC LogFS has to rewrite each valid block to a separate segment.
      Rewrite operation reads data from an old segment and writes it to a
      newly allocated segment. Since every write operation changes data
      block pointers maintained in inode, inode should also be rewritten.
      
      In GC path to avoid AB-BA deadlock LogFS marks a page with
      PG_pre_locked in addition to locking the page (PG_locked). The page
      lock is ignored iff the page is pre-locked.
      
      LogFS uses a special file called segment file. The segment file
      maintains an 8 bytes entry for every segment. It keeps track of erase
      count, level etc. for every segment.
      
      Bad things happen with a segment belonging to the segment file is GCed
      
       ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/readwrite.c:297!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: logfs joydev usbhid hid psmouse e1000 i2c_piix4
      		serio_raw [last unloaded: logfs]
      Pid: 20161, comm: mount Not tainted 3.1.0-rc3+ #3 innotek GmbH
      		VirtualBox
      EIP: 0060:[<f809132a>] EFLAGS: 00010292 CPU: 0
      EIP is at logfs_lock_write_page+0x6a/0x70 [logfs]
      EAX: 00000027 EBX: f73f5b20 ECX: c16007c8 EDX: 00000094
      ESI: 00000000 EDI: e59be6e4 EBP: c7337b28 ESP: c7337b18
      DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process mount (pid: 20161, ti=c7336000 task=eb323f70 task.ti=c7336000)
      Stack:
      f8099a3d c7337b24 f73f5b20 00001002 c7337b50 f8091f6d f8099a4d f80994e4
      00000003 00000000 c7337b68 00000000 c67e4400 00001000 c7337b80 f80935e5
      00000000 00000000 00000000 00000000 e1fcf000 0000000f e59be618 c70bf900
      Call Trace:
      [<f8091f6d>] logfs_get_write_page.clone.16+0xdd/0x100 [logfs]
      [<f80935e5>] logfs_mod_segment_entry+0x55/0x110 [logfs]
      [<f809460d>] logfs_get_segment_entry+0x1d/0x20 [logfs]
      [<f8091060>] ? logfs_cleanup_journal+0x50/0x50 [logfs]
      [<f809521b>] ostore_get_erase_count+0x1b/0x40 [logfs]
      [<f80965b8>] logfs_open_area+0xc8/0x150 [logfs]
      [<c141a7ec>] ? kmemleak_alloc+0x2c/0x60
      [<f809668e>] __logfs_segment_write.clone.16+0x4e/0x1b0 [logfs]
      [<c10dd563>] ? mempool_kmalloc+0x13/0x20
      [<c10dd563>] ? mempool_kmalloc+0x13/0x20
      [<f809696f>] logfs_segment_write+0x17f/0x1d0 [logfs]
      [<f8092e8c>] logfs_write_i0+0x11c/0x180 [logfs]
      [<f8092f35>] logfs_write_direct+0x45/0x90 [logfs]
      [<f80934cd>] __logfs_write_buf+0xbd/0xf0 [logfs]
      [<c102900e>] ? kmap_atomic_prot+0x4e/0xe0
      [<f809424b>] logfs_write_buf+0x3b/0x60 [logfs]
      [<f80947a9>] __logfs_write_inode+0xa9/0x110 [logfs]
      [<f8094cb0>] logfs_rewrite_block+0xc0/0x110 [logfs]
      [<f8095300>] ? get_mapping_page+0x10/0x60 [logfs]
      [<f8095aa0>] ? logfs_load_object_aliases+0x2e0/0x2f0 [logfs]
      [<f808e57d>] logfs_gc_segment+0x2ad/0x310 [logfs]
      [<f808e62a>] __logfs_gc_once+0x4a/0x80 [logfs]
      [<f808ed43>] logfs_gc_pass+0x683/0x6a0 [logfs]
      [<f8097a89>] logfs_mount+0x5a9/0x680 [logfs]
      [<c1126b21>] mount_fs+0x21/0xd0
      [<c10f6f6f>] ? __alloc_percpu+0xf/0x20
      [<c113da41>] ? alloc_vfsmnt+0xb1/0x130
      [<c113db4b>] vfs_kern_mount+0x4b/0xa0
      [<c113e06e>] do_kern_mount+0x3e/0xe0
      [<c113f60d>] do_mount+0x34d/0x670
      [<c10f2749>] ? strndup_user+0x49/0x70
      [<c113fcab>] sys_mount+0x6b/0xa0
      [<c142d87c>] syscall_call+0x7/0xb
      Code: f8 e8 8b 93 39 c9 8b 45 f8 3e 0f ba 28 00 19 d2 85 d2 74 ca eb d0 0f 0b 8d 45 fc 89 44 24 04 c7 04 24 3d 9a 09 f8 e8 09 92 39 c9 <0f> 0b 8d 74 26 00 55 89 e5 3e 8d 74 26 00 8b 10 80 e6 01 74 09
      EIP: [<f809132a>] logfs_lock_write_page+0x6a/0x70 [logfs] SS:ESP 0068:c7337b18
      ---[ end trace 96e67d5b3aa3d6ca ]---
      
      The patch passes locked page to __logfs_write_inode. It calls function
      logfs_get_wblocks() to pre-lock the page. This ensures any further
      attempts to lock the page are ignored (esp from get_erase_count).
      Acked-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      0bd90387
    • P
      logfs: take write mutex lock during fsync and sync · 13ced29c
      Prasad Joshi 提交于
      LogFS uses super->s_write_mutex while writing data to disk. Taking the
      same mutex lock in sync and fsync code path solves the following BUG:
      
      ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/dev_bdev.c:134!
      
      Pid: 2387, comm: flush-253:16 Not tainted 3.0.0+ #4 Bochs Bochs
      RIP: 0010:[<ffffffffa007deed>]  [<ffffffffa007deed>]
                      bdev_writeseg+0x25d/0x270 [logfs]
      Call Trace:
      [<ffffffffa007c381>] logfs_open_area+0x91/0x150 [logfs]
      [<ffffffff8128dcb2>] ? find_level.clone.9+0x62/0x100
      [<ffffffffa007c49c>] __logfs_segment_write.clone.20+0x5c/0x190 [logfs]
      [<ffffffff810ef005>] ? mempool_kmalloc+0x15/0x20
      [<ffffffff810ef383>] ? mempool_alloc+0x53/0x130
      [<ffffffffa007c7a4>] logfs_segment_write+0x1d4/0x230 [logfs]
      [<ffffffffa0078f8e>] logfs_write_i0+0x12e/0x190 [logfs]
      [<ffffffffa0079300>] __logfs_write_rec+0x140/0x220 [logfs]
      [<ffffffffa0079444>] logfs_write_rec+0x64/0xd0 [logfs]
      [<ffffffffa00795b6>] __logfs_write_buf+0x106/0x110 [logfs]
      [<ffffffffa007a13e>] logfs_write_buf+0x4e/0x80 [logfs]
      [<ffffffffa0073e33>] __logfs_writepage+0x23/0x80 [logfs]
      [<ffffffffa007410c>] logfs_writepage+0xdc/0x110 [logfs]
      [<ffffffff810f5ba7>] __writepage+0x17/0x40
      [<ffffffff810f6208>] write_cache_pages+0x208/0x4f0
      [<ffffffff810f5b90>] ? set_page_dirty+0x70/0x70
      [<ffffffff810f653a>] generic_writepages+0x4a/0x70
      [<ffffffff810f75d1>] do_writepages+0x21/0x40
      [<ffffffff8116b9d1>] writeback_single_inode+0x101/0x250
      [<ffffffff8116bdbd>] writeback_sb_inodes+0xed/0x1c0
      [<ffffffff8116c5fb>] writeback_inodes_wb+0x7b/0x1e0
      [<ffffffff8116cc23>] wb_writeback+0x4c3/0x530
      [<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
      [<ffffffff8116cd6b>] wb_do_writeback+0xdb/0x290
      [<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
      [<ffffffff814d6208>] ? _raw_spin_unlock_irqrestore+0x18/0x40
      [<ffffffff8105aa5a>] ? del_timer+0x8a/0x120
      [<ffffffff8116cfac>] bdi_writeback_thread+0x8c/0x2e0
      [<ffffffff8116cf20>] ? wb_do_writeback+0x290/0x290
      [<ffffffff8106d2e6>] kthread+0x96/0xa0
      [<ffffffff814de514>] kernel_thread_helper+0x4/0x10
      [<ffffffff8106d250>] ? kthread_worker_fn+0x190/0x190
      [<ffffffff814de510>] ? gs_change+0xb/0xb
      RIP  [<ffffffffa007deed>] bdev_writeseg+0x25d/0x270 [logfs]
      ---[ end trace 0211ad60a57657c4 ]---
      Reviewed-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      13ced29c
    • P
      logfs: update page reference count for pined pages · 96150606
      Prasad Joshi 提交于
      LogFS sets PG_private flag to indicate a pined page. We assumed that
      marking a page as private is enough to ensure its existence. But
      instead it is necessary to hold a reference count to the page.
      
      The change resolves the following BUG
      
      BUG: Bad page state in process flush-253:16  pfn:6a6d0
      page flags: 0x100000000000808(uptodate|private)
      Suggested-and-Acked-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      96150606
  7. 02 11月, 2011 1 次提交
  8. 10 4月, 2011 1 次提交
  9. 31 3月, 2011 1 次提交
  10. 23 12月, 2010 1 次提交
    • P
      logfs: fix "Kernel BUG at readwrite.c:1193" · f06328d7
      Prasad Joshi 提交于
      This happens when __logfs_create() tries to write a new inode to the disk
      which is full.
      
      __logfs_create() associates the transaction pointer with inode.  During
      the logfs_write_inode() function call chain this transaction pointer is
      moved from inode to page->private using function move_inode_to_page
      (do_write_inode() -> inode_to_page() -> move_inode_to_page)
      
      When the write inode fails, the transaction is aborted and iput is called
      on the failed inode.  During delete_inode the same transaction pointer
      associated with the page is getting used.  Thus causing kernel BUG.
      
      The patch checks for error in write_inode() and restores the page->private
      to NULL.
      
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=20162Signed-off-by: NPrasad Joshi <prasadjoshi124@gmail.com>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Florian Mickler <florian@mickler.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f06328d7
  11. 10 8月, 2010 2 次提交
  12. 05 5月, 2010 1 次提交
  13. 02 5月, 2010 1 次提交
  14. 29 4月, 2010 1 次提交
  15. 21 4月, 2010 1 次提交
  16. 15 4月, 2010 1 次提交
  17. 13 4月, 2010 1 次提交
    • J
      [LogFS] Prevent memory corruption on large deletes · 032d8f72
      Joern Engel 提交于
      Removing sufficiently large files would create aliases for a large
      number of segments.  This in turn results in a large number of journal
      entries and an overflow of s_je_array.
      
      Cheap fix is to add a BUG_ON, turning memory corruption into something
      annoying, but less dangerous.  Real fix is to count the number of
      affected segments and prevent the problem completely.
      Signed-off-by: NJoern Engel <joern@logfs.org>
      032d8f72
  18. 31 3月, 2010 1 次提交
  19. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  20. 28 3月, 2010 1 次提交
  21. 05 3月, 2010 1 次提交
  22. 21 11月, 2009 1 次提交