1. 28 5月, 2013 19 次提交
    • J
      f2fs: add f2fs_readonly() · 77888c1e
      Jaegeuk Kim 提交于
      Introduce a simple macro function for readability.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      77888c1e
    • J
      f2fs: avoid RECLAIM_FS-ON-W: deadlock · 6f85b352
      Jaegeuk Kim 提交于
      This patch tries to avoid the following deadlock condition of which the reclaim
      path can trigger f2fs_balance_fs again.
      
      =================================
      [ INFO: inconsistent lock state ]
      ---------------------------------
      inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
      kswapd0/41 [HC0[0]:SC0[0]:HE1:SE1] takes:
       (&sbi->gc_mutex){+.+.?.}, at: f2fs_balance_fs+0xe6/0x100 [f2fs]
      {RECLAIM_FS-ON-W} state was registered at:
        [<ffffffff810aa5a9>] mark_held_locks+0xb9/0x140
        [<ffffffff810aae85>] lockdep_trace_alloc+0x85/0xf0
        [<ffffffff8113ab2c>] __alloc_pages_nodemask+0x7c/0x9b0
        [<ffffffff81175aa8>] alloc_pages_current+0xb8/0x180
        [<ffffffff811319cf>] __page_cache_alloc+0xaf/0xd0
        [<ffffffff8113225c>] find_or_create_page+0x4c/0xb0
        [<ffffffffa021359e>] find_data_page+0x14e/0x210 [f2fs]
        [<ffffffffa021161b>] f2fs_gc+0x9eb/0xd90 [f2fs]
        [<ffffffffa0218fae>] f2fs_balance_fs+0xee/0x100 [f2fs]
        [<ffffffffa020848c>] f2fs_setattr+0x6c/0x200 [f2fs]
        [<ffffffff811ae51b>] notify_change+0x1db/0x3a0
        [<ffffffff8118fbd0>] do_truncate+0x60/0xa0
        [<ffffffff8118fd95>] vfs_truncate+0x185/0x1b0
        [<ffffffff8118fe1c>] do_sys_truncate+0x5c/0xa0
        [<ffffffff8118ffee>] SyS_truncate+0xe/0x10
        [<ffffffff816e2b42>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6f85b352
    • J
      f2fs: don't do checkpoint if error is occurred · 2c2c149f
      Jaegeuk Kim 提交于
      If we met an error during the dentry recovery, we should not conduct checkpoint.
      Otherwise, some errorneous dentry blocks overwrites the existing blocks that
      contain the remaining recovery information.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      2c2c149f
    • J
      f2fs: fix to unlock page before exit · 45856aff
      Jaegeuk Kim 提交于
      If we got an error after lock_page, we should unlock it before exit.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      45856aff
    • J
      f2fs: remove unnecessary kmap/kunmap operations · 9a55ed65
      Jaegeuk Kim 提交于
      The allocated page used by the recovery is not on HIGHMEM, so that we don't
      need to use kmap/kunmap.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      9a55ed65
    • N
      f2fs: reorganize f2fs_vm_page_mkwrite · 9851e6e1
      Namjae Jeon 提交于
      Few things can be changed in the default mkwrite function
      1) Make file_update_time at the start before acquiring any lock
      2) the condition page_offset(page) >= i_size_read(inode) should be
       changed to page_offset(page) > i_size_read
      3) Move wait_on_page_writeback.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      9851e6e1
    • M
      f2fs: use list_for_each_entry rather than list_for_each_entry_safe · 145b04e5
      majianpeng 提交于
      We can do this, since now we use a global mutex, f2fs_stat_mutex to protect its
      list operations.
      Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
      [Jaegeuk Kim: add description]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      145b04e5
    • H
      f2fs: remove unecessary variable and code · 81fb5e87
      Haicheng Li 提交于
      Code cleanup without behavior changed.
      Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      81fb5e87
    • P
      f2fs, lockdep: annotate mutex_lock_all() · bfe35965
      Peter Zijlstra 提交于
      Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all()
      acquires an array of locks (in global/local lock style).
      
      Any such operation is always serialized using cp_mutex, therefore there is no
      fs_lock[] lock-order issue; tell lockdep about this using the
      mutex_lock_nest_lock() primitive.
      Reported-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      bfe35965
    • J
      f2fs: add debug msgs in the recovery routine · f356fe0c
      Jaegeuk Kim 提交于
      This patch adds some trivial debugging messages in the recovery process.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      f356fe0c
    • J
      f2fs: update inode page after creation · 44a83ff6
      Jaegeuk Kim 提交于
      I found a bug when testing power-off-recovery as follows.
      
      [Bug Scenario]
      1. create a file
      2. fsync the file
      3. reboot w/o any sync
      4. try to recover the file
       - found its fsync mark
       - found its dentry mark
         : try to recover its dentry
          - get its file name
          - get its parent inode number
           : here we got zero value
      
      The reason why we get the wrong parent inode number is that we didn't
      synchronize the inode page with its newly created inode information perfectly.
      
      Especially, previous f2fs stores fi->i_pino and writes it to the cached
      node page in a wrong order, which incurs the zero-valued i_pino during the
      recovery.
      
      So, this patch modifies the creation flow to fix the synchronization order of
      inode page with its inode.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      44a83ff6
    • J
      f2fs: change get_new_data_page to pass a locked node page · 64aa7ed9
      Jaegeuk Kim 提交于
      This patch is for passing a locked node page to get_dnode_of_data.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      64aa7ed9
    • J
      f2fs: skip get_node_page if locked node page is passed · 1646cfac
      Jaegeuk Kim 提交于
      If get_dnode_of_data gets a locked node page, let's skip redundant
      get_node_page calls.
      This is for the futher enhancement.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      1646cfac
    • J
      f2fs: remove unnecessary por_doing check · 0a364af1
      Jaegeuk Kim 提交于
      This por_doing check is totally not related to the recovery process.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      0a364af1
    • J
      f2fs: fix BUG_ON during f2fs_evict_inode(dir) · 74d0b917
      Jaegeuk Kim 提交于
      During the dentry recovery routine, recover_inode() triggers __f2fs_add_link
      with its directory inode.
      
      In the following scenario, a bug is captured.
       1. dir = f2fs_iget(pino)
       2. __f2fs_add_link(dir, name)
       3. iput(dir)
        -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents))
      
      Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable]
      [<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs]
      Call Trace:
       [<ffffffff8118ea00>] evict+0xb0/0x1b0
       [<ffffffff8118f1c5>] iput+0x105/0x190
       [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs]
       [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0
       [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0
       [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70
       [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140
       [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0
       [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30
       [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70
      
      This means that we should flush all the dentry pages between iget and iput().
      But, during the recovery routine, it is unallowed due to consistency, so we
      have to wait the whole recovery process.
      And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we
      can put the stale dir inodes from the dirty_dir_inode_list.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      74d0b917
    • J
      f2fs: fix por_doing variable coverage · 8c26d7d5
      Jaegeuk Kim 提交于
      The reason of using sbi->por_doing is to alleviate data writes during the
      recovery.
      The find_fsync_dnodes() produces some dirty dentry pages, so we should
      cover it too with sbi->por_doing.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      8c26d7d5
    • J
      f2fs: remove redundant assignment · addbe45b
      Jaegeuk Kim 提交于
      We don't need to assign a value redundantly.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      addbe45b
    • J
      f2fs: fix the inconsistent state of data pages · 650495de
      Jaegeuk Kim 提交于
      In get_lock_data_page, if there is a data race between get_dnode_of_data for
      node and grab_cache_page for data, f2fs is able to face with the following
      BUG_ON(dn.data_blkaddr == NEW_ADDR).
      
      kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/data.c:251!
       [<ffffffffa044966c>] get_lock_data_page+0x1ec/0x210 [f2fs]
      Call Trace:
       [<ffffffffa043b089>] f2fs_readdir+0x89/0x210 [f2fs]
       [<ffffffff811a0920>] ? fillonedir+0x100/0x100
       [<ffffffff811a0920>] ? fillonedir+0x100/0x100
       [<ffffffff811a07f8>] vfs_readdir+0xb8/0xe0
       [<ffffffff811a0b4f>] sys_getdents+0x8f/0x110
       [<ffffffff816d7999>] system_call_fastpath+0x16/0x1b
      
      This bug is able to be occurred when the block address of the data block is
      changed after f2fs_put_dnode().
      In order to avoid that, this patch fixes the lock order of node and data
      blocks in which the node block lock is covered by the data block lock.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      650495de
    • J
      f2fs: fix inconsistency of block count during recovery · 65e5cd0a
      Jaegeuk Kim 提交于
      Currently f2fs recovers the dentry of fsynced files.
      When power-off-recovery is conducted, this newly recovered inode should increase
      node block count as well as inode block count.
      
      This patch resolves this inconsistency that results in:
      
      1. create a file
      2. write data
      3. fsync
      4. reboot without sync
      5. mount and recover the file
      6. node block count is 1 and inode block count is 2
       : fall into the inconsistent state
      7. unlink the file
       : trigger the following BUG_ON
      
      ------------[ cut here ]------------
      kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/f2fs.h:716!
      Call Trace:
       [<ffffffffa0344100>] ? get_node_page+0x50/0x1a0 [f2fs]
       [<ffffffffa0344bfc>] remove_inode_page+0x8c/0x100 [f2fs]
       [<ffffffffa03380f0>] ? f2fs_evict_inode+0x180/0x2d0 [f2fs]
       [<ffffffffa033812e>] f2fs_evict_inode+0x1be/0x2d0 [f2fs]
       [<ffffffff811c7a67>] evict+0xa7/0x1a0
       [<ffffffff811c82b5>] iput+0x105/0x190
       [<ffffffff811c2b30>] d_kill+0xe0/0x120
       [<ffffffff811c2c57>] dput+0xe7/0x1e0
       [<ffffffff811acc3d>] __fput+0x19d/0x2d0
       [<ffffffff811acd7e>] ____fput+0xe/0x10
       [<ffffffff81070645>] task_work_run+0xb5/0xe0
       [<ffffffff81002941>] do_notify_resume+0x71/0xb0
       [<ffffffff8175f14a>] int_signal+0x12/0x17
      Reported-and-Tested-by: NChris Fries <C.Fries@motorola.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      65e5cd0a
  2. 27 5月, 2013 6 次提交
    • L
      Linux 3.10-rc3 · e4aa937e
      Linus Torvalds 提交于
      e4aa937e
    • M
      ipc/sem.c: Fix missing wakeups in do_smart_update_queue() · ab465df9
      Manfred Spraul 提交于
      do_smart_update_queue() is called when an operation (semop,
      semctl(SETVAL), semctl(SETALL), ...) modified the array.  It must check
      which of the sleeping tasks can proceed.
      
      do_smart_update_queue() missed a few wakeups:
       - if a sleeping complex op was completed, then all per-semaphore queues
         must be scanned - not only those that were modified by *sops
       - if a sleeping simple op proceeded, then the global queue must be
         scanned again
      
      And:
       - the test for "|sops == NULL) before scanning the global queue is not
         required: If the global queue is empty, then it doesn't need to be
         scanned - regardless of the reason for calling do_smart_update_queue()
      
      The patch is not optimized, i.e.  even completing a wait-for-zero
      operation causes a rescan.  This is done to keep the patch as simple as
      possible.
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Acked-by: NDavidlohr Bueso <davidlohr.bueso@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab465df9
    • L
      Merge tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 89ff7783
      Linus Torvalds 提交于
      Pull NFS client bugfixes from Trond Myklebust:
      
       - Stable fix to prevent an rpc_task wakeup race
       - Fix a NFSv4.1 session drain deadlock
       - Fix a NFSv4/v4.1 mount regression when not running rpc.gssd
       - Ensure auth_gss pipe detection works in namespaces
       - Fix SETCLIENTID fallback if rpcsec_gss is not available
      
      * tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFS: Fix SETCLIENTID fallback if GSS is not available
        SUNRPC: Prevent an rpc_task wakeup race
        NFSv4.1 Fix a pNFS session draining deadlock
        SUNRPC: Convert auth_gss pipe detection to work in namespaces
        SUNRPC: Faster detection if gssd is actually running
        SUNRPC: Fix a bug in gss_create_upcall
      89ff7783
    • L
      Merge tag 'edac_fixes_for_3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · 932ff06b
      Linus Torvalds 提交于
      Pull amd64 edac fix from Borislav Petkov:
       "A sysfs file permissions correction"
      
      * tag 'edac_fixes_for_3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
        amd64_edac: Fix bogus sysfs file permissions
      932ff06b
    • L
      Merge branch 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 95f4838e
      Linus Torvalds 提交于
      Pull parisc fixes from Helge Deller:
       "This time we made the kernel- and interruption stack allocation
        reentrant which fixed some strange kernel crashes (specifically
        protection ID traps).
      
        Furthemore this patchset fixes the interrupt stack in UP and SMP
        configurations by using native locking instructions.  And finally
        usage of floating point calculations on parisc were disabled in the
        MPILIB."
      
      * 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: fix irq stack on UP and SMP
        parisc/superio: Use module_pci_driver to register driver
        parisc: make interrupt and interruption stack allocation reentrant
        parisc: show number of FPE and unaligned access handler calls in /proc/interrupts
        parisc: add additional parisc git tree to MAINTAINERS file
        parisc: use PAGE_SHIFT instead of hardcoded value 12 in pacache.S
        parisc: add rp5470 entry to machine database
        MPILIB: disable usage of floating point registers on parisc
      95f4838e
    • L
      Merge tag 'for-linus-v3.10-rc3' of git://oss.sgi.com/xfs/xfs · 088d812f
      Linus Torvalds 提交于
      Pull xfs fixes from Ben Myers:
       "Here are fixes for corruption on 512 byte filesystems, a rounding
        error, a use-after-free, some flags to fix lockdep reports, and
        several fixes related to CRCs.  We have a somewhat larger post -rc1
        queue than usual due to fixes related to the CRC feature we merged for
        3.10:
      
         - Fix for corruption with FSX on 512 byte blocksize filesystems
         - Fix rounding error in xfs_free_file_space
         - Fix use-after-free with extent free intents
         - Add several missing KM_NOFS flags to fix lockdep reports
         - Several fixes for CRC related code"
      
      * tag 'for-linus-v3.10-rc3' of git://oss.sgi.com/xfs/xfs:
        xfs: remote attribute lookups require the value length
        xfs: xfs_attr_shortform_allfit() does not handle attr3 format.
        xfs: xfs_da3_node_read_verify() doesn't handle XFS_ATTR3_LEAF_MAGIC
        xfs: fix missing KM_NOFS tags to keep lockdep happy
        xfs: Don't reference the EFI after it is freed
        xfs: fix rounding in xfs_free_file_space
        xfs: fix sub-page blocksize data integrity writes
      088d812f
  3. 26 5月, 2013 6 次提交
  4. 25 5月, 2013 9 次提交
    • V
      ARC: lazy dcache flush broke gdb in non-aliasing configs · 7bb66f6e
      Vineet Gupta 提交于
      gdbserver inserting a breakpoint ends up calling copy_user_page() for a
      code page. The generic version of which (non-aliasing config) didn't set
      the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for
      corresponding dynamic loader code page - causing garbade to be executed.
      
      So now aliasing versions of copy_user_highpage()/clear_page() are made
      default. There is no significant overhead since all of special alias
      handling code is compiled out for non-aliasing build
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      7bb66f6e
    • L
      Merge branch 'akpm' (incoming from Andrew Morton) · 9cf18482
      Linus Torvalds 提交于
      Merge fixes from Andrew Morton:
       "A bunch of fixes and one simple fbdev driver which missed the merge
        window because people will still talking about it (to no great
        effect)."
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (30 commits)
        aio: fix kioctx not being freed after cancellation at exit time
        mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas
        drivers/rtc/rtc-max8998.c: check for pdata presence before dereferencing
        ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap()
        random: fix accounting race condition with lockless irq entropy_count update
        drivers/char/random.c: fix priming of last_data
        mm/memory_hotplug.c: fix printk format warnings
        nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary
        drivers/block/brd.c: fix brd_lookup_page() race
        fbdev: FB_GOLDFISH should depend on HAS_DMA
        drivers/rtc/rtc-pl031.c: pass correct pointer to free_irq()
        auditfilter.c: fix kernel-doc warnings
        aio: fix io_getevents documentation
        revert "selftest: add simple test for soft-dirty bit"
        drivers/leds/leds-ot200.c: fix error caused by shifted mask
        mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer
        linux/kernel.h: fix kernel-doc warning
        mm compaction: fix of improper cache flush in migration code
        rapidio/tsi721: fix bug in MSI interrupt handling
        hfs: avoid crash in hfs_bnode_create
        ...
      9cf18482
    • L
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 00cec111
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "We didn't have any fixes sent up for -rc2, so this is a slightly
        larger batch.  A bit all over the place platform-wise; OMAP, at91,
        marvell, renesas, sunxi, ux500, etc.
      
        I tried to summarize highlights but there isn't a whole lot to point
        out.  Lots of little things fixed all over.  A couple of defconfig
        updates due to new/changing options."
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (44 commits)
        ARM: at91/sama5: fix incorrect PMC pcr div definition
        ARM: at91/dt: fix macb pinctrl_macb_rmii_mii_alt definition
        ARM: at91: at91sam9n12: move external irq declatation to DT
        ARM: shmobile: marzen: Use error values in usb_power_*
        ARM: tegra: defconfig fixes
        ARM: nomadik: fix IRQ assignment for SMC ethernet
        ARM: vt8500: Add missing NULL terminator in dt_compat
        clk: tegra: add ac97 controller clock
        clk: tegra: remove USB from clk init table
        ARM: dts: mvebu: Fix wrong the address reg value for the L2-cache node
        ARM: plat-orion: Fix num_resources and id for ge10 and ge11
        ARM: OMAP2+: hwmod: Remove sysc slave idle and auto idle apis
        SERIAL: OMAP: Remove the slave idle handling from the driver
        ARM: OMAP2+: serial: Remove the un-used slave idle hooks
        ARM: OMAP2+: hwmod-data: UART IP needs software control to manage sidle modes
        ARM: OMAP2+: hwmod: Add a new flag to handle SIDLE in SWSUP only in active
        ARM: OMAP2+: hwmod: Fix sidle programming in _enable_sysc()/_idle_sysc()
        arm: mvebu: fix the 'ranges' property to handle PCIe
        ARM: mvebu: select ARCH_REQUIRE_GPIOLIB for mvebu platform
        ARM: AM33XX: Add missing .clkdm_name to clkdiv32k_ick clock
        ...
      00cec111
    • B
      aio: fix kioctx not being freed after cancellation at exit time · 03e04f04
      Benjamin LaHaise 提交于
      The recent changes overhauling fs/aio.c introduced a bug that results in
      the kioctx not being freed when outstanding kiocbs are cancelled at
      exit_aio() time.  Specifically, a kiocb that is cancelled has its
      completion events discarded by batch_complete_aio(), which then fails to
      wake up the process stuck in free_ioctx().  Fix this by modifying the
      wait_event() condition in free_ioctx() appropriately.
      
      This patch was tested with the cancel operation in the thread based code
      posted yesterday.
      
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Josh Boyer <jwboyer@redhat.com>
      Cc: Zach Brown <zab@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      03e04f04
    • C
      mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas · a9ff785e
      Cliff Wickman 提交于
      A panic can be caused by simply cat'ing /proc/<pid>/smaps while an
      application has a VM_PFNMAP range.  It happened in-house when a
      benchmarker was trying to decipher the memory layout of his program.
      
      /proc/<pid>/smaps and similar walks through a user page table should not
      be looking at VM_PFNMAP areas.
      
      Certain tests in walk_page_range() (specifically split_huge_page_pmd())
      assume that all the mapped PFN's are backed with page structures.  And
      this is not usually true for VM_PFNMAP areas.  This can result in panics
      on kernel page faults when attempting to address those page structures.
      
      There are a half dozen callers of walk_page_range() that walk through a
      task's entire page table (as N.  Horiguchi pointed out).  So rather than
      change all of them, this patch changes just walk_page_range() to ignore
      VM_PFNMAP areas.
      
      The logic of hugetlb_vma() is moved back into walk_page_range(), as we
      want to test any vma in the range.
      
      VM_PFNMAP areas are used by:
      - graphics memory manager   gpu/drm/drm_gem.c
      - global reference unit     sgi-gru/grufile.c
      - sgi special memory        char/mspec.c
      - and probably several out-of-tree modules
      
      [akpm@linux-foundation.org: remove now-unused hugetlb_vma() stub]
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Sterba <dsterba@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9ff785e
    • T
      drivers/rtc/rtc-max8998.c: check for pdata presence before dereferencing · 43c523bf
      Tomasz Figa 提交于
      Currently the driver can crash with a NULL pointer dereference if no
      pdata is provided, despite of successful registration of the MFD part.
      This patch fixes the problem by adding a NULL check before dereferencing
      the pdata pointer.
      Signed-off-by: NTomasz Figa <t.figa@samsung.com>
      Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
      Cc: Sachin Kamat <sachin.kamat@linaro.org>
      Reviewed-by: NJingoo Han <jg1.han@samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43c523bf
    • J
      ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap() · b4ca2b4b
      Joseph Qi 提交于
      Last time we found there is lock/unlock bug in ocfs2_file_aio_write, and
      then we did a thorough search for all lock resources in
      ocfs2_inode_info, including rw, inode and open lockres and found this
      bug.  My kernel version is 3.0.13, and it is also in the lastest version
      3.9.  In ocfs2_fiemap, once ocfs2_get_clusters_nocache failed, it should
      goto out_unlock instead of out, because we need release buffer head, up
      read alloc sem and unlock inode.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: NJie Liu <jeff.liu@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Acked-by: NSunil Mushran <sunil.mushran@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4ca2b4b
    • J
      random: fix accounting race condition with lockless irq entropy_count update · 10b3a32d
      Jiri Kosina 提交于
      Commit 902c098a ("random: use lockless techniques in the interrupt
      path") turned IRQ path from being spinlock protected into lockless
      cmpxchg-retry update.
      
      That commit removed r->lock serialization between crediting entropy bits
      from IRQ context and accounting when extracting entropy on userspace
      read path, but didn't turn the r->entropy_count reads/updates in
      account() to use cmpxchg as well.
      
      It has been observed, that under certain circumstances this leads to
      read() on /dev/urandom to return 0 (EOF), as r->entropy_count gets
      corrupted and becomes negative, which in turn results in propagating 0
      all the way from account() to the actual read() call.
      
      Convert the accounting code to be the proper lockless counterpart of
      what has been partially done by 902c098a.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Greg KH <greg@kroah.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10b3a32d
    • J
      drivers/char/random.c: fix priming of last_data · 1e7e2e05
      Jarod Wilson 提交于
      Commit ec8f02da ("random: prime last_data value per fips
      requirements") added priming of last_data per fips requirements.
      
      Unfortuantely, it did so in a way that can lead to multiple threads all
      incrementing nbytes, but only one actually doing anything with the extra
      data, which leads to some fun random corruption and panics.
      
      The fix is to simply do everything needed to prime last_data in a single
      shot, so there's no window for multiple cpus to increment nbytes -- in
      fact, we won't even increment or decrement nbytes anymore, we'll just
      extract the needed EXTRACT_SIZE one time per pool and then carry on with
      the normal routine.
      
      All these changes have been tested across multiple hosts and
      architectures where panics were previously encoutered.  The code changes
      are are strictly limited to areas only touched when when booted in fips
      mode.
      
      This change should also go into 3.8-stable, to make the myriads of fips
      users on 3.8.x happy.
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Tested-by: NJan Stancek <jstancek@redhat.com>
      Tested-by: NJan Stodola <jstodola@redhat.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e7e2e05