1. 25 5月, 2011 2 次提交
    • G
      mtd: create function to perform large allocations · 33b53716
      Grant Erickson 提交于
      Introduce a common function to handle large, contiguous kmalloc buffer
      allocations by exponentially backing off on the size of the requested
      kernel transfer buffer until it succeeds or until the requested
      transfer buffer size falls below the page size.
      
      This helps ensure the operation can succeed under low-memory, highly-
      fragmented situations albeit somewhat more slowly.
      
      Artem: so this patch solves the problem that the kernel tries to kmalloc too
      large buffers, which (a) may fail and does fail - people complain about this,
      and (b) slows down the system in case of high memory fragmentation, because
      the kernel starts dropping caches, writing back, swapping, etc. But we do not
      really have to allocate a lot of memory to do the I/O, we may do this even with
      as little as one min. I/O unit (NAND page) of RAM. So the idea of this patch is
      that if the user asks to read or write a lot, we try to kmalloc a lot, with GFP
      flags which make the kernel _not_ drop caches, etc. If we can allocate it - good,
      if not - we try to allocate twice as less, and so on, until we reach the min.
      I/O unit size, which is our last resort allocation and use the normal
      GFP_KERNEL flag.
      
      Artem: re-write the allocation function so that it makes sure the allocated
      buffer is aligned to the min. I/O size of the flash.
      Signed-off-by: NGrant Erickson <marathon96@gmail.com>
      Tested-by: NBen Gardiner <bengardiner@nanometrics.ca>
      Tested-by: NStefano Babic <sbabic@denx.de>
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      33b53716
    • B
      mtd: nand: renumber conflicting BBT flags · a626743f
      Brian Norris 提交于
      The NAND_USE_FLASH_BBT_NO_OOB and NAND_CREATE_EMPTY_BBT flags conflict
      with the NAND_BBT_SCANBYTE1AND6 and NAND_BBT_DYNAMICSTRUCT flags,
      respectively. This change will allow us to utilize these options
      independently.
      Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      a626743f
  2. 19 5月, 2011 2 次提交
    • G
      drivercore: revert addition of of_match to struct device · b1608d69
      Grant Likely 提交于
      Commit b826291c, "drivercore/dt: add a match table pointer to struct
      device" added an of_match pointer to struct device to cache the
      of_match_table entry discovered at driver match time.  This was unsafe
      because matching is not an atomic operation with probing a driver.  If
      two or more drivers are attempted to be matched to a driver at the
      same time, then the cached matching entry pointer could get
      overwritten.
      
      This patch reverts the of_match cache pointer and reworks all users to
      call of_match_device() directly instead.
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      b1608d69
    • M
      of: fix race when matching drivers · 01294d82
      Milton Miller 提交于
      If two drivers are probing devices at the same time, both will write
      their match table result to the dev->of_match cache at the same time.
      
      Only write the result if the device matches.
      
      In a thread titled "SBus devices sometimes detected, sometimes not",
      Meelis reported his SBus hme was not detected about 50% of the time.
      From the debug suggested by Grant it was obvious another driver matched
      some devices between the call to match the hme and the hme discovery
      failling.
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      [grant.likely: modified to only call of_match_device() once]
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      01294d82
  3. 18 5月, 2011 1 次提交
  4. 16 5月, 2011 1 次提交
  5. 15 5月, 2011 1 次提交
    • L
      fs: remove FS_COW_FL · e1e8fb6a
      Li Zefan 提交于
      FS_COW_FL and FS_NOCOW_FL were newly introduced to control per file
      COW in btrfs, but FS_NOCOW_FL is sufficient.
      
      The fact is we don't have corresponding BTRFS_INODE_COW flag.
      
      COW is default, and FS_NOCOW_FL can be used to switch off COW for
      a single file.
      
      If we mount btrfs with nodatacow, a newly created file will be set with
      the FS_NOCOW_FL flag. So to turn on COW for it, we can just clear the
      FS_NOCOW_FL flag.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      e1e8fb6a
  6. 14 5月, 2011 1 次提交
    • S
      Cache user_ns in struct cred · 47a150ed
      Serge E. Hallyn 提交于
      If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns).
      
      Get rid of _current_user_ns.  This requires nsown_capable() to be
      defined in capability.c rather than as static inline in capability.h,
      so do that.
      
      Request_key needs init_user_ns defined at current_user_ns if
      !CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS
      at current_user_ns() define.
      
      Compile-tested with and without CONFIG_USERNS.
      Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
      [ This makes a huge performance difference for acl_permission_check(),
        up to 30%.  And that is one of the hottest kernel functions for loads
        that are pathname-lookup heavy.  ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      47a150ed
  7. 12 5月, 2011 4 次提交
    • L
      fbcon: add lifetime refcount to opened frame buffers · 698b3682
      Linus Torvalds 提交于
      This just adds the refcount and the new registration lock logic.  It
      does not (for example) actually change the read/write/ioctl routines to
      actually use the frame buffer that was opened: those function still end
      up alway susing whatever the current frame buffer is at the time of the
      call.
      
      Without this, if something holds the frame buffer open over a
      framebuffer switch, the close() operation after the switch will access a
      fb_info that has been free'd by the unregistering of the old frame
      buffer.
      
      (The read/write/ioctl operations will normally not cause problems,
      because they will - illogically - pick up the new fbcon instead.  But a
      switch that happens just as one of those is going on might see problems
      too, the window is just much smaller: one individual op rather than the
      whole open-close sequence.)
      
      This use-after-free is apparently fairly easily triggered by the Ubuntu
      11.04 boot sequence.
      Acked-by: NTim Gardner <tim.gardner@canonical.com>
      Tested-by: NDaniel J Blueman <daniel.blueman@gmail.com>
      Tested-by: NAnca Emanuel <anca.emanuel@gmail.com>
      Cc: Bruno Prémont <bonbons@linux-vserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Andy Whitcroft <andy.whitcroft@canonical.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      698b3682
    • T
      NFSv4.1: Ensure that layoutget uses the correct gfp modes · a75b9df9
      Trond Myklebust 提交于
      Currently, writebacks may end up recursing back into the filesystem due to
      GFP_KERNEL direct reclaims in the pnfs subsystem.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      a75b9df9
    • A
      mm: add alloc_pages_exact_nid() · ee85c2e1
      Andi Kleen 提交于
      Add a alloc_pages_exact_nid() that allocates on a specific node.
      
      The naming is quite broken, but fixing that would need a larger renaming
      action.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: tweak comment]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee85c2e1
    • Y
      mm: use alloc_bootmem_node_nopanic() on really needed path · 8f389a99
      Yinghai Lu 提交于
      Stefan found nobootmem does not work on his system that has only 8M of
      RAM.  This causes an early panic:
      
        BIOS-provided physical RAM map:
         BIOS-88: 0000000000000000 - 000000000009f000 (usable)
         BIOS-88: 0000000000100000 - 0000000000840000 (usable)
        bootconsole [earlyser0] enabled
        Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS!
        DMI not present or invalid.
        last_pfn = 0x840 max_arch_pfn = 0x100000
        init_memory_mapping: 0000000000000000-0000000000840000
        8MB LOWMEM available.
          mapped low ram: 0 - 00840000
          low ram: 0 - 00840000
        Zone PFN ranges:
          DMA      0x00000001 -> 0x00001000
          Normal   empty
        Movable zone start PFN for each node
        early_node_map[2] active PFN ranges
            0: 0x00000001 -> 0x0000009f
            0: 0x00000100 -> 0x00000840
        BUG: Int 6: CR2 (null)
             EDI c034663c  ESI (null)  EBP c0329f38  ESP c0329ef4
             EBX c0346380  EDX 00000006  ECX ffffffff  EAX fffffff4
             err (null)  EIP c0353191   CS c0320060  flg 00010082
        Stack: (null) c030c533 000007cd (null) c030c533 00000001 (null) (null)
               00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null)
               (null) 00000100 00000840 (null) c0329f64 00000001 00001000 (null)
        Pid: 0, comm: swapper Not tainted 2.6.36 #5
        Call Trace:
         [<c02e3707>] ? 0xc02e3707
         [<c035e6e5>] 0xc035e6e5
         [<c0353191>] ? 0xc0353191
         [<c03534d6>] 0xc03534d6
         [<c034f1cd>] 0xc034f1cd
         [<c034a824>] 0xc034a824
         [<c03513cb>] ? 0xc03513cb
         [<c0349432>] 0xc0349432
         [<c0349066>] 0xc0349066
      
      It turns out that we should ignore the low limit of 16M.
      
      Use alloc_bootmem_node_nopanic() in this case.
      
      [akpm@linux-foundation.org: less mess]
      Signed-off-by: NYinghai LU <yinghai@kernel.org>
      Reported-by: NStefan Hellermann <stefan@the2masters.de>
      Tested-by: NStefan Hellermann <stefan@the2masters.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@kernel.org>		[2.6.34+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f389a99
  8. 10 5月, 2011 1 次提交
    • M
      Don't lock guardpage if the stack is growing up · a09a79f6
      Mikulas Patocka 提交于
      Linux kernel excludes guard page when performing mlock on a VMA with
      down-growing stack. However, some architectures have up-growing stack
      and locking the guard page should be excluded in this case too.
      
      This patch fixes lvm2 on PA-RISC (and possibly other architectures with
      up-growing stack). lvm2 calculates number of used pages when locking and
      when unlocking and reports an internal error if the numbers mismatch.
      
      [ Patch changed fairly extensively to also fix /proc/<pid>/maps for the
        grows-up case, and to move things around a bit to clean it all up and
        share the infrstructure with the /proc bits.
      
        Tested on ia64 that has both grow-up and grow-down segments  - Linus ]
      Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Tested-by: NTony Luck <tony.luck@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a09a79f6
  9. 07 5月, 2011 1 次提交
  10. 05 5月, 2011 1 次提交
    • T
      slub: Fix the lockless code on 32-bit platforms with no 64-bit cmpxchg · 30106b8c
      Thomas Gleixner 提交于
      The SLUB allocator use of the cmpxchg_double logic was wrong: it
      actually needs the irq-safe one.
      
      That happens automatically when we use the native unlocked 'cmpxchg8b'
      instruction, but when compiling the kernel for older x86 CPUs that do
      not support that instruction, we fall back to the generic emulation
      code.
      
      And if you don't specify that you want the irq-safe version, the generic
      code ends up just open-coding the cmpxchg8b equivalent without any
      protection against interrupts or preemption.  Which definitely doesn't
      work for SLUB.
      
      This was reported by Werner Landgraf <w.landgraf@ru.ru>, who saw
      instability with his distro-kernel that was compiled to support pretty
      much everything under the sun.  Most big Linux distributions tend to
      compile for PPro and later, and would never have noticed this problem.
      
      This also fixes the prototypes for the irqsafe cmpxchg_double functions
      to use 'bool' like they should.
      
      [ Btw, that whole "generic code defaults to no protection" design just
        sounds stupid - if the code needs no protection, there is no reason to
        use "cmpxchg_double" to begin with.  So we should probably just remove
        the unprotected version entirely as pointless.   - Linus ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-and-tested-by: Nwerner <w.landgraf@ru.ru>
      Acked-and-tested-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Tejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1105041539050.3005@ionosSigned-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30106b8c
  11. 02 5月, 2011 1 次提交
    • J
      i2c-i801: Move device ID definitions to driver · a6e5e2be
      Jean Delvare 提交于
      Move the SMBus device ID definitions of recent devices from pci_ids.h
      to the i2c-i801.c driver file. They don't have to be shared, as they
      are clearly identified and only used in this driver. In the future,
      such IDs will go to i2c-i801 directly. This will make adding support
      for new devices much faster and easier, as it will avoid cross-
      subsystem patch sets and merge conflicts.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: Seth Heasley <seth.heasley@intel.com>
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      a6e5e2be
  12. 29 4月, 2011 3 次提交
  13. 28 4月, 2011 3 次提交
  14. 26 4月, 2011 2 次提交
  15. 25 4月, 2011 3 次提交
  16. 24 4月, 2011 3 次提交
    • T
      libata: Implement ATA_FLAG_NO_DIPM and apply it to mcp65 · ae01b249
      Tejun Heo 提交于
      NVIDIA mcp65 familiy of controllers cause command timeouts when DIPM
      is used.  Implement ATA_FLAG_NO_DIPM and apply it.
      
      This problem was reported by Stefan Bader in the following thread.
      
       http://thread.gmane.org/gmane.linux.ide/48841
      
      stable: applicable to 2.6.37 and 38.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStefan Bader <stefan.bader@canonical.com>
      Cc: stable@kernel.org
      Signed-off-by: NJeff Garzik <jgarzik@pobox.com>
      ae01b249
    • T
      libata: Kill unused ATA_DFLAG_{H|D}IPM flags · 3f7ac1d6
      Tejun Heo 提交于
      ATA_DFLAG_{H|D}IPM flags are no longer used.  Kill them.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@pobox.com>
      3f7ac1d6
    • L
      vfs: get rid of insane dentry hashing rules · dea3667b
      Linus Torvalds 提交于
      The dentry hashing rules have been really quite complicated for a long
      while, in odd ways.  That made functions like __d_drop() very fragile
      and non-obvious.
      
      In particular, whether a dentry was hashed or not was indicated with an
      explicit DCACHE_UNHASHED bit.  That's despite the fact that the hash
      abstraction that the dentries use actually have a 'is this entry hashed
      or not' model (which is a simple test of the 'pprev' pointer).
      
      The reason that was done is because we used the normal 'is this entry
      unhashed' model to mark whether the dentry had _ever_ been hashed in the
      dentry hash tables, and that logic goes back many years (commit
      b3423415: "dcache: avoid RCU for never-hashed dentries").
      
      That, in turn, meant that __d_drop had totally different unhashing logic
      for the dentry hash table case and for the anonymous dcache case,
      because in order to use the "is this dentry hashed" logic as a flag for
      whether it had ever been on the RCU hash table, we had to unhash such a
      dentry differently so that we'd never think that it wasn't 'unhashed'
      and wouldn't be free'd correctly.
      
      That's just insane.  It made the logic really hard to follow, when there
      were two different kinds of "unhashed" states, and one of them (the one
      that used "list_bl_unhashed()") really had nothing at all to do with
      being unhashed per se, but with a very subtle lifetime rule instead.
      
      So turn all of it around, and make it logical.
      
      Instead of having a DENTRY_UNHASHED bit in d_flags to indicate whether
      the dentry is on the hash chains or not, use the hash chain unhashed
      logic for that.  Suddenly "d_unhashed()" just uses "list_bl_unhashed()",
      and everything makes sense.
      
      And for the lifetime rule, just use an explicit DENTRY_RCUACCEES bit.
      If we ever insert the dentry into the dentry hash table so that it is
      visible to RCU lookup, we mark it DENTRY_RCUACCESS to show that it now
      needs the RCU lifetime rules.  Now suddently that test at dentry free
      time makes sense too.
      
      And because unhashing now is sane and doesn't depend on where the dentry
      got unhashed from (because the dentry hash chain details doesn't have
      some subtle side effects), we can re-unify the __d_drop() logic and use
      common code for the unhashing.
      
      Also fix one more open-coded hash chain bit_spin_lock() that I missed in
      the previous chain locking cleanup commit.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dea3667b
  17. 23 4月, 2011 1 次提交
  18. 19 4月, 2011 6 次提交
  19. 18 4月, 2011 3 次提交