1. 23 6月, 2006 40 次提交
    • C
      [PATCH] Swapless page migration: rip out swap based logic · d75a0fcd
      Christoph Lameter 提交于
      Rip the page migration logic out.
      
      Remove all code that has to do with swapping during page migration.
      
      This also guts the ability to migrate pages to swap.  No one used that so lets
      let it go for good.
      
      Page migration should be a bit broken after this patch.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d75a0fcd
    • C
      [PATCH] Swapless page migration: add R/W migration entries · 0697212a
      Christoph Lameter 提交于
      Implement read/write migration ptes
      
      We take the upper two swapfiles for the two types of migration ptes and define
      a series of macros in swapops.h.
      
      The VM is modified to handle the migration entries.  migration entries can
      only be encountered when the page they are pointing to is locked.  This limits
      the number of places one has to fix.  We also check in copy_pte_range and in
      mprotect_pte_range() for migration ptes.
      
      We check for migration ptes in do_swap_cache and call a function that will
      then wait on the page lock.  This allows us to effectively stop all accesses
      to apge.
      
      Migration entries are created by try_to_unmap if called for migration and
      removed by local functions in migrate.c
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Several times while testing swapless page migration (I've no NUMA, just
        hacking it up to migrate recklessly while running load), I've hit the
        BUG_ON(!PageLocked(p)) in migration_entry_to_page.
      
        This comes from an orphaned migration entry, unrelated to the current
        correctly locked migration, but hit by remove_anon_migration_ptes as it
        checks an address in each vma of the anon_vma list.
      
        Such an orphan may be left behind if an earlier migration raced with fork:
        copy_one_pte can duplicate a migration entry from parent to child, after
        remove_anon_migration_ptes has checked the child vma, but before it has
        removed it from the parent vma.  (If the process were later to fault on this
        orphaned entry, it would hit the same BUG from migration_entry_wait.)
      
        This could be fixed by locking anon_vma in copy_one_pte, but we'd rather
        not.  There's no such problem with file pages, because vma_prio_tree_add
        adds child vma after parent vma, and the page table locking at each end is
        enough to serialize.  Follow that example with anon_vma: add new vmas to the
        tail instead of the head.
      
        (There's no corresponding problem when inserting migration entries,
        because a missed pte will leave the page count and mapcount high, which is
        allowed for.  And there's no corresponding problem when migrating via swap,
        because a leftover swap entry will be correctly faulted.  But the swapless
        method has no refcounting of its entries.)
      
      From: Ingo Molnar <mingo@elte.hu>
      
        pte_unmap_unlock() takes the pte pointer as an argument.
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Several times while testing swapless page migration, gcc has tried to exec
        a pointer instead of a string: smells like COW mappings are not being
        properly write-protected on fork.
      
        The protection in copy_one_pte looks very convincing, until at last you
        realize that the second arg to make_migration_entry is a boolean "write",
        and SWP_MIGRATION_READ is 30.
      
        Anyway, it's better done like in change_pte_range, using
        is_write_migration_entry and make_migration_entry_read.
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Remove unnecessary obfuscation from sys_swapon's range check on swap type,
        which blew up causing memory corruption once swapless migration made
        MAX_SWAPFILES no longer 2 ^ MAX_SWAPFILES_SHIFT.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NChristoph Lameter <clameter@engr.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      From: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0697212a
    • C
      [PATCH] page migration cleanup: pass "mapping" to migration functions · 2d1db3b1
      Christoph Lameter 提交于
      Change handling of address spaces.
      
      Pass a pointer to the address space in which the page is migrated to all
      migration function.  This avoids repeatedly having to retrieve the address
      space pointer from the page and checking it for validity.  The old page
      mapping will change once migration has gone to a certain step, so it is less
      confusing to have the pointer always available.
      
      Move the setting of the mapping and index for the new page into
      migrate_pages().
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2d1db3b1
    • C
      [PATCH] page migration cleanup: remove useless definitions · e7340f73
      Christoph Lameter 提交于
      Remove the export for migrate_page_remove_references() and migrate_page_copy()
      that are unlikely to be used directly by filesystems implementing migration.
      The export was useful when buffer_migrate_page() lived in fs/buffer.c but it
      has now been moved to migrate.c in the migration reorg.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e7340f73
    • O
      [PATCH] writeback: fix range handling · 111ebb6e
      OGAWA Hirofumi 提交于
      When a writeback_control's `start' and `end' fields are used to
      indicate a one-byte-range starting at file offset zero, the required
      values of .start=0,.end=0 mean that the ->writepages() implementation
      has no way of telling that it is being asked to perform a range
      request.  Because we're currently overloading (start == 0 && end == 0)
      to mean "this is not a write-a-range request".
      
      To make all this sane, the patch changes range of writeback_control.
      
      So caller does: If it is calling ->writepages() to write pages, it
      sets range (range_start/end or range_cyclic) always.
      
      And if range_cyclic is true, ->writepages() thinks the range is
      cyclic, otherwise it just uses range_start and range_end.
      
      This patch does,
      
          - Add LLONG_MAX, LLONG_MIN, ULLONG_MAX to include/linux/kernel.h
            -1 is usually ok for range_end (type is long long). But, if someone did,
      
      		range_end += val;		range_end is "val - 1"
      		u64val = range_end >> bits;	u64val is "~(0ULL)"
      
            or something, they are wrong. So, this adds LLONG_MAX to avoid nasty
            things, and uses LLONG_MAX for range_end.
      
          - All callers of ->writepages() sets range_start/end or range_cyclic.
      
          - Fix updates of ->writeback_index. It seems already bit strange.
            If it starts at 0 and ended by check of nr_to_write, this last
            index may reduce chance to scan end of file.  So, this updates
            ->writeback_index only if range_cyclic is true or whole-file is
            scanned.
      Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: "Vladimir V. Saveliev" <vs@namesys.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      111ebb6e
    • N
      [PATCH] radix-tree: direct data · 612d6c19
      Nick Piggin 提交于
      The ability to have height 0 radix trees (a direct pointer to the data item
      rather than going through a full node->slot) quietly disappeared with
      old-2.6-bkcvs commit ffee171812d51652f9ba284302d9e5c5cc14bdfd.  On 64-bit
      machines this causes nearly 600 bytes to be used for every <= 4K file in
      pagecache.
      
      Re-introduce this feature, root tags stored in spare ->gfp_mask bits.
      
      Simplify radix_tree_delete's complex tag clearing arrangement (which would
      become even more complex) by just falling back to tag clearing functions
      (the pagecache radix-tree never uses this path anyway, so the icache
      savings will mean it's actually a speedup).
      
      On my 4GB G5, this saves 8MB RAM per kernel kernel source+object tree in
      pagecache.
      
      Pagecache lookup, insertion, and removal speed for small files will also be
      improved.
      
      This makes RCU radix tree harder, but it's worth it.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      612d6c19
    • D
      [PATCH] change gen_pool allocator to not touch managed memory · 929f9727
      Dean Nelson 提交于
      Modify the gen_pool allocator (lib/genalloc.c) to utilize a bitmap scheme
      instead of the buddy scheme.  The purpose of this change is to eliminate
      the touching of the actual memory being allocated.
      
      Since the change modifies the interface, a change to the uncached allocator
      (arch/ia64/kernel/uncached.c) is also required.
      
      Both Andrey Volkov and Jes Sorenson have expressed a desire that the
      gen_pool allocator not write to the memory being managed. See the
      following:
      
        http://marc.theaimsgroup.com/?l=linux-kernel&m=113518602713125&w=2
        http://marc.theaimsgroup.com/?l=linux-kernel&m=113533568827916&w=2Signed-off-by: NDean Nelson <dcn@sgi.com>
      Cc: Andrey Volkov <avolkov@varma-el.com>
      Acked-by: NJes Sorensen <jes@trained-monkey.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      929f9727
    • N
      [PATCH] mm: introduce remap_vmalloc_range() · 83342314
      Nick Piggin 提交于
      Add remap_vmalloc_range, vmalloc_user, and vmalloc_32_user so that drivers
      can have a nice interface for remapping vmalloc memory.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      83342314
    • Y
      [PATCH] Unify pxm_to_node() and node_to_pxm() · 762834e8
      Yasunori Goto 提交于
      Consolidate the various arch-specific implementations of pxm_to_node() and
      node_to_pxm() into a single generic version.
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: "Brown, Len" <len.brown@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      762834e8
    • C
      [PATCH] tightening hugetlb strict accounting · a43a8c39
      Chen, Kenneth W 提交于
      Current hugetlb strict accounting for shared mapping always assume mapping
      starts at zero file offset and reserves pages between zero and size of the
      file.  This assumption often reserves (or lock down) a lot more pages then
      necessary if application maps at none zero file offset.  libhugetlbfs is
      one example that requires proper reservation on shared mapping starts at
      none zero offset.
      
      This patch extends the reservation and hugetlb strict accounting to support
      any arbitrary pair of (offset, len), resulting a much more robust and
      accurate scheme.  More importantly, it won't lock down any hugetlb pages
      outside file mapping.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a43a8c39
    • A
      [PATCH] reserve space for swap label · e8f03d02
      Andreas Dilger 提交于
      Reserve space in the swap disk header for a LABEL and UUID to be specified.
       This has been possible with util-linux-2.12b (via e2fsprogs 1.36
      libblkid), and is used by at least FC3 and later.  The kernel doesn't
      really care about this, but the space shouldn't accidentally be used by
      something else either.
      
      Also make the on-disk structures be fixed-size types, instead of "int",
      though I don't know of any architecture in use where an "int" isn't the
      same size as a "__u32" (all current kernel arches have it as "unsigned
      int").
      Signed-off-by: NAndreas Dilger <adilger@shaw.ca>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e8f03d02
    • K
      [PATCH] support for panic at OOM · fadd8fbd
      KAMEZAWA Hiroyuki 提交于
      This patch adds panic_on_oom sysctl under sys.vm.
      
      When sysctl vm.panic_on_oom = 1, the kernel panics intead of killing rogue
      processes.  And if vm.panic_on_oom is 0 the kernel will do oom_kill() in
      the same way as it does today.  Of course, the default value is 0 and only
      root can modifies it.
      
      In general, oom_killer works well and kill rogue processes.  So the whole
      system can survive.  But there are environments where panic is preferable
      rather than kill some processes.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fadd8fbd
    • A
      [PATCH] squash duplicate page_to_pfn and pfn_to_page · 67de6482
      Andy Whitcroft 提交于
      We have architectures where the size of page_to_pfn and pfn_to_page are
      significant enough to overall image size that they wish to push them out of
      line.  However, in the process we have grown a second copy of the
      implementation of each of these routines for each memory model.  Share the
      implmentation exposing it either inline or out-of-line as required.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      67de6482
    • Y
      [PATCH] wait_table and zonelist initializing for memory hotadd: add return... · 718127cc
      Yasunori Goto 提交于
      [PATCH] wait_table and zonelist initializing for memory hotadd: add return code for init_current_empty_zone
      
      When add_zone() is called against empty zone (not populated zone), we have to
      initialize the zone which didn't initialize at boot time.  But,
      init_currently_empty_zone() may fail due to allocation of wait table.  So,
      this patch is to catch its error code.
      
      Changes against wait_table is in the next patch.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      718127cc
    • Y
      [PATCH] wait_table and zonelist initializing for memory hotadd: change to... · 86356ab1
      Yasunori Goto 提交于
      [PATCH] wait_table and zonelist initializing for memory hotadd: change to meminit for build_zonelist
      
      Change definitions of some functions and data from __init to __meminit.
      
      These functions and data can be used after bootup by this patch to be used for
      hot-add codes.
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      86356ab1
    • Y
      [PATCH] wait_table and zonelist initializing for memory hotadd: change name of wait_table_size() · 02b694de
      Yasunori Goto 提交于
      This is just to rename from wait_table_size() to wait_table_hash_nr_entries().
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      02b694de
    • A
      [PATCH] PG_uncached is ia64 only · f886ed44
      Andrew Morton 提交于
      As Nick points out, only ia64 uses PG_uncached.  So we can push it up into the
      higher bits of the lower half of page->flags and make room for another flag on
      32-bit machines.
      
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Jesse Barnes <jbarnes@sgi.com>
      Cc: Jes Sorensen <jes@trained-monkey.org>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f886ed44
    • A
      [PATCH] zone handle unaligned zone boundaries · cb2b95e1
      Andy Whitcroft 提交于
      The buddy allocator has a requirement that boundaries between contigious
      zones occur aligned with the the MAX_ORDER ranges.  Where they do not we
      will incorrectly merge pages cross zone boundaries.  This can lead to pages
      from the wrong zone being handed out.
      
      Originally the buddy allocator would check that buddies were in the same
      zone by referencing the zone start and end page frame numbers.  This was
      removed as it became very expensive and the buddy allocator already made
      the assumption that zones boundaries were aligned.
      
      It is clear that not all configurations and architectures are honouring
      this alignment requirement.  Therefore it seems safest to reintroduce
      support for non-aligned zone boundaries.  This patch introduces a new check
      when considering a page a buddy it compares the zone_table index for the
      two pages and refuses to merge the pages where they do not match.  The
      zone_table index is unique for each node/zone combination when
      FLATMEM/DISCONTIGMEM is enabled and for each section/zone combination when
      SPARSEMEM is enabled (a SPARSEMEM section is at least a MAX_ORDER size).
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      cb2b95e1
    • D
      [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry · 726c3342
      David Howells 提交于
      Give the statfs superblock operation a dentry pointer rather than a superblock
      pointer.
      
      This complements the get_sb() patch.  That reduced the significance of
      sb->s_root, allowing NFS to place a fake root there.  However, NFS does
      require a dentry to use as a target for the statfs operation.  This permits
      the root in the vfsmount to be used instead.
      
      linux/mount.h has been added where necessary to make allyesconfig build
      successfully.
      
      Interest has also been expressed for use with the FUSE and XFS filesystems.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      726c3342
    • D
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells 提交于
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      454e2398
    • K
      [PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name() · 4a31e348
      Krzysztof Halasa 提交于
      David Boggs noticed that register_hdlc_device() no longer needs
      to call dev_alloc_name() as it's called by register_netdev().
      register_hdlc_device() is currently equivalent to register_netdev().
      
      hdlc_setup() is now EXPORTed as per David's request.
      Signed-off-by: NKrzysztof Halasa <khc@pm.waw.pl>
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      4a31e348
    • S
      [PATCH] network driver for Hilscher netx · 92aa674d
      Sascha Hauer 提交于
      This is a patch for the Hilscher netx builtin ethernet ports. The
      netx board support was merged into 2.6.17-git2.
      The netx is a arm926 based SoC.
      Signed-off-by: NRobert Schwebel <r.schwebel@pengutronix.de>
      Signed-off-by: NSascha Hauer <s.hauer@pengutronix.de>
      
      --
       drivers/net/Kconfig             |   11
       drivers/net/Makefile            |    1
       drivers/net/netx-eth.c          |  516 ++++++++++++++++++++++++++++++++++++++++
       include/asm-arm/arch-netx/eth.h |   27 ++
       4 files changed, 555 insertions(+)
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      92aa674d
    • R
      [PATCH] zlib_inflate: Upgrade library code to a recent version · 4f3865fb
      Richard Purdie 提交于
      Upgrade the zlib_inflate implementation in the kernel from a patched
      version 1.1.3/4 to a patched 1.2.3.
      
      The code in the kernel is about seven years old and I noticed that the
      external zlib library's inflate performance was significantly faster (~50%)
      than the code in the kernel on ARM (and faster again on x86_32).
      
      For comparison the newer deflate code is 20% slower on ARM and 50% slower
      on x86_32 but gives an approx 1% compression ratio improvement.  I don't
      consider this to be an improvement for kernel use so have no plans to
      change the zlib_deflate code.
      
      Various changes have been made to the zlib code in the kernel, the most
      significant being the extra functions/flush option used by ppp_deflate.
      This update reimplements the features PPP needs to ensure it continues to
      work.
      
      This code has been tested on ARM under both JFFS2 (with zlib compression
      enabled) and ppp_deflate and on x86_32.  JFFS2 sees an approx.  10% real
      world file read speed improvement.
      
      This patch also removes ZLIB_VERSION as it no longer has a correct value.
      We don't need version checks anyway as the kernel's module handling will
      take care of that for us.  This removal is also more in keeping with the
      zlib author's wishes (http://www.zlib.net/zlib_faq.html#faq24) and I've
      added something to the zlib.h header to note its a modified version.
      Signed-off-by: NRichard Purdie <rpurdie@rpsys.net>
      Acked-by: NJoern Engel <joern@wh.fh-wedel.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4f3865fb
    • B
      [PATCH] vgacon: make VGA_MAP_MEM take size, remove extra use · 4f1bcaf0
      Bjorn Helgaas 提交于
      VGA_MAP_MEM translates to ioremap() on some architectures.  It makes sense
      to do this to vga_vram_base, because we're going to access memory between
      vga_vram_base and vga_vram_end.
      
      But it doesn't really make sense to map starting at vga_vram_end, because
      we aren't going to access memory starting there.  On ia64, which always has
      to be different, ioremapping vga_vram_end gives you something completely
      incompatible with ioremapped vga_vram_start, so vga_vram_size ends up being
      nonsense.
      
      As a bonus, we often know the size up front, so we can use ioremap()
      correctly, rather than giving it a zero size.
      Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: "Antonino A. Daplas" <adaplas@pol.net>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4f1bcaf0
    • N
      [PATCH] Fix dcache race during umount · 0feae5c4
      NeilBrown 提交于
      The race is that the shrink_dcache_memory shrinker could get called while a
      filesystem is being unmounted, and could try to prune a dentry belonging to
      that filesystem.
      
      If it does, then it will call in to iput on the inode while the dentry is
      no longer able to be found by the umounting process.  If iput takes a
      while, generic_shutdown_super could get all the way though
      shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
      ever waiting on this particular inode.
      
      Eventually the superblock gets freed anyway and if the iput tried to touch
      it (which some filesystems certainly do), it will lose.  The promised
      "Self-destruct in 5 seconds" doesn't lead to a nice day.
      
      The race is closed by holding s_umount while calling prune_one_dentry on
      someone else's dentry.  As a down_read_trylock is used,
      shrink_dcache_memory will no longer try to prune the dentry of a filesystem
      that is being unmounted, and unmount will not be able to start until any
      such active prune_one_dentry completes.
      
      This requires that prune_dcache *knows* which filesystem (if any) it is
      doing the prune on behalf of so that it can be careful of other
      filesystems.  shrink_dcache_memory isn't called it on behalf of any
      filesystem, and so is careful of everything.
      
      shrink_dcache_anon is now passed a super_block rather than the s_anon list
      out of the superblock, so it can get the s_anon list itself, and can pass
      the superblock down to prune_dcache.
      
      If prune_dcache finds a dentry that it cannot free, it leaves it where it
      is (at the tail of the list) and exits, on the assumption that some other
      thread will be removing that dentry soon.  To try to make sure that some
      work gets done, a limited number of dnetries which are untouchable are
      skipped over while choosing the dentry to work on.
      
      I believe this race was first found by Kirill Korotaev.
      
      Cc: Jan Blunck <jblunck@suse.de>
      Acked-by: NKirill Korotaev <dev@openvz.org>
      Cc: Olaf Hering <olh@suse.de>
      Acked-by: NBalbir Singh <balbir@in.ibm.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NBalbir Singh <balbir@in.ibm.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0feae5c4
    • M
      [PATCH] remove steal_locks() · c89681ed
      Miklos Szeredi 提交于
      This patch removes the steal_locks() function.
      
      steal_locks() doesn't work correctly with any filesystem that does it's own
      lock management, including NFS, CIFS, etc.
      
      In addition it has weird semantics on local filesystems in case tasks
      sharing file-descriptor tables are doing POSIX locking operations in
      parallel to execve().
      
      The steal_locks() function has an effect on applications doing:
      
      clone(CLONE_FILES)
        /* in child */
        lock
        execve
        lock
      
      POSIX locks acquired before execve (by "child", "parent" or any further
      task sharing files_struct) will after the execve be owned exclusively by
      "child".
      
      According to Chris Wright some LSB/LTP kind of suite triggers without the
      stealing behavior, but there's no known real-world application that would
      also fail.
      
      Apps using NPTL are not affected, since all other threads are killed before
      execve.
      
      Apps using LinuxThreads are only affected if they
      
        - have multiple threads during exec (LinuxThreads doesn't kill other
          threads, the app may do it with pthread_kill_other_threads_np())
        - rely on POSIX locks being inherited across exec
      
      Both conditions are documented, but not their interaction.
      
      Apps using clone() natively are affected if they
      
        - use clone(CLONE_FILES)
        - rely on POSIX locks being inherited across exec
      
      The above scenarios are unlikely, but possible.
      
      If the patch is vetoed, there's a plan B, that involves mostly keeping the
      weird stealing semantics, but changing the way lock ownership is handled so
      that network and local filesystems work consistently.
      
      That would add more complexity though, so this solution seems to be
      preferred by most people.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Steven French <sfrench@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c89681ed
    • B
      [PATCH] PCI: Add PCI_CAP_ID_VNDR · 0e5b3781
      Brice Goglin 提交于
      Add the vendor-specific extended capability PCI_CAP_ID_VNDR.  It is required
      by the Myri-10G Ethernet driver.
      Signed-off-by: NBrice Goglin <brice@myri.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Cc: Jeff Garzik <jeff@garzik.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0e5b3781
    • D
      [PATCH] Keys: Fix race between two instantiators of a key · 04c567d9
      David Howells 提交于
      Add a revocation notification method to the key type and calls it whilst
      the key's semaphore is still write-locked after setting the revocation
      flag.
      
      The patch then uses this to maintain a reference on the task_struct of the
      process that calls request_key() for as long as the authorisation key
      remains unrevoked.
      
      This fixes a potential race between two processes both of which have
      assumed the authority to instantiate a key (one may have forked the other
      for example).  The problem is that there's no locking around the check for
      revocation of the auth key and the use of the task_struct it points to, nor
      does the auth key keep a reference on the task_struct.
      
      Access to the "context" pointer in the auth key must thenceforth be done
      with the auth key semaphore held.  The revocation method is called with the
      target key semaphore held write-locked and the search of the context
      process's keyrings is done with the auth key semaphore read-locked.
      
      The check for the revocation state of the auth key just prior to searching
      it is done after the auth key is read-locked for the search.  This ensures
      that the auth key can't be revoked between the check and the search.
      
      The revocation notification method is added so that the context task_struct
      can be released as soon as instantiation happens rather than waiting for
      the auth key to be destroyed, thus avoiding the unnecessary pinning of the
      requesting process.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      04c567d9
    • M
      [PATCH] selinux: add hooks for key subsystem · d720024e
      Michael LeMay 提交于
      Introduce SELinux hooks to support the access key retention subsystem
      within the kernel.  Incorporate new flask headers from a modified version
      of the SELinux reference policy, with support for the new security class
      representing retained keys.  Extend the "key_alloc" security hook with a
      task parameter representing the intended ownership context for the key
      being allocated.  Attach security information to root's default keyrings
      within the SELinux initialization routine.
      
      Has passed David's testsuite.
      Signed-off-by: NMichael LeMay <mdlemay@epoch.ncsc.mil>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      Acked-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d720024e
    • B
      [ARM] 3629/1: S3C24XX: fix missing bracket in regs-dsc.h · 02916526
      Ben Dooks 提交于
      Patch from Ben Dooks
      
      Fix missing bracket in include/asm-arm/arch-s3c2410/regs-dsc.h
      Signed-off-by: NBen Dooks <ben-linux@fluff.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      02916526
    • P
      [ARM] 3601/1: i.MX/MX1 DMA error handling for signaled channels only · fa3e686a
      Pavel Pisa 提交于
      Patch from Pavel Pisa
      
      There has been bug, that dma_err_handler() touches even
      channels not signaling error condition.
      
      Problem noticed by Andrea Paterniani.
      Signed-off-by: NPavel Pisa <pisa@cmp.felk.cvut.cz>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      fa3e686a
    • J
      [ALSA] version 1.0.12rc1 · 0dad31d2
      Jaroslav Kysela 提交于
      0dad31d2
    • R
      [ALSA] Disable AC97 AUX and VIDEO controls for WM9705 touchscreen · 1459c784
      Rodolfo Giometti 提交于
      This patch by Rodolfo Giometti disables the AC97 AUX and VIDEO controls
      on the WM9705 when the touchscreen is selected as the AUX and VIDEO
      lines are shared with the touch controller.
      Changes:-
       o Added AC97_HAS_NO_AUX flag
       o Test for AC97_HAS_NO_AUX flag in snd_ac97_mixer_build()
       o Sets AC97_HAS_NO_VIDEO and AC97_HAS_NO_AUX in patch_wolfson05() when
      WM9705 touch driver is selected.
      Signed-off-by: NRodolfo Giometti <giometti@linux.it>
      Signed-off-by: NLiam Girdwood <liam.girdwood@wolfsonmicro.com>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      1459c784
    • T
      [ALSA] Change an arugment of snd_mpu401_uart_new() to bit flags · 302e4c2f
      Takashi Iwai 提交于
      Change the 5th argument of snd_mpu401_uart_new() to bit flags
      instead of a boolean.  The argument takes bits that consist of
      MPU401_INFO_XXX flags.
      The callers that used the value 1 there are replaced with
      MPU401_INFO_INTEGRATED.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      302e4c2f
    • T
      [ALSA] Fix rwlock around snd_iprintf() in sound core · 746df948
      Takashi Iwai 提交于
      Fixed rwlock around snd_iprintf() in sound core part.
      Replaced with mutex.
      Also, make mutex and flags static variables with addition of
      snd_card_locked() function (just for sound.c).
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      746df948
    • C
      [ALSA] rawmidi: add get_port_info callback for sequencer information flags · a7b928ac
      Clemens Ladisch 提交于
      Add a get_port_info callback to the snd_rawmidi_global_ops structure to
      allow the USB MIDI driver to supply information flags for the sequencer
      ports created by seq_midi.
      Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
      a7b928ac
    • C
      [ALSA] add more sequencer port type information bits · 450047a7
      Clemens Ladisch 提交于
      Add four new information flags SNDRV_SEQ_PORT_TYPE_HARDWARE, _SOFTWARE,
      _SYNTHESIZER, _PORT for sequencer ports.  This makes it easier for apps
      like Rosegarden to make policy decisions based on the port type.
      Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
      450047a7
    • T
      [ALSA] Fix mmap_count with O_APPEND opened streams · 9c323fcb
      Takashi Iwai 提交于
      Move mmap_count to snd_pcm_substream instead of runtime struct
      so that multiplly opened substreams via O_APPEND can be handled
      correctly.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      9c323fcb
    • T
      [ALSA] Add O_APPEND flag support to PCM · 0df63e44
      Takashi Iwai 提交于
      Added O_APPEND flag support to PCM to enable shared substreams
      among multiple processes.  This mechanism is used by dmix and
      dsnoop plugins.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      0df63e44
    • T
      [ALSA] Remove unneeded read/write_size fields in proc text ops · bf850204
      Takashi Iwai 提交于
      Remove unneeded read/write_size fields in proc text ops.
      snd_info_set_text_ops() is fixed, too.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      bf850204