1. 02 10月, 2006 25 次提交
  2. 01 10月, 2006 15 次提交
    • L
      Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart · 82965add
      Linus Torvalds 提交于
      * master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
        [AGPGART] printk fixups.
        [AGPGART] Use pci_get_slot not pci_find_slot
      82965add
    • L
      Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq · f0b364a1
      Linus Torvalds 提交于
      * master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
        [CPUFREQ] Make acpi-cpufreq unsticky again.
        [CPUFREQ] longhaul: remove duplicated code.
        [CPUFREQ] Longhaul - Disable arbiter CLE266
        [CPUFREQ] Fix section mismatch warning
        [CPUFREQ] Fix cut-n-paste bug in suspend printk
      f0b364a1
    • Z
      [PATCH] Some config.h removals · 5a73fdc5
      Zachary Amsden 提交于
      During tracking down a PAE compile failure, I found that config.h was being
      included in a bunch of places in i386 code.  It is no longer necessary, so
      drop it.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5a73fdc5
    • Z
      [PATCH] paravirt: update pte hook · 789e6ac0
      Zachary Amsden 提交于
      Add a pte_update_hook which notifies about pte changes that have been made
      without using the set_pte / clear_pte interfaces.  This allows shadow mode
      hypervisors which do not trap on page table access to maintain synchronized
      shadows.
      
      It also turns out, there was one pte update in PAE mode that wasn't using any
      accessor interface at all for setting NX protection.  Considering it is PAE
      specific, and the accessor is i386 specific, I didn't want to add a generic
      encapsulation of this behavior yet.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      789e6ac0
    • Z
      [PATCH] paravirt: remove set pte atomic · a93cb055
      Zachary Amsden 提交于
      Now that ptep_establish has a definition in PAE i386 3-level paging code, the
      only paging model which is insane enough to have multi-word hardware PTEs
      which are not efficient to set atomically, we can remove the ghost of
      set_pte_atomic from other architectures which falesly duplicated it, and
      remove all knowledge of it from the generic pgtable code.
      
      set_pte_atomic is now a private pte operator which is specific to i386
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a93cb055
    • Z
      [PATCH] paravirt: optimize ptep establish for pae · d6d861e3
      Zachary Amsden 提交于
      The ptep_establish macro is only used on user-level PTEs, for P->P mapping
      changes.  Since these always happen under protection of the pagetable lock,
      the strong synchronization of a 64-bit cmpxchg is not needed, in fact, not
      even a lock prefix needs to be used.  We can simply instead clear the P-bit,
      followed by a normal set.  The write ordering is still important to avoid the
      possibility of the TLB snooping a partially written PTE and getting a bad
      mapping installed.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d6d861e3
    • Z
      [PATCH] paravirt: kpte flush · 23002d88
      Zachary Amsden 提交于
      Create a new PTE function which combines clearing a kernel PTE with the
      subsequent flush.  This allows the two to be easily combined into a single
      hypercall or paravirt-op.  More subtly, reverse the order of the flush for
      kmap_atomic.  Instead of flushing on establishing a mapping, flush on clearing
      a mapping.  This eliminates the possibility of leaving stale kmap entries
      which may still have valid TLB mappings.  This is required for direct mode
      hypervisors, which need to reprotect all mappings of a given page when
      changing the page type from a normal page to a protected page (such as a page
      table or descriptor table page).  But it also provides some nicer semantics
      for real hardware, by providing extra debug-proofing against using stale
      mappings, as well as ensuring that no stale mappings exist when changing the
      cacheability attributes of a page, which could lead to cache conflicts when
      two different types of mappings exist for the same page.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      23002d88
    • Z
      [PATCH] paravirt: combine flush accessed dirty.patch · 25e4df5b
      Zachary Amsden 提交于
      Remove ptep_test_and_clear_{dirty|young} from i386, and instead use the
      dominating functions, ptep_clear_flush_{dirty|young}.  This allows the TLB
      page flush to be contained in the same macro, and allows for an eager
      optimization - if reading the PTE initially returned dirty/accessed, we can
      assume the fact that no subsequent update to the PTE which cleared accessed /
      dirty has occurred, as the only way A/D bits can change without holding the
      page table lock is if a remote processor clears them.  This eliminates an
      extra branch which came from the generic version of the code, as we know that
      no other CPU could have cleared the A/D bit, so the flush will always be
      needed.
      
      We still export these two defines, even though we do not actually define
      the macros in the i386 code:
      
       #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
       #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_DIRTY
      
      The reason for this is that the only use of these functions is within the
      generic clear_flush functions, and we want a strong guarantee that there
      are no other users of these functions, so we want to prevent the generic
      code from defining them for us.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      25e4df5b
    • Z
      [PATCH] paravirt: lazy mmu mode hooks.patch · 6606c3e0
      Zachary Amsden 提交于
      Implement lazy MMU update hooks which are SMP safe for both direct and shadow
      page tables.  The idea is that PTE updates and page invalidations while in
      lazy mode can be batched into a single hypercall.  We use this in VMI for
      shadow page table synchronization, and it is a win.  It also can be used by
      PPC and for direct page tables on Xen.
      
      For SMP, the enter / leave must happen under protection of the page table
      locks for page tables which are being modified.  This is because otherwise,
      you end up with stale state in the batched hypercall, which other CPUs can
      race ahead of.  Doing this under the protection of the locks guarantees the
      synchronization is correct, and also means that spurious faults which are
      generated during this window by remote CPUs are properly handled, as the page
      fault handler must re-check the PTE under protection of the same lock.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6606c3e0
    • Z
      [PATCH] paravirt: pte clear not present · 9888a1ca
      Zachary Amsden 提交于
      Change pte_clear_full to a more appropriately named pte_clear_not_present,
      allowing optimizations when not-present mapping changes need not be reflected
      in the hardware TLB for protected page table modes.  There is also another
      case that can use it in the fremap code.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9888a1ca
    • Z
      [PATCH] paravirt: remove read hazard from cow · 3dc90795
      Zachary Amsden 提交于
      We don't want to read PTEs directly like this after they have been modified,
      as a lazy MMU implementation of direct page tables may not have written the
      updated PTE back to memory yet.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3dc90795
    • A
      [PATCH] invalidate_inode_pages2(): ignore page refcounts · bd4c8ce4
      Andrew Morton 提交于
      The recent fix to invalidate_inode_pages() (git commit 016eb4a0) managed to
      unfix invalidate_inode_pages2().
      
      The problem is that various bits of code in the kernel can take transient refs
      on pages: the page scanner will do this when inspecting a batch of pages, and
      the lru_cache_add() batching pagevecs also hold a ref.
      
      Net result is transient failures in invalidate_inode_pages2().  This affects
      NFS directory invalidation (observed) and presumably also block-backed
      direct-io (not yet reported).
      
      Fix it by reverting invalidate_inode_pages2() back to the old version which
      ignores the page refcounts.
      
      We may come up with something more clever later, but for now we need a 2.6.18
      fix for NFS.
      
      Cc: Chuck Lever <cel@citi.umich.edu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bd4c8ce4
    • A
      [PATCH] Support piping into commands in /proc/sys/kernel/core_pattern · d025c9db
      Andi Kleen 提交于
      Using the infrastructure created in previous patches implement support to
      pipe core dumps into programs.
      
      This is done by overloading the existing core_pattern sysctl
      with a new syntax:
      
      |program
      
      When the first character of the pattern is a '|' the kernel will instead
      threat the rest of the pattern as a command to run.  The core dump will be
      written to the standard input of that program instead of to a file.
      
      This is useful for having automatic core dump analysis without filling up
      disks.  The program can do some simple analysis and save only a summary of
      the core dump.
      
      The core dump proces will run with the privileges and in the name space of
      the process that caused the core dump.
      
      I also increased the core pattern size to 128 bytes so that longer command
      lines fit.
      
      Most of the changes comes from allowing core dumps without seeks.  They are
      fairly straight forward though.
      
      One small incompatibility is that if someone had a core pattern previously
      that started with '|' they will get suddenly new behaviour.  I think that's
      unlikely to be a real problem though.
      
      Additional background:
      
      > Very nice, do you happen to have a program that can accept this kind of
      > input for crash dumps?  I'm guessing that the embedded people will
      > really want this functionality.
      
      I had a cheesy demo/prototype.  Basically it wrote the dump to a file again,
      ran gdb on it to get a backtrace and wrote the summary to a shared directory.
      Then there was a simple CGI script to generate a "top 10" crashes HTML
      listing.
      
      Unfortunately this still had the disadvantage to needing full disk space for a
      dump except for deleting it afterwards (in fact it was worse because over the
      pipe holes didn't work so if you have a holey address map it would require
      more space).
      
      Fortunately gdb seems to be happy to handle /proc/pid/fd/xxx input pipes as
      cores (at least it worked with zsh's =(cat core) syntax), so it would be
      likely possible to do it without temporary space with a simple wrapper that
      calls it in the right way.  I ran out of time before doing that though.
      
      The demo prototype scripts weren't very good.  If there is really interest I
      can dig them out (they are currently on a laptop disk on the desk with the
      laptop itself being in service), but I would recommend to rewrite them for any
      serious application of this and fix the disk space problem.
      
      Also to be really useful it should probably find a way to automatically fetch
      the debuginfos (I cheated and just installed them in advance).  If nobody else
      does it I can probably do the rewrite myself again at some point.
      
      My hope at some point was that desktops would support it in their builtin
      crash reporters, but at least the KDE people I talked too seemed to be happy
      with their user space only solution.
      
      Alan sayeth:
      
        I don't believe that piping as such as neccessarily the right model, but
        the ability to intercept and processes core dumps from user space is asked
        for by many enterprise users as well.  They want to know about, capture,
        analyse and process core dumps, often centrally and in automated form.
      
      [akpm@osdl.org: loff_t != unsigned long]
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d025c9db
    • A
      [PATCH] Create call_usermodehelper_pipe() · e239ca54
      Andi Kleen 提交于
      A new member in the ever growing family of call_usermode* functions is
      born.  The new call_usermodehelper_pipe() function allows to pipe data to
      the stdin of the called user mode progam and behaves otherwise like the
      normal call_usermodehelp() (except that it always waits for the child to
      finish)
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e239ca54
    • A
      [PATCH] Some cleanup in the pipe code · d6cbd281
      Andi Kleen 提交于
      Split the big and hard to read do_pipe function into smaller pieces.
      
      This creates new create_write_pipe/free_write_pipe/create_read_pipe
      functions.  These functions are made global so that they can be used by
      other parts of the kernel.
      
      The resulting code is more generic and easier to read and has cleaner error
      handling and less gotos.
      
      [akpm@osdl.org: cleanup]
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d6cbd281