1. 06 9月, 2005 25 次提交
  2. 05 9月, 2005 15 次提交
    • B
      [PATCH] Uml support: add PTRACE_SYSEMU_SINGLESTEP option to i386 · 1b38f006
      Bodo Stroesser 提交于
      This patch implements the new ptrace option PTRACE_SYSEMU_SINGLESTEP, which
      can be used by UML to singlestep a process: it will receive SINGLESTEP
      interceptions for normal instructions and syscalls, but syscall execution will
      be skipped just like with PTRACE_SYSEMU.
      Signed-off-by: NBodo Stroesser <bstroesser@fujitsu-siemens.com>
      Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1b38f006
    • L
      [PATCH] UML Support - Ptrace: adds the host SYSEMU support, for UML and general usage · ed75e8d5
      Laurent Vivier 提交于
            Jeff Dike <jdike@addtoit.com>,
            Paolo 'Blaisorblade' Giarrusso <blaisorblade_spam@yahoo.it>,
            Bodo Stroesser <bstroesser@fujitsu-siemens.com>
      
      Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
      except that the kernel does not execute the requested syscall; this is useful
      to improve performance for virtual environments, like UML, which want to run
      the syscall on their own.
      
      In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
      and on exit, and each time you also have two context switches; with SYSEMU you
      avoid the 2nd stop and so save two context switches per syscall.
      
      Also, some architectures don't have support in the host for changing the
      syscall number via ptrace(), which is currently needed to skip syscall
      execution (UML turns any syscall into getpid() to avoid it being executed on
      the host).  Fixing that is hard, while SYSEMU is easier to implement.
      
      * This version of the patch includes some suggestions of Jeff Dike to avoid
        adding any instructions to the syscall fast path, plus some other little
        changes, by myself, to make it work even when the syscall is executed with
        SYSENTER (but I'm unsure about them). It has been widely tested for quite a
        lot of time.
      
      * Various fixed were included to handle the various switches between
        various states, i.e. when for instance a syscall entry is traced with one of
        PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
        Basically, this is done by remembering which one of them was used even after
        the call to ptrace_notify().
      
      * We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
        to make do_syscall_trace() notice that the current syscall was started with
        SYSEMU on entry, so that no notification ought to be done in the exit path;
        this is a bit of a hack, so this problem is solved in another way in next
        patches.
      
      * Also, the effects of the patch:
      "Ptrace - i386: fix Syscall Audit interaction with singlestep"
      are cancelled; they are restored back in the last patch of this series.
      
      Detailed descriptions of the patches doing this kind of processing follow (but
      I've already summed everything up).
      
      * Fix behaviour when changing interception kind #1.
      
        In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
        only after doing the debugger notification; but the debugger might have
        changed the status of this flag because he continued execution with
        PTRACE_SYSCALL, so this is wrong.  This patch fixes it by saving the flag
        status before calling ptrace_notify().
      
      * Fix behaviour when changing interception kind #2:
        avoid intercepting syscall on return when using SYSCALL again.
      
        A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
        crashes.
      
        The problem is in arch/i386/kernel/entry.S.  The current SYSEMU patch
        inhibits the syscall-handler to be called, but does not prevent
        do_syscall_trace() to be called after this for syscall completion
        interception.
      
        The appended patch fixes this.  It reuses the flag TIF_SYSCALL_EMU to
        remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
        the flag is unused in the depicted situation.
      
      * Fix behaviour when changing interception kind #3:
        avoid intercepting syscall on return when using SINGLESTEP.
      
        When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
        problems with singlestepping on UML in SKAS with SYSEMU.  It looped
        receiving SIGTRAPs without moving forward.  EIP of the traced process was
        the same for all SIGTRAPs.
      
      What's missing is to handle switching from PTRACE_SYSCALL_EMU to
      PTRACE_SINGLESTEP in a way very similar to what is done for the change from
      PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.
      
      I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
      notified and then wake ups the process; the syscall is executed (or skipped,
      when do_syscall_trace() returns 0, i.e.  when using PTRACE_SYSEMU), and
      do_syscall_trace() is called again.  Since we are on the return path of a
      SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
      we must still avoid notifying the parent of the syscall exit.  Now, this
      behaviour is extended even to resuming with PTRACE_SINGLESTEP.
      Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ed75e8d5
    • P
      [PATCH] swsusp: switch pm_message_t to struct · ca078bae
      Pavel Machek 提交于
      This adds type-checking to pm_message_t, so that people can't confuse it
      with int or u32.  It also allows us to fix "disk yoyo" during suspend (disk
      spinning down/up/down).
      
      [We've tried that before; since that cpufreq problems were fixed and I've
      tried make allyes config and fixed resulting damage.]
      Signed-off-by: NPavel Machek <pavel@suse.cz>
      Signed-off-by: NAlexander Nyberg <alexn@telia.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ca078bae
    • M
      [PATCH] x86: fix EFI memory map parsing · 7ae65fd3
      Matt Tolentino 提交于
      The memory descriptors that comprise the EFI memory map are not fixed in
      stone such that the size could change in the future.  This uses the memory
      descriptor size obtained from EFI to iterate over the memory map entries
      during boot.  This enables the removal of an x86 specific pad (and ifdef)
      in the EFI header.  I also couldn't stomach the broken up nature of the
      function to put EFI runtime calls into virtual mode any longer so I fixed
      that up a bit as well.
      
      For reference, this patch only impacts x86.
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7ae65fd3
    • R
      [PATCH] MIPS Technologies PCI ID bits · dc4ec916
      Ralf Baechle 提交于
      - MIPS Denmark does no longer exist; the PCI vendor ID is now owned by
        MIPS Technologies.
      
      - Add ID for SOC-it, MIPS's system controller.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dc4ec916
    • M
      [PATCH] ppc32: mv64x60 updates & enhancements · d01c08c9
      Mark A. Greer 提交于
      Updates and enhancement to the ppc32 mv64x60 code:
      - move code to get mem size from mem ctlr to bootwrapper
      - address some errata in the mv64360 pic code
      - some minor cleanups
      - export one of the bridge's regs via sysfs so user daemon can watch for
        extraction events
      Signed-off-by: NMark A. Greer <mgreer@mvista.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d01c08c9
    • M
      [PATCH] VM: add page_state info to per-node meminfo · c07e02db
      Martin Hicks 提交于
      Add page_state info to the per-node meminfo file in sysfs.  This is mostly
      just for informational purposes.
      
      The lack of this information was brought up recently during a discussion
      regarding pagecache clearing, and I put this patch together to test out one
      of the suggestions.
      
      It seems like interesting info to have, so I'm submitting the patch.
      Signed-off-by: NMartin Hicks <mort@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c07e02db
    • C
      [PATCH] remove hugetlb_clean_stale_pgtable() and fix huge_pte_alloc() · 0e5c9f39
      Chen, Kenneth W 提交于
      I don't think we need to call hugetlb_clean_stale_pgtable() anymore
      in 2.6.13 because of the rework with free_pgtables().  It now collect
      all the pte page at the time of munmap.  It used to only collect page
      table pages when entire one pgd can be freed and left with staled pte
      pages.  Not anymore with 2.6.13.  This function will never be called
      and We should turn it into a BUG_ON.
      
      I also spotted two problems here, not Adam's fault :-)
      (1) in huge_pte_alloc(), it looks like a bug to me that pud is not
          checked before calling pmd_alloc()
      (2) in hugetlb_clean_stale_pgtable(), it also missed a call to
          pmd_free_tlb.  I think a tlb flush is required to flush the mapping
          for the page table itself when we clear out the pmd pointing to a
          pte page.  However, since hugetlb_clean_stale_pgtable() is never
          called, so it won't trigger the bug.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0e5c9f39
    • D
      [PATCH] arm: allow for arch-specific IOREMAP_MAX_ORDER · fd195c49
      Deepak Saxena 提交于
      Version 6 of the ARM architecture introduces the concept of 16MB pages
      (supersections) and 36-bit (40-bit actually, but nobody uses this) physical
      addresses.  36-bit addressed memory and I/O and ARMv6 can only be mapped
      using supersections and the requirement on these is that both virtual and
      physical addresses be 16MB aligned.  In trying to add support for ioremap()
      of 36-bit I/O, we run into the issue that get_vm_area() allows for a
      maximum of 512K alignment via the IOREMAP_MAX_ORDER constant.  To work
      around this, we can:
      
      - Allocate a larger VM area than needed (size + (1ul << IOREMAP_MAX_ORDER))
        and then align the pointer ourselves, but this ends up with 512K of
        wasted VM per ioremap().
      
      - Provide a new __get_vm_area_aligned() API and make __get_vm_area() sit
        on top of this. I did this and it works but I don't like the idea
        adding another VM API just for this one case.
      
      - My preferred solution which is to allow the architecture to override
        the IOREMAP_MAX_ORDER constant with it's own version.
      Signed-off-by: NDeepak Saxena <dsaxena@plexity.net>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fd195c49
    • P
      [PATCH] comment typo fix · e83a9596
      Paolo 'Blaisorblade' Giarrusso 提交于
      smp_entry_t -> swap_entry_t
      Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e83a9596
    • M
      [PATCH] VM: add capabilites check to set_zone_reclaim · bce5f6ba
      Martin Hicks 提交于
      Add a capability check to sys_set_zone_reclaim().  This syscall is not
      something that should be available to a user.
      Signed-off-by: NMartin Hicks <mort@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bce5f6ba
    • N
      [PATCH] mm: remove atomic · 242e5468
      Nick Piggin 提交于
      This bitop does not need to be atomic because it is performed when there will
      be no references to the page (ie.  the page is being freed).
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      242e5468
    • C
      [PATCH] /proc/<pid>/numa_maps to show on which nodes pages reside · 6e21c8f1
      Christoph Lameter 提交于
      This patch was recently discussed on linux-mm:
      http://marc.theaimsgroup.com/?t=112085728500002&r=1&w=2
      
      I inherited a large code base from Ray for page migration.  There was a
      small patch in there that I find to be very useful since it allows the
      display of the locality of the pages in use by a process.  I reworked that
      patch and came up with a /proc/<pid>/numa_maps that gives more information
      about the vma's of a process.  numa_maps is indexes by the start address
      found in /proc/<pid>/maps.  F.e.  with this patch you can see the page use
      of the "getty" process:
      
      margin:/proc/12008 # cat maps
      00000000-00004000 r--p 00000000 00:00 0
      2000000000000000-200000000002c000 r-xp 00000000 08:04 516                /lib/ld-2.3.3.so
      2000000000038000-2000000000040000 rw-p 00028000 08:04 516                /lib/ld-2.3.3.so
      2000000000040000-2000000000044000 rw-p 2000000000040000 00:00 0
      2000000000058000-2000000000260000 r-xp 00000000 08:04 54707842           /lib/tls/libc.so.6.1
      2000000000260000-2000000000268000 ---p 00208000 08:04 54707842           /lib/tls/libc.so.6.1
      2000000000268000-2000000000274000 rw-p 00200000 08:04 54707842           /lib/tls/libc.so.6.1
      2000000000274000-2000000000280000 rw-p 2000000000274000 00:00 0
      2000000000280000-20000000002b4000 r--p 00000000 08:04 9126923            /usr/lib/locale/en_US.utf8/LC_CTYPE
      2000000000300000-2000000000308000 r--s 00000000 08:04 60071467           /usr/lib/gconv/gconv-modules.cache
      2000000000318000-2000000000328000 rw-p 2000000000318000 00:00 0
      4000000000000000-4000000000008000 r-xp 00000000 08:04 29576399           /sbin/mingetty
      6000000000004000-6000000000008000 rw-p 00004000 08:04 29576399           /sbin/mingetty
      6000000000008000-600000000002c000 rw-p 6000000000008000 00:00 0          [heap]
      60000fff7fffc000-60000fff80000000 rw-p 60000fff7fffc000 00:00 0
      60000ffffff44000-60000ffffff98000 rw-p 60000ffffff44000 00:00 0          [stack]
      a000000000000000-a000000000020000 ---p 00000000 00:00 0                  [vdso]
      
      cat numa_maps
      2000000000000000 default MaxRef=43 Pages=11 Mapped=11 N0=4 N1=3 N2=2 N3=2
      2000000000038000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
      2000000000040000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      2000000000058000 default MaxRef=43 Pages=61 Mapped=61 N0=14 N1=15 N2=16 N3=16
      2000000000268000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
      2000000000274000 default MaxRef=1 Pages=3 Mapped=3 Anon=3 N0=3
      2000000000280000 default MaxRef=8 Pages=3 Mapped=3 N0=3
      2000000000300000 default MaxRef=8 Pages=2 Mapped=2 N0=2
      2000000000318000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N2=1
      4000000000000000 default MaxRef=6 Pages=2 Mapped=2 N1=2
      6000000000004000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      6000000000008000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      60000fff7fffc000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      60000ffffff44000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      
      getty uses ld.so.  The first vma is the code segment which is used by 43
      other processes and the pages are evenly distributed over the 4 nodes.
      
      The second vma is the process specific data portion for ld.so.  This is
      only one page.
      
      The display format is:
      
      <startaddress>	 Links to information in /proc/<pid>/map
      <memory policy>  This can be "default" "interleave={}", "prefer=<node>" or "bind={<zones>}"
      MaxRef=		<maximum reference to a page in this vma>
      Pages=		<Nr of pages in use>
      Mapped=		<Nr of pages with mapcount >
      Anon=		<nr of anonymous pages>
      Nx=		<Nr of pages on Node x>
      
      The content of the proc-file is self-evident.  If this would be tied into
      the sparsemem system then the contents of this file would not be too
      useful.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6e21c8f1
    • H
      [PATCH] swap: swap_lock replace list+device · 5d337b91
      Hugh Dickins 提交于
      The idea of a swap_device_lock per device, and a swap_list_lock over them all,
      is appealing; but in practice almost every holder of swap_device_lock must
      already hold swap_list_lock, which defeats the purpose of the split.
      
      The only exceptions have been swap_duplicate, valid_swaphandles and an
      untrodden path in try_to_unuse (plus a few places added in this series).
      valid_swaphandles doesn't show up high in profiles, but swap_duplicate does
      demand attention.  However, with the hold time in get_swap_pages so much
      reduced, I've not yet found a load and set of swap device priorities to show
      even swap_duplicate benefitting from the split.  Certainly the split is mere
      overhead in the common case of a single swap device.
      
      So, replace swap_list_lock and swap_device_lock by spinlock_t swap_lock
      (generally we seem to prefer an _ in the name, and not hide in a macro).
      
      If someone can show a regression in swap_duplicate, then probably we should
      add a hashlock for the swap_map entries alone (shorts being anatomic), so as
      to help the case of the single swap device too.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5d337b91
    • H
      [PATCH] swap: scan_swap_map drop swap_device_lock · 52b7efdb
      Hugh Dickins 提交于
      get_swap_page has often shown up on latency traces, doing lengthy scans while
      holding two spinlocks.  swap_list_lock is already dropped, now scan_swap_map
      drop swap_device_lock before scanning the swap_map.
      
      While scanning for an empty cluster, don't worry that racing tasks may
      allocate what was free and free what was allocated; but when allocating an
      entry, check it's still free after retaking the lock.  Avoid dropping the lock
      in the expected common path.  No barriers beyond the locks, just let the
      cookie crumble; highest_bit limit is volatile, but benign.
      
      Guard against swapoff: must check SWP_WRITEOK before allocating, must raise
      SWP_SCANNING reference count while in scan_swap_map, swapoff wait for that to
      fall - just use schedule_timeout, we don't want to burden scan_swap_map
      itself, and it's very unlikely that anyone can really still be in
      scan_swap_map once swapoff gets this far.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      52b7efdb