1. 26 6月, 2005 11 次提交
  2. 24 6月, 2005 19 次提交
    • P
      [PATCH] kprobes: Temporary disarming of reentrant probe for x86_64 · aa3d7e3d
      Prasanna S Panchamukhi 提交于
      This patch includes x86_64 architecture specific changes to support temporary
      disarming on reentrancy of probes.
      Signed-of-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      aa3d7e3d
    • R
      [PATCH] Move kprobe [dis]arming into arch specific code · 7e1048b1
      Rusty Lynch 提交于
      The architecture independent code of the current kprobes implementation is
      arming and disarming kprobes at registration time.  The problem is that the
      code is assuming that arming and disarming is a just done by a simple write
      of some magic value to an address.  This is problematic for ia64 where our
      instructions look more like structures, and we can not insert break points
      by just doing something like:
      
      *p->addr = BREAKPOINT_INSTRUCTION;
      
      The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent
      functions:
      
           * void arch_arm_kprobe(struct kprobe *p)
           * void arch_disarm_kprobe(struct kprobe *p)
      
      and then adds the new functions for each of the architectures that already
      implement kprobes (spar64/ppc64/i386/x86_64).
      
      I thought arch_[dis]arm_kprobe was the most descriptive of what was really
      happening, but each of the architectures already had a disarm_kprobe()
      function that was really a "disarm and do some other clean-up items as
      needed when you stumble across a recursive kprobe." So...  I took the
      liberty of changing the code that was calling disarm_kprobe() to call
      arch_disarm_kprobe(), and then do the cleanup in the block of code dealing
      with the recursive kprobe case.
      
      So far this patch as been tested on i386, x86_64, and ppc64, but still
      needs to be tested in sparc64.
      Signed-off-by: NRusty Lynch <rusty.lynch@intel.com>
      Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7e1048b1
    • R
      [PATCH] x86_64 specific function return probes · 73649dab
      Rusty Lynch 提交于
      The following patch adds the x86_64 architecture specific implementation
      for function return probes.
      
      Function return probes is a mechanism built on top of kprobes that allows
      a caller to register a handler to be called when a given function exits.
      For example, to instrument the return path of sys_mkdir:
      
      static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
      {
      	printk("sys_mkdir exited\n");
      	return 0;
      }
      static struct kretprobe return_probe = {
      	.handler = sys_mkdir_exit,
      };
      
      <inside setup function>
      
      return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
      if (register_kretprobe(&return_probe)) {
      	printk(KERN_DEBUG "Unable to register return probe!\n");
      	/* do error path */
      }
      
      <inside cleanup function>
      unregister_kretprobe(&return_probe);
      
      The way this works is that:
      
      * At system initialization time, kernel/kprobes.c installs a kprobe
        on a function called kretprobe_trampoline() that is implemented in
        the arch/x86_64/kernel/kprobes.c  (More on this later)
      
      * When a return probe is registered using register_kretprobe(),
        kernel/kprobes.c will install a kprobe on the first instruction of the
        targeted function with the pre handler set to arch_prepare_kretprobe()
        which is implemented in arch/x86_64/kernel/kprobes.c.
      
      * arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
        - nodes for hanging this instance in an empty or free list
        - a pointer to the return probe
        - the original return address
        - a pointer to the stack address
      
        With all this stowed away, arch_prepare_kretprobe() then sets the return
        address for the targeted function to a special trampoline function called
        kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
      
      * The kprobe completes as normal, with control passing back to the target
        function that executes as normal, and eventually returns to our trampoline
        function.
      
      * Since a kprobe was installed on kretprobe_trampoline() during system
        initialization, control passes back to kprobes via the architecture
        specific function trampoline_probe_handler() which will lookup the
        instance in an hlist maintained by kernel/kprobes.c, and then call
        the handler function.
      
      * When trampoline_probe_handler() is done, the kprobes infrastructure
        single steps the original instruction (in this case just a top), and
        then calls trampoline_post_handler().  trampoline_post_handler() then
        looks up the instance again, puts the instance back on the free list,
        and then makes a long jump back to the original return instruction.
      
      So to recap, to instrument the exit path of a function this implementation
      will cause four interruptions:
      
        - A breakpoint at the very beginning of the function allowing us to
          switch out the return address
        - A single step interruption to execute the original instruction that
          we replaced with the break instruction (normal kprobe flow)
        - A breakpoint in the trampoline function where our instrumented function
          returned to
        - A single step interruption to execute the original instruction that
          we replaced with the break instruction (normal kprobe flow)
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      73649dab
    • V
      [PATCH] xen: x86_64: use more usermode macro · 76381fee
      Vincent Hanquez 提交于
      Make use of the user_mode macro where it's possible.  This is useful for Xen
      because it will need only to redefine only the macro to a hypervisor call.
      Signed-off-by: NVincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
      Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      76381fee
    • V
      [PATCH] xen: x86_64: Add macro for debugreg · e9129e56
      Vincent Hanquez 提交于
      Add 2 macros to set and get debugreg on x86_64.  This is useful for Xen
      because it will need only to redefine each macro to a hypervisor call.
      Signed-off-by: NVincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
      Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e9129e56
    • N
      [PATCH] x86_64: avoid wasting IRQs · 701067c4
      Natalie Protasevich 提交于
      I suggest to change the way IRQs are handed out to PCI devices.
      
      Currently, each I/O APIC pin gets associated with an IRQ, no matter if the
      pin is used or not.  It is expected that each pin can potentually be
      engaged by a device inserted into the corresponding PCI slot.  However,
      this imposes severe limitation on systems that have designs that employ
      many I/O APICs, only utilizing couple lines of each, such as P64H2 chipset.
      
      It is used in ES7000, and currently, there is no way to boot the system
      with more that 9 I/O APICs.
      
      The simple change below allows to boot a system with say 64 (or more) I/O
      APICs, each providing 1 slot, which otherwise impossible because of the IRQ
      gaps created for unused lines on each I/O APIC.  It does not resolve the
      problem with number of devices that exceeds number of possible IRQs, but
      eases up a tension for IRQs on any large system with potentually large
      number of devices.
      
      I only implemented this for the ACPI boot, since if the system is this big
      and using newer chipsets it is probably (better be!) an ACPI based system
      :).  The change is completely "mechanical" and does not alter any internal
      structures or interrupt model/implementation.  The patch works for both
      i386 and x86_64 archs.  It works with MSIs just fine, and should not
      intervene with implementations like shared vectors, when they get worked
      out and incorporated.
      
      To illustrate, below is the interrupt distribution for 2-cell ES7000 with
      20 I/O APICs, and an Ethernet card in the last slot, which should be eth1
      and which was not configured because its IRQ exceeded allowable number (it
      actially turned out huge - 480!):
      
      zorro-tb2:~ # cat /proc/interrupts
                 CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
        0:      65716      30012      30007      30002      30009      30010      30010      30010    IO-APIC-edge  timer
        4:        373          0        725        280          0          0          0          0    IO-APIC-edge  serial
        8:          0          0          0          0          0          0          0          0    IO-APIC-edge  rtc
        9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
       14:         39          3          0          0          0          0          0          0    IO-APIC-edge  ide0
       16:        108         13          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
       18:          0          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb3
       19:         15          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
       23:          3          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb4
       96:       4240        397         18          0          0          0          0          0   IO-APIC-level  aic7xxx
       97:         15          0          0          0          0          0          0          0   IO-APIC-level  aic7xxx
      192:        847          0          0          0          0          0          0          0   IO-APIC-level  eth0
      NMI:          0          0          0          0          0          0          0          0
      LOC:     273423     274528     272829     274228     274092     273761     273827     273694
      ERR:          7
      MIS:          0
      
      Even though the system doesn't have that many devices, some don't get
      enabled only because of IRQ numbering model.
      
      This is the IRQ picture after the patch was applied:
      
      zorro-tb2:~ # cat /proc/interrupts
                 CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
        0:      44169      10004      10004      10001      10004      10003      10004       6135    IO-APIC-edge  timer
        4:        345          0          0          0          0        244          0          0    IO-APIC-edge  serial
        8:          0          0          0          0          0          0          0          0    IO-APIC-edge  rtc
        9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
       14:         39          0          3          0          0          0          0          0    IO-APIC-edge  ide0
       17:       4425          0          9          0          0          0          0          0   IO-APIC-level  aic7xxx
       18:         15          0          0          0          0          0          0          0   IO-APIC-level  aic7xxx, uhci_hcd:usb3
       21:        231          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
       22:         26          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
       23:          3          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb4
       24:        348          0          0          0          0          0          0          0   IO-APIC-level  eth0
       25:          6        192          0          0          0          0          0          0   IO-APIC-level  eth1
      NMI:          0          0          0          0          0          0          0          0
      LOC:     107981     107636     108899     108698     108489     108326     108331     108254
      ERR:          7
      MIS:          0
      
      Not only we see the card in the last I/O APIC, but we are not even close to
      using up available IRQs, since we didn't waste any.
      Signed-off-by: NNatalie Protasevich <Natalie.Protasevich@unisys.com>
      Acked-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      701067c4
    • R
      [PATCH] x86_64: never block forced SIGSEGV · 0928d6ef
      Roland McGrath 提交于
      This is the x86_64 version of the signal fix I just posted for i386.
      
      This problem was first noticed on PPC and has already been fixed there.
      But the exact same issue applies to other platforms in the same way.  The
      signal blocking for sa_mask and the handled signal takes place after the
      handler setup.  When the stack is bogus, the handler setup forces a
      SIGSEGV.  But then this will be blocked, and returning to user mode will
      fault again and iterate.  This patch fixes the problem by checking whether
      signal handler setup failed, and not doing the signal-blocking if so.  This
      copies what was done in the ppc code.  I think all architectures' signal
      handler setup code follows this pattern and needs the change.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0928d6ef
    • J
      [PATCH] x86_64: fix hpet for systems that don't support legacy replacement · a3a00751
      john stultz 提交于
      Currently the x86-64 HPET code assumes the entire HPET implementation from
      the spec is present.  This breaks on boxes that do not implement the
      optional legacy timer replacement functionality portion of the spec.
      
      This patch fixes this issue, allowing x86-64 systems that cannot use the
      HPET for the timer interrupt and RTC to still use the HPET as a time
      source.  I've tested this patch on a system systems without HPET, with HPET
      but without legacy timer replacement, as well as HPET with legacy timer
      replacement.
      
      This version adds a minor check to cap the HPET counter value in
      gettimeoffset_hpet to avoid possible time inconsistencies.  Please ignore
      the A2 version I sent to you earlier.
      Acked-by: NAndi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a3a00751
    • A
      [PATCH] x86_64: i8259.c iso99 structure initialization · c0a88c98
      Alexander Nyberg 提交于
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c0a88c98
    • C
      [PATCH] i386: Selectable Frequency of the Timer Interrupt · 59121003
      Christoph Lameter 提交于
      Make the timer frequency selectable. The timer interrupt may cause bus
      and memory contention in large NUMA systems since the interrupt occurs
      on each processor HZ times per second.
      Signed-off-by: NChristoph Lameter <christoph@lameter.com>
      Signed-off-by: NShai Fultheim <shai@scalex86.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      59121003
    • J
      [PATCH] allow early printk to use more than 25 lines · 799d19f6
      Jan Beulich 提交于
      Allow early printk code to take advantage of the full size of the screen, not
      just the first 25 lines.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Acked-by: NAndi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      799d19f6
    • C
      [PATCH] x86/x86_64: pcibus_to_node · 8c5a0908
      Christoph Lameter 提交于
      Define pcibus_to_node to be able to figure out which NUMA node contains a
      given PCI device.  This defines pcibus_to_node(bus) in
      include/linux/topology.h and adjusts the macros for i386 and x86_64 that
      already provided a way to determine the cpumask of a pci device.
      
      x86_64 was changed to not build an array of cpumasks anymore.  Instead an
      array of nodes is build which can be used to generate the cpumask via
      node_to_cpumask.
      Signed-off-by: NChristoph Lameter <christoph@lameter.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8c5a0908
    • V
      [PATCH] Platform SMIs and their interferance with tsc based delay calibration · 8a9e1b0f
      Venkatesh Pallipadi 提交于
      Issue:
      Current tsc based delay_calibration can result in significant errors in
      loops_per_jiffy count when the platform events like SMIs
      (System Management Interrupts that are non-maskable) are present. This could
      lead to potential kernel panic(). This issue is becoming more visible with 2.6
      kernel (as default HZ is 1000) and on platforms with higher SMI handling
      latencies. During the boot time, SMIs are mostly used by BIOS (for things
      like legacy keyboard emulation).
      
      Description:
      The psuedocode for current delay calibration with tsc based delay looks like
      (0) Estimate a value for loops_per_jiffy
      (1) While (loops_per_jiffy estimate is accurate enough)
      (2)   wait for jiffy transition (jiffy1)
      (3)   Note down current tsc (tsc1)
      (4)   loop until tsc becomes tsc1 + loops_per_jiffy
      (5)   check whether jiffy changed since jiffy1 or not and refine
      loops_per_jiffy estimate
      
      Consider the following cases
      Case 1:
      If SMIs happen between (2) and (3) above, we can end up with a
      loops_per_jiffy value that is too low. This results in shorted delays and
      kernel can panic () during boot (Mostly at IOAPIC timer initialization
      timer_irq_works() as we don't have enough timer interrupts in a specified
      interval).
      
      Case 2:
      If SMIs happen between (3) and (4) above, then we can end up with a
      loops_per_jiffy value that is too high. And with current i386 code, too
      high lpj value (greater than 17M) can result in a overflow in
      delay.c:__const_udelay() again resulting in shorter delay and panic().
      
      Solution:
      The patch below makes the calibration routine aware of asynchronous events
      like SMIs. We increase the delay calibration time and also identify any
      significant errors (greater than 12.5%) in the calibration and notify it to
      user.
      
      Patch below changes both i386 and x86-64 architectures to use this
      new and improved calibrate_delay_direct() routine.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8a9e1b0f
    • I
      [PATCH] use ${CROSS_COMPILE}installkernel in arch/*/boot/install.sh · 0f8e2d62
      Ian Campbell 提交于
      The attached patch causes the various arch specific install.sh scripts to
      look for ${CROSS_COMPILE}installkernel rather than just installkernel (in
      both /sbin/ and ~/bin/ where the script already did this).  This allows you
      to have e.g.  arm-linux-installkernel as a handy way to install on your
      cross target.  It also prevents the script picking up on the host
      /sbin/installkernel which causes the script to fall through and do the
      install itself (which is what I actually use myself, with $INSTALL_PATH
      set).
      
      I don't believe it causes back-compatibility problems since calling the
      host installkernel was never likely to work or be what you wanted when
      cross compiling anyway.  If $CROSS_COMPILE isn't set then nothing changes.
      
      I only use ARM and i386 myself but I figured it couldn't hurt to do the
      whole lot.  I've cc'd those who I hope are the arch maintainers for files
      that I've touched.
      Signed-off-by: NIan Campbell <icampbell@arcom.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0f8e2d62
    • M
      [PATCH] add x86-64 specific support for sparsemem · bbfceef4
      Matt Tolentino 提交于
      This patch adds in the necessary support for sparsemem such that x86-64
      kernels may use sparsemem as an alternative to discontigmem for NUMA
      kernels.  Note that this does no preclude one from continuing to build NUMA
      kernels using discontigmem, but merely allows the option to build NUMA
      kernels with sparsemem.
      
      Interestingly, the use of sparsemem in lieu of discontigmem in NUMA kernels
      results in reduced text size for otherwise equivalent kernels as shown in
      the example builds below:
      
         text	   data	    bss	    dec	    hex	filename
      2371036	 765884	1237108	4374028	 42be0c	vmlinux.discontig
      2366549	 776484	1302772	4445805	 43d66d	vmlinux.sparse
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bbfceef4
    • M
      [PATCH] reorganize x86-64 NUMA and DISCONTIGMEM config options · 2b97690f
      Matt Tolentino 提交于
      In order to use the alternative sparsemem implmentation for NUMA kernels,
      we need to reorganize the config options.  This patch effectively abstracts
      out the CONFIG_DISCONTIGMEM options to CONFIG_NUMA in most cases.  Thus,
      the discontigmem implementation may be employed as always, but the
      sparsemem implementation may be used alternatively.
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2b97690f
    • M
      [PATCH] add x86-64 Kconfig options for sparsemem · 1035faf1
      Matt Tolentino 提交于
      Add the requisite arch specific Kconfig options to enable the use of the
      sparsemem implementation for NUMA kernels on x86-64.
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1035faf1
    • M
      [PATCH] remove direct ref to contig_page_data for x86-64 · 07332663
      Matt Tolentino 提交于
      This patch pulls out all remaining direct references to contig_page_data
      from arch/x86-64, thus saving an ifdef in one case.
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      07332663
    • D
      [PATCH] make each arch use mm/Kconfig · 3f22ab27
      Dave Hansen 提交于
      For all architectures, this just means that you'll see a "Memory Model"
      choice in your architecture menu.  For those that implement DISCONTIGMEM,
      you may eventually want to make your ARCH_DISCONTIGMEM_ENABLE a "def_bool
      y" and make your users select DISCONTIGMEM right out of the new choice
      menu.  The only disadvantage might be if you have some specific things that
      you need in your help option to explain something about DISCONTIGMEM.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3f22ab27
  3. 22 6月, 2005 3 次提交
    • W
      [PATCH] Avoiding mmap fragmentation · 1363c3cd
      Wolfgang Wander 提交于
      Ingo recently introduced a great speedup for allocating new mmaps using the
      free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
      causes huge performance increases in thread creation.
      
      The downside of this patch is that it does lead to fragmentation in the
      mmap-ed areas (visible via /proc/self/maps), such that some applications
      that work fine under 2.4 kernels quickly run out of memory on any 2.6
      kernel.
      
      The problem is twofold:
      
        1) the free_area_cache is used to continue a search for memory where
           the last search ended.  Before the change new areas were always
           searched from the base address on.
      
           So now new small areas are cluttering holes of all sizes
           throughout the whole mmap-able region whereas before small holes
           tended to close holes near the base leaving holes far from the base
           large and available for larger requests.
      
        2) the free_area_cache also is set to the location of the last
           munmap-ed area so in scenarios where we allocate e.g.  five regions of
           1K each, then free regions 4 2 3 in this order the next request for 1K
           will be placed in the position of the old region 3, whereas before we
           appended it to the still active region 1, placing it at the location
           of the old region 2.  Before we had 1 free region of 2K, now we only
           get two free regions of 1K -> fragmentation.
      
      The patch addresses thes issues by introducing yet another cache descriptor
      cached_hole_size that contains the largest known hole size below the
      current free_area_cache.  If a new request comes in the size is compared
      against the cached_hole_size and if the request can be filled with a hole
      below free_area_cache the search is started from the base instead.
      
      The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
      (earlier posted) leakme.c test program terminates after 50000+ iterations
      with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
      (as expected) with thread creation, Ingo's test_str02 with 20000 threads
      requires 0.7s system time.
      
      Taking out Ingo's patch (un-patch available per request) by basically
      deleting all mentions of free_area_cache from the kernel and starting the
      search for new memory always at the respective bases we observe: leakme
      terminates successfully with 11 distinctive hardly fragmented areas in
      /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
      time for Ingo's test_str02 with 20000 threads.
      
      Now - drumroll ;-) the appended patch works fine with leakme: it ends with
      only 7 distinct areas in /proc/self/maps and also thread creation seems
      sufficiently fast with 0.71s for 20000 threads.
      Signed-off-by: NWolfgang Wander <wwc@rentec.com>
      Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1363c3cd
    • I
      [PATCH] smp_processor_id() cleanup · 39c715b7
      Ingo Molnar 提交于
      This patch implements a number of smp_processor_id() cleanup ideas that
      Arjan van de Ven and I came up with.
      
      The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
      spaghetti was hard to follow both on the implementational and on the
      usage side.
      
      Some of the complexity arose from picking wrong names, some of the
      complexity comes from the fact that not all architectures defined
      __smp_processor_id.
      
      In the new code, there are two externally visible symbols:
      
       - smp_processor_id(): debug variant.
      
       - raw_smp_processor_id(): nondebug variant. Replaces all existing
         uses of _smp_processor_id() and __smp_processor_id(). Defined
         by every SMP architecture in include/asm-*/smp.h.
      
      There is one new internal symbol, dependent on DEBUG_PREEMPT:
      
       - debug_smp_processor_id(): internal debug variant, mapped to
                                   smp_processor_id().
      
      Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
      lib/smp_processor_id.c file.  All related comments got updated and/or
      clarified.
      
      I have build/boot tested the following 8 .config combinations on x86:
      
       {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}
      
      I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
      architectures are untested, but should work just fine.)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjan@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      39c715b7
    • S
      [PATCH] x86_64: TASK_SIZE fixes for compatibility mode processes · 84929801
      Suresh Siddha 提交于
      Appended patch will setup compatibility mode TASK_SIZE properly.  This will
      fix atleast three known bugs that can be encountered while running
      compatibility mode apps.
      
      a) A malicious 32bit app can have an elf section at 0xffffe000.  During
         exec of this app, we will have a memory leak as insert_vm_struct() is
         not checking for return value in syscall32_setup_pages() and thus not
         freeing the vma allocated for the vsyscall page.  And instead of exec
         failing (as it has addresses > TASK_SIZE), we were allowing it to
         succeed previously.
      
      b) With a 32bit app, hugetlb_get_unmapped_area/arch_get_unmapped_area
         may return addresses beyond 32bits, ultimately causing corruption
         because of wrap-around and resulting in SEGFAULT, instead of returning
         ENOMEM.
      
      c) 32bit app doing this below mmap will now fail.
      
        mmap((void *)(0xFFFFE000UL), 0x10000UL, PROT_READ|PROT_WRITE,
      	MAP_FIXED|MAP_PRIVATE|MAP_ANON, 0, 0);
      Signed-off-by: NZou Nan hai <nanhai.zou@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      84929801
  4. 09 6月, 2005 1 次提交
  5. 01 6月, 2005 2 次提交
  6. 29 5月, 2005 1 次提交
  7. 27 5月, 2005 1 次提交
    • A
      [PATCH] Note on ACPI build fix · 8aadff7d
      Alexander Nyberg 提交于
      Even after the previous fix you can still set CONFIG_ACPI_BOOT
      indirectly even without CONFIG_ACPI by choosing CONFIG_PCI and
      CONFIG_PCI_MMCONFIG.
      
      That doesn't build very well either.
      
      This makes PCI_MMCONFIG depend on ACPI, fixing that hole.
      
      [ I guess in theory Kconfig could follow the whole chain of dependencies
        for things that get selected, but that sounds insanely complicated, so
        we'll just fix up these things by hand.  --Linus ]
      Signed-off-by: NAlexander Nyberg <alexn@telia.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8aadff7d
  8. 26 5月, 2005 1 次提交
  9. 21 5月, 2005 1 次提交