1. 08 7月, 2005 1 次提交
  2. 06 7月, 2005 1 次提交
  3. 30 6月, 2005 1 次提交
  4. 28 6月, 2005 3 次提交
    • R
      [PATCH] Return probe redesign: x86_64 specific changes · ba8af12f
      Rusty Lynch 提交于
      The following patch contains the x86_64 specific changes for the new
      return probe design.  Changes include:
       * Removing the architecture specific functions for querying a return probe
         instance off a stack address
       * Complete rework onf arch_prepare_kretprobe() and trampoline_probe_handler()
       * Removing trampoline_post_handler()
       * Adding arch_init() so that now we handle registering the return probe
         trampoline instead of kernel/kprobes.c doing it
      
      NOTE:
      Note that with this new design, the dependency on calculating a pointer to
      the task off the stack pointer no longer exist (resolving the problem of
      interruption stacks as pointed out in the original feedback to this port.)
      Signed-off-by: NRusty Lynch <rusty.lynch@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ba8af12f
    • A
      [PATCH] kprobes: fix single-step out of line - take2 · 9ec4b1f3
      Ananth N Mavinakayanahalli 提交于
      Now that PPC64 has no-execute support, here is a second try to fix the
      single step out of line during kprobe execution.  Kprobes on x86_64 already
      solved this problem by allocating an executable page and using it as the
      scratch area for stepping out of line.  Reuse that.
      Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9ec4b1f3
    • A
      [PATCH] seccomp: tsc disable · ffaa8bd6
      Andrea Arcangeli 提交于
      I believe at least for seccomp it's worth to turn off the tsc, not just for
      HT but for the L2 cache too.  So it's up to you, either you turn it off
      completely (which isn't very nice IMHO) or I recommend to apply this below
      patch.
      
      This has been tested successfully on x86-64 against current cogito
      repository (i686 compiles so I didn't bother testing ;).  People selling
      the cpu through cpushare may appreciate this bit for a peace of mind.
      
      There's no way to get any timing info anymore with this applied
      (gettimeofday is forbidden of course).  The seccomp environment is
      completely deterministic so it can't be allowed to get timing info, it has
      to be deterministic so in the future I can enable a computing mode that
      does a parallel computing for each task with server side transparent
      checkpointing and verification that the output is the same from all the 2/3
      seller computers for each task, without the buyer even noticing (for now
      the verification is left to the buyer client side and there's no
      checkpointing, since that would require more kernel changes to track the
      dirty bits but it'll be easy to extend once the basic mode is finished).
      
      Eliminating a cold-cache read of the cr4 global variable will save one
      cacheline during the tlb flush while making the code per-cpu-safe at the
      same time.  Thanks to Mikael Pettersson for noticing the tlb flush wasn't
      per-cpu-safe.
      
      The global tlb flush can run from irq (IPI calling do_flush_tlb_all) but
      it'll be transparent to the switch_to code since the IPI won't make any
      change to the cr4 contents from the point of view of the interrupted code
      and since it's now all per-cpu stuff, it will not race.  So no need to
      disable irqs in switch_to slow path.
      Signed-off-by: NAndrea Arcangeli <andrea@cpushare.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ffaa8bd6
  5. 26 6月, 2005 20 次提交
  6. 24 6月, 2005 14 次提交
    • P
      [PATCH] kprobes: Temporary disarming of reentrant probe for x86_64 · aa3d7e3d
      Prasanna S Panchamukhi 提交于
      This patch includes x86_64 architecture specific changes to support temporary
      disarming on reentrancy of probes.
      Signed-of-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      aa3d7e3d
    • R
      [PATCH] Move kprobe [dis]arming into arch specific code · 7e1048b1
      Rusty Lynch 提交于
      The architecture independent code of the current kprobes implementation is
      arming and disarming kprobes at registration time.  The problem is that the
      code is assuming that arming and disarming is a just done by a simple write
      of some magic value to an address.  This is problematic for ia64 where our
      instructions look more like structures, and we can not insert break points
      by just doing something like:
      
      *p->addr = BREAKPOINT_INSTRUCTION;
      
      The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent
      functions:
      
           * void arch_arm_kprobe(struct kprobe *p)
           * void arch_disarm_kprobe(struct kprobe *p)
      
      and then adds the new functions for each of the architectures that already
      implement kprobes (spar64/ppc64/i386/x86_64).
      
      I thought arch_[dis]arm_kprobe was the most descriptive of what was really
      happening, but each of the architectures already had a disarm_kprobe()
      function that was really a "disarm and do some other clean-up items as
      needed when you stumble across a recursive kprobe." So...  I took the
      liberty of changing the code that was calling disarm_kprobe() to call
      arch_disarm_kprobe(), and then do the cleanup in the block of code dealing
      with the recursive kprobe case.
      
      So far this patch as been tested on i386, x86_64, and ppc64, but still
      needs to be tested in sparc64.
      Signed-off-by: NRusty Lynch <rusty.lynch@intel.com>
      Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7e1048b1
    • R
      [PATCH] x86_64 specific function return probes · 73649dab
      Rusty Lynch 提交于
      The following patch adds the x86_64 architecture specific implementation
      for function return probes.
      
      Function return probes is a mechanism built on top of kprobes that allows
      a caller to register a handler to be called when a given function exits.
      For example, to instrument the return path of sys_mkdir:
      
      static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
      {
      	printk("sys_mkdir exited\n");
      	return 0;
      }
      static struct kretprobe return_probe = {
      	.handler = sys_mkdir_exit,
      };
      
      <inside setup function>
      
      return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
      if (register_kretprobe(&return_probe)) {
      	printk(KERN_DEBUG "Unable to register return probe!\n");
      	/* do error path */
      }
      
      <inside cleanup function>
      unregister_kretprobe(&return_probe);
      
      The way this works is that:
      
      * At system initialization time, kernel/kprobes.c installs a kprobe
        on a function called kretprobe_trampoline() that is implemented in
        the arch/x86_64/kernel/kprobes.c  (More on this later)
      
      * When a return probe is registered using register_kretprobe(),
        kernel/kprobes.c will install a kprobe on the first instruction of the
        targeted function with the pre handler set to arch_prepare_kretprobe()
        which is implemented in arch/x86_64/kernel/kprobes.c.
      
      * arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
        - nodes for hanging this instance in an empty or free list
        - a pointer to the return probe
        - the original return address
        - a pointer to the stack address
      
        With all this stowed away, arch_prepare_kretprobe() then sets the return
        address for the targeted function to a special trampoline function called
        kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
      
      * The kprobe completes as normal, with control passing back to the target
        function that executes as normal, and eventually returns to our trampoline
        function.
      
      * Since a kprobe was installed on kretprobe_trampoline() during system
        initialization, control passes back to kprobes via the architecture
        specific function trampoline_probe_handler() which will lookup the
        instance in an hlist maintained by kernel/kprobes.c, and then call
        the handler function.
      
      * When trampoline_probe_handler() is done, the kprobes infrastructure
        single steps the original instruction (in this case just a top), and
        then calls trampoline_post_handler().  trampoline_post_handler() then
        looks up the instance again, puts the instance back on the free list,
        and then makes a long jump back to the original return instruction.
      
      So to recap, to instrument the exit path of a function this implementation
      will cause four interruptions:
      
        - A breakpoint at the very beginning of the function allowing us to
          switch out the return address
        - A single step interruption to execute the original instruction that
          we replaced with the break instruction (normal kprobe flow)
        - A breakpoint in the trampoline function where our instrumented function
          returned to
        - A single step interruption to execute the original instruction that
          we replaced with the break instruction (normal kprobe flow)
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      73649dab
    • V
      [PATCH] xen: x86_64: use more usermode macro · 76381fee
      Vincent Hanquez 提交于
      Make use of the user_mode macro where it's possible.  This is useful for Xen
      because it will need only to redefine only the macro to a hypervisor call.
      Signed-off-by: NVincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
      Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      76381fee
    • V
      [PATCH] xen: x86_64: Add macro for debugreg · e9129e56
      Vincent Hanquez 提交于
      Add 2 macros to set and get debugreg on x86_64.  This is useful for Xen
      because it will need only to redefine each macro to a hypervisor call.
      Signed-off-by: NVincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
      Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e9129e56
    • N
      [PATCH] x86_64: avoid wasting IRQs · 701067c4
      Natalie Protasevich 提交于
      I suggest to change the way IRQs are handed out to PCI devices.
      
      Currently, each I/O APIC pin gets associated with an IRQ, no matter if the
      pin is used or not.  It is expected that each pin can potentually be
      engaged by a device inserted into the corresponding PCI slot.  However,
      this imposes severe limitation on systems that have designs that employ
      many I/O APICs, only utilizing couple lines of each, such as P64H2 chipset.
      
      It is used in ES7000, and currently, there is no way to boot the system
      with more that 9 I/O APICs.
      
      The simple change below allows to boot a system with say 64 (or more) I/O
      APICs, each providing 1 slot, which otherwise impossible because of the IRQ
      gaps created for unused lines on each I/O APIC.  It does not resolve the
      problem with number of devices that exceeds number of possible IRQs, but
      eases up a tension for IRQs on any large system with potentually large
      number of devices.
      
      I only implemented this for the ACPI boot, since if the system is this big
      and using newer chipsets it is probably (better be!) an ACPI based system
      :).  The change is completely "mechanical" and does not alter any internal
      structures or interrupt model/implementation.  The patch works for both
      i386 and x86_64 archs.  It works with MSIs just fine, and should not
      intervene with implementations like shared vectors, when they get worked
      out and incorporated.
      
      To illustrate, below is the interrupt distribution for 2-cell ES7000 with
      20 I/O APICs, and an Ethernet card in the last slot, which should be eth1
      and which was not configured because its IRQ exceeded allowable number (it
      actially turned out huge - 480!):
      
      zorro-tb2:~ # cat /proc/interrupts
                 CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
        0:      65716      30012      30007      30002      30009      30010      30010      30010    IO-APIC-edge  timer
        4:        373          0        725        280          0          0          0          0    IO-APIC-edge  serial
        8:          0          0          0          0          0          0          0          0    IO-APIC-edge  rtc
        9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
       14:         39          3          0          0          0          0          0          0    IO-APIC-edge  ide0
       16:        108         13          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
       18:          0          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb3
       19:         15          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
       23:          3          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb4
       96:       4240        397         18          0          0          0          0          0   IO-APIC-level  aic7xxx
       97:         15          0          0          0          0          0          0          0   IO-APIC-level  aic7xxx
      192:        847          0          0          0          0          0          0          0   IO-APIC-level  eth0
      NMI:          0          0          0          0          0          0          0          0
      LOC:     273423     274528     272829     274228     274092     273761     273827     273694
      ERR:          7
      MIS:          0
      
      Even though the system doesn't have that many devices, some don't get
      enabled only because of IRQ numbering model.
      
      This is the IRQ picture after the patch was applied:
      
      zorro-tb2:~ # cat /proc/interrupts
                 CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
        0:      44169      10004      10004      10001      10004      10003      10004       6135    IO-APIC-edge  timer
        4:        345          0          0          0          0        244          0          0    IO-APIC-edge  serial
        8:          0          0          0          0          0          0          0          0    IO-APIC-edge  rtc
        9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
       14:         39          0          3          0          0          0          0          0    IO-APIC-edge  ide0
       17:       4425          0          9          0          0          0          0          0   IO-APIC-level  aic7xxx
       18:         15          0          0          0          0          0          0          0   IO-APIC-level  aic7xxx, uhci_hcd:usb3
       21:        231          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
       22:         26          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
       23:          3          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb4
       24:        348          0          0          0          0          0          0          0   IO-APIC-level  eth0
       25:          6        192          0          0          0          0          0          0   IO-APIC-level  eth1
      NMI:          0          0          0          0          0          0          0          0
      LOC:     107981     107636     108899     108698     108489     108326     108331     108254
      ERR:          7
      MIS:          0
      
      Not only we see the card in the last I/O APIC, but we are not even close to
      using up available IRQs, since we didn't waste any.
      Signed-off-by: NNatalie Protasevich <Natalie.Protasevich@unisys.com>
      Acked-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      701067c4
    • R
      [PATCH] x86_64: never block forced SIGSEGV · 0928d6ef
      Roland McGrath 提交于
      This is the x86_64 version of the signal fix I just posted for i386.
      
      This problem was first noticed on PPC and has already been fixed there.
      But the exact same issue applies to other platforms in the same way.  The
      signal blocking for sa_mask and the handled signal takes place after the
      handler setup.  When the stack is bogus, the handler setup forces a
      SIGSEGV.  But then this will be blocked, and returning to user mode will
      fault again and iterate.  This patch fixes the problem by checking whether
      signal handler setup failed, and not doing the signal-blocking if so.  This
      copies what was done in the ppc code.  I think all architectures' signal
      handler setup code follows this pattern and needs the change.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0928d6ef
    • J
      [PATCH] x86_64: fix hpet for systems that don't support legacy replacement · a3a00751
      john stultz 提交于
      Currently the x86-64 HPET code assumes the entire HPET implementation from
      the spec is present.  This breaks on boxes that do not implement the
      optional legacy timer replacement functionality portion of the spec.
      
      This patch fixes this issue, allowing x86-64 systems that cannot use the
      HPET for the timer interrupt and RTC to still use the HPET as a time
      source.  I've tested this patch on a system systems without HPET, with HPET
      but without legacy timer replacement, as well as HPET with legacy timer
      replacement.
      
      This version adds a minor check to cap the HPET counter value in
      gettimeoffset_hpet to avoid possible time inconsistencies.  Please ignore
      the A2 version I sent to you earlier.
      Acked-by: NAndi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a3a00751
    • A
      [PATCH] x86_64: i8259.c iso99 structure initialization · c0a88c98
      Alexander Nyberg 提交于
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c0a88c98
    • J
      [PATCH] allow early printk to use more than 25 lines · 799d19f6
      Jan Beulich 提交于
      Allow early printk code to take advantage of the full size of the screen, not
      just the first 25 lines.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Acked-by: NAndi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      799d19f6
    • C
      [PATCH] x86/x86_64: pcibus_to_node · 8c5a0908
      Christoph Lameter 提交于
      Define pcibus_to_node to be able to figure out which NUMA node contains a
      given PCI device.  This defines pcibus_to_node(bus) in
      include/linux/topology.h and adjusts the macros for i386 and x86_64 that
      already provided a way to determine the cpumask of a pci device.
      
      x86_64 was changed to not build an array of cpumasks anymore.  Instead an
      array of nodes is build which can be used to generate the cpumask via
      node_to_cpumask.
      Signed-off-by: NChristoph Lameter <christoph@lameter.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8c5a0908
    • M
      [PATCH] add x86-64 specific support for sparsemem · bbfceef4
      Matt Tolentino 提交于
      This patch adds in the necessary support for sparsemem such that x86-64
      kernels may use sparsemem as an alternative to discontigmem for NUMA
      kernels.  Note that this does no preclude one from continuing to build NUMA
      kernels using discontigmem, but merely allows the option to build NUMA
      kernels with sparsemem.
      
      Interestingly, the use of sparsemem in lieu of discontigmem in NUMA kernels
      results in reduced text size for otherwise equivalent kernels as shown in
      the example builds below:
      
         text	   data	    bss	    dec	    hex	filename
      2371036	 765884	1237108	4374028	 42be0c	vmlinux.discontig
      2366549	 776484	1302772	4445805	 43d66d	vmlinux.sparse
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bbfceef4
    • M
      [PATCH] reorganize x86-64 NUMA and DISCONTIGMEM config options · 2b97690f
      Matt Tolentino 提交于
      In order to use the alternative sparsemem implmentation for NUMA kernels,
      we need to reorganize the config options.  This patch effectively abstracts
      out the CONFIG_DISCONTIGMEM options to CONFIG_NUMA in most cases.  Thus,
      the discontigmem implementation may be employed as always, but the
      sparsemem implementation may be used alternatively.
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2b97690f
    • M
      [PATCH] remove direct ref to contig_page_data for x86-64 · 07332663
      Matt Tolentino 提交于
      This patch pulls out all remaining direct references to contig_page_data
      from arch/x86-64, thus saving an ifdef in one case.
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      07332663