1. 09 5月, 2007 1 次提交
    • B
      [POWERPC] Introduce address space "slices" · d0f13e3c
      Benjamin Herrenschmidt 提交于
      The basic issue is to be able to do what hugetlbfs does but with
      different page sizes for some other special filesystems; more
      specifically, my need is:
      
       - Huge pages
      
       - SPE local store mappings using 64K pages on a 4K base page size
      kernel on Cell
      
       - Some special 4K segments in 64K-page kernels for mapping a dodgy
      type of powerpc-specific infiniband hardware that requires 4K MMU
      mappings for various reasons I won't explain here.
      
      The main issues are:
      
       - To maintain/keep track of the page size per "segment" (as we can
      only have one page size per segment on powerpc, which are 256MB
      divisions of the address space).
      
       - To make sure special mappings stay within their allotted
      "segments" (including MAP_FIXED crap)
      
       - To make sure everybody else doesn't mmap/brk/grow_stack into a
      "segment" that is used for a special mapping
      
      Some of the necessary mechanisms to handle that were present in the
      hugetlbfs code, but mostly in ways not suitable for anything else.
      
      The patch relies on some changes to the generic get_unmapped_area()
      that just got merged.  It still hijacks hugetlb callbacks here or
      there as the generic code hasn't been entirely cleaned up yet but
      that shouldn't be a problem.
      
      So what is a slice ?  Well, I re-used the mechanism used formerly by our
      hugetlbfs implementation which divides the address space in
      "meta-segments" which I called "slices".  The division is done using
      256MB slices below 4G, and 1T slices above.  Thus the address space is
      divided currently into 16 "low" slices and 16 "high" slices.  (Special
      case: high slice 0 is the area between 4G and 1T).
      
      Doing so simplifies significantly the tracking of segments and avoids
      having to keep track of all the 256MB segments in the address space.
      
      While I used the "concepts" of hugetlbfs, I mostly re-implemented
      everything in a more generic way and "ported" hugetlbfs to it.
      
      Slices can have an associated page size, which is encoded in the mmu
      context and used by the SLB miss handler to set the segment sizes.  The
      hash code currently doesn't care, it has a specific check for hugepages,
      though I might add a mechanism to provide per-slice hash mapping
      functions in the future.
      
      The slice code provide a pair of "generic" get_unmapped_area() (bottomup
      and topdown) functions that should work with any slice size.  There is
      some trickiness here so I would appreciate people to have a look at the
      implementation of these and let me know if I got something wrong.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      d0f13e3c
  2. 30 4月, 2007 1 次提交
  3. 24 4月, 2007 4 次提交
  4. 10 3月, 2007 1 次提交
    • B
      [POWERPC] Fix spu SLB invalidations · 94b2a439
      Benjamin Herrenschmidt 提交于
      The SPU code doesn't properly invalidate SPUs SLBs when necessary,
      for example when changing a segment size from the hugetlbfs code. In
      addition, it saves and restores the SLB content on context switches
      which makes it harder to properly handle those invalidations.
      
      This patch removes the saving & restoring for now, something more
      efficient might be found later on. It also adds a spu_flush_all_slbs(mm)
      that can be used by the core mm code to flush the SLBs of all SPEs that
      are running a given mm at the time of the flush.
      
      In order to do that, it adds a spinlock to the list of all SPEs and move
      some bits & pieces from spufs to spu_base.c
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      94b2a439
  5. 24 1月, 2007 1 次提交
  6. 04 12月, 2006 4 次提交
  7. 10 11月, 2006 1 次提交
  8. 25 10月, 2006 5 次提交
  9. 16 10月, 2006 1 次提交
  10. 07 10月, 2006 1 次提交
  11. 05 10月, 2006 4 次提交
    • A
      [POWERPC] spufs: support new OF device tree format · 7650f2f2
      Arnd Bergmann 提交于
      The properties we used traditionally in the device tree are somewhat
      nonstandard.  This adds support for a more conventional format using
      'interrupts' and 'reg' properties.
      
      The interrupts are specified in three cells (class 0, 1 and 2) and
      registered at the interrupt-parent.
      
      The reg property contains either three or four register areas in the
      order 'local-store', 'problem', 'priv2', and 'priv1', so the priv1 one
      can be left out in case of hypervisor driven systems that access these
      through hcalls.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      7650f2f2
    • A
      [POWERPC] spufs: remove support for ancient firmware · 772920e5
      Arnd Bergmann 提交于
      Any firmware that still uses the 'spc' nodes already
      stopped running for other reasons, so let's get rid of this.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      772920e5
    • A
      [POWERPC] spufs: implement error event delivery to user space · 9add11da
      Arnd Bergmann 提交于
      This tries to fix spufs so we have an interface closer to what is
      specified in the man page for events returned in the third argument of
      spu_run.
      
      Fortunately, libspe has never been using the returned contents of that
      register, as they were the same as the return code of spu_run (duh!).
      
      Unlike the specification that we never implemented correctly, we now
      require a SPU_CREATE_EVENTS_ENABLED flag passed to spu_create, in
      order to get the new behavior. When this flag is not passed, spu_run
      will simply ignore the third argument now.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      9add11da
    • M
      [POWERPC] spufs: scheduler support for NUMA. · a68cf983
      Mark Nutter 提交于
      This patch adds NUMA support to the the spufs scheduler.
      
      The new arch/powerpc/platforms/cell/spufs/sched.c is greatly
      simplified, in an attempt to reduce complexity while adding
      support for NUMA scheduler domains.  SPUs are allocated starting
      from the calling thread's node, moving to others as supported by
      current->cpus_allowed.  Preemption is gone as it was buggy, but
      should be re-enabled in another patch when stable.
      
      The new arch/powerpc/platforms/cell/spu_base.c maintains idle
      lists on a per-node basis, and allows caller to specify which
      node(s) an SPU should be allocated from, while passing -1 tells
      spu_alloc() that any node is allowed.
      
      Since the patch removes the currently implemented preemptive
      scheduling, it is technically a regression, but practically
      all users have since migrated to this version, as it is
      part of the IBM SDK and the yellowdog distribution, so there
      is not much point holding it back while the new preemptive
      scheduling patch gets delayed further.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      a68cf983
  12. 04 10月, 2006 1 次提交
    • B
      [POWERPC] Cell interrupt rework · 2e194583
      Benjamin Herrenschmidt 提交于
      This patch reworks the cell iic interrupt handling so that:
      
       - Node ID is back in the interrupt number (only one IRQ host is created
      for all nodes). This allows interrupts from sources on another node to
      be routed non-locally. This will allow possibly one day to fix maxcpus=1
      or 2 and still get interrupts from devices on BE 1. (A bit more fixing
      is needed for that) and it will allow us to implement actual affinity
      control of external interrupts.
      
       - Added handling of the IO exceptions interrupts (badly named, but I
      re-used the name initially used by STI). Those are the interrupts
      exposed by IIC_ISR and IIC_IRR, such as the IOC translation exception,
      performance monitor, etc... Those get their special numbers in the IRQ
      number space and are internally implemented as a cascade on unit 0xe,
      class 1 of each node.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      2e194583
  13. 26 9月, 2006 1 次提交
  14. 31 7月, 2006 1 次提交
  15. 11 7月, 2006 1 次提交
    • B
      [PATCH] powerpc: fix trigger handling in the new irq code · 6e99e458
      Benjamin Herrenschmidt 提交于
      This patch slightly reworks the new irq code to fix a small design error.  I
      removed the passing of the trigger to the map() calls entirely, it was not a
      good idea to have one call do two different things.  It also fixes a couple of
      corner cases.
      
      Mapping a linux virtual irq to a physical irq now does only that.  Setting the
      trigger is a different action which has a different call.
      
      The main changes are:
      
      - I no longer call host->ops->map() for an already mapped irq, I just return
        the virtual number that was already mapped.  It was called before to give an
        opportunity to change the trigger, but that was causing issues as that could
        happen while the interrupt was in use by a device, and because of the
        trigger change, map would potentially muck around with things in a racy way.
         That was causing much burden on a given's controller implementation of
        map() to get it right.  This is much simpler now.  map() is only called on
        the initial mapping of an irq, meaning that you know that this irq is _not_
        being used.  You can initialize the hardware if you want (though you don't
        have to).
      
      - Controllers that can handle different type of triggers (level/edge/etc...)
        now implement the standard irq_chip->set_type() call as defined by the
        generic code.  That means that you can use the standard set_irq_type() to
        configure an irq line manually if you wish or (though I don't like that
        interface), pass explicit trigger flags to request_irq() as defined by the
        generic kernel interfaces.  Also, using those interfaces guarantees that
        your controller set_type callback is called with the descriptor lock held,
        thus providing locking against activity on the same interrupt (including
        mask/unmask/etc...) automatically.  A result is that, for example, MPIC's
        own map() implementation calls irq_set_type(NONE) to configure the hardware
        to the default triggers.
      
      - To allow the above, the irq_map array entry for the new mapped interrupt
        is now set before map() callback is called for the controller.
      
      - The irq_create_of_mapping() (also used by irq_of_parse_and_map()) function
        for mapping interrupts from the device-tree now also call the separate
        set_irq_type(), and only does so if there is a change in the trigger type.
      
      - While I was at it, I changed pci_read_irq_line() (which is the helper I
        would expect most archs to use in their pcibios_fixup() to get the PCI
        interrupt routing from the device tree) to also handle a fallback when the
        DT mapping fails consisting of reading the PCI_INTERRUPT_PIN to know wether
        the device has an interrupt at all, and the the PCI_INTERRUPT_LINE to get an
        interrupt number from the device.  That number is then mapped using the
        default controller, and the trigger is set to level low.  That default
        behaviour works for several platforms that don't have a proper interrupt
        tree like Pegasos.  If it doesn't work for your platform, then either
        provide a proper interrupt tree from the firmware so that fallback isn't
        needed, or don't call pci_read_irq_line()
      
      - Add back a bit that got dropped by my main rework patch for properly
        clearing pending IPIs on pSeries when using a kexec
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6e99e458
  16. 03 7月, 2006 2 次提交
    • B
      [POWERPC] Add new interrupt mapping core and change platforms to use it · 0ebfff14
      Benjamin Herrenschmidt 提交于
      This adds the new irq remapper core and removes the old one.  Because
      there are some fundamental conflicts with the old code, like the value
      of NO_IRQ which I'm now setting to 0 (as per discussions with Linus),
      etc..., this commit also changes the relevant platform and driver code
      over to use the new remapper (so as not to cause difficulties later
      in bisecting).
      
      This patch removes the old pre-parsing of the open firmware interrupt
      tree along with all the bogus assumptions it made to try to renumber
      interrupts according to the platform. This is all to be handled by the
      new code now.
      
      For the pSeries XICS interrupt controller, a single remapper host is
      created for the whole machine regardless of how many interrupt
      presentation and source controllers are found, and it's set to match
      any device node that isn't a 8259.  That works fine on pSeries and
      avoids having to deal with some of the complexities of split source
      controllers vs. presentation controllers in the pSeries device trees.
      
      The powerpc i8259 PIC driver now always requests the legacy interrupt
      range. It also has the feature of being able to match any device node
      (including NULL) if passed no device node as an input. That will help
      porting over platforms with broken device-trees like Pegasos who don't
      have a proper interrupt tree.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      0ebfff14
    • T
      [PATCH] irq-flags: POWERPC: Use the new IRQF_ constants · 6714465e
      Thomas Gleixner 提交于
      Use the new IRQF_ constants and remove the SA_INTERRUPT define
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6714465e
  17. 28 6月, 2006 1 次提交
  18. 21 6月, 2006 7 次提交
  19. 02 5月, 2006 2 次提交