1. 05 10月, 2006 13 次提交
    • A
      [POWERPC] cell: fix bugs found by sparse · 43b4f406
      Arnd Bergmann 提交于
      - Some long constants should be marked 'ul'.
      - When using desc->handler_data to pass an __iomem
        register area, we need to add casts to and from
        __iomem.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      43b4f406
    • A
      [POWERPC] spiderpic: enable new style devtree support · f7e2ce78
      Arnd Bergmann 提交于
      This enables support for new firmware test releases.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f7e2ce78
    • A
      [POWERPC] spufs: add infrastructure for finding elf objects · 86767277
      Arnd Bergmann 提交于
      This adds an 'object-id' file that the spe library can
      use to store a pointer to its ELF object. This was
      originally meant for use by oprofile, but is now
      also used by the GNU debugger, if available.
      
      In order for oprofile to find the location in an spu-elf
      binary where an event counter triggered, we need a way
      to identify the binary in the first place.
      
      Unfortunately, that binary itself can be embedded in a
      powerpc ELF binary. Since we can assume it is mapped into
      the effective address space of the running process,
      have that one write the pointer value into a new spufs
      file.
      
      When a context switch occurs, pass the user value to
      the profiler so that can look at the mapped file (with
      some care).
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      86767277
    • A
      [POWERPC] spufs: support new OF device tree format · 7650f2f2
      Arnd Bergmann 提交于
      The properties we used traditionally in the device tree are somewhat
      nonstandard.  This adds support for a more conventional format using
      'interrupts' and 'reg' properties.
      
      The interrupts are specified in three cells (class 0, 1 and 2) and
      registered at the interrupt-parent.
      
      The reg property contains either three or four register areas in the
      order 'local-store', 'problem', 'priv2', and 'priv1', so the priv1 one
      can be left out in case of hypervisor driven systems that access these
      through hcalls.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      7650f2f2
    • A
      [POWERPC] spufs: add support for read/write on cntl · e1dbff2b
      Arnd Bergmann 提交于
      Writing to cntl can be used to stop execution on the
      spu and to restart it, reading from cntl gives the
      contents of the current status register.
      
      The access is always in ascii, as for most other files.
      
      This was always meant to be there, but we had a little
      problem with writing to runctl so it was left out so
      far.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      e1dbff2b
    • A
      [POWERPC] spufs: remove support for ancient firmware · 772920e5
      Arnd Bergmann 提交于
      Any firmware that still uses the 'spc' nodes already
      stopped running for other reasons, so let's get rid of this.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      772920e5
    • A
      [POWERPC] spufs: make mailbox functions handle multiple elements · cdcc89bb
      Arnd Bergmann 提交于
      Since libspe2 will provide a function that can read/write
      multiple mailbox elements at once, the kernel should handle
      that efficiently.
      
      read/write on the three mailbox files can now access the
      spe context multiple times to operate on any number of
      mailbox data elements.
      
      If the spu application keeps writing to its outbound
      mailbox, the read call will pick up all the data in a
      single system call.
      
      Unfortunately, if the user passes an invalid pointer,
      we may lose a mailbox element on read, since we can't
      put it back. This probably impossible to solve, if the
      user also accesses the mailbox through direct register
      access.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      cdcc89bb
    • A
      [POWERPC] spufs: use correct pg_prot for mapping SPU local store · ac91cb8d
      Arnd Bergmann 提交于
      This hopefully fixes a long-standing bug in the spu file system.
      An spu context comes with local memory that can be either saved
      in kernel pages or point directly to a physical SPE.
      
      When mapping the physical SPE, that mapping needs to be cache-inhibited.
      For simplicity, we used to map the kernel backing memory that way
      too, but unfortunately that was not only inefficient, but also incorrect
      because the same page could then be accessed simultaneously through
      a cacheable and a cache-inhibited mapping, which is not allowed
      by the powerpc specification and in our case caused data inconsistency
      for which we did a really ugly workaround in user space.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      ac91cb8d
    • A
      [POWERPC] spufs: Add infrastructure needed for gang scheduling · 6263203e
      Arnd Bergmann 提交于
      Add the concept of a gang to spufs as a new type of object.
      So far, this has no impact whatsover on scheduling, but makes
      it possible to add that later.
      
      A new type of object in spufs is now a spu_gang. It is created
      with the spu_create system call with the flags argument set
      to SPU_CREATE_GANG (0x2). Inside of a spu_gang, it
      is then possible to create spu_context objects, which until
      now was only possible at the root of spufs.
      
      There is a new member in struct spu_context pointing to
      the spu_gang it belongs to, if any. The spu_gang maintains
      a list of spu_context structures that are its children.
      This information can then be used in the scheduler in the
      future.
      
      There is still a bug that needs to be resolved in this
      basic infrastructure regarding the order in which objects
      are removed. When the spu_gang file descriptor is closed
      before the spu_context descriptors, we leak the dentry
      and inode for the gang. Any ideas how to cleanly solve
      this are appreciated.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      6263203e
    • A
      [POWERPC] spufs: implement error event delivery to user space · 9add11da
      Arnd Bergmann 提交于
      This tries to fix spufs so we have an interface closer to what is
      specified in the man page for events returned in the third argument of
      spu_run.
      
      Fortunately, libspe has never been using the returned contents of that
      register, as they were the same as the return code of spu_run (duh!).
      
      Unlike the specification that we never implemented correctly, we now
      require a SPU_CREATE_EVENTS_ENABLED flag passed to spu_create, in
      order to get the new behavior. When this flag is not passed, spu_run
      will simply ignore the third argument now.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      9add11da
    • H
      [POWERPC] spufs: fix context switch during page fault · 28347bce
      HyeonSeung Jang 提交于
      For better explanation, I break down the page fault handling into steps:
      
      1) There is a page fault caused by DMA operation initiated by SPU and
      DMA is suspended.
      
      2) The interrupt handler 'spu_irq_class_1()/__spu_trap_data_map()' is
      called and it just wakes up the sleeping spe-manager thread.
      
      3) by PPE scheduler, the corresponding bottom half,
      spu_irq_class_1_bottom() is called in process context and DMA is
      restarted.
      
      There can be a quite large time gap between 2) and 3) and I found
      the following problem:
      
      Between 2) and 3) If the context becomes unbound, 3) is not executed
      because when the spe-manager thread is awaken, the context is already
      saved. (This situation can happen, for example, when a high priority spe
      thread newly started in that time gap)
      
      But the actual problem is that the corresponding SPU context does not
      work even if it is bound again to a SPU.
      
      Besides I can see the following warning in mambo simulator when the
      context becomes
      unbound(in save_mfc_cmd()), i.e. when unbind() is called for the
      context after step 2) before 3) :
      
      'WARNING: 61392752237: SPE2: MFC_CMD_QUEUE channel count of 15 is
      inconsistent with number of available DMA queue entries of 16'
      
      After I go through available documents, I found that the problem is
      because the suspended DMA is not restarted when it is bound again.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      28347bce
    • M
      [POWERPC] spufs: scheduler support for NUMA. · a68cf983
      Mark Nutter 提交于
      This patch adds NUMA support to the the spufs scheduler.
      
      The new arch/powerpc/platforms/cell/spufs/sched.c is greatly
      simplified, in an attempt to reduce complexity while adding
      support for NUMA scheduler domains.  SPUs are allocated starting
      from the calling thread's node, moving to others as supported by
      current->cpus_allowed.  Preemption is gone as it was buggy, but
      should be re-enabled in another patch when stable.
      
      The new arch/powerpc/platforms/cell/spu_base.c maintains idle
      lists on a per-node basis, and allows caller to specify which
      node(s) an SPU should be allocated from, while passing -1 tells
      spu_alloc() that any node is allowed.
      
      Since the patch removes the currently implemented preemptive
      scheduling, it is technically a regression, but practically
      all users have since migrated to this version, as it is
      part of the IBM SDK and the yellowdog distribution, so there
      is not much point holding it back while the new preemptive
      scheduling patch gets delayed further.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      a68cf983
    • B
      [POWERPC] spufs: cell spu problem state mapping updates · 27d5bf2a
      Benjamin Herrenschmidt 提交于
      This patch adds a new "psmap" file to spufs that allows mmap of all of
      the problem state mapping of SPEs. It is compatible with 64k pages. In
      addition, it removes mmap ability of individual files when using 64k
      pages, with the exception of signal1 and signal2 which will both map the
      entire 64k page holding both registers. It also removes
      CONFIG_SPUFS_MMAP as there is no point in not building mmap support in
      spufs.
      
      It goes along a separate patch to libspe implementing usage of that new
      file to access problem state registers.
      
      Another patch will follow up to fix races opened up by accessing
      the 'runcntl' register directly, which is made possible with this
      patch.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      27d5bf2a
  2. 04 10月, 2006 2 次提交
    • D
      Remove all inclusions of <linux/config.h> · 038b0a6d
      Dave Jones 提交于
      kbuild explicitly includes this at build time.
      Signed-off-by: NDave Jones <davej@redhat.com>
      038b0a6d
    • B
      [POWERPC] Cell interrupt rework · 2e194583
      Benjamin Herrenschmidt 提交于
      This patch reworks the cell iic interrupt handling so that:
      
       - Node ID is back in the interrupt number (only one IRQ host is created
      for all nodes). This allows interrupts from sources on another node to
      be routed non-locally. This will allow possibly one day to fix maxcpus=1
      or 2 and still get interrupts from devices on BE 1. (A bit more fixing
      is needed for that) and it will allow us to implement actual affinity
      control of external interrupts.
      
       - Added handling of the IO exceptions interrupts (badly named, but I
      re-used the name initially used by STI). Those are the interrupts
      exposed by IIC_ISR and IIC_IRR, such as the IOC translation exception,
      performance monitor, etc... Those get their special numbers in the IRQ
      number space and are internally implemented as a cascade on unit 0xe,
      class 1 of each node.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      2e194583
  3. 27 9月, 2006 2 次提交
    • T
      [PATCH] inode-diet: Eliminate i_blksize from the inode structure · ba52de12
      Theodore Ts'o 提交于
      This eliminates the i_blksize field from struct inode.  Filesystems that want
      to provide a per-inode st_blksize can do so by providing their own getattr
      routine instead of using the generic_fillattr() function.
      
      Note that some filesystems were providing pretty much random (and incorrect)
      values for i_blksize.
      
      [bunk@stusta.de: cleanup]
      [akpm@osdl.org: generic_fillattr() fix]
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ba52de12
    • T
      [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private · 8e18e294
      Theodore Ts'o 提交于
      The following patches reduce the size of the VFS inode structure by 28 bytes
      on a UP x86.  (It would be more on an x86_64 system).  This is a 10% reduction
      in the inode size on a UP kernel that is configured in a production mode
      (i.e., with no spinlock or other debugging functions enabled; if you want to
      save memory taken up by in-core inodes, the first thing you should do is
      disable the debugging options; they are responsible for a huge amount of bloat
      in the VFS inode structure).
      
      This patch:
      
      The filesystem or device-specific pointer in the inode is inside a union,
      which is pretty pointless given that all 30+ users of this field have been
      using the void pointer.  Get rid of the union and rename it to i_private, with
      a comment to explain who is allowed to use the void pointer.  This is just a
      cleanup, but it allows us to reuse the union 'u' for something something where
      the union will actually be used.
      
      [judith@osdl.org: powerpc build fix]
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NJudith Lebzelter <judith@osdl.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8e18e294
  4. 26 9月, 2006 2 次提交
  5. 25 8月, 2006 3 次提交
  6. 31 7月, 2006 1 次提交
  7. 11 7月, 2006 1 次提交
    • B
      [PATCH] powerpc: fix trigger handling in the new irq code · 6e99e458
      Benjamin Herrenschmidt 提交于
      This patch slightly reworks the new irq code to fix a small design error.  I
      removed the passing of the trigger to the map() calls entirely, it was not a
      good idea to have one call do two different things.  It also fixes a couple of
      corner cases.
      
      Mapping a linux virtual irq to a physical irq now does only that.  Setting the
      trigger is a different action which has a different call.
      
      The main changes are:
      
      - I no longer call host->ops->map() for an already mapped irq, I just return
        the virtual number that was already mapped.  It was called before to give an
        opportunity to change the trigger, but that was causing issues as that could
        happen while the interrupt was in use by a device, and because of the
        trigger change, map would potentially muck around with things in a racy way.
         That was causing much burden on a given's controller implementation of
        map() to get it right.  This is much simpler now.  map() is only called on
        the initial mapping of an irq, meaning that you know that this irq is _not_
        being used.  You can initialize the hardware if you want (though you don't
        have to).
      
      - Controllers that can handle different type of triggers (level/edge/etc...)
        now implement the standard irq_chip->set_type() call as defined by the
        generic code.  That means that you can use the standard set_irq_type() to
        configure an irq line manually if you wish or (though I don't like that
        interface), pass explicit trigger flags to request_irq() as defined by the
        generic kernel interfaces.  Also, using those interfaces guarantees that
        your controller set_type callback is called with the descriptor lock held,
        thus providing locking against activity on the same interrupt (including
        mask/unmask/etc...) automatically.  A result is that, for example, MPIC's
        own map() implementation calls irq_set_type(NONE) to configure the hardware
        to the default triggers.
      
      - To allow the above, the irq_map array entry for the new mapped interrupt
        is now set before map() callback is called for the controller.
      
      - The irq_create_of_mapping() (also used by irq_of_parse_and_map()) function
        for mapping interrupts from the device-tree now also call the separate
        set_irq_type(), and only does so if there is a change in the trigger type.
      
      - While I was at it, I changed pci_read_irq_line() (which is the helper I
        would expect most archs to use in their pcibios_fixup() to get the PCI
        interrupt routing from the device tree) to also handle a fallback when the
        DT mapping fails consisting of reading the PCI_INTERRUPT_PIN to know wether
        the device has an interrupt at all, and the the PCI_INTERRUPT_LINE to get an
        interrupt number from the device.  That number is then mapped using the
        default controller, and the trigger is set to level low.  That default
        behaviour works for several platforms that don't have a proper interrupt
        tree like Pegasos.  If it doesn't work for your platform, then either
        provide a proper interrupt tree from the firmware so that fallback isn't
        needed, or don't call pci_read_irq_line()
      
      - Add back a bit that got dropped by my main rework patch for properly
        clearing pending IPIs on pSeries when using a kexec
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6e99e458
  8. 03 7月, 2006 4 次提交
  9. 01 7月, 2006 1 次提交
  10. 30 6月, 2006 1 次提交
    • I
      [PATCH] genirq: rename desc->handler to desc->chip · d1bef4ed
      Ingo Molnar 提交于
      This patch-queue improves the generic IRQ layer to be truly generic, by adding
      various abstractions and features to it, without impacting existing
      functionality.
      
      While the queue can be best described as "fix and improve everything in the
      generic IRQ layer that we could think of", and thus it consists of many
      smaller features and lots of cleanups, the one feature that stands out most is
      the new 'irq chip' abstraction.
      
      The irq-chip abstraction is about describing and coding and IRQ controller
      driver by mapping its raw hardware capabilities [and quirks, if needed] in a
      straightforward way, without having to think about "IRQ flow"
      (level/edge/etc.) type of details.
      
      This stands in contrast with the current 'irq-type' model of genirq
      architectures, which 'mixes' raw hardware capabilities with 'flow' details.
      The patchset supports both types of irq controller designs at once, and
      converts i386 and x86_64 to the new irq-chip design.
      
      As a bonus side-effect of the irq-chip approach, chained interrupt controllers
      (master/slave PIC constructs, etc.) are now supported by design as well.
      
      The end result of this patchset intends to be simpler architecture-level code
      and more consolidation between architectures.
      
      We reused many bits of code and many concepts from Russell King's ARM IRQ
      layer, the merging of which was one of the motivations for this patchset.
      
      This patch:
      
      rename desc->handler to desc->chip.
      
      Originally i did not want to do this, because it's a big patch.  But having
      both "desc->handler", "desc->handle_irq" and "action->handler" caused a
      large degree of confusion and made the code appear alot less clean than it
      truly is.
      
      I have also attempted a dual approach as well by introducing a
      desc->chip alias - but that just wasnt robust enough and broke
      frequently.
      
      So lets get over with this quickly.  The conversion was done automatically
      via scripts and converts all the code in the kernel.
      
      This renaming patch is the first one amongst the patches, so that the
      remaining patches can stay flexible and can be merged and split up
      without having some big monolithic patch act as a merge barrier.
      
      [akpm@osdl.org: build fix]
      [akpm@osdl.org: another build fix]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d1bef4ed
  11. 28 6月, 2006 7 次提交
  12. 27 6月, 2006 1 次提交
  13. 26 6月, 2006 2 次提交