1. 14 8月, 2013 1 次提交
    • C
      tile: fast-path unaligned memory access for tilegx · 2f9ac29e
      Chris Metcalf 提交于
      This change enables unaligned userspace memory access via a kernel
      fast path on tilegx.  The kernel tracks user PC/instruction pairs
      per-thread using a direct-mapped cache in userspace.  The cache
      maps those PC/instruction pairs to JIT'ed instruction sequences that
      load or store using byte-wide load store intructions and then
      synthesize 2-, 4- or 8-byte load or store results.  Once an
      instruction has been seen to generate an unaligned access once,
      subsequent hits on that instruction typically require overhead
      of only around 50 cycles if cache and TLB is hot.
      
      We support the prctl() PR_GET_UNALIGN / PR_SET_UNALIGN sys call to
      enable or disable unaligned fixups on a per-process basis.
      
      To do this we pull some of the tilepro unaligned support out of the
      single_step.c file; tilepro uses instruction disassembly for both
      single-step and unaligned access support.  Since tilegx actually has
      hardware singlestep support, though, it's cleaner to keep the tilegx
      unaligned access code in a separate file.  While we're at it,
      properly rename the tilepro-specific types, etc., to have tilepro
      suffixes instead of generic tile suffixes.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      2f9ac29e
  2. 19 7月, 2012 2 次提交
  3. 26 5月, 2012 2 次提交
    • C
      arch/tile: support kexec() for tilegx · fc0c49f5
      Chris Metcalf 提交于
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      fc0c49f5
    • C
      arch/tile: optimize get_user/put_user and friends · 47d632f9
      Chris Metcalf 提交于
      Use direct load/store for the get_user/put_user.
      
      Previously, we would call out to a helper routine that would do the
      appropriate thing and then return, handling the possible exception
      internally.  Now we inline the load or store, along with a "we succeeded"
      indication in a register; if the load or store faults, we write a
      "we failed" indication into the same register and then return to the
      following instruction.  This is more efficient and gives us more compact
      code, as well as being more in line with what other architectures do.
      
      The special futex assembly source file for TILE-Gx also disappears in
      this change; we just use the same inlining idiom there as well, putting
      the appropriate atomic operations directly into futex_atomic_op_inuser()
      (and thus into the FUTEX_WAIT function).
      
      The underlying atomic copy_from_user, copy_to_user functions were
      renamed using the (cryptic) x86 convention as copy_from_user_ll and
      copy_to_user_ll.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      47d632f9
  4. 05 5月, 2012 1 次提交
  5. 27 5月, 2011 1 次提交
    • C
      arch/tile: more /proc and /sys file support · f133ecca
      Chris Metcalf 提交于
      This change introduces a few of the less controversial /proc and
      /proc/sys interfaces for tile, along with sysfs attributes for
      various things that were originally proposed as /proc/tile files.
      It also adjusts the "hardwall" proc API.
      
      Arnd Bergmann reviewed the initial arch/tile submission, which
      included a complete set of all the /proc/tile and /proc/sys/tile
      knobs that we had added in a somewhat ad hoc way during initial
      development, and provided feedback on where most of them should go.
      
      One knob turned out to be similar enough to the existing
      /proc/sys/debug/exception-trace that it was re-implemented to use
      that model instead.
      
      Another knob was /proc/tile/grid, which reported the "grid" dimensions
      of a tile chip (e.g. 8x8 processors = 64-core chip).  Arnd suggested
      looking at sysfs for that, so this change moves that information
      to a pair of sysfs attributes (chip_width and chip_height) in the
      /sys/devices/system/cpu directory.  We also put the "chip_serial"
      and "chip_revision" information from our old /proc/tile/board file
      as attributes in /sys/devices/system/cpu.
      
      Other information collected via hypervisor APIs is now placed in
      /sys/hypervisor.  We create a /sys/hypervisor/type file (holding the
      constant string "tilera") to be parallel with the Xen use of
      /sys/hypervisor/type holding "xen".  We create three top-level files,
      "version" (the hypervisor's own version), "config_version" (the
      version of the configuration file), and "hvconfig" (the contents of
      the configuration file).  The remaining information from our old
      /proc/tile/board and /proc/tile/switch files becomes an attribute
      group appearing under /sys/hypervisor/board/.
      
      Finally, after some feedback from Arnd Bergmann for the previous
      version of this patch, the /proc/tile/hardwall file is split up into
      two conceptual parts.  First, a directory /proc/tile/hardwall/ which
      contains one file per active hardwall, each file named after the
      hardwall's ID and holding a cpulist that says which cpus are enclosed by
      the hardwall.  Second, a /proc/PID file "hardwall" that is either
      empty (for non-hardwall-using processes) or contains the hardwall ID.
      
      Finally, this change pushes the /proc/sys/tile/unaligned_fixup/
      directory, with knobs controlling the kernel code for handling the
      fixup of unaligned exceptions.
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      f133ecca
  6. 25 11月, 2010 1 次提交
    • C
      pci root complex: support for tile architecture · f02cbbe6
      Chris Metcalf 提交于
      This change enables PCI root complex support for TILEPro.  Unlike
      TILE-Gx, TILEPro has no support for memory-mapped I/O, so the PCI
      support consists of hypervisor upcalls for PIO, DMA, etc.  However,
      the performance is fine for the devices we have tested with so far
      (1Gb Ethernet, SATA, etc.).
      
      The <asm/io.h> header was tweaked to be a little bit more aggressive
      about disabling attempts to map/unmap IO port space.  The hacky
      <asm/pci-bridge.h> header was rolled into the <asm/pci.h> header
      and the result was simplified.  Both of the latter two headers were
      preliminary versions not meant for release before now - oh well.
      
      There is one quirk for our TILEmpower platform, which accidentally
      negotiates up to 5GT and needs to be kicked down to 2.5GT.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      f02cbbe6
  7. 07 7月, 2010 1 次提交
    • C
      arch/tile: Add driver to enable access to the user dynamic network. · 9f9c0382
      Chris Metcalf 提交于
      This network (the "UDN") connects all the cpus on the chip in a
      wormhole-routed dynamic network.  Subrectangles of the chip can
      be allocated by a "create" ioctl on /dev/hardwall, and then to access the
      UDN in that rectangle, tasks must perform an "activate" ioctl on that
      same file object after affinitizing themselves to a single cpu in
      the region.  Sending a wormhole-routed message that tries to leave
      that subrectangle causes all activated tasks to receive a SIGILL
      (just as they would if they tried to access the UDN without first
      activating themselves to a hardwall rectangle).
      
      The original submission of this code to LKML had the driver
      instantiated under /proc/tile/hardwall.  Now we just use a character
      device for this, conventionally /dev/hardwall.  Some futures planning
      for the TILE-Gx chip suggests that we may want to have other types of
      devices that share the general model of "bind a task to a cpu, then
      'activate' a file descriptor on a pseudo-device that gives access to
      some hardware resource".  As such, we are using a device rather
      than, for example, a syscall, to set up and activate this code.
      
      As part of this change, the compat_ptr() declaration was fixed and used
      to pass the compat_ioctl argument to the normal ioctl.  So far we limit
      compat code to 2GB, so the difference between zero-extend and sign-extend
      (the latter being correct, eventually) had been overlooked.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      9f9c0382
  8. 05 6月, 2010 1 次提交