1. 19 1月, 2016 2 次提交
    • C
      arch/tile: move user_exit() to early kernel entry sequence · 1bb50cad
      Chris Metcalf 提交于
      This ensures that we always notify context tracking that we
      have exited from user space no matter how we enter the kernel.
      It is similar to how arm64 handles context tracking, for example.
      
      This allows the removal of all the exception_enter() calls that
      were added in commit 49e4e156 ("tile: support CONTEXT_TRACKING and
      thus NOHZ_FULL").
      Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
      1bb50cad
    • C
      arch/tile: adopt prepare_exit_to_usermode() model from x86 · 583b24a2
      Chris Metcalf 提交于
      This change is a prerequisite change for TASK_ISOLATION but also
      stands on its own for readability and maintainability.  The existing
      tile do_work_pending() was called in a loop from assembly on
      the slow path; this change moves the loop into C code as well.
      For the x86 version see commit c5c46f59 ("x86/entry: Add new,
      comprehensible entry and exit handlers written in C").
      
      This change exposes a pre-existing bug on the older tilepro platform;
      the singlestep processing is done last, but on tilepro (unlike tilegx)
      we enable interrupts while doing that processing, so we could in
      theory miss a signal or other asynchronous event.  A future change
      could fix this by breaking the singlestep work into a "prepare"
      step done in the main loop, and a "trigger" step done after exiting
      the loop.  Since this change is intended as purely a restructuring
      change, we call out the bug explicitly now, but don't yet fix it.
      Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
      583b24a2
  2. 31 7月, 2015 1 次提交
  3. 08 3月, 2014 2 次提交
  4. 28 9月, 2013 1 次提交
  5. 04 9月, 2013 2 次提交
    • C
      tile: remove support for TILE64 · d7c96611
      Chris Metcalf 提交于
      This chip is no longer being actively developed for (it was superceded
      by the TILEPro64 in 2008), and in any case the existing compiler and
      toolchain in the community do not support it.  It's unlikely that the
      kernel works with TILE64 at this point as the configuration has not been
      tested in years.  The support is also awkward as it requires maintaining
      a significant number of ifdefs.  So, just remove it altogether.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      d7c96611
    • C
      tile: parameterize VA and PA space more cleanly · acbde1db
      Chris Metcalf 提交于
      The existing code relied on the hardware definition (<arch/chip.h>)
      to specify how much VA and PA space was available.  It's convenient
      to allow customizing this for some configurations, so provide symbols
      MAX_PA_WIDTH and MAX_VA_WIDTH in <asm/page.h> that can be modified
      if desired.
      
      Additionally, move away from the MEM_XX_INTRPT nomenclature to
      define the start of various regions within the VA space.  In fact
      the cleaner symbol is, for example, MEM_SV_START, to indicate the
      start of the area used for supervisor code; the actual address of the
      interrupt vectors is not as important, and can be changed if desired.
      As part of this change, convert from "intrpt1" nomenclature (which
      built in the old privilege-level 1 model) to a simple "intrpt".
      
      Also strip out some tilepro-specific code supporting modifying the
      PL the kernel could run at, since we don't actually support using
      different PLs in tilepro, only tilegx.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      acbde1db
  6. 30 8月, 2013 1 次提交
    • C
      tilegx: change how we find the kernel stack · 35f05976
      Chris Metcalf 提交于
      Previously, we used a special-purpose register (SPR_SYSTEM_SAVE_K_0)
      to hold the CPU number and the top of the current kernel stack
      by using the low bits to hold the CPU number, and using the high
      bits to hold the address of the page just above where we'd want
      the kernel stack to be.  That way we could initialize a new SP
      when first entering the kernel by just masking the SPR value and
      subtracting a couple of words.
      
      However, it's actually more useful to be able to place an arbitrary
      kernel-top value in the SPR.  This allows us to create a new stack
      context (e.g. for virtualization) with an arbitrary top-of-stack VA.
      To make this work, we now store the CPU number in the high bits,
      above the highest legal VA bit (42 bits in the current tilegx
      microarchitecture).  The full 42 bits are thus available to store the
      top of stack value.  Getting the current cpu (a relatively common
      operation) is still fast; it's now a shift rather than a mask.
      
      We make this change only for tilegx, since tilepro has too few SPR
      bits to do this, and we don't need this support on tilepro anyway.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      35f05976
  7. 14 8月, 2013 3 次提交
    • C
      tile: provide traceability for hypervisor calls · 9ae09838
      Chris Metcalf 提交于
      This change adds infrastructure (CONFIG_TILE_HVGLUE_TRACE) that
      provides C code wrappers for the calls the kernel makes to the Tilera
      hypervisor.  This allows standard kernel infrastructure like FTRACE to
      be able to instrument hypervisor calls.
      
      To allow direct calls to the true API, we export their names with a
      leading underscore as well.  This is important for the few contexts
      where we need to make hypervisor calls without touching the stack.
      
      As part of this change, we also switch from creating the symbols
      with linker magic to creating them with assembler magic.  This lets
      us provide a symbol type and generally make them appear more as symbols
      and less as just random values in the Elf namespace.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      9ae09838
    • C
      tile: support CONFIG_PREEMPT · bc1a298f
      Chris Metcalf 提交于
      This change adds support for CONFIG_PREEMPT (full kernel preemption).
      In addition to the core support, this change includes a number
      of places where we fix up uses of smp_processor_id() and per-cpu
      variables.  I also eliminate the PAGE_HOME_HERE and PAGE_HOME_UNKNOWN
      values for page homing, as it turns out they weren't being used.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      bc1a298f
    • C
      tile: fast-path unaligned memory access for tilegx · 2f9ac29e
      Chris Metcalf 提交于
      This change enables unaligned userspace memory access via a kernel
      fast path on tilegx.  The kernel tracks user PC/instruction pairs
      per-thread using a direct-mapped cache in userspace.  The cache
      maps those PC/instruction pairs to JIT'ed instruction sequences that
      load or store using byte-wide load store intructions and then
      synthesize 2-, 4- or 8-byte load or store results.  Once an
      instruction has been seen to generate an unaligned access once,
      subsequent hits on that instruction typically require overhead
      of only around 50 cycles if cache and TLB is hot.
      
      We support the prctl() PR_GET_UNALIGN / PR_SET_UNALIGN sys call to
      enable or disable unaligned fixups on a per-process basis.
      
      To do this we pull some of the tilepro unaligned support out of the
      single_step.c file; tilepro uses instruction disassembly for both
      single-step and unaligned access support.  Since tilegx actually has
      hardware singlestep support, though, it's cleaner to keep the tilegx
      unaligned access code in a separate file.  While we're at it,
      properly rename the tilepro-specific types, etc., to have tilepro
      suffixes instead of generic tile suffixes.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      2f9ac29e
  8. 13 8月, 2013 1 次提交
  9. 22 3月, 2013 1 次提交
  10. 24 10月, 2012 2 次提交
  11. 21 10月, 2012 1 次提交
  12. 17 5月, 2012 1 次提交
    • C
      arch/tile: fix up some issues in calling do_work_pending() · fc327e26
      Chris Metcalf 提交于
      First, we were at risk of handling thread-info flags, in particular
      do_signal(), when returning from kernel space.  This could happen
      after a failed kernel_execve(), or when forking a kernel thread.
      The fix is to test in do_work_pending() for user_mode() and return
      immediately if so; we already had this test for one of the flags,
      so I just hoisted it to the top of the function.
      
      Second, if a ptraced process updated the callee-saved registers
      in the ptregs struct and then processed another thread-info flag, we
      would overwrite the modifications with the original callee-saved
      registers.  To fix this, we add a register to note if we've already
      saved the registers once, and skip doing it on additional passes
      through the loop.  To avoid a performance hit from the couple of
      extra instructions involved, I modified the GET_THREAD_INFO() macro
      to be guaranteed to be one instruction, then bundled it with adjacent
      instructions, yielding an overall net savings.
      Reported-By: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      fc327e26
  13. 03 4月, 2012 1 次提交
  14. 13 10月, 2011 1 次提交
  15. 27 7月, 2011 1 次提交
  16. 05 5月, 2011 1 次提交
    • C
      arch/tile: allow nonatomic stores to interoperate with fast atomic syscalls · df29ccb6
      Chris Metcalf 提交于
      This semantic was already true for atomic operations within the kernel,
      and this change makes it true for the fast atomic syscalls (__NR_cmpxchg
      and __NR_atomic_update) as well.  Previously, user-space had to use
      the fast atomic syscalls exclusively to update memory, since raw stores
      could lose a race with the atomic update code even when the atomic update
      hadn't actually modified the value.
      
      With this change, we no longer write back the value to memory if it
      hasn't changed.  This allows certain types of idioms in user space to
      work as expected, e.g. "atomic exchange" to acquire a spinlock, followed
      by a raw store of zero to release the lock.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      df29ccb6
  17. 03 5月, 2011 1 次提交
    • C
      arch/tile: support TIF_NOTIFY_RESUME · 313ce674
      Chris Metcalf 提交于
      This support is required for CONFIG_KEYS, NFSv4 kernel DNS, etc.
      The change is slightly more complex than the minimal thing, since
      I took advantage of having to go into the assembly code to just
      move a bunch of stuff into C code: specifically, the schedule(),
      do_async_page_fault(), do_signal(), and single_step_once() support,
      in addition to the TIF_NOTIFY_RESUME support.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      313ce674
  18. 11 3月, 2011 2 次提交
    • C
      arch/tile: support 4KB page size as well as 64KB · 76c567fb
      Chris Metcalf 提交于
      The Tilera architecture traditionally supports 64KB page sizes
      to improve TLB utilization and improve performance when the
      hardware is being used primarily to run a single application.
      
      For more generic server scenarios, it can be beneficial to run
      with 4KB page sizes, so this commit allows that to be specified
      (by modifying the arch/tile/include/hv/pagesize.h header).
      
      As part of this change, we also re-worked the PTE management
      slightly so that PTE writes all go through a __set_pte() function
      where we can do some additional validation.  The set_pte_order()
      function was eliminated since the "order" argument wasn't being used.
      
      One bug uncovered was in the PCI DMA code, which wasn't properly
      flushing the specified range.  This was benign with 64KB pages,
      but with 4KB pages we were getting some larger flushes wrong.
      
      The per-cpu memory reservation code also needed updating to
      conform with the newer percpu stuff; before it always chose 64KB,
      and that was always correct, but with 4KB granularity we now have
      to pay closer attention and reserve the amount of memory that will
      be requested when the percpu code starts allocating.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      76c567fb
    • C
      arch/tile: fix some comments and whitespace · 5fb682b0
      Chris Metcalf 提交于
      This is a grab bag of changes with no actual change to generated code.
      This includes whitespace and comment typos, plus a couple of stale
      comments being removed.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      5fb682b0
  19. 02 3月, 2011 1 次提交
    • C
      arch/tile: stop disabling INTCTRL_1 interrupts during hypervisor downcalls · b2ce2bda
      Chris Metcalf 提交于
      The problem was that this could lead to IPIs being disabled during
      the softirq processing after a hypervisor downcall (e.g. for I/O),
      since both IPI and device interrupts use the INCTRL_1 downcall mechanism.
      When this happened at the wrong time, it could lead to deadlock.
      
      Luckily, we were already maintaining the per-interrupt state we need,
      and using it in the proper way in the hypervisor, so all we had to do
      was to change Linux to stop blocking downcall interrupts for the entire
      length of the downcall.  (Now they're blocked while we're executing the
      downcall routine itself, but not while we're executing any subsequent
      softirq routines.)  The hypervisor is doing a very small amount of
      work it no longer needs to do (masking INTCTRL_1 on entry to the client
      interrupt routine), but doing so means that older versions of Tile Linux
      will continue to work with a current hypervisor, so that seems reasonable.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      b2ce2bda
  20. 18 12月, 2010 1 次提交
    • C
      arch/tile: handle rt_sigreturn() more cleanly · 81711cee
      Chris Metcalf 提交于
      The current tile rt_sigreturn() syscall pattern uses the common idiom
      of loading up pt_regs with all the saved registers from the time of
      the signal, then anticipating the fact that we will clobber the ABI
      "return value" register (r0) as we return from the syscall by setting
      the rt_sigreturn return value to whatever random value was in the pt_regs
      for r0.
      
      However, this breaks in our 64-bit kernel when running "compat" tasks,
      since we always sign-extend the "return value" register to properly
      handle returned pointers that are in the upper 2GB of the 32-bit compat
      address space.  Doing this to the sigreturn path then causes occasional
      random corruption of the 64-bit r0 register.
      
      Instead, we stop doing the crazy "load the return-value register"
      hack in sigreturn.  We already have some sigreturn-specific assembly
      code that we use to pass the pt_regs pointer to C code.  We extend that
      code to also set the link register to point to a spot a few instructions
      after the usual syscall return address so we don't clobber the saved r0.
      Now it no longer matters what the rt_sigreturn syscall returns, and the
      pt_regs structure can be cleanly and completely reloaded.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      81711cee
  21. 16 10月, 2010 2 次提交
  22. 15 10月, 2010 3 次提交
  23. 25 9月, 2010 1 次提交
    • C
      arch/tile: remove dead code from intvec_32.S · ea44e06e
      Chris Metcalf 提交于
      This "bpt_code" instruction was killed off in our development line a while
      ago (the actual definition of bpt_code that is used is in kernel/traps.c)
      but I didn't push it for 2.6.36 because it seemed harmless and I didn't
      want to try to push more than absolutely necessary.
      
      However, we recently fixed a bug in our gcc that had been causing
      "-gdwarf2" not to be passed to the assembler, and passing this flag causes
      an erroneous assembler failure in the presence of code in a data section,
      sometimes.  While we'd like to track down the bug in the assembler,
      we'd also like to make sure 2.6.36 builds with the current toolchain,
      so I'm removing this dead code as well.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      ea44e06e
  24. 14 8月, 2010 1 次提交
    • C
      arch/tile: extend syscall ABI to set r1 on return as well. · ba00376b
      Chris Metcalf 提交于
      Until now, the tile architecture ABI for syscall return has just been
      that r0 holds the return value, and an error is only signalled like it is
      for kernel code, with a negative small number.
      
      However, this means that in multiple places in userspace we end up writing
      the same three-cycle idiom that tests for a small negative number for
      error.  It seems cleaner to instead move that code into the kernel, and
      set r1 to hold zero on success or errno on failure; previously, r1 was
      just zeroed on return from the kernel (to avoid leaking kernel state).
      This way a single conditional branch after the syscall is sufficient
      to test for the failure case.  The number of cycles taken is the same,
      but the error-checking code is in just one place, so total code size is
      smaller, and random userspace syscall code is easier to understand.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      ba00376b
  25. 07 7月, 2010 1 次提交
    • C
      arch/tile: Add driver to enable access to the user dynamic network. · 9f9c0382
      Chris Metcalf 提交于
      This network (the "UDN") connects all the cpus on the chip in a
      wormhole-routed dynamic network.  Subrectangles of the chip can
      be allocated by a "create" ioctl on /dev/hardwall, and then to access the
      UDN in that rectangle, tasks must perform an "activate" ioctl on that
      same file object after affinitizing themselves to a single cpu in
      the region.  Sending a wormhole-routed message that tries to leave
      that subrectangle causes all activated tasks to receive a SIGILL
      (just as they would if they tried to access the UDN without first
      activating themselves to a hardwall rectangle).
      
      The original submission of this code to LKML had the driver
      instantiated under /proc/tile/hardwall.  Now we just use a character
      device for this, conventionally /dev/hardwall.  Some futures planning
      for the TILE-Gx chip suggests that we may want to have other types of
      devices that share the general model of "bind a task to a cpu, then
      'activate' a file descriptor on a pseudo-device that gives access to
      some hardware resource".  As such, we are using a device rather
      than, for example, a syscall, to set up and activate this code.
      
      As part of this change, the compat_ptr() declaration was fixed and used
      to pass the compat_ioctl argument to the normal ioctl.  So far we limit
      compat code to 2GB, so the difference between zero-extend and sign-extend
      (the latter being correct, eventually) had been overlooked.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      9f9c0382
  26. 05 6月, 2010 1 次提交