1. 24 10月, 2012 1 次提交
  2. 21 10月, 2012 1 次提交
  3. 17 5月, 2012 1 次提交
    • C
      arch/tile: fix up some issues in calling do_work_pending() · fc327e26
      Chris Metcalf 提交于
      First, we were at risk of handling thread-info flags, in particular
      do_signal(), when returning from kernel space.  This could happen
      after a failed kernel_execve(), or when forking a kernel thread.
      The fix is to test in do_work_pending() for user_mode() and return
      immediately if so; we already had this test for one of the flags,
      so I just hoisted it to the top of the function.
      
      Second, if a ptraced process updated the callee-saved registers
      in the ptregs struct and then processed another thread-info flag, we
      would overwrite the modifications with the original callee-saved
      registers.  To fix this, we add a register to note if we've already
      saved the registers once, and skip doing it on additional passes
      through the loop.  To avoid a performance hit from the couple of
      extra instructions involved, I modified the GET_THREAD_INFO() macro
      to be guaranteed to be one instruction, then bundled it with adjacent
      instructions, yielding an overall net savings.
      Reported-By: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      fc327e26
  4. 03 4月, 2012 1 次提交
  5. 13 10月, 2011 1 次提交
  6. 27 7月, 2011 1 次提交
  7. 05 5月, 2011 1 次提交
    • C
      arch/tile: allow nonatomic stores to interoperate with fast atomic syscalls · df29ccb6
      Chris Metcalf 提交于
      This semantic was already true for atomic operations within the kernel,
      and this change makes it true for the fast atomic syscalls (__NR_cmpxchg
      and __NR_atomic_update) as well.  Previously, user-space had to use
      the fast atomic syscalls exclusively to update memory, since raw stores
      could lose a race with the atomic update code even when the atomic update
      hadn't actually modified the value.
      
      With this change, we no longer write back the value to memory if it
      hasn't changed.  This allows certain types of idioms in user space to
      work as expected, e.g. "atomic exchange" to acquire a spinlock, followed
      by a raw store of zero to release the lock.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      df29ccb6
  8. 03 5月, 2011 1 次提交
    • C
      arch/tile: support TIF_NOTIFY_RESUME · 313ce674
      Chris Metcalf 提交于
      This support is required for CONFIG_KEYS, NFSv4 kernel DNS, etc.
      The change is slightly more complex than the minimal thing, since
      I took advantage of having to go into the assembly code to just
      move a bunch of stuff into C code: specifically, the schedule(),
      do_async_page_fault(), do_signal(), and single_step_once() support,
      in addition to the TIF_NOTIFY_RESUME support.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      313ce674
  9. 11 3月, 2011 2 次提交
    • C
      arch/tile: support 4KB page size as well as 64KB · 76c567fb
      Chris Metcalf 提交于
      The Tilera architecture traditionally supports 64KB page sizes
      to improve TLB utilization and improve performance when the
      hardware is being used primarily to run a single application.
      
      For more generic server scenarios, it can be beneficial to run
      with 4KB page sizes, so this commit allows that to be specified
      (by modifying the arch/tile/include/hv/pagesize.h header).
      
      As part of this change, we also re-worked the PTE management
      slightly so that PTE writes all go through a __set_pte() function
      where we can do some additional validation.  The set_pte_order()
      function was eliminated since the "order" argument wasn't being used.
      
      One bug uncovered was in the PCI DMA code, which wasn't properly
      flushing the specified range.  This was benign with 64KB pages,
      but with 4KB pages we were getting some larger flushes wrong.
      
      The per-cpu memory reservation code also needed updating to
      conform with the newer percpu stuff; before it always chose 64KB,
      and that was always correct, but with 4KB granularity we now have
      to pay closer attention and reserve the amount of memory that will
      be requested when the percpu code starts allocating.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      76c567fb
    • C
      arch/tile: fix some comments and whitespace · 5fb682b0
      Chris Metcalf 提交于
      This is a grab bag of changes with no actual change to generated code.
      This includes whitespace and comment typos, plus a couple of stale
      comments being removed.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      5fb682b0
  10. 02 3月, 2011 1 次提交
    • C
      arch/tile: stop disabling INTCTRL_1 interrupts during hypervisor downcalls · b2ce2bda
      Chris Metcalf 提交于
      The problem was that this could lead to IPIs being disabled during
      the softirq processing after a hypervisor downcall (e.g. for I/O),
      since both IPI and device interrupts use the INCTRL_1 downcall mechanism.
      When this happened at the wrong time, it could lead to deadlock.
      
      Luckily, we were already maintaining the per-interrupt state we need,
      and using it in the proper way in the hypervisor, so all we had to do
      was to change Linux to stop blocking downcall interrupts for the entire
      length of the downcall.  (Now they're blocked while we're executing the
      downcall routine itself, but not while we're executing any subsequent
      softirq routines.)  The hypervisor is doing a very small amount of
      work it no longer needs to do (masking INTCTRL_1 on entry to the client
      interrupt routine), but doing so means that older versions of Tile Linux
      will continue to work with a current hypervisor, so that seems reasonable.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      b2ce2bda
  11. 18 12月, 2010 1 次提交
    • C
      arch/tile: handle rt_sigreturn() more cleanly · 81711cee
      Chris Metcalf 提交于
      The current tile rt_sigreturn() syscall pattern uses the common idiom
      of loading up pt_regs with all the saved registers from the time of
      the signal, then anticipating the fact that we will clobber the ABI
      "return value" register (r0) as we return from the syscall by setting
      the rt_sigreturn return value to whatever random value was in the pt_regs
      for r0.
      
      However, this breaks in our 64-bit kernel when running "compat" tasks,
      since we always sign-extend the "return value" register to properly
      handle returned pointers that are in the upper 2GB of the 32-bit compat
      address space.  Doing this to the sigreturn path then causes occasional
      random corruption of the 64-bit r0 register.
      
      Instead, we stop doing the crazy "load the return-value register"
      hack in sigreturn.  We already have some sigreturn-specific assembly
      code that we use to pass the pt_regs pointer to C code.  We extend that
      code to also set the link register to point to a spot a few instructions
      after the usual syscall return address so we don't clobber the saved r0.
      Now it no longer matters what the rt_sigreturn syscall returns, and the
      pt_regs structure can be cleanly and completely reloaded.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      81711cee
  12. 16 10月, 2010 2 次提交
  13. 15 10月, 2010 3 次提交
  14. 25 9月, 2010 1 次提交
    • C
      arch/tile: remove dead code from intvec_32.S · ea44e06e
      Chris Metcalf 提交于
      This "bpt_code" instruction was killed off in our development line a while
      ago (the actual definition of bpt_code that is used is in kernel/traps.c)
      but I didn't push it for 2.6.36 because it seemed harmless and I didn't
      want to try to push more than absolutely necessary.
      
      However, we recently fixed a bug in our gcc that had been causing
      "-gdwarf2" not to be passed to the assembler, and passing this flag causes
      an erroneous assembler failure in the presence of code in a data section,
      sometimes.  While we'd like to track down the bug in the assembler,
      we'd also like to make sure 2.6.36 builds with the current toolchain,
      so I'm removing this dead code as well.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      ea44e06e
  15. 14 8月, 2010 1 次提交
    • C
      arch/tile: extend syscall ABI to set r1 on return as well. · ba00376b
      Chris Metcalf 提交于
      Until now, the tile architecture ABI for syscall return has just been
      that r0 holds the return value, and an error is only signalled like it is
      for kernel code, with a negative small number.
      
      However, this means that in multiple places in userspace we end up writing
      the same three-cycle idiom that tests for a small negative number for
      error.  It seems cleaner to instead move that code into the kernel, and
      set r1 to hold zero on success or errno on failure; previously, r1 was
      just zeroed on return from the kernel (to avoid leaking kernel state).
      This way a single conditional branch after the syscall is sufficient
      to test for the failure case.  The number of cycles taken is the same,
      but the error-checking code is in just one place, so total code size is
      smaller, and random userspace syscall code is easier to understand.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      ba00376b
  16. 07 7月, 2010 1 次提交
    • C
      arch/tile: Add driver to enable access to the user dynamic network. · 9f9c0382
      Chris Metcalf 提交于
      This network (the "UDN") connects all the cpus on the chip in a
      wormhole-routed dynamic network.  Subrectangles of the chip can
      be allocated by a "create" ioctl on /dev/hardwall, and then to access the
      UDN in that rectangle, tasks must perform an "activate" ioctl on that
      same file object after affinitizing themselves to a single cpu in
      the region.  Sending a wormhole-routed message that tries to leave
      that subrectangle causes all activated tasks to receive a SIGILL
      (just as they would if they tried to access the UDN without first
      activating themselves to a hardwall rectangle).
      
      The original submission of this code to LKML had the driver
      instantiated under /proc/tile/hardwall.  Now we just use a character
      device for this, conventionally /dev/hardwall.  Some futures planning
      for the TILE-Gx chip suggests that we may want to have other types of
      devices that share the general model of "bind a task to a cpu, then
      'activate' a file descriptor on a pseudo-device that gives access to
      some hardware resource".  As such, we are using a device rather
      than, for example, a syscall, to set up and activate this code.
      
      As part of this change, the compat_ptr() declaration was fixed and used
      to pass the compat_ioctl argument to the normal ioctl.  So far we limit
      compat code to 2GB, so the difference between zero-extend and sign-extend
      (the latter being correct, eventually) had been overlooked.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      9f9c0382
  17. 05 6月, 2010 1 次提交