1. 27 10月, 2012 1 次提交
    • D
      sparc64: Make montmul/montsqr/mpmul usable in 32-bit threads. · 517ffce4
      David S. Miller 提交于
      The Montgomery Multiply, Montgomery Square, and Multiple-Precision
      Multiply instructions work by loading a combination of the floating
      point and multiple register windows worth of integer registers
      with the inputs.
      
      These values are 64-bit.  But for 32-bit userland processes we only
      save the low 32-bits of each integer register during a register spill.
      This is because the register window save area is in the user stack and
      has a fixed layout.
      
      Therefore, the only way to use these instruction in 32-bit mode is to
      perform the following sequence:
      
      1) Load the top-32bits of a choosen integer register with a sentinel,
         say "-1".  This will be in the outer-most register window.
      
         The idea is that we're trying to see if the outer-most register
         window gets spilled, and thus the 64-bit values were truncated.
      
      2) Load all the inputs for the montmul/montsqr/mpmul instruction,
         down to the inner-most register window.
      
      3) Execute the opcode.
      
      4) Traverse back up to the outer-most register window.
      
      5) Check the sentinel, if it's still "-1" store the results.
         Otherwise retry the entire sequence.
      
      This retry is extremely troublesome.  If you're just unlucky and an
      interrupt or other trap happens, it'll push that outer-most window to
      the stack and clear the sentinel when we restore it.
      
      We could retry forever and never make forward progress if interrupts
      arrive at a fast enough rate (consider perf events as one example).
      So we have do limited retries and fallback to software which is
      extremely non-deterministic.
      
      Luckily it's very straightforward to provide a mechanism to let
      32-bit applications use a 64-bit stack.  Stacks in 64-bit mode are
      biased by 2047 bytes, which means that the lowest bit is set in the
      actual %sp register value.
      
      So if we see bit zero set in a 32-bit application's stack we treat
      it like a 64-bit stack.
      
      Runtime detection of such a facility is tricky, and cumbersome at
      best.  For example, just trying to use a biased stack and seeing if it
      works is hard to recover from (the signal handler will need to use an
      alt stack, plus something along the lines of longjmp).  Therefore, we
      add a system call to report a bitmask of arch specific features like
      this in a cheap and less hairy way.
      
      With help from Andy Polyakov.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      517ffce4
  2. 17 10月, 2012 2 次提交
  3. 11 10月, 2012 1 次提交
    • D
      sparc64: Fix deficiencies in sun4v error reporting. · f88620b9
      David S. Miller 提交于
      Missing error types, attributes, and report fields.  Pad out
      to 64-bytes.
      
      Make string reporting cleaner and easier to extend in the future using
      "const char *" arrays that index by either bit position, or absolute
      field value.
      
      Report the raw 64-byte error report as a sequence of u64s before the
      annotated version.
      
      Only report fields which are valid, given the context and the
      attribute bits which are set.
      
      For shutdown requests, use the local copy of the error report not the
      one we just freed up back to the queue.  Also, use orderly_poweroff()
      just like the Domain Services shutdown request code does.
      
      If the real-address reported is "-1" (unknown) try to disassemble the
      instruction to report the effective address of the access.  Only do
      this in privileged mode.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f88620b9
  4. 09 10月, 2012 7 次提交
  5. 06 10月, 2012 4 次提交
    • D
      Revert strace hiccups fix. · 2863bc54
      David S. Miller 提交于
      This reverts commit 40138249 and
      ffa9009c.
      
      There are problems with how the flag bytes were rearranged, in
      particular we really can't move values down into the lowest
      16 bits since those are used for individual state bits.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2863bc54
    • D
      sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy. · 9f825962
      David S. Miller 提交于
      This adds optimized memset/bzero/page-clear routines for Niagara-4.
      
      We basically can do what powerpc has been able to do for a decade (via
      the "dcbz" instruction), which is use cache line clearing stores for
      bzero and memsets with a 'c' argument of zero.
      
      As long as we make the cache initializing store to each 32-byte
      subblock of the L2 cache line, it works.
      
      As with other Niagara-4 optimized routines, the key is to make sure to
      avoid any usage of the %asi register, as reads and writes to it cost
      at least 50 cycles.
      
      For the user clear cases, we don't use these new routines, we use the
      Niagara-1 variants instead.  Those have to use %asi in an unavoidable
      way.
      
      A Niagara-4 8K page clear costs just under 600 cycles.
      
      Add definitions of the MRU variants of the cache initializing store
      ASIs.  By default, cache initializing stores install the line as Least
      Recently Used.  If we know we're going to use the data immediately
      (which is true for page copies and clears) we can use the Most
      Recently Used variant, to decrease the likelyhood of the lines being
      evicted before they get used.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f825962
    • D
      compat: move compat_siginfo_t definition to asm/compat.h · 751f409d
      Denys Vlasenko 提交于
      This is a preparatory patch for the introduction of NT_SIGINFO elf note.
      
      Make the location of compat_siginfo_t uniform across eight architectures
      which have it.  Now it can be pulled in by including asm/compat.h or
      linux/compat.h.
      
      Most of the copies are verbatim.  compat_uid[32]_t had to be replaced by
      __compat_uid[32]_t.  compat_uptr_t had to be moved up before
      compat_siginfo_t in asm/compat.h on a several architectures (tile already
      had it moved up).  compat_sigval_t had to be relocated from linux/compat.h
      to asm/compat.h.
      Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Amerigo Wang <amwang@redhat.com>
      Cc: "Jonathan M. Foote" <jmfoote@cert.org>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      751f409d
    • J
      cross-arch: don't corrupt personality flags upon exec() · 16f3e95b
      Jiri Kosina 提交于
      Historically, the top three bytes of personality have been used for
      things such as ADDR_NO_RANDOMIZE, which made sense only for specific
      architectures.
      
      We now however have a flag there that is general no matter the
      architecture (UNAME26); generally we have to be careful to preserve the
      personality flags across exec().
      
      This patch tries to fix all architectures that forcefully overwrite
      personality flags during exec() (ppc32 and s390 have been fixed recently
      by commits f9783ec8 ("[S390] Do not clobber personality flags on
      exec") and 59e4c3a2 ("powerpc/32: Don't clobber personality flags on
      exec") in a similar way already).
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Zankel <chris@zankel.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      16f3e95b
  6. 05 10月, 2012 1 次提交
    • A
      sparc64: Rearrange thread info to cheaply clear syscall noerror state. · 40138249
      Al Viro 提交于
      After fixing a couple of brainos, it even seems to work.  What's done here
      is move of ->syscall_noerror right before FPDEPTH byte in ->flags and
      using sth to [%g6 + TI_SYS_NOERROR] instead of stb to [%g6 + TI_FPDEPTH] in
      both branches of etrap_save.  AFAICS, that ought to be solid.  Again,
      deciding what to do with now unused delay slot of branch on ->syscall_noerror
      and dealing with the order of tests in ret_from_sys is a separate question,
      but at least that way we don't have to clean ->syscall_noerror in there at
      all.  AFAICS, it ought to be a clear win - sth is not going to cost more than
      stb on etrap_64.S side of things, and we are losing write on syscalls.S one.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40138249
  7. 04 10月, 2012 1 次提交
  8. 03 10月, 2012 3 次提交
  9. 01 10月, 2012 2 次提交
  10. 28 9月, 2012 1 次提交
    • D
      Make most arch asm/module.h files use asm-generic/module.h · 786d35d4
      David Howells 提交于
      Use the mapping of Elf_[SPE]hdr, Elf_Addr, Elf_Sym, Elf_Dyn, Elf_Rel/Rela,
      ELF_R_TYPE() and ELF_R_SYM() to either the 32-bit version or the 64-bit version
      into asm-generic/module.h for all arches bar MIPS.
      
      Also, use the generic definition mod_arch_specific where possible.
      
      To this end, I've defined three new config bools:
      
       (*) HAVE_MOD_ARCH_SPECIFIC
      
           Arches define this if they don't want to use the empty generic
           mod_arch_specific struct.
      
       (*) MODULES_USE_ELF_RELA
      
           Arches define this if their modules can contain RELA records.  This causes
           the Elf_Rela mapping to be emitted and allows apply_relocate_add() to be
           defined by the arch rather than have the core emit an error message.
      
       (*) MODULES_USE_ELF_REL
      
           Arches define this if their modules can contain REL records.  This causes
           the Elf_Rel mapping to be emitted and allows apply_relocate() to be
           defined by the arch rather than have the core emit an error message.
      
      Note that it is possible to allow both REL and RELA records: m68k and mips are
      two arches that do this.
      
      With this, some arch asm/module.h files can be deleted entirely and replaced
      with a generic-y marker in the arch Kbuild file.
      
      Additionally, I have removed the bits from m32r and score that handle the
      unsupported type of relocation record as that's now handled centrally.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      786d35d4
  11. 20 9月, 2012 1 次提交
  12. 07 9月, 2012 1 次提交
  13. 29 8月, 2012 1 次提交
  14. 19 8月, 2012 7 次提交
  15. 31 7月, 2012 1 次提交
  16. 27 7月, 2012 6 次提交