1. 11 6月, 2015 1 次提交
    • L
      ia64: remove paravirt code · e55645ec
      Luis R. Rodriguez 提交于
      All the ia64 pvops code is now dead code since both
      xen and kvm support have been ripped out [0] [1]. Just
      that no one had troubled to rip this stuff out. The only
      useful remaining pieces were the old pvops docs but that
      was recently also generalized and moved out from ia64 [2].
      
      This has been run time tested on an ia64 Madison system.
      
      [0] 003f7de6 "KVM: ia64: remove" since v3.19-rc1
      [1] d52eefb4 "ia64/xen: Remove Xen support for ia64" since v3.14-rc1
      [2] "virtual: Documentation: simplify and generalize paravirt_ops.txt"
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e55645ec
  2. 06 1月, 2015 1 次提交
  3. 10 10月, 2014 1 次提交
  4. 19 8月, 2014 1 次提交
  5. 07 8月, 2014 1 次提交
  6. 20 5月, 2014 1 次提交
  7. 29 1月, 2014 1 次提交
  8. 14 11月, 2013 1 次提交
  9. 28 1月, 2013 1 次提交
    • F
      cputime: Generic on-demand virtual cputime accounting · abf917cd
      Frederic Weisbecker 提交于
      If we want to stop the tick further idle, we need to be
      able to account the cputime without using the tick.
      
      Virtual based cputime accounting solves that problem by
      hooking into kernel/user boundaries.
      
      However implementing CONFIG_VIRT_CPU_ACCOUNTING require
      low level hooks and involves more overhead. But we already
      have a generic context tracking subsystem that is required
      for RCU needs by archs which plan to shut down the tick
      outside idle.
      
      This patch implements a generic virtual based cputime
      accounting that relies on these generic kernel/user hooks.
      
      There are some upsides of doing this:
      
      - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
      if context tracking is already built (already necessary for RCU in full
      tickless mode).
      
      - We can rely on the generic context tracking subsystem to dynamically
      (de)activate the hooks, so that we can switch anytime between virtual
      and tick based accounting. This way we don't have the overhead
      of the virtual accounting when the tick is running periodically.
      
      And one downside:
      
      - There is probably more overhead than a native virtual based cputime
      accounting. But this relies on hooks that are already set anyway.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      abf917cd
  10. 04 1月, 2013 1 次提交
  11. 29 11月, 2012 1 次提交
  12. 20 10月, 2012 2 次提交
  13. 15 10月, 2012 1 次提交
  14. 10 1月, 2012 1 次提交
  15. 03 11月, 2011 1 次提交
  16. 27 8月, 2011 1 次提交
  17. 01 6月, 2011 1 次提交
  18. 29 5月, 2011 1 次提交
    • E
      ns: Wire up the setns system call · 7b21fddd
      Eric W. Biederman 提交于
      32bit and 64bit on x86 are tested and working.  The rest I have looked
      at closely and I can't find any problems.
      
      setns is an easy system call to wire up.  It just takes two ints so I
      don't expect any weird architecture porting problems.
      
      While doing this I have noticed that we have some architectures that are
      very slow to get new system calls.  cris seems to be the slowest where
      the last system calls wired up were preadv and pwritev.  avr32 is weird
      in that recvmmsg was wired up but never declared in unistd.h.  frv is
      behind with perf_event_open being the last syscall wired up.  On h8300
      the last system call wired up was epoll_wait.  On m32r the last system
      call wired up was fallocate.  mn10300 has recvmmsg as the last system
      call wired up.  The rest seem to at least have syncfs wired up which was
      new in the 2.6.39.
      
      v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com>
      v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com>
      v4: Moved wiring up of the system call to another patch
      v5: ported to v2.6.39-rc6
      v6: rebased onto parisc-next and net-next to avoid syscall  conflicts.
      v7: ported to Linus's latest post 2.6.39 tree.
      
      >  arch/blackfin/include/asm/unistd.h     |    3 ++-
      >  arch/blackfin/mach-common/entry.S      |    1 +
      Acked-by: NMike Frysinger <vapier@gentoo.org>
      
      Oh - ia64 wiring looks good.
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b21fddd
  19. 23 3月, 2011 1 次提交
  20. 14 8月, 2010 1 次提交
  21. 09 2月, 2010 1 次提交
    • T
      [IA64] Remove COMPAT_IA32 support · 32974ad4
      Tony Luck 提交于
      This has been broken since May 2008 when Al Viro killed altroot support.
      Since nobody has complained, it would appear that there are no users of
      this code (A plausible theory since the main OSVs that support ia64 prefer
      to use the IA32-EL software emulation).
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      32974ad4
  22. 13 10月, 2009 1 次提交
    • A
      net: Introduce recvmmsg socket syscall · a2e27255
      Arnaldo Carvalho de Melo 提交于
      Meaning receive multiple messages, reducing the number of syscalls and
      net stack entry/exit operations.
      
      Next patches will introduce mechanisms where protocols that want to
      optimize this operation will provide an unlocked_recvmsg operation.
      
      This takes into account comments made by:
      
      . Paul Moore: sock_recvmsg is called only for the first datagram,
        sock_recvmsg_nosec is used for the rest.
      
      . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
        works in the same fashion as the ppoll one.
      
        If the underlying protocol returns a datagram with MSG_OOB set, this
        will make recvmmsg return right away with as many datagrams (+ the OOB
        one) it has received so far.
      
      . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
        datagrams and then recvmsg returns an error, recvmmsg will return
        the successfully received datagrams, store the error and return it
        in the next call.
      
      This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
      where we will be able to acquire the lock only at batch start and end, not at
      every underlying recvmsg call.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2e27255
  23. 17 6月, 2009 1 次提交
  24. 09 4月, 2009 1 次提交
  25. 27 3月, 2009 1 次提交
  26. 14 1月, 2009 3 次提交
    • H
      [CVE-2009-0029] Remove __attribute__((weak)) from sys_pipe/sys_pipe2 · 1134723e
      Heiko Carstens 提交于
      Remove __attribute__((weak)) from common code sys_pipe implemantation.
      IA64, ALPHA, SUPERH (32bit) and SPARC (32bit) have own implemantations
      with the same name. Just rename them.
      For sys_pipe2 there is no architecture specific implementation.
      
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      1134723e
    • S
      ftrace, ia64: IA64 dynamic ftrace support · a14a07b8
      Shaohua Li 提交于
      IA64 dynamic ftrace support.
      The original _mcount stub for each function is like:
      	alloc r40=ar.pfs,12,8,0
      	mov r43=r0;;
      	mov r42=b0
      	mov r41=r1
      	nop.i 0x0
      	br.call.sptk.many b0 = _mcount;;
      
      The patch convert it to below for nop:
      	[MII] nop.m 0x0
      	mov r3=ip
      	nop.i 0x0
      	[MLX] nop.m 0x0
      	nop.x 0x0;;
      This isn't completely nop, as there is one instuction 'mov r3=ip', but
      it should be light and harmless for code follow it.
      
      And below is for call
      	[MII] nop.m 0x0
      	mov r3=ip
      	nop.i 0x0
      	[MLX] nop.m 0x0
      	brl.many .;;
      In this way, only one instruction is changed to convert code between nop
      and call. This should meet dyn-ftrace's requirement.
      But this requires CPU support brl instruction, so dyn-ftrace isn't
      supported for old Itanium system. Assume there are quite few such old
      system running.
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a14a07b8
    • S
      ftrace, ia64: IA64 static ftrace support · d3e75ff1
      Shaohua Li 提交于
      IA64 ftrace suppport. In IA64, below code will be added in each function
      if -pg is enabled.
      
      alloc r40=ar.pfs,12,8,0
      mov r43=r0;;
      mov r42=b0
      mov r41=r1
      nop.i 0x0
      br.call.sptk.many b0 = _mcount;;
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d3e75ff1
  27. 21 11月, 2008 1 次提交
    • T
      [IA64] Rationalize kernel mode alignment checking · b704882e
      Tony Luck 提交于
      Itanium processors can handle some misaligned data accesses. They
      also provide a mode where all such accesses are forced to trap. The
      kernel was schizophrenic about use of this mode:
      
      * Base kernel code ran in permissive mode where the only traps
        generated were from those cases that the h/w could not handle.
      * Interrupt, syscall and trap code ran in strict mode where all
        unaligned accesses caused traps to the 0x5a00 unaligned reference
        vector.
      
      Use strict alignment checking throughout the kernel, but make
      sure that we continue to let user mode use more relaxed mode
      as the default.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      b704882e
  28. 07 10月, 2008 1 次提交
  29. 26 7月, 2008 1 次提交
  30. 28 5月, 2008 1 次提交
    • I
      [IA64] pvops: paravirtualize entry.S · 4df8d22b
      Isaku Yamahata 提交于
      paravirtualize ia64_swtich_to, ia64_leave_syscall and ia64_leave_kernel.
      They include sensitive or performance critical privileged instructions
      so that they need paravirtualization.
      To paravirtualize them by single source and multi compile
      they are converted into indirect jump. And define each pv instances.
      
      Cc: Keith Owens <kaos@ocs.com.au>
      Cc: "Dong, Eddie" <eddie.dong@intel.com>
      Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      4df8d22b
  31. 15 5月, 2008 2 次提交
  32. 22 4月, 2008 1 次提交
    • H
      [IA64] disable interrupts on exit of ia64_trace_syscall · 38477ad7
      Hidetoshi Seto 提交于
      While testing with CONFIG_VIRT_CPU_ACCOUNTING=y, I found that
      I occasionally get very huge system time in some threads.
      
      So I dug the issue and finally noticed that it was caused
      because of an interrupt which interrupt in the following window:
      
      > [arch/ia64/kernel/entry.S: (!CONFIG_PREEMPT && CONFIG_VIRT_CPU_ACCOUNTING)]
      >
      > ENTRY(ia64_leave_syscall)
      >    :
      > (pUStk) rsm psr.i
      >         cmp.eq pLvSys,p0=r0,r0          // pLvSys=1: leave from syscall
      > (pUStk) cmp.eq.unc p6,p0=r0,r0          // p6 <- pUStk
      > .work_processed_syscall:
      >         adds r2=PT(LOADRS)+16,r12
      > (pUStk) mov.m r22=ar.itc                        // fetch time at leave
      >         adds r18=TI_FLAGS+IA64_TASK_SIZE,r13
      >         ;;
      > <<< window: from here >>>
      > (p6)    ld4 r31=[r18]  // load current_thread_info()->flags
      >         ld8 r19=[r2],PT(B6)-PT(LOADRS)
      >         adds r3=PT(AR_BSPSTORE)+16,r12
      >         ;;
      >         mov r16=ar.bsp
      >         ld8 r18=[r2],PT(R9)-PT(B6)
      > (p6)    and r15=TIF_WORK_MASK,r31  // any work other than TIF_SYSCALL_TRACE?
      >         ;;
      >         ld8 r23=[r3],PT(R11)-PT(AR_BSPSTORE)
      > (p6)    cmp4.ne.unc p6,p0=r15, r0               // any special work pending?
      > (p6)    br.cond.spnt .work_pending_syscall
      >         ;;
      >         ld8 r9=[r2],PT(CR_IPSR)-PT(R9)
      >         ld8 r11=[r3],PT(CR_IIP)-PT(R11)
      > (pNonSys) break 0 // bug check: we shouldn't be here if pNonSys is TRUE!
      >         ;;
      >         invala
      > <<< window: to here >>>
      >         rsm psr.i | psr.ic // turn off interrupts and interruption collection
      
      If pUStk is true, it means we are going to return user mode, hence we fetch
      ar.itc to get time at leave from system.
      It seems that it is not possible to interrupt the window if pUStk is true,
      because interrupts are disabled early.  And also disabling interrupt makes
      sense because it is safe for referring current_thread_info()->flags.
      
      However interrupting the window while pUStk is true was possible.
      The route was:
      ia64_trace_syscall
      -> .work_pending_syscall_end
      -> .work_processed_syscall
      Only in case entering the window from this route, interrupts are enabled
      during in the window even if pUStk is true.  I suppose interrupts must be
      disabled here anyway if pUStk is true.
      I'm not sure but afraid that what kind of bad effect were there, other
      than crazy system time which I found.
      
      FYI, there was a commit 6f6d7582 that
      points out a bug at same point(exit of ia64_trace_syscall) in 2006.
      It can be said that there was an another bug.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      38477ad7
  33. 21 2月, 2008 1 次提交
    • H
      [IA64] VIRT_CPU_ACCOUNTING (accurate cpu time accounting) · b64f34cd
      Hidetoshi Seto 提交于
      This patch implements VIRT_CPU_ACCOUNTING for ia64,
      which enable us to use more accurate cpu time accounting.
      
      The VIRT_CPU_ACCOUNTING is an item of kernel config, which s390
      and powerpc arch have.  By turning this config on, these archs
      change the mechanism of cpu time accounting from tick-sampling
      based one to state-transition based one.
      
      The state-transition based accounting is done by checking time
      (cycle counter in processor) at every state-transition point,
      such as entrance/exit of kernel, interrupt, softirq etc.
      The difference between point to point is the actual time consumed
      during in the state. There is no doubt about that this value is
      more accurate than that of tick-sampling based accounting.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      b64f34cd
  34. 09 2月, 2008 1 次提交
  35. 06 2月, 2008 1 次提交
    • D
      timerfd: new timerfd API · 4d672e7a
      Davide Libenzi 提交于
      This is the new timerfd API as it is implemented by the following patch:
      
      int timerfd_create(int clockid, int flags);
      int timerfd_settime(int ufd, int flags,
      		    const struct itimerspec *utmr,
      		    struct itimerspec *otmr);
      int timerfd_gettime(int ufd, struct itimerspec *otmr);
      
      The timerfd_create() API creates an un-programmed timerfd fd.  The "clockid"
      parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
      
      The timerfd_settime() API give new settings by the timerfd fd, by optionally
      retrieving the previous expiration time (in case the "otmr" parameter is not
      NULL).
      
      The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
      is set in the "flags" parameter.  Otherwise it's a relative time.
      
      The timerfd_gettime() API returns the next expiration time of the timer, or
      {0, 0} if the timerfd has not been set yet.
      
      Like the previous timerfd API implementation, read(2) and poll(2) are
      supported (with the same interface).  Here's a simple test program I used to
      exercise the new timerfd APIs:
      
      http://www.xmailserver.org/timerfd-test2.c
      
      [akpm@linux-foundation.org: coding-style cleanups]
      [akpm@linux-foundation.org: fix ia64 build]
      [akpm@linux-foundation.org: fix m68k build]
      [akpm@linux-foundation.org: fix mips build]
      [akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
      [heiko.carstens@de.ibm.com: fix s390]
      [akpm@linux-foundation.org: fix powerpc build]
      [akpm@linux-foundation.org: fix sparc64 more]
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4d672e7a
  36. 20 7月, 2007 1 次提交