1. 11 12月, 2009 1 次提交
  2. 13 10月, 2009 1 次提交
    • A
      net: Introduce recvmmsg socket syscall · a2e27255
      Arnaldo Carvalho de Melo 提交于
      Meaning receive multiple messages, reducing the number of syscalls and
      net stack entry/exit operations.
      
      Next patches will introduce mechanisms where protocols that want to
      optimize this operation will provide an unlocked_recvmsg operation.
      
      This takes into account comments made by:
      
      . Paul Moore: sock_recvmsg is called only for the first datagram,
        sock_recvmsg_nosec is used for the rest.
      
      . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
        works in the same fashion as the ppoll one.
      
        If the underlying protocol returns a datagram with MSG_OOB set, this
        will make recvmmsg return right away with as many datagrams (+ the OOB
        one) it has received so far.
      
      . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
        datagrams and then recvmsg returns an error, recvmmsg will return
        the successfully received datagrams, store the error and return it
        in the next call.
      
      This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
      where we will be able to acquire the lock only at batch start and end, not at
      every underlying recvmsg call.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2e27255
  3. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  4. 15 8月, 2009 1 次提交
    • M
      ARM: 5677/1: ARM support for TIF_RESTORE_SIGMASK/pselect6/ppoll/epoll_pwait · 36984265
      Mikael Pettersson 提交于
      This patch adds support for TIF_RESTORE_SIGMASK to ARM's
      signal handling, which allows to hook up the pselect6, ppoll,
      and epoll_pwait syscalls on ARM.
      
      Tested here with eabi userspace and a test program with a
      deliberate race between a child's exit and the parent's
      sigprocmask/select sequence. Using sys_pselect6() instead
      of sigprocmask/select reliably prevents the race.
      
      The other arch's support for TIF_RESTORE_SIGMASK has evolved
      over time:
      
      In 2.6.16:
      - add TIF_RESTORE_SIGMASK which parallels TIF_SIGPENDING
      - test both when checking for pending signal [changed later]
      - reimplement sys_sigsuspend() to use current->saved_sigmask,
        TIF_RESTORE_SIGMASK [changed later], and -ERESTARTNOHAND;
        ditto for sys_rt_sigsuspend(), but drop private code and
        use common code via __ARCH_WANT_SYS_RT_SIGSUSPEND;
      - there are now no "extra" calls to do_signal() so its oldset
        parameter is always &current->blocked so need not be passed,
        also its return value is changed to void
      - change handle_signal() to return 0/-errno
      - change do_signal() to honor TIF_RESTORE_SIGMASK:
        + get oldset from current->saved_sigmask if TIF_RESTORE_SIGMASK
          is set
        + if handle_signal() was successful then clear TIF_RESTORE_SIGMASK
        + if no signal was delivered and TIF_RESTORE_SIGMASK is set then
          clear it and restore the sigmask
      - hook up sys_pselect6() and sys_ppoll()
      
      In 2.6.19:
      - hook up sys_epoll_pwait()
      
      In 2.6.26:
      - allow archs to override how TIF_RESTORE_SIGMASK is implemented;
        default set_restore_sigmask() sets both TIF_RESTORE_SIGMASK and
        TIF_SIGPENDING; archs need now just test TIF_SIGPENDING again
        when checking for pending signal work; some archs now implement
        TIF_RESTORE_SIGMASK as a secondary/non-atomic thread flag bit
      - call set_restore_sigmask() in sys_sigsuspend() instead of setting
        TIF_RESTORE_SIGMASK
      
      In 2.6.29-rc:
      - kill sys_pselect7() which no arch wanted
      
      So for 2.6.31-rc6/ARM this patch does the following:
      - Add TIF_RESTORE_SIGMASK. Use the generic set_restore_sigmask()
        which sets both TIF_SIGPENDING and TIF_RESTORE_SIGMASK, so
        TIF_RESTORE_SIGMASK need not claim one of the scarce low thread
        flags, and existing TIF_SIGPENDING and _TIF_WORK_MASK tests need
        not be extended for TIF_RESTORE_SIGMASK.
      - sys_sigsuspend() is reimplemented to use current->saved_sigmask
        and set_restore_sigmask(), making it identical to most other archs
      - The private code for sys_rt_sigsuspend() is removed, instead
        generic code supplies it via __ARCH_WANT_SYS_RT_SIGSUSPEND.
      - sys_sigsuspend() and sys_rt_sigsuspend() no longer need a pt_regs
        parameter, so their assembly code wrappers are removed.
      - handle_signal() is changed to return 0 on success or -errno.
      - The oldset parameter to do_signal() is now redundant and removed,
        and the return value is now also redundant and changed to void.
      - do_signal() is changed to honor TIF_RESTORE_SIGMASK:
        + get oldset from current->saved_sigmask if TIF_RESTORE_SIGMASK
          is set
        + if handle_signal() was successful then clear TIF_RESTORE_SIGMASK
        + if no signal was delivered and TIF_RESTORE_SIGMASK is set then
          clear it and restore the sigmask
      - Hook up sys_pselect6, sys_ppoll, and sys_epoll_pwait.
      Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      36984265
  5. 21 6月, 2009 1 次提交
  6. 20 4月, 2009 1 次提交
    • M
      [ARM] 5456/1: add sys_preadv and sys_pwritev · eb8f3142
      Mikael Pettersson 提交于
      Kernel 2.6.30-rc1 added sys_preadv and sys_pwritev to most archs
      but not ARM, resulting in
      
      <stdin>:1421:2: warning: #warning syscall preadv not implemented
      <stdin>:1425:2: warning: #warning syscall pwritev not implemented
      
      This patch adds sys_preadv and sys_pwritev to ARM.
      
      These syscalls simply take five long-sized parameters, so they
      should have no calling-convention/ABI issues in the kernel.
      
      Tested on armv5tel eabi using a preadv/pwritev test program posted
      on linuxppc-dev earlier this month.
      
      It would be nice to get this into the kernel before 2.6.30 final,
      so that glibc's kernel version feature test for these syscalls
      doesn't have to special-case ARM.
      Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      eb8f3142
  7. 14 1月, 2009 1 次提交
  8. 13 8月, 2008 1 次提交
  9. 19 4月, 2008 1 次提交
  10. 28 3月, 2008 1 次提交
  11. 06 2月, 2008 1 次提交
    • D
      timerfd: new timerfd API · 4d672e7a
      Davide Libenzi 提交于
      This is the new timerfd API as it is implemented by the following patch:
      
      int timerfd_create(int clockid, int flags);
      int timerfd_settime(int ufd, int flags,
      		    const struct itimerspec *utmr,
      		    struct itimerspec *otmr);
      int timerfd_gettime(int ufd, struct itimerspec *otmr);
      
      The timerfd_create() API creates an un-programmed timerfd fd.  The "clockid"
      parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
      
      The timerfd_settime() API give new settings by the timerfd fd, by optionally
      retrieving the previous expiration time (in case the "otmr" parameter is not
      NULL).
      
      The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
      is set in the "flags" parameter.  Otherwise it's a relative time.
      
      The timerfd_gettime() API returns the next expiration time of the timer, or
      {0, 0} if the timerfd has not been set yet.
      
      Like the previous timerfd API implementation, read(2) and poll(2) are
      supported (with the same interface).  Here's a simple test program I used to
      exercise the new timerfd APIs:
      
      http://www.xmailserver.org/timerfd-test2.c
      
      [akpm@linux-foundation.org: coding-style cleanups]
      [akpm@linux-foundation.org: fix ia64 build]
      [akpm@linux-foundation.org: fix m68k build]
      [akpm@linux-foundation.org: fix mips build]
      [akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
      [heiko.carstens@de.ibm.com: fix s390]
      [akpm@linux-foundation.org: fix powerpc build]
      [akpm@linux-foundation.org: fix sparc64 more]
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4d672e7a
  12. 13 10月, 2007 1 次提交
  13. 29 6月, 2007 1 次提交
    • D
      Introduce fixed sys_sync_file_range2() syscall, implement on PowerPC and ARM · edd5cd4a
      David Woodhouse 提交于
      Not all the world is an i386.  Many architectures need 64-bit arguments to be
      aligned in suitable pairs of registers, and the original
      sys_sync_file_range(int, loff_t, loff_t, int) was therefore wasting an
      argument register for padding after the first integer.  Since we don't
      normally have more than 6 arguments for system calls, that left no room for
      the final argument on some architectures.
      
      Fix this by introducing sys_sync_file_range2(int, int, loff_t, loff_t) which
      all fits nicely.  In fact, ARM already had that, but called it
      sys_arm_sync_file_range.  Move it to fs/sync.c and rename it, then implement
      the needed compatibility routine.  And stop the missing syscall check from
      bitching about the absence of sys_sync_file_range() if we've implemented
      sys_sync_file_range2() instead.
      
      Tested on PPC32 and with 32-bit and 64-bit userspace on PPC64.
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      edd5cd4a
  14. 16 5月, 2007 1 次提交
  15. 16 2月, 2007 1 次提交
    • R
      [ARM] 4137/1: Add kexec support · c587e4a6
      Richard Purdie 提交于
      Add kexec support to ARM.
      
      Improvements like commandline handling could be made but this patch gives
      basic functional support. It uses the next available syscall number, 347.
      
      Once the syscall number is known, userspace support will be
      finalised/submitted to kexec-tools, various patches already exist.
      
      Originally based on a patch by Maxim Syrchin but updated and forward
      ported by various people.
      Signed-off-by: NRichard Purdie <rpurdie@rpsys.net>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      c587e4a6
  16. 18 12月, 2006 1 次提交
    • R
      [ARM] Add more syscalls · 5a059f1a
      Russell King 提交于
      Add:
        sys_unshare
        sys_set_robust_list
        sys_get_robust_list
        sys_splice
        sys_arm_sync_file_range
        sys_tee
        sys_vmsplice
        sys_move_pages
        sys_getcpu
      
      Special note about sys_arm_sync_file_range(), which is implemented as:
      
      asmlinkage long sys_arm_sync_file_range(int fd, unsigned int flags,
                                              loff_t offset, loff_t nbytes)
      {
              return sys_sync_file_range(fd, offset, nbytes, flags);
      }
      
      We can't export sys_sync_file_range() directly on ARM because the
      argument list someone picked does not fit in the available registers.
      Would be nice if... there was an arch maintainer review mechanism for
      new syscalls before they hit the kernel.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      5a059f1a
  17. 10 12月, 2006 1 次提交
  18. 17 2月, 2006 1 次提交
  19. 09 2月, 2006 1 次提交
  20. 19 1月, 2006 1 次提交
    • A
      [ARM] safer handling of syscall table padding · fa1b4f91
      Al Viro 提交于
      ARM entry-common.S needs to know syscall table size; in itself that would
      not be a problem, but there's an additional constraint - some of the
      instructions using it want a constant that would be a multiple of 4.
      So we have to pad syscall table with sys_ni_syscall and that's where
      the trouble begins.  .rept pseudo-op wants a constant expression for
      number of repetitions and subtraction of two labels (before and after
      syscall table) doesn't always get simplified to constant early enough
      for .rept.  If labels end up in different frags, we lose.  And while
      the frag size is large enough (slightly below 4Kb), the syscall table
      is about 1/3 of that.  We used to get away with that, but the recent
      changes had been enough to trigger the breakage.
      
      Proper fix is simple: have a macro (CALL(x)) to populate the table
      instead of using explicit .long x and the first time we include calls.S
      have it defined to .equ NR_syscalls,NR_syscalls+1.  Then we can find
      the proper amount of padding on the first inclusion simply by looking
      at NR_syscalls at that time.  And that will be constant, no matter what.
      
      Moreover, the same trick kills the need of having an estimate of padded
      NR_syscalls - it will be calculated for free at the same time.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      fa1b4f91
  21. 15 1月, 2006 2 次提交
  22. 17 12月, 2005 1 次提交
  23. 14 9月, 2005 1 次提交
  24. 10 9月, 2005 1 次提交
  25. 09 9月, 2005 1 次提交
  26. 01 9月, 2005 1 次提交
    • N
      [ARM] 2865/2: fix fadvise64_64 syscall argument passing · 68d9102f
      Nicolas Pitre 提交于
      Patch from Nicolas Pitre
      
      The prototype for sys_fadvise64_64() is:
          long sys_fadvise64_64(int fd, loff_t offset, loff_t len, int advice)
      The argument list is therefore as follows on legacy ABI:
      	fd: type int (r0)
      	offset: type long long (r1-r2)
      	len: type long long (r3-sp[0])
      	advice: type int (sp[4])
      With EABI this becomes:
      	fd: type int (r0)
      	offset: type long long (r2-r3)
      	len: type long long (sp[0]-sp[4])
      	advice: type int (sp[8])
      Not only do we have ABI differences here, but the EABI version requires
      one additional word on the syscall stack.
      To avoid the ABI mismatch and the extra stack space required with EABI
      this syscall is now defined with a different argument ordering
      on ARM as follows:
          long sys_arm_fadvise64_64(int fd, int advice, loff_t offset, loff_t len)
      This gives us the following ABI independent argument distribution:
      	fd: type int (r0)
      	advice: type int (r1)
      	offset: type long long (r2-r3)
      	len: type long long (sp[0]-sp[4])
      Now, since the syscall entry code takes care of 5 registers only by
      default including the store of r4 to the stack, we need a wrapper to
      store r5 to the stack as well.  Because that wrapper was missing and was
      always required this means that sys_fadvise64_64 never worked on ARM and
      therefore we can safely reuse its syscall number for our new
      sys_arm_fadvise64_64 interface.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      68d9102f
  27. 15 8月, 2005 1 次提交
  28. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4