1. 26 10月, 2009 1 次提交
  2. 01 10月, 2009 1 次提交
  3. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  4. 09 8月, 2009 1 次提交
  5. 01 5月, 2009 1 次提交
  6. 03 4月, 2009 1 次提交
    • G
      preadv/pwritev: Add preadv and pwritev system calls. · f3554f4b
      Gerd Hoffmann 提交于
      This patch adds preadv and pwritev system calls.  These syscalls are a
      pretty straightforward combination of pread and readv (same for write).
      They are quite useful for doing vectored I/O in threaded applications.
      Using lseek+readv instead opens race windows you'll have to plug with
      locking.
      
      Other systems have such system calls too, for example NetBSD, check
      here: http://www.daemon-systems.org/man/preadv.2.html
      
      The application-visible interface provided by glibc should look like
      this to be compatible to the existing implementations in the *BSD family:
      
        ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset);
        ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset);
      
      This prototype has one problem though: On 32bit archs is the (64bit)
      offset argument unaligned, which the syscall ABI of several archs doesn't
      allow to do.  At least s390 needs a wrapper in glibc to handle this.  As
      we'll need a wrappers in glibc anyway I've decided to push problem to
      glibc entriely and use a syscall prototype which works without
      arch-specific wrappers inside the kernel: The offset argument is
      explicitly splitted into two 32bit values.
      
      The patch sports the actual system call implementation and the windup in
      the x86 system call tables.  Other archs follow as separate patches.
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <linux-api@vger.kernel.org>
      Cc: <linux-arch@vger.kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f3554f4b
  7. 28 3月, 2009 1 次提交
    • C
      generic compat_sys_ustat · 2b1c6bd7
      Christoph Hellwig 提交于
      Due to a different size of ino_t ustat needs a compat handler, but
      currently only x86 and mips provide one.  Add a generic compat_sys_ustat
      and switch all architectures over to it.  Instead of doing various
      user copy hacks compat_sys_ustat just reimplements sys_ustat as
      it's trivial.  This was suggested by Arnd Bergmann.
      
      Found by Eric Sandeen when running xfstests/017 on ppc64, which causes
      stack smashing warnings on RHEL/Fedora due to the too large amount of
      data writen by the syscall.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2b1c6bd7
  8. 07 2月, 2009 1 次提交
    • R
      x86-64: fix int $0x80 -ENOSYS return · c09249f8
      Roland McGrath 提交于
      One of my past fixes to this code introduced a different new bug.
      When using 32-bit "int $0x80" entry for a bogus syscall number,
      the return value is not correctly set to -ENOSYS.  This only happens
      when neither syscall-audit nor syscall tracing is enabled (i.e., never
      seen if auditd ever started).  Test program:
      
      	/* gcc -o int80-badsys -m32 -g int80-badsys.c
      	   Run on x86-64 kernel.
      	   Note to reproduce the bug you need auditd never to have started.  */
      
      	#include <errno.h>
      	#include <stdio.h>
      
      	int
      	main (void)
      	{
      	  long res;
      	  asm ("int $0x80" : "=a" (res) : "0" (99999));
      	  printf ("bad syscall returns %ld\n", res);
      	  return res != -ENOSYS;
      	}
      
      The fix makes the int $0x80 path match the sysenter and syscall paths.
      Reported-by: NDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      c09249f8
  9. 18 1月, 2009 1 次提交
  10. 08 12月, 2008 1 次提交
    • I
      performance counters: x86 support · 241771ef
      Ingo Molnar 提交于
      Implement performance counters for x86 Intel CPUs.
      
      It's simplified right now: the PERFMON CPU feature is assumed,
      which is available in Core2 and later Intel CPUs.
      
      The design is flexible to be extended to more CPU types as well.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      241771ef
  11. 17 10月, 2008 1 次提交
    • C
      compat: generic compat get/settimeofday · b418da16
      Christoph Hellwig 提交于
      Nothing arch specific in get/settimeofday.  The details of the timeval
      conversion varied a little from arch to arch, but all with the same
      results.
      
      Also add an extern declaration for sys_tz to linux/time.h because externs
      in .c files are fowned upon.  I'll kill the externs in various other files
      in a sparate patch.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b418da16
  12. 13 10月, 2008 1 次提交
  13. 26 7月, 2008 1 次提交
  14. 25 7月, 2008 7 次提交
    • U
      flag parameters add-on: remove epoll_create size param · 9fe5ad9c
      Ulrich Drepper 提交于
      Remove the size parameter from the new epoll_create syscall and renames the
      syscall itself.  The updated test program follows.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <fcntl.h>
      #include <stdio.h>
      #include <time.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      
      #ifndef __NR_epoll_create2
      # ifdef __x86_64__
      #  define __NR_epoll_create2 291
      # elif defined __i386__
      #  define __NR_epoll_create2 329
      # else
      #  error "need __NR_epoll_create2"
      # endif
      #endif
      
      #define EPOLL_CLOEXEC O_CLOEXEC
      
      int
      main (void)
      {
        int fd = syscall (__NR_epoll_create2, 0);
        if (fd == -1)
          {
            puts ("epoll_create2(0) failed");
            return 1;
          }
        int coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if (coe & FD_CLOEXEC)
          {
            puts ("epoll_create2(0) set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        fd = syscall (__NR_epoll_create2, EPOLL_CLOEXEC);
        if (fd == -1)
          {
            puts ("epoll_create2(EPOLL_CLOEXEC) failed");
            return 1;
          }
        coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if ((coe & FD_CLOEXEC) == 0)
          {
            puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        puts ("OK");
      
        return 0;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9fe5ad9c
    • U
      flag parameters: inotify_init · 4006553b
      Ulrich Drepper 提交于
      This patch introduces the new syscall inotify_init1 (note: the 1 stands for
      the one parameter the syscall takes, as opposed to no parameter before).  The
      values accepted for this parameter are function-specific and defined in the
      inotify.h header.  Here the values must match the O_* flags, though.  In this
      patch CLOEXEC support is introduced.
      
      The following test must be adjusted for architectures other than x86 and
      x86-64 and in case the syscall numbers changed.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <fcntl.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      
      #ifndef __NR_inotify_init1
      # ifdef __x86_64__
      #  define __NR_inotify_init1 294
      # elif defined __i386__
      #  define __NR_inotify_init1 332
      # else
      #  error "need __NR_inotify_init1"
      # endif
      #endif
      
      #define IN_CLOEXEC O_CLOEXEC
      
      int
      main (void)
      {
        int fd;
        fd = syscall (__NR_inotify_init1, 0);
        if (fd == -1)
          {
            puts ("inotify_init1(0) failed");
            return 1;
          }
        int coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if (coe & FD_CLOEXEC)
          {
            puts ("inotify_init1(0) set close-on-exit");
            return 1;
          }
        close (fd);
      
        fd = syscall (__NR_inotify_init1, IN_CLOEXEC);
        if (fd == -1)
          {
            puts ("inotify_init1(IN_CLOEXEC) failed");
            return 1;
          }
        coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if ((coe & FD_CLOEXEC) == 0)
          {
            puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit");
            return 1;
          }
        close (fd);
      
        puts ("OK");
      
        return 0;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      [akpm@linux-foundation.org: add sys_ni stub]
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4006553b
    • U
      flag parameters: pipe · ed8cae8b
      Ulrich Drepper 提交于
      This patch introduces the new syscall pipe2 which is like pipe but it also
      takes an additional parameter which takes a flag value.  This patch implements
      the handling of O_CLOEXEC for the flag.  I did not add support for the new
      syscall for the architectures which have a special sys_pipe implementation.  I
      think the maintainers of those archs have the chance to go with the unified
      implementation but that's up to them.
      
      The implementation introduces do_pipe_flags.  I did that instead of changing
      all callers of do_pipe because some of the callers are written in assembler.
      I would probably screw up changing the assembly code.  To avoid breaking code
      do_pipe is now a small wrapper around do_pipe_flags.  Once all callers are
      changed over to do_pipe_flags the old do_pipe function can be removed.
      
      The following test must be adjusted for architectures other than x86 and
      x86-64 and in case the syscall numbers changed.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <fcntl.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      
      #ifndef __NR_pipe2
      # ifdef __x86_64__
      #  define __NR_pipe2 293
      # elif defined __i386__
      #  define __NR_pipe2 331
      # else
      #  error "need __NR_pipe2"
      # endif
      #endif
      
      int
      main (void)
      {
        int fd[2];
        if (syscall (__NR_pipe2, fd, 0) != 0)
          {
            puts ("pipe2(0) failed");
            return 1;
          }
        for (int i = 0; i < 2; ++i)
          {
            int coe = fcntl (fd[i], F_GETFD);
            if (coe == -1)
              {
                puts ("fcntl failed");
                return 1;
              }
            if (coe & FD_CLOEXEC)
              {
                printf ("pipe2(0) set close-on-exit for fd[%d]\n", i);
                return 1;
              }
          }
        close (fd[0]);
        close (fd[1]);
      
        if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0)
          {
            puts ("pipe2(O_CLOEXEC) failed");
            return 1;
          }
        for (int i = 0; i < 2; ++i)
          {
            int coe = fcntl (fd[i], F_GETFD);
            if (coe == -1)
              {
                puts ("fcntl failed");
                return 1;
              }
            if ((coe & FD_CLOEXEC) == 0)
              {
                printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i);
                return 1;
              }
          }
        close (fd[0]);
        close (fd[1]);
      
        puts ("OK");
      
        return 0;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed8cae8b
    • U
      flag parameters: dup2 · 336dd1f7
      Ulrich Drepper 提交于
      This patch adds the new dup3 syscall.  It extends the old dup2 syscall by one
      parameter which is meant to hold a flag value.  Support for the O_CLOEXEC flag
      is added in this patch.
      
      The following test must be adjusted for architectures other than x86 and
      x86-64 and in case the syscall numbers changed.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <fcntl.h>
      #include <stdio.h>
      #include <time.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      
      #ifndef __NR_dup3
      # ifdef __x86_64__
      #  define __NR_dup3 292
      # elif defined __i386__
      #  define __NR_dup3 330
      # else
      #  error "need __NR_dup3"
      # endif
      #endif
      
      int
      main (void)
      {
        int fd = syscall (__NR_dup3, 1, 4, 0);
        if (fd == -1)
          {
            puts ("dup3(0) failed");
            return 1;
          }
        int coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if (coe & FD_CLOEXEC)
          {
            puts ("dup3(0) set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC);
        if (fd == -1)
          {
            puts ("dup3(O_CLOEXEC) failed");
            return 1;
          }
        coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if ((coe & FD_CLOEXEC) == 0)
          {
            puts ("dup3(O_CLOEXEC) set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        puts ("OK");
      
        return 0;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      336dd1f7
    • U
      flag parameters: epoll_create · a0998b50
      Ulrich Drepper 提交于
      This patch adds the new epoll_create2 syscall.  It extends the old epoll_create
      syscall by one parameter which is meant to hold a flag value.  In this
      patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec
      flag for the returned file descriptor to be set.
      
      A new name EPOLL_CLOEXEC is introduced which in this implementation must
      have the same value as O_CLOEXEC.
      
      The following test must be adjusted for architectures other than x86 and
      x86-64 and in case the syscall numbers changed.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <fcntl.h>
      #include <stdio.h>
      #include <time.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      
      #ifndef __NR_epoll_create2
      # ifdef __x86_64__
      #  define __NR_epoll_create2 291
      # elif defined __i386__
      #  define __NR_epoll_create2 329
      # else
      #  error "need __NR_epoll_create2"
      # endif
      #endif
      
      #define EPOLL_CLOEXEC O_CLOEXEC
      
      int
      main (void)
      {
        int fd = syscall (__NR_epoll_create2, 1, 0);
        if (fd == -1)
          {
            puts ("epoll_create2(0) failed");
            return 1;
          }
        int coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if (coe & FD_CLOEXEC)
          {
            puts ("epoll_create2(0) set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC);
        if (fd == -1)
          {
            puts ("epoll_create2(EPOLL_CLOEXEC) failed");
            return 1;
          }
        coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if ((coe & FD_CLOEXEC) == 0)
          {
            puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        puts ("OK");
      
        return 0;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0998b50
    • U
      flag parameters: eventfd · b087498e
      Ulrich Drepper 提交于
      This patch adds the new eventfd2 syscall.  It extends the old eventfd
      syscall by one parameter which is meant to hold a flag value.  In this
      patch the only flag support is EFD_CLOEXEC which causes the close-on-exec
      flag for the returned file descriptor to be set.
      
      A new name EFD_CLOEXEC is introduced which in this implementation must
      have the same value as O_CLOEXEC.
      
      The following test must be adjusted for architectures other than x86 and
      x86-64 and in case the syscall numbers changed.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <fcntl.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      
      #ifndef __NR_eventfd2
      # ifdef __x86_64__
      #  define __NR_eventfd2 290
      # elif defined __i386__
      #  define __NR_eventfd2 328
      # else
      #  error "need __NR_eventfd2"
      # endif
      #endif
      
      #define EFD_CLOEXEC O_CLOEXEC
      
      int
      main (void)
      {
        int fd = syscall (__NR_eventfd2, 1, 0);
        if (fd == -1)
          {
            puts ("eventfd2(0) failed");
            return 1;
          }
        int coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if (coe & FD_CLOEXEC)
          {
            puts ("eventfd2(0) sets close-on-exec flag");
            return 1;
          }
        close (fd);
      
        fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC);
        if (fd == -1)
          {
            puts ("eventfd2(EFD_CLOEXEC) failed");
            return 1;
          }
        coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if ((coe & FD_CLOEXEC) == 0)
          {
            puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        puts ("OK");
      
        return 0;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      [akpm@linux-foundation.org: add sys_ni stub]
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b087498e
    • U
      flag parameters: signalfd · 9deb27ba
      Ulrich Drepper 提交于
      This patch adds the new signalfd4 syscall.  It extends the old signalfd
      syscall by one parameter which is meant to hold a flag value.  In this
      patch the only flag support is SFD_CLOEXEC which causes the close-on-exec
      flag for the returned file descriptor to be set.
      
      A new name SFD_CLOEXEC is introduced which in this implementation must
      have the same value as O_CLOEXEC.
      
      The following test must be adjusted for architectures other than x86 and
      x86-64 and in case the syscall numbers changed.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <fcntl.h>
      #include <signal.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      
      #ifndef __NR_signalfd4
      # ifdef __x86_64__
      #  define __NR_signalfd4 289
      # elif defined __i386__
      #  define __NR_signalfd4 327
      # else
      #  error "need __NR_signalfd4"
      # endif
      #endif
      
      #define SFD_CLOEXEC O_CLOEXEC
      
      int
      main (void)
      {
        sigset_t ss;
        sigemptyset (&ss);
        sigaddset (&ss, SIGUSR1);
        int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
        if (fd == -1)
          {
            puts ("signalfd4(0) failed");
            return 1;
          }
        int coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if (coe & FD_CLOEXEC)
          {
            puts ("signalfd4(0) set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC);
        if (fd == -1)
          {
            puts ("signalfd4(SFD_CLOEXEC) failed");
            return 1;
          }
        coe = fcntl (fd, F_GETFD);
        if (coe == -1)
          {
            puts ("fcntl failed");
            return 1;
          }
        if ((coe & FD_CLOEXEC) == 0)
          {
            puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag");
            return 1;
          }
        close (fd);
      
        puts ("OK");
      
        return 0;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      [akpm@linux-foundation.org: add sys_ni stub]
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9deb27ba
  15. 24 7月, 2008 1 次提交
    • R
      x86_64 ia32 syscall audit fast-path · 5cbf1565
      Roland McGrath 提交于
      This adds fast paths for 32-bit syscall entry and exit when
      TIF_SYSCALL_AUDIT is set, but no other kind of syscall tracing.
      These paths does not need to save and restore all registers as
      the general case of tracing does.  Avoiding the iret return path
      when syscall audit is enabled helps performance a lot.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      5cbf1565
  16. 17 7月, 2008 1 次提交
    • R
      x86 ptrace: unify syscall tracing · d4d67150
      Roland McGrath 提交于
      This unifies and cleans up the syscall tracing code on i386 and x86_64.
      
      Using a single function for entry and exit tracing on 32-bit made the
      do_syscall_trace() into some terrible spaghetti.  The logic is clear and
      simple using separate syscall_trace_enter() and syscall_trace_leave()
      functions as on 64-bit.
      
      The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
      on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
      tracing either 32-bit or 64-bit tasks.  It behaves just like 32-bit.
      
      Changing syscall_trace_enter() to return the syscall number shortens
      all the assembly paths, while adding the SYSEMU feature in a simple way.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      d4d67150
  17. 16 7月, 2008 1 次提交
  18. 09 7月, 2008 1 次提交
  19. 08 7月, 2008 3 次提交
  20. 19 6月, 2008 1 次提交
  21. 26 4月, 2008 1 次提交
  22. 17 4月, 2008 2 次提交
  23. 06 2月, 2008 1 次提交
  24. 30 1月, 2008 2 次提交
  25. 10 11月, 2007 1 次提交
    • C
      x86 - 32-bit ptrace emulation mishandles 6th arg · ecd744ee
      Chuck Ebbert 提交于
      [ jdike - Pushing Chuck's patch - see
      http://lkml.org/lkml/2005/9/16/261 for some history and a test
      program.  UML is also broken without this patch - its processes get
      SIGBUS from the corrupt 6th argument to mmap being interpretted as a
      file offset ]
      
      When the 32-bit vDSO is used to make a system call, the %ebp register for
      the 6th syscall arg has to be loaded from the user stack (where it's pushed
      by the vDSO user code).  The native i386 kernel always does this before
      stopping for syscall tracing, so %ebp can be seen and modified via ptrace
      to access the 6th syscall argument.  The x86-64 kernel fails to do this,
      presenting the stack address to ptrace instead.  This makes the %rbp value
      seen by 64-bit ptrace of a 32-bit process, and the %ebp value seen by a
      32-bit caller of ptrace, both differ from the native i386 behavior.
      
      This patch fixes the problem by putting the word loaded from the user stack
      into %rbp before calling syscall_trace_enter, and reloading the 6th syscall
      argument from there afterwards (so ptrace can change it).  This makes the
      behavior match that of i386 kernels.
      Original-Patch-By: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com>
      Signed-off-by: NJeff Dike <jdike@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      ecd744ee
  26. 11 10月, 2007 1 次提交
  27. 22 9月, 2007 1 次提交
  28. 22 7月, 2007 1 次提交
  29. 18 7月, 2007 1 次提交
    • A
      sys_fallocate() implementation on i386, x86_64 and powerpc · 97ac7350
      Amit Arora 提交于
      fallocate() is a new system call being proposed here which will allow
      applications to preallocate space to any file(s) in a file system.
      Each file system implementation that wants to use this feature will need
      to support an inode operation called ->fallocate().
      Applications can use this feature to avoid fragmentation to certain
      level and thus get faster access speed. With preallocation, applications
      also get a guarantee of space for particular file(s) - even if later the
      the system becomes full.
      
      Currently, glibc provides an interface called posix_fallocate() which
      can be used for similar cause. Though this has the advantage of working
      on all file systems, but it is quite slow (since it writes zeroes to
      each block that has to be preallocated). Without a doubt, file systems
      can do this more efficiently within the kernel, by implementing
      the proposed fallocate() system call. It is expected that
      posix_fallocate() will be modified to call this new system call first
      and incase the kernel/filesystem does not implement it, it should fall
      back to the current implementation of writing zeroes to the new blocks.
      ToDos:
      1. Implementation on other architectures (other than i386, x86_64,
         and ppc). Patches for s390(x) and ia64 are already available from
         previous posts, but it was decided that they should be added later
         once fallocate is in the mainline. Hence not including those patches
         in this take.
      2. Changes to glibc,
         a) to support fallocate() system call
         b) to make posix_fallocate() and posix_fallocate64() call fallocate()
      Signed-off-by: NAmit Arora <aarora@in.ibm.com>
      97ac7350
  30. 17 7月, 2007 1 次提交
    • V
      diskquota: 32bit quota tools on 64bit architectures · b716395e
      Vasily Tarasov 提交于
      OpenVZ Linux kernel team has discovered the problem with 32bit quota tools
      working on 64bit architectures.  In 2.6.10 kernel sys32_quotactl() function
      was replaced by sys_quotactl() with the comment "sys_quotactl seems to be
      32/64bit clean, enable it for 32bit" However this isn't right.  Look at
      if_dqblk structure:
      
      struct if_dqblk {
              __u64 dqb_bhardlimit;
              __u64 dqb_bsoftlimit;
              __u64 dqb_curspace;
              __u64 dqb_ihardlimit;
              __u64 dqb_isoftlimit;
              __u64 dqb_curinodes;
              __u64 dqb_btime;
              __u64 dqb_itime;
              __u32 dqb_valid;
      };
      
      For 32 bit quota tools sizeof(if_dqblk) == 0x44.
      But for 64 bit kernel its size is 0x48, 'cause of alignment!
      Thus we got a problem. Attached patch reintroduce sys32_quotactl() function,
      that handles this and related situations.
      
      [michal.k.k.piotrowski@gmail.com: build fix]
      [akpm@linux-foundation.org: Make it link with CONFIG_QUOTA=n]
      Signed-off-by: NVasily Tarasov <vtaras@openvz.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NMichal Piotrowski <michal.k.k.piotrowski@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b716395e