1. 30 11月, 2016 1 次提交
  2. 27 4月, 2016 1 次提交
  3. 21 1月, 2016 1 次提交
  4. 16 12月, 2015 1 次提交
    • M
      Partial revert of "powerpc: Individual System V IPC system calls" · 2475c362
      Michael Ellerman 提交于
      This partially reverts commit a3423615.
      
      While reviewing the glibc patch to exploit the individual IPC calls,
      Arnd & Andreas noticed that we were still requiring userspace to pass
      IPC_64 in order to get the new style IPC API.
      
      With a bit of cleanup in the kernel we can drop that requirement, and
      instead only provide the new style API, which will simplify things for
      userspace.
      
      Rather than try and sneak that patch into 4.4, instead we will drop the
      individual IPC calls for powerpc, and merge them again in 4.5 once the
      cleanup patch has gone in.
      
      Because we've already added sys_mlock2() as syscall #378, we don't do a
      full revert of the IPC calls. Instead we drop the __NR #defines, and
      send those now undefined syscall numbers to sys_ni_syscall(). This
      leaves a gap in the syscall numbers, but we'll reuse them when we merge
      the individual IPC calls.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      2475c362
  5. 16 11月, 2015 1 次提交
  6. 15 10月, 2015 1 次提交
  7. 21 9月, 2015 1 次提交
  8. 09 9月, 2015 1 次提交
  9. 28 3月, 2015 1 次提交
    • M
      powerpc: Add a proper syscall for switching endianness · 529d235a
      Michael Ellerman 提交于
      We currently have a "special" syscall for switching endianness. This is
      syscall number 0x1ebe, which is handled explicitly in the 64-bit syscall
      exception entry.
      
      That has a few problems, firstly the syscall number is outside of the
      usual range, which confuses various tools. For example strace doesn't
      recognise the syscall at all.
      
      Secondly it's handled explicitly as a special case in the syscall
      exception entry, which is complicated enough without it.
      
      As a first step toward removing the special syscall, we need to add a
      regular syscall that implements the same functionality.
      
      The logic is simple, it simply toggles the MSR_LE bit in the userspace
      MSR. This is the same as the special syscall, with the caveat that the
      special syscall clobbers fewer registers.
      
      This version clobbers r9-r12, XER, CTR, and CR0-1,5-7.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      529d235a
  10. 29 12月, 2014 1 次提交
    • P
      powerpc: Wire up sys_execveat() syscall · 1e5d0fdb
      Pranith Kumar 提交于
      Wire up sys_execveat(). This passes the selftests for the system call.
      
      Check success of execveat(3, '../execveat', 0)... [OK]
      Check success of execveat(5, 'execveat', 0)... [OK]
      Check success of execveat(6, 'execveat', 0)... [OK]
      Check success of execveat(-100, '/home/pranith/linux/...ftests/exec/execveat', 0)... [OK]
      Check success of execveat(99, '/home/pranith/linux/...ftests/exec/execveat', 0)... [OK]
      Check success of execveat(8, '', 4096)... [OK]
      Check success of execveat(17, '', 4096)... [OK]
      Check success of execveat(9, '', 4096)... [OK]
      Check success of execveat(14, '', 4096)... [OK]
      Check success of execveat(14, '', 4096)... [OK]
      Check success of execveat(15, '', 4096)... [OK]
      Check failure of execveat(8, '', 0) with ENOENT... [OK]
      Check failure of execveat(8, '(null)', 4096) with EFAULT... [OK]
      Check success of execveat(5, 'execveat.symlink', 0)... [OK]
      Check success of execveat(6, 'execveat.symlink', 0)... [OK]
      Check success of execveat(-100, '/home/pranith/linux/...xec/execveat.symlink', 0)... [OK]
      Check success of execveat(10, '', 4096)... [OK]
      Check success of execveat(10, '', 4352)... [OK]
      Check failure of execveat(5, 'execveat.symlink', 256) with ELOOP... [OK]
      Check failure of execveat(6, 'execveat.symlink', 256) with ELOOP... [OK]
      Check failure of execveat(-100, '/home/pranith/linux/tools/testing/selftests/exec/execveat.symlink', 256) with ELOOP... [OK]
      Check success of execveat(3, '../script', 0)... [OK]
      Check success of execveat(5, 'script', 0)... [OK]
      Check success of execveat(6, 'script', 0)... [OK]
      Check success of execveat(-100, '/home/pranith/linux/...elftests/exec/script', 0)... [OK]
      Check success of execveat(13, '', 4096)... [OK]
      Check success of execveat(13, '', 4352)... [OK]
      Check failure of execveat(18, '', 4096) with ENOENT... [OK]
      Check failure of execveat(7, 'script', 0) with ENOENT... [OK]
      Check success of execveat(16, '', 4096)... [OK]
      Check success of execveat(16, '', 4096)... [OK]
      Check success of execveat(4, '../script', 0)... [OK]
      Check success of execveat(4, 'script', 0)... [OK]
      Check success of execveat(4, '../script', 0)... [OK]
      Check failure of execveat(4, 'script', 0) with ENOENT... [OK]
      Check failure of execveat(5, 'execveat', 65535) with EINVAL... [OK]
      Check failure of execveat(5, 'no-such-file', 0) with ENOENT... [OK]
      Check failure of execveat(6, 'no-such-file', 0) with ENOENT... [OK]
      Check failure of execveat(-100, 'no-such-file', 0) with ENOENT... [OK]
      Check failure of execveat(5, '', 4096) with EACCES... [OK]
      Check failure of execveat(5, 'Makefile', 0) with EACCES... [OK]
      Check failure of execveat(11, '', 4096) with EACCES... [OK]
      Check failure of execveat(12, '', 4096) with EACCES... [OK]
      Check failure of execveat(99, '', 4096) with EBADF... [OK]
      Check failure of execveat(99, 'execveat', 0) with EBADF... [OK]
      Check failure of execveat(8, 'execveat', 0) with ENOTDIR... [OK]
      Invoke copy of 'execveat' via filename of length 4093:
      Check success of execveat(19, '', 4096)... [OK]
      Check success of execveat(5, 'xxxxxxxxxxxxxxxxxxxx...yyyyyyyyyyyyyyyyyyyy', 0)... [OK]
      Invoke copy of 'script' via filename of length 4093:
      Check success of execveat(20, '', 4096)... [OK]
      /bin/sh: 0: Can't open /dev/fd/5/xxxxxxx(... a long line of x's and y's, 0)... [OK]
      Check success of execveat(5, 'xxxxxxxxxxxxxxxxxxxx...yyyyyyyyyyyyyyyyyyyy', 0)... [OK]
      
      Tested on a 32-bit powerpc system.
      Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1e5d0fdb
  11. 22 10月, 2014 1 次提交
    • P
      powerpc: Wire up sys_bpf() syscall · fcbb539f
      Pranith Kumar 提交于
      This patch wires up the new syscall sys_bpf() on powerpc.
      
      Passes the tests in samples/bpf:
      
          #0 add+sub+mul OK
          #1 unreachable OK
          #2 unreachable2 OK
          #3 out of range jump OK
          #4 out of range jump2 OK
          #5 test1 ld_imm64 OK
          #6 test2 ld_imm64 OK
          #7 test3 ld_imm64 OK
          #8 test4 ld_imm64 OK
          #9 test5 ld_imm64 OK
          #10 no bpf_exit OK
          #11 loop (back-edge) OK
          #12 loop2 (back-edge) OK
          #13 conditional loop OK
          #14 read uninitialized register OK
          #15 read invalid register OK
          #16 program doesn't init R0 before exit OK
          #17 stack out of bounds OK
          #18 invalid call insn1 OK
          #19 invalid call insn2 OK
          #20 invalid function call OK
          #21 uninitialized stack1 OK
          #22 uninitialized stack2 OK
          #23 check valid spill/fill OK
          #24 check corrupted spill/fill OK
          #25 invalid src register in STX OK
          #26 invalid dst register in STX OK
          #27 invalid dst register in ST OK
          #28 invalid src register in LDX OK
          #29 invalid dst register in LDX OK
          #30 junk insn OK
          #31 junk insn2 OK
          #32 junk insn3 OK
          #33 junk insn4 OK
          #34 junk insn5 OK
          #35 misaligned read from stack OK
          #36 invalid map_fd for function call OK
          #37 don't check return value before access OK
          #38 access memory with incorrect alignment OK
          #39 sometimes access memory with incorrect alignment OK
          #40 jump test 1 OK
          #41 jump test 2 OK
          #42 jump test 3 OK
          #43 jump test 4 OK
      Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
      [mpe: test using samples/bpf]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fcbb539f
  12. 09 9月, 2014 1 次提交
  13. 02 6月, 2014 1 次提交
  14. 29 1月, 2014 1 次提交
  15. 05 3月, 2013 1 次提交
  16. 14 12月, 2012 1 次提交
  17. 09 10月, 2012 1 次提交
  18. 03 10月, 2012 1 次提交
    • C
      compat: fs: Generic compat_sys_sendfile implementation · 8f9c0119
      Catalin Marinas 提交于
      This function is used by sparc, powerpc and arm64 for compat support.
      The patch adds a generic implementation which calls do_sendfile()
      directly and avoids set_fs().
      
      The sparc architecture has wrappers for the sign extensions while
      powerpc relies on the compiler to do the this. The patch adds wrappers
      for powerpc to handle the u32->int type conversion.
      
      compat_sys_sendfile64() can be replaced by a sys_sendfile() call since
      compat_loff_t has the same size as off_t on a 64-bit system.
      
      On powerpc, the patch also changes the 64-bit sendfile call from
      sys_sendile64 to sys_sendfile.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8f9c0119
  19. 01 10月, 2012 1 次提交
  20. 31 7月, 2012 1 次提交
  21. 01 11月, 2011 1 次提交
    • C
      Cross Memory Attach · fcf63409
      Christopher Yeoh 提交于
      The basic idea behind cross memory attach is to allow MPI programs doing
      intra-node communication to do a single copy of the message rather than a
      double copy of the message via shared memory.
      
      The following patch attempts to achieve this by allowing a destination
      process, given an address and size from a source process, to copy memory
      directly from the source process into its own address space via a system
      call.  There is also a symmetrical ability to copy from the current
      process's address space into a destination process's address space.
      
      - Use of /proc/pid/mem has been considered, but there are issues with
        using it:
        - Does not allow for specifying iovecs for both src and dest, assuming
          preadv or pwritev was implemented either the area read from or
        written to would need to be contiguous.
        - Currently mem_read allows only processes who are currently
        ptrace'ing the target and are still able to ptrace the target to read
        from the target. This check could possibly be moved to the open call,
        but its not clear exactly what race this restriction is stopping
        (reason  appears to have been lost)
        - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
        domain socket is a bit ugly from a userspace point of view,
        especially when you may have hundreds if not (eventually) thousands
        of processes  that all need to do this with each other
        - Doesn't allow for some future use of the interface we would like to
        consider adding in the future (see below)
        - Interestingly reading from /proc/pid/mem currently actually
        involves two copies! (But this could be fixed pretty easily)
      
      As mentioned previously use of vmsplice instead was considered, but has
      problems.  Since you need the reader and writer working co-operatively if
      the pipe is not drained then you block.  Which requires some wrapping to
      do non blocking on the send side or polling on the receive.  In all to all
      communication it requires ordering otherwise you can deadlock.  And in the
      example of many MPI tasks writing to one MPI task vmsplice serialises the
      copying.
      
      There are some cases of MPI collectives where even a single copy interface
      does not get us the performance gain we could.  For example in an
      MPI_Reduce rather than copy the data from the source we would like to
      instead use it directly in a mathops (say the reduce is doing a sum) as
      this would save us doing a copy.  We don't need to keep a copy of the data
      from the source.  I haven't implemented this, but I think this interface
      could in the future do all this through the use of the flags - eg could
      specify the math operation and type and the kernel rather than just
      copying the data would apply the specified operation between the source
      and destination and store it in the destination.
      
      Although we don't have a "second user" of the interface (though I've had
      some nibbles from people who may be interested in using it for intra
      process messaging which is not MPI).  This interface is something which
      hardware vendors are already doing for their custom drivers to implement
      fast local communication.  And so in addition to this being useful for
      OpenMPI it would mean the driver maintainers don't have to fix things up
      when the mm changes.
      
      There was some discussion about how much faster a true zero copy would
      go. Here's a link back to the email with some testing I did on that:
      
      http://marc.info/?l=linux-mm&m=130105930902915&w=2
      
      There is a basic man page for the proposed interface here:
      
      http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt
      
      This has been implemented for x86 and powerpc, other architecture should
      mainly (I think) just need to add syscall numbers for the process_vm_readv
      and process_vm_writev. There are 32 bit compatibility versions for
      64-bit kernels.
      
      For arch maintainers there are some simple tests to be able to quickly
      verify that the syscalls are working correctly here:
      
      http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgzSigned-off-by: NChris Yeoh <yeohc@au1.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: <linux-man@vger.kernel.org>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fcf63409
  22. 29 5月, 2011 1 次提交
    • E
      ns: Wire up the setns system call · 7b21fddd
      Eric W. Biederman 提交于
      32bit and 64bit on x86 are tested and working.  The rest I have looked
      at closely and I can't find any problems.
      
      setns is an easy system call to wire up.  It just takes two ints so I
      don't expect any weird architecture porting problems.
      
      While doing this I have noticed that we have some architectures that are
      very slow to get new system calls.  cris seems to be the slowest where
      the last system calls wired up were preadv and pwritev.  avr32 is weird
      in that recvmmsg was wired up but never declared in unistd.h.  frv is
      behind with perf_event_open being the last syscall wired up.  On h8300
      the last system call wired up was epoll_wait.  On m32r the last system
      call wired up was fallocate.  mn10300 has recvmmsg as the last system
      call wired up.  The rest seem to at least have syncfs wired up which was
      new in the 2.6.39.
      
      v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com>
      v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com>
      v4: Moved wiring up of the system call to another patch
      v5: ported to v2.6.39-rc6
      v6: rebased onto parisc-next and net-next to avoid syscall  conflicts.
      v7: ported to Linus's latest post 2.6.39 tree.
      
      >  arch/blackfin/include/asm/unistd.h     |    3 ++-
      >  arch/blackfin/mach-common/entry.S      |    1 +
      Acked-by: NMike Frysinger <vapier@gentoo.org>
      
      Oh - ia64 wiring looks good.
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b21fddd
  23. 06 5月, 2011 1 次提交
    • A
      net: Add sendmmsg socket system call · 228e548e
      Anton Blanchard 提交于
      This patch adds a multiple message send syscall and is the send
      version of the existing recvmmsg syscall. This is heavily
      based on the patch by Arnaldo that added recvmmsg.
      
      I wrote a microbenchmark to test the performance gains of using
      this new syscall:
      
      http://ozlabs.org/~anton/junkcode/sendmmsg_test.c
      
      The test was run on a ppc64 box with a 10 Gbit network card. The
      benchmark can send both UDP and RAW ethernet packets.
      
      64B UDP
      
      batch   pkts/sec
      1       804570
      2       872800 (+ 8 %)
      4       916556 (+14 %)
      8       939712 (+17 %)
      16      952688 (+18 %)
      32      956448 (+19 %)
      64      964800 (+20 %)
      
      64B raw socket
      
      batch   pkts/sec
      1       1201449
      2       1350028 (+12 %)
      4       1461416 (+22 %)
      8       1513080 (+26 %)
      16      1541216 (+28 %)
      32      1553440 (+29 %)
      64      1557888 (+30 %)
      
      We see a 20% improvement in throughput on UDP send and 30%
      on raw socket send.
      
      [ Add sparc syscall entries. -DaveM ]
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      228e548e
  24. 30 3月, 2011 1 次提交
  25. 02 9月, 2010 1 次提交
  26. 24 8月, 2010 1 次提交
  27. 13 3月, 2010 2 次提交
    • C
      Add generic sys_olduname() · 5cacdb4a
      Christoph Hellwig 提交于
      Add generic implementations of the old and really old uname system calls.
      Note that sh only implements sys_olduname but not sys_oldolduname, but I'm
      not going to bother with another ifdef for that special case.
      
      m32r implemented an old uname but never wired it up, so kill it, too.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5cacdb4a
    • C
      Add generic sys_ipc wrapper · baed7fc9
      Christoph Hellwig 提交于
      Add a generic implementation of the ipc demultiplexer syscall.  Except for
      s390 and sparc64 all implementations of the sys_ipc are nearly identical.
      
      There are slight differences in the types of the parameters, where mips
      and powerpc as the only 64-bit architectures with sys_ipc use unsigned
      long for the "third" argument as it gets casted to a pointer later, while
      it traditionally is an "int" like most other paramters.  frv goes even
      further and uses unsigned long for all parameters execept for "ptr" which
      is a pointer type everywhere.  The change from int to unsigned long for
      "third" and back to "int" for the others on frv should be fine due to the
      in-register calling conventions for syscalls (we already had a similar
      issue with the generic sys_ptrace), but I'd prefer to have the arch
      maintainers looks over this in details.
      
      Except for that h8300, m68k and m68knommu lack an impplementation of the
      semtimedop sub call which this patch adds, and various architectures have
      gets used - at least on i386 it seems superflous as the compat code on
      x86-64 and ia64 doesn't even bother to implement it.
      
      [akpm@linux-foundation.org: add sys_ipc to sys_ni.c]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Reviewed-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      baed7fc9
  28. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  29. 15 6月, 2009 1 次提交
  30. 07 4月, 2009 1 次提交
  31. 09 1月, 2009 1 次提交
  32. 04 8月, 2008 1 次提交
  33. 25 7月, 2008 1 次提交
  34. 14 2月, 2008 1 次提交
  35. 24 1月, 2008 1 次提交
    • P
      [POWERPC] Provide a way to protect 4k subpages when using 64k pages · fa28237c
      Paul Mackerras 提交于
      Using 64k pages on 64-bit PowerPC systems makes life difficult for
      emulators that are trying to emulate an ISA, such as x86, which use a
      smaller page size, since the emulator can no longer use the MMU and
      the normal system calls for controlling page protections.  Of course,
      the emulator can emulate the MMU by checking and possibly remapping
      the address for each memory access in software, but that is pretty
      slow.
      
      This provides a facility for such programs to control the access
      permissions on individual 4k sub-pages of 64k pages.  The idea is
      that the emulator supplies an array of protection masks to apply to a
      specified range of virtual addresses.  These masks are applied at the
      level where hardware PTEs are inserted into the hardware page table
      based on the Linux PTEs, so the Linux PTEs are not affected.  Note
      that this new mechanism does not allow any access that would otherwise
      be prohibited; it can only prohibit accesses that would otherwise be
      allowed.  This new facility is only available on 64-bit PowerPC and
      only when the kernel is configured for 64k pages.
      
      The masks are supplied using a new subpage_prot system call, which
      takes a starting virtual address and length, and a pointer to an array
      of protection masks in memory.  The array has a 32-bit word per 64k
      page to be protected; each 32-bit word consists of 16 2-bit fields,
      for which 0 allows any access (that is otherwise allowed), 1 prevents
      write accesses, and 2 or 3 prevent any access.
      
      Implicit in this is that the regions of the address space that are
      protected are switched to use 4k hardware pages rather than 64k
      hardware pages (on machines with hardware 64k page support).  In fact
      the whole process is switched to use 4k hardware pages when the
      subpage_prot system call is used, but this could be improved in future
      to switch only the affected segments.
      
      The subpage protection bits are stored in a 3 level tree akin to the
      page table tree.  The top level of this tree is stored in a structure
      that is appended to the top level of the page table tree, i.e., the
      pgd array.  Since it will often only be 32-bit addresses (below 4GB)
      that are protected, the pointers to the first four bottom level pages
      are also stored in this structure (each bottom level page contains the
      protection bits for 1GB of address space), so the protection bits for
      addresses below 4GB can be accessed with one fewer loads than those
      for higher addresses.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      fa28237c
  36. 18 7月, 2007 1 次提交
    • A
      sys_fallocate() implementation on i386, x86_64 and powerpc · 97ac7350
      Amit Arora 提交于
      fallocate() is a new system call being proposed here which will allow
      applications to preallocate space to any file(s) in a file system.
      Each file system implementation that wants to use this feature will need
      to support an inode operation called ->fallocate().
      Applications can use this feature to avoid fragmentation to certain
      level and thus get faster access speed. With preallocation, applications
      also get a guarantee of space for particular file(s) - even if later the
      the system becomes full.
      
      Currently, glibc provides an interface called posix_fallocate() which
      can be used for similar cause. Though this has the advantage of working
      on all file systems, but it is quite slow (since it writes zeroes to
      each block that has to be preallocated). Without a doubt, file systems
      can do this more efficiently within the kernel, by implementing
      the proposed fallocate() system call. It is expected that
      posix_fallocate() will be modified to call this new system call first
      and incase the kernel/filesystem does not implement it, it should fall
      back to the current implementation of writing zeroes to the new blocks.
      ToDos:
      1. Implementation on other architectures (other than i386, x86_64,
         and ppc). Patches for s390(x) and ia64 are already available from
         previous posts, but it was decided that they should be added later
         once fallocate is in the mainline. Hence not including those patches
         in this take.
      2. Changes to glibc,
         a) to support fallocate() system call
         b) to make posix_fallocate() and posix_fallocate64() call fallocate()
      Signed-off-by: NAmit Arora <aarora@in.ibm.com>
      97ac7350
  37. 29 6月, 2007 1 次提交
    • D
      Introduce fixed sys_sync_file_range2() syscall, implement on PowerPC and ARM · edd5cd4a
      David Woodhouse 提交于
      Not all the world is an i386.  Many architectures need 64-bit arguments to be
      aligned in suitable pairs of registers, and the original
      sys_sync_file_range(int, loff_t, loff_t, int) was therefore wasting an
      argument register for padding after the first integer.  Since we don't
      normally have more than 6 arguments for system calls, that left no room for
      the final argument on some architectures.
      
      Fix this by introducing sys_sync_file_range2(int, int, loff_t, loff_t) which
      all fits nicely.  In fact, ARM already had that, but called it
      sys_arm_sync_file_range.  Move it to fs/sync.c and rename it, then implement
      the needed compatibility routine.  And stop the missing syscall check from
      bitching about the absence of sys_sync_file_range() if we've implemented
      sys_sync_file_range2() instead.
      
      Tested on PPC32 and with 32-bit and 64-bit userspace on PPC64.
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      edd5cd4a
  38. 17 5月, 2007 2 次提交