1. 08 12月, 2015 3 次提交
  2. 02 12月, 2015 1 次提交
    • Z
      vfs: add copy_file_range syscall and vfs helper · 29732938
      Zach Brown 提交于
      Add a copy_file_range() system call for offloading copies between
      regular files.
      
      This gives an interface to underlying layers of the storage stack which
      can copy without reading and writing all the data.  There are a few
      candidates that should support copy offloading in the nearer term:
      
      - btrfs shares extent references with its clone ioctl
      - NFS has patches to add a COPY command which copies on the server
      - SCSI has a family of XCOPY commands which copy in the device
      
      This system call avoids the complexity of also accelerating the creation
      of the destination file by operating on an existing destination file
      descriptor, not a path.
      
      Currently the high level vfs entry point limits copy offloading to files
      on the same mount and super (and not in the same file).  This can be
      relaxed if we get implementations which can copy between file systems
      safely.
      Signed-off-by: NZach Brown <zab@redhat.com>
      [Anna Schumaker: Change -EINVAL to -EBADF during file verification,
                       Change flags parameter from int to unsigned int,
                       Add function to include/linux/syscalls.h,
                       Check copy len after file open mode,
                       Don't forbid ranges inside the same file,
                       Use rw_verify_area() to veriy ranges,
                       Use file_out rather than file_in,
                       Add COPY_FR_REFLINK flag]
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      29732938
  3. 29 11月, 2015 1 次提交
  4. 26 11月, 2015 1 次提交
  5. 25 11月, 2015 1 次提交
    • A
      arm64: fix building without CONFIG_UID16 · fbc416ff
      Arnd Bergmann 提交于
      As reported by Michal Simek, building an ARM64 kernel with CONFIG_UID16
      disabled currently fails because the system call table still needs to
      reference the individual function entry points that are provided by
      kernel/sys_ni.c in this case, and the declarations are hidden inside
      of #ifdef CONFIG_UID16:
      
      arch/arm64/include/asm/unistd32.h:57:8: error: 'sys_lchown16' undeclared here (not in a function)
       __SYSCALL(__NR_lchown, sys_lchown16)
      
      I believe this problem only exists on ARM64, because older architectures
      tend to not need declarations when their system call table is built
      in assembly code, while newer architectures tend to not need UID16
      support. ARM64 only uses these system calls for compatibility with
      32-bit ARM binaries.
      
      This changes the CONFIG_UID16 check into CONFIG_HAVE_UID16, which is
      set unconditionally on ARM64 with CONFIG_COMPAT, so we see the
      declarations whenever we need them, but otherwise the behavior is
      unchanged.
      
      Fixes: af1839eb ("Kconfig: clean up the long arch list for the UID16 config option")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      fbc416ff
  6. 24 11月, 2015 2 次提交
  7. 23 11月, 2015 1 次提交
    • J
      slab/slub: adjust kmem_cache_alloc_bulk API · 865762a8
      Jesper Dangaard Brouer 提交于
      Adjust kmem_cache_alloc_bulk API before we have any real users.
      
      Adjust API to return type 'int' instead of previously type 'bool'.  This
      is done to allow future extension of the bulk alloc API.
      
      A future extension could be to allow SLUB to stop at a page boundary, when
      specified by a flag, and then return the number of objects.
      
      The advantage of this approach, would make it easier to make bulk alloc
      run without local IRQs disabled.  With an approach of cmpxchg "stealing"
      the entire c->freelist or page->freelist.  To avoid overshooting we would
      stop processing at a slab-page boundary.  Else we always end up returning
      some objects at the cost of another cmpxchg.
      
      To keep compatible with future users of this API linking against an older
      kernel when using the new flag, we need to return the number of allocated
      objects with this API change.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      865762a8
  8. 21 11月, 2015 5 次提交
    • P
      tty: audit: Fix audit source · 6b2a3d62
      Peter Hurley 提交于
      The data to audit/record is in the 'from' buffer (ie., the input
      read buffer).
      
      Fixes: 72586c60 ("n_tty: Fix auditing support for cannonical mode")
      Cc: stable <stable@vger.kernel.org> # 4.1+
      Cc: Miloslav Trmač <mitr@redhat.com>
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Acked-by: NLaura Abbott <labbott@fedoraproject.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b2a3d62
    • J
      mm: fix up sparse warning in gfpflags_allow_blocking · 21fa8442
      Jeff Layton 提交于
      sparse says:
      
          include/linux/gfp.h:274:26: warning: incorrect type in return expression (different base types)
          include/linux/gfp.h:274:26:    expected bool
          include/linux/gfp.h:274:26:    got restricted gfp_t
      
      ...add a forced cast to silence the warning.
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      21fa8442
    • R
      kernel/signal.c: unexport sigsuspend() · 9d8a7652
      Richard Weinberger 提交于
      sigsuspend() is nowhere used except in signal.c itself, so we can mark it
      static do not pollute the global namespace.
      
      But this patch is more than a boring cleanup patch, it fixes a real issue
      on UserModeLinux.  UML has a special console driver to display ttys using
      xterm, or other terminal emulators, on the host side.  Vegard reported
      that sometimes UML is unable to spawn a xterm and he's facing the
      following warning:
      
        WARNING: CPU: 0 PID: 908 at include/linux/thread_info.h:128 sigsuspend+0xab/0xc0()
      
      It turned out that this warning makes absolutely no sense as the UML
      xterm code calls sigsuspend() on the host side, at least it tries.  But
      as the kernel itself offers a sigsuspend() symbol the linker choose this
      one instead of the glibc wrapper.  Interestingly this code used to work
      since ever but always blocked signals on the wrong side.  Some recent
      kernel change made the WARN_ON() trigger and uncovered the bug.
      
      It is a wonderful example of how much works by chance on computers. :-)
      
      Fixes: 68f3f16d ("new helper: sigsuspend()")
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
      Tested-by: NVegard Nossum <vegard.nossum@oracle.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: <stable@vger.kernel.org>	[3.5+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d8a7652
    • D
      configfs: allow dynamic group creation · 5cf6a51e
      Daniel Baluta 提交于
      This patchset introduces IIO software triggers, offers a way of configuring
      them via configfs and adds the IIO hrtimer based interrupt source to be used
      with software triggers.
      
      The architecture is now split in 3 parts, to remove all IIO trigger specific
      parts from IIO configfs core:
      
      (1) IIO configfs - creates the root of the IIO configfs subsys.
      (2) IIO software triggers - software trigger implementation, dynamically
          creating /config/iio/triggers group.
      (3) IIO hrtimer trigger - is the first interrupt source for software triggers
          (with syfs to follow). Each trigger type can implement its own set of
          attributes.
      
      Lockdep seems to be happy with the locking in configfs patch.
      
      This patch (of 5):
      
      We don't want to hardcode default groups at subsystem
      creation time. We export:
      	* configfs_register_group
      	* configfs_unregister_group
      to allow drivers to programatically create/destroy groups
      later, after module init time.
      
      This is needed for IIO configfs support.
      
      (akpm: the other 4 patches to be merged via the IIO tree)
      Signed-off-by: NDaniel Baluta <daniel.baluta@intel.com>
      Suggested-by: NLars-Peter Clausen <lars@metafoo.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NJoel Becker <jlbec@evilplan.org>
      Cc: Hartmut Knaack <knaack.h@gmx.de>
      Cc: Octavian Purdila <octavian.purdila@intel.com>
      Cc: Paul Bolle <pebolle@tiscali.nl>
      Cc: Adriana Reus <adriana.reus@intel.com>
      Cc: Cristina Opriceana <cristina.opriceana@gmail.com>
      Cc: Peter Meerwald <pmeerw@pmeerw.net>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5cf6a51e
    • R
      slab.h: sprinkle __assume_aligned attributes · 94a58c36
      Rasmus Villemoes 提交于
      The various allocators return aligned memory.  Telling the compiler that
      allows it to generate better code in many cases, for example when the
      return value is immediately passed to memset().
      
      Some code does become larger, but at least we win twice as much as we lose:
      
      $ scripts/bloat-o-meter /tmp/vmlinux vmlinux
      add/remove: 0/0 grow/shrink: 13/52 up/down: 995/-2140 (-1145)
      
      An example of the different (and smaller) code can be seen in mm_alloc(). Before:
      
      :       48 8d 78 08             lea    0x8(%rax),%rdi
      :       48 89 c1                mov    %rax,%rcx
      :       48 89 c2                mov    %rax,%rdx
      :       48 c7 00 00 00 00 00    movq   $0x0,(%rax)
      :       48 c7 80 48 03 00 00    movq   $0x0,0x348(%rax)
      :       00 00 00 00
      :       31 c0                   xor    %eax,%eax
      :       48 83 e7 f8             and    $0xfffffffffffffff8,%rdi
      :       48 29 f9                sub    %rdi,%rcx
      :       81 c1 50 03 00 00       add    $0x350,%ecx
      :       c1 e9 03                shr    $0x3,%ecx
      :       f3 48 ab                rep stos %rax,%es:(%rdi)
      
      After:
      
      :       48 89 c2                mov    %rax,%rdx
      :       b9 6a 00 00 00          mov    $0x6a,%ecx
      :       31 c0                   xor    %eax,%eax
      :       48 89 d7                mov    %rdx,%rdi
      :       f3 48 ab                rep stos %rax,%es:(%rdi)
      
      So gcc's strategy is to do two possibly (but not really, of course)
      unaligned stores to the first and last word, then do an aligned rep stos
      covering the middle part with a little overlap.  Maybe arches which do not
      allow unaligned stores gain even more.
      
      I don't know if gcc can actually make use of alignments greater than 8 for
      anything, so one could probably drop the __assume_xyz_alignment macros and
      just use __assume_aligned(8).
      
      The increases in code size are mostly caused by gcc deciding to
      opencode strlen() using the check-four-bytes-at-a-time trick when it
      knows the buffer is sufficiently aligned (one function grew by 200
      bytes). Now it turns out that many of these strlen() calls showing up
      were in fact redundant, and they're gone from -next. Applying the two
      patches to next-20151001 bloat-o-meter instead says
      
      add/remove: 0/0 grow/shrink: 6/52 up/down: 244/-2140 (-1896)
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      94a58c36
  9. 20 11月, 2015 3 次提交
    • J
      lightnvm: add free and bad lun info to show luns · 2fde0e48
      Javier Gonzalez 提交于
      Add free block, used block, and bad block information to the show debug
      interface. This information is used to debug how targets track blocks.
      
      Also, change debug function name to make it more generic.
      Signed-off-by: NJavier Gonzalez <javier@cnexlabs.com>
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2fde0e48
    • J
      lightnvm: keep track of block counts · 0b59733b
      Javier Gonzalez 提交于
      Maintain number of in use blocks, free blocks, and bad blocks in a per
      lun basis. This allows the upper layers to get information about the
      state of each lun.
      
      Also, account for blocks reserved to the device on the free block count.
      nr_free_blocks matches now the actual number of blocks on the free list
      when the device is booted.
      Signed-off-by: NJavier Gonzalez <javier@cnexlabs.com>
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0b59733b
    • D
      block: protect rw_page against device teardown · 2e6edc95
      Dan Williams 提交于
      Fix use after free crashes like the following:
      
       general protection fault: 0000 [#1] SMP
       Call Trace:
        [<ffffffffa0050216>] ? pmem_do_bvec.isra.12+0xa6/0xf0 [nd_pmem]
        [<ffffffffa0050ba2>] pmem_rw_page+0x42/0x80 [nd_pmem]
        [<ffffffff8128fd90>] bdev_read_page+0x50/0x60
        [<ffffffff812972f0>] do_mpage_readpage+0x510/0x770
        [<ffffffff8128fd20>] ? I_BDEV+0x20/0x20
        [<ffffffff811d86dc>] ? lru_cache_add+0x1c/0x50
        [<ffffffff81297657>] mpage_readpages+0x107/0x170
        [<ffffffff8128fd20>] ? I_BDEV+0x20/0x20
        [<ffffffff8128fd20>] ? I_BDEV+0x20/0x20
        [<ffffffff8129058d>] blkdev_readpages+0x1d/0x20
        [<ffffffff811d615f>] __do_page_cache_readahead+0x28f/0x310
        [<ffffffff811d6039>] ? __do_page_cache_readahead+0x169/0x310
        [<ffffffff811c5abd>] ? pagecache_get_page+0x2d/0x1d0
        [<ffffffff811c76f6>] filemap_fault+0x396/0x530
        [<ffffffff811f816e>] __do_fault+0x4e/0xf0
        [<ffffffff811fce7d>] handle_mm_fault+0x11bd/0x1b50
      
      Cc: <stable@vger.kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Acked-by: NMatthew Wilcox <willy@linux.intel.com>
      [willy: symmetry fixups]
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      2e6edc95
  10. 19 11月, 2015 2 次提交
  11. 18 11月, 2015 2 次提交
  12. 17 11月, 2015 5 次提交
  13. 16 11月, 2015 2 次提交
  14. 14 11月, 2015 1 次提交
    • A
      9p: xattr simplifications · e409de99
      Andreas Gruenbacher 提交于
      Now that the xattr handler is passed to the xattr handler operations, we
      can use the same get and set operations for the user, trusted, and security
      xattr namespaces.  In those namespaces, we can access the full attribute
      name by "reattaching" the name prefix the vfs has skipped for us.  Add a
      xattr_full_name helper to make this obvious in the code.
      
      For the "system.posix_acl_access" and "system.posix_acl_default"
      attributes, handler->prefix is the full attribute name; the suffix is the
      empty string.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Ron Minnich <rminnich@sandia.gov>
      Cc: Latchesar Ionkov <lucho@ionkov.net>
      Cc: v9fs-developer@lists.sourceforge.net
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e409de99
  15. 12 11月, 2015 1 次提交
  16. 11 11月, 2015 5 次提交
  17. 10 11月, 2015 4 次提交