1. 31 10月, 2016 1 次提交
  2. 20 10月, 2016 1 次提交
    • A
      fs/proc: Stop trying to report thread stacks · b18cb64e
      Andy Lutomirski 提交于
      This reverts more of:
      
        b7643757 ("procfs: mark thread stack correctly in proc/<pid>/maps")
      
      ... which was partially reverted by:
      
        65376df5 ("proc: revert /proc/<pid>/maps [stack:TID] annotation")
      
      Originally, /proc/PID/task/TID/maps was the same as /proc/TID/maps.
      
      In current kernels, /proc/PID/maps (or /proc/TID/maps even for
      threads) shows "[stack]" for VMAs in the mm's stack address range.
      
      In contrast, /proc/PID/task/TID/maps uses KSTK_ESP to guess the
      target thread's stack's VMA.  This is racy, probably returns garbage
      and, on arches with CONFIG_TASK_INFO_IN_THREAD=y, is also crash-prone:
      KSTK_ESP is not safe to use on tasks that aren't known to be running
      ordinary process-context kernel code.
      
      This patch removes the difference and just shows "[stack]" for VMAs
      in the mm's stack range.  This is IMO much more sensible -- the
      actual "stack" address really is treated specially by the VM code,
      and the current thread stack isn't even well-defined for programs
      that frequently switch stacks on their own.
      Reported-by: NJann Horn <jann@thejh.net>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linux API <linux-api@vger.kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tycho Andersen <tycho.andersen@canonical.com>
      Link: http://lkml.kernel.org/r/3e678474ec14e0a0ec34c611016753eea2e1b8ba.1475257877.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b18cb64e
  3. 12 10月, 2016 3 次提交
  4. 10 10月, 2016 1 次提交
  5. 08 10月, 2016 2 次提交
    • R
      Documentation/filesystems/proc.txt: add more description for maps/smaps · 53aeee7a
      Robert Ho 提交于
      Add some more description on the limitations for smaps/maps readings, as
      well as some guaruntees we can make.
      
      Link: http://lkml.kernel.org/r/1475296958-27652-2-git-send-email-robert.hu@intel.comSigned-off-by: NRobert Ho <robert.hu@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Robert Hu <robert.hu@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53aeee7a
    • A
      xattr: Stop calling {get,set,remove}xattr inode operations · 6c6ef9f2
      Andreas Gruenbacher 提交于
      All filesystems that support xattrs by now do so via xattr handlers.
      They all define sb->s_xattr, and their getxattr, setxattr, and
      removexattr inode operations use the generic inode operations.  On
      filesystems that don't support xattrs, the xattr inode operations are
      all NULL, and sb->s_xattr is also NULL.
      
      This means that we can remove the getxattr, setxattr, and removexattr
      inode operations and directly call the generic handlers, or better,
      inline expand those handlers into fs/xattr.c.
      
      Filesystems that do not support xattrs on some inodes should clear the
      IOP_XATTR i_opflags flag in those inodes.  (Right now, some filesystems
      have checks to disable xattrs on some inodes in the ->list, ->get, and
      ->set xattr handler operations instead.)  The IOP_XATTR flag is
      automatically cleared in inodes of filesystems that don't have xattr
      support.
      
      In orangefs, symlinks do have a setxattr iop but no getxattr iop.  Add a
      check for symlinks to orangefs_inode_getxattr to preserve the current,
      weird behavior; that check may not be necessary though.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6c6ef9f2
  6. 03 10月, 2016 1 次提交
  7. 01 10月, 2016 1 次提交
  8. 29 9月, 2016 1 次提交
  9. 27 9月, 2016 3 次提交
  10. 22 9月, 2016 1 次提交
  11. 20 9月, 2016 1 次提交
  12. 19 9月, 2016 1 次提交
  13. 16 9月, 2016 1 次提交
  14. 01 9月, 2016 1 次提交
  15. 30 8月, 2016 1 次提交
  16. 04 8月, 2016 1 次提交
  17. 03 8月, 2016 2 次提交
  18. 01 8月, 2016 1 次提交
  19. 27 7月, 2016 4 次提交
    • K
    • K
      mm: introduce fault_env · bae473a4
      Kirill A. Shutemov 提交于
      The idea borrowed from Peter's patch from patchset on speculative page
      faults[1]:
      
      Instead of passing around the endless list of function arguments,
      replace the lot with a single structure so we can change context without
      endless function signature changes.
      
      The changes are mostly mechanical with exception of faultaround code:
      filemap_map_pages() got reworked a bit.
      
      This patch is preparation for the next one.
      
      [1] http://lkml.kernel.org/r/20141020222841.302891540@infradead.org
      
      Link: http://lkml.kernel.org/r/1466021202-61880-9-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bae473a4
    • M
      mm: migrate: support non-lru movable page migration · bda807d4
      Minchan Kim 提交于
      We have allowed migration for only LRU pages until now and it was enough
      to make high-order pages.  But recently, embedded system(e.g., webOS,
      android) uses lots of non-movable pages(e.g., zram, GPU memory) so we
      have seen several reports about troubles of small high-order allocation.
      For fixing the problem, there were several efforts (e,g,.  enhance
      compaction algorithm, SLUB fallback to 0-order page, reserved memory,
      vmalloc and so on) but if there are lots of non-movable pages in system,
      their solutions are void in the long run.
      
      So, this patch is to support facility to change non-movable pages with
      movable.  For the feature, this patch introduces functions related to
      migration to address_space_operations as well as some page flags.
      
      If a driver want to make own pages movable, it should define three
      functions which are function pointers of struct
      address_space_operations.
      
      1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);
      
      What VM expects on isolate_page function of driver is to return *true*
      if driver isolates page successfully.  On returing true, VM marks the
      page as PG_isolated so concurrent isolation in several CPUs skip the
      page for isolation.  If a driver cannot isolate the page, it should
      return *false*.
      
      Once page is successfully isolated, VM uses page.lru fields so driver
      shouldn't expect to preserve values in that fields.
      
      2. int (*migratepage) (struct address_space *mapping,
      		struct page *newpage, struct page *oldpage, enum migrate_mode);
      
      After isolation, VM calls migratepage of driver with isolated page.  The
      function of migratepage is to move content of the old page to new page
      and set up fields of struct page newpage.  Keep in mind that you should
      indicate to the VM the oldpage is no longer movable via
      __ClearPageMovable() under page_lock if you migrated the oldpage
      successfully and returns 0.  If driver cannot migrate the page at the
      moment, driver can return -EAGAIN.  On -EAGAIN, VM will retry page
      migration in a short time because VM interprets -EAGAIN as "temporal
      migration failure".  On returning any error except -EAGAIN, VM will give
      up the page migration without retrying in this time.
      
      Driver shouldn't touch page.lru field VM using in the functions.
      
      3. void (*putback_page)(struct page *);
      
      If migration fails on isolated page, VM should return the isolated page
      to the driver so VM calls driver's putback_page with migration failed
      page.  In this function, driver should put the isolated page back to the
      own data structure.
      
      4. non-lru movable page flags
      
      There are two page flags for supporting non-lru movable page.
      
      * PG_movable
      
      Driver should use the below function to make page movable under
      page_lock.
      
      	void __SetPageMovable(struct page *page, struct address_space *mapping)
      
      It needs argument of address_space for registering migration family
      functions which will be called by VM.  Exactly speaking, PG_movable is
      not a real flag of struct page.  Rather than, VM reuses page->mapping's
      lower bits to represent it.
      
      	#define PAGE_MAPPING_MOVABLE 0x2
      	page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;
      
      so driver shouldn't access page->mapping directly.  Instead, driver
      should use page_mapping which mask off the low two bits of page->mapping
      so it can get right struct address_space.
      
      For testing of non-lru movable page, VM supports __PageMovable function.
      However, it doesn't guarantee to identify non-lru movable page because
      page->mapping field is unified with other variables in struct page.  As
      well, if driver releases the page after isolation by VM, page->mapping
      doesn't have stable value although it has PAGE_MAPPING_MOVABLE (Look at
      __ClearPageMovable).  But __PageMovable is cheap to catch whether page
      is LRU or non-lru movable once the page has been isolated.  Because LRU
      pages never can have PAGE_MAPPING_MOVABLE in page->mapping.  It is also
      good for just peeking to test non-lru movable pages before more
      expensive checking with lock_page in pfn scanning to select victim.
      
      For guaranteeing non-lru movable page, VM provides PageMovable function.
      Unlike __PageMovable, PageMovable functions validates page->mapping and
      mapping->a_ops->isolate_page under lock_page.  The lock_page prevents
      sudden destroying of page->mapping.
      
      Driver using __SetPageMovable should clear the flag via
      __ClearMovablePage under page_lock before the releasing the page.
      
      * PG_isolated
      
      To prevent concurrent isolation among several CPUs, VM marks isolated
      page as PG_isolated under lock_page.  So if a CPU encounters PG_isolated
      non-lru movable page, it can skip it.  Driver doesn't need to manipulate
      the flag because VM will set/clear it automatically.  Keep in mind that
      if driver sees PG_isolated page, it means the page have been isolated by
      VM so it shouldn't touch page.lru field.  PG_isolated is alias with
      PG_reclaim flag so driver shouldn't use the flag for own purpose.
      
      [opensource.ganesh@gmail.com: mm/compaction: remove local variable is_lru]
        Link: http://lkml.kernel.org/r/20160618014841.GA7422@leo-test
      Link: http://lkml.kernel.org/r/1464736881-24886-3-git-send-email-minchan@kernel.orgSigned-off-by: NGioh Kim <gi-oh.kim@profitbricks.com>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: John Einar Reitan <john.reitan@foss.arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bda807d4
    • R
      dax: some small updates to dax.txt documentation · 221c7dc8
      Ross Zwisler 提交于
      These are originally from Matthew Wilcox and were part of his huge
      "mm,fs,dax: Change ->pmd_fault to ->huge_fault" patch that was part of
      PUD support.
      
      I'm breaking these small changes out as they stand on their own and add
      useful information to Documentation/filesystems/dax.txt.
      
      Link: http://lkml.kernel.org/r/20160714214049.20075-1-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      221c7dc8
  20. 25 7月, 2016 1 次提交
  21. 13 7月, 2016 1 次提交
    • D
      pmem: kill __pmem address space · 7a9eb206
      Dan Williams 提交于
      The __pmem address space was meant to annotate codepaths that touch
      persistent memory and need to coordinate a call to wmb_pmem().  Now that
      wmb_pmem() is gone, there is little need to keep this annotation.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      7a9eb206
  22. 10 7月, 2016 1 次提交
  23. 09 7月, 2016 1 次提交
  24. 02 7月, 2016 1 次提交
  25. 01 7月, 2016 1 次提交
  26. 30 6月, 2016 2 次提交
  27. 14 6月, 2016 1 次提交
  28. 06 6月, 2016 1 次提交
    • E
      devpts: Make each mount of devpts an independent filesystem. · eedf265a
      Eric W. Biederman 提交于
      The /dev/ptmx device node is changed to lookup the directory entry "pts"
      in the same directory as the /dev/ptmx device node was opened in.  If
      there is a "pts" entry and that entry is a devpts filesystem /dev/ptmx
      uses that filesystem.  Otherwise the open of /dev/ptmx fails.
      
      The DEVPTS_MULTIPLE_INSTANCES configuration option is removed, so that
      userspace can now safely depend on each mount of devpts creating a new
      instance of the filesystem.
      
      Each mount of devpts is now a separate and equal filesystem.
      
      Reserved ttys are now available to all instances of devpts where the
      mounter is in the initial mount namespace.
      
      A new vfs helper path_pts is introduced that finds a directory entry
      named "pts" in the directory of the passed in path, and changes the
      passed in path to point to it.  The helper path_pts uses a function
      path_parent_directory that was factored out of follow_dotdot.
      
      In the implementation of devpts:
       - devpts_mnt is killed as it is no longer meaningful if all mounts of
         devpts are equal.
       - pts_sb_from_inode is replaced by just inode->i_sb as all cached
         inodes in the tty layer are now from the devpts filesystem.
       - devpts_add_ref is rolled into the new function devpts_ptmx.  And the
         unnecessary inode hold is removed.
       - devpts_del_ref is renamed devpts_release and reduced to just a
         deacrivate_super.
       - The newinstance mount option continues to be accepted but is now
         ignored.
      
      In devpts_fs.h definitions for when !CONFIG_UNIX98_PTYS are removed as
      they are never used.
      
      Documentation/filesystems/devices.txt is updated to describe the current
      situation.
      
      This has been verified to work properly on openwrt-15.05, centos5,
      centos6, centos7, debian-6.0.2, debian-7.9, debian-8.2, ubuntu-14.04.3,
      ubuntu-15.10, fedora23, magia-5, mint-17.3, opensuse-42.1,
      slackware-14.1, gentoo-20151225 (13.0?), archlinux-2015-12-01.  With the
      caveat that on centos6 and on slackware-14.1 that there wind up being
      two instances of the devpts filesystem mounted on /dev/pts, the lower
      copy does not end up getting used.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Aurelien Jarno <aurelien@aurel32.net>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jann Horn <jann@thejh.net>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Florian Weimer <fw@deneb.enyo.de>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eedf265a
  29. 28 5月, 2016 1 次提交
  30. 27 5月, 2016 1 次提交