1. 23 3月, 2016 2 次提交
    • M
      fat: add config option to set UTF-8 mount option by default · 38739380
      Maciej S. Szmigiero 提交于
      FAT has long supported its own default file name encoding config
      setting, separate from CONFIG_NLS_DEFAULT.
      
      However, if UTF-8 encoded file names are desired FAT character set
      should not be set to utf8 since this would make file names case
      sensitive even if case insensitive matching is requested.  Instead,
      "utf8" mount options should be provided to enable UTF-8 file names in
      FAT file system.
      
      Unfortunately, there was no possibility to set the default value of this
      option so on UTF-8 system "utf8" mount option had to be added manually
      to most FAT mounts.
      
      This patch adds config option to set such default value.
      Signed-off-by: NMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      38739380
    • G
      ocfs2: add feature document for online file check · d750c42a
      Gang He 提交于
      This document will describe OCFS2 online file check feature.  OCFS2 is
      often used in high-availaibility systems.  However, OCFS2 usually
      converts the filesystem to read-only when encounters an error.  This may
      not be necessary, since turning the filesystem read-only would affect
      other running processes as well, decreasing availability.
      
      Then, a mount option (errors=continue) is introduced, which would return
      the -EIO errno to the calling process and terminate furhter processing
      so that the filesystem is not corrupted further.  The filesystem is not
      converted to read-only, and the problematic file's inode number is
      reported in the kernel log.  The user can try to check/fix this file via
      online filecheck feature.
      Signed-off-by: NGang He <ghe@suse.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d750c42a
  2. 18 3月, 2016 2 次提交
    • C
      nfsd: add SCSI layout support · f99d4fbd
      Christoph Hellwig 提交于
      This is a simple extension to the block layout driver to use SCSI
      persistent reservations for access control and fencing, as well as
      SCSI VPD pages for device identification.
      
      For this we need to pass the nfs4_client to the proc_getdeviceinfo method
      to generate the reservation key, and add a new fence_client method
      to allow for fence actions in the layout driver.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f99d4fbd
    • J
      proc: add /proc/<pid>/timerslack_ns interface · 5de23d43
      John Stultz 提交于
      This patch provides a proc/PID/timerslack_ns interface which exposes a
      task's timerslack value in nanoseconds and allows it to be changed.
      
      This allows power/performance management software to set timer slack for
      other threads according to its policy for the thread (such as when the
      thread is designated foreground vs.  background activity)
      
      If the value written is non-zero, slack is set to that value.  Otherwise
      sets it to the default for the thread.
      
      This interface checks that the calling task has permissions to to use
      PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we can ensure
      arbitrary apps do not change the timer slack for other apps.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Oren Laadan <orenl@cellrox.com>
      Cc: Ruchi Kandoi <kandoiruchi@google.com>
      Cc: Rom Lemarchand <romlem@android.com>
      Cc: Android Kernel Team <kernel-team@android.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5de23d43
  3. 12 3月, 2016 1 次提交
  4. 10 3月, 2016 2 次提交
  5. 08 3月, 2016 1 次提交
  6. 06 3月, 2016 1 次提交
  7. 27 2月, 2016 1 次提交
  8. 12 2月, 2016 2 次提交
  9. 11 2月, 2016 1 次提交
  10. 04 2月, 2016 2 次提交
    • K
      mm: polish virtual memory accounting · 30bdbb78
      Konstantin Khlebnikov 提交于
      * add VM_STACK as alias for VM_GROWSUP/DOWN depending on architecture
      * always account VMAs with flag VM_STACK as stack (as it was before)
      * cleanup classifying helpers
      * update comments and documentation
      Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Tested-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30bdbb78
    • J
      proc: revert /proc/<pid>/maps [stack:TID] annotation · 65376df5
      Johannes Weiner 提交于
      Commit b7643757 ("procfs: mark thread stack correctly in
      proc/<pid>/maps") added [stack:TID] annotation to /proc/<pid>/maps.
      
      Finding the task of a stack VMA requires walking the entire thread list,
      turning this into quadratic behavior: a thousand threads means a
      thousand stacks, so the rendering of /proc/<pid>/maps needs to look at a
      million combinations.
      
      The cost is not in proportion to the usefulness as described in the
      patch.
      
      Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
      /proc/<pid>/numa_maps) usable again for higher thread counts.
      
      The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained, as
      identifying the stack VMA there is an O(1) operation.
      
      Siddesh said:
       "The end users needed a way to identify thread stacks programmatically and
        there wasn't a way to do that.  I'm afraid I no longer remember (or have
        access to the resources that would aid my memory since I changed
        employers) the details of their requirement.  However, I did do this on my
        own time because I thought it was an interesting project for me and nobody
        really gave any feedback then as to its utility, so as far as I am
        concerned you could roll back the main thread maps information since the
        information is available in the thread-specific files"
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
      Cc: Shaohua Li <shli@fb.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65376df5
  11. 21 1月, 2016 1 次提交
  12. 15 1月, 2016 6 次提交
    • R
      Documentation/filesystems: describe the shared memory usage/accounting · 0bc126d4
      Rodrigo Freire 提交于
      The Shared Memory accounting support is present in Kernel since commit
      4b02108a ("mm: oom analysis: add shmem vmstat") and in userland
      free(1) since 2014.  This patch updates the Documentation to reflect
      this change.
      Signed-off-by: NRodrigo Freire <rfreire@redhat.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0bc126d4
    • J
      mm, procfs: breakdown RSS for anon, shmem and file in /proc/pid/status · 8cee852e
      Jerome Marchand 提交于
      There are several shortcomings with the accounting of shared memory
      (SysV shm, shared anonymous mapping, mapping of a tmpfs file).  The
      values in /proc/<pid>/status and <...>/statm don't allow to distinguish
      between shmem memory and a shared mapping to a regular file, even though
      theirs implication on memory usage are quite different: during reclaim,
      file mapping can be dropped or written back on disk, while shmem needs a
      place in swap.
      
      Also, to distinguish the memory occupied by anonymous and file mappings,
      one has to read the /proc/pid/statm file, which has a field for the file
      mappings (again, including shmem) and total memory occupied by these
      mappings (i.e.  equivalent to VmRSS in the <...>/status file.  Getting
      the value for anonymous mappings only is thus not exactly user-friendly
      (the statm file is intended to be rather efficiently machine-readable).
      
      To address both of these shortcomings, this patch adds a breakdown of
      VmRSS in /proc/<pid>/status via new fields RssAnon, RssFile and
      RssShmem, making use of the previous preparatory patch.  These fields
      tell the user the memory occupied by private anonymous pages, mapped
      regular files and shmem, respectively.  Other existing fields in /status
      and /statm files are left without change.  The /statm file can be
      extended in the future, if there's a need for that.
      
      Example (part of) /proc/pid/status output including the new Rss* fields:
      
      VmPeak:  2001008 kB
      VmSize:  2001004 kB
      VmLck:         0 kB
      VmPin:         0 kB
      VmHWM:      5108 kB
      VmRSS:      5108 kB
      RssAnon:              92 kB
      RssFile:            1324 kB
      RssShmem:           3692 kB
      VmData:      192 kB
      VmStk:       136 kB
      VmExe:         4 kB
      VmLib:      1784 kB
      VmPTE:      3928 kB
      VmPMD:        20 kB
      VmSwap:        0 kB
      HugetlbPages:          0 kB
      
      [vbabka@suse.cz: forward-porting, tweak changelog]
      Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8cee852e
    • V
      mm, proc: account for shmem swap in /proc/pid/smaps · c261e7d9
      Vlastimil Babka 提交于
      Currently, /proc/pid/smaps will always show "Swap: 0 kB" for
      shmem-backed mappings, even if the mapped portion does contain pages
      that were swapped out.  This is because unlike private anonymous
      mappings, shmem does not change pte to swap entry, but pte_none when
      swapping the page out.  In the smaps page walk, such page thus looks
      like it was never faulted in.
      
      This patch changes smaps_pte_entry() to determine the swap status for
      such pte_none entries for shmem mappings, similarly to how
      mincore_page() does it.  Swapped out shmem pages are thus accounted for.
      For private mappings of tmpfs files that COWed some of the pages, swaped
      out status of the original shmem pages is naturally ignored.  If some of
      the private copies was also swapped out, they are accounted via their
      page table swap entries, so the resulting reported swap usage is then a
      sum of both swapped out private copies, and swapped out shmem pages that
      were not COWed.  No double accounting can thus happen.
      
      The accounting is arguably still not as precise as for private anonymous
      mappings, since now we will count also pages that the process in
      question never accessed, but another process populated them and then let
      them become swapped out.  I believe it is still less confusing and
      subtle than not showing any swap usage by shmem mappings at all.
      Swapped out counter might of interest of users who would like to prevent
      from future swapins during performance critical operation and pre-fault
      them at their convenience.  Especially for larger swapped out regions
      the cost of swapin is much higher than a fresh page allocation.  So a
      differentiation between pte_none vs.  swapped out is important for those
      usecases.
      
      One downside of this patch is that it makes /proc/pid/smaps more
      expensive for shmem mappings, as we consult the radix tree for each
      pte_none entry, so the overal complexity is O(n*log(n)).  I have
      measured this on a process that creates a 2GB mapping and dirties single
      pages with a stride of 2MB, and time how long does it take to cat
      /proc/pid/smaps of this process 100 times.
      
      Private anonymous mapping:
      
      real    0m0.949s
      user    0m0.116s
      sys     0m0.348s
      
      Mapping of a /dev/shm/file:
      
      real    0m3.831s
      user    0m0.180s
      sys     0m3.212s
      
      The difference is rather substantial, so the next patch will reduce the
      cost for shared or read-only mappings.
      
      In a less controlled experiment, I've gathered pids of processes on my
      desktop that have either '/dev/shm/*' or 'SYSV*' in smaps.  This
      included the Chrome browser and some KDE processes.  Again, I've run cat
      /proc/pid/smaps on each 100 times.
      
      Before this patch:
      
      real    0m9.050s
      user    0m0.518s
      sys     0m8.066s
      
      After this patch:
      
      real    0m9.221s
      user    0m0.541s
      sys     0m8.187s
      
      This suggests low impact on average systems.
      
      Note that this patch doesn't attempt to adjust the SwapPss field for
      shmem mappings, which would need extra work to determine who else could
      have the pages mapped.  Thus the value stays zero except for COWed
      swapped out pages in a shmem mapping, which are accounted as usual.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: NJerome Marchand <jmarchan@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c261e7d9
    • V
      mm, documentation: clarify /proc/pid/status VmSwap limitations for shmem · bf9683d6
      Vlastimil Babka 提交于
      This series is based on Jerome Marchand's [1] so let me quote the first
      paragraph from there:
      
      There are several shortcomings with the accounting of shared memory
      (sysV shm, shared anonymous mapping, mapping to a tmpfs file).  The
      values in /proc/<pid>/status and statm don't allow to distinguish
      between shmem memory and a shared mapping to a regular file, even though
      their implications on memory usage are quite different: at reclaim, file
      mapping can be dropped or written back on disk while shmem needs a place
      in swap.  As for shmem pages that are swapped-out or in swap cache, they
      aren't accounted at all.
      
      The original motivation for myself is that a customer found (IMHO
      rightfully) confusing that e.g.  top output for process swap usage is
      unreliable with respect to swapped out shmem pages, which are not
      accounted for.
      
      The fundamental difference between private anonymous and shmem pages is
      that the latter has PTE's converted to pte_none, and not swapents.  As
      such, they are not accounted to the number of swapents visible e.g.  in
      /proc/pid/status VmSwap row.  It might be theoretically possible to use
      swapents when swapping out shmem (without extra cost, as one has to
      change all mappers anyway), and on swap in only convert the swapent for
      the faulting process, leaving swapents in other processes until they
      also fault (so again no extra cost).  But I don't know how many
      assumptions this would break, and it would be too disruptive change for
      a relatively small benefit.
      
      Instead, my approach is to document the limitation of VmSwap, and
      provide means to determine the swap usage for shmem areas for those who
      are interested and willing to pay the price, using /proc/pid/smaps.
      Because outside of ipcs, I don't think it's possible to currently to
      determine the usage at all.  The previous patchset [1] did introduce new
      shmem-specific fields into smaps output, and functions to determine the
      values.  I take a simpler approach, noting that smaps output already has
      a "Swap: X kB" line, where currently X == 0 always for shmem areas.  I
      think we can just consider this a bug and provide the proper value by
      consulting the radix tree, as e.g.  mincore_page() does.  In the patch
      changelog I explain why this is also not perfect (and cannot be without
      swapents), but still arguably much better than showing a 0.
      
      The last two patches are adapted from Jerome's patchset and provide a
      VmRSS breakdown to RssAnon, RssFile and RssShm in /proc/pid/status.
      Hugh noted that this is a welcome addition, and I agree that it might
      help e.g.  debugging process memory usage at albeit non-zero, but still
      rather low cost of extra per-mm counter and some page flag checks.
      
      [1] http://lwn.net/Articles/611966/
      
      This patch (of 6):
      
      The documentation for /proc/pid/status does not mention that the value
      of VmSwap counts only swapped out anonymous private pages, and not
      swapped out pages of the underlying shmem objects (for shmem mappings).
      This is not obvious, so document this limitation.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJerome Marchand <jmarchan@redhat.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bf9683d6
    • A
      Make sure that highmem pages are not added to symlink page cache · e8ecde25
      Al Viro 提交于
      inode_nohighmem() is sufficient to make sure that page_get_link()
      won't try to allocate a highmem page.  Moreover, it is sufficient
      to make sure that page_symlink/__page_symlink won't do the same
      thing.  However, any filesystem that manually preseeds the symlink's
      page cache upon symlink(2) needs to make sure that the page it
      inserts there won't be a highmem one.
      
      Fortunately, only nfs and shmem have run afoul of that...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e8ecde25
    • S
      Documentation: update libhugetlbfs site url · ceec86ec
      SeongJae Park 提交于
      The site for libhugetlbfs has moved from sourceforge to github. This
      commit updates the old url.
      Signed-off-by: NSeongJae Park <sj38.park@gmail.com>
      Acked-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NJonathan Corbet <corbet@lwn.net>
      ceec86ec
  13. 14 1月, 2016 1 次提交
  14. 04 1月, 2016 1 次提交
    • P
      configfs: implement binary attributes · 03607ace
      Pantelis Antoniou 提交于
      ConfigFS lacked binary attributes up until now. This patch
      introduces support for binary attributes in a somewhat similar
      manner of sysfs binary attributes albeit with changes that
      fit the configfs usage model.
      
      Problems that configfs binary attributes fix are everything that
      requires a binary blob as part of the configuration of a resource,
      such as bitstream loading for FPGAs, DTBs for dynamically created
      devices etc.
      
      Look at Documentation/filesystems/configfs/configfs.txt for internals
      and howto use them.
      
      This patch is against linux-next as of today that contains
      Christoph's configfs rework.
      Signed-off-by: NPantelis Antoniou <pantelis.antoniou@konsulko.com>
      [hch: folded a fix from Geert Uytterhoeven <geert+renesas@glider.be>]
      [hch: a few tiny updates based on review feedback]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      03607ace
  15. 31 12月, 2015 1 次提交
  16. 17 12月, 2015 1 次提交
  17. 11 12月, 2015 1 次提交
  18. 09 12月, 2015 2 次提交
    • A
      replace ->follow_link() with new method that could stay in RCU mode · 6b255391
      Al Viro 提交于
      new method: ->get_link(); replacement of ->follow_link().  The differences
      are:
      	* inode and dentry are passed separately
      	* might be called both in RCU and non-RCU mode;
      the former is indicated by passing it a NULL dentry.
      	* when called that way it isn't allowed to block
      and should return ERR_PTR(-ECHILD) if it needs to be called
      in non-RCU mode.
      
      It's a flagday change - the old method is gone, all in-tree instances
      converted.  Conversion isn't hard; said that, so far very few instances
      do not immediately bail out when called in RCU mode.  That'll change
      in the next commits.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6b255391
    • A
      don't put symlink bodies in pagecache into highmem · 21fc61c7
      Al Viro 提交于
      kmap() in page_follow_link_light() needed to go - allowing to hold
      an arbitrary number of kmaps for long is a great way to deadlocking
      the system.
      
      new helper (inode_nohighmem(inode)) needs to be used for pagecache
      symlinks inodes; done for all in-tree cases.  page_follow_link_light()
      instrumented to yell about anything missed.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      21fc61c7
  19. 05 12月, 2015 1 次提交
  20. 12 11月, 2015 1 次提交
  21. 10 11月, 2015 1 次提交
    • R
      coredump: add DAX filtering for ELF coredumps · 5037835c
      Ross Zwisler 提交于
      Add two new flags to the existing coredump mechanism for ELF files to
      allow us to explicitly filter DAX mappings.  This is desirable because
      DAX mappings, like hugetlb mappings, have the potential to be very
      large.
      
      Update the coredump_filter documentation in
      Documentation/filesystems/proc.txt so that it addresses the new DAX
      coredump flags.  Also update the documented default value of
      coredump_filter to be consistent with the core(5) man page.  The
      documentation being updated talks about bit 4, Dump ELF headers, which
      is enabled if CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is turned on in the
      kernel config.  This kernel config option defaults to "y" if both ELF
      binaries and coredump are enabled.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5037835c
  22. 06 11月, 2015 4 次提交
  23. 03 11月, 2015 1 次提交
  24. 30 10月, 2015 1 次提交
  25. 19 10月, 2015 1 次提交
    • L
      ipconfig: send Client-identifier in DHCP requests · 26fb342c
      Li RongQing 提交于
      A dhcp server may provide parameters to a client from a pool of IP
      addresses and using a shared rootfs, or provide a specific set of
      parameters for a specific client, usually using the MAC address to
      identify each client individually. The dhcp protocol also specifies
      a client-id field which can be used to determine the correct
      parameters to supply when no MAC address is available. There is
      currently no way to tell the kernel to supply a specific client-id,
      only the userspace dhcp clients support this feature, but this can
      not be used when the network is needed before userspace is available
      such as when the root filesystem is on NFS.
      
      This patch is to be able to do something like "ip=dhcp,client_id_type,
      client_id_value", as a kernel parameter to enable the kernel to
      identify itself to the server.
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      26fb342c
  26. 14 10月, 2015 1 次提交