1. 17 7月, 2007 2 次提交
    • L
      hugetlbfs: handle empty options string · b4c07bce
      Lee Schermerhorn 提交于
      I was seeing a null pointer deref in fs/super.c:vfs_kern_mount().
      Some file system get_sb() handler was returning NULL mnt_sb with
      a non-negative return value.  I also noticed a "hugetlbfs: Bad
      mount option:" message in the log.
      
      Turns out that hugetlbfs_parse_options() was not checking for an
      empty option string after call to strsep().  On failure,
      hugetlbfs_parse_options() returns 1.  hugetlbfs_fill_super() just
      passed this return code back up the call stack where
      vfs_kern_mount() missed the error and proceeded with a NULL mnt_sb.
      
      Apparently introduced by patch:
      	hugetlbfs-use-lib-parser-fix-docs.patch
      
      The problem was exposed by this line in my fstab:
      
      none        /huge       hugetlbfs   defaults    0 0
      
      It can also be demonstrated by invoking mount of hugetlbfs
      directly with no options or a bogus option.
      
      This patch:
      
      1) adds the check for empty option to hugetlbfs_parse_options(),
      2) enhances the error message to bracket any unrecognized
         option with quotes ,
      3) modifies hugetlbfs_parse_options() to return -EINVAL on any
         unrecognized option,
      4) adds a BUG_ON() to vfs_kern_mount() to catch any get_sb()
         handler that returns a NULL mnt->mnt_sb with a return value
         >= 0.
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4c07bce
    • R
      hugetlbfs: use lib/parser, fix docs · e73a75fa
      Randy Dunlap 提交于
      Use lib/parser.c to parse hugetlbfs mount options.  Correct docs in
      hugetlbpage.txt.
      
      old size of hugetlbfs_fill_super:  675 bytes
      new size of hugetlbfs_fill_super:  686 bytes
      (hugetlbfs_parse_options() is inlined)
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: Adam Litke <agl@us.ibm.com>
      Acked-by: NWilliam Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e73a75fa
  2. 17 6月, 2007 1 次提交
    • E
      shm: fix the filename of hugetlb sysv shared memory · 9d66586f
      Eric W. Biederman 提交于
      Some user space tools need to identify SYSV shared memory when examining
      /proc/<pid>/maps.  To do so they look for a block device with major zero, a
      dentry named SYSV<sysv key>, and having the minor of the internal sysv
      shared memory kernel mount.
      
      To help these tools and to make it easier for people just browsing
      /proc/<pid>/maps this patch modifies hugetlb sysv shared memory to use the
      SYSV<key> dentry naming convention.
      
      User space tools will still have to be aware that hugetlb sysv shared
      memory lives on a different internal kernel mount and so has a different
      block device minor number from the rest of sysv shared memory.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: Albert Cahalan <acahalan@gmail.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d66586f
  3. 17 5月, 2007 1 次提交
    • C
      Remove SLAB_CTOR_CONSTRUCTOR · a35afb83
      Christoph Lameter 提交于
      SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a35afb83
  4. 08 5月, 2007 5 次提交
  5. 13 2月, 2007 2 次提交
  6. 10 2月, 2007 1 次提交
    • K
      [PATCH] hugetlb: preserve hugetlb pte dirty state · 6649a386
      Ken Chen 提交于
      __unmap_hugepage_range() is buggy that it does not preserve dirty state of
      huge_pte when unmapping hugepage range.  It causes data corruption in the
      event of dop_caches being used by sys admin.  For example, an application
      creates a hugetlb file, modify pages, then unmap it.  While leaving the
      hugetlb file alive, comes along sys admin doing a "echo 3 >
      /proc/sys/vm/drop_caches".
      
      drop_pagecache_sb() will happily free all pages that aren't marked dirty if
      there are no active mapping.  Later when application remaps the hugetlb
      file back and all data are gone, triggering catastrophic flip over on
      application.
      
      Not only that, the internal resv_huge_pages count will also get all messed
      up.  Fix it up by marking page dirty appropriately.
      Signed-off-by: NKen Chen <kenchen@google.com>
      Cc: "Nish Aravamudan" <nish.aravamudan@gmail.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: <stable@kernel.org>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6649a386
  7. 22 12月, 2006 1 次提交
    • L
      VM: Remove "clear_page_dirty()" and "test_clear_page_dirty()" functions · fba2591b
      Linus Torvalds 提交于
      They were horribly easy to mis-use because of their tempting naming, and
      they also did way more than any users of them generally wanted them to
      do.
      
      A dirty page can become clean under two circumstances:
      
       (a) when we write it out.  We have "clear_page_dirty_for_io()" for
           this, and that function remains unchanged.
      
           In the "for IO" case it is not sufficient to just clear the dirty
           bit, you also have to mark the page as being under writeback etc.
      
       (b) when we actually remove a page due to it becoming inaccessible to
           users, notably because it was truncate()'d away or the file (or
           metadata) no longer exists, and we thus want to cancel any
           outstanding dirty state.
      
      For the (b) case, we now introduce "cancel_dirty_page()", which only
      touches the page state itself, and verifies that the page is not mapped
      (since cancelling writes on a mapped page would be actively wrong as it
      is still accessible to users).
      
      Some filesystems need to be fixed up for this: CIFS, FUSE, JFS,
      ReiserFS, XFS all use the old confusing functions, and will be fixed
      separately in subsequent commits (with some of them just removing the
      offending logic, and others using clear_page_dirty_for_io()).
      
      This was confirmed by Martin Michlmayr to fix the apt database
      corruption on ARM.
      
      Cc: Martin Michlmayr <tbm@cyrius.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Andrei Popa <andrei.popa@i-neo.ro>
      Cc: Andrew Morton <akpm@osdl.org>
      Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
      Cc: Gordon Farquharson <gordonfarquharson@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fba2591b
  8. 09 12月, 2006 1 次提交
  9. 08 12月, 2006 2 次提交
  10. 15 11月, 2006 1 次提交
    • H
      [PATCH] hugetlb: prepare_hugepage_range check offset too · 68589bc3
      Hugh Dickins 提交于
      (David:)
      
      If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example,
      because the given file offset is not hugepage aligned - then do_mmap_pgoff
      will go to the unmap_and_free_vma backout path.
      
      But at this stage the vma hasn't been marked as hugepage, and the backout path
      will call unmap_region() on it.  That will eventually call down to the
      non-hugepage version of unmap_page_range().  On ppc64, at least, that will
      cause serious problems if there are any existing hugepage pagetable entries in
      the vicinity - for example if there are any other hugepage mappings under the
      same PUD.  unmap_page_range() will trigger a bad_pud() on the hugepage pud
      entries.  I suspect this will also cause bad problems on ia64, though I don't
      have a machine to test it on.
      
      (Hugh:)
      
      prepare_hugepage_range() should check file offset alignment when it checks
      virtual address and length, to stop MAP_FIXED with a bad huge offset from
      unmapping before it fails further down.  PowerPC should apply the same
      prepare_hugepage_range alignment checks as ia64 and all the others do.
      
      Then none of the alignment checks in hugetlbfs_file_mmap are required (nor
      is the check for too small a mapping); but even so, move up setting of
      VM_HUGETLB and add a comment to warn of what David Gibson discovered - if
      hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region
      when unwinding from error will go the non-huge way, which may cause bad
      behaviour on architectures (powerpc and ia64) which segregate their huge
      mappings into a separate region of the address space.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      68589bc3
  11. 29 10月, 2006 2 次提交
    • H
      [PATCH] hugetlb: fix prio_tree unit · 856fc295
      Hugh Dickins 提交于
      hugetlb_vmtruncate_list was misconverted to prio_tree: its prio_tree is in
      units of PAGE_SIZE (PAGE_CACHE_SIZE) like any other, not HPAGE_SIZE (whereas
      its radix_tree is kept in units of HPAGE_SIZE, otherwise slots would be
      absurdly sparse).
      
      At first I thought the error benign, just calling __unmap_hugepage_range on
      more vmas than necessary; but on 32-bit machines, when the prio_tree is
      searched correctly, it happens to ensure the v_offset calculation won't
      overflow.  As it stood, when truncating at or beyond 4GB, it was liable to
      discard pages COWed from lower offsets; or even to clear pmd entries of
      preceding vmas, triggering exit_mmap's BUG_ON(nr_ptes).
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      856fc295
    • H
      [PATCH] hugetlb: fix size=4G parsing · b9d7e6ae
      Hugh Dickins 提交于
      On 32-bit machines, mount -t hugetlbfs -o size=4G gave a 0GB filesystem,
      size=5G gave a 1GB filesystem etc: there's no point in masking size with
      HPAGE_MASK just before shifting its lower bits away, and since HPAGE_MASK is a
      UL, that removed all the higher bits of the unsigned long long size.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b9d7e6ae
  12. 12 10月, 2006 1 次提交
  13. 01 10月, 2006 1 次提交
  14. 30 9月, 2006 1 次提交
  15. 27 9月, 2006 1 次提交
  16. 11 7月, 2006 1 次提交
  17. 29 6月, 2006 1 次提交
  18. 23 6月, 2006 3 次提交
    • C
      [PATCH] tightening hugetlb strict accounting · a43a8c39
      Chen, Kenneth W 提交于
      Current hugetlb strict accounting for shared mapping always assume mapping
      starts at zero file offset and reserves pages between zero and size of the
      file.  This assumption often reserves (or lock down) a lot more pages then
      necessary if application maps at none zero file offset.  libhugetlbfs is
      one example that requires proper reservation on shared mapping starts at
      none zero offset.
      
      This patch extends the reservation and hugetlb strict accounting to support
      any arbitrary pair of (offset, len), resulting a much more robust and
      accurate scheme.  More importantly, it won't lock down any hugetlb pages
      outside file mapping.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a43a8c39
    • D
      [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry · 726c3342
      David Howells 提交于
      Give the statfs superblock operation a dentry pointer rather than a superblock
      pointer.
      
      This complements the get_sb() patch.  That reduced the significance of
      sb->s_root, allowing NFS to place a fake root there.  However, NFS does
      require a dentry to use as a target for the statfs operation.  This permits
      the root in the vfsmount to be used instead.
      
      linux/mount.h has been added where necessary to make allyesconfig build
      successfully.
      
      Interest has also been expressed for use with the FUSE and XFS filesystems.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      726c3342
    • D
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells 提交于
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      454e2398
  19. 29 3月, 2006 1 次提交
  20. 22 3月, 2006 2 次提交
    • C
      [PATCH] convert hugetlbfs_counter to atomic · bba1e9b2
      Chen, Kenneth W 提交于
      Implementation of hugetlbfs_counter() is functionally equivalent to
      atomic_inc_return().  Use the simpler atomic form.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bba1e9b2
    • D
      [PATCH] hugepage: Strict page reservation for hugepage inodes · b45b5bd6
      David Gibson 提交于
      These days, hugepages are demand-allocated at first fault time.  There's a
      somewhat dubious (and racy) heuristic when making a new mmap() to check if
      there are enough available hugepages to fully satisfy that mapping.
      
      A particularly obvious case where the heuristic breaks down is where a
      process maps its hugepages not as a single chunk, but as a bunch of
      individually mmap()ed (or shmat()ed) blocks without touching and
      instantiating the pages in between allocations.  In this case the size of
      each block is compared against the total number of available hugepages.
      It's thus easy for the process to become overcommitted, because each block
      mapping will succeed, although the total number of hugepages required by
      all blocks exceeds the number available.  In particular, this defeats such
      a program which will detect a mapping failure and adjust its hugepage usage
      downward accordingly.
      
      The patch below addresses this problem, by strictly reserving a number of
      physical hugepages for hugepage inodes which have been mapped, but not
      instatiated.  MAP_SHARED mappings are thus "safe" - they will fail on
      mmap(), not later with an OOM SIGKILL.  MAP_PRIVATE mappings can still
      trigger an OOM.  (Actually SHARED mappings can technically still OOM, but
      only if the sysadmin explicitly reduces the hugepage pool between mapping
      and instantiation)
      
      This patch appears to address the problem at hand - it allows DB2 to start
      correctly, for instance, which previously suffered the failure described
      above.
      
      This patch causes no regressions on the libhugetblfs testsuite, and makes a
      test (designed to catch this problem) pass which previously failed (ppc64,
      POWER5).
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b45b5bd6
  21. 02 2月, 2006 1 次提交
  22. 15 1月, 2006 1 次提交
    • R
      [PATCH] Add tmpfs options for memory placement policies · 7339ff83
      Robin Holt 提交于
      Anything that writes into a tmpfs filesystem is liable to disproportionately
      decrease the available memory on a particular node.  Since there's no telling
      what sort of application (e.g.  dd/cp/cat) might be dropping large files
      there, this lets the admin choose the appropriate default behavior for their
      site's situation.
      
      Introduce a tmpfs mount option which allows specifying a memory policy and
      a second option to specify the nodelist for that policy.  With the default
      policy, tmpfs will behave as it does today.  This patch adds support for
      preferred, bind, and interleave policies.
      
      The default policy will cause pages to be added to tmpfs files on the node
      which is doing the writing.  Some jobs expect a single process to create
      and manage the tmpfs files.  This results in a node which has a
      significantly reduced number of free pages.
      
      With this patch, the administrator can specify the policy and nodes for
      that policy where they would prefer allocations.
      
      This patch was originally written by Brent Casavant and Hugh Dickins.  I
      added support for the bind and preferred policies and the mpol_nodelist
      mount option.
      Signed-off-by: NBrent Casavant <bcasavan@sgi.com>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7339ff83
  23. 12 1月, 2006 1 次提交
  24. 10 1月, 2006 1 次提交
  25. 07 1月, 2006 1 次提交
    • D
      [PATCH] Hugetlb: Copy on Write support · 1e8f889b
      David Gibson 提交于
      Implement copy-on-write support for hugetlb mappings so MAP_PRIVATE can be
      supported.  This helps us to safely use hugetlb pages in many more
      applications.  The patch makes the following changes.  If needed, I also have
      it broken out according to the following paragraphs.
      
      1. Add a pair of functions to set/clear write access on huge ptes.  The
         writable check in make_huge_pte is moved out to the caller for use by COW
         later.
      
      2. Hugetlb copy-on-write requires special case handling in the following
         situations:
      
         - copy_hugetlb_page_range() - Copied pages must be write protected so
           a COW fault will be triggered (if necessary) if those pages are written
           to.
      
         - find_or_alloc_huge_page() - Only MAP_SHARED pages are added to the
           page cache.  MAP_PRIVATE pages still need to be locked however.
      
      3. Provide hugetlb_cow() and calls from hugetlb_fault() and
         hugetlb_no_page() which handles the COW fault by making the actual copy.
      
      4. Remove the check in hugetlbfs_file_map() so that MAP_PRIVATE mmaps
         will be allowed.  Make MAP_HUGETLB exempt from the depricated VM_RESERVED
         mapping check.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAdam Litke <agl@us.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "Seth, Rohit" <rohit.seth@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1e8f889b
  26. 23 11月, 2005 1 次提交
    • D
      [PATCH] Fix hugetlbfs_statfs() reporting of block limits · 74a8a65c
      David Gibson 提交于
      Currently, if a hugetlbfs is mounted without limits (the default), statfs()
      will return -1 for max/free/used blocks.  This does not appear to be in
      line with normal convention: simple_statfs() and shmem_statfs() both return
      0 in similar cases.  Worse, it confuses the translation logic in
      put_compat_statfs(), causing it to return -EOVERFLOW on such a mount.
      
      This patch alters hugetlbfs_statfs() to return 0 for max/free/used blocks
      on a mount without limits.  Note that we need the test in the patch below,
      rather than just using 0 in the sbinfo structure, because the -1 marked in
      the free blocks field is used internally to tell the
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      74a8a65c
  27. 09 11月, 2005 1 次提交
  28. 30 10月, 2005 2 次提交
    • A
      [PATCH] hugetlb: overcommit accounting check · 2e9b367c
      Adam Litke 提交于
      Basic overcommit checking for hugetlb_file_map() based on an implementation
      used with demand faulting in SLES9.
      
      Since demand faulting can't guarantee the availability of pages at mmap
      time, this patch implements a basic sanity check to ensure that the number
      of huge pages required to satisfy the mmap are currently available.
      Despite the obvious race, I think it is a good start on doing proper
      accounting.  I'd like to work towards an accounting system that mimics the
      semantics of normal pages (especially for the MAP_PRIVATE/COW case).  That
      work is underway and builds on what this patch starts.
      
      Huge page shared memory segments are simpler and still maintain their
      commit on shmget semantics.
      Signed-off-by: NAdam Litke <agl@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2e9b367c
    • A
      [PATCH] hugetlb: demand fault handler · 4c887265
      Adam Litke 提交于
      Below is a patch to implement demand faulting for huge pages.  The main
      motivation for changing from prefaulting to demand faulting is so that huge
      page memory areas can be allocated according to NUMA policy.
      
      Thanks to consolidated hugetlb code, switching the behavior requires changing
      only one fault handler.  The bulk of the patch just moves the logic from
      hugelb_prefault() to hugetlb_pte_fault() and find_get_huge_page().
      Signed-off-by: NAdam Litke <agl@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4c887265