1. 18 7月, 2011 3 次提交
  2. 15 7月, 2011 5 次提交
  3. 13 7月, 2011 10 次提交
  4. 28 6月, 2011 7 次提交
    • J
      mm: fix assertion mapping->nrpages == 0 in end_writeback() · 08142579
      Jan Kara 提交于
      Under heavy memory and filesystem load, users observe the assertion
      mapping->nrpages == 0 in end_writeback() trigger.  This can be caused by
      page reclaim reclaiming the last page from a mapping in the following
      race:
      
      	CPU0				CPU1
        ...
        shrink_page_list()
          __remove_mapping()
            __delete_from_page_cache()
              radix_tree_delete()
      					evict_inode()
      					  truncate_inode_pages()
      					    truncate_inode_pages_range()
      					      pagevec_lookup() - finds nothing
      					  end_writeback()
      					    mapping->nrpages != 0 -> BUG
              page->mapping = NULL
              mapping->nrpages--
      
      Fix the problem by doing a reliable check of mapping->nrpages under
      mapping->tree_lock in end_writeback().
      
      Analyzed by Jay <jinshan.xiong@whamcloud.com>, lost in LKML, and dug out
      by Miklos Szeredi <mszeredi@suse.de>.
      
      Cc: Jay <jinshan.xiong@whamcloud.com>
      Cc: Miklos Szeredi <mszeredi@suse.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08142579
    • C
      include/linux/compat.h: declare compat_sys_sendmmsg() · 507c5f12
      Chris Metcalf 提交于
      This is required for tilegx to be able to use the compat unistd.h header
      where compat_sys_sendmmsg() is now mentioned.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      507c5f12
    • H
      tmpfs: add shmem_read_mapping_page_gfp · d9d90e5e
      Hugh Dickins 提交于
      Although it is used (by i915) on nothing but tmpfs, read_cache_page_gfp()
      is unsuited to tmpfs, because it inserts a page into pagecache before
      calling the filesystem's ->readpage: tmpfs may have pages in swapcache
      which only it knows how to locate and switch to filecache.
      
      At present tmpfs provides a ->readpage method, and copes with this by
      copying pages; but soon we can simplify it by removing its ->readpage.
      Provide shmem_read_mapping_page_gfp() now, ready for that transition,
      
      Export shmem_read_mapping_page_gfp() and add it to list in shmem_fs.h,
      with shmem_read_mapping_page() inline for the common mapping_gfp case.
      
      (shmem_read_mapping_page_gfp or shmem_read_cache_page_gfp? Generally the
      read_mapping_page functions use the mapping's ->readpage, and the
      read_cache_page functions use the supplied filler, so I think
      read_cache_page_gfp was slightly misnamed.)
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d9d90e5e
    • H
      tmpfs: take control of its truncate_range · 94c1e62d
      Hugh Dickins 提交于
      2.6.35's new truncate convention gave tmpfs the opportunity to control
      its file truncation, no longer enforced from outside by vmtruncate().
      We shall want to build upon that, to handle pagecache and swap together.
      
      Slightly redefine the ->truncate_range interface: let it now be called
      between the unmap_mapping_range()s, with the filesystem responsible for
      doing the truncate_inode_pages_range() from it - just as the filesystem
      is nowadays responsible for doing that from its ->setattr.
      
      Let's rename shmem_notify_change() to shmem_setattr().  Instead of
      calling the generic truncate_setsize(), bring that code in so we can
      call shmem_truncate_range() - which will later be updated to perform its
      own variant of truncate_inode_pages_range().
      
      Remove the punch_hole unmap_mapping_range() from shmem_truncate_range():
      now that the COW's unmap_mapping_range() comes after ->truncate_range,
      there is no need to call it a third time.
      
      Export shmem_truncate_range() and add it to the list in shmem_fs.h, so
      that i915_gem_object_truncate() can call it explicitly in future; get
      this patch in first, then update drm/i915 once this is available (until
      then, i915 will just be doing the truncate_inode_pages() twice).
      
      Though introduced five years ago, no other filesystem is implementing
      ->truncate_range, and its only other user is madvise(,,MADV_REMOVE): we
      expect to convert it to fallocate(,FALLOC_FL_PUNCH_HOLE,,) shortly,
      whereupon ->truncate_range can be removed from inode_operations -
      shmem_truncate_range() will help i915 across that transition too.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      94c1e62d
    • H
      mm: move shmem prototypes to shmem_fs.h · 072441e2
      Hugh Dickins 提交于
      Before adding any more global entry points into shmem.c, gather such
      prototypes into shmem_fs.h.  Remove mm's own declarations from swap.h,
      but for now leave the ones in mm.h: because shmem_file_setup() and
      shmem_zero_setup() are called from various places, and we should not
      force other subsystems to update immediately.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      072441e2
    • V
      Fix some kernel-doc warnings · 4d258b25
      Vitaliy Ivanov 提交于
      Fix 'make htmldocs' warnings:
      
        Warning(/include/linux/hrtimer.h:153): No description found for parameter 'clockid'
        Warning(/include/linux/device.h:604): Excess struct/union/enum/typedef member 'of_match' description in 'device'
        Warning(/include/net/sock.h:349): Excess struct/union/enum/typedef member 'sk_rmem_alloc' description in 'sock'
      Signed-off-by: NVitaliy Ivanov <vitalivanov@gmail.com>
      Acked-by: NGrant Likely <grant.likely@secretlab.ca>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4d258b25
    • K
      Fix node_start/end_pfn() definition for mm/page_cgroup.c · c6830c22
      KAMEZAWA Hiroyuki 提交于
      commit 21a3c964 uses node_start/end_pfn(nid) for detection start/end
      of nodes. But, it's not defined in linux/mmzone.h but defined in
      /arch/???/include/mmzone.h which is included only under
      CONFIG_NEED_MULTIPLE_NODES=y.
      
      Then, we see
        mm/page_cgroup.c: In function 'page_cgroup_init':
        mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
        mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'
      
      So, fixiing page_cgroup.c is an idea...
      
      But node_start_pfn()/node_end_pfn() is a very generic macro and
      should be implemented in the same manner for all archs.
      (m32r has different implementation...)
      
      This patch removes definitions of node_start/end_pfn() in each archs
      and defines a unified one in linux/mmzone.h. It's not under
      CONFIG_NEED_MULTIPLE_NODES, now.
      
      A result of macro expansion is here (mm/page_cgroup.c)
      
      for !NUMA
       start_pfn = ((&contig_page_data)->node_start_pfn);
        end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});
      
      for NUMA (x86-64)
        start_pfn = ((node_data[nid])->node_start_pfn);
        end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});
      
      Changelog:
       - fixed to avoid using "nid" twice in node_end_pfn() macro.
      Reported-and-acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Reported-and-tested-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c6830c22
  5. 22 6月, 2011 2 次提交
    • A
      PM: Fix async resume following suspend failure · 6d0e0e84
      Alan Stern 提交于
      The PM core doesn't handle suspend failures correctly when it comes to
      asynchronously suspended devices.  These devices are moved onto the
      dpm_suspended_list as soon as the corresponding async thread is
      started up, and they remain on the list even if they fail to suspend
      or the sleep transition is cancelled before they get suspended.  As a
      result, when the PM core unwinds the transition, it tries to resume
      the devices even though they were never suspended.
      
      This patch (as1474) fixes the problem by adding a new "is_suspended"
      flag to dev_pm_info.  Devices are resumed only if the flag is set.
      
      [rjw:
       * Moved the dev->power.is_suspended check into device_resume(),
         because we need to complete dev->power.completion and clear
         dev->power.is_prepared too for devices whose
         dev->power.is_suspended flags are unset.
       * Fixed __device_suspend() to avoid setting dev->power.is_suspended
         if async_error is different from zero.]
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@kernel.org
      6d0e0e84
    • A
      PM: Rename dev_pm_info.in_suspend to is_prepared · f76b168b
      Alan Stern 提交于
      This patch (as1473) renames the "in_suspend" field in struct
      dev_pm_info to "is_prepared", in preparation for an upcoming change.
      The new name is more descriptive of what the field really means.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@kernel.org
      f76b168b
  6. 21 6月, 2011 2 次提交
    • L
      vfs: i_state needs to be 'unsigned long' for now · 79568f5b
      Linus Torvalds 提交于
      Commit 13e12d14 ("vfs: reorganize 'struct inode' layout a bit")
      moved things around a bit changed i_state to be unsigned int instead of
      unsigned long.  That was to help structure layout for the 64-bit case,
      and shrink 'struct inode' a bit (admittedly that only happened when
      spinlock debugging was on and i_flags didn't pack with i_lock).
      
      However, Meelis Roos reports that this results in unaligned exceptions
      on sprc, and it turns out that the bit-locking primitives that we use
      for the I_NEW bit want to use the bitops.  Which want 'unsigned long',
      not 'unsigned int'.
      
      We really should fix the bit locking code to not have that kind of
      requirement, but that's a much bigger change.  So for now, revert that
      field back to 'unsigned long' (but keep the other re-ordering changes
      from the commit that caused this).
      
      Andi points out that we have played games with this in 'struct page', so
      it's solvable with other hacks too, but since right now the struct inode
      size advantage only happens with some rare config options, it's not
      worth fighting.
      
      It _would_ be worth fixing the bitlocking code, though.  Especially
      since there is no type safety in the bitlocking code (this never caused
      any warnings, and worked fine on x86-64, because the bitlocks take a
      'void *' and x86-64 doesn't care that deeply about alignment).  So it's
      currently a very easy problem to trigger by mistake and never notice.
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79568f5b
    • B
      NFSv4.1: file layout must consider pg_bsize for coalescing · 19345cb2
      Benny Halevy 提交于
      Otherwise we end up overflowing the rpc buffer size on the receive end.
      Signed-off-by: NBenny Halevy <benny@tonian.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      19345cb2
  7. 20 6月, 2011 2 次提交
  8. 19 6月, 2011 1 次提交
  9. 18 6月, 2011 2 次提交
  10. 17 6月, 2011 2 次提交
  11. 16 6月, 2011 4 次提交
    • R
      gpio: add GPIOF_ values regardless on kconfig settings · c001fb72
      Randy Dunlap 提交于
      Make GPIOF_ defined values available even when GPIOLIB nor GENERIC_GPIO
      is enabled by moving them to <linux/gpio.h>.
      
      Fixes these build errors in linux-next:
      sound/soc/codecs/ak4641.c:524: error: 'GPIOF_OUT_INIT_LOW' undeclared (first use in this function)
      sound/soc/codecs/wm8915.c:2921: error: 'GPIOF_OUT_INIT_LOW' undeclared (first use in this function)
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      c001fb72
    • J
      uts: make default hostname configurable, rather than always using "(none)" · bd5dc17b
      Josh Triplett 提交于
      The "hostname" tool falls back to setting the hostname to "localhost" if
      /etc/hostname does not exist.  Distribution init scripts have the same
      fallback.  However, if userspace never calls sethostname, such as when
      booting with init=/bin/sh, or otherwise booting a minimal system without
      the usual init scripts, the default hostname of "(none)" remains,
      unhelpfully appearing in various places such as prompts ("root@(none):~#")
      and logs.  Furthermore, "(none)" doesn't typically resolve to anything
      useful.
      
      Make the default hostname configurable.  This removes the need for the
      standard fallback, provides a useful default for systems that never call
      sethostname, and makes minimal systems that much more useful with less
      configuration.  Distributions could choose to use "localhost" here to
      avoid the fallback, while embedded systems may wish to use a specific
      target hostname.
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NDavid Miller <davem@davemloft.net>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Cc: Kel Modderman <kel@otaku42.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd5dc17b
    • D
      BUILD_BUG_ON_ZERO: fix sparse breakage · ca39599c
      Dr. David Alan Gilbert 提交于
      BUILD_BUG_ON_ZERO and BUILD_BUG_ON_NULL must return values, even in the
      CHECKER case otherwise various users of it become syntactically invalid.
      Signed-off-by: NDr. David Alan Gilbert <linux@treblig.org>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca39599c
    • K
      mm: increase RECLAIM_DISTANCE to 30 · 32e45ff4
      KOSAKI Motohiro 提交于
      Recently, Robert Mueller reported (http://lkml.org/lkml/2010/9/12/236)
      that zone_reclaim_mode doesn't work properly on his new NUMA server (Dual
      Xeon E5520 + Intel S5520UR MB).  He is using Cyrus IMAPd and it's built on
      a very traditional single-process model.
      
        * a master process which reads config files and manages the other
          process
        * multiple imapd processes, one per connection
        * multiple pop3d processes, one per connection
        * multiple lmtpd processes, one per connection
        * periodical "cleanup" processes.
      
      There are thousands of independent processes.  The problem is, recent
      Intel motherboard turn on zone_reclaim_mode by default and traditional
      prefork model software don't work well on it.  Unfortunatelly, such models
      are still typical even in the 21st century.  We can't ignore them.
      
      This patch raises the zone_reclaim_mode threshold to 30.  30 doesn't have
      any specific meaning.  but 20 means that one-hop QPI/Hypertransport and
      such relatively cheap 2-4 socket machine are often used for traditional
      servers as above.  The intention is that these machines don't use
      zone_reclaim_mode.
      
      Note: ia64 and Power have arch specific RECLAIM_DISTANCE definitions.
      This patch doesn't change such high-end NUMA machine behavior.
      
      Dave Hansen said:
      
      : I know specifically of pieces of x86 hardware that set the information
      : in the BIOS to '21' *specifically* so they'll get the zone_reclaim_mode
      : behavior which that implies.
      :
      : They've done performance testing and run very large and scary benchmarks
      : to make sure that they _want_ this turned on.  What this means for them
      : is that they'll probably be de-optimized, at least on newer versions of
      : the kernel.
      :
      : If you want to do this for particular systems, maybe _that_'s what we
      : should do.  Have a list of specific configurations that need the
      : defaults overridden either because they're buggy, or they have an
      : unusual hardware configuration not really reflected in the distance
      : table.
      
      And later said:
      
      : The original change in the hardware tables was for the benefit of a
      : benchmark.  Said benchmark isn't going to get run on mainline until the
      : next batch of enterprise distros drops, at which point the hardware where
      : this was done will be irrelevant for the benchmark.  I'm sure any new
      : hardware will just set this distance to another yet arbitrary value to
      : make the kernel do what it wants.  :)
      :
      : Also, when the hardware got _set_ to this initially, I complained.  So, I
      : guess I'm getting my way now, with this patch.  I'm cool with it.
      Reported-by: NRobert Mueller <robm@fastmail.fm>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Acked-by: NDave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32e45ff4