1. 10 6月, 2009 1 次提交
    • L
      Make /dev/zero reads interruptible by signals · 2b838687
      Linus Torvalds 提交于
      This helps with bad latencies for large reads from /dev/zero, but might
      conceivably break some application that "knows" that a read of /dev/zero
      cannot return early.  So do this early in the merge window to give us
      maximal test coverage, even if the patch is totally trivial.
      
      Obviously, no well-behaved application should ever depend on the read
      being uninterruptible, but hey, bugs happen.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b838687
  2. 05 6月, 2009 1 次提交
    • S
      drivers/char/mem.c: avoid OOM lockup during large reads from /dev/zero · 730c586a
      Salman Qazi 提交于
      While running 20 parallel instances of dd as follows:
      
        #!/bin/bash
        for i in `seq 1 20`; do
                 dd if=/dev/zero of=/export/hda3/dd_$i bs=1073741824 count=1 &
        done
        wait
      
      on a 16G machine, we noticed that rather than just killing the processes,
      the entire kernel went down.  Stracing dd reveals that it first does an
      mmap2, which makes 1GB worth of zero page mappings.  Then it performs a
      read on those pages from /dev/zero, and finally it performs a write.
      
      The machine died during the reads.  Looking at the code, it was noticed
      that /dev/zero's read operation had been changed by
      557ed1fa ("remove ZERO_PAGE") from giving
      zero page mappings to actually zeroing the page.
      
      The zeroing of the pages causes physical pages to be allocated to the
      process.  But, when the process exhausts all the memory that it can, the
      kernel cannot kill it, as it is still in the kernel mode allocating more
      memory.  Consequently, the kernel eventually crashes.
      
      To fix this, I propose that when a fatal signal is pending during
      /dev/zero read operation, we simply return and let the user process die.
      Signed-off-by: NSalman Qazi <sqazi@google.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ Modified error return and comment trivially.  - Linus]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      730c586a
  3. 10 4月, 2009 1 次提交
  4. 07 1月, 2009 1 次提交
  5. 17 10月, 2008 1 次提交
  6. 25 7月, 2008 1 次提交
  7. 22 7月, 2008 1 次提交
  8. 20 7月, 2008 1 次提交
    • I
      Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM · d092633b
      Ingo Molnar 提交于
      From: Arjan van de Ven <arjan@infradead.org>
      Date: Sat, 19 Jul 2008 15:47:17 -0700
      
      CONFIG_NONPROMISC_DEVMEM was a rather confusing name - but renaming it
      to CONFIG_PROMISC_DEVMEM causes problems on architectures that do not
      support this feature; this patch renames it to CONFIG_STRICT_DEVMEM,
      so that architectures can opt-in into it.
      
      ( the polarity of the option is still the same as it was originally; it
        needs to be for now to not break architectures that don't have the
        infastructure yet to support this feature)
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: "V.Radhakrishnan" <rk@atr-labs.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ---
      d092633b
  9. 18 7月, 2008 1 次提交
    • I
      x86: rename CONFIG_NONPROMISC_DEVMEM to CONFIG_PROMISC_DEVMEM · 64d206d8
      Ingo Molnar 提交于
      Linus observed:
      
      > The real bug is that we shouldn't have "double negatives", and
      > certainly not negative config options. Making that "promiscuous
      > /dev/mem" option a negated thing as a config option was bad.
      
      right ... lets rename this option. There should never be a negation
      in config options.
      
      [ that reminds me of CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER, but that
        is for another commit ;-) ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      64d206d8
  10. 21 6月, 2008 1 次提交
  11. 29 4月, 2008 1 次提交
  12. 25 4月, 2008 5 次提交
  13. 29 10月, 2007 1 次提交
  14. 17 10月, 2007 2 次提交
    • P
      mm: bdi init hooks · e0bf68dd
      Peter Zijlstra 提交于
      provide BDI constructor/destructor hooks
      
      [akpm@linux-foundation.org: compile fix]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e0bf68dd
    • N
      remove ZERO_PAGE · 557ed1fa
      Nick Piggin 提交于
      The commit b5810039 contains the note
      
        A last caveat: the ZERO_PAGE is now refcounted and managed with rmap
        (and thus mapcounted and count towards shared rss).  These writes to
        the struct page could cause excessive cacheline bouncing on big
        systems.  There are a number of ways this could be addressed if it is
        an issue.
      
      And indeed this cacheline bouncing has shown up on large SGI systems.
      There was a situation where an Altix system was essentially livelocked
      tearing down ZERO_PAGE pagetables when an HPC app aborted during startup.
      This situation can be avoided in userspace, but it does highlight the
      potential scalability problem with refcounting ZERO_PAGE, and corner
      cases where it can really hurt (we don't want the system to livelock!).
      
      There are several broad ways to fix this problem:
      1. add back some special casing to avoid refcounting ZERO_PAGE
      2. per-node or per-cpu ZERO_PAGES
      3. remove the ZERO_PAGE completely
      
      I will argue for 3. The others should also fix the problem, but they
      result in more complex code than does 3, with little or no real benefit
      that I can see.
      
      Why? Inserting a ZERO_PAGE for anonymous read faults appears to be a
      false optimisation: if an application is performance critical, it would
      not be doing many read faults of new memory, or at least it could be
      expected to write to that memory soon afterwards. If cache or memory use
      is critical, it should not be working with a significant number of
      ZERO_PAGEs anyway (a more compact representation of zeroes should be
      used).
      
      As a sanity check -- mesuring on my desktop system, there are never many
      mappings to the ZERO_PAGE (eg. 2 or 3), thus memory usage here should not
      increase much without it.
      
      When running a make -j4 kernel compile on my dual core system, there are
      about 1,000 mappings to the ZERO_PAGE created per second, but about 1,000
      ZERO_PAGE COW faults per second (less than 1 ZERO_PAGE mapping per second
      is torn down without being COWed). So removing ZERO_PAGE will save 1,000
      page faults per second when running kbuild, while keeping it only saves
      less than 1 page clearing operation per second. 1 page clear is cheaper
      than a thousand faults, presumably, so there isn't an obvious loss.
      
      Neither the logical argument nor these basic tests give a guarantee of no
      regressions. However, this is a reasonable opportunity to try to remove
      the ZERO_PAGE from the pagefault path. If it is found to cause regressions,
      we can reintroduce it and just avoid refcounting it.
      
      The /dev/zero ZERO_PAGE usage and TLB tricks also get nuked.  I don't see
      much use to them except on benchmarks.  All other users of ZERO_PAGE are
      converted just to use ZERO_PAGE(0) for simplicity. We can look at
      replacing them all and maybe ripping out ZERO_PAGE completely when we are
      more satisfied with this solution.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus "snif" Torvalds <torvalds@linux-foundation.org>
      557ed1fa
  15. 11 7月, 2007 1 次提交
  16. 10 7月, 2007 1 次提交
  17. 09 5月, 2007 2 次提交
  18. 18 4月, 2007 1 次提交
  19. 23 1月, 2007 1 次提交
    • L
      Revert "[PATCH] Fix up mmap_kmem" · 6d3154cc
      Linus Torvalds 提交于
      This reverts commit 99a10a60.
      
      As per Hugh Dickins:
      
        "Nadia Derbey has reported that mmap of /dev/kmem no longer works with
         the kernel virtual address as offset, and Franck has confirmed that
         his patch came from a misunderstanding of what an offset means to
         /dev/kmem - whereas his patch description seems to say that he was
         correcting the offset on a few plaforms, there was no such problem to
         correct, and his patch was in fact changing its API on all platforms."
      Suggested-by: NHugh Dickins <hugh@veritas.com>
      Cc: Franck Bui-Huu <fbuihuu@gmail.com>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Andrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6d3154cc
  20. 11 12月, 2006 1 次提交
    • H
      [PATCH] read_zero_pagealigned() locking fix · 5fcf7bb7
      Hugh Dickins 提交于
      Ramiro Voicu hits the BUG_ON(!pte_none(*pte)) in zeromap_pte_range: kernel
      bugzilla 7645.  Right: read_zero_pagealigned uses down_read of mmap_sem,
      but another thread's racing read of /dev/zero, or a normal fault, can
      easily set that pte again, in between zap_page_range and zeromap_page_range
      getting there.  It's been wrong ever since 2.4.3.
      
      The simple fix is to use down_write instead, but that would serialize reads
      of /dev/zero more than at present: perhaps some app would be badly
      affected.  So instead let zeromap_page_range return the error instead of
      BUG_ON, and read_zero_pagealigned break to the slower clear_user loop in
      that case - there's no need to optimize for it.
      
      Use -EEXIST for when a pte is found: BUG_ON in mmap_zero (the other user of
      zeromap_page_range), though it really isn't interesting there.  And since
      mmap_zero wants -EAGAIN for out-of-memory, the zeromaps better return that
      than -ENOMEM.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: Ramiro Voicu: <Ramiro.Voicu@cern.ch>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5fcf7bb7
  21. 09 12月, 2006 1 次提交
  22. 02 12月, 2006 1 次提交
  23. 13 10月, 2006 2 次提交
    • L
      Include proper header file for PFN_DOWN() · b8a3ad5b
      Linus Torvalds 提交于
      The recent commit (99a10a60) to fix up
      mmap_kmem() broke compiles because it used PFN_DOWN() without including
      <linux/pfn.h>.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b8a3ad5b
    • F
      [PATCH] Fix up mmap_kmem · 99a10a60
      Franck Bui-Huu 提交于
      vma->vm_pgoff is an pfn _offset_ relatif to the begining
      of the memory start. The previous code was doing at first:
      
      	vma->vm_pgoff << PAGE_SHIFT
      
      which results into a wrong physical address since some
      platforms have a physical mem start that can be different
      from 0. After that the previous call __pa() on this
      wrong physical address, however __pa() is used to convert
      a _virtual_ address into a physical one.
      
      This patch rewrites this convertion. It calculates the
      pfn of PAGE_OFFSET which is the pfn of the mem start
      then it adds the vma->vm_pgoff to it.
      
      It also uses virt_to_phys() instead of __pa() since the
      latter shouldn't be used by drivers.
      Signed-off-by: NFranck Bui-Huu <fbuihuu@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      99a10a60
  24. 30 9月, 2006 1 次提交
  25. 27 9月, 2006 1 次提交
  26. 11 7月, 2006 1 次提交
    • L
      [PATCH] make valid_mmap_phys_addr_range() take a pfn · 06c67bef
      Lennert Buytenhek 提交于
      Newer ARMs have a 40 bit physical address space, but mapping physical
      memory above 4G needs a special page table format which we (currently?) do
      not use for userspace mappings, so what happens instead is that mapping an
      address >= 4G will happily discard the upper bits and wrap.
      
      There is a valid_mmap_phys_addr_range() arch hook where we could check for
      >= 4G addresses and deny the mapping, but this hook takes an unsigned long
      address:
      
      	static inline int valid_mmap_phys_addr_range(unsigned long addr, size_t size);
      
      And drivers/char/mem.c:mmap_mem() calls it like this:
      
      	static int mmap_mem(struct file * file, struct vm_area_struct * vma)
      	{
      		size_t size = vma->vm_end - vma->vm_start;
      
      		if (!valid_mmap_phys_addr_range(vma->vm_pgoff << PAGE_SHIFT, size))
      
      So that's not much help either.
      
      This patch makes the hook take a pfn instead of a phys address.
      Signed-off-by: NLennert Buytenhek <buytenh@wantstofly.org>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      06c67bef
  27. 04 7月, 2006 1 次提交
  28. 01 7月, 2006 1 次提交
  29. 27 6月, 2006 2 次提交
  30. 26 4月, 2006 1 次提交
  31. 29 3月, 2006 1 次提交
  32. 27 3月, 2006 1 次提交