1. 29 4月, 2008 1 次提交
  2. 28 4月, 2008 1 次提交
    • N
      mm: introduce pte_special pte bit · 7e675137
      Nick Piggin 提交于
      s390 for one, cannot implement VM_MIXEDMAP with pfn_valid, due to their memory
      model (which is more dynamic than most).  Instead, they had proposed to
      implement it with an additional path through vm_normal_page(), using a bit in
      the pte to determine whether or not the page should be refcounted:
      
      vm_normal_page()
      {
      	...
              if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
                      if (vma->vm_flags & VM_MIXEDMAP) {
      #ifdef s390
      			if (!mixedmap_refcount_pte(pte))
      				return NULL;
      #else
                              if (!pfn_valid(pfn))
                                      return NULL;
      #endif
                              goto out;
                      }
      	...
      }
      
      This is fine, however if we are allowed to use a bit in the pte to determine
      refcountedness, we can use that to _completely_ replace all the vma based
      schemes.  So instead of adding more cases to the already complex vma-based
      scheme, we can have a clearly seperate and simple pte-based scheme (and get
      slightly better code generation in the process):
      
      vm_normal_page()
      {
      #ifdef s390
      	if (!mixedmap_refcount_pte(pte))
      		return NULL;
      	return pte_page(pte);
      #else
      	...
      #endif
      }
      
      And finally, we may rather make this concept usable by any architecture rather
      than making it s390 only, so implement a new type of pte state for this.
      Unfortunately the old vma based code must stay, because some architectures may
      not be able to spare pte bits.  This makes vm_normal_page a little bit more
      ugly than we would like, but the 2 cases are clearly seperate.
      
      So introduce a pte_special pte state, and use it in mm/memory.c.  It is
      currently a noop for all architectures, so this doesn't actually result in any
      compiled code changes to mm/memory.o.
      
      BTW:
      I haven't put vm_normal_page() into arch code as-per an earlier suggestion.
      The reason is that, regardless of where vm_normal_page is actually
      implemented, the *abstraction* is still exactly the same. Also, while it
      depends on whether the architecture has pte_special or not, that is the
      only two possible cases, and it really isn't an arch specific function --
      the role of the arch code should be to provide primitive functions and
      accessors with which to build the core code; pte_special does that. We do
      not want architectures to know or care about vm_normal_page itself, and
      we definitely don't want them being able to invent something new there
      out of sight of mm/ code. If we made vm_normal_page an arch function, then
      we have to make vm_insert_mixed (next patch) an arch function too. So I
      don't think moving it to arch code fundamentally improves any abstractions,
      while it does practically make the code more difficult to follow, for both
      mm and arch developers, and easier to misuse.
      
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Acked-by: NCarsten Otte <cotte@de.ibm.com>
      Cc: Jared Hulbert <jaredeh@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e675137
  3. 17 4月, 2008 1 次提交
  4. 03 4月, 2008 1 次提交
    • C
      kvm: provide kvm.h for all architecture: fixes headers_install · dd135ebb
      Christian Borntraeger 提交于
      Currently include/linux/kvm.h is not considered by make headers_install,
      because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h.  This problem
      was introduced by
      
      commit fb56dbb3
      Author: Avi Kivity <avi@qumranet.com>
      Date:   Sun Dec 2 10:50:06 2007 +0200
      
          KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM
      
          Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h>
          includes, not existing.  Rather than add a zillion <asm/kvm.h>s, export kvm.
          only if the arch actually supports it.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      
      which makes this an 2.6.25 regression.
      
      One way of solving the issue is to enhance Kbuild, but Avi and David conviced
      me, that changing headers_install is not the way to go.  This patch changes
      the definition for linux/kvm.h to unifdef-y.
      
      If  unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
      architectures without asm/kvm.h.  Therefore, this patch also provides
      asm/kvm.h on all architectures.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: NAvi Kivity <avi@qumranet.com>
      Cc: Sam Ravnborg <sam@ravnborg.org
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd135ebb
  5. 05 3月, 2008 1 次提交
  6. 09 2月, 2008 5 次提交
    • M
      CONFIG_HIGHPTE vs. sub-page page tables. · 2f569afd
      Martin Schwidefsky 提交于
      Background: I've implemented 1K/2K page tables for s390.  These sub-page
      page tables are required to properly support the s390 virtualization
      instruction with KVM.  The SIE instruction requires that the page tables
      have 256 page table entries (pte) followed by 256 page status table entries
      (pgste).  The pgstes are only required if the process is using the SIE
      instruction.  The pgstes are updated by the hardware and by the hypervisor
      for a number of reasons, one of them is dirty and reference bit tracking.
      To avoid wasting memory the standard pte table allocation should return
      1K/2K (31/64 bit) and 2K/4K if the process is using SIE.
      
      Problem: Page size on s390 is 4K, page table size is 1K or 2K.  That means
      the s390 version for pte_alloc_one cannot return a pointer to a struct
      page.  Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
      cannot return a pointer to a pte either, since that would require more than
      32 bit for the return value of pte_alloc_one (and the pte * would not be
      accessible since its not kmapped).
      
      Solution: The only solution I found to this dilemma is a new typedef: a
      pgtable_t.  For s390 pgtable_t will be a (pte *) - to be introduced with a
      later patch.  For everybody else it will be a (struct page *).  The
      additional problem with the initialization of the ptl lock and the
      NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
      a destructor pgtable_page_dtor.  The page table allocation and free
      functions need to call these two whenever a page table page is allocated or
      freed.  pmd_populate will get a pgtable_t instead of a struct page pointer.
       To get the pgtable_t back from a pmd entry that has been installed with
      pmd_populate a new function pmd_pgtable is added.  It replaces the pmd_page
      call in free_pte_range and apply_to_pte_range.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f569afd
    • H
      avoid overflows in kernel/time.c · bdc80787
      H. Peter Anvin 提交于
      When the conversion factor between jiffies and milli- or microseconds is
      not a single multiply or divide, as for the case of HZ == 300, we currently
      do a multiply followed by a divide.  The intervening result, however, is
      subject to overflows, especially since the fraction is not simplified (for
      HZ == 300, we multiply by 300 and divide by 1000).
      
      This is exposed to the user when passing a large timeout to poll(), for
      example.
      
      This patch replaces the multiply-divide with a reciprocal multiplication on
      32-bit platforms.  When the input is an unsigned long, there is no portable
      way to do this on 64-bit platforms there is no portable way to do this
      since it requires a 128-bit intermediate result (which gcc does support on
      64-bit platforms but may generate libgcc calls, e.g.  on 64-bit s390), but
      since the output is a 32-bit integer in the cases affected, just simplify
      the multiply-divide (*3/10 instead of *300/1000).
      
      The reciprocal multiply used can have off-by-one errors in the upper half
      of the valid output range.  This could be avoided at the expense of having
      to deal with a potential 65-bit intermediate result.  Since the intent is
      to avoid overflow problems and most of the other time conversions are only
      semiexact, the off-by-one errors were considered an acceptable tradeoff.
      
      At Ralf Baechle's suggestion, this version uses a Perl script to compute
      the necessary constants.  We already have dependencies on Perl for kernel
      compiles.  This does, however, require the Perl module Math::BigInt, which
      is included in the standard Perl distribution starting with version 5.8.0.
      In order to support older versions of Perl, include a table of canned
      constants in the script itself, and structure the script so that
      Math::BigInt isn't required if pulling values from said table.
      
      Running the script requires that the HZ value is available from the
      Makefile.  Thus, this patch also adds the Kconfig variable CONFIG_HZ to the
      architectures which didn't already have it (alpha, cris, frv, h8300, m32r,
      m68k, m68knommu, sparc, v850, and xtensa.) It does *not* touch the sh or
      sh64 architectures, since Paul Mundt has dealt with those separately in the
      sh tree.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>,
      Cc: Sam Ravnborg <sam@ravnborg.org>,
      Cc: Paul Mundt <lethal@linux-sh.org>,
      Cc: Richard Henderson <rth@twiddle.net>,
      Cc: Michael Starvik <starvik@axis.com>,
      Cc: David Howells <dhowells@redhat.com>,
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>,
      Cc: Hirokazu Takata <takata@linux-m32r.org>,
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>,
      Cc: Roman Zippel <zippel@linux-m68k.org>,
      Cc: William L. Irwin <sparclinux@vger.kernel.org>,
      Cc: Chris Zankel <chris@zankel.net>,
      Cc: H. Peter Anvin <hpa@zytor.com>,
      Cc: Jan Engelhardt <jengelh@computergmbh.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bdc80787
    • M
      asm-*/posix_types.h: scrub __GLIBC__ · 531d7d42
      Mike Frysinger 提交于
      Some arches (like alpha and ia64) already have a clean posix_types.h header.
      This brings all the others in line by removing all references to __GLIBC__
      (and some undocumented __USE_ALL).
      Signed-off-by: NMike Frysinger <vapier@gentoo.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      531d7d42
    • D
      aout: suppress A.OUT library support if !CONFIG_ARCH_SUPPORTS_AOUT · 7fa30315
      David Howells 提交于
      Suppress A.OUT library support if CONFIG_ARCH_SUPPORTS_AOUT is not set.
      
      Not all architectures support the A.OUT binfmt, so the ELF binfmt should not
      be permitted to go looking for A.OUT libraries to load in such a case.  Not
      only that, but under such conditions A.OUT core dumps are not produced either.
      
      To make this work, this patch also does the following:
      
       (1) Makes the existence of the contents of linux/a.out.h contingent on
           CONFIG_ARCH_SUPPORTS_AOUT.
      
       (2) Renames dump_thread() to aout_dump_thread() as it's only called by A.OUT
           core dumping code.
      
       (3) Moves aout_dump_thread() into asm/a.out-core.h and makes it inline.  This
           is then included only where needed.  This means that this bit of arch
           code will be stored in the appropriate A.OUT binfmt module rather than
           the core kernel.
      
       (4) Drops A.OUT support for Blackfin (according to Mike Frysinger it's not
           needed) and FRV.
      
      This patch depends on the previous patch to move STACK_TOP[_MAX] out of
      asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT
      format is available.
      
      [jdike@addtoit.com: uml: re-remove accidentally restored code]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NJeff Dike <jdike@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7fa30315
    • D
      aout: move STACK_TOP[_MAX] to asm/processor.h · 922a70d3
      David Howells 提交于
      Move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're
      required whether or not A.OUT format is available.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      922a70d3
  7. 08 2月, 2008 4 次提交
  8. 07 2月, 2008 1 次提交
  9. 06 2月, 2008 2 次提交
  10. 01 2月, 2008 1 次提交
  11. 29 1月, 2008 1 次提交
  12. 23 10月, 2007 2 次提交
    • J
      Add CONFIG_DEBUG_SG sg validation · d6ec0842
      Jens Axboe 提交于
      Add a Kconfig entry which will toggle some sanity checks on the sg
      entry and tables.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d6ec0842
    • J
      Change table chaining layout · 18dabf47
      Jens Axboe 提交于
      Change the page member of the scatterlist structure to be an unsigned
      long, and encode more stuff in the lower bits:
      
      - Bits 0 and 1 zero: this is a normal sg entry. Next sg entry is located
        at sg + 1.
      - Bit 0 set: this is a chain entry, the next real entry is at ->page_link
        with the two low bits masked off.
      - Bit 1 set: this is the final entry in the sg entry. sg_next() will return
        NULL when passed such an entry.
      
      It's thus important that sg table users use the proper accessors to get
      and set the page member.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      18dabf47
  13. 20 10月, 2007 3 次提交
  14. 19 10月, 2007 1 次提交
    • N
      bitops: introduce lock ops · 26333576
      Nick Piggin 提交于
      Introduce test_and_set_bit_lock / clear_bit_unlock bitops with lock semantics.
      Convert all architectures to use the generic implementation.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Acked-By: NDavid Howells <dhowells@redhat.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Bryan Wu <bryan.wu@analog.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Chris Zankel <chris@zankel.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26333576
  15. 17 10月, 2007 5 次提交
  16. 14 10月, 2007 2 次提交
  17. 12 9月, 2007 1 次提交
  18. 23 8月, 2007 3 次提交
  19. 29 7月, 2007 1 次提交
  20. 27 7月, 2007 1 次提交
  21. 23 7月, 2007 1 次提交
  22. 20 7月, 2007 1 次提交