1. 10 12月, 2009 1 次提交
    • C
      vfs: Implement proper O_SYNC semantics · 6b2f3d1f
      Christoph Hellwig 提交于
      While Linux provided an O_SYNC flag basically since day 1, it took until
      Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
      since that day we had generic_osync_around with only minor changes and the
      great "For now, when the user asks for O_SYNC, we'll actually give
      O_DSYNC" comment.  This patch intends to actually give us real O_SYNC
      semantics in addition to the O_DSYNC semantics.  After Jan's O_SYNC
      patches which are required before this patch it's actually surprisingly
      simple, we just need to figure out when to set the datasync flag to
      vfs_fsync_range and when not.
      
      This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
      numerical value to keep binary compatibility, and adds a new real O_SYNC
      flag.  To guarantee backwards compatiblity it is defined as expanding to
      both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
      sure we are backwards-compatible when compiled against the new headers.
      
      This also means that all places that don't care about the differences can
      just check O_DSYNC and get the right behaviour for O_SYNC, too - only
      places that actuall care need to check __O_SYNC in addition.  Drivers and
      network filesystems have been updated in a fail safe way to always do the
      full sync magic if O_DSYNC is set.  The few places setting O_SYNC for
      lower layers are kept that way for now to stay failsafe.
      
      We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
      to make sure we always get these sane options.
      
      Note that parisc really screwed up their headers as they already define a
      O_DSYNC that has always been a no-op.  We try to repair it by using it for
      the new O_DSYNC and redefinining O_SYNC to send both the traditional
      O_SYNC numerical value _and_ the O_DSYNC one.
      
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger@sun.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      6b2f3d1f
  2. 26 11月, 2009 1 次提交
  3. 24 11月, 2009 3 次提交
  4. 10 11月, 2009 1 次提交
  5. 24 9月, 2009 1 次提交
    • R
      x86: Reduce verbosity of "PAT enabled" kernel message · e23a8b6a
      Roland Dreier 提交于
      On modern systems, the kernel prints the message
      
          x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
      
      once for every CPU.
      
      This gets kind of ridiculous on huge systems; for example, on a
      64-thread system I was lucky enough to get:
      
          dmesg| grep 'PAT enabled' | wc
               64     704    5174
      
      There is already a BUG() if non-boot CPUs have PAT capabilities
      that don't match the boot CPU, so just print the message on the
      boot CPU. (I kept the print after the wrmsrl() that enables PAT,
      so that the log output continues to mean that the system survived
      enabling PAT on the boot CPU)
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <adavdj92sso.fsf@cisco.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e23a8b6a
  6. 18 9月, 2009 1 次提交
    • S
      x86, pat: don't use rb-tree based lookup in reserve_memtype() · dcb73bf4
      Suresh Siddha 提交于
      Recent enhancement of rb-tree based lookup exposed a  bug with the lookup
      mechanism in the reserve_memtype() which ensures that there are no conflicting
      memtype requests for the memory range.
      
      memtype_rb_search() returns an entry which has a start address <= new start
      address. And from here we traverse the linear linked list to check if there
      any conflicts with the existing mappings. As the rbtree is based on the
      start address of the memory range, it is quite possible that we have several
      overlapped mappings whose start address is much less than new requested start
      but the end is >= new requested end. This results in conflicting memtype
      mappings.
      
      Same bug exists with the old code which uses cached_entry from where
      we traverse the linear linked list. But the new rb-tree code exposes this
      bug fairly easily.
      
      For now, don't use the memtype_rb_search() and always start the search from
      the head of linear linked list in reserve_memtype(). Linear linked list
      for most of the systems grow's to few 10's of entries(as we track memory type
      of RAM pages using struct page). So we should be ok for now.
      
      We still retain the rbtree and use it to speed up free_memtype() which
      doesn't have the same bug(as we know what exactly we are searching for
      in free_memtype).
      
      Also use list_for_each_entry_from() in free_memtype() so that we start
      the search from rb-tree lookup result.
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <1253136483.4119.12.camel@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      dcb73bf4
  7. 06 9月, 2009 1 次提交
  8. 27 8月, 2009 7 次提交
  9. 18 8月, 2009 1 次提交
    • S
      x86, pat: Allow ISA memory range uncacheable mapping requests · 1adcaafe
      Suresh Siddha 提交于
      Max Vozeler reported:
      >  Bug 13877 -  bogl-term broken with CONFIG_X86_PAT=y, works with =n
      >
      >  strace of bogl-term:
      >  814   mmap2(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0)
      >				 = -1 EAGAIN (Resource temporarily unavailable)
      >  814   write(2, "bogl: mmaping /dev/fb0: Resource temporarily unavailable\n",
      >	       57) = 57
      
      PAT code maps the ISA memory range as WB in the PAT attribute, so that
      fixed range MTRR registers define the actual memory type (UC/WC/WT etc).
      
      But the upper level is_new_memtype_allowed() API checks are failing,
      as the request here is for UC and the return tracked type is WB (Tracked type is
      WB as MTRR type for this legacy range potentially will be different for each
      4k page).
      
      Fix is_new_memtype_allowed() by always succeeding the ISA address range
      checks, as the null PAT (WB) and def MTRR fixed range register settings
      satisfy the memory type needs of the applications that map the ISA address
      range.
      Reported-and-Tested-by: NMax Vozeler <xam@debian.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      1adcaafe
  10. 17 4月, 2009 1 次提交
  11. 12 4月, 2009 1 次提交
    • M
      x86: fix wrong section of pat_disable & make it static · 1ee4bd92
      Marcin Slusarz 提交于
      pat_disable cannot be __cpuinit anymore because it's called from pat_init
      and the callchain looks like this:
      pat_disable [cpuinit] <- pat_init <- generic_set_all <-
       ipi_handler <- set_mtrr <- (other non init/cpuinit functions)
      
      WARNING: arch/x86/mm/built-in.o(.text+0x449e): Section mismatch in reference
      from the function pat_init() to the function .cpuinit.text:pat_disable()
      The function pat_init() references
      the function __cpuinit pat_disable().
      This is often because pat_init lacks a __cpuinit
      annotation or the annotation of pat_disable is wrong.
      
      Non CONFIG_X86_PAT version of pat_disable is static inline, so this version
      can be static too (and there are no callers outside of this file).
      Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      LKML-Reference: <49DFB055.6070405@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1ee4bd92
  12. 10 4月, 2009 2 次提交
  13. 13 3月, 2009 1 次提交
  14. 28 2月, 2009 1 次提交
  15. 25 2月, 2009 1 次提交
  16. 12 2月, 2009 1 次提交
  17. 24 1月, 2009 1 次提交
    • H
      x86: handle PAT more like other CPU features · 75a04811
      H. Peter Anvin 提交于
      Impact: Cleanup
      
      When PAT was originally introduced, it was handled specially for a few
      reasons:
      
      - PAT bugs are hard to track down, so we wanted to maintain a
        whitelist of CPUs.
      - The i386 and x86-64 CPUID code was not yet unified.
      
      Both of these are now obsolete, so handle PAT like any other features,
      including ordinary feature blacklisting due to known bugs.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      75a04811
  18. 22 1月, 2009 1 次提交
  19. 16 1月, 2009 1 次提交
  20. 15 1月, 2009 1 次提交
  21. 14 1月, 2009 3 次提交
    • V
      x86 PAT: remove CPA WARN_ON for zero pte · 58dab916
      venkatesh.pallipadi@intel.com 提交于
      Impact: reduce scope of debug check - avoid warnings
      
      The logic to find whether identity map exists or not using
      high_memory or max_low_pfn_mapped/max_pfn_mapped are not complete
      as the memory withing the range may not be mapped if there is a
      unusable hole in e820.
      
      Specifically, on my test system I started seeing these warnings with
      tools like hwinfo, acpidump trying to map ACPI region.
      
      [   27.400018] ------------[ cut here ]------------
      [   27.400344] WARNING: at /home/venkip/src/linus/linux-2.6/arch/x86/mm/pageattr.c:560 __change_page_attr_set_clr+0xf3/0x8b8()
      [   27.400821] Hardware name: X7DB8
      [   27.401070] CPA: called for zero pte. vaddr = ffff8800cff6a000 cpa->vaddr = ffff8800cff6a000
      [   27.401569] Modules linked in:
      [   27.401882] Pid: 4913, comm: dmidecode Not tainted 2.6.28-05716-gfe0bdec6 #586
      [   27.402141] Call Trace:
      [   27.402488]  [<ffffffff80237c21>] warn_slowpath+0xd3/0x10f
      [   27.402749]  [<ffffffff80274ade>] ? find_get_page+0xb3/0xc9
      [   27.403028]  [<ffffffff80274a2b>] ? find_get_page+0x0/0xc9
      [   27.403333]  [<ffffffff80226425>] __change_page_attr_set_clr+0xf3/0x8b8
      [   27.403628]  [<ffffffff8028ec99>] ? __purge_vmap_area_lazy+0x192/0x1a1
      [   27.403883]  [<ffffffff8028eb52>] ? __purge_vmap_area_lazy+0x4b/0x1a1
      [   27.404172]  [<ffffffff80290268>] ? vm_unmap_aliases+0x1ab/0x1bb
      [   27.404512]  [<ffffffff80290105>] ? vm_unmap_aliases+0x48/0x1bb
      [   27.404766]  [<ffffffff80226d28>] change_page_attr_set_clr+0x13e/0x2e6
      [   27.405026]  [<ffffffff80698fa7>] ? _spin_unlock+0x26/0x2a
      [   27.405292]  [<ffffffff80227e6a>] ? reserve_memtype+0x19b/0x4e3
      [   27.405590]  [<ffffffff80226ffd>] _set_memory_wb+0x22/0x24
      [   27.405844]  [<ffffffff80225d28>] ioremap_change_attr+0x26/0x28
      [   27.406097]  [<ffffffff80228355>] reserve_pfn_range+0x1a3/0x235
      [   27.406427]  [<ffffffff80228430>] track_pfn_vma_new+0x49/0xb3
      [   27.406686]  [<ffffffff80286c46>] remap_pfn_range+0x94/0x32c
      [   27.406940]  [<ffffffff8022878d>] ? phys_mem_access_prot_allowed+0xb5/0x1a8
      [   27.407209]  [<ffffffff803e9bf4>] mmap_mem+0x75/0x9d
      [   27.407523]  [<ffffffff8028b3b4>] mmap_region+0x2cf/0x53e
      [   27.407776]  [<ffffffff8028b8cc>] do_mmap_pgoff+0x2a9/0x30d
      [   27.408034]  [<ffffffff8020f4a4>] sys_mmap+0x92/0xce
      [   27.408339]  [<ffffffff8020b65b>] system_call_fastpath+0x16/0x1b
      [   27.408614] ---[ end trace 4b16ad70c09a602d ]---
      [   27.408871] dmidecode:4913 reserve_pfn_range ioremap_change_attr failed write-back for cff6a000-cff6b000
      
      This is wih track_pfn_vma_new trying to keep identity map in sync.
      The address cff6a000 is the ACPI region according to e820.
      
      [    0.000000] BIOS-provided physical RAM map:
      [    0.000000]  BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
      [    0.000000]  BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
      [    0.000000]  BIOS-e820: 00000000000cc000 - 00000000000d0000 (reserved)
      [    0.000000]  BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
      [    0.000000]  BIOS-e820: 0000000000100000 - 00000000cff60000 (usable)
      [    0.000000]  BIOS-e820: 00000000cff60000 - 00000000cff69000 (ACPI data)
      [    0.000000]  BIOS-e820: 00000000cff69000 - 00000000cff80000 (ACPI NVS)
      [    0.000000]  BIOS-e820: 00000000cff80000 - 00000000d0000000 (reserved)
      [    0.000000]  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
      [    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
      [    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
      [    0.000000]  BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
      [    0.000000]  BIOS-e820: 0000000100000000 - 0000000230000000 (usable)
      
      And is not mapped as per init_memory_mapping.
      
      [    0.000000] init_memory_mapping: 0000000000000000-00000000cff60000
      [    0.000000] init_memory_mapping: 0000000100000000-0000000230000000
      
      We can add logic to check for this. But, there can also be other holes in
      identity map when we have 1GB of aligned reserved space in e820.
      
      This patch handles it by removing the WARN_ON and returning a specific
      error value (EFAULT) to indicate that the address does not have any
      identity mapping.
      
      The code that tries to keep identity map in sync can ignore
      this error, with other callers of cpa still getting error here.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      58dab916
    • V
      x86 PAT: return compatible mapping to remap_pfn_range callers · cdecff68
      venkatesh.pallipadi@intel.com 提交于
      Impact: avoid warning message, potentially solve 3D performance regression
      
      Change x86 PAT code to return compatible memtype if the exact memtype that
      was requested in remap_pfn_rage and friends is not available due to some
      conflict.
      
      This is done by returning the compatible type in pgprot parameter of
      track_pfn_vma_new(), and the caller uses that memtype for page table.
      
      Note that track_pfn_vma_copy() which is basically called during fork gets the
      prot from existing page table and should not have any conflict. Hence we use
      strict memtype check there and do not allow compatible memtypes.
      
      This patch fixes the bug reported here:
      
        http://marc.info/?l=linux-kernel&m=123108883716357&w=2
      
      Specifically the error message:
      
        X:5010 map pfn expected mapping type write-back for d0000000-d0101000,
        got write-combining
      
      Should go away.
      Reported-and-bisected-by: NKevin Winchester <kjwinchester@gmail.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdecff68
    • V
      x86 PAT: change track_pfn_vma_new to take pgprot_t pointer param · e4b866ed
      venkatesh.pallipadi@intel.com 提交于
      Impact: cleanup
      
      Change the protection parameter for track_pfn_vma_new() into a pgprot_t pointer.
      Subsequent patch changes the x86 PAT handling to return a compatible
      memtype in pgprot_t, if what was requested cannot be allowed due to conflicts.
      No fuctionality change in this patch.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e4b866ed
  22. 24 12月, 2008 1 次提交
    • H
      x86: PAT: fix address types in track_pfn_vma_new() · c1c15b65
      H. Peter Anvin 提交于
      Impact: cleanup, fix warning
      
      This warning:
      
       arch/x86/mm/pat.c: In function track_pfn_vma_copy:
       arch/x86/mm/pat.c:701: warning: passing argument 5 of follow_phys from incompatible pointer type
      
      Triggers because physical addresses are resource_size_t, not u64.
      
      This really matters when calling an interface like follow_phys() which
      takes a pointer to a physical address -- although on x86, being
      littleendian, it would generally work anyway as long as the memory region
      wasn't completely uninitialized.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c1c15b65
  23. 20 12月, 2008 1 次提交
  24. 19 12月, 2008 2 次提交
  25. 31 10月, 2008 1 次提交
  26. 11 10月, 2008 2 次提交
  27. 21 8月, 2008 1 次提交