1. 22 10月, 2007 18 次提交
    • K
      Intel IOMMU: Avoid memory allocation failures in dma map api calls · eb3fa7cb
      Keshavamurthy, Anil S 提交于
      Intel IOMMU driver needs memory during DMA map calls to setup its internal
      page tables and for other data structures.  As we all know that these DMA map
      calls are mostly called in the interrupt context or with the spinlock held by
      the upper level drivers(network/storage drivers), so in order to avoid any
      memory allocation failure due to low memory issues, this patch makes memory
      allocation by temporarily setting PF_MEMALLOC flags for the current task
      before making memory allocation calls.
      
      We evaluated mempools as a backup when kmem_cache_alloc() fails
      and found that mempools are really not useful here because
       1) We don't know for sure how much to reserve in advance
       2) And mempools are not useful for GFP_ATOMIC case (as we call
          memory alloc functions with GFP_ATOMIC)
      
      (akpm: point 2 is wrong...)
      
      With PF_MEMALLOC flag set in the current->flags, the VM subsystem avoids any
      watermark checks before allocating memory thus guarantee'ing the memory till
      the last free page.  Further, looking at the code in mm/page_alloc.c in
      __alloc_pages() function, looks like this flag is useful only in the
      non-interrupt context.
      
      If we are in the interrupt context and memory allocation in IOMMU driver fails
      for some reason, then the DMA map api's will return failure and it is up to
      the higher level drivers to retry.  Suppose, if upper level driver programs
      the controller with the buggy DMA virtual address, the IOMMU will block that
      DMA transaction when that happens thus preventing any corruption to main
      memory.
      
      So far in our test scenario, we were unable to create any memory allocation
      failure inside dma map api calls.
      Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eb3fa7cb
    • K
      Intel IOMMU: Intel IOMMU driver · ba395927
      Keshavamurthy, Anil S 提交于
      Actual intel IOMMU driver.  Hardware spec can be found at:
      http://www.intel.com/technology/virtualization
      
      This driver sets X86_64 'dma_ops', so hook into standard DMA APIs.  In this
      way, PCI driver will get virtual DMA address.  This change is transparent to
      PCI drivers.
      
      [akpm@linux-foundation.org: remove unneeded cast]
      [akpm@linux-foundation.org: build fix]
      [bunk@stusta.de: fix duplicate CONFIG_DMAR Makefile line]
      Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba395927
    • K
      Intel IOMMU: IOVA allocation and management routines · f8de50eb
      Keshavamurthy, Anil S 提交于
      This code implements a generic IOVA allocation and management.  As per Dave's
      suggestion we are now allocating IO virtual address from Higher DMA limit
      address rather than lower end address and this eliminated the need to preserve
      the IO virtual address for multiple devices sharing the same domain virtual
      address.
      
      Also this code uses red black trees to store the allocated and reserved iova
      nodes.  This showed a good performance improvements over previous linear
      linked list.
      
      [akpm@linux-foundation.org: remove inlines]
      [akpm@linux-foundation.org: coding style fixes]
      Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f8de50eb
    • K
      Intel IOMMU: clflush_cache_range now takes size param · a9c55b3b
      Keshavamurthy, Anil S 提交于
      Introduce the size param for clflush_cache_range().
      Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9c55b3b
    • K
      Intel IOMMU: PCI generic helper function · 994a65e2
      Keshavamurthy, Anil S 提交于
      When devices are under a p2p bridge, upstream transactions get replaced by the
      device id of the bridge as it owns the PCIE transaction.  Hence its necessary
      to setup translations on behalf of the bridge as well.  Due to this limitation
      all devices under a p2p share the same domain in a DMAR.
      
      We just cache the type of device, if its a native PCIe device
      or not for later use.
      
      [akpm@linux-foundation.org: BUG_ON -> WARN_ON+recover]
      Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      994a65e2
    • K
      Intel IOMMU: DMAR detection and parsing logic · 10e5247f
      Keshavamurthy, Anil S 提交于
      This patch supports the upcomming Intel IOMMU hardware a.k.a.  Intel(R)
      Virtualization Technology for Directed I/O Architecture and the hardware spec
      for the same can be found here
      http://www.intel.com/technology/virtualization/index.htm
      
      FAQ! (questions from akpm, answers from ak)
      
      > So...  what's all this code for?
      >
      > I assume that the intent here is to speed things up under Xen, etc?
      
      Yes in some cases, but not this code.  That would be the Xen version of this
      code that could potentially assign whole devices to guests.  I expect this to
      be only useful in some special cases though because most hardware is not
      virtualizable and you typically want an own instance for each guest.
      
      Ok at some point KVM might implement this too; i likely would use this code
      for this.
      
      > Do we
      > have any benchmark results to help us to decide whether a merge would be
      > justified?
      
      The main advantage for doing it in the normal kernel is not performance, but
      more safety.  Broken devices won't be able to corrupt memory by doing random
      DMA.
      
      Unfortunately that doesn't work for graphics yet, for that need user space
      interfaces for the X server are needed.
      
      There are some potential performance benefits too:
      
      - When you have a device that cannot address the complete address range an
        IOMMU can remap its memory instead of bounce buffering.  Remapping is likely
        cheaper than copying.
      
      - The IOMMU can merge sg lists into a single virtual block.  This could
        potentially speed up SG IO when the device is slow walking SG lists.  [I
        long ago benchmarked 5% on some block benchmark with an old MPT Fusion; but
        it probably depends a lot on the HBA]
      
      And you get better driver debugging because unexpected memory accesses from
      the devices will cause a trappable event.
      
      >
      > Does it slow anything down?
      
      It adds more overhead to each IO so yes.
      
      This patch:
      
      Add support for early detection and parsing of DMAR's (DMA Remapping) reported
      to OS via ACPI tables.
      
      DMA remapping(DMAR) devices support enables independent address translations
      for Direct Memory Access(DMA) from Devices.  These DMA remapping devices are
      reported via ACPI tables and includes pci device scope covered by these DMA
      remapping device.
      
      For detailed info on the specification of "Intel(R) Virtualization Technology
      for Directed I/O Architecture" please see
      http://www.intel.com/technology/virtualization/index.htmSigned-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10e5247f
    • J
      ext2: avoid rec_len overflow with 64KB block size · 89910ccc
      Jan Kara 提交于
      With 64KB blocksize, a directory entry can have size 64KB which does not
      fit into 16 bits we have for entry length.  So we store 0xffff instead and
      convert the value when read from / written to disk.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      89910ccc
    • J
      dcache: don't expose uninitialized memory in /proc/<pid>/fd/<fd> · 321bcf92
      J. Bruce Fields 提交于
      Well, it's not especially important that target->d_iname get the contents
      of dentry->d_iname, but it's important that it get initialized with
      *something*, otherwise we're just exposing some random piece of memory to
      anyone who reads the link at /proc/<pid>/fd/<fd> for the deleted file, when
      it's still held open by someone.
      
      I've run a test program that copies a short (<36 character) name ontop of a
      long (>=36 character) name and see that the first time I run it, without
      this patch, I get unpredicatable results out of /proc/<pid>/fd/<fd>.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      321bcf92
    • S
      capabilities: clean up file capability reading · b68680e4
      Serge E. Hallyn 提交于
      Simplify the vfs_cap_data structure.
      
      Also fix get_file_caps which was declaring
      __le32 v1caps[XATTR_CAPS_SZ] on the stack, but
      XATTR_CAPS_SZ is already * sizeof(__le32).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Andrew Morgan <morgan@kernel.org>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b68680e4
    • Y
      memory hotplug: make kmem_cache_node for SLUB on memory online avoid panic · b9049e23
      Yasunori Goto 提交于
      Fix a panic due to access NULL pointer of kmem_cache_node at discard_slab()
      after memory online.
      
      When memory online is called, kmem_cache_nodes are created for all SLUBs
      for new node whose memory are available.
      
      slab_mem_going_online_callback() is called to make kmem_cache_node() in
      callback of memory online event.  If it (or other callbacks) fails, then
      slab_mem_offline_callback() is called for rollback.
      
      In memory offline, slab_mem_going_offline_callback() is called to shrink
      all slub cache, then slab_mem_offline_callback() is called later.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: locking fix]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9049e23
    • Y
      memory hotplug: rearrange memory hotplug notifier · 7b78d335
      Yasunori Goto 提交于
      Current memory notifier has some defects yet.  (Fortunately, nothing uses
      it.) This patch is to fix and rearrange for them.
      
        - Add information of start_pfn, nr_pages, and node id if node status is
          changes from/to memoryless node for callback functions.
          Callbacks can't do anything without those information.
        - Add notification going-online status.
          It is necessary for creating per node structure before the node's
          pages are available.
        - Move GOING_OFFLINE status notification after page isolation.
          It is good place for return memory like cache for callback,
          because returned page is not used again.
        - Make CANCEL events for rollingback when error occurs.
        - Delete MEM_MAPPING_INVALID notification. It will be not used.
        - Fix compile error of (un)register_memory_notifier().
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b78d335
    • Y
      memory hotplug: document the memory hotplug notifier · 10020ca2
      Yasunori Goto 提交于
      Add description about event notification callback routine to the document
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10020ca2
    • R
      i386: paravirt boot sequence · a24e7851
      Rusty Russell 提交于
      This patch uses the updated boot protocol to do paravirtualized boot.
      If the boot version is >= 2.07, then it will do two things:
      
       1. Check the bootparams loadflags to see if we should reload the
          segment registers and clear interrupts.  This is appropriate
          for normal native boot and some paravirtualized environments, but
          inapproprate for others.
      
       2. Check the hardware architecture, and dispatch to the appropriate
          kernel entrypoint.  If the bootloader doesn't set this, then we
          simply do the normal boot sequence.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a24e7851
    • R
    • R
      update boot spec to 2.07 · e5371ac5
      Rusty Russell 提交于
      Updates for version 2.07 of the boot protocol.  This includes:
      
      load_flags.KEEP_SEGMENTS- flag to request/inhibit segment reloads
      hardware_subarch	- what subarchitecture we're booting under
      hardware_subarch_data	- per-architecture data
      
      The intention of these changes is to make booting a paravirtualized
      kernel work via the normal Linux boot protocol.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e5371ac5
    • T
      NFS: Fix a typo in nfs_call_unlink() · 55b70a03
      Trond Myklebust 提交于
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      55b70a03
    • T
    • L
      Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6 · efea90a4
      Linus Torvalds 提交于
      * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
        Blackfin arch: update boards files
        Blackfin arch: dma add some API and cleanup bf54x DMA definition
        Blackfin arch: cleanup and promote the general purpose timers api to a core blackfin component
        Blackfin arch: add a cheesy install target
        Blackfin arch: add functions for converting between sclks and usecs
        Blackfin arch: add assembly function for doing 64bit unsigned division
        Blackfin arch: -mno-fdpic works
        Blackfin arch: use "char bfin_board_name[]" rather than "char *bfin_board_name" per discussion on lkml as the former uses less storage
        Blackfin arch: Fixing Bug: balance calls to get_task_mm with corresponding mmput calls
        Blackfin serial driver Kconfig: depend on DMA not being enabled rather than a specific DMA size
        Blackfin arch: Fix bug: missing CHIPID register field definition of BF54x
        Blackfin arch: Fix up /proc/cpuinfo so it is like everyone else
        Blackfin arch: Optimization - no need to make additional math here
        Blackfin arch: force irq_flags into the .data section
        Blackfin arch BF548 defconfig: enable watchdog by default
        Blackfin arch: add new processor ADSP-BF52x arch/mach support
      efea90a4
  2. 21 10月, 2007 3 次提交
  3. 22 10月, 2007 1 次提交
  4. 21 10月, 2007 1 次提交
  5. 22 10月, 2007 1 次提交
  6. 21 10月, 2007 9 次提交
  7. 22 10月, 2007 2 次提交
  8. 21 10月, 2007 5 次提交