1. 15 8月, 2007 1 次提交
    • S
      [IOAT]: Remove redundant struct member to avoid descriptor cache miss · 54a09feb
      Shannon Nelson 提交于
      The layout for struct ioat_desc_sw is non-optimal and causes an extra
      cache hit for every descriptor processed.  By tightening up the struct
      layout and removing one item, we pull in the fields that get used in
      the speedpath and get a little better performance.
      
      
      Before:
      -------
      struct ioat_desc_sw {
      	struct ioat_dma_descriptor * hw;                 /*     0     8
      */
      	struct list_head           node;                 /*     8    16
      */
      	int                        tx_cnt;               /*    24     4
      */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	dma_addr_t                 src;                  /*    32     8
      */
      	__u32                      src_len;              /*    40     4
      */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	dma_addr_t                 dst;                  /*    48     8
      */
      	__u32                      dst_len;              /*    56     4
      */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	/* --- cacheline 1 boundary (64 bytes) --- */
      	struct dma_async_tx_descriptor async_tx;         /*    64   144
      */
      	/* --- cacheline 3 boundary (192 bytes) was 16 bytes ago --- */
      
      	/* size: 208, cachelines: 4 */
      	/* sum members: 196, holes: 3, sum holes: 12 */
      	/* last cacheline: 16 bytes */
      };	/* definitions: 1 */
      
      
      After:
      ------
      
      struct ioat_desc_sw {
      	struct ioat_dma_descriptor * hw;                 /*     0     8
      */
      	struct list_head           node;                 /*     8    16
      */
      	int                        tx_cnt;               /*    24     4
      */
      	__u32                      len;                  /*    28     4
      */
      	dma_addr_t                 src;                  /*    32     8
      */
      	dma_addr_t                 dst;                  /*    40     8
      */
      	struct dma_async_tx_descriptor async_tx;         /*    48   144
      */
      	/* --- cacheline 3 boundary (192 bytes) --- */
      
      	/* size: 192, cachelines: 3 */
      };	/* definitions: 1 */
      Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54a09feb
  2. 13 7月, 2007 2 次提交
    • D
      dmaengine: make clients responsible for managing channels · d379b01e
      Dan Williams 提交于
      The current implementation assumes that a channel will only be used by one
      client at a time.  In order to enable channel sharing the dmaengine core is
      changed to a model where clients subscribe to channel-available-events.
      Instead of tracking how many channels a client wants and how many it has
      received the core just broadcasts the available channels and lets the
      clients optionally take a reference.  The core learns about the clients'
      needs at dma_event_callback time.
      
      In support of multiple operation types, clients can specify a capability
      mask to only be notified of channels that satisfy a certain set of
      capabilities.
      
      Changelog:
      * removed DMA_TX_ARRAY_INIT, no longer needed
      * dma_client_chan_free -> dma_chan_release: switch to global reference
        counting only at device unregistration time, before it was also happening
        at client unregistration time
      * clients now return dma_state_client to dmaengine (ack, dup, nak)
      * checkpatch.pl fixes
      * fixup merge with git-ioat
      
      Cc: Chris Leech <christopher.leech@intel.com>
      Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      d379b01e
    • D
      dmaengine: refactor dmaengine around dma_async_tx_descriptor · 7405f74b
      Dan Williams 提交于
      The current dmaengine interface defines mutliple routines per operation,
      i.e. dma_async_memcpy_buf_to_buf, dma_async_memcpy_buf_to_page etc.  Adding
      more operation types (xor, crc, etc) to this model would result in an
      unmanageable number of method permutations.
      
      	Are we really going to add a set of hooks for each DMA engine
      	whizbang feature?
      		- Jeff Garzik
      
      The descriptor creation process is refactored using the new common
      dma_async_tx_descriptor structure.  Instead of per driver
      do_<operation>_<dest>_to_<src> methods, drivers integrate
      dma_async_tx_descriptor into their private software descriptor and then
      define a 'prep' routine per operation.  The prep routine allocates a
      descriptor and ensures that the tx_set_src, tx_set_dest, tx_submit routines
      are valid.  Descriptor creation and submission becomes:
      
      struct dma_device *dev;
      struct dma_chan *chan;
      struct dma_async_tx_descriptor *tx;
      
      tx = dev->device_prep_dma_<operation>(chan, len, int_flag)
      tx->tx_set_src(dma_addr_t, tx, index /* for multi-source ops */)
      tx->tx_set_dest(dma_addr_t, tx, index)
      tx->tx_submit(tx)
      
      In addition to the refactoring, dma_async_tx_descriptor also lays the
      groundwork for definining cross-channel-operation dependencies, and a
      callback facility for asynchronous notification of operation completion.
      
      Changelog:
      * drop dma mapping methods, suggested by Chris Leech
      * fix ioat_dma_dependency_added, also caught by Andrew Morton
      * fix dma_sync_wait, change from Andrew Morton
      * uninline large functions, change from Andrew Morton
      * add tx->callback = NULL to dmaengine calls to interoperate with async_tx
        calls
      * hookup ioat_tx_submit
      * convert channel capabilities to a 'cpumask_t like' bitmap
      * removed DMA_TX_ARRAY_INIT, no longer needed
      * checkpatch.pl fixes
      * make set_src, set_dest, and tx_submit descriptor specific methods
      * fixup git-ioat merge
      * move group_list and phys to dma_async_tx_descriptor
      
      Cc: Jeff Garzik <jeff@garzik.org>
      Cc: Chris Leech <christopher.leech@intel.com>
      Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      7405f74b
  3. 11 10月, 2006 1 次提交
  4. 18 6月, 2006 2 次提交