1. 02 3月, 2016 2 次提交
  2. 26 1月, 2016 1 次提交
  3. 13 1月, 2016 3 次提交
  4. 07 12月, 2015 3 次提交
    • V
      virtio_ring: shadow available ring flags & index · f277ec42
      Venkatesh Srinivas 提交于
      Improves cacheline transfer flow of available ring header.
      
      Virtqueues are implemented as a pair of rings, one producer->consumer
      avail ring and one consumer->producer used ring; preceding the
      avail ring in memory are two contiguous u16 fields -- avail->flags
      and avail->idx. A producer posts work by writing to avail->idx and
      a consumer reads avail->idx.
      
      The flags and idx fields only need to be written by a producer CPU
      and only read by a consumer CPU; when the producer and consumer are
      running on different CPUs and the virtio_ring code is structured to
      only have source writes/sink reads, we can continuously transfer the
      avail header cacheline between 'M' states between cores. This flow
      optimizes core -> core bandwidth on certain CPUs.
      
      (see: "Software Optimization Guide for AMD Family 15h Processors",
      Section 11.6; similar language appears in the 10h guide and should
      apply to CPUs w/ exclusive caches, using LLC as a transfer cache)
      
      Unfortunately the existing virtio_ring code issued reads to the
      avail->idx and read-modify-writes to avail->flags on the producer.
      
      This change shadows the flags and index fields in producer memory;
      the vring code now reads from the shadows and only ever writes to
      avail->flags and avail->idx, allowing the cacheline to transfer
      core -> core optimally.
      
      In a concurrent version of vring_bench, the time required for
      10,000,000 buffer checkout/returns was reduced by ~2% (average
      across many runs) on an AMD Piledriver (15h) CPU:
      
      (w/o shadowing):
       Performance counter stats for './vring_bench':
           5,451,082,016      L1-dcache-loads
           ...
             2.221477739 seconds time elapsed
      
      (w/ shadowing):
       Performance counter stats for './vring_bench':
           5,405,701,361      L1-dcache-loads
           ...
             2.168405376 seconds time elapsed
      
      The further away (in a NUMA sense) virtio producers and consumers are
      from each other, the more we expect to benefit. Physical implementations
      of virtio devices and implementations of virtio where the consumer polls
      vring avail indexes (vhost) should also benefit.
      Signed-off-by: NVenkatesh Srinivas <venkateshs@google.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f277ec42
    • M
      virtio: Do not drop __GFP_HIGH in alloc_indirect · 82107539
      Michal Hocko 提交于
      b92b1b89 ("virtio: force vring descriptors to be allocated from
      lowmem") tried to exclude highmem pages for descriptors so it cleared
      __GFP_HIGHMEM from a given gfp mask. The patch also cleared __GFP_HIGH
      which doesn't make much sense for this fix because __GFP_HIGH only
      controls access to memory reserves and it doesn't have any influence
      on the zone selection. Some of the call paths use GFP_ATOMIC and
      dropping __GFP_HIGH will reduce their changes for success because the
      lack of access to memory reserves.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMel Gorman <mgorman@techsingularity.net>
      82107539
    • S
      virtio: fix memory leak of virtio ida cache layers · c13f99b7
      Suman Anna 提交于
      The virtio core uses a static ida named virtio_index_ida for
      assigning index numbers to virtio devices during registration.
      The ida core may allocate some internal idr cache layers and
      an ida bitmap upon any ida allocation, and all these layers are
      truely freed only upon the ida destruction. The virtio_index_ida
      is not destroyed at present, leading to a memory leak when using
      the virtio core as a module and atleast one virtio device is
      registered and unregistered.
      
      Fix this by invoking ida_destroy() in the virtio core module
      exit.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NSuman Anna <s-anna@ti.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      c13f99b7
  5. 08 9月, 2015 3 次提交
  6. 06 8月, 2015 1 次提交
  7. 24 6月, 2015 1 次提交
  8. 04 6月, 2015 1 次提交
  9. 28 5月, 2015 1 次提交
    • L
      kernel/params: constify struct kernel_param_ops uses · 9c27847d
      Luis R. Rodriguez 提交于
      Most code already uses consts for the struct kernel_param_ops,
      sweep the kernel for the last offending stragglers. Other than
      include/linux/moduleparam.h and kernel/params.c all other changes
      were generated with the following Coccinelle SmPL patch. Merge
      conflicts between trees can be handled with Coccinelle.
      
      In the future git could get Coccinelle merge support to deal with
      patch --> fail --> grammar --> Coccinelle --> new patch conflicts
      automatically for us on patches where the grammar is available and
      the patch is of high confidence. Consider this a feature request.
      
      Test compiled on x86_64 against:
      
      	* allnoconfig
      	* allmodconfig
      	* allyesconfig
      
      @ const_found @
      identifier ops;
      @@
      
      const struct kernel_param_ops ops = {
      };
      
      @ const_not_found depends on !const_found @
      identifier ops;
      @@
      
      -struct kernel_param_ops ops = {
      +const struct kernel_param_ops ops = {
      };
      
      Generated-by: Coccinelle SmPL
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Junio C Hamano <gitster@pobox.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: cocci@systeme.lip6.fr
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9c27847d
  10. 07 5月, 2015 1 次提交
  11. 15 4月, 2015 5 次提交
  12. 01 4月, 2015 2 次提交
  13. 29 3月, 2015 1 次提交
  14. 17 3月, 2015 1 次提交
    • M
      virtio_mmio: fix access width for mmio · 704a0b5f
      Michael S. Tsirkin 提交于
      Going over the virtio mmio code, I noticed that it doesn't correctly
      access modern device config values using "natural" accessors: it uses
      readb to get/set them byte by byte, while the virtio 1.0 spec explicitly states:
      
      	4.2.2.2 Driver Requirements: MMIO Device Register Layout
      
      	...
      
      	The driver MUST only use 32 bit wide and aligned reads and writes to
      	access the control registers described in table 4.1.
      	For the device-specific configuration space, the driver MUST use
      	8 bit wide accesses for 8 bit wide fields, 16 bit wide and aligned
      	accesses for 16 bit wide fields and 32 bit wide and aligned accesses for
      	32 and 64 bit wide fields.
      
      Borrow code from virtio_pci_modern to do this correctly.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      704a0b5f
  15. 13 3月, 2015 1 次提交
  16. 10 3月, 2015 2 次提交
  17. 17 2月, 2015 1 次提交
  18. 11 2月, 2015 2 次提交
  19. 23 1月, 2015 1 次提交
    • P
      virtio-mmio: Update the device to OASIS spec version · 1862ee22
      Pawel Moll 提交于
      This patch add a support for second version of the virtio-mmio device,
      which follows OASIS "Virtual I/O Device (VIRTIO) Version 1.0"
      specification.
      
      Main changes:
      
      1. The control register symbolic names use the new device/driver
         nomenclature rather than the old guest/host one.
      
      2. The driver detect the device version (version 1 is the pre-OASIS
         spec, version 2 is compatible with fist revision of the OASIS spec)
         and drives the device accordingly.
      
      3. New version uses direct addressing (64 bit address split into two
         low/high register) instead of the guest page size based one,
         and addresses each part of the queue (descriptors, available, used)
         separately.
      
      4. The device activity is now explicitly triggered by writing to the
         "queue ready" register.
      
      5. Whole 64 bit features are properly handled now (both ways).
      Signed-off-by: NPawel Moll <pawel.moll@arm.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      1862ee22
  20. 21 1月, 2015 7 次提交