1. 31 1月, 2019 1 次提交
    • J
      vhost: log dirty page correctly · 1688e75c
      Jason Wang 提交于
      [ Upstream commit cc5e710759470bc7f3c61d11fd54586f15fdbdf4 ]
      
      Vhost dirty page logging API is designed to sync through GPA. But we
      try to log GIOVA when device IOTLB is enabled. This is wrong and may
      lead to missing data after migration.
      
      To solve this issue, when logging with device IOTLB enabled, we will:
      
      1) reuse the device IOTLB translation result of GIOVA->HVA mapping to
         get HVA, for writable descriptor, get HVA through iovec. For used
         ring update, translate its GIOVA to HVA
      2) traverse the GPA->HVA mapping to get the possible GPA and log
         through GPA. Pay attention this reverse mapping is not guaranteed
         to be unique, so we should log each possible GPA in this case.
      
      This fix the failure of scp to guest during migration. In -next, we
      will probably support passing GIOVA->GPA instead of GIOVA->HVA.
      
      Fixes: 6b1e6cc7 ("vhost: new device IOTLB API")
      Reported-by: NJintack Lim <jintack@cs.columbia.edu>
      Cc: Jintack Lim <jintack@cs.columbia.edu>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1688e75c
  2. 10 1月, 2019 1 次提交
  3. 04 11月, 2018 1 次提交
  4. 26 8月, 2018 1 次提交
  5. 09 8月, 2018 1 次提交
  6. 07 8月, 2018 1 次提交
    • J
      vhost: switch to use new message format · 429711ae
      Jason Wang 提交于
      We use to have message like:
      
      struct vhost_msg {
      	int type;
      	union {
      		struct vhost_iotlb_msg iotlb;
      		__u8 padding[64];
      	};
      };
      
      Unfortunately, there will be a hole of 32bit in 64bit machine because
      of the alignment. This leads a different formats between 32bit API and
      64bit API. What's more it will break 32bit program running on 64bit
      machine.
      
      So fixing this by introducing a new message type with an explicit
      32bit reserved field after type like:
      
      struct vhost_msg_v2 {
      	__u32 type;
      	__u32 reserved;
      	union {
      		struct vhost_iotlb_msg iotlb;
      		__u8 padding[64];
      	};
      };
      
      We will have a consistent ABI after switching to use this. To enable
      this capability, introduce a new ioctl (VHOST_SET_BAKCEND_FEATURE) for
      userspace to enable this feature (VHOST_BACKEND_F_IOTLB_V2).
      
      Fixes: 6b1e6cc7 ("vhost: new device IOTLB API")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      429711ae
  7. 13 6月, 2018 2 次提交
    • K
      treewide: kmalloc() -> kmalloc_array() · 6da2ec56
      Kees Cook 提交于
      The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
      patch replaces cases of:
      
              kmalloc(a * b, gfp)
      
      with:
              kmalloc_array(a * b, gfp)
      
      as well as handling cases of:
      
              kmalloc(a * b * c, gfp)
      
      with:
      
              kmalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kmalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kmalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The tools/ directory was manually excluded, since it has its own
      implementation of kmalloc().
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kmalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kmalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kmalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kmalloc
      + kmalloc_array
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kmalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kmalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kmalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kmalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kmalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kmalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kmalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kmalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kmalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kmalloc(C1 * C2 * C3, ...)
      |
        kmalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kmalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kmalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kmalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kmalloc(sizeof(THING) * C2, ...)
      |
        kmalloc(sizeof(TYPE) * C2, ...)
      |
        kmalloc(C1 * C2 * C3, ...)
      |
        kmalloc(C1 * C2, ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      6da2ec56
    • M
      Convert vhost to struct_size · b2303d7b
      Matthew Wilcox 提交于
      Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      b2303d7b
  8. 12 6月, 2018 1 次提交
  9. 26 5月, 2018 1 次提交
  10. 25 5月, 2018 1 次提交
  11. 11 4月, 2018 3 次提交
  12. 30 3月, 2018 1 次提交
  13. 28 3月, 2018 1 次提交
  14. 20 3月, 2018 1 次提交
  15. 12 2月, 2018 1 次提交
    • L
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds 提交于
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  16. 01 2月, 2018 5 次提交
  17. 25 1月, 2018 2 次提交
  18. 06 12月, 2017 1 次提交
  19. 29 11月, 2017 1 次提交
  20. 28 11月, 2017 3 次提交
  21. 15 11月, 2017 1 次提交
  22. 09 9月, 2017 1 次提交
  23. 30 7月, 2017 1 次提交
  24. 20 6月, 2017 1 次提交
    • I
      sched/wait: Rename wait_queue_t => wait_queue_entry_t · ac6424b9
      Ingo Molnar 提交于
      Rename:
      
      	wait_queue_t		=>	wait_queue_entry_t
      
      'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
      but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
      which had to carry the name.
      
      Start sorting this out by renaming it to 'wait_queue_entry_t'.
      
      This also allows the real structure name 'struct __wait_queue' to
      lose its double underscore and become 'struct wait_queue_entry',
      which is the more canonical nomenclature for such data types.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ac6424b9
  25. 09 5月, 2017 1 次提交
  26. 02 3月, 2017 3 次提交
    • I
      sched/headers: Prepare to move signal wakeup & sigpending methods from... · 174cd4b1
      Ingo Molnar 提交于
      sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h>
      
      Fix up affected files that include this signal functionality via sched.h.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      174cd4b1
    • I
      sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> · 6e84f315
      Ingo Molnar 提交于
      We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which
      will have to be picked up from other headers and a couple of .c files.
      
      Create a trivial placeholder <linux/sched/mm.h> file that just
      maps to <linux/sched.h> to make this patch obviously correct and
      bisectable.
      
      The APIs that are going to be moved first are:
      
         mm_alloc()
         __mmdrop()
         mmdrop()
         mmdrop_async_fn()
         mmdrop_async()
         mmget_not_zero()
         mmput()
         mmput_async()
         get_task_mm()
         mm_access()
         mm_release()
      
      Include the new header in the files that are going to need it.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6e84f315
    • J
      vhost: introduce O(1) vq metadata cache · f8894913
      Jason Wang 提交于
      When device IOTLB is enabled, all address translations were stored in
      interval tree. O(lgN) searching time could be slow for virtqueue
      metadata (avail, used and descriptors) since they were accessed much
      often than other addresses. So this patch introduces an O(1) array
      which points to the interval tree nodes that store the translations of
      vq metadata. Those array were update during vq IOTLB prefetching and
      were reset during each invalidation and tlb update. Each time we want
      to access vq metadata, this small array were queried before interval
      tree. This would be sufficient for static mappings but not dynamic
      mappings, we could do optimizations on top.
      
      Test were done with l2fwd in guest (2M hugepage):
      
         noiommu  | before        | after
      tx 1.32Mpps | 1.06Mpps(82%) | 1.30Mpps(98%)
      rx 2.33Mpps | 1.46Mpps(63%) | 2.29Mpps(98%)
      
      We can almost reach the same performance as noiommu mode.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f8894913
  27. 28 2月, 2017 1 次提交
  28. 04 2月, 2017 1 次提交
    • H
      vhost: fix initialization for vq->is_le · cda8bba0
      Halil Pasic 提交于
      Currently, under certain circumstances vhost_init_is_le does just a part
      of the initialization job, and depends on vhost_reset_is_le being called
      too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
      when vq->private_data is NULL. This is not only counter intuitive, but
      also real a problem because it breaks vhost_net. The bug was introduced to
      vhost_net with commit 2751c988 ("vhost: cross-endian support for
      legacy devices"). The symptom is corruption of the vq's used.idx field
      (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
      shutdown on a vq with pending descriptors.
      
      Let us make sure the outcome of vhost_init_is_le never depend on the state
      it is actually supposed to initialize, and fix virtio_net by removing the
      reset from vhost_vq_init_access.
      
      With the above, there is no reason for vhost_reset_is_le to do just half
      of the job. Let us make vhost_reset_is_le reinitialize is_le.
      Signed-off-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
      Reported-by: NMichael A. Tebolt <miket@us.ibm.com>
      Reported-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Fixes: commit 2751c988 ("vhost: cross-endian support for legacy devices")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      Tested-by: NMichael A. Tebolt <miket@us.ibm.com>
      cda8bba0