1. 26 7月, 2019 1 次提交
  2. 10 7月, 2019 1 次提交
  3. 19 6月, 2019 1 次提交
  4. 05 6月, 2019 1 次提交
    • H
      irqchip/gic-v3-its: Fix command queue pointer comparison bug · a050fa54
      Heyi Guo 提交于
      When we run several VMs with PCI passthrough and GICv4 enabled, not
      pinning vCPUs, we will occasionally see below warnings in dmesg:
      
      ITS queue timeout (65440 65504 480)
      ITS cmd its_build_vmovp_cmd failed
      
      The reason for the above issue is that in BUILD_SINGLE_CMD_FUNC:
      1. Post the write command.
      2. Release the lock.
      3. Start to read GITS_CREADR to get the reader pointer.
      4. Compare the reader pointer to the target pointer.
      5. If reader pointer does not reach the target, sleep 1us and continue
      to try.
      
      If we have several processors running the above concurrently, other
      CPUs will post write commands while the 1st CPU is waiting the
      completion. So we may have below issue:
      
      phase 1:
      ---rd_idx-----from_idx-----to_idx--0---------
      
      wait 1us:
      
      phase 2:
      --------------from_idx-----to_idx--0-rd_idx--
      
      That is the rd_idx may fly ahead of to_idx, and if in case to_idx is
      near the wrap point, rd_idx will wrap around. So the below condition
      will not be met even after 1s:
      
      if (from_idx < to_idx && rd_idx >= to_idx)
      
      There is another theoretical issue. For a slow and busy ITS, the
      initial rd_idx may fall behind from_idx a lot, just as below:
      
      ---rd_idx---0--from_idx-----to_idx-----------
      
      This will cause the wait function exit too early.
      
      Actually, it does not make much sense to use from_idx to judge if
      to_idx is wrapped, but we need a initial rd_idx when lock is still
      acquired, and it can be used to judge whether to_idx is wrapped and
      the current rd_idx is wrapped.
      
      We switch to a method of calculating the delta of two adjacent reads
      and accumulating it to get the sum, so that we can get the real rd_idx
      from the wrapped value even when the queue is almost full.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Signed-off-by: NHeyi Guo <guoheyi@huawei.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      a050fa54
  5. 03 5月, 2019 1 次提交
  6. 29 4月, 2019 4 次提交
  7. 05 4月, 2019 1 次提交
  8. 21 3月, 2019 1 次提交
  9. 21 2月, 2019 1 次提交
  10. 14 2月, 2019 1 次提交
  11. 29 1月, 2019 3 次提交
    • M
      irqchip/gic-v3-its: Gracefully fail on LPI exhaustion · 45725e0f
      Marc Zyngier 提交于
      In the unlikely event that we cannot find any available LPI in the
      system, we should gracefully return an error instead of carrying
      on with no LPI allocated at all.
      
      Fixes: 38dd7c49 ("irqchip/gic-v3-its: Drop chunk allocation compatibility")
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      45725e0f
    • M
      irqchip/gic-v3-its: Plug allocation race for devices sharing a DevID · 9791ec7d
      Marc Zyngier 提交于
      On systems or VMs where multiple devices share a single DevID
      (because they sit behind a PCI bridge, or because the HW is
      broken in funky ways), we reuse the save its_device structure
      in order to reflect this.
      
      It turns out that there is a distinct lack of locking when looking
      up the its_device, and two device being probed concurrently can result
      in double allocations. That's obviously not nice.
      
      A solution for this is to have a per-ITS mutex that serializes device
      allocation.
      
      A similar issue exists on the freeing side, which can run concurrently
      with the allocation. On top of now taking the appropriate lock, we
      also make sure that a shared device is never freed, as we have no way
      to currently track the life cycle of such object.
      Reported-by: NZheng Xiang <zhengxiang9@huawei.com>
      Tested-by: NZheng Xiang <zhengxiang9@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      9791ec7d
    • H
      irqchip/gic-v4: Fix occasional VLPI drop · 6479450f
      Heyi Guo 提交于
      1. In current implementation, every VLPI will temporarily be mapped to
      the first CPU in system (normally CPU0) and then moved to the real
      scheduled CPU later.
      
      2. So there is a time window and a VLPI may be sent to CPU0 instead of
      the real scheduled vCPU, in a multi-CPU virtual machine.
      
      3. However, CPU0 may have not been scheduled as a virtual CPU after
      system boots up, so the value of its GICR_VPROPBASER is unknown at
      that moment.
      
      4. If the INTID of VLPI is larger than 2^(GICR_VPROPBASER.IDbits+1),
      while IDbits is also in unknown state, GIC will behave as if the VLPI
      is out of range and simply drop it, which results in interrupt missing
      in Guest.
      
      As no code will clear GICR_VPROPBASER at runtime, we can safely
      initialize the IDbits field at boot time for each CPU to get rid of
      this issue.
      
      We also clear Valid bit of GICR_VPENDBASER in case any ancient
      programming gets left in and causes memory corrupting. A new function
      its_clear_vpend_valid() is added to reuse the code in
      its_vpe_deschedule().
      
      Fixes: e643d803 ("irqchip/gic-v3-its: Add VPE scheduling")
      Signed-off-by: NHeyi Guo <guoheyi@huawei.com>
      Signed-off-by: NHeyi Guo <heyi.guo@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      6479450f
  12. 18 1月, 2019 1 次提交
  13. 03 10月, 2018 1 次提交
  14. 02 10月, 2018 9 次提交
  15. 07 9月, 2018 1 次提交
  16. 06 8月, 2018 1 次提交
  17. 16 7月, 2018 6 次提交
    • M
      irqchip/gic-v3-its: Honor hypervisor enforced LPI range · 12b2905a
      Marc Zyngier 提交于
      A recent extension to the GIC architecture allows a hypervisor to
      arbitrarily reduce the number of LPIs available to a guest, no
      matter what the GIC says about the valid range of IntIDs.
      
      Let's factor in this information when computing the number of
      available LPIs
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      12b2905a
    • M
      irqchip/gic-v3: Expose GICD_TYPER in the rdist structure · a4f9edb2
      Marc Zyngier 提交于
      Instead of exposing the GIC distributor IntID field in the rdist
      structure that is passed to the ITS, let's replace it with a
      copy of the whole GICD_TYPER register. We are going to need
      some of this information at a later time.
      
      No functionnal change.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      a4f9edb2
    • M
      irqchip/gic-v3-its: Drop chunk allocation compatibility · 38dd7c49
      Marc Zyngier 提交于
      The chunk allocation system is now officially dead, so let's
      remove it.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      38dd7c49
    • M
      irqchip/gic-v3-its: Move minimum LPI requirements to individual busses · 147c8f37
      Marc Zyngier 提交于
      At the moment, the core ITS driver imposes the allocation to be
      in chunks of 32. As we want to relax this on a per bus basis, let's
      move the the the allocation constraints to each bus.
      
      No functionnal change.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      147c8f37
    • M
      irqchip/gic-v3-its: Use full range of LPIs · fe8e9350
      Marc Zyngier 提交于
      As we used to represent the LPI range using a bitmap, we were reducing
      the number of LPIs to at most 64k in order to preserve memory.
      
      With our new allocator, there is no such need, as dealing with 2^16
      or 2^32 LPIs takes the same amount of memory.
      
      So let's use the number of IntID bits reported by the GIC instead of
      an arbitrary limit.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      fe8e9350
    • M
      irqchip/gic-v3-its: Refactor LPI allocator · 880cb3cd
      Marc Zyngier 提交于
      Our current LPI allocator relies on a bitmap, each bit representing
      a chunk of 32 LPIs, meaning that each device gets allocated LPIs
      in multiple of 32. It served us well so far, but new use cases now
      require much more finer grain allocations, down the the individual
      LPI.
      
      Given the size of the IntID space (up to 32bit), it isn't practical
      to continue using a bitmap, so let's use a different data structure
      altogether.
      
      We switch to a list, where each element represent a contiguous range
      of LPIs. On allocation, we simply grab the first group big enough to
      satisfy the allocation, and substract what we need from it. If the
      group becomes empty, we just remove it. On freeing interrupts, we
      insert a new group of interrupt in the list, sort it and fuse the
      adjacent groups.
      
      This makes freeing interrupt much more expensive than allocating
      them (an unusual behaviour), but that's fine as long as we consider
      that freeing interrupts is an extremely rare event.
      
      We still allocate interrupts in blocks of 32 for the time being,
      but subsequent patches will relax this.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      880cb3cd
  18. 22 6月, 2018 4 次提交
  19. 13 6月, 2018 1 次提交
    • K
      treewide: kzalloc() -> kcalloc() · 6396bb22
      Kees Cook 提交于
      The kzalloc() function has a 2-factor argument form, kcalloc(). This
      patch replaces cases of:
      
              kzalloc(a * b, gfp)
      
      with:
              kcalloc(a * b, gfp)
      
      as well as handling cases of:
      
              kzalloc(a * b * c, gfp)
      
      with:
      
              kzalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kzalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kzalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kzalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kzalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kzalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kzalloc
      + kcalloc
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kzalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(sizeof(THING) * C2, ...)
      |
        kzalloc(sizeof(TYPE) * C2, ...)
      |
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(C1 * C2, ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      6396bb22