1. 27 3月, 2020 5 次提交
  2. 18 3月, 2020 12 次提交
  3. 23 1月, 2020 1 次提交
    • M
      net: Fix packet reordering caused by GRO and listified RX cooperation · c8079432
      Maxim Mikityanskiy 提交于
      Commit 323ebb61 ("net: use listified RX for handling GRO_NORMAL
      skbs") introduces batching of GRO_NORMAL packets in napi_frags_finish,
      and commit 6570bc79 ("net: core: use listified Rx for GRO_NORMAL in
      napi_gro_receive()") adds the same to napi_skb_finish. However,
      dev_gro_receive (that is called just before napi_{frags,skb}_finish) can
      also pass skbs to the networking stack: e.g., when the GRO session is
      flushed, napi_gro_complete is called, which passes pp directly to
      netif_receive_skb_internal, skipping napi->rx_list. It means that the
      packet stored in pp will be handled by the stack earlier than the
      packets that arrived before, but are still waiting in napi->rx_list. It
      leads to TCP reorderings that can be observed in the TCPOFOQueue counter
      in netstat.
      
      This commit fixes the reordering issue by making napi_gro_complete also
      use napi->rx_list, so that all packets going through GRO will keep their
      order. In order to keep napi_gro_flush working properly, gro_normal_list
      calls are moved after the flush to clear napi->rx_list.
      
      iwlwifi calls napi_gro_flush directly and does the same thing that is
      done by gro_normal_list, so the same change is applied there:
      napi_gro_flush is moved to be before the flush of napi->rx_list.
      
      A few other drivers also use napi_gro_flush (brocade/bna/bnad.c,
      cortina/gemini.c, hisilicon/hns3/hns3_enet.c). The first two also use
      napi_complete_done afterwards, which performs the gro_normal_list flush,
      so they are fine. The latter calls napi_gro_receive right after
      napi_gro_flush, so it can end up with non-empty napi->rx_list anyway.
      
      Fixes: 323ebb61 ("net: use listified RX for handling GRO_NORMAL skbs")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Cc: Alexander Lobakin <alobakin@dlink.ru>
      Cc: Edward Cree <ecree@solarflare.com>
      Acked-by: NAlexander Lobakin <alobakin@dlink.ru>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Acked-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8079432
  4. 04 1月, 2020 3 次提交
    • L
      iwlwifi: add device name to device_info · 0b295a1e
      Luca Coelho 提交于
      We have a lot of mostly duplicated data structures that are repeated
      only because the device name string is different.  To avoid this, move
      the string from the cfg to the trans structure and add it
      independently from the rest of the configuration to the PCI mapping
      tables.
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      0b295a1e
    • L
      iwlwifi: implement a new device configuration table · 2a612a60
      Luca Coelho 提交于
      Add a new device table that contains information that can be checked
      at runtime in order to decide which configuration to use.  This allows
      us to map the full cfg independently from the tran-specific
      configuration.
      
      This is the first step in creating the new table.  Subsequent patches
      will add the possibility of checking different values at runtime in
      order to make the decision.
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      2a612a60
    • L
      iwlwifi: assume the driver_data is a trans_cfg, but allow full cfg · b3bd6416
      Luca Coelho 提交于
      With the new concept of separating the trans-specific (trans_cfg) data
      from the rest of the cfg, we will start mapping only the trans_cfg
      part to the PCI device ID/subsystem device ID.  So we can assume that
      the data passed to the probe function contains the trans_cfg, but
      since the full cfg still contains the trans_cfg at the beginning, we
      can allow a full cfg to be passed as well.  This makes it easier to
      convert the existing tables one by one.
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      b3bd6416
  5. 24 12月, 2019 2 次提交
    • L
      iwlwifi: pcie: always disable L0S states · cc894b85
      Luca Coelho 提交于
      L0S states have been found to be unstable with our devices and in
      newer hardware they are not supported at all, so we must always set
      the L0S_DISABLED bit.  Previously we were only disabling L0S states if
      L1 was supported, because the assumption was that transitions from L0S
      to L1 state was the problematic case.  But now we should never use
      L0S, so do it regardless of whether L1 is supported or not.
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      cc894b85
    • L
      iwlwifi: pcie: rename L0S_ENABLED bit to L0S_DISABLED · 3d1b28fd
      Luca Coelho 提交于
      This bit has been misnamed since the initial implementation of the
      driver.  The correct semantics is that setting this bit disables L0S
      states, and we already clearly use it as such in the code.  Rename it
      to avoid confusion.
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      3d1b28fd
  6. 23 12月, 2019 10 次提交
    • L
      iwlwifi: remove CSR registers abstraction · 6dece0e9
      Luca Coelho 提交于
      We needed this abstraction for some CSR registers for
      IWL_DEVICE_22560, but that has been removed, so we don't need the
      abstraction anymore.  Remove it.
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      6dece0e9
    • L
      iwlwifi: remove some outdated iwl22000 configurations · b81b7bd0
      Luca Coelho 提交于
      A few configuration structures were either not referenced anymore or
      assigned to devices IDs that were not in use anymore.  Remove them.
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      b81b7bd0
    • J
      iwlwifi: pcie: validate queue ID before array deref/bit ops · 0e002708
      Johannes Berg 提交于
      Validate that the queue ID is in range before trying to use it as
      an index or for test_bit() - the previous bug showed that this has
      in fact happened, and it was lucky that we caught it there, had the
      bit been set then we'd have actually used the value despite being
      far out of range.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      0e002708
    • J
      iwlwifi: pcie: use partial pages if applicable · cfdc20ef
      Johannes Berg 提交于
      If we have only 2k RBs like on the latest (AX210) hardware, then
      even on x86 where PAGE_SIZE is 4k we currently waste half of the
      memory.
      
      If this is the case, return partial pages from the allocator and
      track the offset in each RBD (to be able to find the data in them
      and remap them later.)
      
      This might also address other platforms with larger PAGE_SIZE by
      putting more RBs into a single large page.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      cfdc20ef
    • J
      iwlwifi: pcie: map only used part of RX buffers · 80084e35
      Johannes Berg 提交于
      We don't need to map *everything* of the RX buffers, we won't use
      that much, map only the part we're going to use. This save some
      IOMMU space (if applicable and it can deal with that) and also
      prepares a bit for mapping partial pages for 2K buffers later.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      80084e35
    • J
      iwlwifi: allocate more receive buffers for HE devices · c042f0c7
      Johannes Berg 提交于
      For HE-capable devices, we need to allocate more receive buffers as
      there could be 256 frames aggregated into a single A-MPDU, and then
      they might contain A-MSDUs as well. Until 22000 family, the devices
      are able to put multiple frames into a single RB and the default RB
      size is 4k, but starting from AX210 family this is no longer true.
      On the other hand, those newer devices only use 2k receive buffers
      (by default).
      
      Modify the code and configuration to allocate an appropriate number
      of RBs depending on the device capabilities:
      
       * 4096 for AX210 HE devices, which use 2k buffers by default,
       * 2048 for 22000 family devices which use 4k buffers by default,
       * 512 for existing 9000 family devices, which doesn't really
         change anything since that's the default before this patch,
       * 512 also for AX210/22000 family devices that don't do HE.
      
      Theoretically, for devices lower than AX210, we wouldn't have to
      allocate that many RBs if the RB size was manually increased, but
      to support that the code got more complex, and it didn't really
      seem necessary as that's a use case for monitor mode only, where
      hopefully the wasted memory isn't really much of a concern.
      
      Note that AX210 devices actually support bigger than 12-bit VID,
      which is required here as we want to allocate 4096 buffers plus
      some for quick recycling, so adjust the code for that as well.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      c042f0c7
    • J
      iwlwifi: pcie: extend hardware workaround to context-info · d84a7a65
      Johannes Berg 提交于
      After more investigation on the hardware side, it appears that the
      hardware bug regarding 2^32 boundary reaching/crossing also affects
      other uses of the DMA engine, in particular the ones triggered by
      the context-info (image loader) mechanism.
      
      It also turns out that the bug only affects devices with gen2 TX
      hardware engine, so we don't need to change context info for gen3.
      The TX path workarounds are simpler to still keep for both though.
      
      Add the workaround to that code as well; this is a lot simpler as
      we have just a single way to allocate DMA memory there.
      
      I made the algorithm recursive (with a small limit) since it's
      actually (almost) impossible to hit this today - dma_alloc_coherent
      is currently documented to always return 32-bit addressable memory
      regardless of the DMA mask for it, and so we could only get REALLY
      unlucky to get the very last page in that area.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      d84a7a65
    • J
      iwlwifi: pcie: allocate smaller dev_cmd for TX headers · a89c72ff
      Johannes Berg 提交于
      As noted in the previous commit, due to the way we allocate the
      dev_cmd headers with 324 byte size, and 4/8 byte alignment, the
      part we use of them (bytes 20..40-68) could still cross a page
      and thus 2^32 boundary.
      
      Address this by using alignment to ensure that the allocation
      cannot cross a page boundary, on hardware that's affected. To
      make that not cause more memory consumption, reduce the size of
      the allocations to the necessary size - we go from 324 bytes in
      each allocation to 60/68 on gen2 depending on family, and ~120
      or so on gen1 (so on gen1 it's a pure reduction in size, since
      we don't need alignment there).
      
      To avoid size and clearing issues, add a new structure that's
      just the header, and use kmem_cache_zalloc().
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      a89c72ff
    • J
      iwlwifi: pcie: detect the DMA bug and warn if it happens · c5a4e8eb
      Johannes Berg 提交于
      Warn if the DMA bug is going to happen. We don't have a good
      way of actually aborting in this case and we have workarounds
      in place for the cases where it happens, but in order to not
      be surprised add a safety-check and warn.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      c5a4e8eb
    • J
      iwlwifi: pcie: work around DMA hardware bug · c4a786b3
      Johannes Berg 提交于
      There's a hardware bug in the flow handler (DMA engine), if the
      address + len of some TB wraps around a 2^32 boundary, the carry
      bit is then carried over into the next TB.
      
      Work around this by copying the data to a new page when we find
      this situation, and then copy it in a way that we cannot hit the
      very end of the page.
      
      To be able to free the new page again later we need to chain it
      to the TSO page, use the last pointer there to make sure we can
      never use the page fully for DMA, and thus cannot cause the same
      overflow situation on this page.
      
      This leaves a few potential places (where we didn't observe the
      problem) unaddressed:
       * The second TB could reach or cross the end of a page (and thus
         2^32) due to the way we allocate the dev_cmd for the header
       * For host commands, a similar thing could happen since they're
         just kmalloc().
      We'll address these in further commits.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      c4a786b3
  7. 20 12月, 2019 1 次提交
  8. 10 12月, 2019 2 次提交
  9. 28 11月, 2019 1 次提交
  10. 20 11月, 2019 3 次提交