1. 03 7月, 2017 1 次提交
  2. 20 4月, 2017 4 次提交
    • B
      mwifiex: pcie: clear outstanding work when resetting · 35e67d3d
      Brian Norris 提交于
      When we shut down the device (i.e., during 'reset'), we cancel any
      outstanding work, but we don't clear any work-related flags. This can
      cause problems if, e.g., we begin to queue a new firmware dump or card
      reset while the other one is in progress. That might leave work_flags
      with a stale value, and we might begin one of these *after* we've
      completely reset the device. That doesn't make sense, because all
      firmware context will have been lost by then.
      
      This fixes some forms of cascading failures, where I:
      
      (a) force a firmware dump (cat /sys/kernel/debug/mwifiex/mlan0/device_dump)
      (b) run a Wifi scan in parallel (iw mlan0 scan)
      (c) the scan times out due to (a) hogging the interface
      (d) the command timeout triggers another firmware dump and a reset [*]
      (e) the 2nd firmware dump flag persists across the reset
      (f) as soon as the interface comes back up, we trigger the pending
          firmware dump
      (g) subsequent commands time out again, while we are processing the
          firmware dump; return to (d)
      
      [*] Note that automatic card_reset() support is not yet implemented for
      the mwifiex PCIe driver, so we won't hit *exactly* this behavior yet.
      But we can see similarly-confusing behaviors today.
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      35e67d3d
    • B
      mwifiex: pcie: fix cmd_buf use-after-free in remove/reset · 3c8cb9ad
      Brian Norris 提交于
      Command buffers (skb's) are allocated by the main driver, and freed upon
      the last use. That last use is often in mwifiex_free_cmd_buffer(). In
      the meantime, if the command buffer gets used by the PCI driver, we map
      it as DMA-able, and store the mapping information in the 'cb' memory.
      
      However, if a command was in-flight when resetting the device (and
      therefore was still mapped), we don't get a chance to unmap this memory
      until after the core has cleaned up its command handling.
      
      Let's keep a refcount within the PCI driver, so we ensure the memory
      only gets freed after we've finished unmapping it.
      
      Noticed by KASAN when forcing a reset via:
      
        echo 1 > /sys/bus/pci/.../reset
      
      The same code path can presumably be exercised in remove() and
      shutdown().
      
      [  205.390377] mwifiex_pcie 0000:01:00.0: info: shutdown mwifiex...
      [  205.400393] ==================================================================
      [  205.407719] BUG: KASAN: use-after-free in mwifiex_unmap_pci_memory.isra.14+0x4c/0x100 [mwifiex_pcie] at addr ffffffc0ad471b28
      [  205.419040] Read of size 16 by task bash/1913
      [  205.423421] =============================================================================
      [  205.431625] BUG skbuff_head_cache (Tainted: G    B          ): kasan: bad access detected
      [  205.439815] -----------------------------------------------------------------------------
      [  205.439815]
      [  205.449534] INFO: Allocated in __build_skb+0x48/0x114 age=1311 cpu=4 pid=1913
      [  205.456709] 	alloc_debug_processing+0x124/0x178
      [  205.461282] 	___slab_alloc.constprop.58+0x528/0x608
      [  205.466196] 	__slab_alloc.isra.54.constprop.57+0x44/0x54
      [  205.471542] 	kmem_cache_alloc+0xcc/0x278
      [  205.475497] 	__build_skb+0x48/0x114
      [  205.479019] 	__netdev_alloc_skb+0xe0/0x170
      [  205.483244] 	mwifiex_alloc_cmd_buffer+0x68/0xdc [mwifiex]
      [  205.488759] 	mwifiex_init_fw+0x40/0x6cc [mwifiex]
      [  205.493584] 	_mwifiex_fw_dpc+0x158/0x520 [mwifiex]
      [  205.498491] 	mwifiex_reinit_sw+0x2c4/0x398 [mwifiex]
      [  205.503510] 	mwifiex_pcie_reset_notify+0x114/0x15c [mwifiex_pcie]
      [  205.509643] 	pci_reset_notify+0x5c/0x6c
      [  205.513519] 	pci_reset_function+0x6c/0x7c
      [  205.517567] 	reset_store+0x68/0x98
      [  205.521003] 	dev_attr_store+0x54/0x60
      [  205.524705] 	sysfs_kf_write+0x9c/0xb0
      [  205.528413] INFO: Freed in __kfree_skb+0xb0/0xbc age=131 cpu=4 pid=1913
      [  205.535064] 	free_debug_processing+0x264/0x370
      [  205.539550] 	__slab_free+0x84/0x40c
      [  205.543075] 	kmem_cache_free+0x1c8/0x2a0
      [  205.547030] 	__kfree_skb+0xb0/0xbc
      [  205.550465] 	consume_skb+0x164/0x178
      [  205.554079] 	__dev_kfree_skb_any+0x58/0x64
      [  205.558304] 	mwifiex_free_cmd_buffer+0xa0/0x158 [mwifiex]
      [  205.563817] 	mwifiex_shutdown_drv+0x578/0x5c4 [mwifiex]
      [  205.569164] 	mwifiex_shutdown_sw+0x178/0x310 [mwifiex]
      [  205.574353] 	mwifiex_pcie_reset_notify+0xd4/0x15c [mwifiex_pcie]
      [  205.580398] 	pci_reset_notify+0x5c/0x6c
      [  205.584274] 	pci_dev_save_and_disable+0x24/0x6c
      [  205.588837] 	pci_reset_function+0x30/0x7c
      [  205.592885] 	reset_store+0x68/0x98
      [  205.596324] 	dev_attr_store+0x54/0x60
      [  205.600017] 	sysfs_kf_write+0x9c/0xb0
      ...
      [  205.800488] Call trace:
      [  205.802980] [<ffffffc00020a69c>] dump_backtrace+0x0/0x190
      [  205.808415] [<ffffffc00020a96c>] show_stack+0x20/0x28
      [  205.813506] [<ffffffc0005d020c>] dump_stack+0xa4/0xcc
      [  205.818598] [<ffffffc0003be44c>] print_trailer+0x158/0x168
      [  205.824120] [<ffffffc0003be5f0>] object_err+0x4c/0x5c
      [  205.829210] [<ffffffc0003c45bc>] kasan_report+0x334/0x500
      [  205.834641] [<ffffffc0003c3994>] check_memory_region+0x20/0x14c
      [  205.840593] [<ffffffc0003c3b14>] __asan_loadN+0x14/0x1c
      [  205.845879] [<ffffffbffc46171c>] mwifiex_unmap_pci_memory.isra.14+0x4c/0x100 [mwifiex_pcie]
      [  205.854282] [<ffffffbffc461864>] mwifiex_pcie_delete_cmdrsp_buf+0x94/0xa8 [mwifiex_pcie]
      [  205.862421] [<ffffffbffc462028>] mwifiex_pcie_free_buffers+0x11c/0x158 [mwifiex_pcie]
      [  205.870302] [<ffffffbffc4620d4>] mwifiex_pcie_down_dev+0x70/0x80 [mwifiex_pcie]
      [  205.877736] [<ffffffbffc1397a8>] mwifiex_shutdown_sw+0x190/0x310 [mwifiex]
      [  205.884658] [<ffffffbffc4606b4>] mwifiex_pcie_reset_notify+0xd4/0x15c [mwifiex_pcie]
      [  205.892446] [<ffffffc000635f54>] pci_reset_notify+0x5c/0x6c
      [  205.898048] [<ffffffc00063a044>] pci_dev_save_and_disable+0x24/0x6c
      [  205.904350] [<ffffffc00063cf0c>] pci_reset_function+0x30/0x7c
      [  205.910134] [<ffffffc000641118>] reset_store+0x68/0x98
      [  205.915312] [<ffffffc000771588>] dev_attr_store+0x54/0x60
      [  205.920750] [<ffffffc00046f53c>] sysfs_kf_write+0x9c/0xb0
      [  205.926182] [<ffffffc00046dfb0>] kernfs_fop_write+0x184/0x1f8
      [  205.931963] [<ffffffc0003d64f4>] __vfs_write+0x6c/0x17c
      [  205.937221] [<ffffffc0003d7164>] vfs_write+0xf0/0x1c4
      [  205.942310] [<ffffffc0003d7da0>] SyS_write+0x78/0xd8
      [  205.947312] [<ffffffc000204634>] el0_svc_naked+0x24/0x28
      ...
      [  205.998268] ==================================================================
      
      This bug has been around in different forms for a while. It was sort of
      noticed in commit 955ab095 ("mwifiex: Do not kfree cmd buf while
      unregistering PCIe"), but it just fixed the double-free, without
      acknowledging the potential for use-after-free.
      
      Fixes: fc331460 ("mwifiex: use pci_alloc/free_consistent APIs for PCIe")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      3c8cb9ad
    • X
      mwifiex: pcie: extract wifi part from combo firmware during function level reset · efde6648
      Xinming Hu 提交于
      A separate wifi-only firmware was download during pcie function level
      reset. It is in fact the tail part of wifi/bt combo firmware. Per
      Brian's and Dmitry's suggestion, this patch extract the wifi part from
      combo firmware.
      
      After that, the mrvl/pcie8997_wlan_v4.bin image in linux-firmware repo
      is redundant (though I guess we keep it around to support older
      kernels).
      Signed-off-by: NXinming Hu <huxm@marvell.com>
      Signed-off-by: NGanapathi Bhat <gbhat@marvell.com>
      Signed-off-by: NCathy Luo <cluo@marvell.com>
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      efde6648
    • X
      mwifiex: pcie: correct scratch register name · 127ee1db
      Xinming Hu 提交于
      This patch correct pcie scratch register name, to keep the same with
      chipset side definition.
      Signed-off-by: NXinming Hu <huxm@marvell.com>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      127ee1db
  3. 05 4月, 2017 1 次提交
    • B
      mwifiex: catch mwifiex_fw_dpc() errors properly in reset · 755b37c9
      Brian Norris 提交于
      When resetting the device, we take a synchronous firmware-loading code
      path, which borrows a lot from the asynchronous path used at probe time.
      We don't catch errors correctly though, which means that in the PCIe
      driver, we may try to dereference the 'adapter' struct after
      mwifiex_fw_dpc() has freed it. See this (erronous) print in
      mwifiex_pcie_reset_notify():
      
      	mwifiex_dbg(adapter, INFO, "%s, successful\n", __func__);
      
      Let's instead refactor the synchronous (or "!req_fw_nowait") path so
      that we propagate errors and handle them properly.
      
      This fixes a use-after-free issue in the PCIe driver, as well as a
      misleading debug message ("successful"). It looks like the SDIO driver
      doesn't have these problems, since it doesn't do anything after
      mwifiex_reinit_sw().
      
      Fixes: 4c5dae59 ("mwifiex: add PCIe function level reset support")
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      755b37c9
  4. 21 3月, 2017 3 次提交
    • B
      mwifiex: fix kernel crash after shutdown command timeout · 5caa7f38
      Brian Norris 提交于
      We observed a SHUTDOWN command timeout during reboot stress test due to
      a corner case firmware bug. It can lead to either a use-after-free +
      OOPS (on either the adapter structure, or the 'card' structure) or an
      abort (where, e.g., the PCI device is "disabled" before we're done
      dumping the FW).
      
      We can avoid this by canceling/flushing the FW dump work:
      
      (a) after we've terminated all other work queues (e.g., for processing
          commands which could time out)
      (b) after we've disabled all interrupts (which could also queue more
          work for us)
      (c) after we've unregistered the netdev and wiphy structures (and
          implicitly, and debugfs entries which could manually trigger FW dumps)
      (d) before we've actually disabled the device (e.g.,
          pci_device_disable())
      
      Altogether, this means no card->work will be scheduled if we sync at
      a point that satisfies the above. This can be done at the beginning of
      the .cleanup_if() callback.
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      5caa7f38
    • D
      mwifiex: fix for unaligned reads · 92c70a95
      Devidas Puranik 提交于
      Using the accessor function e.g. get_unaligned_le32 instead of
      le32_to_cpu to avoid the unaligned access. This is for the
      architectures that don't handle the unaligned memory access
      Signed-off-by: NDevidas Puranik <devidas@marvell.com>
      Signed-off-by: NGanapathi Bhat <gbhat@marvell.com>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      92c70a95
    • B
      mwifiex: pcie: clean up error prints in mwifiex_pcie_reset_notify() · 52033415
      Brian Norris 提交于
      We shouldn't be printing a kernel pointer as a decimal integer. But we
      really shouldn't be printing this case at all; we should never get here
      with NULL drvdata. We've eliminated this unnecessary conditional in
      several other places, so kill it here too.
      
      Similarly, there's no need to check for '!pdev'; we are guaranteed to
      have a real device here.
      
      And finally, use dev_err() instead of pr_err().
      
      This yields (for failed PCIe resets):
      
      [   68.286586] mwifiex_pcie 0000:01:00.0: mwifiex_pcie_reset_notify: adapter structure is not valid
      
      instead of:
      
      [   82.932658] mwifiex_pcie: mwifiex_pcie_reset_notify: Card or adapter structure is not valid (-270880688088)
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      52033415
  5. 16 3月, 2017 1 次提交
    • B
      mwifiex: pcie: don't leak DMA buffers when removing · 4e841d3e
      Brian Norris 提交于
      When PCIe FLR support was added, much of the remove/release code for
      PCIe was migrated to ->down_dev(), but ->down_dev() is never called for
      device removal. Let's refactor the cleanup to be done in both cases.
      
      Also, drop the comments above mwifiex_cleanup_pcie(), because they were
      clearly wrong, and it's better to have clear and obvious code than to
      detail the code steps in comments anyway.
      
      Fixes: 4c5dae59 ("mwifiex: add PCIe function level reset support")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      4e841d3e
  6. 28 1月, 2017 2 次提交
  7. 20 1月, 2017 3 次提交
    • B
      mwifiex: pcie: read FROMDEVICE DMA-able memory with READ_ONCE() · fe116788
      Brian Norris 提交于
      In mwifiex_delay_for_sleep_cookie(), we're looping and waiting for the
      PCIe endpoint to write a magic value back to memory, to signal that it
      has finished going to sleep. We're not letting the compiler know that
      this might change underneath our feet though. Let's do that, for good
      hygiene.
      
      I'm not aware of this fixing any concrete problems. I also give no
      guarantee that this loop is actually correct in any other way, but at
      least this looks like an improvement to me.
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      fe116788
    • B
      mwifiex: pcie: don't loop/retry interrupt status checks · 5d5ddb5e
      Brian Norris 提交于
      The following sequence occurs when using IEEE power-save on 8997:
      (a) driver sees SLEEP event
      (b) driver issues SLEEP CONFIRM
      (c) driver recevies CMD interrupt; within the interrupt processing loop,
          we do (d) and (e):
      (d) wait for FW sleep cookie (and often time out; it takes a while), FW
          is putting card into low power mode
      (e) re-check PCIE_HOST_INT_STATUS register; quit loop with 0 value
      
      But at (e), no one actually signaled an interrupt (i.e., we didn't check
      adapter->int_status). And what's more, because the card is going to
      sleep, this register read appears to take a very long time in some cases
      -- 3 milliseconds in my case!
      
      Now, I propose that (e) is completely unnecessary. If there were any
      additional interrupts signaled after the start of this loop, then the
      interrupt handler would have set adapter->int_status to non-zero and
      queued more work for the main loop -- and we'd catch it on the next
      iteration of the main loop.
      
      So this patch drops all the looping/re-reading of PCIE_HOST_INT_STATUS,
      which avoids the problematic (and slow) register read in step (e).
      
      Incidentally, this is a very similar issue to the one fixed in commit
      ec815dd2 ("mwifiex: prevent register accesses after host is
      sleeping"), except that the register read is just very slow instead of
      fatal in this case.
      
      Tested on 8997 in both MSI and (though not technically supported at the
      moment) MSI-X mode.
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      5d5ddb5e
    • B
      mwifiex: pcie: use posted write to wake up firmware · 062e008a
      Brian Norris 提交于
      Depending on system factors (e.g., the PCIe link PM state), the first
      read to wake up the Wifi firmware can take a long time. There is no
      reason to use a (blocking, non-posted) read at this point, so let's just
      use a write instead. Write vs. read doesn't matter functionality-wise --
      it's just a dummy operation. But let's make sure to re-write with the
      correct "ready" signature, since we check for that in other parts of the
      driver.
      
      This has been shown to decrease the time spent blocking in this function
      on RK3399.
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      062e008a
  8. 17 1月, 2017 1 次提交
    • A
      mwifiex: fix uninitialized variable access in pcie_remove · 0e8edb9a
      Arnd Bergmann 提交于
      Checking the firmware status from PCIe register only works
      if the register is available, otherwise we end up with
      random behavior:
      
      drivers/net/wireless/marvell/mwifiex/pcie.c: In function 'mwifiex_pcie_remove':
      drivers/net/wireless/marvell/mwifiex/pcie.c:585:5: error: 'fw_status' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      This makes sure we treat the absence of the register as a failure.
      
      Fixes: 045f0c1b ("mwifiex: get rid of global user_rmmod flag")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      0e8edb9a
  9. 12 1月, 2017 8 次提交
  10. 29 11月, 2016 1 次提交
    • B
      mwifiex: pcie: implement timeout loop for FW programming doorbell · 22dde1ed
      Brian Norris 提交于
      Marvell Wifi PCIe modules don't always behave nicely for PCIe power
      management when their firmware hasn't been loaded, particularly after
      suspending the PCIe link one or more times. When this happens, we might
      end up spinning forever in this status-polling tight loop. Let's make
      this less tight by adding a timeout and by sleeping a bit in between
      reads, as we do with the other similar loops.
      
      This prevents us from hogging a CPU even in such pathological cases, and
      allows the FW initialization to just fail gracefully instead.
      
      I chose the same polling parameters as the earlier loop in this
      function, and empirically, I found that this loop never makes it more
      than about 12 cycles in a sane FW init sequence. I had no official
      information on the actual intended latency for this portion of the
      download.
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Acked-by: NAmitkumar Karwar <akarwar@marvell.com>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      22dde1ed
  11. 25 11月, 2016 1 次提交
  12. 19 11月, 2016 11 次提交
  13. 18 11月, 2016 1 次提交
    • B
      mwifiex: don't do unbalanced free()'ing in cleanup_if() · 66b9c182
      Brian Norris 提交于
      The cleanup_if() callback is the inverse of init_if(). We allocate our
      'card' interface structure in the probe() function, but we free it in
      cleanup_if(). That gives a few problems:
      (a) we leak this memory if probe() fails before we reach init_if()
      (b) we can't safely utilize 'card' after cleanup_if() -- namely, in
          remove() or suspend(), both of which might race with the cleanup
          paths in our asynchronous FW initialization path
      
      Solution: just use devm_kzalloc(), which will free this structure
      properly when the device is removed -- and drop the set_drvdata(...,
      NULL), since the driver core does this for us. This also removes the
      temptation to use drvdata == NULL as a hack for checking if the device
      has been "cleaned up."
      
      I *do* leave the set_drvdata(..., NULL) for the hacky SDIO
      mwifiex_recreate_adapter(), since the device core won't be able to clear
      that one for us.
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      66b9c182
  14. 09 11月, 2016 2 次提交
    • A
      mwifiex: report error to PCIe for suspend failure · 5190f2e4
      Amitkumar Karwar 提交于
      When host_sleep_config command fails, we should return an error to
      PCIe, instead of continuing (and possibly panicking, when we try to keep
      processing a timed-out ioctl after we return "successfully" from
      suspend).
      Signed-off-by: NAmitkumar Karwar <akarwar@marvell.com>
      Reviewed-by: NBrian Norris <briannorris@chromium.org>
      Tested-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      5190f2e4
    • A
      mwifiex: prevent register accesses after host is sleeping · ec815dd2
      Amitkumar Karwar 提交于
      Following is mwifiex driver-firmware host sleep handshake.
      It involves three threads. suspend handler, interrupt handler, interrupt
      processing in main work queue.
      
      1) Enter suspend handler
      2) Download HS_CFG command
      3) Response from firmware for HS_CFG
      4) Suspend thread waits until handshake completes(i.e hs_activate becomes
         true)
      5) SLEEP from firmware
      6) SLEEP confirm downloaded to firmware.
      7) SLEEP confirm response from firmware
      8) Driver processes SLEEP confirm response and set hs_activate to wake up
      suspend thread
      9) Exit suspend handler
      10) Read sleep cookie in loop and wait until it indicates firmware is
      sleep.
      11) After processing SLEEP confirm response, we are at the end of interrupt
      processing routine. Recheck if there are interrupts received while we were
      processing them.
      
      During suspend-resume stress test, it's been observed that we may end up
      acessing PCIe hardware(in 10 and 11) when PCIe bus is closed which leads
      to a kernel crash.
      
      This patch solves the problem with below changes.
      a) action 10 above can be done before 8
      b) Skip 11 if hs_activated is true. SLEEP confirm response
      is the last interrupt from firmware. No need to recheck for
      pending interrupts.
      c) Add flush_workqueue() in suspend handler.
      Signed-off-by: NAmitkumar Karwar <akarwar@marvell.com>
      Reviewed-by: NBrian Norris <briannorris@chromium.org>
      Tested-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      ec815dd2