1. 05 3月, 2020 17 次提交
  2. 31 12月, 2019 1 次提交
    • V
      spi: spi-fsl-dspi: Fix 16-bit word order in 32-bit XSPI mode · ca59d5a5
      Vladimir Oltean 提交于
      When used in Extended SPI mode on LS1021A, the DSPI controller wants to
      have the least significant 16-bit word written first to the TX FIFO.
      
      In fact, the LS1021A reference manual says:
      
      33.5.2.4.2 Draining the TX FIFO
      
      When Extended SPI Mode (DSPIx_MCR[XSPI]) is enabled, if the frame size
      of SPI Data to be transmitted is more than 16 bits, then it causes two
      Data entries to be popped from TX FIFO simultaneously which are
      transferred to the shift register. The first of the two popped entries
      forms the 16 least significant bits of the SPI frame to be transmitted.
      
      So given the following TX buffer:
      
       +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
       | 0x0 | 0x1 | 0x2 | 0x3 | 0x4 | 0x5 | 0x6 | 0x7 | 0x8 | 0x9 | 0xa | 0xb |
       +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
       |     32-bit word 1     |     32-bit word 2     |     32-bit word 3     |
       +-----------------------+-----------------------+-----------------------+
      
      The correct way that a little-endian system should transmit it on the
      wire when bits_per_word is 32 is:
      
      0x03020100
      0x07060504
      0x0b0a0908
      
      But it is actually transmitted as following, as seen with a scope:
      
      0x01000302
      0x05040706
      0x09080b0a
      
      It appears that this patch has been submitted at least once before:
      https://lkml.org/lkml/2018/9/21/286
      but in that case Chuanhua Han did not manage to explain the problem
      clearly enough and the patch did not get merged, leaving XSPI mode
      broken.
      
      Fixes: 8fcd151d ("spi: spi-fsl-dspi: XSPI FIFO handling (in TCFQ mode)")
      Cc: Esben Haabendal <eha@deif.com>
      Cc: Chuanhua Han <chuanhua.han@nxp.com>
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20191228135536.14284-1-olteanv@gmail.comSigned-off-by: NMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      ca59d5a5
  3. 28 12月, 2019 1 次提交
  4. 16 12月, 2019 1 次提交
  5. 15 10月, 2019 1 次提交
  6. 08 10月, 2019 2 次提交
    • V
      spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode · 5d2af8bc
      Vladimir Oltean 提交于
      With this patch, the "interrupts" property from the device tree bindings
      is ignored, even if present, if the driver runs in TCFQ mode.
      
      Switching to using the DSPI in poll mode has several distinct
      benefits:
      
      - With interrupts, the DSPI driver in TCFQ mode raises an IRQ after each
        transmitted word. There is more time wasted for the "waitq" event than
        for actual I/O. And the DSPI IRQ count can easily get the largest in
        /proc/interrupts on Freescale boards with attached SPI devices.
      
      - The SPI I/O time is both lower, and more consistently so. Attached to
        some Freescale devices are either PTP switches, or SPI RTCs. For
        reading time off of a SPI slave device, it is important that all SPI
        transfers take a deterministic time to complete.
      
      - In poll mode there is much less time spent by the CPU in hardirq
        context, which helps with the response latency of the system, and at
        the same time there is more control over when interrupts must be
        disabled (to get a precise timestamp measurement): win-win.
      
      On the LS1021A-TSN board, where the SPI device is a SJA1105 PTP switch
      (with a bits_per_word=8 driver), I created a "benchmark" where I read
      its PTP time once per second, for 120 seconds. Each "read PTP time" is a
      12-byte SPI transfer. I then recorded the time before putting the first
      byte in the TX FIFO, and the time after reading the last byte from the
      RX FIFO. That is the transfer delay in nanoseconds.
      
      Interrupt mode:
      
        delay: min 125120 max 168320 mean 150286 std dev 17675.3
      
      Poll mode:
      
        delay: min 69440 max 119040 mean 70312.9 std dev 8065.34
      
      Both the mean latency and the standard deviation are more than 50% lower
      in poll mode than in interrupt mode. This is with an 'ondemand' governor
      on an otherwise idle system - therefore running mostly at 600 MHz out of
      a max of 1200 MHz.
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20190905010114.26718-5-olteanv@gmail.comSigned-off-by: NMark Brown <broonie@kernel.org>
      5d2af8bc
    • V
      spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode · d6b71dfa
      Vladimir Oltean 提交于
      In this mode, the DSPI controller uses PIO to transfer word by word. In
      comparison, in EOQ mode the 4-word deep FIFO is being used, hence the
      current logic will need some adaptation for which I do not have the
      hardware (Coldfire) to test. It is not clear what is the timing of DMA
      transfers and whether timestamping in the driver brings any overall
      performance increase compared to regular timestamping done in the core.
      
      Short phc2sys summary after 58 minutes of running on LS1021A-TSN with
      interrupts disabled during the critical section:
      
        offset: min -26251 max 16416 mean -21.8672 std dev 863.416
        delay: min 4720 max 57280 mean 5182.49 std dev 1607.19
        lost servo lock 3 times
      
      Summary of the same phc2sys service running for 120 minutes with
      interrupts disabled:
      
        offset: min -378 max 381 mean -0.0083089 std dev 101.495
        delay: min 4720 max 5920 mean 5129.38 std dev 154.899
        lost servo lock 0 times
      
      The minimum delay (pre to post time) in nanoseconds is the same, but the
      maximum delay is quite a bit higher, due to interrupts getting sometimes
      executed and interfering with the measurement. Hence set disable_irqs
      whenever possible (aka when the driver runs in poll mode - otherwise it
      would be a contradiction in terms).
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20190905010114.26718-4-olteanv@gmail.comSigned-off-by: NMark Brown <broonie@kernel.org>
      d6b71dfa
  7. 02 10月, 2019 1 次提交
    • V
      spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode · 3c0f9d8b
      Vladimir Oltean 提交于
      With this patch, the "interrupts" property from the device tree bindings
      is ignored, even if present, if the driver runs in TCFQ mode.
      
      Switching to using the DSPI in poll mode has several distinct
      benefits:
      
      - With interrupts, the DSPI driver in TCFQ mode raises an IRQ after each
        transmitted word. There is more time wasted for the "waitq" event than
        for actual I/O. And the DSPI IRQ count can easily get the largest in
        /proc/interrupts on Freescale boards with attached SPI devices.
      
      - The SPI I/O time is both lower, and more consistently so. Attached to
        some Freescale devices are either PTP switches, or SPI RTCs. For
        reading time off of a SPI slave device, it is important that all SPI
        transfers take a deterministic time to complete.
      
      - In poll mode there is much less time spent by the CPU in hardirq
        context, which helps with the response latency of the system, and at
        the same time there is more control over when interrupts must be
        disabled (to get a precise timestamp measurement, which will come in a
        future patch): win-win.
      
      On the LS1021A-TSN board, where the SPI device is a SJA1105 PTP switch
      (with a bits_per_word=8 driver), I created a "benchmark" where I
      periodically transferred a 12-byte message once per second, for 120
      seconds. I then recorded the time before putting the first byte in the
      TX FIFO, and the time after reading the last byte from the RX FIFO. That
      is the transfer delay in nanoseconds.
      
      Interrupt mode:
      
        delay: min 125120 max 168320 mean 150286 std dev 17675.3
      
      Poll mode:
      
        delay: min 69440 max 119040 mean 70312.9 std dev 8065.34
      
      Both the mean latency and the standard deviation are more than 50% lower
      in poll mode than in interrupt mode, and the 'max' in poll mode is lower
      than the 'min' in interrupt mode. This is with an 'ondemand' governor on
      an otherwise idle system - therefore running mostly at 600 MHz out of a
      max of 1200 MHz.
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20191001205216.32115-1-olteanv@gmail.comSigned-off-by: NMark Brown <broonie@kernel.org>
      3c0f9d8b
  8. 01 10月, 2019 1 次提交
  9. 03 9月, 2019 1 次提交
    • V
      spi: spi-fsl-dspi: Fix race condition in TCFQ/EOQ interrupt · e3273649
      Vladimir Oltean 提交于
      When the driver is working in TCFQ/EOQ mode (i.e. interacts with the SPI
      controller's FIFOs directly) the following sequence of operations
      happens:
      
      - The first byte of the tx buffer gets pushed to the TX FIFO (dspi->len
        gets decremented). This triggers the train of interrupts that handle
        the rest of the bytes.
      
      - The dspi_interrupt handles a TX confirmation event. It reads the newly
        available byte from the RX FIFO, checks the dspi->len exit condition,
        and if there's more to be done, it kicks off the next interrupt in the
        train by writing the next byte to the TX FIFO.
      
      Now the problem is that the wait queue is woken up one byte too early,
      because dspi->len becomes 0 as soon as the byte has been pushed into the
      TX FIFO. Its interrupt has not yet been processed and the RX byte has
      not been put from the FIFO into the buffer.
      
      Depending on the timing of the wait queue wakeup vs the handling of the
      last dspi_interrupt, it can happen that the main SPI message pump thread
      has already returned back into the spi_device driver. When the rx buffer
      is on stack (which it can be, because in this mode, the DSPI doesn't do
      DMA), the last interrupt will perform a memory write into an rx buffer
      that has been freed. This manifests as stack corruption.
      
      The solution is to only wake up the wait queue when dspi_rxtx says so,
      i.e. after it has processed the last TX confirmation interrupt and
      collected the last RX byte.
      
      Fixes: c55be305 ("spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20190903105708.32273-1-olteanv@gmail.comSigned-off-by: NMark Brown <broonie@kernel.org>
      e3273649
  10. 23 8月, 2019 5 次提交
  11. 20 8月, 2019 9 次提交