1. 01 10月, 2018 8 次提交
  2. 24 9月, 2018 1 次提交
    • E
      netpoll: make ndo_poll_controller() optional · ac3d9dd0
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      It seems that all networking drivers that do use NAPI
      for their TX completions, should not provide a ndo_poll_controller().
      
      NAPI drivers have netpoll support already handled
      in core networking stack, since netpoll_poll_dev()
      uses poll_napi(dev) to iterate through registered
      NAPI contexts for a device.
      
      This patch allows netpoll_poll_dev() to process NAPI
      contexts even for drivers not providing ndo_poll_controller(),
      allowing for following patches in NAPI drivers.
      
      Also we export netpoll_poll_dev() so that it can be called
      by bonding/team drivers in following patches.
      Reported-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac3d9dd0
  3. 22 9月, 2018 1 次提交
    • O
      block: use nanosecond resolution for iostat · b57e99b4
      Omar Sandoval 提交于
      Klaus Kusche reported that the I/O busy time in /proc/diskstats was not
      updating properly on 4.18. This is because we started using ktime to
      track elapsed time, and we convert nanoseconds to jiffies when we update
      the partition counter. However, this gets rounded down, so any I/Os that
      take less than a jiffy are not accounted for. Previously in this case,
      the value of jiffies would sometimes increment while we were doing I/O,
      so at least some I/Os were accounted for.
      
      Let's convert the stats to use nanoseconds internally. We still report
      milliseconds as before, now more accurately than ever. The value is
      still truncated to 32 bits for backwards compatibility.
      
      Fixes: 522a7775 ("block: consolidate struct request timestamp fields")
      Cc: stable@vger.kernel.org
      Reported-by: NKlaus Kusche <klaus.kusche@computerix.info>
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b57e99b4
  4. 21 9月, 2018 2 次提交
  5. 20 9月, 2018 3 次提交
  6. 19 9月, 2018 1 次提交
    • J
      net: stmmac: Rework coalesce timer and fix multi-queue races · 8fce3331
      Jose Abreu 提交于
      This follows David Miller advice and tries to fix coalesce timer in
      multi-queue scenarios.
      
      We are now using per-queue coalesce values and per-queue TX timer.
      
      Coalesce timer default values was changed to 1ms and the coalesce frames
      to 25.
      
      Tested in B2B setup between XGMAC2 and GMAC5.
      Signed-off-by: NJose Abreu <joabreu@synopsys.com>
      Fixes: 	ce736788 ("net: stmmac: adding multiple buffers for TX")
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Neil Armstrong <narmstrong@baylibre.com>
      Cc: Jerome Brunet <jbrunet@baylibre.com>
      Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Alexandre Torgue <alexandre.torgue@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fce3331
  7. 14 9月, 2018 1 次提交
    • L
      mm: get rid of vmacache_flush_all() entirely · 7a9cdebd
      Linus Torvalds 提交于
      Jann Horn points out that the vmacache_flush_all() function is not only
      potentially expensive, it's buggy too.  It also happens to be entirely
      unnecessary, because the sequence number overflow case can be avoided by
      simply making the sequence number be 64-bit.  That doesn't even grow the
      data structures in question, because the other adjacent fields are
      already 64-bit.
      
      So simplify the whole thing by just making the sequence number overflow
      case go away entirely, which gets rid of all the complications and makes
      the code faster too.  Win-win.
      
      [ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics
        also just goes away entirely with this ]
      Reported-by: NJann Horn <jannh@google.com>
      Suggested-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NDavidlohr Bueso <dave@stgolabs.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7a9cdebd
  8. 13 9月, 2018 2 次提交
  9. 12 9月, 2018 2 次提交
  10. 10 9月, 2018 1 次提交
    • P
      mfd: da9063: Fix DT probing with constraints · a318c243
      Philipp Zabel 提交于
      Commit 1c892e38 ("regulator: da9063: Handle less LDOs on DA9063L")
      reordered the da9063_regulator_info[] array, but not the DA9063_ID_*
      regulator ids and not the da9063_matches[] array, because ids are used
      as indices in the array initializer. This mismatch between regulator id
      and da9063_regulator_info[] array index causes the driver probe to fail
      because constraints from DT are not applied to the correct regulator:
      
        da9063 0-0058: Device detected (chip-ID: 0x61, var-ID: 0x50)
        DA9063_BMEM: Bringing 900000uV into 3300000-3300000uV
        DA9063_LDO9: Bringing 3300000uV into 2500000-2500000uV
        DA9063_LDO1: Bringing 900000uV into 3300000-3300000uV
        DA9063_LDO1: failed to apply 3300000-3300000uV constraint(-22)
      
      This patch reorders the DA9063_ID_* as apparently intended, and with
      them the entries in the da90630_matches[] array.
      
      Fixes: 1c892e38 ("regulator: da9063: Handle less LDOs on DA9063L")
      Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de>
      Reviewed-by: NMarek Vasut <marek.vasut@gmail.com>
      Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      a318c243
  11. 06 9月, 2018 3 次提交
  12. 05 9月, 2018 3 次提交
  13. 04 9月, 2018 1 次提交
  14. 03 9月, 2018 2 次提交
  15. 01 9月, 2018 2 次提交
    • D
      blkcg: delay blkg destruction until after writeback has finished · 59b57717
      Dennis Zhou (Facebook) 提交于
      Currently, blkcg destruction relies on a sequence of events:
        1. Destruction starts. blkcg_css_offline() is called and blkgs
           release their reference to the blkcg. This immediately destroys
           the cgwbs (writeback).
        2. With blkgs giving up their reference, the blkcg ref count should
           become zero and eventually call blkcg_css_free() which finally
           frees the blkcg.
      
      Jiufei Xue reported that there is a race between blkcg_bio_issue_check()
      and cgroup_rmdir(). To remedy this, blkg destruction becomes contingent
      on the completion of all writeback associated with the blkcg. A count of
      the number of cgwbs is maintained and once that goes to zero, blkg
      destruction can follow. This should prevent premature blkg destruction
      related to writeback.
      
      The new process for blkcg cleanup is as follows:
        1. Destruction starts. blkcg_css_offline() is called which offlines
           writeback. Blkg destruction is delayed on the cgwb_refcnt count to
           avoid punting potentially large amounts of outstanding writeback
           to root while maintaining any ongoing policies. Here, the base
           cgwb_refcnt is put back.
        2. When the cgwb_refcnt becomes zero, blkcg_destroy_blkgs() is called
           and handles destruction of blkgs. This is where the css reference
           held by each blkg is released.
        3. Once the blkcg ref count goes to zero, blkcg_css_free() is called.
           This finally frees the blkg.
      
      It seems in the past blk-throttle didn't do the most understandable
      things with taking data from a blkg while associating with current. So,
      the simplification and unification of what blk-throttle is doing caused
      this.
      
      Fixes: 08e18eab ("block: add bi_blkg to the bio for cgroups")
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDennis Zhou <dennisszhou@gmail.com>
      Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
      Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      59b57717
    • D
      Revert "blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()" · 6b065462
      Dennis Zhou (Facebook) 提交于
      This reverts commit 4c699480.
      
      Destroying blkgs is tricky because of the nature of the relationship. A
      blkg should go away when either a blkcg or a request_queue goes away.
      However, blkg's pin the blkcg to ensure they remain valid. To break this
      cycle, when a blkcg is offlined, blkgs put back their css ref. This
      eventually lets css_free() get called which frees the blkcg.
      
      The above commit (4c699480) breaks this order of events by trying to
      destroy blkgs in css_free(). As the blkgs still hold references to the
      blkcg, css_free() is never called.
      
      The race between blkcg_bio_issue_check() and cgroup_rmdir() will be
      addressed in the following patch by delaying destruction of a blkg until
      all writeback associated with the blkcg has been finished.
      
      Fixes: 4c699480 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()")
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDennis Zhou <dennisszhou@gmail.com>
      Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
      Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6b065462
  16. 31 8月, 2018 4 次提交
  17. 30 8月, 2018 2 次提交
    • A
      vfs: add the fadvise() file operation · 45cd0faa
      Amir Goldstein 提交于
      This is going to be used by overlayfs and possibly useful
      for other filesystems.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      45cd0faa
    • M
      arm/arm64: smccc-1.1: Handle function result as parameters · 755a8bf5
      Marc Zyngier 提交于
      If someone has the silly idea to write something along those lines:
      
      	extern u64 foo(void);
      
      	void bar(struct arm_smccc_res *res)
      	{
      		arm_smccc_1_1_smc(0xbad, foo(), res);
      	}
      
      they are in for a surprise, as this gets compiled as:
      
      	0000000000000588 <bar>:
      	 588:   a9be7bfd        stp     x29, x30, [sp, #-32]!
      	 58c:   910003fd        mov     x29, sp
      	 590:   f9000bf3        str     x19, [sp, #16]
      	 594:   aa0003f3        mov     x19, x0
      	 598:   aa1e03e0        mov     x0, x30
      	 59c:   94000000        bl      0 <_mcount>
      	 5a0:   94000000        bl      0 <foo>
      	 5a4:   aa0003e1        mov     x1, x0
      	 5a8:   d4000003        smc     #0x0
      	 5ac:   b4000073        cbz     x19, 5b8 <bar+0x30>
      	 5b0:   a9000660        stp     x0, x1, [x19]
      	 5b4:   a9010e62        stp     x2, x3, [x19, #16]
      	 5b8:   f9400bf3        ldr     x19, [sp, #16]
      	 5bc:   a8c27bfd        ldp     x29, x30, [sp], #32
      	 5c0:   d65f03c0        ret
      	 5c4:   d503201f        nop
      
      The call to foo "overwrites" the x0 register for the return value,
      and we end up calling the wrong secure service.
      
      A solution is to evaluate all the parameters before assigning
      anything to specific registers, leading to the expected result:
      
      	0000000000000588 <bar>:
      	 588:   a9be7bfd        stp     x29, x30, [sp, #-32]!
      	 58c:   910003fd        mov     x29, sp
      	 590:   f9000bf3        str     x19, [sp, #16]
      	 594:   aa0003f3        mov     x19, x0
      	 598:   aa1e03e0        mov     x0, x30
      	 59c:   94000000        bl      0 <_mcount>
      	 5a0:   94000000        bl      0 <foo>
      	 5a4:   aa0003e1        mov     x1, x0
      	 5a8:   d28175a0        mov     x0, #0xbad
      	 5ac:   d4000003        smc     #0x0
      	 5b0:   b4000073        cbz     x19, 5bc <bar+0x34>
      	 5b4:   a9000660        stp     x0, x1, [x19]
      	 5b8:   a9010e62        stp     x2, x3, [x19, #16]
      	 5bc:   f9400bf3        ldr     x19, [sp, #16]
      	 5c0:   a8c27bfd        ldp     x29, x30, [sp], #32
      	 5c4:   d65f03c0        ret
      Reported-by: NJulien Grall <julien.grall@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      755a8bf5
  18. 29 8月, 2018 1 次提交