1. 30 10月, 2019 9 次提交
    • D
      io_uring: add set of tracing events · c826bd7a
      Dmitrii Dolgov 提交于
      To trace io_uring activity one can get an information from workqueue and
      io trace events, but looks like some parts could be hard to identify via
      this approach. Making what happens inside io_uring more transparent is
      important to be able to reason about many aspects of it, hence introduce
      the set of tracing events.
      
      All such events could be roughly divided into two categories:
      
      * those, that are helping to understand correctness (from both kernel
        and an application point of view). E.g. a ring creation, file
        registration, or waiting for available CQE. Proposed approach is to
        get a pointer to an original structure of interest (ring context, or
        request), and then find relevant events. io_uring_queue_async_work
        also exposes a pointer to work_struct, to be able to track down
        corresponding workqueue events.
      
      * those, that provide performance related information. Mostly it's about
        events that change the flow of requests, e.g. whether an async work
        was queued, or delayed due to some dependencies. Another important
        case is how io_uring optimizations (e.g. registered files) are
        utilized.
      Signed-off-by: NDmitrii Dolgov <9erthalion6@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c826bd7a
    • J
      io_uring: add support for canceling timeout requests · 11365043
      Jens Axboe 提交于
      We might have cases where the need for a specific timeout is gone, add
      support for canceling an existing timeout operation. This works like the
      POLL_REMOVE command, where the application passes in the user_data of
      the timeout it wishes to cancel in the sqe->addr field.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      11365043
    • J
      io_uring: add support for absolute timeouts · a41525ab
      Jens Axboe 提交于
      This is a pretty trivial addition on top of the relative timeouts
      we have now, but it's handy for ensuring tighter timing for those
      that are building scheduling primitives on top of io_uring.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a41525ab
    • J
      io_uring: replace s->needs_lock with s->in_async · ba5290cc
      Jackie Liu 提交于
      There is no function change, just to clean up the code, use s->in_async
      to make the code know where it is.
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ba5290cc
    • J
      io_uring: allow application controlled CQ ring size · 33a107f0
      Jens Axboe 提交于
      We currently size the CQ ring as twice the SQ ring, to allow some
      flexibility in not overflowing the CQ ring. This is done because the
      SQE life time is different than that of the IO request itself, the SQE
      is consumed as soon as the kernel has seen the entry.
      
      Certain application don't need a huge SQ ring size, since they just
      submit IO in batches. But they may have a lot of requests pending, and
      hence need a big CQ ring to hold them all. By allowing the application
      to control the CQ ring size multiplier, we can cater to those
      applications more efficiently.
      
      If an application wants to define its own CQ ring size, it must set
      IORING_SETUP_CQSIZE in the setup flags, and fill out
      io_uring_params->cq_entries. The value must be a power of two.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      33a107f0
    • J
      io_uring: add support for IORING_REGISTER_FILES_UPDATE · c3a31e60
      Jens Axboe 提交于
      Allows the application to remove/replace/add files to/from a file set.
      Passes in a struct:
      
      struct io_uring_files_update {
      	__u32 offset;
      	__s32 *fds;
      };
      
      that holds an array of fds, size of array passed in through the usual
      nr_args part of the io_uring_register() system call. The logic is as
      follows:
      
      1) If ->fds[i] is -1, the existing file at i + ->offset is removed from
         the set.
      2) If ->fds[i] is a valid fd, the existing file at i + ->offset is
         replaced with ->fds[i].
      
      For case #2, is the existing file is currently empty (fd == -1), the
      new fd is simply added to the array.
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c3a31e60
    • J
      io_uring: allow sparse fixed file sets · 08a45173
      Jens Axboe 提交于
      This is in preparation for allowing updates to fixed file sets without
      requiring a full unregister+register.
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      08a45173
    • J
      io_uring: run dependent links inline if possible · ba816ad6
      Jens Axboe 提交于
      Currently any dependent link is executed from a new workqueue context,
      which means that we'll be doing a context switch per link in the chain.
      If we are running the completion of the current request from our async
      workqueue and find that the next request is a link, then run it directly
      from the workqueue context instead of forcing another switch.
      
      This improves the performance of linked SQEs, and reduces the CPU
      overhead.
      Reviewed-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ba816ad6
    • A
      um-ubd: Entrust re-queue to the upper layers · d848074b
      Anton Ivanov 提交于
      Fixes crashes due to ubd requeue logic conflicting with the block-mq
      logic. Crash is reproducible in 5.0 - 5.3.
      
      Fixes: 53766def ("um: Clean-up command processing in UML UBD driver")
      Cc: stable@vger.kernel.org # v5.0+
      Signed-off-by: NAnton Ivanov <anton.ivanov@cambridgegreys.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d848074b
  2. 29 10月, 2019 2 次提交
    • A
      nvme-multipath: remove unused groups_only mode in ana log · 86cccfbf
      Anton Eidelman 提交于
      groups_only mode in nvme_read_ana_log() is no longer used: remove it.
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      86cccfbf
    • A
      nvme-multipath: fix possible io hang after ctrl reconnect · af8fd042
      Anton Eidelman 提交于
      The following scenario results in an IO hang:
      1) ctrl completes a request with NVME_SC_ANA_TRANSITION.
         NVME_NS_ANA_PENDING bit in ns->flags is set and ana_work is triggered.
      2) ana_work: nvme_read_ana_log() tries to get the ANA log page from the ctrl.
         This fails because ctrl disconnects.
         Therefore nvme_update_ns_ana_state() is not called
         and NVME_NS_ANA_PENDING bit in ns->flags is not cleared.
      3) ctrl reconnects: nvme_mpath_init(ctrl,...) calls
         nvme_read_ana_log(ctrl, groups_only=true).
         However, nvme_update_ana_state() does not update namespaces
         because nr_nsids = 0 (due to groups_only mode).
      4) scan_work calls nvme_validate_ns() finds the ns and re-validates OK.
      
      Result:
      The ctrl is now live but NVME_NS_ANA_PENDING bit in ns->flags is still set.
      Consequently ctrl will never be considered a viable path by __nvme_find_path().
      IO will hang if ctrl is the only or the last path to the namespace.
      
      More generally, while ctrl is reconnecting, its ANA state may change.
      And because nvme_mpath_init() requests ANA log in groups_only mode,
      these changes are not propagated to the existing ctrl namespaces.
      This may result in a mal-function or an IO hang.
      
      Solution:
      nvme_mpath_init() will nvme_read_ana_log() with groups_only set to false.
      This will not harm the new ctrl case (no namespaces present),
      and will make sure the ANA state of namespaces gets updated after reconnect.
      
      Note: Another option would be for nvme_mpath_init() to invoke
      nvme_parse_ana_log(..., nvme_set_ns_ana_state) for each existing namespace.
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      af8fd042
  3. 28 10月, 2019 3 次提交
  4. 27 10月, 2019 14 次提交
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 153a971f
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
       "Two fixes for the VMWare guest support:
      
         - Unbreak VMWare platform detection which got wreckaged by converting
           an integer constant to a string constant.
      
         - Fix the clang build of the VMWAre hypercall by explicitely
           specifying the ouput register for INL instead of using the short
           form"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu/vmware: Fix platform detection VMWARE_PORT macro
        x86/cpu/vmware: Use the full form of INL in VMWARE_HYPERCALL, for clang/llvm
      153a971f
    • L
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2b776b54
      Linus Torvalds 提交于
      Pull timer fixes from Thomas Gleixner:
       "A small set of fixes for time(keeping):
      
         - Add a missing include to prevent compiler warnings.
      
         - Make the VDSO implementation of clock_getres() POSIX compliant
           again. A recent change dropped the NULL pointer guard which is
           required as NULL is a valid pointer value for this function.
      
         - Fix two function documentation typos"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        posix-cpu-timers: Fix two trivial comments
        timers/sched_clock: Include local timekeeping.h for missing declarations
        lib/vdso: Make clock_getres() POSIX compliant again
      2b776b54
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a8a31fdc
      Linus Torvalds 提交于
      Pull perf fixes from Thomas Gleixner:
       "A set of perf fixes:
      
        kernel:
      
         - Unbreak the tracking of auxiliary buffer allocations which got
           imbalanced causing recource limit failures.
      
         - Fix the fallout of splitting of ToPA entries which missed to shift
           the base entry PA correctly.
      
         - Use the correct context to lookup the AUX event when unmapping the
           associated AUX buffer so the event can be stopped and the buffer
           reference dropped.
      
        tools:
      
         - Fix buildiid-cache mode setting in copyfile_mode_ns() when copying
           /proc/kcore
      
         - Fix freeing id arrays in the event list so the correct event is
           closed.
      
         - Sync sched.h anc kvm.h headers with the kernel sources.
      
         - Link jvmti against tools/lib/ctype.o to have weak strlcpy().
      
         - Fix multiple memory and file descriptor leaks, found by coverity in
           perf annotate.
      
         - Fix leaks in error handling paths in 'perf c2c', 'perf kmem', found
           by a static analysis tool"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/aux: Fix AUX output stopping
        perf/aux: Fix tracking of auxiliary trace buffer allocation
        perf/x86/intel/pt: Fix base for single entry topa
        perf kmem: Fix memory leak in compact_gfp_flags()
        tools headers UAPI: Sync sched.h with the kernel
        tools headers kvm: Sync kvm.h headers with the kernel sources
        tools headers kvm: Sync kvm headers with the kernel sources
        tools headers kvm: Sync kvm headers with the kernel sources
        perf c2c: Fix memory leak in build_cl_output()
        perf tools: Fix mode setting in copyfile_mode_ns()
        perf annotate: Fix multiple memory and file descriptor leaks
        perf tools: Fix resource leak of closedir() on the error paths
        perf evlist: Fix fix for freed id arrays
        perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()
      a8a31fdc
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1e1ac1cb
      Linus Torvalds 提交于
      Pull irq fixes from Thomas Gleixner:
       "Two fixes for interrupt controller drivers:
      
         - Skip IRQ_M_EXT entries in the device tree when initializing the
           RISCV PLIC controller to avoid a double init attempt.
      
         - Use the correct ITS list when issuing the VMOVP synchronization
           command so the operation works only on the ITS instances which are
           associated to a VM"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/sifive-plic: Skip contexts except supervisor in plic_init()
        irqchip/gic-v3-its: Use the exact ITSList for VMOVP
      1e1ac1cb
    • L
      Merge tag '5.4-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · c9a2e4a8
      Linus Torvalds 提交于
      Pull cifs fixes from Steve French:
       "Seven cifs/smb3 fixes, including three for stable"
      
      * tag '5.4-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
        CIFS: Fix use after free of file info structures
        CIFS: Fix retry mid list corruption on reconnects
        cifs: Fix missed free operations
        CIFS: avoid using MID 0xFFFF
        cifs: clarify comment about timestamp granularity for old servers
        cifs: Handle -EINPROGRESS only when noblockcnt is set
      c9a2e4a8
    • L
      Merge tag 'riscv/for-v5.4-rc5-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 6995a6a5
      Linus Torvalds 提交于
      Pull RISC-V fixes from Paul Walmsley:
       "Several minor fixes and cleanups for v5.4-rc5:
      
         - Three build fixes for various SPARSEMEM-related kernel
           configurations
      
         - Two cleanup patches for the kernel bug and breakpoint trap handler
           code"
      
      * tag 'riscv/for-v5.4-rc5-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: cleanup do_trap_break
        riscv: cleanup <asm/bug.h>
        riscv: Fix undefined reference to vmemmap_populate_basepages
        riscv: Fix implicit declaration of 'page_to_section'
        riscv: fix fs/proc/kcore.c compilation with sparsemem enabled
      6995a6a5
    • L
      Merge tag 'mips_fixes_5.4_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 5a1e843c
      Linus Torvalds 提交于
      Pull MIPS fixes from Paul Burton:
       "A few MIPS fixes:
      
         - Fix VDSO time-related function behavior for systems where we need
           to fall back to syscalls, but were instead returning bogus results.
      
         - A fix to TLB exception handlers for Cavium Octeon systems where
           they would inadvertently clobber the $1/$at register.
      
         - A build fix for bcm63xx configurations.
      
         - Switch to using my @kernel.org email address"
      
      * tag 'mips_fixes_5.4_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: tlbex: Fix build_restore_pagemask KScratch restore
        MIPS: bmips: mark exception vectors as char arrays
        mips: vdso: Fix __arch_get_hw_counter()
        MAINTAINERS: Use @kernel.org address for Paul Burton
      5a1e843c
    • L
      Merge tag 'tty-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 29768954
      Linus Torvalds 提交于
      Pull tty/serial driver fix from Greg KH:
       "Here is a single tty/serial driver fix for 5.4-rc5 that resolves a
        reported issue.
      
        It has been in linux-next for a while with no problems"
      
      * tag 'tty-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        8250-men-mcb: fix error checking when get_num_ports returns -ENODEV
      29768954
    • L
      Merge tag 'staging-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 228bd624
      Linus Torvalds 提交于
      Pull staging driver fix from Greg KH:
       "Here is a single staging driver fix, for the wlan-ng driver, that
        resolves a reported issue.
      
        It is been in linux-next for a while with no reported issues"
      
      * tag 'staging-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: wlan-ng: fix exit return when sme->key_idx >= NUM_WEPKEYS
      228bd624
    • L
      Merge tag 'driver-core-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 13fa692e
      Linus Torvalds 提交于
      Pull driver core fix from Greg KH:
       "Here is a single sysfs fix for 5.4-rc5.
      
        It resolves an error if you actually try to use the __BIN_ATTR_WO()
        macro, seems I never tested it properly before :(
      
        This has been in linux-next for a while with no reported issues"
      
      * tag 'driver-core-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        sysfs: Fixes __BIN_ATTR_WO() macro
      13fa692e
    • L
      Merge tag 'char-misc-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · a03885d5
      Linus Torvalds 提交于
      Pull binder fix from Greg KH:
       "This is a single binder fix to resolve a reported issue by Jann. It's
        been in linux-next for a while with no reported issues"
      
      * tag 'char-misc-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        binder: Don't modify VMA bounds in ->mmap handler
      a03885d5
    • L
      Merge tag 'usb-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 0ecdd78c
      Linus Torvalds 提交于
      Pull USB fixes from Greg KH:
       "Here are a number of small USB driver fixes for 5.4-rc5.
      
        More "fun" with some of the misc USB drivers as found by syzbot, and
        there are a number of other small bugfixes in here for reported
        issues.
      
        All have been in linux-next for a while with no reported issues"
      
      * tag 'usb-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: cdns3: Error out if USB_DR_MODE_UNKNOWN in cdns3_core_init_role()
        USB: ldusb: fix read info leaks
        USB: serial: ti_usb_3410_5052: clean up serial data access
        USB: serial: ti_usb_3410_5052: fix port-close races
        USB: usblp: fix use-after-free on disconnect
        usb: udc: lpc32xx: fix bad bit shift operation
        usb: cdns3: Fix dequeue implementation.
        USB: legousbtower: fix a signedness bug in tower_probe()
        USB: legousbtower: fix memleak on disconnect
        USB: ldusb: fix memleak on disconnect
      0ecdd78c
    • L
      Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 992cb107
      Linus Torvalds 提交于
      Pull i2c fixes from Wolfram Sang:
       "A few driver fixes for the I2C subsystem"
      
      * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: stm32f7: remove warning when compiling with W=1
        i2c: stm32f7: fix a race in slave mode with arbitration loss irq
        i2c: stm32f7: fix first byte to send in slave mode
        i2c: mt65xx: fix NULL ptr dereference
        i2c: aspeed: fix master pending state handling
      992cb107
    • L
      Merge tag 'for-linus-2019-10-26' of git://git.kernel.dk/linux-block · acf913b7
      Linus Torvalds 提交于
      Pull block and io_uring fixes from Jens Axboe:
       "A bit bigger than usual at this point in time, mostly due to some good
        bug hunting work by Pavel that resulted in three io_uring fixes from
        him and two from me. Anyway, this pull request contains:
      
         - Revert of the submit-and-wait optimization for io_uring, it can't
           always be done safely. It depends on commands always making
           progress on their own, which isn't necessarily the case outside of
           strict file IO. (me)
      
         - Series of two patches from me and three from Pavel, fixing issues
           with shared data and sequencing for io_uring.
      
         - Lastly, two timeout sequence fixes for io_uring (zhangyi)
      
         - Two nbd patches fixing races (Josef)
      
         - libahci regulator_get_optional() fix (Mark)"
      
      * tag 'for-linus-2019-10-26' of git://git.kernel.dk/linux-block:
        nbd: verify socket is supported during setup
        ata: libahci_platform: Fix regulator_get_optional() misuse
        nbd: handle racing with error'ed out commands
        nbd: protect cmd->status with cmd->lock
        io_uring: fix bad inflight accounting for SETUP_IOPOLL|SETUP_SQTHREAD
        io_uring: used cached copies of sq->dropped and cq->overflow
        io_uring: Fix race for sqes with userspace
        io_uring: Fix broken links with offloading
        io_uring: Fix corrupted user_data
        io_uring: correct timeout req sequence when inserting a new entry
        io_uring : correct timeout req sequence when waiting timeout
        io_uring: revert "io_uring: optimize submit_and_wait API"
      acf913b7
  5. 26 10月, 2019 12 次提交