1. 07 2月, 2018 6 次提交
  2. 02 2月, 2018 1 次提交
    • A
      lib/strscpy: Shut up KASAN false-positives in strscpy() · 1a3241ff
      Andrey Ryabinin 提交于
      strscpy() performs the word-at-a-time optimistic reads.  So it may may
      access the memory past the end of the object, which is perfectly fine
      since strscpy() doesn't use that (past-the-end) data and makes sure the
      optimistic read won't cross a page boundary.
      
      Use new read_word_at_a_time() to shut up the KASAN.
      
      Note that this potentially could hide some bugs.  In example bellow,
      stscpy() will copy more than we should (1-3 extra uninitialized bytes):
      
              char dst[8];
              char *src;
      
              src = kmalloc(5, GFP_KERNEL);
              memset(src, 0xff, 5);
              strscpy(dst, src, 8);
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1a3241ff
  3. 27 1月, 2018 1 次提交
  4. 24 1月, 2018 2 次提交
    • S
      vsprintf: Do not have bprintf dereference pointers · 841a915d
      Steven Rostedt (VMware) 提交于
      When trace_printk() was introduced, it was discussed that making it be as
      low overhead as possible, that the processing of the format string should be
      delayed until it is read. That is, a "trace_printk()" should not convert
      the %d into numbers and so on, but instead, save the fmt string and all the
      args in the buffer at the time of recording. When the trace_printk() data is
      read, it would then parse the format string and do the conversions of the
      saved arguments in the tracing buffer.
      
      The code to perform this was added to vsprintf where vbin_printf() would
      save the arguments of a specified format string in a buffer, then
      bstr_printf() could be used to convert the buffer with the same format
      string into the final output, as if vsprintf() was called in one go.
      
      The issue arises when dereferenced pointers are used. The problem is that
      something like %*pbl which reads a bitmask, will save the pointer to the
      bitmask in the buffer. Then the reading of the buffer via bstr_printf() will
      then look at the pointer to process the final output. Obviously the value of
      that pointer could have changed since the time it was recorded to the time
      the buffer is read. Worse yet, the bitmask could be unmapped, and the
      reading of the trace buffer could actually cause a kernel oops.
      
      Another problem is that user space tools such as perf and trace-cmd do not
      have access to the contents of these pointers, and they become useless when
      the tracing buffer is extracted.
      
      Instead of having vbin_printf() simply save the pointer in the buffer for
      later processing, have it perform the formatting at the time bin_printf() is
      called. This will fix the issue of dereferencing pointers at a later time,
      and has the extra benefit of having user space tools understand these
      values.
      
      Since perf and trace-cmd already can handle %p[sSfF] via saving kallsyms,
      their pointers are saved and not processed during vbin_printf(). If they
      were converted, it would break perf and trace-cmd, as they would not know
      how to deal with the conversion.
      
      Link: http://lkml.kernel.org/r/20171228204025.14a71d8f@gandalf.local.homeReported-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      841a915d
    • B
      kobject: Export kobj_ns_grab_current() and kobj_ns_drop() · 172856ea
      Bart Van Assche 提交于
      Make it possible to call these two functions from a kernel module.
      Note: despite their name, these two functions can be used meaningfully
      independent of kobjects. A later patch will add calls to these
      functions from the SRP driver because this patch series modifies the
      SRP driver such that it can hold a reference to a namespace that can
      last longer than the lifetime of the process through which the
      namespace reference was obtained.
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      172856ea
  5. 22 1月, 2018 2 次提交
  6. 20 1月, 2018 2 次提交
    • D
      bpf: add couple of test cases for signed extended imms · fcd1c917
      Daniel Borkmann 提交于
      Add a couple of test cases for interpreter and JIT that are
      related to an issue we faced some time ago in Cilium [1],
      which is fixed in LLVM with commit e53750e1e086 ("bpf: fix
      bug on silently truncating 64-bit immediate").
      
      Test cases were run-time checking kernel to behave as intended
      which should also provide some guidance for current or new
      JITs in case they should trip over this. Added for cBPF and
      eBPF.
      
        [1] https://github.com/cilium/cilium/pull/2162Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      fcd1c917
    • B
      lib/scatterlist: Fix chaining support in sgl_alloc_order() · 8c7a8d1c
      Bart Van Assche 提交于
      This patch avoids that workloads with large block sizes (megabytes)
      can trigger the following call stack with the ib_srpt driver (that
      driver is the only driver that chains scatterlists allocated by
      sgl_alloc_order()):
      
      BUG: Bad page state in process kworker/0:1H  pfn:2423a78
      page:fffffb03d08e9e00 count:-3 mapcount:0 mapping:          (null) index:0x0
      flags: 0x57ffffc0000000()
      raw: 0057ffffc0000000 0000000000000000 0000000000000000 fffffffdffffffff
      raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
      page dumped because: nonzero _count
      CPU: 0 PID: 733 Comm: kworker/0:1H Tainted: G          I      4.15.0-rc7.bart+ #1
      Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
      Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
      Call Trace:
       dump_stack+0x5c/0x83
       bad_page+0xf5/0x10f
       get_page_from_freelist+0xa46/0x11b0
       __alloc_pages_nodemask+0x103/0x290
       sgl_alloc_order+0x101/0x180
       target_alloc_sgl+0x2c/0x40 [target_core_mod]
       srpt_alloc_rw_ctxs+0x173/0x2d0 [ib_srpt]
       srpt_handle_new_iu+0x61e/0x7f0 [ib_srpt]
       __ib_process_cq+0x55/0xa0 [ib_core]
       ib_cq_poll_work+0x1b/0x60 [ib_core]
       process_one_work+0x141/0x340
       worker_thread+0x47/0x3e0
       kthread+0xf5/0x130
       ret_from_fork+0x1f/0x30
      
      Fixes: e80a0af4 ("lib/scatterlist: Introduce sgl_alloc() and sgl_free()")
      Reported-by: NLaurence Oberman <loberman@redhat.com>
      Tested-by: NLaurence Oberman <loberman@redhat.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Laurence Oberman <loberman@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8c7a8d1c
  7. 15 1月, 2018 16 次提交
  8. 13 1月, 2018 3 次提交
    • M
      error-injection: Support fault injection framework · 4b1a29a7
      Masami Hiramatsu 提交于
      Support in-kernel fault-injection framework via debugfs.
      This allows you to inject a conditional error to specified
      function using debugfs interfaces.
      
      Here is the result of test script described in
      Documentation/fault-injection/fault-injection.txt
      
        ===========
        # ./test_fail_function.sh
        1+0 records in
        1+0 records out
        1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227404 s, 46.1 MB/s
        btrfs-progs v4.4
        See http://btrfs.wiki.kernel.org for more information.
      
        Label:              (null)
        UUID:               bfa96010-12e9-4360-aed0-42eec7af5798
        Node size:          16384
        Sector size:        4096
        Filesystem size:    1001.00MiB
        Block group profiles:
          Data:             single            8.00MiB
          Metadata:         DUP              58.00MiB
          System:           DUP              12.00MiB
        SSD detected:       no
        Incompat features:  extref, skinny-metadata
        Number of devices:  1
        Devices:
           ID        SIZE  PATH
            1  1001.00MiB  /dev/loop2
      
        mount: mount /dev/loop2 on /opt/tmpmnt failed: Cannot allocate memory
        SUCCESS!
        ===========
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      4b1a29a7
    • M
      error-injection: Add injectable error types · 663faf9f
      Masami Hiramatsu 提交于
      Add injectable error types for each error-injectable function.
      
      One motivation of error injection test is to find software flaws,
      mistakes or mis-handlings of expectable errors. If we find such
      flaws by the test, that is a program bug, so we need to fix it.
      
      But if the tester miss input the error (e.g. just return success
      code without processing anything), it causes unexpected behavior
      even if the caller is correctly programmed to handle any errors.
      That is not what we want to test by error injection.
      
      To clarify what type of errors the caller must expect for each
      injectable function, this introduces injectable error types:
      
       - EI_ETYPE_NULL : means the function will return NULL if it
      		    fails. No ERR_PTR, just a NULL.
       - EI_ETYPE_ERRNO : means the function will return -ERRNO
      		    if it fails.
       - EI_ETYPE_ERRNO_NULL : means the function will return -ERRNO
      		       (ERR_PTR) or NULL.
      
      ALLOW_ERROR_INJECTION() macro is expanded to get one of
      NULL, ERRNO, ERRNO_NULL to record the error type for
      each function. e.g.
      
       ALLOW_ERROR_INJECTION(open_ctree, ERRNO)
      
      This error types are shown in debugfs as below.
      
        ====
        / # cat /sys/kernel/debug/error_injection/list
        open_ctree [btrfs]	ERRNO
        io_ctl_init [btrfs]	ERRNO
        ====
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      663faf9f
    • M
      error-injection: Separate error-injection from kprobe · 540adea3
      Masami Hiramatsu 提交于
      Since error-injection framework is not limited to be used
      by kprobes, nor bpf. Other kernel subsystems can use it
      freely for checking safeness of error-injection, e.g.
      livepatch, ftrace etc.
      So this separate error-injection framework from kprobes.
      
      Some differences has been made:
      
      - "kprobe" word is removed from any APIs/structures.
      - BPF_ALLOW_ERROR_INJECTION() is renamed to
        ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
      - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
        feature. It is automatically enabled if the arch supports
        error injection feature for kprobe or ftrace etc.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      540adea3
  9. 12 1月, 2018 1 次提交
  10. 10 1月, 2018 2 次提交
    • C
      dma-mapping: move swiotlb arch helpers to a new header · ea8c64ac
      Christoph Hellwig 提交于
      phys_to_dma, dma_to_phys and dma_capable are helpers published by
      architecture code for use of swiotlb and xen-swiotlb only.  Drivers are
      not supposed to use these directly, but use the DMA API instead.
      
      Move these to a new asm/dma-direct.h helper, included by a
      linux/dma-direct.h wrapper that provides the default linear mapping
      unless the architecture wants to override it.
      
      In the MIPS case the existing dma-coherent.h is reused for now as
      untangling it will take a bit of work.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      ea8c64ac
    • A
      bpf: introduce BPF_JIT_ALWAYS_ON config · 290af866
      Alexei Starovoitov 提交于
      The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
      
      A quote from goolge project zero blog:
      "At this point, it would normally be necessary to locate gadgets in
      the host kernel code that can be used to actually leak data by reading
      from an attacker-controlled location, shifting and masking the result
      appropriately and then using the result of that as offset to an
      attacker-controlled address for a load. But piecing gadgets together
      and figuring out which ones work in a speculation context seems annoying.
      So instead, we decided to use the eBPF interpreter, which is built into
      the host kernel - while there is no legitimate way to invoke it from inside
      a VM, the presence of the code in the host kernel's text section is sufficient
      to make it usable for the attack, just like with ordinary ROP gadgets."
      
      To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
      option that removes interpreter from the kernel in favor of JIT-only mode.
      So far eBPF JIT is supported by:
      x64, arm64, arm32, sparc64, s390, powerpc64, mips64
      
      The start of JITed program is randomized and code page is marked as read-only.
      In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
      
      v2->v3:
      - move __bpf_prog_ret0 under ifdef (Daniel)
      
      v1->v2:
      - fix init order, test_bpf and cBPF (Daniel's feedback)
      - fix offloaded bpf (Jakub's feedback)
      - add 'return 0' dummy in case something can invoke prog->bpf_func
      - retarget bpf tree. For bpf-next the patch would need one extra hunk.
        It will be sent when the trees are merged back to net-next
      
      Considered doing:
        int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
      but it seems better to land the patch as-is and in bpf-next remove
      bpf_jit_enable global variable from all JITs, consolidate in one place
      and remove this jit_init() function.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      290af866
  11. 09 1月, 2018 2 次提交
    • J
      treewide: Use DEVICE_ATTR_RW · b6b996b6
      Joe Perches 提交于
      Convert DEVICE_ATTR uses to DEVICE_ATTR_RW where possible.
      
      Done with perl script:
      
      $ git grep -w --name-only DEVICE_ATTR | \
        xargs perl -i -e 'local $/; while (<>) { s/\bDEVICE_ATTR\s*\(\s*(\w+)\s*,\s*\(?(\s*S_IRUGO\s*\|\s*S_IWUSR|\s*S_IWUSR\s*\|\s*S_IRUGO\s*|\s*0644\s*)\)?\s*,\s*\1_show\s*,\s*\1_store\s*\)/DEVICE_ATTR_RW(\1)/g; print;}'
      Signed-off-by: NJoe Perches <joe@perches.com>
      Acked-by: NFelipe Balbi <felipe.balbi@linux.intel.com>
      Acked-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Acked-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Acked-by: NZhang Rui <rui.zhang@intel.com>
      Acked-by: NJarkko Nikula <jarkko.nikula@bitmer.com>
      Acked-by: NJani Nikula <jani.nikula@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6b996b6
    • S
      symbol lookup: introduce dereference_symbol_descriptor() · 04b8eb7a
      Sergey Senozhatsky 提交于
      dereference_symbol_descriptor() invokes appropriate ARCH specific
      function descriptor dereference callbacks:
      - dereference_kernel_function_descriptor() if the pointer is a
        kernel symbol;
      
      - dereference_module_function_descriptor() if the pointer is a
        module symbol.
      
      This is the last step needed to make '%pS/%ps' smart enough to
      handle function descriptor dereference on affected ARCHs and
      to retire '%pF/%pf'.
      
      To refresh it:
        Some architectures (ia64, ppc64, parisc64) use an indirect pointer
        for C function pointers - the function pointer points to a function
        descriptor and we need to dereference it to get the actual function
        pointer.
      
        Function descriptors live in .opd elf section and all affected
        ARCHs (ia64, ppc64, parisc64) handle it properly for kernel and
        modules. So we, technically, can decide if the dereference is
        needed by simply looking at the pointer: if it belongs to .opd
        section then we need to dereference it.
      
        The kernel and modules have their own .opd sections, obviously,
        that's why we need to split dereference_function_descriptor()
        and use separate kernel and module dereference arch callbacks.
      
      Link: http://lkml.kernel.org/r/20171206043649.GB15885@jagdpanzerIV
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: James Bottomley <jejb@parisc-linux.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Tested-by: Tony Luck <tony.luck@intel.com> #ia64
      Tested-by: Santosh Sivaraj <santosh@fossix.org> #powerpc
      Tested-by: Helge Deller <deller@gmx.de> #parisc64
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      04b8eb7a
  12. 08 1月, 2018 1 次提交
  13. 07 1月, 2018 1 次提交