1. 05 1月, 2019 5 次提交
    • D
      net, skbuff: do not prefer skb allocation fails early · f8c468e8
      David Rientjes 提交于
      Commit dcda9b04 ("mm, tree wide: replace __GFP_REPEAT by
      __GFP_RETRY_MAYFAIL with more useful semantic") replaced __GFP_REPEAT in
      alloc_skb_with_frags() with __GFP_RETRY_MAYFAIL when the allocation may
      directly reclaim.
      
      The previous behavior would require reclaim up to 1 << order pages for
      skb aligned header_len of order > PAGE_ALLOC_COSTLY_ORDER before failing,
      otherwise the allocations in alloc_skb() would loop in the page allocator
      looking for memory.  __GFP_RETRY_MAYFAIL makes both allocations failable
      under memory pressure, including for the HEAD allocation.
      
      This can cause, among many other things, write() to fail with ENOTCONN
      during RPC when under memory pressure.
      
      These allocations should succeed as they did previous to dcda9b04
      even if it requires calling the oom killer and additional looping in the
      page allocator to find memory.  There is no way to specify the previous
      behavior of __GFP_REPEAT, but it's unlikely to be necessary since the
      previous behavior only guaranteed that 1 << order pages would be reclaimed
      before failing for order > PAGE_ALLOC_COSTLY_ORDER.  That reclaim is not
      guaranteed to be contiguous memory, so repeating for such large orders is
      usually not beneficial.
      
      Removing the setting of __GFP_RETRY_MAYFAIL to restore the previous
      behavior, specifically not allowing alloc_skb() to fail for small orders
      and oom kill if necessary rather than allowing RPCs to fail.
      
      Fixes: dcda9b04 ("mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic")
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8c468e8
    • W
      soc/fsl/qe: fix err handling of ucc_of_parse_tdm · 8d68100a
      Wen Yang 提交于
      Currently there are some issues with the ucc_of_parse_tdm function:
      1, a possible null pointer dereference in ucc_of_parse_tdm,
      detected by the semantic patch deref_null.cocci,
      with the following warning:
      drivers/soc/fsl/qe/qe_tdm.c:177:21-24: ERROR: pdev is NULL but dereferenced.
      2, dev gets modified, so in any case that devm_iounmap() will fail
      even when the new pdev is valid, because the iomap was done with a
       different pdev.
      3, there is no driver bind with the "fsl,t1040-qe-si" or
      "fsl,t1040-qe-siram" device. So allocating resources using devm_*()
      with these devices won't provide a cleanup path for these resources
      when the caller fails.
      
      This patch fixes them.
      Suggested-by: NLi Yang <leoyang.li@nxp.com>
      Suggested-by: NChristophe LEROY <christophe.leroy@c-s.fr>
      Signed-off-by: NWen Yang <wen.yang99@zte.com.cn>
      Reviewed-by: NPeng Hao <peng.hao2@zte.com.cn>
      CC: Julia Lawall <julia.lawall@lip6.fr>
      CC: Zhao Qiang <qiang.zhao@nxp.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      CC: linuxppc-dev@lists.ozlabs.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d68100a
    • K
      r8169: Add support for new Realtek Ethernet · 36352991
      Kai-Heng Feng 提交于
      There are two new Realtek Ethernet devices which are re-branded r8168h.
      Add the IDs to to support them.
      Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
      Reviewed-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36352991
    • A
      netlink: fixup regression in RTM_GETADDR · 7c1e8a38
      Arthur Gautier 提交于
      This commit fixes a regression in AF_INET/RTM_GETADDR and
      AF_INET6/RTM_GETADDR.
      
      Before this commit, the kernel would stop dumping addresses once the first
      skb was full and end the stream with NLMSG_DONE(-EMSGSIZE). The error
      shouldn't be sent back to netlink_dump so the callback is kept alive. The
      userspace is expected to call back with a new empty skb.
      
      Changes from V1:
       - The error is not handled in netlink_dump anymore but rather in
         inet_dump_ifaddr and inet6_dump_addr directly as suggested by
         David Ahern.
      
      Fixes: d7e38611 ("net/ipv4: Put target net when address dump fails due to bad attributes")
      Fixes: 242afaa6 ("net/ipv6: Put target net when address dump fails due to bad attributes")
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: "David S . Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NArthur Gautier <baloo@gandi.net>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c1e8a38
    • C
      octeontx2-af: Fix a resource leak in an error handling path in 'cgx_probe()' · 1492623e
      Christophe JAILLET 提交于
      If an error occurs after the call to 'pci_alloc_irq_vectors()', we must
      call 'pci_free_irq_vectors()' in order to avoid a	resource leak.
      
      The same sequence is already in place in the corresponding 'cgx_remove()'
      function.
      
      Fixes: 1463f382 ("octeontx2-af: Add support for CGX link management")
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1492623e
  2. 04 1月, 2019 4 次提交
    • L
      Remove 'type' argument from access_ok() function · 96d4f267
      Linus Torvalds 提交于
      Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
      of the user address range verification function since we got rid of the
      old racy i386-only code to walk page tables by hand.
      
      It existed because the original 80386 would not honor the write protect
      bit when in kernel mode, so you had to do COW by hand before doing any
      user access.  But we haven't supported that in a long time, and these
      days the 'type' argument is a purely historical artifact.
      
      A discussion about extending 'user_access_begin()' to do the range
      checking resulted this patch, because there is no way we're going to
      move the old VERIFY_xyz interface to that model.  And it's best done at
      the end of the merge window when I've done most of my merges, so let's
      just get this done once and for all.
      
      This patch was mostly done with a sed-script, with manual fix-ups for
      the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
      
      There were a couple of notable cases:
      
       - csky still had the old "verify_area()" name as an alias.
      
       - the iter_iov code had magical hardcoded knowledge of the actual
         values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
         really used it)
      
       - microblaze used the type argument for a debug printout
      
      but other than those oddities this should be a total no-op patch.
      
      I tried to fix up all architectures, did fairly extensive grepping for
      access_ok() uses, and the changes are trivial, but I may have missed
      something.  Any missed conversion should be trivially fixable, though.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      96d4f267
    • L
      Merge tag 'locks-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux · 135143b2
      Linus Torvalds 提交于
      Pull file locking bugfix from Jeff Layton:
       "This is a one-line fix for a bug that syzbot turned up in the new
        patches to mitigate the thundering herd when a lock is released"
      
      * tag 'locks-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
        locks: fix error in locks_move_blocks()
      135143b2
    • L
      Merge tag 'sound-fix-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 810574ca
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "Among a few HD-audio fixes, the only significant one is the regression
        fix on some machines like Dell XPS due to the default binding changes.
        We ended up reverting the whole since the fix for ASoC HD-audio driver
        won't be available immediately"
      
      * tag 'sound-fix-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Revert DSP detection on legacy HD-audio driver
        ALSA: hda/tegra: clear pending irq handlers
        ALSA: hda/realtek: Enable the headset mic auto detection for ASUS laptops
      810574ca
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 43d86ee8
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
       "Several fixes here. Basically split down the line between newly
        introduced regressions and long existing problems:
      
         1) Double free in tipc_enable_bearer(), from Cong Wang.
      
         2) Many fixes to nf_conncount, from Florian Westphal.
      
         3) op->get_regs_len() can throw an error, check it, from Yunsheng
            Lin.
      
         4) Need to use GFP_ATOMIC in *_add_hash_mac_address() of fsl/fman
            driver, from Scott Wood.
      
         5) Inifnite loop in fib_empty_table(), from Yue Haibing.
      
         6) Use after free in ax25_fillin_cb(), from Cong Wang.
      
         7) Fix socket locking in nr_find_socket(), also from Cong Wang.
      
         8) Fix WoL wakeup enable in r8169, from Heiner Kallweit.
      
         9) On 32-bit sock->sk_stamp is not thread-safe, from Deepa Dinamani.
      
        10) Fix ptr_ring wrap during queue swap, from Cong Wang.
      
        11) Missing shutdown callback in hinic driver, from Xue Chaojing.
      
        12) Need to return NULL on error from ip6_neigh_lookup(), from Stefano
            Brivio.
      
        13) BPF out of bounds speculation fixes from Daniel Borkmann"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
        ipv6: Consider sk_bound_dev_if when binding a socket to an address
        ipv6: Fix dump of specific table with strict checking
        bpf: add various test cases to selftests
        bpf: prevent out of bounds speculation on pointer arithmetic
        bpf: fix check_map_access smin_value test when pointer contains offset
        bpf: restrict unknown scalars of mixed signed bounds for unprivileged
        bpf: restrict stack pointer arithmetic for unprivileged
        bpf: restrict map value pointer arithmetic for unprivileged
        bpf: enable access to ax register also from verifier rewrite
        bpf: move tmp variable into ax register in interpreter
        bpf: move {prev_,}insn_idx into verifier env
        isdn: fix kernel-infoleak in capi_unlocked_ioctl
        ipv6: route: Fix return value of ip6_neigh_lookup() on neigh_create() error
        net/hamradio/6pack: use mod_timer() to rearm timers
        net-next/hinic:add shutdown callback
        net: hns3: call hns3_nic_net_open() while doing HNAE3_UP_CLIENT
        ip: validate header length on virtual device xmit
        tap: call skb_probe_transport_header after setting skb->dev
        ptr_ring: wrap back ->producer in __ptr_ring_swap_queue()
        net: rds: remove unnecessary NULL check
        ...
      43d86ee8
  3. 03 1月, 2019 31 次提交
    • D
      ipv6: Consider sk_bound_dev_if when binding a socket to an address · c5ee0663
      David Ahern 提交于
      IPv6 does not consider if the socket is bound to a device when binding
      to an address. The result is that a socket can be bound to eth0 and then
      bound to the address of eth1. If the device is a VRF, the result is that
      a socket can only be bound to an address in the default VRF.
      
      Resolve by considering the device if sk_bound_dev_if is set.
      
      This problem exists from the beginning of git history.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5ee0663
    • D
      ipv6: Fix dump of specific table with strict checking · 73155879
      David Ahern 提交于
      Dump of a specific table with strict checking enabled is looping. The
      problem is that the end of the table dump is not marked in the cb. When
      dumping a specific table, cb args 0 and 1 are not used (they are the hash
      index and entry with an hash table index when dumping all tables). Re-use
      args[0] to hold a 'done' flag for the specific table dump.
      
      Fixes: 13e38901 ("net/ipv6: Plumb support for filtering route dumps")
      Reported-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73155879
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 645ff1e8
      Linus Torvalds 提交于
      Pull input updates from Dmitry Torokhov:
       "A tiny pull request this merge window unfortunately, should get more
        material in for the next release:
      
         - new driver for Raspberry Pi's touchscreen (firmware interface)
      
         - miscellaneous input driver fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: elan_i2c - add ACPI ID for touchpad in ASUS Aspire F5-573G
        Input: atmel_mxt_ts - don't try to free unallocated kernel memory
        Input: drv2667 - fix indentation issues
        Input: touchscreen - fix coding style issue
        Input: add official Raspberry Pi's touchscreen driver
        Input: nomadik-ske-keypad - fix a loop timeout test
        Input: rotary-encoder - don't log EPROBE_DEFER to kernel log
        Input: olpc_apsp - remove set but not used variable 'np'
        Input: olpc_apsp - enable the SP clock
        Input: olpc_apsp - check FIFO status on open(), not probe()
        Input: olpc_apsp - drop CONFIG_OLPC dependency
        clk: mmp2: add SP clock
        dt-bindings: marvell,mmp2: Add clock id for the SP clock
        Input: ad7879 - drop platform data support
      645ff1e8
    • L
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · d548e659
      Linus Torvalds 提交于
      Pull virtio/vhost updates from Michael Tsirkin:
      "Features, fixes, cleanups:
      
         - discard in virtio blk
      
         - misc fixes and cleanups"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vhost: correct the related warning message
        vhost: split structs into a separate header file
        virtio: remove deprecated VIRTIO_PCI_CONFIG()
        vhost/vsock: switch to a mutex for vhost_vsock_hash
        virtio_blk: add discard and write zeroes support
      d548e659
    • L
      Merge tag 'for-4.21/block-20190102' of git://git.kernel.dk/linux-block · 77d0b194
      Linus Torvalds 提交于
      Pull more block updates from Jens Axboe:
      
       - Dead code removal for loop/sunvdc (Chengguang)
      
       - Mark BIDI support for bsg as deprecated, logging a single dmesg
         warning if anyone is actually using it (Christoph)
      
       - blkcg cleanup, killing a dead function and making the tryget_closest
         variant easier to read (Dennis)
      
       - Floppy fixes, one fixing a regression in swim3 (Finn)
      
       - lightnvm use-after-free fix (Gustavo)
      
       - gdrom leak fix (Wenwen)
      
       - a set of drbd updates (Lars, Luc, Nathan, Roland)
      
      * tag 'for-4.21/block-20190102' of git://git.kernel.dk/linux-block: (28 commits)
        block/swim3: Fix regression on PowerBook G3
        block/swim3: Fix -EBUSY error when re-opening device after unmount
        block/swim3: Remove dead return statement
        block/amiflop: Don't log error message on invalid ioctl
        gdrom: fix a memory leak bug
        lightnvm: pblk: fix use-after-free bug
        block: sunvdc: remove redundant code
        block: loop: remove redundant code
        bsg: deprecate BIDI support in bsg
        blkcg: remove unused __blkg_release_rcu()
        blkcg: clean up blkg_tryget_closest()
        drbd: Change drbd_request_detach_interruptible's return type to int
        drbd: Avoid Clang warning about pointless switch statment
        drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")
        drbd: skip spurious timeout (ping-timeo) when failing promote
        drbd: don't retry connection if peers do not agree on "authentication" settings
        drbd: fix print_st_err()'s prototype to match the definition
        drbd: avoid spurious self-outdating with concurrent disconnect / down
        drbd: do not block when adjusting "disk-options" while IO is frozen
        drbd: fix comment typos
        ...
      77d0b194
    • L
      Merge tag 'for-4.21/libata-20190102' of git://git.kernel.dk/linux-block · b79f9f93
      Linus Torvalds 提交于
      Pull libata fix from Jens Axboe:
       "This libata change missed the original libata pull request.
      
        Just a single fix in here, fixing a missed reference drop"
      
      * tag 'for-4.21/libata-20190102' of git://git.kernel.dk/linux-block:
        ata: pata_macio: add of_node_put()
      b79f9f93
    • L
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 0f2107da
      Linus Torvalds 提交于
      Pull more clk updates from Stephen Boyd:
       "One more patch to generalize a set of DT binding defines now before
        -rc1 comes out.
      
        This way the SoC DTS files can use the proper defines from a stable
        tag"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: imx8qxp: make the name of clock ID generic
      0f2107da
    • L
      Merge tag 'devprop-4.21-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 01766d27
      Linus Torvalds 提交于
      Pull device properties framework fixes from Rafael Wysocki:
       "Fix two potential NULL pointer dereferences found by Coverity in the
        software nodes code introduced recently (Colin Ian King)"
      
      * tag 'devprop-4.21-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        drivers: base: swnode: check if swnode is NULL before dereferencing it
        drivers: base: swnode: check if pointer p is NULL before dereferencing it
      01766d27
    • L
      Merge tag 'mailbox-v4.21' of git://git.linaro.org/landing-teams/working/fujitsu/integration · 35ddb06a
      Linus Torvalds 提交于
      Pull mailbox updates from Jassi Brar:
      
       - Introduce device-managed registration
         devm_mbox_controller_un/register and convert drivers to use it
      
       - Introduce flush api to support clients that must busy-wait in atomic
         context
      
       - Support multiple controllers per device
      
       - Hi3660: a bugfix and constify ops structure
      
       - TI-MsgMgr: off by one bugfix.
      
       - BCM: switch to spdx license
      
       - Tegra-HSP: support for shared mailboxes and suspend/resume.
      
      * tag 'mailbox-v4.21' of git://git.linaro.org/landing-teams/working/fujitsu/integration: (30 commits)
        mailbox: tegra-hsp: Use device-managed registration API
        mailbox: tegra-hsp: use devm_kstrdup_const()
        mailbox: tegra-hsp: Add suspend/resume support
        mailbox: tegra-hsp: Add support for shared mailboxes
        dt-bindings: tegra186-hsp: Add shared mailboxes
        mailbox: Allow multiple controllers per device
        mailbox: Support blocking transfers in atomic context
        mailbox: ti-msgmgr: Use device-managed registration API
        mailbox: stm32-ipcc: Use device-managed registration API
        mailbox: rockchip: Use device-managed registration API
        mailbox: qcom-apcs: Use device-managed registration API
        mailbox: platform-mhu: Use device-managed registration API
        mailbox: omap: Use device-managed registration API
        mailbox: mtk-cmdq: Remove needless devm_kfree() calls
        mailbox: mtk-cmdq: Use device-managed registration API
        mailbox: xgene-slimpro: Use device-managed registration API
        mailbox: sti: Use device-managed registration API
        mailbox: altera: Use device-managed registration API
        mailbox: imx: Use device-managed registration API
        mailbox: hi6220: Use device-managed registration API
        ...
      35ddb06a
    • L
      Merge branch 'for-linus-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · 6aa293d8
      Linus Torvalds 提交于
      Pull UML updates from Richard Weinberger:
      
       - DISCARD support for our block device driver
      
       - Many TLB flush optimizations
      
       - Various smaller fixes
      
       - And most important, Anton agreed to help me maintaining UML
      
      * 'for-linus-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
        um: Remove obsolete reenable_XX calls
        um: writev needs <sys/uio.h>
        Add Anton Ivanov to UML maintainers
        um: remove redundant generic-y
        um: Optimize Flush TLB for force/fork case
        um: Avoid marking pages with "changed protection"
        um: Skip TLB flushing where not needed
        um: Optimize TLB operations v2
        um: Remove unnecessary faulted check in uaccess.c
        um: Add support for DISCARD in the UBD Driver
        um: Remove unsafe printks from the io thread
        um: Clean-up command processing in UML UBD driver
        um: Switch to block-mq constants in the UML UBD driver
        um: Make GCOV depend on !KCOV
        um: Include sys/uio.h to have writev()
        um: Add HAVE_DEBUG_BUGVERBOSE
        um: Update maintainers file entry
      6aa293d8
    • L
      Merge tag 's390-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 04a17ede
      Linus Torvalds 提交于
      Pull s390 updates from Martin Schwidefsky:
      
       - A larger update for the zcrypt / AP bus code:
          + Update two inline assemblies in the zcrypt driver to make gcc happy
          + Add a missing reply code for invalid special commands for zcrypt
          + Allow AP device reset to be triggered from user space
          + Split the AP scan function into smaller, more readable functions
      
       - Updates for vfio-ccw and vfio-ap
          + Add maintainers and reviewer for vfio-ccw
          + Include facility.h in vfio_ap_drv.c to avoid fragile include chain
          + Simplicy vfio-ccw state machine
      
       - Use the common code version of bust_spinlocks
      
       - Make use of the DEFINE_SHOW_ATTRIBUTE
      
       - Fix three incorrect file permissions in the DASD driver
      
       - Remove bit spin-lock from the PCI interrupt handler
      
       - Fix GFP_ATOMIC vs GFP_KERNEL in the PCI code
      
      * tag 's390-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/zcrypt: rework ap scan bus code
        s390/zcrypt: make sysfs reset attribute trigger queue reset
        s390/pci: fix sleeping in atomic during hotplug
        s390/pci: remove bit_lock usage in interrupt handler
        s390/drivers: fix proc/debugfs file permissions
        s390: convert to DEFINE_SHOW_ATTRIBUTE
        MAINTAINERS/vfio-ccw: add Farhan and Eric, make Halil Reviewer
        vfio: ccw: Merge BUSY and BOXED states
        s390: use common bust_spinlocks()
        s390/zcrypt: improve special ap message cmd handling
        s390/ap: rework assembler functions to use unions for in/out register variables
        s390: vfio-ap: include <asm/facility> for test_facility()
      04a17ede
    • N
      locks: fix error in locks_move_blocks() · bf77ae4c
      NeilBrown 提交于
      After moving all requests from
         fl->fl_blocked_requests
      to
         new->fl_blocked_requests
      
      it is nonsensical to do anything to all the remaining elements, there
      aren't any.  This should do something to all the requests that have been
      moved. For simplicity, it does it to all requests in the target list.
      
      Setting "f->fl_blocker = new" to all members of new->fl_blocked_requests
      is "obviously correct" as it preserves the invariant of the linkage
      among requests.
      
      Reported-by: syzbot+239d99847eb49ecb3899@syzkaller.appspotmail.com
      Fixes: 5946c431 ("fs/locks: allow a lock request to block other requests.")
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJeff Layton <jlayton@kernel.org>
      bf77ae4c
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · be630043
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2019-01-02
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) prevent out of bounds speculation on pointer arithmetic, from Daniel.
      
      2) typo fix, from Xiaozhou.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be630043
    • L
      Merge tag 'nfs-for-4.21-1' of git://git.linux-nfs.org/projects/anna/linux-nfs · e6b92572
      Linus Torvalds 提交于
      Pull NFS client updates from Anna Schumaker:
       "Stable bugfixes:
         - xprtrdma: Yet another double DMA-unmap # v4.20
      
        Features:
         - Allow some /proc/sys/sunrpc entries without CONFIG_SUNRPC_DEBUG
         - Per-xprt rdma receive workqueues
         - Drop support for FMR memory registration
         - Make port= mount option optional for RDMA mounts
      
        Other bugfixes and cleanups:
         - Remove unused nfs4_xdev_fs_type declaration
         - Fix comments for behavior that has changed
         - Remove generic RPC credentials by switching to 'struct cred'
         - Fix crossing mountpoints with different auth flavors
         - Various xprtrdma fixes from testing and auditing the close code
         - Fixes for disconnect issues when using xprtrdma with krb5
         - Clean up and improve xprtrdma trace points
         - Fix NFS v4.2 async copy reboot recovery"
      
      * tag 'nfs-for-4.21-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (63 commits)
        sunrpc: convert to DEFINE_SHOW_ATTRIBUTE
        sunrpc: Add xprt after nfs4_test_session_trunk()
        sunrpc: convert unnecessary GFP_ATOMIC to GFP_NOFS
        sunrpc: handle ENOMEM in rpcb_getport_async
        NFS: remove unnecessary test for IS_ERR(cred)
        xprtrdma: Prevent leak of rpcrdma_rep objects
        NFSv4.2 fix async copy reboot recovery
        xprtrdma: Don't leak freed MRs
        xprtrdma: Add documenting comment for rpcrdma_buffer_destroy
        xprtrdma: Replace outdated comment for rpcrdma_ep_post
        xprtrdma: Update comments in frwr_op_send
        SUNRPC: Fix some kernel doc complaints
        SUNRPC: Simplify defining common RPC trace events
        NFS: Fix NFSv4 symbolic trace point output
        xprtrdma: Trace mapping, alloc, and dereg failures
        xprtrdma: Add trace points for calls to transport switch methods
        xprtrdma: Relocate the xprtrdma_mr_map trace points
        xprtrdma: Clean up of xprtrdma chunk trace points
        xprtrdma: Remove unused fields from rpcrdma_ia
        xprtrdma: Cull dprintk() call sites
        ...
      e6b92572
    • L
      Merge tag 'nfsd-4.21' of git://linux-nfs.org/~bfields/linux · e45428a4
      Linus Torvalds 提交于
      Pull nfsd updates from Bruce Fields:
       "Thanks to Vasily Averin for fixing a use-after-free in the
        containerized NFSv4.2 client, and cleaning up some convoluted
        backchannel server code in the process.
      
        Otherwise, miscellaneous smaller bugfixes and cleanup"
      
      * tag 'nfsd-4.21' of git://linux-nfs.org/~bfields/linux: (25 commits)
        nfs: fixed broken compilation in nfs_callback_up_net()
        nfs: minor typo in nfs4_callback_up_net()
        sunrpc: fix debug message in svc_create_xprt()
        sunrpc: make visible processing error in bc_svc_process()
        sunrpc: remove unused xpo_prep_reply_hdr callback
        sunrpc: remove svc_rdma_bc_class
        sunrpc: remove svc_tcp_bc_class
        sunrpc: remove unused bc_up operation from rpc_xprt_ops
        sunrpc: replace svc_serv->sv_bc_xprt by boolean flag
        sunrpc: use-after-free in svc_process_common()
        sunrpc: use SVC_NET() in svcauth_gss_* functions
        nfsd: drop useless LIST_HEAD
        lockd: Show pid of lockd for remote locks
        NFSD remove OP_CACHEME from 4.2 op_flags
        nfsd: Return EPERM, not EACCES, in some SETATTR cases
        sunrpc: fix cache_head leak due to queued request
        nfsd: clean up indentation, increase indentation in switch statement
        svcrdma: Optimize the logic that selects the R_key to invalidate
        nfsd: fix a warning in __cld_pipe_upcall()
        nfsd4: fix crash on writing v4_end_grace before nfsd startup
        ...
      e45428a4
    • A
      Merge branch 'prevent-oob-under-speculation' · a67825f5
      Alexei Starovoitov 提交于
      Daniel Borkmann says:
      
      ====================
      This set fixes an out of bounds case under speculative execution
      by implementing masking of pointer alu into the verifier. For
      details please see the individual patches.
      
      Thanks!
      
      v2 -> v3:
        - 8/9: change states_equal condition into old->speculative &&
          !cur->speculative, thanks Jakub!
        - 8/9: remove incorrect speculative state test in
          propagate_liveness(), thanks Jakub!
      v1 -> v2:
        - Typo fixes in commit msg and a comment, thanks David!
      ====================
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      a67825f5
    • D
      bpf: add various test cases to selftests · 80c9b2fa
      Daniel Borkmann 提交于
      Add various map value pointer related test cases to test_verifier
      kselftest to reflect recent changes and improve test coverage. The
      tests include basic masking functionality, unprivileged behavior
      on pointer arithmetic which goes oob, mixed bounds tests, negative
      unknown scalar but resulting positive offset for access and helper
      range, handling of arithmetic from multiple maps, various masking
      scenarios with subsequent map value access and others including two
      test cases from Jann Horn for prior fixes.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      80c9b2fa
    • D
      bpf: prevent out of bounds speculation on pointer arithmetic · 979d63d5
      Daniel Borkmann 提交于
      Jann reported that the original commit back in b2157399
      ("bpf: prevent out-of-bounds speculation") was not sufficient
      to stop CPU from speculating out of bounds memory access:
      While b2157399 only focussed on masking array map access
      for unprivileged users for tail calls and data access such
      that the user provided index gets sanitized from BPF program
      and syscall side, there is still a more generic form affected
      from BPF programs that applies to most maps that hold user
      data in relation to dynamic map access when dealing with
      unknown scalars or "slow" known scalars as access offset, for
      example:
      
        - Load a map value pointer into R6
        - Load an index into R7
        - Do a slow computation (e.g. with a memory dependency) that
          loads a limit into R8 (e.g. load the limit from a map for
          high latency, then mask it to make the verifier happy)
        - Exit if R7 >= R8 (mispredicted branch)
        - Load R0 = R6[R7]
        - Load R0 = R6[R0]
      
      For unknown scalars there are two options in the BPF verifier
      where we could derive knowledge from in order to guarantee
      safe access to the memory: i) While </>/<=/>= variants won't
      allow to derive any lower or upper bounds from the unknown
      scalar where it would be safe to add it to the map value
      pointer, it is possible through ==/!= test however. ii) another
      option is to transform the unknown scalar into a known scalar,
      for example, through ALU ops combination such as R &= <imm>
      followed by R |= <imm> or any similar combination where the
      original information from the unknown scalar would be destroyed
      entirely leaving R with a constant. The initial slow load still
      precedes the latter ALU ops on that register, so the CPU
      executes speculatively from that point. Once we have the known
      scalar, any compare operation would work then. A third option
      only involving registers with known scalars could be crafted
      as described in [0] where a CPU port (e.g. Slow Int unit)
      would be filled with many dependent computations such that
      the subsequent condition depending on its outcome has to wait
      for evaluation on its execution port and thereby executing
      speculatively if the speculated code can be scheduled on a
      different execution port, or any other form of mistraining
      as described in [1], for example. Given this is not limited
      to only unknown scalars, not only map but also stack access
      is affected since both is accessible for unprivileged users
      and could potentially be used for out of bounds access under
      speculation.
      
      In order to prevent any of these cases, the verifier is now
      sanitizing pointer arithmetic on the offset such that any
      out of bounds speculation would be masked in a way where the
      pointer arithmetic result in the destination register will
      stay unchanged, meaning offset masked into zero similar as
      in array_index_nospec() case. With regards to implementation,
      there are three options that were considered: i) new insn
      for sanitation, ii) push/pop insn and sanitation as inlined
      BPF, iii) reuse of ax register and sanitation as inlined BPF.
      
      Option i) has the downside that we end up using from reserved
      bits in the opcode space, but also that we would require
      each JIT to emit masking as native arch opcodes meaning
      mitigation would have slow adoption till everyone implements
      it eventually which is counter-productive. Option ii) and iii)
      have both in common that a temporary register is needed in
      order to implement the sanitation as inlined BPF since we
      are not allowed to modify the source register. While a push /
      pop insn in ii) would be useful to have in any case, it
      requires once again that every JIT needs to implement it
      first. While possible, amount of changes needed would also
      be unsuitable for a -stable patch. Therefore, the path which
      has fewer changes, less BPF instructions for the mitigation
      and does not require anything to be changed in the JITs is
      option iii) which this work is pursuing. The ax register is
      already mapped to a register in all JITs (modulo arm32 where
      it's mapped to stack as various other BPF registers there)
      and used in constant blinding for JITs-only so far. It can
      be reused for verifier rewrites under certain constraints.
      The interpreter's tmp "register" has therefore been remapped
      into extending the register set with hidden ax register and
      reusing that for a number of instructions that needed the
      prior temporary variable internally (e.g. div, mod). This
      allows for zero increase in stack space usage in the interpreter,
      and enables (restricted) generic use in rewrites otherwise as
      long as such a patchlet does not make use of these instructions.
      The sanitation mask is dynamic and relative to the offset the
      map value or stack pointer currently holds.
      
      There are various cases that need to be taken under consideration
      for the masking, e.g. such operation could look as follows:
      ptr += val or val += ptr or ptr -= val. Thus, the value to be
      sanitized could reside either in source or in destination
      register, and the limit is different depending on whether
      the ALU op is addition or subtraction and depending on the
      current known and bounded offset. The limit is derived as
      follows: limit := max_value_size - (smin_value + off). For
      subtraction: limit := umax_value + off. This holds because
      we do not allow any pointer arithmetic that would
      temporarily go out of bounds or would have an unknown
      value with mixed signed bounds where it is unclear at
      verification time whether the actual runtime value would
      be either negative or positive. For example, we have a
      derived map pointer value with constant offset and bounded
      one, so limit based on smin_value works because the verifier
      requires that statically analyzed arithmetic on the pointer
      must be in bounds, and thus it checks if resulting
      smin_value + off and umax_value + off is still within map
      value bounds at time of arithmetic in addition to time of
      access. Similarly, for the case of stack access we derive
      the limit as follows: MAX_BPF_STACK + off for subtraction
      and -off for the case of addition where off := ptr_reg->off +
      ptr_reg->var_off.value. Subtraction is a special case for
      the masking which can be in form of ptr += -val, ptr -= -val,
      or ptr -= val. In the first two cases where we know that
      the value is negative, we need to temporarily negate the
      value in order to do the sanitation on a positive value
      where we later swap the ALU op, and restore original source
      register if the value was in source.
      
      The sanitation of pointer arithmetic alone is still not fully
      sufficient as is, since a scenario like the following could
      happen ...
      
        PTR += 0x1000 (e.g. K-based imm)
        PTR -= BIG_NUMBER_WITH_SLOW_COMPARISON
        PTR += 0x1000
        PTR -= BIG_NUMBER_WITH_SLOW_COMPARISON
        [...]
      
      ... which under speculation could end up as ...
      
        PTR += 0x1000
        PTR -= 0 [ truncated by mitigation ]
        PTR += 0x1000
        PTR -= 0 [ truncated by mitigation ]
        [...]
      
      ... and therefore still access out of bounds. To prevent such
      case, the verifier is also analyzing safety for potential out
      of bounds access under speculative execution. Meaning, it is
      also simulating pointer access under truncation. We therefore
      "branch off" and push the current verification state after the
      ALU operation with known 0 to the verification stack for later
      analysis. Given the current path analysis succeeded it is
      likely that the one under speculation can be pruned. In any
      case, it is also subject to existing complexity limits and
      therefore anything beyond this point will be rejected. In
      terms of pruning, it needs to be ensured that the verification
      state from speculative execution simulation must never prune
      a non-speculative execution path, therefore, we mark verifier
      state accordingly at the time of push_stack(). If verifier
      detects out of bounds access under speculative execution from
      one of the possible paths that includes a truncation, it will
      reject such program.
      
      Given we mask every reg-based pointer arithmetic for
      unprivileged programs, we've been looking into how it could
      affect real-world programs in terms of size increase. As the
      majority of programs are targeted for privileged-only use
      case, we've unconditionally enabled masking (with its alu
      restrictions on top of it) for privileged programs for the
      sake of testing in order to check i) whether they get rejected
      in its current form, and ii) by how much the number of
      instructions and size will increase. We've tested this by
      using Katran, Cilium and test_l4lb from the kernel selftests.
      For Katran we've evaluated balancer_kern.o, Cilium bpf_lxc.o
      and an older test object bpf_lxc_opt_-DUNKNOWN.o and l4lb
      we've used test_l4lb.o as well as test_l4lb_noinline.o. We
      found that none of the programs got rejected by the verifier
      with this change, and that impact is rather minimal to none.
      balancer_kern.o had 13,904 bytes (1,738 insns) xlated and
      7,797 bytes JITed before and after the change. Most complex
      program in bpf_lxc.o had 30,544 bytes (3,817 insns) xlated
      and 18,538 bytes JITed before and after and none of the other
      tail call programs in bpf_lxc.o had any changes either. For
      the older bpf_lxc_opt_-DUNKNOWN.o object we found a small
      increase from 20,616 bytes (2,576 insns) and 12,536 bytes JITed
      before to 20,664 bytes (2,582 insns) and 12,558 bytes JITed
      after the change. Other programs from that object file had
      similar small increase. Both test_l4lb.o had no change and
      remained at 6,544 bytes (817 insns) xlated and 3,401 bytes
      JITed and for test_l4lb_noinline.o constant at 5,080 bytes
      (634 insns) xlated and 3,313 bytes JITed. This can be explained
      in that LLVM typically optimizes stack based pointer arithmetic
      by using K-based operations and that use of dynamic map access
      is not overly frequent. However, in future we may decide to
      optimize the algorithm further under known guarantees from
      branch and value speculation. Latter seems also unclear in
      terms of prediction heuristics that today's CPUs apply as well
      as whether there could be collisions in e.g. the predictor's
      Value History/Pattern Table for triggering out of bounds access,
      thus masking is performed unconditionally at this point but could
      be subject to relaxation later on. We were generally also
      brainstorming various other approaches for mitigation, but the
      blocker was always lack of available registers at runtime and/or
      overhead for runtime tracking of limits belonging to a specific
      pointer. Thus, we found this to be minimally intrusive under
      given constraints.
      
      With that in place, a simple example with sanitized access on
      unprivileged load at post-verification time looks as follows:
      
        # bpftool prog dump xlated id 282
        [...]
        28: (79) r1 = *(u64 *)(r7 +0)
        29: (79) r2 = *(u64 *)(r7 +8)
        30: (57) r1 &= 15
        31: (79) r3 = *(u64 *)(r0 +4608)
        32: (57) r3 &= 1
        33: (47) r3 |= 1
        34: (2d) if r2 > r3 goto pc+19
        35: (b4) (u32) r11 = (u32) 20479  |
        36: (1f) r11 -= r2                | Dynamic sanitation for pointer
        37: (4f) r11 |= r2                | arithmetic with registers
        38: (87) r11 = -r11               | containing bounded or known
        39: (c7) r11 s>>= 63              | scalars in order to prevent
        40: (5f) r11 &= r2                | out of bounds speculation.
        41: (0f) r4 += r11                |
        42: (71) r4 = *(u8 *)(r4 +0)
        43: (6f) r4 <<= r1
        [...]
      
      For the case where the scalar sits in the destination register
      as opposed to the source register, the following code is emitted
      for the above example:
      
        [...]
        16: (b4) (u32) r11 = (u32) 20479
        17: (1f) r11 -= r2
        18: (4f) r11 |= r2
        19: (87) r11 = -r11
        20: (c7) r11 s>>= 63
        21: (5f) r2 &= r11
        22: (0f) r2 += r0
        23: (61) r0 = *(u32 *)(r2 +0)
        [...]
      
      JIT blinding example with non-conflicting use of r10:
      
        [...]
         d5:	je     0x0000000000000106    _
         d7:	mov    0x0(%rax),%edi       |
         da:	mov    $0xf153246,%r10d     | Index load from map value and
         e0:	xor    $0xf153259,%r10      | (const blinded) mask with 0x1f.
         e7:	and    %r10,%rdi            |_
         ea:	mov    $0x2f,%r10d          |
         f0:	sub    %rdi,%r10            | Sanitized addition. Both use r10
         f3:	or     %rdi,%r10            | but do not interfere with each
         f6:	neg    %r10                 | other. (Neither do these instructions
         f9:	sar    $0x3f,%r10           | interfere with the use of ax as temp
         fd:	and    %r10,%rdi            | in interpreter.)
        100:	add    %rax,%rdi            |_
        103:	mov    0x0(%rdi),%eax
       [...]
      
      Tested that it fixes Jann's reproducer, and also checked that test_verifier
      and test_progs suite with interpreter, JIT and JIT with hardening enabled
      on x86-64 and arm64 runs successfully.
      
        [0] Speculose: Analyzing the Security Implications of Speculative
            Execution in CPUs, Giorgi Maisuradze and Christian Rossow,
            https://arxiv.org/pdf/1801.04084.pdf
      
        [1] A Systematic Evaluation of Transient Execution Attacks and
            Defenses, Claudio Canella, Jo Van Bulck, Michael Schwarz,
            Moritz Lipp, Benjamin von Berg, Philipp Ortner, Frank Piessens,
            Dmitry Evtyushkin, Daniel Gruss,
            https://arxiv.org/pdf/1811.05441.pdf
      
      Fixes: b2157399 ("bpf: prevent out-of-bounds speculation")
      Reported-by: NJann Horn <jannh@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      979d63d5
    • D
      bpf: fix check_map_access smin_value test when pointer contains offset · b7137c4e
      Daniel Borkmann 提交于
      In check_map_access() we probe actual bounds through __check_map_access()
      with offset of reg->smin_value + off for lower bound and offset of
      reg->umax_value + off for the upper bound. However, even though the
      reg->smin_value could have a negative value, the final result of the
      sum with off could be positive when pointer arithmetic with known and
      unknown scalars is combined. In this case we reject the program with
      an error such as "R<x> min value is negative, either use unsigned index
      or do a if (index >=0) check." even though the access itself would be
      fine. Therefore extend the check to probe whether the actual resulting
      reg->smin_value + off is less than zero.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      b7137c4e
    • D
      bpf: restrict unknown scalars of mixed signed bounds for unprivileged · 9d7eceed
      Daniel Borkmann 提交于
      For unknown scalars of mixed signed bounds, meaning their smin_value is
      negative and their smax_value is positive, we need to reject arithmetic
      with pointer to map value. For unprivileged the goal is to mask every
      map pointer arithmetic and this cannot reliably be done when it is
      unknown at verification time whether the scalar value is negative or
      positive. Given this is a corner case, the likelihood of breaking should
      be very small.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      9d7eceed
    • D
      bpf: restrict stack pointer arithmetic for unprivileged · e4298d25
      Daniel Borkmann 提交于
      Restrict stack pointer arithmetic for unprivileged users in that
      arithmetic itself must not go out of bounds as opposed to the actual
      access later on. Therefore after each adjust_ptr_min_max_vals() with
      a stack pointer as a destination we simulate a check_stack_access()
      of 1 byte on the destination and once that fails the program is
      rejected for unprivileged program loads. This is analog to map
      value pointer arithmetic and needed for masking later on.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      e4298d25
    • D
      bpf: restrict map value pointer arithmetic for unprivileged · 0d6303db
      Daniel Borkmann 提交于
      Restrict map value pointer arithmetic for unprivileged users in that
      arithmetic itself must not go out of bounds as opposed to the actual
      access later on. Therefore after each adjust_ptr_min_max_vals() with a
      map value pointer as a destination it will simulate a check_map_access()
      of 1 byte on the destination and once that fails the program is rejected
      for unprivileged program loads. We use this later on for masking any
      pointer arithmetic with the remainder of the map value space. The
      likelihood of breaking any existing real-world unprivileged eBPF
      program is very small for this corner case.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      0d6303db
    • D
      bpf: enable access to ax register also from verifier rewrite · 9b73bfdd
      Daniel Borkmann 提交于
      Right now we are using BPF ax register in JIT for constant blinding as
      well as in interpreter as temporary variable. Verifier will not be able
      to use it simply because its use will get overridden from the former in
      bpf_jit_blind_insn(). However, it can be made to work in that blinding
      will be skipped if there is prior use in either source or destination
      register on the instruction. Taking constraints of ax into account, the
      verifier is then open to use it in rewrites under some constraints. Note,
      ax register already has mappings in every eBPF JIT.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      9b73bfdd
    • D
      bpf: move tmp variable into ax register in interpreter · 144cd91c
      Daniel Borkmann 提交于
      This change moves the on-stack 64 bit tmp variable in ___bpf_prog_run()
      into the hidden ax register. The latter is currently only used in JITs
      for constant blinding as a temporary scratch register, meaning the BPF
      interpreter will never see the use of ax. Therefore it is safe to use
      it for the cases where tmp has been used earlier. This is needed to later
      on allow restricted hidden use of ax in both interpreter and JITs.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      144cd91c
    • D
      bpf: move {prev_,}insn_idx into verifier env · c08435ec
      Daniel Borkmann 提交于
      Move prev_insn_idx and insn_idx from the do_check() function into
      the verifier environment, so they can be read inside the various
      helper functions for handling the instructions. It's easier to put
      this into the environment rather than changing all call-sites only
      to pass it along. insn_idx is useful in particular since this later
      on allows to hold state in env->insn_aux_data[env->insn_idx].
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      c08435ec
    • L
      Merge tag '9p-for-4.21' of git://github.com/martinetd/linux · 85f78456
      Linus Torvalds 提交于
      Pull 9p updates from Dominique Martinet:
       "Missing prototype warning fix and a syzkaller fix when a 9p server
        advertises a too small msize"
      
      * tag '9p-for-4.21' of git://github.com/martinetd/linux:
        9p/net: put a lower bound on msize
        net/9p: include trans_common.h to fix missing prototype warning.
      85f78456
    • L
      Merge tag '4.21-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · cacf02df
      Linus Torvalds 提交于
      Pull cifs updates from Steve French:
      
       - four fixes for stable
      
       - improvements to DFS including allowing failover to alternate targets
      
       - some small performance improvements
      
      * tag '4.21-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (39 commits)
        cifs: update internal module version number
        cifs: we can not use small padding iovs together with encryption
        cifs: Minor Kconfig clarification
        cifs: Always resolve hostname before reconnecting
        cifs: Add support for failover in cifs_reconnect_tcon()
        cifs: Add support for failover in smb2_reconnect()
        cifs: Only free DFS target list if we actually got one
        cifs: start DFS cache refresher in cifs_mount()
        cifs: Use GFP_ATOMIC when a lock is held in cifs_mount()
        cifs: Add support for failover in cifs_reconnect()
        cifs: Add support for failover in cifs_mount()
        cifs: remove set but not used variable 'sep'
        cifs: Make use of DFS cache to get new DFS referrals
        cifs: minor updates to documentation
        cifs: check kzalloc return
        cifs: remove set but not used variable 'server'
        cifs: Use kzfree() to free password
        cifs: Fix to use kmem_cache_free() instead of kfree()
        cifs: update for current_kernel_time64() removal
        cifs: Add DFS cache routines
        ...
      cacf02df
    • L
      Merge branch 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 74673fc5
      Linus Torvalds 提交于
      Pull TPM updates from James Morris:
      
       - Support for partial reads of /dev/tpm0.
      
       - Clean up for TPM 1.x code: move the commands to tpm1-cmd.c and make
         everything to use the same data structure for building TPM commands
         i.e. struct tpm_buf.
      
      * 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (25 commits)
        tpm: add support for partial reads
        tpm: tpm_ibmvtpm: fix kdoc warnings
        tpm: fix kdoc for tpm2_flush_context_cmd()
        tpm: tpm_try_transmit() refactor error flow.
        tpm: use u32 instead of int for PCR index
        tpm1: reimplement tpm1_continue_selftest() using tpm_buf
        tpm1: reimplement SAVESTATE using tpm_buf
        tpm1: rename tpm1_pcr_read_dev to tpm1_pcr_read()
        tpm1: implement tpm1_pcr_read_dev() using tpm_buf structure
        tpm: tpm1: rewrite tpm1_get_random() using tpm_buf structure
        tpm: tpm-space.c remove unneeded semicolon
        tpm: tpm-interface.c drop unused macros
        tpm: add tpm_auto_startup() into tpm-interface.c
        tpm: factor out tpm_startup function
        tpm: factor out tpm 1.x pm suspend flow into tpm1-cmd.c
        tpm: move tpm 1.x selftest code from tpm-interface.c tpm1-cmd.c
        tpm: factor out tpm1_get_random into tpm1-cmd.c
        tpm: move tpm_getcap to tpm1-cmd.c
        tpm: move tpm1_pcr_extend to tpm1-cmd.c
        tpm: factor out tpm_get_timeouts()
        ...
      74673fc5
    • L
      Merge branch 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 19f2e267
      Linus Torvalds 提交于
      Pull smack updates from James Morris:
       "Two Smack patches for 4.21.
      
        Jose's patch adds missing documentation and Zoran's fleshes out the
        access checks on keyrings"
      
      * 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        Smack: Improve Documentation
        smack: fix access permissions for keyring
      19f2e267
    • L
      block: don't use un-ordered __set_current_state(TASK_UNINTERRUPTIBLE) · 1ac5cd49
      Linus Torvalds 提交于
      This mostly reverts commit 849a3700 ("block: avoid ordered task
      state change for polled IO").  It was wrongly claiming that the ordering
      wasn't necessary.  The memory barrier _is_ necessary.
      
      If something is truly polling and not going to sleep, it's the whole
      state setting that is unnecessary, not the memory barrier.  Whenever you
      set your state to a sleeping state, you absolutely need the memory
      barrier.
      
      Note that sometimes the memory barrier can be elsewhere.  For example,
      the ordering might be provided by an external lock, or by setting the
      process state to sleeping before adding yourself to the wait queue list
      that is used for waking up (where the wait queue lock itself will
      guarantee that any wakeup will correctly see the sleeping state).
      
      But none of those cases were true here.
      
      NOTE! Some of the polling paths may indeed be able to drop the state
      setting entirely, at which point the memory barrier also goes away.
      
      (Also note that this doesn't revert the TASK_RUNNING cases: there is no
      race between a wakeup and setting the process state to TASK_RUNNING,
      since the end result doesn't depend on ordering).
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1ac5cd49
    • E
      isdn: fix kernel-infoleak in capi_unlocked_ioctl · d63967e4
      Eric Dumazet 提交于
      Since capi_ioctl() copies 64 bytes after calling
      capi20_get_manufacturer() we need to ensure to not leak
      information to user.
      
      BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
      CPU: 0 PID: 11245 Comm: syz-executor633 Not tainted 4.20.0-rc7+ #2
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x173/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
       kmsan_internal_check_memory+0x9d4/0xb00 mm/kmsan/kmsan.c:704
       kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
       _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
       capi_ioctl include/linux/uaccess.h:177 [inline]
       capi_unlocked_ioctl+0x1a0b/0x1bf0 drivers/isdn/capi/capi.c:939
       do_vfs_ioctl+0xebd/0x2bf0 fs/ioctl.c:46
       ksys_ioctl fs/ioctl.c:713 [inline]
       __do_sys_ioctl fs/ioctl.c:720 [inline]
       __se_sys_ioctl+0x1da/0x270 fs/ioctl.c:718
       __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:718
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      RIP: 0033:0x440019
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffdd4659fb8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440019
      RDX: 0000000020000080 RSI: 00000000c0044306 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
      R10: 0000000000000000 R11: 0000000000000213 R12: 00000000004018a0
      R13: 0000000000401930 R14: 0000000000000000 R15: 0000000000000000
      
      Local variable description: ----data.i@capi_unlocked_ioctl
      Variable was created at:
       capi_ioctl drivers/isdn/capi/capi.c:747 [inline]
       capi_unlocked_ioctl+0x82/0x1bf0 drivers/isdn/capi/capi.c:939
       do_vfs_ioctl+0xebd/0x2bf0 fs/ioctl.c:46
      
      Bytes 12-63 of 64 are uninitialized
      Memory access of size 64 starts at ffff88807ac5fce8
      Data copied to user address 0000000020000080
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Karsten Keil <isdn@linux-pingi.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d63967e4