1. 19 12月, 2017 2 次提交
    • J
      tipc: remove leaving group member from all lists · 3f42f5fe
      Jon Maloy 提交于
      A group member going into state LEAVING should never go back to any
      other state before it is finally deleted. However, this might happen
      if the socket needs to send out a RECLAIM message during this interval.
      Since we forget to remove the leaving member from the group's 'active'
      or 'pending' list, the member might be selected for reclaiming, change
      state to RECLAIMING, and get stuck in this state instead of being
      deleted. This might lead to suppression of the expected 'member down'
      event to the receiver.
      
      We fix this by removing the member from all lists, except the RB tree,
      at the moment it goes into state LEAVING.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f42f5fe
    • J
      tipc: fix lost member events bug · 23483399
      Jon Maloy 提交于
      Group messages are not supposed to be returned to sender when the
      destination socket disappears. This is done correctly for regular
      traffic messages, by setting the 'dest_droppable' bit in the header.
      But we forget to do that in group protocol messages. This has the effect
      that such messages may sometimes bounce back to the sender, be perceived
      as a legitimate peer message, and wreak general havoc for the rest of
      the session. In particular, we have seen that a member in state LEAVING
      may go back to state RECLAIMED or REMITTED, hence causing suppression
      of an otherwise expected 'member down' event to the user.
      
      We fix this by setting the 'dest_droppable' bit even in group protocol
      messages.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23483399
  2. 18 12月, 2017 1 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · b36025b1
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2017-12-17
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix a corner case in generic XDP where we have non-linear skbs
         but enough tailroom in the skb to not miss to linearizing there,
         from Song.
      
      2) Fix BPF JIT bugs in s390x and ppc64 to not recache skb data when
         BPF context is not skb, from Daniel.
      
      3) Fix a BPF JIT bug in sparc64 where recaching skb data after helper
         call would use the wrong register for the skb, from Daniel.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b36025b1
  3. 17 12月, 2017 5 次提交
    • A
      vxlan: restore dev->mtu setting based on lower device · f870c1ff
      Alexey Kodanev 提交于
      Stefano Brivio says:
          Commit a985343b ("vxlan: refactor verification and
          application of configuration") introduced a change in the
          behaviour of initial MTU setting: earlier, the MTU for a link
          created on top of a given lower device, without an initial MTU
          specification, was set to the MTU of the lower device minus
          headroom as a result of this path in vxlan_dev_configure():
      
      	if (!conf->mtu)
      		dev->mtu = lowerdev->mtu -
      			   (use_ipv6 ? VXLAN6_HEADROOM : VXLAN_HEADROOM);
      
          which is now gone. Now, the initial MTU, in absence of a
          configured value, is simply set by ether_setup() to ETH_DATA_LEN
          (1500 bytes).
      
          This breaks userspace expectations in case the MTU of
          the lower device is higher than 1500 bytes minus headroom.
      
      This patch restores the previous behaviour on newlink operation. Since
      max_mtu can be negative and we update dev->mtu directly, also check it
      for valid minimum.
      Reported-by: NJunhan Yan <juyan@redhat.com>
      Fixes: a985343b ("vxlan: refactor verification and application of configuration")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f870c1ff
    • B
      ipv6: icmp6: Allow icmp messages to be looped back · 588753f1
      Brendan McGrath 提交于
      One example of when an ICMPv6 packet is required to be looped back is
      when a host acts as both a Multicast Listener and a Multicast Router.
      
      A Multicast Router will listen on address ff02::16 for MLDv2 messages.
      
      Currently, MLDv2 messages originating from a Multicast Listener running
      on the same host as the Multicast Router are not being delivered to the
      Multicast Router. This is due to dst.input being assigned the default
      value of dst_discard.
      
      This results in the packet being looped back but discarded before being
      delivered to the Multicast Router.
      
      This patch sets dst.input to ip6_input to ensure a looped back packet
      is delivered to the Multicast Router.
      Signed-off-by: NBrendan McGrath <redmcg@redmandi.dyndns.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      588753f1
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · f3b5ad89
      Linus Torvalds 提交于
      Pull rdma fixes from Jason Gunthorpe:
       "More fixes from testing done on the rc kernel, including more SELinux
        testing. Looking forward, lockdep found regression today in ipoib
        which is still being fixed.
      
        Summary:
      
         - Fix for SELinux on the umad SMI path. Some old hardware does not
           fill the PKey properly exposing another bug in the newer SELinux
           code.
      
         - Check the input port as we can exceed array bounds from this user
           supplied value
      
         - Users are unable to use the hash field support as they want due to
           incorrect checks on the field restrictions, correct that so the
           feature works as intended
      
         - User triggerable oops in the NETLINK_RDMA handler
      
         - cxgb4 driver fix for a bad interaction with CQ flushing in iser
           caused by patches in this merge window, and bad CQ flushing during
           normal close.
      
         - Unbalanced memalloc_noio in ipoib in an error path"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        IB/ipoib: Restore MM behavior in case of tx_ring allocation failure
        iw_cxgb4: only insert drain cqes if wq is flushed
        iw_cxgb4: only clear the ARMED bit if a notification is needed
        RDMA/netlink: Fix general protection fault
        IB/mlx4: Fix RSS hash fields restrictions
        IB/core: Don't enforce PKey security on SMI MADs
        IB/core: Bound check alternate path port number
      f3b5ad89
    • L
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · f25e2295
      Linus Torvalds 提交于
      Pull i2c fixes from Wolfram Sang:
       "Two bugfixes for the AT24 I2C eeprom driver and some minor corrections
        for I2C bus drivers"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: piix4: Fix port number check on release
        i2c: stm32: Fix copyrights
        i2c-cht-wc: constify platform_device_id
        eeprom: at24: change nvmem stride to 1
        eeprom: at24: fix I2C device selection for runtime PM
      f25e2295
    • L
      Merge tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs · d025fbf1
      Linus Torvalds 提交于
      Pull NFS client fixes from Anna Schumaker:
       "This has two stable bugfixes, one to fix a BUG_ON() when
        nfs_commit_inode() is called with no outstanding commit requests and
        another to fix a race in the SUNRPC receive codepath.
      
        Additionally, there are also fixes for an NFS client deadlock and an
        xprtrdma performance regression.
      
        Summary:
      
        Stable bugfixes:
         - NFS: Avoid a BUG_ON() in nfs_commit_inode() by not waiting for a
           commit in the case that there were no commit requests.
         - SUNRPC: Fix a race in the receive code path
      
        Other fixes:
         - NFS: Fix a deadlock in nfs client initialization
         - xprtrdma: Fix a performance regression for small IOs"
      
      * tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        SUNRPC: Fix a race in the receive code path
        nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests
        xprtrdma: Spread reply processing over more CPUs
        nfs: fix a deadlock in nfs client initialization
      d025fbf1
  4. 16 12月, 2017 32 次提交
    • L
      Revert "mm: replace p??_write with pte_access_permitted in fault + gup paths" · f6f37321
      Linus Torvalds 提交于
      This reverts commits 5c9d2d5c, c7da82b8, and e7fe7b5c.
      
      We'll probably need to revisit this, but basically we should not
      complicate the get_user_pages_fast() case, and checking the actual page
      table protection key bits will require more care anyway, since the
      protection keys depend on the exact state of the VM in question.
      
      Particularly when doing a "remote" page lookup (ie in somebody elses VM,
      not your own), you need to be much more careful than this was.  Dave
      Hansen says:
      
       "So, the underlying bug here is that we now a get_user_pages_remote()
        and then go ahead and do the p*_access_permitted() checks against the
        current PKRU. This was introduced recently with the addition of the
        new p??_access_permitted() calls.
      
        We have checks in the VMA path for the "remote" gups and we avoid
        consulting PKRU for them. This got missed in the pkeys selftests
        because I did a ptrace read, but not a *write*. I also didn't
        explicitly test it against something where a COW needed to be done"
      
      It's also not entirely clear that it makes sense to check the protection
      key bits at this level at all.  But one possible eventual solution is to
      make the get_user_pages_fast() case just abort if it sees protection key
      bits set, which makes us fall back to the regular get_user_pages() case,
      which then has a vma and can do the check there if we want to.
      
      We'll see.
      
      Somewhat related to this all: what we _do_ want to do some day is to
      check the PAGE_USER bit - it should obviously always be set for user
      pages, but it would be a good check to have back.  Because we have no
      generic way to test for it, we lost it as part of moving over from the
      architecture-specific x86 GUP implementation to the generic one in
      commit e585513b ("x86/mm/gup: Switch GUP to the generic
      get_user_page_fast() implementation").
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f6f37321
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7a3c296a
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Clamp timeouts to INT_MAX in conntrack, from Jay Elliot.
      
       2) Fix broken UAPI for BPF_PROG_TYPE_PERF_EVENT, from Hendrik
          Brueckner.
      
       3) Fix locking in ieee80211_sta_tear_down_BA_sessions, from Johannes
          Berg.
      
       4) Add missing barriers to ptr_ring, from Michael S. Tsirkin.
      
       5) Don't advertise gigabit in sh_eth when not available, from Thomas
          Petazzoni.
      
       6) Check network namespace when delivering to netlink taps, from Kevin
          Cernekee.
      
       7) Kill a race in raw_sendmsg(), from Mohamed Ghannam.
      
       8) Use correct address in TCP md5 lookups when replying to an incoming
          segment, from Christoph Paasch.
      
       9) Add schedule points to BPF map alloc/free, from Eric Dumazet.
      
      10) Don't allow silly mtu values to be used in ipv4/ipv6 multicast, also
          from Eric Dumazet.
      
      11) Fix SKB leak in tipc, from Jon Maloy.
      
      12) Disable MAC learning on OVS ports of mlxsw, from Yuval Mintz.
      
      13) SKB leak fix in skB_complete_tx_timestamp(), from Willem de Bruijn.
      
      14) Add some new qmi_wwan device IDs, from Daniele Palmas.
      
      15) Fix static key imbalance in ingress qdisc, from Jiri Pirko.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
        net: qcom/emac: Reduce timeout for mdio read/write
        net: sched: fix static key imbalance in case of ingress/clsact_init error
        net: sched: fix clsact init error path
        ip_gre: fix wrong return value of erspan_rcv
        net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support
        pkt_sched: Remove TC_RED_OFFLOADED from uapi
        net: sched: Move to new offload indication in RED
        net: sched: Add TCA_HW_OFFLOAD
        net: aquantia: Increment driver version
        net: aquantia: Fix typo in ethtool statistics names
        net: aquantia: Update hw counters on hw init
        net: aquantia: Improve link state and statistics check interval callback
        net: aquantia: Fill in multicast counter in ndev stats from hardware
        net: aquantia: Fill ndev stat couters from hardware
        net: aquantia: Extend stat counters to 64bit values
        net: aquantia: Fix hardware DMA stream overload on large MRRS
        net: aquantia: Fix actual speed capabilities reporting
        sock: free skb in skb_complete_tx_timestamp on error
        s390/qeth: update takeover IPs after configuration change
        s390/qeth: lock IP table while applying takeover changes
        ...
      7a3c296a
    • L
      Merge tag 'usb-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · c36c7a7c
      Linus Torvalds 提交于
      Pull USB fixes from Greg KH:
       "Here are some USB fixes for 4.15-rc4.
      
        There is the usual handful gadget/dwc2/dwc3 fixes as always, for
        reported issues. But the most important things in here is the core fix
        from Alan Stern to resolve a nasty security bug (my first attempt is
        reverted, Alan's was much cleaner), as well as a number of usbip fixes
        from Shuah Khan to resolve those reported security issues.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        USB: core: prevent malicious bNumInterfaces overflow
        Revert "USB: core: only clean up what we allocated"
        USB: core: only clean up what we allocated
        Revert "usb: gadget: allow to enable legacy drivers without USB_ETH"
        usb: gadget: webcam: fix V4L2 Kconfig dependency
        usb: dwc2: Fix TxFIFOn sizes and total TxFIFO size issues
        usb: dwc3: gadget: Fix PCM1 for ISOC EP with ep->mult less than 3
        usb: dwc3: of-simple: set dev_pm_ops
        usb: dwc3: of-simple: fix missing clk_disable_unprepare
        usb: dwc3: gadget: Wait longer for controller to end command processing
        usb: xhci: fix TDS for MTK xHCI1.1
        xhci: Don't add a virt_dev to the devs array before it's fully allocated
        usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer
        usbip: prevent vhci_hcd driver from leaking a socket pointer address
        usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input
        usbip: fix stub_rx: get_pipe() to validate endpoint number
        tools/usbip: fixes potential (minor) "buffer overflow" (detected on recent gcc with -Werror)
        USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID
        usb: musb: da8xx: fix babble condition handling
      c36c7a7c
    • L
      Merge tag 'staging-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · a84ec723
      Linus Torvalds 提交于
      Pull staging fixes from Greg KH:
       "Here are some small staging driver fixes for 4.15-rc4.
      
        One patch for the ccree driver to prevent an unitialized value from
        being returned to a caller, and the other fixes a logic error in the
        pi433 driver"
      
      * tag 'staging-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: pi433: Fixes issue with bit shift in rf69_get_modulation
        staging: ccree: Uninitialized return in ssi_ahash_import()
      a84ec723
    • L
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · d6e47eed
      Linus Torvalds 提交于
      Pull virtio regression fixes from Michael Tsirkin:
       "Fixes two issues in the latest kernel"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_mmio: fix devm cleanup
        ptr_ring: fix up after recent ptr_ring changes
      d6e47eed
    • L
      Merge tag 'for-4.15/dm-fixes' of... · ee1b43ec
      Linus Torvalds 提交于
      Merge tag 'for-4.15/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - fix a particularly nasty DM core bug in a 4.15 refcount_t conversion.
      
       - fix various targets to dm_register_target after module __init
         resources created; otherwise racing lvm2 commands could result in a
         NULL pointer during initialization of associated DM kernel module.
      
       - fix regression in bio-based DM multipath queue_if_no_path handling.
      
       - fix DM bufio's shrinker to reclaim more than one buffer per scan.
      
      * tag 'for-4.15/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm bufio: fix shrinker scans when (nr_to_scan < retain_target)
        dm mpath: fix bio-based multipath queue_if_no_path handling
        dm: fix various targets to dm_register_target after module __init resources created
        dm table: fix regression from improper dm_dev_internal.count refcount_t conversion
      ee1b43ec
    • L
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 66dbbd72
      Linus Torvalds 提交于
      Pull SCSI fixes from James Bottomley:
       "The most important one is the bfa fix because it's easy to oops the
        kernel with this driver (this includes the commit that corrects the
        compiler warning in the original), a regression in the new timespec
        conversion in aacraid and a regression in the Fibre Channel ELS
        handling patch.
      
        The other three are a theoretical problem with termination in the
        vendor/host matching code and a use after free in lpfc.
      
        The additional patches are a fix for an I/O hang in the mq code under
        certain circumstances and a rare oops in some debugging code"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: core: Fix a scsi_show_rq() NULL pointer dereference
        scsi: MAINTAINERS: change FCoE list to linux-scsi
        scsi: libsas: fix length error in sas_smp_handler()
        scsi: bfa: fix type conversion warning
        scsi: core: run queue if SCSI device queue isn't ready and queue is idle
        scsi: scsi_devinfo: cleanly zero-pad devinfo strings
        scsi: scsi_devinfo: handle non-terminated strings
        scsi: bfa: fix access to bfad_im_port_s
        scsi: aacraid: address UBSAN warning regression
        scsi: libfc: fix ELS request handling
        scsi: lpfc: Use after free in lpfc_rq_buf_free()
      66dbbd72
    • L
      Merge tag 'mmc-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 07a20ed1
      Linus Torvalds 提交于
      Pull MMC fixes from Ulf Hansson:
       "A couple of MMC fixes:
      
         - fix use of uninitialized drv_typ variable
      
         - apply NO_CMD23 quirk to some specific SD cards to make them work"
      
      * tag 'mmc-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: core: apply NO_CMD23 quirk to some specific cards
        mmc: core: properly init drv_type
      07a20ed1
    • L
      Merge tag 'ceph-for-4.15-rc4' of git://github.com/ceph/ceph-client · dd3d66b8
      Linus Torvalds 提交于
      Pull ceph fix from Ilya Dryomov:
       "CephFS inode trimming fix from Zheng, marked for stable"
      
      * tag 'ceph-for-4.15-rc4' of git://github.com/ceph/ceph-client:
        ceph: drop negative child dentries before try pruning inode's alias
      dd3d66b8
    • L
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 227701e0
      Linus Torvalds 提交于
      Pull overlayfs fixes from Miklos Szeredi:
      
       - fix incomplete syncing of filesystem
      
       - fix regression in readdir on ovl over 9p
      
       - only follow redirects when needed
      
       - misc fixes and cleanups
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: fix overlay: warning prefix
        ovl: Use PTR_ERR_OR_ZERO()
        ovl: Sync upper dirty data when syncing overlayfs
        ovl: update ctx->pos on impure dir iteration
        ovl: Pass ovl_get_nlink() parameters in right order
        ovl: don't follow redirects if redirect_dir=off
      227701e0
    • H
      net: qcom/emac: Reduce timeout for mdio read/write · 043ee1de
      Hemanth Puranik 提交于
      Currently mdio read/write takes around ~115us as the timeout
      between status check is set to 100us.
      By reducing the timeout to 1us mdio read/write takes ~15us to
      complete. This improves the link up event response.
      Signed-off-by: NHemanth Puranik <hpuranik@codeaurora.org>
      Acked-by: NTimur Tabi <timur@codeaurora.org>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      043ee1de
    • L
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 06f976ec
      Linus Torvalds 提交于
      Pull arm64 fixes from Will Deacon:
       "There are some significant fixes in here for FP state corruption,
        hardware access/dirty PTE corruption and an erratum workaround for the
        Falkor CPU.
      
        I'm hoping that things finally settle down now, but never say never...
      
        Summary:
      
         - Fix FPSIMD context switch regression introduced in -rc2
      
         - Fix ABI break with SVE CPUID register reporting
      
         - Fix use of uninitialised variable
      
         - Fixes to hardware access/dirty management and sanity checking
      
         - CPU erratum workaround for Falkor CPUs
      
         - Fix reporting of writeable+executable mappings
      
         - Fix signal reporting for RAS errors"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: fpsimd: Fix copying of FP state from signal frame into task struct
        arm64/sve: Report SVE to userspace via CPUID only if supported
        arm64: fix CONFIG_DEBUG_WX address reporting
        arm64: fault: avoid send SIGBUS two times
        arm64: hw_breakpoint: Use linux/uaccess.h instead of asm/uaccess.h
        arm64: Add software workaround for Falkor erratum 1041
        arm64: Define cputype macros for Falkor CPU
        arm64: mm: Fix false positives in set_pte_at access/dirty race detection
        arm64: mm: Fix pte_mkclean, pte_mkdirty semantics
        arm64: Initialise high_memory global variable earlier
      06f976ec
    • J
      net: sched: fix static key imbalance in case of ingress/clsact_init error · b59e6979
      Jiri Pirko 提交于
      Move static key increments to the beginning of the init function
      so they pair 1:1 with decrements in ingress/clsact_destroy,
      which is called in case ingress/clsact_init fails.
      
      Fixes: 6529eaba ("net: sched: introduce tcf block infractructure")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b59e6979
    • J
      net: sched: fix clsact init error path · 343723dd
      Jiri Pirko 提交于
      Since in qdisc_create, the destroy op is called when init fails, we
      don't do cleanup in init and leave it up to destroy.
      This fixes use-after-free when trying to put already freed block.
      
      Fixes: 6e40cf2d ("net: sched: use extended variants of block_get/put in ingress and clsact qdiscs")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      343723dd
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e53000b1
      Linus Torvalds 提交于
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes:
      
         - fix the s2ram regression related to confusion around segment
           register restoration, plus related cleanups that make the code more
           robust
      
         - a guess-unwinder Kconfig dependency fix
      
         - an isoimage build target fix for certain tool chain combinations
      
         - instruction decoder opcode map fixes+updates, and the syncing of
           the kernel decoder headers to the objtool headers
      
         - a kmmio tracing fix
      
         - two 5-level paging related fixes
      
         - a topology enumeration fix on certain SMP systems"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Resync objtool's instruction decoder source code copy with the kernel's latest version
        x86/decoder: Fix and update the opcodes map
        x86/power: Make restore_processor_context() sane
        x86/power/32: Move SYSENTER MSR restoration to fix_processor_context()
        x86/power/64: Use struct desc_ptr for the IDT in struct saved_context
        x86/unwinder/guess: Prevent using CONFIG_UNWINDER_GUESS=y with CONFIG_STACKDEPOT=y
        x86/build: Don't verify mtools configuration file for isoimage
        x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
        x86/boot/compressed/64: Print error if 5-level paging is not supported
        x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
        x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation
      e53000b1
    • L
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1f76a755
      Linus Torvalds 提交于
      Pull locking fixes from Ingo Molnar:
       "Misc fixes:
      
         - Fix a S390 boot hang that was caused by the lock-break logic.
           Remove lock-break to begin with, as review suggested it was
           unreasonably fragile and our confidence in its continued good
           health is lower than our confidence in its removal.
      
         - Remove the lockdep cross-release checking code for now, because of
           unresolved false positive warnings. This should make lockdep work
           well everywhere again.
      
         - Get rid of the final (and single) ACCESS_ONCE() straggler and
           remove the API from v4.15.
      
         - Fix a liblockdep build warning"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tools/lib/lockdep: Add missing declaration of 'pr_cont()'
        checkpatch: Remove ACCESS_ONCE() warning
        compiler.h: Remove ACCESS_ONCE()
        tools/include: Remove ACCESS_ONCE()
        tools/perf: Convert ACCESS_ONCE() to READ_ONCE()
        locking/lockdep: Remove the cross-release locking checks
        locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y
        locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK
      1f76a755
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a58653cc
      Linus Torvalds 提交于
      Pull scheduler fixes from Ingo Molnar:
       "Two fixes: a crash fix for an ARM SoC platform, and kernel-doc
        warnings fixes"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/rt: Do not pull from current CPU if only one CPU to pull
        sched/core: Fix kernel-doc warnings after code movement
      a58653cc
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3fba3614
      Linus Torvalds 提交于
      Pull perf tooling fix from Ingo Molnar:
       "Synchronize kernel <-> tooling headers to resolve two build warnings
        in the perf build"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tools/headers: Synchronize kernel <-> tooling headers
      3fba3614
    • L
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 35d57884
      Linus Torvalds 提交于
      Pull early_ioremap fix from Ingo Molnar:
       "A boot hang fix when the EFI earlyprintk driver is enabled"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep
      35d57884
    • L
      Merge tag 'for-linus-4.15-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · bde6b37e
      Linus Torvalds 提交于
      Pull xen fixes from Juergen Gross:
       "Two minor fixes for running as Xen dom0:
      
         - when built as 32 bit kernel on large machines the Xen LAPIC
           emulation should report a rather modern LAPIC in order to support
           enough APIC-Ids
      
         - The Xen LAPIC emulation is needed for dom0 only, so build it only
           for kernels supporting to run as Xen dom0"
      
      * tag 'for-linus-4.15-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: XEN_ACPI_PROCESSOR is Dom0-only
        x86/Xen: don't report ancient LAPIC version
      bde6b37e
    • T
      SUNRPC: Fix a race in the receive code path · 90d91b0c
      Trond Myklebust 提交于
      We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot
      race with the call to xprt_complete_rqst().
      Reported-by: NChuck Lever <chuck.lever@oracle.com>
      Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
      Fixes: ce7c252a ("SUNRPC: Add a separate spinlock to protect..")
      Cc: stable@vger.kernel.org # 4.14+
      Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      90d91b0c
    • S
      nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests · dc4fd9ab
      Scott Mayhew 提交于
      If there were no commit requests, then nfs_commit_inode() should not
      wait on the commit or mark the inode dirty, otherwise the following
      BUG_ON can be triggered:
      
      [ 1917.130762] kernel BUG at fs/inode.c:578!
      [ 1917.130766] Oops: Exception in kernel mode, sig: 5 [#1]
      [ 1917.130768] SMP NR_CPUS=2048 NUMA pSeries
      [ 1917.130772] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc sg nx_crypto pseries_rng ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod
      [ 1917.130805] CPU: 2 PID: 14923 Comm: umount.nfs4 Tainted: G               ------------ T 3.10.0-768.el7.ppc64 #1
      [ 1917.130810] task: c0000005ecd88040 ti: c00000004cea0000 task.ti: c00000004cea0000
      [ 1917.130813] NIP: c000000000354178 LR: c000000000354160 CTR: c00000000012db80
      [ 1917.130816] REGS: c00000004cea3720 TRAP: 0700   Tainted: G               ------------ T  (3.10.0-768.el7.ppc64)
      [ 1917.130820] MSR: 8000000100029032 <SF,EE,ME,IR,DR,RI>  CR: 22002822  XER: 20000000
      [ 1917.130828] CFAR: c00000000011f594 SOFTE: 1
      GPR00: c000000000354160 c00000004cea39a0 c0000000014c4700 c0000000018cc750
      GPR04: 000000000000c750 80c0000000000000 0600000000000000 04eeb76bea749a03
      GPR08: 0000000000000034 c0000000018cc758 0000000000000001 d000000005e619e8
      GPR12: c00000000012db80 c000000007b31200 0000000000000000 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR24: 0000000000000000 c000000000dfc3ec 0000000000000000 c0000005eefc02c0
      GPR28: d0000000079dbd50 c0000005b94a02c0 c0000005b94a0250 c0000005b94a01c8
      [ 1917.130867] NIP [c000000000354178] .evict+0x1c8/0x350
      [ 1917.130871] LR [c000000000354160] .evict+0x1b0/0x350
      [ 1917.130873] Call Trace:
      [ 1917.130876] [c00000004cea39a0] [c000000000354160] .evict+0x1b0/0x350 (unreliable)
      [ 1917.130880] [c00000004cea3a30] [c0000000003558cc] .evict_inodes+0x13c/0x270
      [ 1917.130884] [c00000004cea3af0] [c000000000327d20] .kill_anon_super+0x70/0x1e0
      [ 1917.130896] [c00000004cea3b80] [d000000005e43e30] .nfs_kill_super+0x20/0x60 [nfs]
      [ 1917.130900] [c00000004cea3c00] [c000000000328a20] .deactivate_locked_super+0xa0/0x1b0
      [ 1917.130903] [c00000004cea3c80] [c00000000035ba54] .cleanup_mnt+0xd4/0x180
      [ 1917.130907] [c00000004cea3d10] [c000000000119034] .task_work_run+0x114/0x150
      [ 1917.130912] [c00000004cea3db0] [c00000000001ba6c] .do_notify_resume+0xcc/0x100
      [ 1917.130916] [c00000004cea3e30] [c00000000000a7b0] .ret_from_except_lite+0x5c/0x60
      [ 1917.130919] Instruction dump:
      [ 1917.130921] 7fc3f378 486734b5 60000000 387f00a0 38800003 4bdcb365 60000000 e95f00a0
      [ 1917.130927] 694a0060 7d4a0074 794ad182 694a0001 <0b0a0000> 892d02a4 2f890000 40de0134
      Signed-off-by: NScott Mayhew <smayhew@redhat.com>
      Cc: stable@vger.kernel.org # 4.5+
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      dc4fd9ab
    • C
      xprtrdma: Spread reply processing over more CPUs · ccede759
      Chuck Lever 提交于
      Commit d8f532d2 ("xprtrdma: Invoke rpcrdma_reply_handler
      directly from RECV completion") introduced a performance regression
      for NFS I/O small enough to not need memory registration. In multi-
      threaded benchmarks that generate primarily small I/O requests,
      IOPS throughput is reduced by nearly a third. This patch restores
      the previous level of throughput.
      
      Because workqueues are typically BOUND (in particular ib_comp_wq,
      nfsiod_workqueue, and rpciod_workqueue), NFS/RDMA workloads tend
      to aggregate on the CPU that is handling Receive completions.
      
      The usual approach to addressing this problem is to create a QP
      and CQ for each CPU, and then schedule transactions on the QP
      for the CPU where you want the transaction to complete. The
      transaction then does not require an extra context switch during
      completion to end up on the same CPU where the transaction was
      started.
      
      This approach doesn't work for the Linux NFS/RDMA client because
      currently the Linux NFS client does not support multiple connections
      per client-server pair, and the RDMA core API does not make it
      straightforward for ULPs to determine which CPU is responsible for
      handling Receive completions for a CQ.
      
      So for the moment, record the CPU number in the rpcrdma_req before
      the transport sends each RPC Call. Then during Receive completion,
      queue the RPC completion on that same CPU.
      
      Additionally, move all RPC completion processing to the deferred
      handler so that even RPCs with simple small replies complete on
      the CPU that sent the corresponding RPC Call.
      
      Fixes: d8f532d2 ("xprtrdma: Invoke rpcrdma_reply_handler ...")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      ccede759
    • S
      nfs: fix a deadlock in nfs client initialization · c156618e
      Scott Mayhew 提交于
      The following deadlock can occur between a process waiting for a client
      to initialize in while walking the client list during nfsv4 server trunking
      detection and another process waiting for the nfs_clid_init_mutex so it
      can initialize that client:
      
      Process 1                               Process 2
      ---------                               ---------
      spin_lock(&nn->nfs_client_lock);
      list_add_tail(&CLIENTA->cl_share_link,
              &nn->nfs_client_list);
      spin_unlock(&nn->nfs_client_lock);
                                              spin_lock(&nn->nfs_client_lock);
                                              list_add_tail(&CLIENTB->cl_share_link,
                                                      &nn->nfs_client_list);
                                              spin_unlock(&nn->nfs_client_lock);
                                              mutex_lock(&nfs_clid_init_mutex);
                                              nfs41_walk_client_list(clp, result, cred);
                                              nfs_wait_client_init_complete(CLIENTA);
      (waiting for nfs_clid_init_mutex)
      
      Make sure nfs_match_client() only evaluates clients that have completed
      initialization in order to prevent that deadlock.
      
      This patch also fixes v4.0 trunking behavior by not marking the client
      NFS_CS_READY until the clientid has been confirmed.
      Signed-off-by: NScott Mayhew <smayhew@redhat.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      c156618e
    • H
      ip_gre: fix wrong return value of erspan_rcv · c05fad57
      Haishuang Yan 提交于
      If pskb_may_pull return failed, return PACKET_REJECT instead of -ENOMEM.
      
      Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN")
      Cc: William Tu <u9012063@gmail.com>
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c05fad57
    • D
      net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support · c647c0d6
      Daniele Palmas 提交于
      This patch adds support for Telit ME910 PID 0x1101.
      Signed-off-by: NDaniele Palmas <dnlplm@gmail.com>
      Acked-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c647c0d6
    • D
      Merge branch 'net-sched-Make-qdisc-offload-uapi-uniform' · d1fca67f
      David S. Miller 提交于
      Yuval Mintz says:
      
      ====================
      net: sched: Make qdisc offload uapi uniform
      
      Several qdiscs can already be offloaded to hardware, but there's an
      inconsistecy in regard to the uapi through which they indicate such
      an offload is taking place - indication is passed to the user via
      TCA_OPTIONS where each qdisc retains private logic for setting it.
      
      The recent addition of offloading to RED in
      602f3baf ("net_sch: red: Add offload ability to RED qdisc") caused
      the addition of yet another uapi field for this purpose -
      TC_RED_OFFLOADED.
      
      For clarity and prevention of bloat in the uapi we want to eliminate
      said added uapi, replacing it with a common mechanism that can be used
      to reflect offload status of the various qdiscs.
      
      The first patch introduces TCA_HW_OFFLOAD as the generic message meant
      for this purpose. The second changes the current RED implementation into
      setting the internal bits necessary for passing it, and the third removes
      TC_RED_OFFLOADED as its no longer needed.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d1fca67f
    • Y
      pkt_sched: Remove TC_RED_OFFLOADED from uapi · 4a98795b
      Yuval Mintz 提交于
      Following the previous patch, RED is now using the new uniform uapi
      for indicating it's offloaded. As a result, TC_RED_OFFLOADED is no
      longer utilized by kernel and can be removed [as it's still not
      part of any stable release].
      
      Fixes: 602f3baf ("net_sch: red: Add offload ability to RED qdisc")
      Signed-off-by: NYuval Mintz <yuvalm@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a98795b
    • Y
      net: sched: Move to new offload indication in RED · 428a68af
      Yuval Mintz 提交于
      Let RED utilize the new internal flag, TCQ_F_OFFLOADED,
      to mark a given qdisc as offloaded instead of using a dedicated
      indication.
      
      Also, change internal logic into looking at said flag when possible.
      
      Fixes: 602f3baf ("net_sch: red: Add offload ability to RED qdisc")
      Signed-off-by: NYuval Mintz <yuvalm@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      428a68af
    • Y
      net: sched: Add TCA_HW_OFFLOAD · 7a4fa291
      Yuval Mintz 提交于
      Qdiscs can be offloaded to HW, but current implementation isn't uniform.
      Instead, qdiscs either pass information about offload status via their
      TCA_OPTIONS or omit it altogether.
      
      Introduce a new attribute - TCA_HW_OFFLOAD that would form a uniform
      uAPI for the offloading status of qdiscs.
      Signed-off-by: NYuval Mintz <yuvalm@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a4fa291
    • D
      Merge branch 'aquantia-fixes' · 0a060697
      David S. Miller 提交于
      Igor Russkikh says:
      
      ====================
      net: aquantia: Atlantic driver 12/2017 updates
      
      The patchset contains important hardware fix for machines with large MRRS
      and couple of improvement in stats and capabilities reporting
      
      patch v3:
       - Fixed patch #7 after Andrew's finding. NIC level stats actually
         have to be cleaned only on hw struct creation (and this is done
         in kzalloc). On each hwinit we only have to reset link state
         to make sure hw stats update will not increment nic stats during init.
      
      patch v2:
       - split into more detailed commits
      
      Comment from David on wrong defines case will be submitted separately later
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a060697
    • I
      net: aquantia: Increment driver version · d4c242d4
      Igor Russkikh 提交于
      Add a suffix to distinguish kernel mainline version and aquantia releases
      Signed-off-by: NIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4c242d4