1. 13 8月, 2018 3 次提交
    • L
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 921195d3
      Linus Torvalds 提交于
      Pull SCSI fixes from James Bottomley:
       "Eight fixes.
      
        The most important one is the mpt3sas fix which makes the driver work
        again on big endian systems. The rest are mostly minor error path or
        checker issues and the vmw_scsi one fixes a performance problem"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: vmw_pvscsi: Return DID_RESET for status SAM_STAT_COMMAND_TERMINATED
        scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled
        scsi: mpt3sas: Swap I/O memory read value back to cpu endianness
        scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO
        scsi: fcoe: drop frames in ELS LOGO error path
        scsi: fcoe: fix use-after-free in fcoe_ctlr_els_send
        scsi: qedi: Fix a potential buffer overflow
        scsi: qla2xxx: Fix memory leak for allocating abort IOCB
      921195d3
    • L
      init: rename and re-order boot_cpu_state_init() · b5b1404d
      Linus Torvalds 提交于
      This is purely a preparatory patch for upcoming changes during the 4.19
      merge window.
      
      We have a function called "boot_cpu_state_init()" that isn't really
      about the bootup cpu state: that is done much earlier by the similarly
      named "boot_cpu_init()" (note lack of "state" in name).
      
      This function initializes some hotplug CPU state, and needs to run after
      the percpu data has been properly initialized.  It even has a comment to
      that effect.
      
      Except it _doesn't_ actually run after the percpu data has been properly
      initialized.  On x86 it happens to do that, but on at least arm and
      arm64, the percpu base pointers are initialized by the arch-specific
      'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init().
      
      This had some unexpected results, and in particular we have a patch
      pending for the merge window that did the obvious cleanup of using
      'this_cpu_write()' in the cpu hotplug init code:
      
        -       per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
        +       this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
      
      which is obviously the right thing to do.  Except because of the
      ordering issue, it actually failed miserably and unexpectedly on arm64.
      
      So this just fixes the ordering, and changes the name of the function to
      be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu
      hotplug state, because the core CPU state was supposed to have already
      been done earlier.
      
      Marked for stable, since the (not yet merged) patch that will show this
      problem is marked for stable.
      Reported-by: NVlastimil Babka <vbabka@suse.cz>
      Reported-by: NMian Yousaf Kaukab <yousaf.kaukab@suse.com>
      Suggested-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5b1404d
    • L
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d6dd6431
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "A bunch of race fixes, mostly around lazy pathwalk.
      
        All of it is -stable fodder, a large part going back to 2013"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        make sure that __dentry_kill() always invalidates d_seq, unhashed or not
        fix __legitimize_mnt()/mntput() race
        fix mntput/mntput race
        root dentries need RCU-delayed freeing
      d6dd6431
  2. 12 8月, 2018 5 次提交
  3. 11 8月, 2018 5 次提交
  4. 10 8月, 2018 10 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · e91e2189
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-08-10
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix cpumap and devmap on teardown as they're under RCU context
         and won't have same assumption as running under NAPI protection,
         from Jesper.
      
      2) Fix various sockmap bugs in bpf_tcp_sendmsg() code, e.g. we had
         a bug where socket error was not propagated correctly, from Daniel.
      
      3) Fix incompatible libbpf header license for BTF code and match it
         before it gets officially released with the rest of libbpf which
         is LGPL-2.1, from Martin.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e91e2189
    • A
      make sure that __dentry_kill() always invalidates d_seq, unhashed or not · 4c0d7cd5
      Al Viro 提交于
      RCU pathwalk relies upon the assumption that anything that changes
      ->d_inode of a dentry will invalidate its ->d_seq.  That's almost
      true - the one exception is that the final dput() of already unhashed
      dentry does *not* touch ->d_seq at all.  Unhashing does, though,
      so for anything we'd found by RCU dcache lookup we are fine.
      Unfortunately, we can *start* with an unhashed dentry or jump into
      it.
      
      We could try and be careful in the (few) places where that could
      happen.  Or we could just make the final dput() invalidate the damn
      thing, unhashed or not.  The latter is much simpler and easier to
      backport, so let's do it that way.
      Reported-by: N"Dae R. Jeong" <threeearcat@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4c0d7cd5
    • A
      fix __legitimize_mnt()/mntput() race · 119e1ef8
      Al Viro 提交于
      __legitimize_mnt() has two problems - one is that in case of success
      the check of mount_lock is not ordered wrt preceding increment of
      refcount, making it possible to have successful __legitimize_mnt()
      on one CPU just before the otherwise final mntpu() on another,
      with __legitimize_mnt() not seeing mntput() taking the lock and
      mntput() not seeing the increment done by __legitimize_mnt().
      Solved by a pair of barriers.
      
      Another is that failure of __legitimize_mnt() on the second
      read_seqretry() leaves us with reference that'll need to be
      dropped by caller; however, if that races with final mntput()
      we can end up with caller dropping rcu_read_lock() and doing
      mntput() to release that reference - with the first mntput()
      having freed the damn thing just as rcu_read_lock() had been
      dropped.  Solution: in "do mntput() yourself" failure case
      grab mount_lock, check if MNT_DOOMED has been set by racing
      final mntput() that has missed our increment and if it has -
      undo the increment and treat that as "failure, caller doesn't
      need to drop anything" case.
      
      It's not easy to hit - the final mntput() has to come right
      after the first read_seqretry() in __legitimize_mnt() *and*
      manage to miss the increment done by __legitimize_mnt() before
      the second read_seqretry() in there.  The things that are almost
      impossible to hit on bare hardware are not impossible on SMP
      KVM, though...
      Reported-by: NOleg Nesterov <oleg@redhat.com>
      Fixes: 48a066e7 ("RCU'd vsfmounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      119e1ef8
    • A
      fix mntput/mntput race · 9ea0a46c
      Al Viro 提交于
      mntput_no_expire() does the calculation of total refcount under mount_lock;
      unfortunately, the decrement (as well as all increments) are done outside
      of it, leading to false positives in the "are we dropping the last reference"
      test.  Consider the following situation:
      	* mnt is a lazy-umounted mount, kept alive by two opened files.  One
      of those files gets closed.  Total refcount of mnt is 2.  On CPU 42
      mntput(mnt) (called from __fput()) drops one reference, decrementing component
      	* After it has looked at component #0, the process on CPU 0 does
      mntget(), incrementing component #0, gets preempted and gets to run again -
      on CPU 69.  There it does mntput(), which drops the reference (component #69)
      and proceeds to spin on mount_lock.
      	* On CPU 42 our first mntput() finishes counting.  It observes the
      decrement of component #69, but not the increment of component #0.  As the
      result, the total it gets is not 1 as it should've been - it's 0.  At which
      point we decide that vfsmount needs to be killed and proceed to free it and
      shut the filesystem down.  However, there's still another opened file
      on that filesystem, with reference to (now freed) vfsmount, etc. and we are
      screwed.
      
      It's not a wide race, but it can be reproduced with artificial slowdown of
      the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups.
      
      Fix consists of moving the refcount decrement under mount_lock; the tricky
      part is that we want (and can) keep the fast case (i.e. mount that still
      has non-NULL ->mnt_ns) entirely out of mount_lock.  All places that zero
      mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu()
      before that mntput().  IOW, if mntput() observes (under rcu_read_lock())
      a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to
      be dropped.
      Reported-by: NJann Horn <jannh@google.com>
      Tested-by: NJann Horn <jannh@google.com>
      Fixes: 48a066e7 ("RCU'd vsfmounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9ea0a46c
    • D
      Merge branch 'bpf-fix-cpu-and-devmap-teardown' · 9c954201
      Daniel Borkmann 提交于
      Jesper Dangaard Brouer says:
      
      ====================
      Removing entries from cpumap and devmap, goes through a number of
      syncronization steps to make sure no new xdp_frames can be enqueued.
      But there is a small chance, that xdp_frames remains which have not
      been flushed/processed yet.  Flushing these during teardown, happens
      from RCU context and not as usual under RX NAPI context.
      
      The optimization introduced in commt 389ab7f0 ("xdp: introduce
      xdp_return_frame_rx_napi"), missed that the flush operation can also
      be called from RCU context.  Thus, we cannot always use the
      xdp_return_frame_rx_napi call, which take advantage of the protection
      provided by XDP RX running under NAPI protection.
      
      The samples/bpf xdp_redirect_cpu have a --stress-mode, that is
      adjusted to easier reproduce (verified by Red Hat QA).
      ====================
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      9c954201
    • J
      xdp: fix bug in devmap teardown code path · 1bf9116d
      Jesper Dangaard Brouer 提交于
      Like cpumap teardown, the devmap teardown code also flush remaining
      xdp_frames, via bq_xmit_all() in case map entry is removed.  The code
      can call xdp_return_frame_rx_napi, from the the wrong context, in-case
      ndo_xdp_xmit() fails.
      
      Fixes: 389ab7f0 ("xdp: introduce xdp_return_frame_rx_napi")
      Fixes: 735fc405 ("xdp: change ndo_xdp_xmit API to support bulking")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      1bf9116d
    • J
      samples/bpf: xdp_redirect_cpu adjustment to reproduce teardown race easier · 37d7ff25
      Jesper Dangaard Brouer 提交于
      The teardown race in cpumap is really hard to reproduce.  These changes
      makes it easier to reproduce, for QA.
      
      The --stress-mode now have a case of a very small queue size of 8, that helps
      to trigger teardown flush to encounter a full queue, which results in calling
      xdp_return_frame API, in a non-NAPI protect context.
      
      Also increase MAX_CPUS, as my QA department have larger machines than me.
      Tested-by: NJean-Tsung Hsiao <jhsiao@redhat.com>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      37d7ff25
    • J
      xdp: fix bug in cpumap teardown code path · ad0ab027
      Jesper Dangaard Brouer 提交于
      When removing a cpumap entry, a number of syncronization steps happen.
      Eventually the teardown code __cpu_map_entry_free is invoked from/via
      call_rcu.
      
      The teardown code __cpu_map_entry_free() flushes remaining xdp_frames,
      by invoking bq_flush_to_queue, which calls xdp_return_frame_rx_napi().
      The issues is that the teardown code is not running in the RX NAPI
      code path.  Thus, it is not allowed to invoke the NAPI variant of
      xdp_return_frame.
      
      This bug was found and triggered by using the --stress-mode option to
      the samples/bpf program xdp_redirect_cpu.  It is hard to trigger,
      because the ptr_ring have to be full and cpumap bulk queue max
      contains 8 packets, and a remote CPU is racing to empty the ptr_ring
      queue.
      
      Fixes: 389ab7f0 ("xdp: introduce xdp_return_frame_rx_napi")
      Tested-by: NJean-Tsung Hsiao <jhsiao@redhat.com>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      ad0ab027
    • L
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 112cbae2
      Linus Torvalds 提交于
      Pull crypto fix from Herbert Xu:
       "This fixes a performance regression in arm64 NEON crypto as well as a
        crash in x86 aegis/morus on unsupported CPUs"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: x86/aegis,morus - Fix and simplify CPUID checks
        crypto: arm64 - revert NEON yield for fast AEAD implementations
      112cbae2
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 6395ad85
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) The real fix for the ipv6 route metric leak Sabrina was seeing, from
          Cong Wang.
      
       2) Fix syzbot triggers AF_PACKET v3 ring buffer insufficient room
          conditions, from Willem de Bruijn.
      
       3) vsock can reinitialize active work struct, fix from Cong Wang.
      
       4) RXRPC keepalive generator can wedge a cpu, fix from David Howells.
      
       5) Fix locking in AF_SMC ioctl, from Ursula Braun.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        dsa: slave: eee: Allow ports to use phylink
        net/smc: move sock lock in smc_ioctl()
        net/smc: allow sysctl rmem and wmem defaults for servers
        net/smc: no shutdown in state SMC_LISTEN
        net: aquantia: Fix IFF_ALLMULTI flag functionality
        rxrpc: Fix the keepalive generator [ver #2]
        net/mlx5e: Cleanup of dcbnl related fields
        net/mlx5e: Properly check if hairpin is possible between two functions
        vhost: reset metadata cache when initializing new IOTLB
        llc: use refcount_inc_not_zero() for llc_sap_find()
        dccp: fix undefined behavior with 'cwnd' shift in ccid2_cwnd_restart()
        tipc: fix an interrupt unsafe locking scenario
        vsock: split dwork to avoid reinitializations
        net: thunderx: check for failed allocation lmac->dmacs
        cxgb4: mk_act_open_req() buggers ->{local, peer}_ip on big-endian hosts
        packet: refine ring v3 block size test to hold one frame
        ip6_tunnel: use the right value for ipv4 min mtu check in ip6_tnl_xmit
        ipv6: fix double refcount of fib6_metrics
      6395ad85
  5. 09 8月, 2018 17 次提交
    • G
      i2c: xlp9xx: Fix case where SSIF read transaction completes early · 5eb173f5
      George Cherian 提交于
      During ipmi stress tests we see occasional failure of transactions
      at the boot time. This happens in the case of a I2C_M_RECV_LEN
      transactions, when the read transfer completes (with the initial
      read length of 34) before the driver gets a chance to handle interrupts.
      
      The current driver code expects at least 2 interrupts for I2C_M_RECV_LEN
      transactions. The length is updated during the first interrupt, and  the
      buffer contents are only copied during subsequent interrupts. In case of
      just one interrupt, we will complete the transaction without copying
      out the bytes from RX fifo.
      
      Update the code to drain the RX fifo after the length update,
      so that the transaction completes correctly in all cases.
      Signed-off-by: NGeorge Cherian <george.cherian@cavium.com>
      Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
      Cc: stable@kernel.org
      5eb173f5
    • A
      dsa: slave: eee: Allow ports to use phylink · 1be52e97
      Andrew Lunn 提交于
      For a port to be able to use EEE, both the MAC and the PHY must
      support EEE. A phy can be provided by both a phydev or phylink. Verify
      at least one of these exist, not just phydev.
      
      Fixes: aab9c406 ("net: dsa: Plug in PHYLINK support")
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1be52e97
    • D
      Merge branch 'smc-fixes' · ef91b6f9
      David S. Miller 提交于
      Ursula Braun says:
      
      ====================
      net/smc: fixes 2018-08-08
      
      here are small fixes for SMC: The first patch makes sure, shutdown code
      is not executed for sockets in state SMC_LISTEN. The second patch resets
      send and receive buffer values for accepted sockets, since TCP buffer size
      optimizations for the internal CLC socket should not be forwarded to the
      outer SMC socket. The third patch solves a race between connect and ioctl
      reported by syzbot.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef91b6f9
    • U
      net/smc: move sock lock in smc_ioctl() · 7311d665
      Ursula Braun 提交于
      When an SMC socket is connecting it is decided whether fallback to
      TCP is needed. To avoid races between connect and ioctl move the
      sock lock before the use_fallback check.
      
      Reported-by: syzbot+5b2cece1a8ecb2ca77d8@syzkaller.appspotmail.com
      Reported-by: syzbot+19557374321ca3710990@syzkaller.appspotmail.com
      Fixes: 1992d998 ("net/smc: take sock lock in smc_ioctl()")
      Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7311d665
    • U
      net/smc: allow sysctl rmem and wmem defaults for servers · bd58c7e0
      Ursula Braun 提交于
      Without setsockopt SO_SNDBUF and SO_RCVBUF settings, the sysctl
      defaults net.ipv4.tcp_wmem and net.ipv4.tcp_rmem should be the base
      for the sizes of the SMC sndbuf and rcvbuf. Any TCP buffer size
      optimizations for servers should be ignored.
      Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd58c7e0
    • U
      net/smc: no shutdown in state SMC_LISTEN · caa21e19
      Ursula Braun 提交于
      Invoking shutdown for a socket in state SMC_LISTEN does not make
      sense. Nevertheless programs like syzbot fuzzing the kernel may
      try to do this. For SMC this means a socket refcounting problem.
      This patch makes sure a shutdown call for an SMC socket in state
      SMC_LISTEN simply returns with -ENOTCONN.
      Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      caa21e19
    • D
      net: aquantia: Fix IFF_ALLMULTI flag functionality · 11ba961c
      Dmitry Bogdanov 提交于
      It was noticed that NIC always pass all multicast traffic to the host
      regardless of IFF_ALLMULTI flag on the interface.
      The rule in MC Filter Table in NIC, that is configured to accept any
      multicast packets, is turning on if IFF_MULTICAST flag is set on the
      interface. It leads to passing all multicast traffic to the host.
      This fix changes the condition to turn on that rule by checking
      IFF_ALLMULTI flag as it should.
      
      Fixes: b21f502f ("net:ethernet:aquantia: Fix for multicast filter handling.")
      Signed-off-by: NDmitry Bogdanov <dmitry.bogdanov@aquantia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      11ba961c
    • D
      rxrpc: Fix the keepalive generator [ver #2] · 330bdcfa
      David Howells 提交于
      AF_RXRPC has a keepalive message generator that generates a message for a
      peer ~20s after the last transmission to that peer to keep firewall ports
      open.  The implementation is incorrect in the following ways:
      
       (1) It mixes up ktime_t and time64_t types.
      
       (2) It uses ktime_get_real(), the output of which may jump forward or
           backward due to adjustments to the time of day.
      
       (3) If the current time jumps forward too much or jumps backwards, the
           generator function will crank the base of the time ring round one slot
           at a time (ie. a 1s period) until it catches up, spewing out VERSION
           packets as it goes.
      
      Fix the problem by:
      
       (1) Only using time64_t.  There's no need for sub-second resolution.
      
       (2) Use ktime_get_seconds() rather than ktime_get_real() so that time
           isn't perceived to go backwards.
      
       (3) Simplifying rxrpc_peer_keepalive_worker() by splitting it into two
           parts:
      
           (a) The "worker" function that manages the buckets and the timer.
      
           (b) The "dispatch" function that takes the pending peers and
           	 potentially transmits a keepalive packet before putting them back
           	 in the ring into the slot appropriate to the revised last-Tx time.
      
       (4) Taking everything that's pending out of the ring and splicing it into
           a temporary collector list for processing.
      
           In the case that there's been a significant jump forward, the ring
           gets entirely emptied and then the time base can be warped forward
           before the peers are processed.
      
           The warping can't happen if the ring isn't empty because the slot a
           peer is in is keepalive-time dependent, relative to the base time.
      
       (5) Limit the number of iterations of the bucket array when scanning it.
      
       (6) Set the timer to skip any empty slots as there's no point waking up if
           there's nothing to do yet.
      
      This can be triggered by an incoming call from a server after a reboot with
      AF_RXRPC and AFS built into the kernel causing a peer record to be set up
      before userspace is started.  The system clock is then adjusted by
      userspace, thereby potentially causing the keepalive generator to have a
      meltdown - which leads to a message like:
      
      	watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:23]
      	...
      	Workqueue: krxrpcd rxrpc_peer_keepalive_worker
      	EIP: lock_acquire+0x69/0x80
      	...
      	Call Trace:
      	 ? rxrpc_peer_keepalive_worker+0x5e/0x350
      	 ? _raw_spin_lock_bh+0x29/0x60
      	 ? rxrpc_peer_keepalive_worker+0x5e/0x350
      	 ? rxrpc_peer_keepalive_worker+0x5e/0x350
      	 ? __lock_acquire+0x3d3/0x870
      	 ? process_one_work+0x110/0x340
      	 ? process_one_work+0x166/0x340
      	 ? process_one_work+0x110/0x340
      	 ? worker_thread+0x39/0x3c0
      	 ? kthread+0xdb/0x110
      	 ? cancel_delayed_work+0x90/0x90
      	 ? kthread_stop+0x70/0x70
      	 ? ret_from_fork+0x19/0x24
      
      Fixes: ace45bec ("rxrpc: Fix firewall route keepalive")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      330bdcfa
    • D
      Merge branch 'mlx5-fixes' · f39cc1c7
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5e fixes 2018-08-07
      
      I know it is late into 4.18 release, and this is why I am submitting
      only two mlx5e ethernet fixes.
      
      The first one from Or, is needed for -stable and it fixes hairpin
      for "same device" check.
      
      The second fix is a non risk fix from Huy which cleans up and improves
      error return value reporting for dcbnl_ieee_setapp.
      
      For -stable v4.16
      - net/mlx5e: Properly check if hairpin is possible between two functions
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f39cc1c7
    • H
      net/mlx5e: Cleanup of dcbnl related fields · f280c6a1
      Huy Nguyen 提交于
      Remove unused netdev_registered_init/remove in en.h
      Return ENOSUPPORT if the check MLX5_DSCP_SUPPORTED fails.
      Remove extra white space
      
      Fixes: 2a5e7a13 ("net/mlx5e: Add dcbnl dscp to priority support")
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Cc: Yuval Shaia <yuval.shaia@oracle.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f280c6a1
    • O
      net/mlx5e: Properly check if hairpin is possible between two functions · 816f6706
      Or Gerlitz 提交于
      The current check relies on function BDF addresses and can get
      us wrong e.g when two VFs are assigned into a VM and the PCI
      v-address is set by the hypervisor.
      
      Fixes: 5c65c564 ('net/mlx5e: Support offloading TC NIC hairpin flows')
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reported-by: NAlaa Hleihel <alaa@mellanox.com>
      Tested-by: NAlaa Hleihel <alaa@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      816f6706
    • J
      parisc: Define mb() and add memory barriers to assembler unlock sequences · fedb8da9
      John David Anglin 提交于
      For years I thought all parisc machines executed loads and stores in
      order. However, Jeff Law recently indicated on gcc-patches that this is
      not correct. There are various degrees of out-of-order execution all the
      way back to the PA7xxx processor series (hit-under-miss). The PA8xxx
      series has full out-of-order execution for both integer operations, and
      loads and stores.
      
      This is described in the following article:
      http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml
      
      For this reason, we need to define mb() and to insert a memory barrier
      before the store unlocking spinlocks. This ensures that all memory
      accesses are complete prior to unlocking. The ldcw instruction performs
      the same function on entry.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org # 4.0+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      fedb8da9
    • H
      parisc: Enable CONFIG_MLONGCALLS by default · 66509a27
      Helge Deller 提交于
      Enable the -mlong-calls compiler option by default, because otherwise in most
      cases linking the vmlinux binary fails due to truncations of R_PARISC_PCREL22F
      relocations. This fixes building the 64-bit defconfig.
      
      Cc: stable@vger.kernel.org # 4.0+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      66509a27
    • A
      Merge branch 'sockmap-fixes' · bf9bae0e
      Alexei Starovoitov 提交于
      Daniel Borkmann says:
      
      ====================
      Two sockmap fixes in bpf_tcp_sendmsg(), and one fix for the
      sockmap kernel selftest. Thanks!
      ====================
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      bf9bae0e
    • D
      bpf, sockmap: fix cork timeout for select due to epipe · 3c6ed988
      Daniel Borkmann 提交于
      I ran into the same issue as a009f1f3 ("selftests/bpf:
      test_sockmap, timing improvements") where I had a broken
      pipe error on the socket due to remote end timing out on
      select and then shutting down it's sockets while the other
      side was still sending. We may need to do a bigger rework
      in general on the test_sockmap.c, but for now increase it
      to a more suitable timeout.
      
      Fixes: a18fda1a ("bpf: reduce runtime of test_sockmap tests")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      3c6ed988
    • D
      bpf, sockmap: fix leak in bpf_tcp_sendmsg wait for mem path · 7c81c717
      Daniel Borkmann 提交于
      In bpf_tcp_sendmsg() the sk_alloc_sg() may fail. In the case of
      ENOMEM, it may also mean that we've partially filled the scatterlist
      entries with pages. Later jumping to sk_stream_wait_memory()
      we could further fail with an error for several reasons, however
      we miss to call free_start_sg() if the local sk_msg_buff was used.
      
      Fixes: 4f738adb ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      7c81c717
    • D
      bpf, sockmap: fix bpf_tcp_sendmsg sock error handling · 5121700b
      Daniel Borkmann 提交于
      While working on bpf_tcp_sendmsg() code, I noticed that when a
      sk->sk_err is set we error out with err = sk->sk_err. However
      this is problematic since sk->sk_err is a positive error value
      and therefore we will neither go into sk_stream_error() nor will
      we report an error back to user space. I had this case with EPIPE
      and user space was thinking sendmsg() succeeded since EPIPE is
      a positive value, thinking we submitted 32 bytes. Fix it by
      negating the sk->sk_err value.
      
      Fixes: 4f738adb ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      5121700b