1. 01 3月, 2018 17 次提交
    • D
      Merge branch 'net-smc-fixes' · 7358799c
      David S. Miller 提交于
      Ursula Braun says:
      
      ====================
      net/smc: fixes 2018-02-28
      
      here are 3 smc bug fixes for the net-tree. Karsten's first patch is
      the reworked version of last week's
         "[PATCH net-next 2/5] net/smc: fix structure size"
      patch, now solved without using __packed, and now targetted for net
      instead of net-next.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7358799c
    • D
      net/smc: fix NULL pointer dereference on sock_create_kern() error path · a5dcb73b
      Davide Caratti 提交于
      when sock_create_kern(..., a) returns an error, 'a' might not be a valid
      pointer, so it shouldn't be dereferenced to read a->sk->sk_sndbuf and
      and a->sk->sk_rcvbuf; not doing that caused the following crash:
      
      general protection fault: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
          (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 4254 Comm: syzkaller919713 Not tainted 4.16.0-rc1+ #18
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      RIP: 0010:smc_create+0x14e/0x300 net/smc/af_smc.c:1410
      RSP: 0018:ffff8801b06afbc8 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff8801b63457c0 RCX: ffffffff85a3e746
      RDX: 0000000000000004 RSI: 00000000ffffffff RDI: 0000000000000020
      RBP: ffff8801b06afbf0 R08: 00000000000007c0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: ffff8801b6345c08 R14: 00000000ffffffe9 R15: ffffffff8695ced0
      FS:  0000000001afb880(0000) GS:ffff8801db200000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000040 CR3: 00000001b0721004 CR4: 00000000001606f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        __sock_create+0x4d4/0x850 net/socket.c:1285
        sock_create net/socket.c:1325 [inline]
        SYSC_socketpair net/socket.c:1409 [inline]
        SyS_socketpair+0x1c0/0x6f0 net/socket.c:1366
        do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x26/0x9b
      RIP: 0033:0x4404b9
      RSP: 002b:00007fff44ab6908 EFLAGS: 00000246 ORIG_RAX: 0000000000000035
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004404b9
      RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000000002b
      RBP: 00007fff44ab6910 R08: 0000000000000002 R09: 00007fff44003031
      R10: 0000000020000040 R11: 0000000000000246 R12: ffffffffffffffff
      R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000
      Code: 48 c1 ea 03 80 3c 02 00 0f 85 b3 01 00 00 4c 8b a3 48 04 00 00 48
      b8
      00 00 00 00 00 fc ff df 49 8d 7c 24 20 48 89 fa 48 c1 ea 03 <80> 3c 02
      00
      0f 85 82 01 00 00 4d 8b 7c 24 20 48 b8 00 00 00 00
      RIP: smc_create+0x14e/0x300 net/smc/af_smc.c:1410 RSP: ffff8801b06afbc8
      
      Fixes: cd6851f3 smc: remote memory buffers (RMBs)
      Reported-and-tested-by: syzbot+aa0227369be2dcc26ebe@syzkaller.appspotmail.com
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5dcb73b
    • K
      net/smc: use link_id of server in confirm link reply · 2be922f3
      Karsten Graul 提交于
      The CONFIRM LINK reply message must contain the link_id sent
      by the server. And set the link_id explicitly when
      initializing the link.
      Signed-off-by: NKarsten Graul <kgraul@linux.vnet.ibm.com>
      Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2be922f3
    • K
      net/smc: use a constant for control message length · cbba07a7
      Karsten Graul 提交于
      The sizeof(struct smc_cdc_msg) evaluates to 48 bytes instead of the
      required 44 bytes. We need to use the constant value of
      SMC_WR_TX_SIZE to set and check the control message length.
      Signed-off-by: NKarsten Graul <kgraul@linux.vnet.ibm.com>
      Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbba07a7
    • J
      virtio-net: disable NAPI only when enabled during XDP set · 4e09ff53
      Jason Wang 提交于
      We try to disable NAPI to prevent a single XDP TX queue being used by
      multiple cpus. But we don't check if device is up (NAPI is enabled),
      this could result stall because of infinite wait in
      napi_disable(). Fixing this by checking device state through
      netif_running() before.
      
      Fixes: 4941d472 ("virtio-net: do not reset during XDP set")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e09ff53
    • J
      net/tcp/illinois: replace broken algorithm reference link · ecc83275
      Joey Pabalinas 提交于
      The link to the pdf containing the algorithm description is now a
      dead link; it seems http://www.ifp.illinois.edu/~srikant/ has been
      moved to https://sites.google.com/a/illinois.edu/srikant/ and none of
      the original papers can be found there...
      
      I have replaced it with the only working copy I was able to find.
      
      n.b. there is also a copy available at:
      
      http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.6350&rep=rep1&type=pdf
      
      However, this seems to only be a *cached* version, so I am unsure
      exactly how reliable that link can be expected to remain over time
      and have decided against using that one.
      Signed-off-by: NJoey Pabalinas <joeypabalinas@gmail.com>
      
       1 file changed, 1 insertion(+), 1 deletion(-)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecc83275
    • S
      tcp: purge write queue upon RST · a27fd7a8
      Soheil Hassas Yeganeh 提交于
      When the connection is reset, there is no point in
      keeping the packets on the write queue until the connection
      is closed.
      
      RFC 793 (page 70) and RFC 793-bis (page 64) both suggest
      purging the write queue upon RST:
      https://tools.ietf.org/html/draft-ietf-tcpm-rfc793bis-07
      
      Moreover, this is essential for a correct MSG_ZEROCOPY
      implementation, because userspace cannot call close(fd)
      before receiving zerocopy signals even when the connection
      is reset.
      
      Fixes: f214f915 ("tcp: enable MSG_ZEROCOPY")
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a27fd7a8
    • D
      Merge branch 'tcp-revert-a-F-RTO-extension-due-to-broken-middle-boxes' · 55e84dd7
      David S. Miller 提交于
      Yuchung Cheng says:
      
      ====================
      tcp: revert a F-RTO extension due to broken middle-boxes
      
      This patch series reverts a (non-standard) TCP F-RTO extension that aimed
      to detect more spurious timeouts. Unfortunately it could result in poor
      performance due to broken middle-boxes that modify TCP packets. E.g.
      https://www.spinics.net/lists/netdev/msg484154.html
      We believe the best and simplest solution is to just revert the change.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55e84dd7
    • Y
      tcp: revert F-RTO extension to detect more spurious timeouts · fc68e171
      Yuchung Cheng 提交于
      This reverts commit 89fe18e4.
      
      While the patch could detect more spurious timeouts, it could cause
      poor TCP performance on broken middle-boxes that modifies TCP packets
      (e.g. receive window, SACK options). Since the performance gain is
      much smaller compared to the potential loss. The best solution is
      to fully revert the change.
      
      Fixes: 89fe18e4 ("tcp: extend F-RTO to catch more spurious timeouts")
      Reported-by: NTeodor Milkov <tm@del.bg>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc68e171
    • Y
      tcp: revert F-RTO middle-box workaround · d4131f09
      Yuchung Cheng 提交于
      This reverts commit cc663f4d. While fixing
      some broken middle-boxes that modifies receive window fields, it does not
      address middle-boxes that strip off SACK options. The best solution is
      to fully revert this patch and the root F-RTO enhancement.
      
      Fixes: cc663f4d ("tcp: restrict F-RTO to work-around broken middle-boxes")
      Reported-by: NTeodor Milkov <tm@del.bg>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4131f09
    • D
      Merge branch 's390-qeth-fixes' · c8431622
      David S. Miller 提交于
      Julian Wiedmann says:
      
      ====================
      s390/qeth: fixes 2018-02-27
      
      please apply some more qeth patches for -net and stable.
      
      One patch fixes a performance bug in the TSO path. Then there's several
      more fixes for IP management on L3 devices - including a revert, so that
      the subsequent fix cleanly applies to earlier kernels.
      The final patch takes care of a race in the control IO code that causes
      qeth to miss the cmd response, and subsequently trigger device recovery.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8431622
    • J
      s390/qeth: fix IPA command submission race · d22ffb5a
      Julian Wiedmann 提交于
      If multiple IPA commands are build & sent out concurrently,
      fill_ipacmd_header() may assign a seqno value to a command that's
      different from what send_control_data() later assigns to this command's
      reply.
      This is due to other commands passing through send_control_data(),
      and incrementing card->seqno.ipa along the way.
      
      So one IPA command has no reply that's waiting for its seqno, while some
      other IPA command has multiple reply objects waiting for it.
      Only one of those waiting replies wins, and the other(s) times out and
      triggers a recovery via send_ipa_cmd().
      
      Fix this by making sure that the same seqno value is assigned to
      a command and its reply object.
      Do so immediately before submitting the command & while holding the
      irq_pending "lock", to produce nicely ascending seqnos.
      
      As a side effect, *all* IPA commands now use a reply object that's
      waiting for its actual seqno. Previously, early IPA commands that were
      submitted while the card was still DOWN used the "catch-all" IDX seqno.
      Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d22ffb5a
    • J
      s390/qeth: fix IP address lookup for L3 devices · c5c48c58
      Julian Wiedmann 提交于
      Current code ("qeth_l3_ip_from_hash()") matches a queried address object
      against objects in the IP table by IP address, Mask/Prefix Length and
      MAC address ("qeth_l3_ipaddrs_is_equal()"). But what callers actually
      require is either
      a) "is this IP address registered" (ie. match by IP address only),
      before adding a new address.
      b) or "is this address object registered" (ie. match all relevant
         attributes), before deleting an address.
      
      Right now
      1. the ADD path is too strict in its lookup, and eg. doesn't detect
      conflicts between an existing NORMAL address and a new VIPA address
      (because the NORMAL address will have mask != 0, while VIPA has
      a mask == 0),
      2. the DELETE path is not strict enough, and eg. allows del_rxip() to
      delete a VIPA address as long as the IP address matches.
      
      Fix all this by adding helpers (_addr_match_ip() and _addr_match_all())
      that do the appropriate checking.
      
      Note that the ADD path for NORMAL addresses is special, as qeth keeps
      track of how many times such an address is in use (and there is no
      immediate way of returning errors to the caller). So when a requested
      NORMAL address _fully_ matches an existing one, it's not considered a
      conflict and we merely increment the refcount.
      
      Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5c48c58
    • J
      Revert "s390/qeth: fix using of ref counter for rxip addresses" · 4964c66f
      Julian Wiedmann 提交于
      This reverts commit cb816192.
      
      The issue this attempted to fix never actually occurs.
      l3_add_rxip() checks (via l3_ip_from_hash()) if the requested address
      was previously added to the card. If so, it returns -EEXIST and doesn't
      call l3_add_ip().
      As a result, the "address exists" path in l3_add_ip() is never taken
      for rxip addresses, and this patch had no effect.
      
      Fixes: cb816192 ("s390/qeth: fix using of ref counter for rxip addresses")
      Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4964c66f
    • J
      s390/qeth: fix double-free on IP add/remove race · 14d066c3
      Julian Wiedmann 提交于
      Registering an IPv4 address with the HW takes quite a while, so we
      temporarily drop the ip_htable lock. Any concurrent add/remove of the
      same IP adjusts the IP's use count, and (on remove) is then blocked by
      addr->in_progress.
      After the register call has completed, we check the use count for
      concurrently attempted add/remove calls - and possibly straight-away
      deregister the IP again. This happens via l3_delete_ip(), which
      1) looks up the queried IP in the htable (getting a reference to the
         *same* queried object),
      2) deregisters the IP from the HW, and
      3) frees the IP object.
      
      The caller in l3_add_ip() then does a second free on the same object.
      
      For this case, skip all the extra checks and lookups in l3_delete_ip()
      and just deregister & free the IP object ourselves.
      
      Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14d066c3
    • J
      s390/qeth: fix IP removal on offline cards · 98d823ab
      Julian Wiedmann 提交于
      If the HW is not reachable, then none of the IPs in qeth's internal
      table has been registered with the HW yet. So when deleting such an IP,
      there's no need to stage it for deregistration - just drop it from
      the table.
      
      This fixes the "add-delete-add" scenario on an offline card, where the
      the second "add" merely increments the IP's use count. But as the IP is
      still set to DISP_ADDR_DELETE from the previous "delete" step,
      l3_recover_ip() won't register it with the HW when the card goes online.
      
      Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98d823ab
    • J
      s390/qeth: fix overestimated count of buffer elements · 12472af8
      Julian Wiedmann 提交于
      qeth_get_elements_for_range() doesn't know how to handle a 0-length
      range (ie. start == end), and returns 1 when it should return 0.
      Such ranges occur on TSO skbs, where the L2/L3/L4 headers (and thus all
      of the skb's linear data) are skipped when mapping the skb into regular
      buffer elements.
      
      This overestimation may cause several performance-related issues:
      1. sub-optimal IO buffer selection, where the next buffer gets selected
         even though the skb would actually still fit into the current buffer.
      2. forced linearization, if the element count for a non-linear skb
         exceeds QETH_MAX_BUFFER_ELEMENTS.
      
      Rather than modifying qeth_get_elements_for_range() and adding overhead
      to every caller, fix up those callers that are in risk of passing a
      0-length range.
      
      Fixes: 2863c613 ("qeth: refactor calculation of SBALE count")
      Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12472af8
  2. 28 2月, 2018 12 次提交
  3. 27 2月, 2018 11 次提交