1. 16 6月, 2017 5 次提交
  2. 15 6月, 2017 6 次提交
    • D
      rxrpc: Cache the congestion window setting · f7aec129
      David Howells 提交于
      Cache the congestion window setting that was determined during a call's
      transmission phase when it finishes so that it can be used by the next call
      to the same peer, thereby shortcutting the slow-start algorithm.
      
      The value is stored in the rxrpc_peer struct and is accessed without
      locking.  Each call takes the value that happens to be there when it starts
      and just overwrites the value when it finishes.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7aec129
    • J
      net: don't global ICMP rate limit packets originating from loopback · 849a44de
      Jesper Dangaard Brouer 提交于
      Florian Weimer seems to have a glibc test-case which requires that
      loopback interfaces does not get ICMP ratelimited.  This was broken by
      commit c0303efe ("net: reduce cycles spend on ICMP replies that
      gets rate limited").
      
      An ICMP response will usually be routed back-out the same incoming
      interface.  Thus, take advantage of this and skip global ICMP
      ratelimit when the incoming device is loopback.  In the unlikely event
      that the outgoing it not loopback, due to strange routing policy
      rules, ICMP rate limiting still works via peer ratelimiting via
      icmpv4_xrlim_allow().  Thus, we should still comply with RFC1812
      (section 4.3.2.8 "Rate Limiting").
      
      This seems to fix the reproducer given by Florian.  While still
      avoiding to perform expensive and unneeded outgoing route lookup for
      rate limited packets (in the non-loopback case).
      
      Fixes: c0303efe ("net: reduce cycles spend on ICMP replies that gets rate limited")
      Reported-by: NFlorian Weimer <fweimer@redhat.com>
      Reported-by: N"H.J. Lu" <hjl.tools@gmail.com>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      849a44de
    • D
      net/act_pedit: fix an error code · c4f65b09
      Dan Carpenter 提交于
      I'm reviewing static checker warnings where we do ERR_PTR(0), which is
      the same as NULL.  I'm pretty sure we intended to return ERR_PTR(-EINVAL)
      here.  Sometimes these bugs lead to a NULL dereference but I don't
      immediately see that problem here.
      
      Fixes: 71d0ed70 ("net/act_pedit: Support using offset relative to the conventional network headers")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NAmir Vadai <amir@vadai.me>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4f65b09
    • P
      net: use skb_unref() in napi_consume_skb() · 7608894e
      Paolo Abeni 提交于
      The commit 83ada39bb79d ("net: factor out a helper to decrement the
      skb refcount") provided and used a helper for decrementing skb usage,
      but I missed at least a spot for it.
      
      This change remove some more duplicated code reusing skb_unref() in
      napi_consume_skb(), too. The helper uses an additional, unneeded
      unlikely(!skb) test - napi_consume_skb() already check it a few lines
      above - but the compiler is smart enough to optimize the duplicated
      test out.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7608894e
    • Y
      bpf: permits narrower load from bpf program context fields · 31fd8581
      Yonghong Song 提交于
      Currently, verifier will reject a program if it contains an
      narrower load from the bpf context structure. For example,
              __u8 h = __sk_buff->hash, or
              __u16 p = __sk_buff->protocol
              __u32 sample_period = bpf_perf_event_data->sample_period
      which are narrower loads of 4-byte or 8-byte field.
      
      This patch solves the issue by:
        . Introduce a new parameter ctx_field_size to carry the
          field size of narrower load from prog type
          specific *__is_valid_access validator back to verifier.
        . The non-zero ctx_field_size for a memory access indicates
          (1). underlying prog type specific convert_ctx_accesses
               supporting non-whole-field access
          (2). the current insn is a narrower or whole field access.
        . In verifier, for such loads where load memory size is
          less than ctx_field_size, verifier transforms it
          to a full field load followed by proper masking.
        . Currently, __sk_buff and bpf_perf_event_data->sample_period
          are supporting narrowing loads.
        . Narrower stores are still not allowed as typical ctx stores
          are just normal stores.
      
      Because of this change, some tests in verifier will fail and
      these tests are removed. As a bonus, rename some out of bound
      __sk_buff->cb access to proper field name and remove two
      redundant "skb cb oob" tests.
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31fd8581
    • W
      net_sched: move tcf_lock down after gen_replace_estimator() · 74030603
      WANG Cong 提交于
      Laura reported a sleep-in-atomic kernel warning inside
      tcf_act_police_init() which calls gen_replace_estimator() with
      spinlock protection.
      
      It is not necessary in this case, we already have RTNL lock here
      so it is enough to protect concurrent writers. For the reader,
      i.e. tcf_act_police(), it needs to make decision based on this
      rate estimator, in the worst case we drop more/less packets than
      necessary while changing the rate in parallel, it is still acceptable.
      Reported-by: NLaura Abbott <labbott@redhat.com>
      Reported-by: NNick Huber <nicholashuber@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74030603
  3. 14 6月, 2017 8 次提交
  4. 13 6月, 2017 15 次提交
  5. 12 6月, 2017 6 次提交
    • P
      udp: try to avoid 2 cache miss on dequeue · b65ac446
      Paolo Abeni 提交于
      when udp_recvmsg() is executed, on x86_64 and other archs, most skb
      fields are on cold cachelines.
      If the skb are linear and the kernel don't need to compute the udp
      csum, only a handful of skb fields are required by udp_recvmsg().
      Since we already use skb->dev_scratch to cache hot data, and
      there are 32 bits unused on 64 bit archs, use such field to cache
      as much data as we can, and try to prefetch on dequeue the relevant
      fields that are left out.
      
      This can save up to 2 cache miss per packet.
      
      v1 -> v2:
        - changed udp_dev_scratch fields types to u{32,16} variant,
          replaced bitfiled with bool
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b65ac446
    • P
      udp: avoid a cache miss on dequeue · 0a463c78
      Paolo Abeni 提交于
      Since UDP no more uses sk->destructor, we can clear completely
      the skb head state before enqueuing. Amend and use
      skb_release_head_state() for that.
      
      All head states share a single cacheline, which is not
      normally used/accesses on dequeue. We can avoid entirely accessing
      such cacheline implementing and using in the UDP code a specialized
      skb free helper which ignores the skb head state.
      
      This saves a cacheline miss at skb deallocation time.
      
      v1 -> v2:
        replaced secpath_reset() with skb_release_head_state()
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a463c78
    • P
      net: factor out a helper to decrement the skb refcount · 3889a803
      Paolo Abeni 提交于
      The same code is replicated in 3 different places; move it to a
      common helper.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3889a803
    • C
      proc: snmp6: Use correct type in memset · 3500cd73
      Christian Perle 提交于
      Reading /proc/net/snmp6 yields bogus values on 32 bit kernels.
      Use "u64" instead of "unsigned long" in sizeof().
      
      Fixes: 4a4857b1 ("proc: Reduce cache miss in snmp6_seq_show")
      Signed-off-by: NChristian Perle <christian.perle@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3500cd73
    • M
      Bluetooth: Send HCI Set Event Mask Page 2 command only when needed · 313f6888
      Marcel Holtmann 提交于
      The Broadcom BCM20702 Bluetooth controller in ThinkPad-T530 devices
      report support for the Set Event Mask Page 2 command, but actually do
      return an error when trying to use it.
      
        < HCI Command: Read Local Supported Commands (0x04|0x0002) plen 0
        > HCI Event: Command Complete (0x0e) plen 68
             Read Local Supported Commands (0x04|0x0002) ncmd 1
               Status: Success (0x00)
               Commands: 162 entries
                 ...
                 Set Event Mask Page 2 (Octet 22 - Bit 2)
                 ...
      
        < HCI Command: Set Event Mask Page 2 (0x03|0x0063) plen 8
               Mask: 0x0000000000000000
        > HCI Event: Command Complete (0x0e) plen 4
             Set Event Mask Page 2 (0x03|0x0063) ncmd 1
               Status: Unknown HCI Command (0x01)
      
      Since these controllers do not support any feature that would require
      the event mask page 2 to be modified, it is safe to not send this
      command at all. The default value is all bits set to zero.
      
      T:  Bus=01 Lev=02 Prnt=02 Port=03 Cnt=03 Dev#=  9 Spd=12   MxCh= 0
      D:  Ver= 2.00 Cls=ff(vend.) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=0a5c ProdID=21e6 Rev= 1.12
      S:  Manufacturer=Broadcom Corp
      S:  Product=BCM20702A0
      S:  SerialNumber=F82FA8E8CFC0
      C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=  0mA
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      I:  If#= 1 Alt= 2 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      I:  If#= 1 Alt= 3 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      I:  If#= 1 Alt= 4 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      I:  If#= 1 Alt= 5 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=btusb
      E:  Ad=84(I) Atr=02(Bulk) MxPS=  32 Ivl=0ms
      E:  Ad=04(O) Atr=02(Bulk) MxPS=  32 Ivl=0ms
      I:* If#= 3 Alt= 0 #EPs= 0 Cls=fe(app. ) Sub=01 Prot=01 Driver=(none)
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Reported-by: NSedat Dilek <sedat.dilek@gmail.com>
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: NSzymon Janc <szymon.janc@codecoup.pl>
      313f6888
    • D
      net: ipmr: Fix some mroute forwarding issues in vrf's · 4b1f0d33
      Donald Sharp 提交于
      This patch fixes two issues:
      
      1) When forwarding on *,G mroutes that are in a vrf, the
      kernel was dropping information about the actual incoming
      interface when calling ip_mr_forward from ip_mr_input.
      This caused ip_mr_forward to send the multicast packet
      back out the incoming interface.  Fix this by
      modifying ip_mr_forward to be handed the correctly
      resolved dev.
      
      2) When a unresolved cache entry is created we store
      the incoming skb on the unresolved cache entry and
      upon mroute resolution from the user space daemon,
      we attempt to forward the packet.  Again we were
      not resolving to the correct incoming device for
      a vrf scenario, before calling ip_mr_forward.
      Fix this by resolving to the correct interface
      and calling ip_mr_forward with the result.
      
      Fixes: e58e4159 ("net: Enable support for VRF with ipv4 multicast")
      Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: NYotam Gigi <yotamg@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b1f0d33