1. 09 9月, 2017 29 次提交
    • D
      bpf: make error reporting in bpf_warn_invalid_xdp_action more clear · 9beb8bed
      Daniel Borkmann 提交于
      Differ between illegal XDP action code and just driver
      unsupported one to provide better feedback when we throw
      a one-time warning here. Reason is that with 814abfab
      ("xdp: add bpf_redirect helper function") not all drivers
      support the new XDP return code yet and thus they will
      fall into their 'default' case when checking for return
      codes after program return, which then triggers a
      bpf_warn_invalid_xdp_action() stating that the return
      code is illegal, but from XDP perspective it's not.
      
      I decided not to place something like a XDP_ACT_MAX define
      into uapi i) given we don't have this either for all other
      program types, ii) future action codes could have further
      encoding there, which would render such define unsuitable
      and we wouldn't be able to rip it out again, and iii) we
      rarely add new action codes.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9beb8bed
    • F
      Revert "mdio_bus: Remove unneeded gpiod NULL check" · a010a2f6
      Florian Fainelli 提交于
      This reverts commit 95b80bf3 ("mdio_bus:
      Remove unneeded gpiod NULL check"), this commit assumed that GPIOLIB
      checks for NULL descriptors, so it's safe to drop them, but it is not
      when CONFIG_GPIOLIB is disabled in the kernel. If we do call
      gpiod_set_value_cansleep() on a GPIO descriptor we will issue warnings
      coming from the inline stubs declared in include/linux/gpio/consumer.h.
      
      Fixes: 95b80bf3 ("mdio_bus: Remove unneeded gpiod NULL check")
      Reported-by: NWoojung Huh <Woojung.Huh@microchip.com>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a010a2f6
    • D
      Merge branch 'xdp-bpf-fixes' · a7bc5774
      David S. Miller 提交于
      John Fastabend says:
      
      ====================
      net: Fixes for XDP/BPF
      
      The following fixes, UAPI updates, and small improvement,
      
      i. XDP needs to be called inside RCU with preempt disabled.
      
      ii. Not strictly a bug fix but we have an attach command in the
      sockmap UAPI already to avoid having a single kernel released with
      only the attach and not the detach I'm pushing this into net branch.
      Its early in the RC cycle so I think this is OK (not ideal but better
      than supporting a UAPI with a missing detach forever).
      
      iii. Final patch replace cpu_relax with cond_resched in devmap.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7bc5774
    • J
      bpf: devmap, use cond_resched instead of cpu_relax · 374fb014
      John Fastabend 提交于
      Be a bit more friendly about waiting for flush bits to complete.
      Replace the cpu_relax() with a cond_resched().
      Suggested-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      374fb014
    • J
      bpf: add support for sockmap detach programs · 5a67da2a
      John Fastabend 提交于
      The bpf map sockmap supports adding programs via attach commands. This
      patch adds the detach command to keep the API symmetric and allow
      users to remove previously added programs. Otherwise the user would
      have to delete the map and re-add it to get in this state.
      
      This also adds a series of additional tests to capture detach operation
      and also attaching/detaching invalid prog types.
      
      API note: socks will run (or not run) programs depending on the state
      of the map at the time the sock is added. We do not for example walk
      the map and remove programs from previously attached socks.
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a67da2a
    • J
      net: rcu lock and preempt disable missing around generic xdp · bbbe211c
      John Fastabend 提交于
      do_xdp_generic must be called inside rcu critical section with preempt
      disabled to ensure BPF programs are valid and per-cpu variables used
      for redirect operations are consistent. This patch ensures this is true
      and fixes the splat below.
      
      The netif_receive_skb_internal() code path is now broken into two rcu
      critical sections. I decided it was better to limit the preempt_enable/disable
      block to just the xdp static key portion and the fallout is more
      rcu_read_lock/unlock calls. Seems like the best option to me.
      
      [  607.596901] =============================
      [  607.596906] WARNING: suspicious RCU usage
      [  607.596912] 4.13.0-rc4+ #570 Not tainted
      [  607.596917] -----------------------------
      [  607.596923] net/core/dev.c:3948 suspicious rcu_dereference_check() usage!
      [  607.596927]
      [  607.596927] other info that might help us debug this:
      [  607.596927]
      [  607.596933]
      [  607.596933] rcu_scheduler_active = 2, debug_locks = 1
      [  607.596938] 2 locks held by pool/14624:
      [  607.596943]  #0:  (rcu_read_lock_bh){......}, at: [<ffffffff95445ffd>] ip_finish_output2+0x14d/0x890
      [  607.596973]  #1:  (rcu_read_lock_bh){......}, at: [<ffffffff953c8e3a>] __dev_queue_xmit+0x14a/0xfd0
      [  607.597000]
      [  607.597000] stack backtrace:
      [  607.597006] CPU: 5 PID: 14624 Comm: pool Not tainted 4.13.0-rc4+ #570
      [  607.597011] Hardware name: Dell Inc. Precision Tower 5810/0HHV7N, BIOS A17 03/01/2017
      [  607.597016] Call Trace:
      [  607.597027]  dump_stack+0x67/0x92
      [  607.597040]  lockdep_rcu_suspicious+0xdd/0x110
      [  607.597054]  do_xdp_generic+0x313/0xa50
      [  607.597068]  ? time_hardirqs_on+0x5b/0x150
      [  607.597076]  ? mark_held_locks+0x6b/0xc0
      [  607.597088]  ? netdev_pick_tx+0x150/0x150
      [  607.597117]  netif_rx_internal+0x205/0x3f0
      [  607.597127]  ? do_xdp_generic+0xa50/0xa50
      [  607.597144]  ? lock_downgrade+0x2b0/0x2b0
      [  607.597158]  ? __lock_is_held+0x93/0x100
      [  607.597187]  netif_rx+0x119/0x190
      [  607.597202]  loopback_xmit+0xfd/0x1b0
      [  607.597214]  dev_hard_start_xmit+0x127/0x4e0
      
      Fixes: d4455169 ("net: xdp: support xdp generic on virtual devices")
      Fixes: b5cdae32 ("net: Generic XDP")
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbbe211c
    • D
      bpf: don't select potentially stale ri->map from buggy xdp progs · 109980b8
      Daniel Borkmann 提交于
      We can potentially run into a couple of issues with the XDP
      bpf_redirect_map() helper. The ri->map in the per CPU storage
      can become stale in several ways, mostly due to misuse, where
      we can then trigger a use after free on the map:
      
      i) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT
      and running on a driver not supporting XDP_REDIRECT yet. The
      ri->map on that CPU becomes stale when the XDP program is unloaded
      on the driver, and a prog B loaded on a different driver which
      supports XDP_REDIRECT return code. prog B would have to omit
      calling to bpf_redirect_map() and just return XDP_REDIRECT, which
      would then access the freed map in xdp_do_redirect() since not
      cleared for that CPU.
      
      ii) prog A is calling bpf_redirect_map(), returning a code other
      than XDP_REDIRECT. prog A is then detached, which triggers release
      of the map. prog B is attached which, similarly as in i), would
      just return XDP_REDIRECT without having called bpf_redirect_map()
      and thus be accessing the freed map in xdp_do_redirect() since
      not cleared for that CPU.
      
      iii) prog A is attached to generic XDP, calling the bpf_redirect_map()
      helper and returning XDP_REDIRECT. xdp_do_generic_redirect() is
      currently not handling ri->map (will be fixed by Jesper), so it's
      not being reset. Later loading a e.g. native prog B which would,
      say, call bpf_xdp_redirect() and then returns XDP_REDIRECT would
      find in xdp_do_redirect() that a map was set and uses that causing
      use after free on map access.
      
      Fix thus needs to avoid accessing stale ri->map pointers, naive
      way would be to call a BPF function from drivers that just resets
      it to NULL for all XDP return codes but XDP_REDIRECT and including
      XDP_REDIRECT for drivers not supporting it yet (and let ri->map
      being handled in xdp_do_generic_redirect()). There is a less
      intrusive way w/o letting drivers call a reset for each BPF run.
      
      The verifier knows we're calling into bpf_xdp_redirect_map()
      helper, so it can do a small insn rewrite transparent to the prog
      itself in the sense that it fills R4 with a pointer to the own
      bpf_prog. We have that pointer at verification time anyway and
      R4 is allowed to be used as per calling convention we scratch
      R0 to R5 anyway, so they become inaccessible and program cannot
      read them prior to a write. Then, the helper would store the prog
      pointer in the current CPUs struct redirect_info. Later in
      xdp_do_*_redirect() we check whether the redirect_info's prog
      pointer is the same as passed xdp_prog pointer, and if that's
      the case then all good, since the prog holds a ref on the map
      anyway, so it is always valid at that point in time and must
      have a reference count of at least 1. If in the unlikely case
      they are not equal, it means we got a stale pointer, so we clear
      and bail out right there. Also do reset map and the owning prog
      in bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and
      bpf_xdp_redirect() won't get mixed up, only the last call should
      take precedence. A tc bpf_redirect() doesn't use map anywhere
      yet, so no need to clear it there since never accessed in that
      layer.
      
      Note that in case the prog is released, and thus the map as
      well we're still under RCU read critical section at that time
      and have preemption disabled as well. Once we commit with the
      __dev_map_insert_ctx() from xdp_do_redirect_map() and set the
      map to ri->map_to_flush, we still wait for a xdp_do_flush_map()
      to finish in devmap dismantle time once flush_needed bit is set,
      so that is fine.
      
      Fixes: 97f91a7c ("bpf: add bpf_redirect_map helper routine")
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      109980b8
    • K
      net: tulip: Constify tulip_tbl · 9a486c9d
      Kees Cook 提交于
      It looks like all users of tulip_tbl are reads, so mark this table
      as read-only.
      
      $ git grep tulip_tbl  # edited to avoid line-wraps...
      interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ...
      interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs&~RxPollInt, ...
      interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ...
      interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs | TimerInt,
      pnic.c:      iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7);
      tulip.h:     extern struct tulip_chip_table tulip_tbl[];
      tulip_core.c:struct tulip_chip_table tulip_tbl[] = {
      tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR5);
      tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7);
      tulip_core.c:setup_timer(&tp->timer, tulip_tbl[tp->chip_id].media_timer,
      tulip_core.c:const char *chip_name = tulip_tbl[chip_idx].chip_name;
      tulip_core.c:if (pci_resource_len (pdev, 0) < tulip_tbl[chip_idx].io_size)
      tulip_core.c:ioaddr =  pci_iomap(..., tulip_tbl[chip_idx].io_size);
      tulip_core.c:tp->flags = tulip_tbl[chip_idx].flags;
      tulip_core.c:setup_timer(&tp->timer, tulip_tbl[tp->chip_id].media_timer,
      tulip_core.c:INIT_WORK(&tp->media_work, tulip_tbl[tp->chip_id].media_task);
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jarod Wilson <jarod@redhat.com>
      Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a486c9d
    • I
      net: ethernet: ti: netcp_core: no need in netif_napi_del · e333ac1f
      Ivan Khoronzhuk 提交于
      Don't remove rx_napi specifically just before free_netdev(),
      it's supposed to be done in it and is confusing w/o tx_napi deletion.
      Signed-off-by: NIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e333ac1f
    • M
      davicom: Display proper debug level up to 6 · 0fdbedc7
      Mathieu Malaterre 提交于
      This will make it explicit some messages are of the form:
      dm9000_dbg(db, 5, ...
      Signed-off-by: NMathieu Malaterre <malat@debian.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fdbedc7
    • B
      net: phy: sfp: rename dt properties to match the binding · 25ee0793
      Baruch Siach 提交于
      Make the Rx rate select control gpio property name match the documented
      binding. This would make the addition of 'rate-select1-gpios' for SFP+
      support more natural.
      
      Also, make the MOD-DEF0 gpio property name match the documentation.
      Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25ee0793
    • B
      dt-binding: net: sfp binding documentation · 3ef37140
      Baruch Siach 提交于
      Add device-tree binding documentation SFP transceivers. Support for SFP
      transceivers has been recently introduced (drivers/net/phy/sfp.c).
      Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ef37140
    • B
      dt-bindings: add SFF vendor prefix · 165da358
      Baruch Siach 提交于
      Acked-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      165da358
    • B
      dt-bindings: net: don't confuse with generic PHY property · c43593d8
      Baruch Siach 提交于
      This complements commit 9a94b3a4 (dt-binding: phy: don't confuse with
      Ethernet phy properties).
      
      The generic PHY 'phys' property sometime appears in the same node with
      the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in
      ethernet.txt to reduce confusion.
      Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c43593d8
    • H
      ip6_tunnel: fix setting hop_limit value for ipv6 tunnel · 18e1173d
      Haishuang Yan 提交于
      Similar to vxlan/geneve tunnel, if hop_limit is zero, it should fall
      back to ip6_dst_hoplimt().
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18e1173d
    • H
      ip_tunnel: fix setting ttl and tos value in collect_md mode · 0f693f19
      Haishuang Yan 提交于
      ttl and tos variables are declared and assigned, but are not used in
      iptunnel_xmit() function.
      
      Fixes: cfc7381b ("ip_tunnel: add collect_md mode to IPIP tunnel")
      Cc: Alexei Starovoitov <ast@fb.com>
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f693f19
    • E
      ipv6: fix typo in fib6_net_exit() · 32a805ba
      Eric Dumazet 提交于
      IPv6 FIB should use FIB6_TABLE_HASHSZ, not FIB_TABLE_HASHSZ.
      
      Fixes: ba1cc08d ("ipv6: fix memory leak with multiple tables during netns destruction")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32a805ba
    • E
      tcp: fix a request socket leak · 1f3b359f
      Eric Dumazet 提交于
      While the cited commit fixed a possible deadlock, it added a leak
      of the request socket, since reqsk_put() must be called if the BPF
      filter decided the ACK packet must be dropped.
      
      Fixes: d624d276 ("tcp: fix possible deadlock in TCP stack vs BPF filter")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f3b359f
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 10807461
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter/IPVS fixes for your net tree,
      they are:
      
      1) Fix SCTP connection setup when IPVS module is loaded and any scheduler
         is registered, from Xin Long.
      
      2) Don't create a SCTP connection from SCTP ABORT packets, also from
         Xin Long.
      
      3) WARN_ON() and drop packet, instead of BUG_ON() races when calling
         nf_nat_setup_info(). This is specifically a longstanding problem
         when br_netfilter with conntrack support is in place, patch from
         Florian Westphal.
      
      4) Avoid softlock splats via iptables-restore, also from Florian.
      
      5) Revert NAT hashtable conversion to rhashtable, semantics of rhlist
         are different from our simple NAT hashtable, this has been causing
         problems in the recent Linux kernel releases. From Florian.
      
      6) Add per-bucket spinlock for NAT hashtable, so at least we restore
         one of the benefits we got from the previous rhashtable conversion.
      
      7) Fix incorrect hashtable size in memory allocation in xt_hashlimit,
         from Zhizhou Tian.
      
      8) Fix build/link problems with hashlimit and 32-bit arches, to address
         recent fallout from a new hashlimit mode, from Vishwanath Pai.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10807461
    • D
      Merge tag 'wireless-drivers-for-davem-2017-09-08' of... · 91aac563
      David S. Miller 提交于
      Merge tag 'wireless-drivers-for-davem-2017-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for 4.14
      
      Few fixes to regressions introduced in the last one or two releases.
      The iwlwifi fix is for a regression reported by Linus.
      
      rtlwifi
      
      * fix two antenna selection related bugs
      
      iwlwifi
      
      * fix regression with older firmwares
      
      brcmfmac
      
      * workaround firmware crash for bcm4345
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91aac563
    • M
      sctp: fix missing wake ups in some situations · 7906b00f
      Marcelo Ricardo Leitner 提交于
      Commit fb586f25 ("sctp: delay calls to sk_data_ready() as much as
      possible") minimized the number of wake ups that are triggered in case
      the association receives a packet with multiple data chunks on it and/or
      when io_events are enabled and then commit 0970f5b3 ("sctp: signal
      sk_data_ready earlier on data chunks reception") moved the wake up to as
      soon as possible. It thus relies on the state machine running later to
      clean the flag that the event was already generated.
      
      The issue is that there are 2 call paths that calls
      sctp_ulpq_tail_event() outside of the state machine, causing the flag to
      linger and possibly omitting a needed wake up in the sequence.
      
      One of the call paths is when enabling SCTP_SENDER_DRY_EVENTS via
      setsockopt(SCTP_EVENTS), as noticed by Harald Welte. The other is when
      partial reliability triggers removal of chunks from the send queue when
      the application calls sendmsg().
      
      This commit fixes it by not setting the flag in case the socket is not
      owned by the user, as it won't be cleaned later. This works for
      user-initiated calls and also for rx path processing.
      
      Fixes: fb586f25 ("sctp: delay calls to sk_data_ready() as much as possible")
      Reported-by: NHarald Welte <laforge@gnumonks.org>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7906b00f
    • V
      netfilter: xt_hashlimit: fix build error caused by 64bit division · 90c4ae4e
      Vishwanath Pai 提交于
      64bit division causes build/link errors on 32bit architectures. It
      prints out error messages like:
      
      ERROR: "__aeabi_uldivmod" [net/netfilter/xt_hashlimit.ko] undefined!
      
      The value of avg passed through by userspace in BYTE mode cannot exceed
      U32_MAX. Which means 64bit division in user2rate_bytes is unnecessary.
      To fix this I have changed the type of param 'user' to u32.
      
      Since anything greater than U32_MAX is an invalid input we error out in
      hashlimit_mt_check_common() when this is the case.
      
      Changes in v2:
      	Making return type as u32 would cause an overflow for small
      	values of 'user' (for example 2, 3 etc). To avoid this I bumped up
      	'r' to u64 again as well as the return type. This is OK since the
      	variable that stores the result is u64. We still avoid 64bit
      	division here since 'user' is u32.
      
      Fixes: bea74641 ("netfilter: xt_hashlimit: add rate match mode")
      Signed-off-by: NVishwanath Pai <vpai@akamai.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      90c4ae4e
    • Z
      netfilter: xt_hashlimit: alloc hashtable with right size · 05d0eae7
      Zhizhou Tian 提交于
      struct xt_byteslimit_htable used hlist_head, but memory allocation is
      done through sizeof(struct list_head).
      Signed-off-by: NZhizhou Tian <zhizhou.tian@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      05d0eae7
    • F
      netfilter: core: remove erroneous warn_on · 74585d4f
      Florian Westphal 提交于
      kernel test robot reported:
      
      WARNING: CPU: 0 PID: 1244 at net/netfilter/core.c:218 __nf_hook_entries_try_shrink+0x49/0xcd
      [..]
      
      After allowing batching in nf_unregister_net_hooks its possible that an earlier
      call to __nf_hook_entries_try_shrink already compacted the list.
      If this happens we don't need to do anything.
      
      Fixes: d3ad2c17 ("netfilter: core: batch nf_unregister_net_hooks synchronize_net calls")
      Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NAaron Conole <aconole@bytheb.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      74585d4f
    • F
      netfilter: nat: use keyed locks · 8073e960
      Florian Westphal 提交于
      no need to serialize on a single lock, we can partition the table and
      add/delete in parallel to different slots.
      This restores one of the advantages that got lost with the rhlist
      revert.
      
      Cc: Ivan Babrou <ibobrik@gmail.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8073e960
    • F
      netfilter: nat: Revert "netfilter: nat: convert nat bysrc hash to rhashtable" · e1bf1687
      Florian Westphal 提交于
      This reverts commit 870190a9.
      
      It was not a good idea. The custom hash table was a much better
      fit for this purpose.
      
      A fast lookup is not essential, in fact for most cases there is no lookup
      at all because original tuple is not taken and can be used as-is.
      What needs to be fast is insertion and deletion.
      
      rhlist removal however requires a rhlist walk.
      We can have thousands of entries in such a list if source port/addresses
      are reused for multiple flows, if this happens removal requests are so
      expensive that deletions of a few thousand flows can take several
      seconds(!).
      
      The advantages that we got from rhashtable are:
      1) table auto-sizing
      2) multiple locks
      
      1) would be nice to have, but it is not essential as we have at
      most one lookup per new flow, so even a million flows in the bysource
      table are not a problem compared to current deletion cost.
      2) is easy to add to custom hash table.
      
      I tried to add hlist_node to rhlist to speed up rhltable_remove but this
      isn't doable without changing semantics.  rhltable_remove_fast will
      check that the to-be-deleted object is part of the table and that
      requires a list walk that we want to avoid.
      
      Furthermore, using hlist_node increases size of struct rhlist_head, which
      in turn increases nf_conn size.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=196821Reported-by: NIvan Babrou <ibobrik@gmail.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      e1bf1687
    • F
      netfilter: xtables: add scheduling opportunity in get_counters · a5d7a714
      Florian Westphal 提交于
      There are reports about spurious softlockups during iptables-restore, a
      backtrace i saw points at get_counters -- it uses a sequence lock and also
      has unbounded restart loop.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a5d7a714
    • F
      netfilter: nf_nat: don't bug when mapping already exists · 75c26314
      Florian Westphal 提交于
      It seems preferrable to limp along if we have a conflicting mapping,
      its certainly better than a BUG().
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      75c26314
    • S
      ipv6: fix memory leak with multiple tables during netns destruction · ba1cc08d
      Sabrina Dubroca 提交于
      fib6_net_exit only frees the main and local tables. If another table was
      created with fib6_alloc_table, we leak it when the netns is destroyed.
      
      Fix this in the same way ip_fib_net_exit cleans up tables, by walking
      through the whole hashtable of fib6_table's. We can get rid of the
      special cases for local and main, since they're also part of the
      hashtable.
      
      Reproducer:
          ip netns add x
          ip -net x -6 rule add from 6003:1::/64 table 100
          ip netns del x
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Fixes: 58f09b78 ("[NETNS][IPV6] ip6_fib - make it per network namespace")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba1cc08d
  2. 08 9月, 2017 10 次提交
  3. 07 9月, 2017 1 次提交
    • L
      rtlwifi: btcoexist: Fix antenna selection code · 6d622692
      Larry Finger 提交于
      In commit 87d8a9f3 ("rtlwifi: btcoex: call bind to setup btcoex"),
      the code turns on a call to exhalbtc_bind_bt_coex_withadapter(). This
      routine contains a bug that causes incorrect antenna selection for those
      HP laptops with only one antenna and an incorrectly programmed EFUSE.
      These boxes are the ones that need the ant_sel module parameter.
      
      Fixes: 87d8a9f3 ("rtlwifi: btcoex: call bind to setup btcoex")
      Signed-off-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Cc: Ping-Ke Shih <pkshih@realtek.com>
      Cc: Yan-Hsuan Chuang <yhchuang@realtek.com>
      Cc: Birming Chiu <birming@realtek.com>
      Cc: Shaofu <shaofu@realtek.com>
      Cc: Steven Ting <steventing@realtek.com>
      Cc: Stable <stable@vger.kernel.org> # 4.13+
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      6d622692