1. 20 8月, 2016 9 次提交
  2. 19 8月, 2016 10 次提交
  3. 18 8月, 2016 21 次提交
    • L
      netfilter: cttimeout: fix use after free error when delete netns · b75911b6
      Liping Zhang 提交于
      In general, when we want to delete a netns, cttimeout_net_exit will
      be called before ipt_unregister_table, i.e. before ctnl_timeout_put.
      
      But after call kfree_rcu in cttimeout_net_exit, we will still decrease
      the timeout object's refcnt in ctnl_timeout_put, this is incorrect,
      and will cause a use after free error.
      
      It is easy to reproduce this problem:
        # while : ; do
        ip netns add xxx
        ip netns exec xxx nfct add timeout testx inet icmp timeout 200
        ip netns exec xxx iptables -t raw -p icmp -I OUTPUT -j CT --timeout testx
        ip netns del xxx
        done
      
        =======================================================================
        BUG kmalloc-96 (Tainted: G    B       E  ): Poison overwritten
        -----------------------------------------------------------------------
        INFO: 0xffff88002b5161e8-0xffff88002b5161e8. First byte 0x6a instead of
        0x6b
        INFO: Allocated in cttimeout_new_timeout+0xd4/0x240 [nfnetlink_cttimeout]
        age=104 cpu=0 pid=3330
        ___slab_alloc+0x4da/0x540
        __slab_alloc+0x20/0x40
        __kmalloc+0x1c8/0x240
        cttimeout_new_timeout+0xd4/0x240 [nfnetlink_cttimeout]
        nfnetlink_rcv_msg+0x21a/0x230 [nfnetlink]
        [ ... ]
      
      So only when the refcnt decreased to 0, we call kfree_rcu to free the
      timeout object. And like nfnetlink_acct do, use atomic_cmpxchg to
      avoid race between ctnl_timeout_try_del and ctnl_timeout_put.
      Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b75911b6
    • L
      netfilter: nfnetlink_acct: fix race between nfacct del and xt_nfacct destroy · 12be15dd
      Liping Zhang 提交于
      Suppose that we input the following commands at first:
        # nfacct add test
        # iptables -A INPUT -m nfacct --nfacct-name test
      
      And now "test" acct's refcnt is 2, but later when we try to delete the
      "test" nfacct and the related iptables rule at the same time, race maybe
      happen:
            CPU0                                    CPU1
        nfnl_acct_try_del                      nfnl_acct_put
        atomic_dec_and_test //ref=1,testfail          -
             -                                 atomic_dec_and_test //ref=0,testok
             -                                 kfree_rcu
        atomic_inc //ref=1                            -
      
      So after the rcu grace period, nf_acct will be freed but it is still linked
      in the nfnl_acct_list, and we can access it later, then oops will happen.
      
      Convert atomic_dec_and_test and atomic_inc combinaiton to one atomic
      operation atomic_cmpxchg here to fix this problem.
      Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      12be15dd
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 184ca823
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
          Fietkau.
      
       2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.
      
       3) Fix some tg3 ethtool logic bugs, and one that would cause no
          interrupts to be generated when rx-coalescing is set to 0.  From
          Satish Baddipadige and Siva Reddy Kallam.
      
       4) QLCNIC mailbox corruption and napi budget handling fix from Manish
          Chopra.
      
       5) Fix fib_trie logic when walking the trie during /proc/net/route
          output than can access a stale node pointer.  From David Forster.
      
       6) Several sctp_diag fixes from Phil Sutter.
      
       7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.
      
       8) Checksum fixup fixes in bpf from Daniel Borkmann.
      
       9) Memork leaks in nfnetlink, from Liping Zhang.
      
      10) Use after free in rxrpc, from David Howells.
      
      11) Use after free in new skb_array code of macvtap driver, from Jason
          Wang.
      
      12) Calipso resource leak, from Colin Ian King.
      
      13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.
      
      14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.
      
      15) Fix lockdep splats in macsec, from Sabrina Dubroca.
      
      16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
          handling.
      
      17) Various tc-action bug fixes, from CONG Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
        net_sched: allow flushing tc police actions
        net_sched: unify the init logic for act_police
        net_sched: convert tcf_exts from list to pointer array
        net_sched: move tc offload macros to pkt_cls.h
        net_sched: fix a typo in tc_for_each_action()
        net_sched: remove an unnecessary list_del()
        net_sched: remove the leftover cleanup_a()
        mlxsw: spectrum: Allow packets to be trapped from any PG
        mlxsw: spectrum: Unmap 802.1Q FID before destroying it
        mlxsw: spectrum: Add missing rollbacks in error path
        mlxsw: reg: Fix missing op field fill-up
        mlxsw: spectrum: Trap loop-backed packets
        mlxsw: spectrum: Add missing packet traps
        mlxsw: spectrum: Mark port as active before registering it
        mlxsw: spectrum: Create PVID vPort before registering netdevice
        mlxsw: spectrum: Remove redundant errors from the code
        mlxsw: spectrum: Don't return upon error in removal path
        i40e: check for and deal with non-contiguous TCs
        ixgbe: Re-enable ability to toggle VLAN filtering
        ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
        ...
      184ca823
    • D
      Merge branch 'tc_action-fixes' · b96c22c0
      David S. Miller 提交于
      Cong Wang says:
      
      ====================
      net_sched: tc action fixes and updates
      
      This patchset fixes a few regressions caused by the previous
      code refactor and more. Thanks to Jamal for catching them!
      
      Note, patch 3/7 and 4/7 are not strictly necessary for this patchset,
      I just want to carry them together.
      
      ---
      v4: adjust an indention for Jamal
          add two more patches
      
      v3: avoid list for fast path, suggested by Jamal
      
      v2: replace flex_array with regular dynamic array
          keep tcf_action_stats_update() in act_api.h
          fix macro typos found by Amir
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b96c22c0
    • R
      net_sched: allow flushing tc police actions · b5ac8518
      Roman Mashak 提交于
      The act_police uses its own code to walk the
      action hashtable, which leads to that we could
      not flush standalone tc police actions, so just
      switch to tcf_generic_walker() like other actions.
      
      (Joint work from Roman and Cong.)
      Signed-off-by: NRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5ac8518
    • W
      net_sched: unify the init logic for act_police · 0852e455
      WANG Cong 提交于
      Jamal reported a crash when we create a police action
      with a specific index, this is because the init logic
      is not correct, we should always create one for this
      case. Just unify the logic with other tc actions.
      
      Fixes: a03e6fe5 ("act_police: fix a crash during removal")
      Reported-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0852e455
    • W
      net_sched: convert tcf_exts from list to pointer array · 22dc13c8
      WANG Cong 提交于
      As pointed out by Jamal, an action could be shared by
      multiple filters, so we can't use list to chain them
      any more after we get rid of the original tc_action.
      Instead, we could just save pointers to these actions
      in tcf_exts, since they are refcount'ed, so convert
      the list to an array of pointers.
      
      The "ugly" part is the action API still accepts list
      as a parameter, I just introduce a helper function to
      convert the array of pointers to a list, instead of
      relying on the C99 feature to iterate the array.
      
      Fixes: a85a970a ("net_sched: move tc_action into tcf_common")
      Reported-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22dc13c8
    • W
      net_sched: move tc offload macros to pkt_cls.h · 2734437e
      WANG Cong 提交于
      struct tcf_exts belongs to filters, should not be visible
      to plain tc actions.
      
      Cc: Ido Schimmel <idosch@mellanox.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2734437e
    • W
      net_sched: fix a typo in tc_for_each_action() · 0c23c3e7
      WANG Cong 提交于
      It is harmless because all users pass 'a' to this macro.
      
      Fixes: 00175aec ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
      Cc: Amir Vadai <amir@vadai.me>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c23c3e7
    • W
      net_sched: remove an unnecessary list_del() · 824a7e88
      WANG Cong 提交于
      This list_del() for tc action is not needed actually,
      because we only use this list to chain bulk operations,
      therefore should not be carried for latter operations.
      
      Fixes: ec0595cc ("net_sched: get rid of struct tcf_common")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      824a7e88
    • W
      net_sched: remove the leftover cleanup_a() · f07fed82
      WANG Cong 提交于
      After refactoring tc_action into tcf_common, we no
      longer need to cleanup temporary "actions" in list,
      they are permanently stored in the hashtable.
      
      Fixes: a85a970a ("net_sched: move tc_action into tcf_common")
      Reported-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f07fed82
    • D
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · f4abf05f
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2016-08-16
      
      This series contains fixes to e1000e, igb, ixgbe and i40e.
      
      Kshitiz Gupta provides a fix for igb to resolve the PHY delay compensation
      math in several functions.
      
      Jarod Wilson provides a fix for e1000e which had to broken up into 2
      patches, first is prepares the driver for expanding the list of NICs
      that have occasional ~10 hour clock jumps when being used for PTP.
      Second patch actually fixes i218 silicon which has been experiencing
      the clock jumps while using PTP.
      
      Alex provides 2 patches for ixgbe now that he is back at Intel.  First
      fixes setting VLNCTRL.VFE bit, which was left unchanged in earlier patches
      which resulted in disabling VLAN filtering for all the VFs.  Second
      corrects the support for disabling the VLAN tag filtering via the
      feature bit.
      
      Lastly, David fixes i40e which was causing a kernel panic when
      non-contiguous traffic classes or traffic classes not starting with TC0,
      were configured on a link partner switch.  To fix this, changed the
      logic when determining the total number of TCs enabled.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4abf05f
    • D
      Merge branch 'mlxsw-fixes' · 647f28c7
      David S. Miller 提交于
      Jiri Pirko says:
      
      ====================
      mlxsw: IPv4 UC router fixes
      
      Ido says:
      Patches 1-3 fix a long standing problem in the driver's init sequence,
      which manifests itself quite often when routing daemons try to configure
      an IP address on registered netdevs that don't yet have an associated
      vPort.
      
      Patches 4-9 add missing packet traps for the router to work properly and
      also fix ordering issue following the recent changes to the driver's init
      sequence.
      
      The last patch isn't related to the router, but fixes a general problem
      in which under certain conditions packets aren't trapped to CPU.
      
      v1->v2:
      - Change order of patch 7
      - Add patch 6 following Ilan's comment
      - Add patchset name and cover letter
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      647f28c7
    • I
      mlxsw: spectrum: Allow packets to be trapped from any PG · 9ffcc372
      Ido Schimmel 提交于
      When packets enter the device they are classified to a priority group
      (PG) buffer based on their PCP value. After their egress port and
      traffic class are determined they are moved to the switch's shared
      buffer and await transmission, if:
      
      (Ingress{Port}.Usage < Thres && Ingress{Port,PG}.Usage < Thres &&
       Egress{Port}.Usage < Thres && Egress{Port,TC}.Usage < Thres)
      ||
      (Ingress{Port}.Usage < Min || Ingress{Port,PG} < Min ||
       Egress{Port}.Usage < Min || Egress{Port,TC}.Usage < Min)
      
      Packets scheduled to transmission through CPU port (trapped to CPU) use
      traffic class 7, which has a zero maximum and minimum quotas. However,
      when such packets arrive from PG 0 they are admitted to the shared
      buffer as PG 0 has a non-zero minimum quota.
      
      Allow all packets to be trapped to the CPU - regardless of the PG they
      were classified to - by assigning a 10KB minimum quota for CPU port and
      TC7.
      
      Fixes: 8e8dfe9f ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support")
      Reported-by: NTamir Winetroub <tamirw@mellanox.com>
      Tested-by: NTamir Winetroub <tamirw@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ffcc372
    • I
      mlxsw: spectrum: Unmap 802.1Q FID before destroying it · 8168287b
      Ido Schimmel 提交于
      Before destroying the 802.1Q FID we should first remove the VID-to-FID
      mapping. This makes mlxsw_sp_fid_destroy() symmetric with regards to
      mlxsw_sp_fid_create().
      
      Fixes: 14d39461 ("mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8168287b
    • I
      mlxsw: spectrum: Add missing rollbacks in error path · 0583272d
      Ido Schimmel 提交于
      While going over the code I noticed we are missing two rollbacks in the
      port's creation error path. Add them and adjust the place of one of them
      in the port's removal sequence so that both are symmetric.
      
      Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0583272d
    • J
      mlxsw: reg: Fix missing op field fill-up · 0e7df1a2
      Jiri Pirko 提交于
      Ralue pack function needs to set op, otherwise it is 0 for add always.
      
      Fixes: d5a1c749 ("mlxsw: reg: Add Router Algorithmic LPM Unicast Entry Register definition")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0e7df1a2
    • I
      mlxsw: spectrum: Trap loop-backed packets · a94a614f
      Ido Schimmel 提交于
      One of the conditions to generate an ICMP Redirect Message is that "the
      packet is being forwarded out the same physical interface that it was
      received from" (RFC 1812).
      
      Therefore, we need to be able to trap such packets and let the kernel
      decide what to do with them.
      
      For each RIF, enable the loop-back filter, which will raise the LBERROR
      trap whenever the ingress RIF equals the egress RIF.
      
      Fixes: 99724c18 ("mlxsw: spectrum: Introduce support for router interfaces")
      Reported-by: NIlan Tayari <ilant@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a94a614f
    • E
      mlxsw: spectrum: Add missing packet traps · c20b8018
      Elad Raz 提交于
      Add the following traps:
      
      1) MTU Error: Trap packets whose size is bigger than the egress RIF's
      MTU. If DF bit isn't set, traffic will continue to be routed in slow
      path.
      
      2) TTL Error: Trap packets whose TTL expired. This allows traceroute to
      work properly.
      
      3) OSPF packets.
      
      Fixes: 7b27ce7b ("mlxsw: spectrum: Add traps needed for router implementation")
      Signed-off-by: NElad Raz <eladr@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c20b8018
    • I
      mlxsw: spectrum: Mark port as active before registering it · 2f25844c
      Ido Schimmel 提交于
      Commit bbf2a475 ("mlxsw: spectrum: Initialize ports at the end of
      init sequence") moved ports initialization to the end of the init
      sequence, which means ports are the first to be removed during fini.
      
      Since the FDB delayed work is still active when ports are removed it's
      possible for it to process FDB notifications of inactive ports,
      resulting in a warning message.
      
      Fix that by marking ports as inactive only after unregistering them. The
      NETDEV_UNREGISTER event will invoke bridge's driver port removal
      sequence that will cause the FDB (and FDB notifications) to be flushed.
      
      Fixes: bbf2a475 ("mlxsw: spectrum: Initialize ports at the end of init sequence")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f25844c
    • I
      mlxsw: spectrum: Create PVID vPort before registering netdevice · 05978481
      Ido Schimmel 提交于
      After registering a netdevice it's possible for user space applications
      to configure an IP address on it. From the driver's perspective, this
      means a router interface (RIF) should be created for the PVID vPort.
      
      Therefore, we must create the PVID vPort before registering the
      netdevice.
      
      Fixes: 99724c18 ("mlxsw: spectrum: Introduce support for router interfaces")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05978481