1. 22 9月, 2016 3 次提交
  2. 21 9月, 2016 2 次提交
  3. 20 9月, 2016 2 次提交
  4. 19 9月, 2016 1 次提交
  5. 17 9月, 2016 2 次提交
  6. 16 9月, 2016 1 次提交
    • M
      net: VRF: Pass original iif to ip_route_input() · d6f64d72
      Mark Tomlinson 提交于
      The function ip_rcv_finish() calls l3mdev_ip_rcv(). On any VRF except
      the global VRF, this replaces skb->dev with the VRF master interface.
      When calling ip_route_input_noref() from here, the checks for forwarding
      look at this master device instead of the initial ingress interface.
      This will allow packets to be routed which normally would be dropped.
      For example, an interface that is not assigned an IP address should
      drop packets, but because the checking is against the master device, the
      packet will be forwarded.
      
      The fix here is to still call l3mdev_ip_rcv(), but remember the initial
      net_device. This is passed to the other functions within ip_rcv_finish,
      so they still see the original interface.
      Signed-off-by: NMark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6f64d72
  7. 15 9月, 2016 1 次提交
    • J
      mac80211: reject TSPEC TIDs (TSIDs) for aggregation · 85d5313e
      Johannes Berg 提交于
      Since mac80211 doesn't currently support TSIDs 8-15 which can
      only be used after QoS TSPEC negotiation (and not even after
      WMM negotiation), reject attempts to set up aggregation
      sessions for them, which might confuse drivers. In mac80211
      we do correctly handle that, but the TSIDs should never get
      used anyway, and drivers might not be able to handle it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      85d5313e
  8. 14 9月, 2016 2 次提交
  9. 13 9月, 2016 4 次提交
    • X
      sctp: hold the transport before using it in sctp_hash_cmp · 715f5552
      Xin Long 提交于
      Since commit 4f008781 ("sctp: apply rhashtable api to send/recv
      path"), sctp uses transport rhashtable with .obj_cmpfn sctp_hash_cmp,
      in which it compares the members of the transport with the rhashtable
      args to check if it's the right transport.
      
      But sctp uses the transport without holding it in sctp_hash_cmp, it can
      cause a use-after-free panic. As after it gets transport from hashtable,
      another CPU may close the sk and free the asoc. In sctp_association_free,
      it frees all the transports, meanwhile, the assoc's refcnt may be reduced
      to 0, assoc can be destroyed by sctp_association_destroy.
      
      So after that, transport->assoc is actually an unavailable memory address
      in sctp_hash_cmp. Although sctp_hash_cmp is under rcu_read_lock, it still
      can not avoid this, as assoc is not freed by RCU.
      
      This patch is to hold the transport before checking it's members with
      sctp_transport_hold, in which it checks the refcnt first, holds it if
      it's not 0.
      
      Fixes: 4f008781 ("sctp: apply rhashtable api to send/recv path")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      715f5552
    • G
      netfilter: synproxy: Check oom when adding synproxy and seqadj ct extensions · 4440a2ab
      Gao Feng 提交于
      When memory is exhausted, nfct_seqadj_ext_add may fail to add the
      synproxy and seqadj extensions. The function nf_ct_seqadj_init doesn't
      check if get valid seqadj pointer by the nfct_seqadj.
      
      Now drop the packet directly when fail to add seqadj extension to
      avoid dereference NULL pointer in nf_ct_seqadj_init from
      init_conntrack().
      Signed-off-by: NGao Feng <fgao@ikuai8.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4440a2ab
    • C
      svcauth_gss: Revert 64c59a37 ("Remove unnecessary allocation") · bf2c4b6f
      Chuck Lever 提交于
      rsc_lookup steals the passed-in memory to avoid doing an allocation of
      its own, so we can't just pass in a pointer to memory that someone else
      is using.
      
      If we really want to avoid allocation there then maybe we should
      preallocate somwhere, or reference count these handles.
      
      For now we should revert.
      
      On occasion I see this on my server:
      
      kernel: kernel BUG at /home/cel/src/linux/linux-2.6/mm/slub.c:3851!
      kernel: invalid opcode: 0000 [#1] SMP
      kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd btrfs xor iTCO_wdt iTCO_vendor_support raid6_pq pcspkr i2c_i801 i2c_smbus lpc_ich mfd_core mei_me sg mei shpchp wmi ioatdma ipmi_si ipmi_msghandler acpi_pad acpi_power_meter rpcrdma ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb mlx4_core ahci libahci libata ptp pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
      kernel: CPU: 7 PID: 145 Comm: kworker/7:2 Not tainted 4.8.0-rc4-00006-g9d06b0b #15
      kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
      kernel: Workqueue: events do_cache_clean [sunrpc]
      kernel: task: ffff8808541d8000 task.stack: ffff880854344000
      kernel: RIP: 0010:[<ffffffff811e7075>]  [<ffffffff811e7075>] kfree+0x155/0x180
      kernel: RSP: 0018:ffff880854347d70  EFLAGS: 00010246
      kernel: RAX: ffffea0020fe7660 RBX: ffff88083f9db064 RCX: 146ff0f9d5ec5600
      kernel: RDX: 000077ff80000000 RSI: ffff880853f01500 RDI: ffff88083f9db064
      kernel: RBP: ffff880854347d88 R08: ffff8808594ee000 R09: ffff88087fdd8780
      kernel: R10: 0000000000000000 R11: ffffea0020fe76c0 R12: ffff880853f01500
      kernel: R13: ffffffffa013cf76 R14: ffffffffa013cff0 R15: ffffffffa04253a0
      kernel: FS:  0000000000000000(0000) GS:ffff88087fdc0000(0000) knlGS:0000000000000000
      kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      kernel: CR2: 00007fed60b020c3 CR3: 0000000001c06000 CR4: 00000000001406e0
      kernel: Stack:
      kernel: ffff8808589f2f00 ffff880853f01500 0000000000000001 ffff880854347da0
      kernel: ffffffffa013cf76 ffff8808589f2f00 ffff880854347db8 ffffffffa013d006
      kernel: ffff8808589f2f20 ffff880854347e00 ffffffffa0406f60 0000000057c7044f
      kernel: Call Trace:
      kernel: [<ffffffffa013cf76>] rsc_free+0x16/0x90 [auth_rpcgss]
      kernel: [<ffffffffa013d006>] rsc_put+0x16/0x30 [auth_rpcgss]
      kernel: [<ffffffffa0406f60>] cache_clean+0x2e0/0x300 [sunrpc]
      kernel: [<ffffffffa04073ee>] do_cache_clean+0xe/0x70 [sunrpc]
      kernel: [<ffffffff8109a70f>] process_one_work+0x1ff/0x3b0
      kernel: [<ffffffff8109b15c>] worker_thread+0x2bc/0x4a0
      kernel: [<ffffffff8109aea0>] ? rescuer_thread+0x3a0/0x3a0
      kernel: [<ffffffff810a0ba4>] kthread+0xe4/0xf0
      kernel: [<ffffffff8169c47f>] ret_from_fork+0x1f/0x40
      kernel: [<ffffffff810a0ac0>] ? kthread_stop+0x110/0x110
      kernel: Code: f7 ff ff eb 3b 65 8b 05 da 30 e2 7e 89 c0 48 0f a3 05 a0 38 b8 00 0f 92 c0 84 c0 0f 85 d1 fe ff ff 0f 1f 44 00 00 e9 f5 fe ff ff <0f> 0b 49 8b 03 31 f6 f6 c4 40 0f 85 62 ff ff ff e9 61 ff ff ff
      kernel: RIP  [<ffffffff811e7075>] kfree+0x155/0x180
      kernel: RSP <ffff880854347d70>
      kernel: ---[ end trace 3fdec044969def26 ]---
      
      It seems to be most common after a server reboot where a client has been
      using a Kerberos mount, and reconnects to continue its workload.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      bf2c4b6f
    • P
      netfilter: nf_nat: handle NF_DROP from nfnetlink_parse_nat_setup() · ecfcdfec
      Pablo Neira Ayuso 提交于
      nf_nat_setup_info() returns NF_* verdicts, so convert them to error
      codes that is what ctnelink expects. This has passed overlook without
      having any impact since this nf_nat_setup_info() has always returned
      NF_ACCEPT so far. Since 870190a9 ("netfilter: nat: convert nat bysrc
      hash to rhashtable"), this is problem.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ecfcdfec
  10. 12 9月, 2016 3 次提交
  11. 10 9月, 2016 1 次提交
    • M
      sctp: identify chunks that need to be fragmented at IP level · 7303a147
      Marcelo Ricardo Leitner 提交于
      Previously, without GSO, it was easy to identify it: if the chunk didn't
      fit and there was no data chunk in the packet yet, we could fragment at
      IP level. So if there was an auth chunk and we were bundling a big data
      chunk, it would fragment regardless of the size of the auth chunk. This
      also works for the context of PMTU reductions.
      
      But with GSO, we cannot distinguish such PMTU events anymore, as the
      packet is allowed to exceed PMTU.
      
      So we need another check: to ensure that the chunk that we are adding,
      actually fits the current PMTU. If it doesn't, trigger a flush and let
      it be fragmented at IP level in the next round.
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7303a147
  12. 09 9月, 2016 4 次提交
  13. 07 9月, 2016 6 次提交
    • W
      ipv6: addrconf: fix dev refcont leak when DAD failed · 751eb6b6
      Wei Yongjun 提交于
      In general, when DAD detected IPv6 duplicate address, ifp->state
      will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
      delayed work, the call tree should be like this:
      
      ndisc_recv_ns
        -> addrconf_dad_failure        <- missing ifp put
           -> addrconf_mod_dad_work
             -> schedule addrconf_dad_work()
               -> addrconf_dad_stop()  <- missing ifp hold before call it
      
      addrconf_dad_failure() called with ifp refcont holding but not put.
      addrconf_dad_work() call addrconf_dad_stop() without extra holding
      refcount. This will not cause any issue normally.
      
      But the race between addrconf_dad_failure() and addrconf_dad_work()
      may cause ifp refcount leak and netdevice can not be unregister,
      dmesg show the following messages:
      
      IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
      ...
      unregister_netdevice: waiting for eth0 to become free. Usage count = 1
      
      Cc: stable@vger.kernel.org
      Fixes: c15b1cca ("ipv6: move DAD and addrconf_verify processing
      to workqueue")
      Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      751eb6b6
    • M
      net: Don't delete routes in different VRFs · 5a56a0b3
      Mark Tomlinson 提交于
      When deleting an IP address from an interface, there is a clean-up of
      routes which refer to this local address. However, there was no check to
      see that the VRF matched. This meant that deletion wasn't confined to
      the VRF it should have been.
      
      To solve this, a new field has been added to fib_info to hold a table
      id. When removing fib entries corresponding to a local ip address, this
      table id is also used in the comparison.
      
      The table id is populated when the fib_info is created. This was already
      done in some places, but not in ip_rt_ioctl(). This has now been fixed.
      
      Fixes: 021dd3b8 ("net: Add routes to the table associated with the device")
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NMark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a56a0b3
    • C
      xprtrdma: Fix receive buffer accounting · 05c97466
      Chuck Lever 提交于
      An RPC can terminate before its reply arrives, if a credential
      problem or a soft timeout occurs. After this happens, xprtrdma
      reports it is out of Receive buffers.
      
      A Receive buffer is posted before each RPC is sent, and returned to
      the buffer pool when a reply is received. If no reply is received
      for an RPC, that Receive buffer remains posted. But xprtrdma tries
      to post another when the next RPC is sent.
      
      If this happens a few dozen times, there are no receive buffers left
      to be posted at send time. I don't see a way for a transport
      connection to recover at that point, and it will spit warnings and
      unnecessarily delay RPCs on occasion for its remaining lifetime.
      
      Commit 1e465fd4 ("xprtrdma: Replace send and receive arrays")
      removed a little bit of logic to detect this case and not provide
      a Receive buffer so no more buffers are posted, and then transport
      operation continues correctly. We didn't understand what that logic
      did, and it wasn't commented, so it was removed as part of the
      overhaul to support backchannel requests.
      
      Restore it, but be wary of the need to keep extra Receives posted
      to deal with backchannel requests.
      
      Fixes: 1e465fd4 ("xprtrdma: Replace send and receive arrays")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      05c97466
    • C
      xprtrdma: Revert 3d4cf35b ("xprtrdma: Reply buffer exhaustion...") · 78d506e1
      Chuck Lever 提交于
      Receive buffer exhaustion, if it were to actually occur, would be
      catastrophic. However, when there are no reply buffers to post, that
      means all of them have already been posted and are waiting for
      incoming replies. By design, there can never be more RPCs in flight
      than there are available receive buffers.
      
      A receive buffer can be left posted after an RPC exits without a
      received reply; say, due to a credential problem or a soft timeout.
      This does not result in fewer posted receive buffers than there are
      pending RPCs, and there is already logic in xprtrdma to deal
      appropriately with this case.
      
      It also looks like the "+ 2" that was removed was accidentally
      accommodating the number of extra receive buffers needed for
      receiving backchannel requests. That will need to be addressed by
      another patch.
      
      Fixes: 3d4cf35b ("xprtrdma: Reply buffer exhaustion can be...")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      78d506e1
    • D
      ipv6: release dst in ping_v6_sendmsg · 03c2778a
      Dave Jones 提交于
      Neither the failure or success paths of ping_v6_sendmsg release
      the dst it acquires.  This leads to a flood of warnings from
      "net/core/dst.c:288 dst_release" on older kernels that
      don't have 8bf4ada2 backported.
      
      That patch optimistically hoped this had been fixed post 3.10, but
      it seems at least one case wasn't, where I've seen this triggered
      a lot from machines doing unprivileged icmp sockets.
      
      Cc: Martin Lau <kafai@fb.com>
      Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03c2778a
    • L
      netfilter: nft_chain_route: re-route before skb is queued to userspace · d1a6cba5
      Liping Zhang 提交于
      Imagine such situation, user add the following nft rules, and queue
      the packets to userspace for further check:
        # ip rule add fwmark 0x0/0x1 lookup eth0
        # ip rule add fwmark 0x1/0x1 lookup eth1
        # nft add table filter
        # nft add chain filter output {type route hook output priority 0 \;}
        # nft add rule filter output mark set 0x1
        # nft add rule filter output queue num 0
      
      But after we reinject the skbuff, the packet will be sent via the
      wrong route, i.e. in this case, the packet will be routed via eth0
      table, not eth1 table. Because we skip to do re-route when verdict
      is NF_QUEUE, even if the mark was changed.
      
      Acctually, we should not touch sk_buff if verdict is NF_DROP or
      NF_STOLEN, and when re-route fails, return NF_DROP with error code.
      This is consistent with the mangle table in iptables.
      Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d1a6cba5
  14. 06 9月, 2016 1 次提交
  15. 05 9月, 2016 3 次提交
  16. 03 9月, 2016 2 次提交
    • P
      sunrpc: fix UDP memory accounting · a41bd25a
      Paolo Abeni 提交于
      The commit f9b2ee71 ("SUNRPC: Move UDP receive data path
      into a workqueue context"), as a side effect, moved the
      skb_free_datagram() call outside the scope of the related socket
      lock, but UDP sockets require such lock to be held for proper
      memory accounting.
      Fix it by replacing skb_free_datagram() with
      skb_free_datagram_locked().
      
      Fixes: f9b2ee71 ("SUNRPC: Move UDP receive data path into a workqueue context")
      Reported-and-tested-by: NJan Stancek <jstancek@redhat.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Cc: stable@vger.kernel.org # 4.4+
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      a41bd25a
    • S
      l2tp: fix use-after-free during module unload · 2f86953e
      Sabrina Dubroca 提交于
      Tunnel deletion is delayed by both a workqueue (l2tp_tunnel_delete -> wq
       -> l2tp_tunnel_del_work) and RCU (sk_destruct -> RCU ->
      l2tp_tunnel_destruct).
      
      By the time l2tp_tunnel_destruct() runs to destroy the tunnel and finish
      destroying the socket, the private data reserved via the net_generic
      mechanism has already been freed, but l2tp_tunnel_destruct() actually
      uses this data.
      
      Make sure tunnel deletion for the netns has completed before returning
      from l2tp_exit_net() by first flushing the tunnel removal workqueue, and
      then waiting for RCU callbacks to complete.
      
      Fixes: 167eb17e ("l2tp: create tunnel sockets in the right namespace")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f86953e
  17. 02 9月, 2016 2 次提交