1. 25 1月, 2015 1 次提交
  2. 15 1月, 2015 2 次提交
  3. 14 1月, 2015 1 次提交
    • J
      neighbour: fix base_reachable_time(_ms) not effective immediatly when changed · 4bf6980d
      Jean-Francois Remy 提交于
      When setting base_reachable_time or base_reachable_time_ms on a
      specific interface through sysctl or netlink, the reachable_time
      value is not updated.
      
      This means that neighbour entries will continue to be updated using the
      old value until it is recomputed in neigh_period_work (which
          recomputes the value every 300*HZ).
      On systems with HZ equal to 1000 for instance, it means 5mins before
      the change is effective.
      
      This patch changes this behavior by recomputing reachable_time after
      each set on base_reachable_time or base_reachable_time_ms.
      The new value will become effective the next time the neighbour's timer
      is triggered.
      
      Changes are made in two places: the netlink code for set and the sysctl
      handling code. For sysctl, I use a proc_handler. The ipv6 network
      code does provide its own handler but it already refreshes
      reachable_time correctly so it's not an issue.
      Any other user of neighbour which provide its own handlers must
      refresh reachable_time.
      Signed-off-by: NJean-Francois Remy <jeff@melix.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4bf6980d
  4. 13 1月, 2015 1 次提交
    • J
      tipc: fix bug in broadcast retransmit code · 16416779
      Jon Paul Maloy 提交于
      In commit 58dc55f2 ("tipc: use generic
      SKB list APIs to manage link transmission queue") we replace all list
      traversal loops with the macros skb_queue_walk() or
      skb_queue_walk_safe(). While the previous loops were based on the
      assumption that the list was NULL-terminated, the standard macros
      stop when the iterator reaches the list head, which is non-NULL.
      
      In the function bclink_retransmit_pkt() this macro replacement has
      lead to a bug. When we receive a BCAST STATE_MSG we unconditionally
      call the function bclink_retransmit_pkt(), whether there really is
      anything to retransmit or not, assuming that the sequence number
      comparisons will lead to the correct behavior. However, if the
      transmission queue is empty, or if there are no eligible buffers in
      the transmission queue, we will by mistake pass the list head pointer
      to the function tipc_link_retransmit(). Since the list head is not a
      valid sk_buff, this leads to a crash.
      
      In this commit we fix this by only calling tipc_link_retransmit()
      if we actually found eligible buffers in the transmission queue.
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16416779
  5. 12 1月, 2015 1 次提交
  6. 09 1月, 2015 1 次提交
  7. 08 1月, 2015 1 次提交
  8. 07 1月, 2015 4 次提交
    • P
      netfilter: nf_tables: fix flush ruleset chain dependencies · a2f18db0
      Pablo Neira Ayuso 提交于
      Jumping between chains doesn't mix well with flush ruleset. Rules
      from a different chain and set elements may still refer to us.
      
      [  353.373791] ------------[ cut here ]------------
      [  353.373845] kernel BUG at net/netfilter/nf_tables_api.c:1159!
      [  353.373896] invalid opcode: 0000 [#1] SMP
      [  353.373942] Modules linked in: intel_powerclamp uas iwldvm iwlwifi
      [  353.374017] CPU: 0 PID: 6445 Comm: 31c3.nft Not tainted 3.18.0 #98
      [  353.374069] Hardware name: LENOVO 5129CTO/5129CTO, BIOS 6QET47WW (1.17 ) 07/14/2010
      [...]
      [  353.375018] Call Trace:
      [  353.375046]  [<ffffffff81964c31>] ? nf_tables_commit+0x381/0x540
      [  353.375101]  [<ffffffff81949118>] nfnetlink_rcv+0x3d8/0x4b0
      [  353.375150]  [<ffffffff81943fc5>] netlink_unicast+0x105/0x1a0
      [  353.375200]  [<ffffffff8194438e>] netlink_sendmsg+0x32e/0x790
      [  353.375253]  [<ffffffff818f398e>] sock_sendmsg+0x8e/0xc0
      [  353.375300]  [<ffffffff818f36b9>] ? move_addr_to_kernel.part.20+0x19/0x70
      [  353.375357]  [<ffffffff818f44f9>] ? move_addr_to_kernel+0x19/0x30
      [  353.375410]  [<ffffffff819016d2>] ? verify_iovec+0x42/0xd0
      [  353.375459]  [<ffffffff818f3e10>] ___sys_sendmsg+0x3f0/0x400
      [  353.375510]  [<ffffffff810615fa>] ? native_sched_clock+0x2a/0x90
      [  353.375563]  [<ffffffff81176697>] ? acct_account_cputime+0x17/0x20
      [  353.375616]  [<ffffffff8110dc78>] ? account_user_time+0x88/0xa0
      [  353.375667]  [<ffffffff818f4bbd>] __sys_sendmsg+0x3d/0x80
      [  353.375719]  [<ffffffff81b184f4>] ? int_check_syscall_exit_work+0x34/0x3d
      [  353.375776]  [<ffffffff818f4c0d>] SyS_sendmsg+0xd/0x20
      [  353.375823]  [<ffffffff81b1826d>] system_call_fastpath+0x16/0x1b
      
      Release objects in this order: rules -> sets -> chains -> tables, to
      make sure no references to chains are held anymore.
      Reported-by: NAsbjoern Sloth Toennesen <asbjorn@asbjorn.biz>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a2f18db0
    • P
      netfilter: nfnetlink: relax strict multicast group check from netlink_bind · 62924af2
      Pablo Neira Ayuso 提交于
      Relax the checking that was introduced in 97840cb6 ("netfilter:
      nfnetlink: fix insufficient validation in nfnetlink_bind") when the
      subscription bitmask is used. Existing userspace code code may request
      to listen to all of the existing netlink groups by setting an all to one
      subscription group bitmask. Netlink already validates subscription via
      setsockopt() for us.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      62924af2
    • P
      netfilter: nfnetlink: validate nfnetlink header from batch · 9ea2aa8b
      Pablo Neira Ayuso 提交于
      Make sure there is enough room for the nfnetlink header in the
      netlink messages that are part of the batch. There is a similar
      check in netlink_rcv_skb().
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9ea2aa8b
    • P
      netfilter: conntrack: fix race between confirmation and flush · 8ca3f5e9
      Pablo Neira Ayuso 提交于
      Commit 5195c14c ("netfilter: conntrack: fix race in
      __nf_conntrack_confirm against get_next_corpse") aimed to resolve the
      race condition between the confirmation (packet path) and the flush
      command (from control plane). However, it introduced a crash when
      several packets race to add a new conntrack, which seems easier to
      reproduce when nf_queue is in place.
      
      Fix this race, in __nf_conntrack_confirm(), by removing the CT
      from unconfirmed list before checking the DYING bit. In case
      race occured, re-add the CT to the dying list
      
      This patch also changes the verdict from NF_ACCEPT to NF_DROP when
      we lose race. Basically, the confirmation happens for the first packet
      that we see in a flow. If you just invoked conntrack -F once (which
      should be the common case), then this is likely to be the first packet
      of the flow (unless you already called flush anytime soon in the past).
      This should be hard to trigger, but better drop this packet, otherwise
      we leave things in inconsistent state since the destination will likely
      reply to this packet, but it will find no conntrack, unless the origin
      retransmits.
      
      The change of the verdict has been discussed in:
      https://www.marc.info/?l=linux-netdev&m=141588039530056&w=2Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8ca3f5e9
  9. 06 1月, 2015 6 次提交
  10. 05 1月, 2015 1 次提交
  11. 03 1月, 2015 2 次提交
  12. 31 12月, 2014 1 次提交
  13. 30 12月, 2014 1 次提交
  14. 27 12月, 2014 8 次提交
    • J
      netlink/genetlink: pass network namespace to bind/unbind · 023e2cfa
      Johannes Berg 提交于
      Netlink families can exist in multiple namespaces, and for the most
      part multicast subscriptions are per network namespace. Thus it only
      makes sense to have bind/unbind notifications per network namespace.
      
      To achieve this, pass the network namespace of a given client socket
      to the bind/unbind functions.
      
      Also do this in generic netlink, and there also make sure that any
      bind for multicast groups that only exist in init_net is rejected.
      This isn't really a problem if it is accepted since a client in a
      different namespace will never receive any notifications from such
      a group, but it can confuse the family if not rejected (it's also
      possible to silently (without telling the family) accept it, but it
      would also have to be ignored on unbind so families that take any
      kind of action on bind/unbind won't do unnecessary work for invalid
      clients like that.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      023e2cfa
    • J
      genetlink: pass multicast bind/unbind to families · c380d9a7
      Johannes Berg 提交于
      In order to make the newly fixed multicast bind/unbind
      functionality in generic netlink, pass them down to the
      appropriate family.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c380d9a7
    • J
      netlink: call unbind when releasing socket · 7d68536b
      Johannes Berg 提交于
      Currently, netlink_unbind() is only called when the socket
      explicitly unbinds, which limits its usefulness (luckily
      there are no users of it yet anyway.)
      
      Call netlink_unbind() also when a socket is released, so it
      becomes possible to track listeners with this callback and
      without also implementing a netlink notifier (and checking
      netlink_has_listeners() in there.)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d68536b
    • J
      netlink: update listeners directly when removing socket · b10dcb3b
      Johannes Berg 提交于
      The code is now confusing to read - first in one function down
      (netlink_remove) any group subscriptions are implicitly removed
      by calling __sk_del_bind_node(), but the subscriber database is
      only updated far later by calling netlink_update_listeners().
      
      Move the latter call to just after removal from the list so it
      is easier to follow the code.
      
      This also enables moving the locking inside the kernel-socket
      conditional, which improves the normal socket destruction path.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b10dcb3b
    • J
      genetlink: pass only network namespace to genl_has_listeners() · f8403a2e
      Johannes Berg 提交于
      There's no point to force the caller to know about the internal
      genl_sock to use inside struct net, just have them pass the network
      namespace. This doesn't really change code generation since it's
      an inline, but makes the caller less magic - there's never any
      reason to pass another socket.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8403a2e
    • J
      netlink: rename netlink_unbind() to netlink_undo_bind() · 02c81ab9
      Johannes Berg 提交于
      The new name is more expressive - this isn't a generic unbind
      function but rather only a little undo helper for use only in
      netlink_bind().
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02c81ab9
    • J
      net: Generalize ndo_gso_check to ndo_features_check · 5f35227e
      Jesse Gross 提交于
      GSO isn't the only offload feature with restrictions that
      potentially can't be expressed with the current features mechanism.
      Checksum is another although it's a general issue that could in
      theory apply to anything. Even if it may be possible to
      implement these restrictions in other ways, it can result in
      duplicate code or inefficient per-packet behavior.
      
      This generalizes ndo_gso_check so that drivers can remove any
      features that don't make sense for a given packet, similar to
      netif_skb_features(). It also converts existing driver
      restrictions to the new format, completing the work that was
      done to support tunnel protocols since the issues apply to
      checksums as well.
      
      By actually removing features from the set that are used to do
      offloading, it solves another problem with the existing
      interface. In these cases, GSO would run with the original set
      of features and not do anything because it appears that
      segmentation is not required.
      
      CC: Tom Herbert <therbert@google.com>
      CC: Joe Stringer <joestringer@nicira.com>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Hayes Wang <hayeswang@realtek.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      Acked-by: NTom Herbert <therbert@google.com>
      Fixes: 04ffcb25 ("net: Add ndo_gso_check")
      Tested-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f35227e
    • J
      net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding · 2c26d34b
      Jay Vosburgh 提交于
      When using VXLAN tunnels and a sky2 device, I have experienced
      checksum failures of the following type:
      
      [ 4297.761899] eth0: hw csum failure
      [...]
      [ 4297.765223] Call Trace:
      [ 4297.765224]  <IRQ>  [<ffffffff8172f026>] dump_stack+0x46/0x58
      [ 4297.765235]  [<ffffffff8162ba52>] netdev_rx_csum_fault+0x42/0x50
      [ 4297.765238]  [<ffffffff8161c1a0>] ? skb_push+0x40/0x40
      [ 4297.765240]  [<ffffffff8162325c>] __skb_checksum_complete+0xbc/0xd0
      [ 4297.765243]  [<ffffffff8168c602>] tcp_v4_rcv+0x2e2/0x950
      [ 4297.765246]  [<ffffffff81666ca0>] ? ip_rcv_finish+0x360/0x360
      
      	These are reliably reproduced in a network topology of:
      
      container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch
      
      	When VXLAN encapsulated traffic is received from a similarly
      configured peer, the above warning is generated in the receive
      processing of the encapsulated packet.  Note that the warning is
      associated with the container eth0.
      
              The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
      because the packet is an encapsulated Ethernet frame, the checksum
      generated by the hardware includes the inner protocol and Ethernet
      headers.
      
      	The receive code is careful to update the skb->csum, except in
      __dev_forward_skb, as called by dev_forward_skb.  __dev_forward_skb
      calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
      to skip over the Ethernet header, but does not update skb->csum when
      doing so.
      
      	This patch resolves the problem by adding a call to
      skb_postpull_rcsum to update the skb->csum after the call to
      eth_type_trans.
      Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c26d34b
  15. 25 12月, 2014 3 次提交
  16. 24 12月, 2014 6 次提交