1. 10 2月, 2017 1 次提交
    • J
      openvswitch: Add original direction conntrack tuple to sw_flow_key. · 9dd7f890
      Jarno Rajahalme 提交于
      Add the fields of the conntrack original direction 5-tuple to struct
      sw_flow_key.  The new fields are initially marked as non-existent, and
      are populated whenever a conntrack action is executed and either finds
      or generates a conntrack entry.  This means that these fields exist
      for all packets that were not rejected by conntrack as untrackable.
      
      The original tuple fields in the sw_flow_key are filled from the
      original direction tuple of the conntrack entry relating to the
      current packet, or from the original direction tuple of the master
      conntrack entry, if the current conntrack entry has a master.
      Generally, expected connections of connections having an assigned
      helper (e.g., FTP), have a master conntrack entry.
      
      The main purpose of the new conntrack original tuple fields is to
      allow matching on them for policy decision purposes, with the premise
      that the admissibility of tracked connections reply packets (as well
      as original direction packets), and both direction packets of any
      related connections may be based on ACL rules applying to the master
      connection's original direction 5-tuple.  This also makes it easier to
      make policy decisions when the actual packet headers might have been
      transformed by NAT, as the original direction 5-tuple represents the
      packet headers before any such transformation.
      
      When using the original direction 5-tuple the admissibility of return
      and/or related packets need not be based on the mere existence of a
      conntrack entry, allowing separation of admission policy from the
      established conntrack state.  While existence of a conntrack entry is
      required for admission of the return or related packets, policy
      changes can render connections that were initially admitted to be
      rejected or dropped afterwards.  If the admission of the return and
      related packets was based on mere conntrack state (e.g., connection
      being in an established state), a policy change that would make the
      connection rejected or dropped would need to find and delete all
      conntrack entries affected by such a change.  When using the original
      direction 5-tuple matching the affected conntrack entries can be
      allowed to time out instead, as the established state of the
      connection would not need to be the basis for packet admission any
      more.
      
      It should be noted that the directionality of related connections may
      be the same or different than that of the master connection, and
      neither the original direction 5-tuple nor the conntrack state bits
      carry this information.  If needed, the directionality of the master
      connection can be stored in master's conntrack mark or labels, which
      are automatically inherited by the expected related connections.
      
      The fact that neither ARP nor ND packets are trackable by conntrack
      allows mutual exclusion between ARP/ND and the new conntrack original
      tuple fields.  Hence, the IP addresses are overlaid in union with ARP
      and ND fields.  This allows the sw_flow_key to not grow much due to
      this patch, but it also means that we must be careful to never use the
      new key fields with ARP or ND packets.  ARP is easy to distinguish and
      keep mutually exclusive based on the ethernet type, but ND being an
      ICMPv6 protocol requires a bit more attention.
      Signed-off-by: NJarno Rajahalme <jarno@ovn.org>
      Acked-by: NJoe Stringer <joe@ovn.org>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9dd7f890
  2. 28 12月, 2016 1 次提交
    • P
      openvswitch: upcall: Fix vlan handling. · df30f740
      pravin shelar 提交于
      Networking stack accelerate vlan tag handling by
      keeping topmost vlan header in skb. This works as
      long as packet remains in OVS datapath. But during
      OVS upcall vlan header is pushed on to the packet.
      When such packet is sent back to OVS datapath, core
      networking stack might not handle it correctly. Following
      patch avoids this issue by accelerating the vlan tag
      during flow key extract. This simplifies datapath by
      bringing uniform packet processing for packets from
      all code paths.
      
      Fixes: 5108bbad ("openvswitch: add processing of L3 packets").
      CC: Jarno Rajahalme <jarno@ovn.org>
      CC: Jiri Benc <jbenc@redhat.com>
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df30f740
  3. 13 11月, 2016 2 次提交
    • J
      openvswitch: add processing of L3 packets · 5108bbad
      Jiri Benc 提交于
      Support receiving, extracting flow key and sending of L3 packets (packets
      without an Ethernet header).
      
      Note that even after this patch, non-Ethernet interfaces are still not
      allowed to be added to bridges. Similarly, netlink interface for sending and
      receiving L3 packets to/from user space is not in place yet.
      
      Based on previous versions by Lorand Jakab and Simon Horman.
      Signed-off-by: NLorand Jakab <lojakab@cisco.com>
      Signed-off-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5108bbad
    • J
      openvswitch: add mac_proto field to the flow key · 329f45bc
      Jiri Benc 提交于
      Use a hole in the structure. We support only Ethernet so far and will add
      a support for L2-less packets shortly. We could use a bool to indicate
      whether the Ethernet header is present or not but the approach with the
      mac_proto field is more generic and occupies the same number of bytes in the
      struct, while allowing later extensibility. It also makes the code in the
      next patches more self explaining.
      
      It would be nice to use ARPHRD_ constants but those are u16 which would be
      waste. Thus define our own constants.
      
      Another upside of this is that we can overload this new field to also denote
      whether the flow key is valid. This has the advantage that on
      refragmentation, we don't have to reparse the packet but can rely on the
      stored eth.type. This is especially important for the next patches in this
      series - instead of adding another branch for L2-less packets before calling
      ovs_fragment, we can just remove all those branches completely.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      329f45bc
  4. 13 10月, 2016 1 次提交
  5. 03 10月, 2016 1 次提交
  6. 21 9月, 2016 1 次提交
  7. 19 9月, 2016 2 次提交
  8. 09 9月, 2016 1 次提交
  9. 07 10月, 2015 1 次提交
  10. 01 9月, 2015 1 次提交
  11. 30 8月, 2015 3 次提交
  12. 28 8月, 2015 2 次提交
    • J
      openvswitch: Allow matching on conntrack label · c2ac6673
      Joe Stringer 提交于
      Allow matching and setting the ct_label field. As with ct_mark, this is
      populated by executing the CT action. The label field may be modified by
      specifying a label and mask nested under the CT action. It is stored as
      metadata attached to the connection. Label modification occurs after
      lookup, and will only persist when the conntrack entry is committed by
      providing the COMMIT flag to the CT action. Labels are currently fixed
      to 128 bits in size.
      Signed-off-by: NJoe Stringer <joestringer@nicira.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2ac6673
    • J
      openvswitch: Add conntrack action · 7f8a436e
      Joe Stringer 提交于
      Expose the kernel connection tracker via OVS. Userspace components can
      make use of the CT action to populate the connection state (ct_state)
      field for a flow. This state can be subsequently matched.
      
      Exposed connection states are OVS_CS_F_*:
      - NEW (0x01) - Beginning of a new connection.
      - ESTABLISHED (0x02) - Part of an existing connection.
      - RELATED (0x04) - Related to an established connection.
      - INVALID (0x20) - Could not track the connection for this packet.
      - REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
      - TRACKED (0x80) - This packet has been sent through conntrack.
      
      When the CT action is executed by itself, it will send the packet
      through the connection tracker and populate the ct_state field with one
      or more of the connection state flags above. The CT action will always
      set the TRACKED bit.
      
      When the COMMIT flag is passed to the conntrack action, this specifies
      that information about the connection should be stored. This allows
      subsequent packets for the same (or related) connections to be
      correlated with this connection. Sending subsequent packets for the
      connection through conntrack allows the connection tracker to consider
      the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.
      
      The CT action may optionally take a zone to track the flow within. This
      allows connections with the same 5-tuple to be kept logically separate
      from connections in other zones. If the zone is specified, then the
      "ct_zone" match field will be subsequently populated with the zone id.
      
      IP fragments are handled by transparently assembling them as part of the
      CT action. The maximum received unit (MRU) size is tracked so that
      refragmentation can occur during output.
      
      IP frag handling contributed by Andy Zhou.
      
      Based on original design by Justin Pettit.
      Signed-off-by: NJoe Stringer <joestringer@nicira.com>
      Signed-off-by: NJustin Pettit <jpettit@nicira.com>
      Signed-off-by: NAndy Zhou <azhou@nicira.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f8a436e
  13. 22 7月, 2015 1 次提交
  14. 06 5月, 2015 1 次提交
  15. 15 4月, 2015 1 次提交
    • D
      mm: remove GFP_THISNODE · 4167e9b2
      David Rientjes 提交于
      NOTE: this is not about __GFP_THISNODE, this is only about GFP_THISNODE.
      
      GFP_THISNODE is a secret combination of gfp bits that have different
      behavior than expected.  It is a combination of __GFP_THISNODE,
      __GFP_NORETRY, and __GFP_NOWARN and is special-cased in the page
      allocator slowpath to fail without trying reclaim even though it may be
      used in combination with __GFP_WAIT.
      
      An example of the problem this creates: commit e97ca8e5 ("mm: fix
      GFP_THISNODE callers and clarify") fixed up many users of GFP_THISNODE
      that really just wanted __GFP_THISNODE.  The problem doesn't end there,
      however, because even it was a no-op for alloc_misplaced_dst_page(),
      which also sets __GFP_NORETRY and __GFP_NOWARN, and
      migrate_misplaced_transhuge_page(), where __GFP_NORETRY and __GFP_NOWAIT
      is set in GFP_TRANSHUGE.  Converting GFP_THISNODE to __GFP_THISNODE is a
      no-op in these cases since the page allocator special-cases
      __GFP_THISNODE && __GFP_NORETRY && __GFP_NOWARN.
      
      It's time to just remove GFP_THISNODE entirely.  We leave __GFP_THISNODE
      to restrict an allocation to a local node, but remove GFP_THISNODE and
      its obscurity.  Instead, we require that a caller clear __GFP_WAIT if it
      wants to avoid reclaim.
      
      This allows the aforementioned functions to actually reclaim as they
      should.  It also enables any future callers that want to do
      __GFP_THISNODE but also __GFP_NORETRY && __GFP_NOWARN to reclaim.  The
      rule is simple: if you don't want to reclaim, then don't set __GFP_WAIT.
      
      Aside: ovs_flow_stats_update() really wants to avoid reclaim as well, so
      it is unchanged.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Pravin Shelar <pshelar@nicira.com>
      Cc: Jarno Rajahalme <jrajahalme@nicira.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4167e9b2
  16. 12 2月, 2015 1 次提交
  17. 15 1月, 2015 1 次提交
  18. 14 1月, 2015 1 次提交
  19. 03 1月, 2015 1 次提交
  20. 10 11月, 2014 2 次提交
  21. 06 11月, 2014 1 次提交
  22. 18 10月, 2014 2 次提交
  23. 06 10月, 2014 3 次提交
  24. 16 9月, 2014 3 次提交
  25. 23 8月, 2014 1 次提交
  26. 30 6月, 2014 1 次提交
  27. 23 5月, 2014 3 次提交