1. 23 10月, 2015 1 次提交
    • R
      mpls: multipath route support · f8efb73c
      Roopa Prabhu 提交于
      This patch adds support for MPLS multipath routes.
      
      Includes following changes to support multipath:
      - splits struct mpls_route into 'struct mpls_route + struct mpls_nh'
      
      - 'struct mpls_nh' represents a mpls nexthop label forwarding entry
      
      - moves mpls route and nexthop structures into internal.h
      
      - A mpls_route can point to multiple mpls_nh structs
      
      - the nexthops are maintained as a array (similar to ipv4 fib)
      
      - In the process of restructuring, this patch also consistently changes
        all labels to u8
      
      - Adds support to parse/fill RTA_MULTIPATH netlink attribute for
      multipath routes similar to ipv4/v6 fib
      
      - In this patch, the multipath route nexthop selection algorithm
      simply returns the first nexthop. It is replaced by a
      hash based algorithm from Robert Shearman in the next patch
      
      - mpls_route_update cleanup: remove 'dev' handling in mpls_route_update.
      mpls_route_update though implemented to update based on dev, it was
      never used that way. And the dev handling gets tricky with multiple
      nexthops. Cannot match against any single nexthops dev. So, this patch
      removes the unused 'dev' handling in mpls_route_update.
      
      - dead route/path handling will be implemented in a subsequent patch
      
      Example:
      
      $ip -f mpls route add 100 nexthop as 200 via inet 10.1.1.2 dev swp1 \
                      nexthop as 700 via inet 10.1.1.6 dev swp2 \
                      nexthop as 800 via inet 40.1.1.2 dev swp3
      
      $ip  -f mpls route show
      100
              nexthop as to 200 via inet 10.1.1.2  dev swp1
              nexthop as to 700 via inet 10.1.1.6  dev swp2
              nexthop as to 800 via inet 40.1.1.2  dev swp3
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Acked-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8efb73c
  2. 01 9月, 2015 1 次提交
  3. 10 8月, 2015 1 次提交
    • R
      mpls: Enforce payload type of traffic sent using explicit NULL · 118d5234
      Robert Shearman 提交于
      RFC 4182 s2 states that if an IPv4 Explicit NULL label is the only
      label on the stack, then after popping the resulting packet must be
      treated as a IPv4 packet and forwarded based on the IPv4 header. The
      same is true for IPv6 Explicit NULL with an IPv6 packet following.
      
      Therefore, when installing the IPv4/IPv6 Explicit NULL label routes,
      add an attribute that specifies the expected payload type for use at
      forwarding time for determining the type of the encapsulated packet
      instead of inspecting the first nibble of the packet.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      118d5234
  4. 07 8月, 2015 2 次提交
  5. 04 8月, 2015 1 次提交
  6. 01 8月, 2015 1 次提交
  7. 22 7月, 2015 2 次提交
  8. 12 6月, 2015 1 次提交
    • R
      mpls: handle device renames for per-device sysctls · 0fae3bf0
      Robert Shearman 提交于
      If a device is renamed and the original name is subsequently reused
      for a new device, the following warning is generated:
      
      sysctl duplicate entry: /net/mpls/conf/veth0//input
      CPU: 3 PID: 1379 Comm: ip Not tainted 4.1.0-rc4+ #20
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
       0000000000000000 0000000000000000 ffffffff81566aaf 0000000000000000
       ffffffff81236279 ffff88002f7d7f00 0000000000000000 ffff88000db336d8
       ffff88000db33698 0000000000000005 ffff88002e046000 ffff8800168c9280
      Call Trace:
       [<ffffffff81566aaf>] ? dump_stack+0x40/0x50
       [<ffffffff81236279>] ? __register_sysctl_table+0x289/0x5a0
       [<ffffffffa051a24f>] ? mpls_dev_notify+0x1ff/0x300 [mpls_router]
       [<ffffffff8108db7f>] ? notifier_call_chain+0x4f/0x70
       [<ffffffff81470e72>] ? register_netdevice+0x2b2/0x480
       [<ffffffffa0524748>] ? veth_newlink+0x178/0x2d3 [veth]
       [<ffffffff8147f84c>] ? rtnl_newlink+0x73c/0x8e0
       [<ffffffff8147f27a>] ? rtnl_newlink+0x16a/0x8e0
       [<ffffffff81459ff2>] ? __kmalloc_reserve.isra.30+0x32/0x90
       [<ffffffff8147ccfd>] ? rtnetlink_rcv_msg+0x8d/0x250
       [<ffffffff8145b027>] ? __alloc_skb+0x47/0x1f0
       [<ffffffff8149badb>] ? __netlink_lookup+0xab/0xe0
       [<ffffffff8147cc70>] ? rtnetlink_rcv+0x30/0x30
       [<ffffffff8149e7a0>] ? netlink_rcv_skb+0xb0/0xd0
       [<ffffffff8147cc64>] ? rtnetlink_rcv+0x24/0x30
       [<ffffffff8149df17>] ? netlink_unicast+0x107/0x1a0
       [<ffffffff8149e4be>] ? netlink_sendmsg+0x50e/0x630
       [<ffffffff8145209c>] ? sock_sendmsg+0x3c/0x50
       [<ffffffff81452beb>] ? ___sys_sendmsg+0x27b/0x290
       [<ffffffff811bd258>] ? mem_cgroup_try_charge+0x88/0x110
       [<ffffffff811bd5b6>] ? mem_cgroup_commit_charge+0x56/0xa0
       [<ffffffff811d7700>] ? do_filp_open+0x30/0xa0
       [<ffffffff8145336e>] ? __sys_sendmsg+0x3e/0x80
       [<ffffffff8156c3f2>] ? system_call_fastpath+0x16/0x75
      
      Fix this by unregistering the previous sysctl table (registered for
      the path containing the original device name) and re-registering the
      table for the path containing the new device name.
      
      Fixes: 37bde799 ("mpls: Per-device enabling of packet input")
      Reported-by: NScott Feldman <sfeldma@gmail.com>
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fae3bf0
  9. 08 6月, 2015 1 次提交
  10. 10 5月, 2015 1 次提交
  11. 06 5月, 2015 1 次提交
  12. 23 4月, 2015 3 次提交
  13. 13 3月, 2015 1 次提交
  14. 09 3月, 2015 5 次提交
  15. 07 3月, 2015 1 次提交
  16. 06 3月, 2015 1 次提交
  17. 05 3月, 2015 1 次提交
  18. 04 3月, 2015 6 次提交
    • E
      mpls: Multicast route table change notifications · 8de147dc
      Eric W. Biederman 提交于
      Unlike IPv4 this code notifies on all cases where mpls routes
      are added or removed and it never automatically removes routes.
      Avoiding both the userspace confusion that is caused by omitting
      route updates and the possibility of a flood of netlink traffic
      when an interface goes doew.
      
      For now reserved labels are handled automatically and userspace
      is not notified.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8de147dc
    • E
      mpls: Netlink commands to add, remove, and dump routes · 03c05665
      Eric W. Biederman 提交于
      This change adds two new netlink routing attributes:
      RTA_VIA and RTA_NEWDST.
      
      RTA_VIA specifies the specifies the next machine to send a packet to
      like RTA_GATEWAY.  RTA_VIA differs from RTA_GATEWAY in that it
      includes the address family of the address of the next machine to send
      a packet to.  Currently the MPLS code supports addresses in AF_INET,
      AF_INET6 and AF_PACKET.  For AF_INET and AF_INET6 the destination mac
      address is acquired from the neighbour table.  For AF_PACKET the
      destination mac_address is specified in the netlink configuration.
      
      I think raw destination mac address support with the family AF_PACKET
      will prove useful.  There is MPLS-TP which is defined to operate
      on machines that do not support internet packets of any flavor.  Further
      seem to be corner cases where it can be useful.  At this point
      I don't care much either way.
      
      RTA_NEWDST specifies the destination address to forward the packet
      with.  MPLS typically changes it's destination address at every hop.
      For a swap operation RTA_NEWDST is specified with a length of one label.
      For a push operation RTA_NEWDST is specified with two or more labels.
      For a pop operation RTA_NEWDST is not specified or equivalently an emtpy
      RTAN_NEWDST is specified.
      
      Those new netlink attributes are used to implement handling of rt-netlink
      RTM_NEWROUTE, RTM_DELROUTE, and RTM_GETROUTE messages, to maintain the
      MPLS label table.
      
      rtm_to_route_config parses a netlink RTM_NEWROUTE or RTM_DELROUTE message,
      verify no unhandled attributes or unhandled values are present and sets
      up the data structures for mpls_route_add and mpls_route_del.
      
      I did my best to match up with the existing conventions with the caveats
      that MPLS addresses are all destination-specific-addresses, and so
      don't properly have a scope.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03c05665
    • E
      mpls: Functions for reading and wrinting mpls labels over netlink · 966bae33
      Eric W. Biederman 提交于
      Reading and writing addresses in network byte order in netlink is
      traditional and I see no reason to change that.  MPLS is interesting
      as effectively it has variabely length addresses (the MPLS label
      stack).  To represent these variable length addresses in netlink
      I use a valid MPLS label stack (complete with stop bit).
      
      This achieves two things: a well defined existing format is used,
      and the data can be interpreted without looking at it's length.
      
      Not needed to look at the length to decode the variable length
      network representation allows existing userspace functions
      such as inet_ntop to be used without needed to change their
      prototype.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      966bae33
    • E
      mpls: Basic support for adding and removing routes · a2519929
      Eric W. Biederman 提交于
      mpls_route_add and mpls_route_del implement the basic logic for adding
      and removing Next Hop Label Forwarding Entries from the MPLS input
      label map.  The addition and subtraction is done in a way that is
      consistent with how the existing routing table in Linux are
      maintained.  Thus all of the work to deal with NLM_F_APPEND,
      NLM_F_EXCL, NLM_F_REPLACE, and NLM_F_CREATE.
      
      Cases that are not clearly defined such as changing the interpretation
      of the mpls reserved labels is not allowed.
      
      Because it seems like the right thing to do adding an MPLS route without
      specifying an input label and allowing the kernel to pick a free label
      table entry is supported.   The implementation is currently less than optimal
      but that can be changed.
      
      As I don't have anything else to test with only ethernet and the loopback
      device are the only two device types currently supported for forwarding
      MPLS over.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2519929
    • E
      mpls: Add a sysctl to control the size of the mpls label table · 7720c01f
      Eric W. Biederman 提交于
      This sysctl gives two benefits.  By defaulting the table size to 0
      mpls even when compiled in and enabled defaults to not forwarding
      any packets.  This prevents unpleasant surprises for users.
      
      The other benefit is that as mpls labels are allocated locally a dense
      table a small dense label table may be used which saves memory and
      is extremely simple and efficient to implement.
      
      This sysctl allows userspace to choose the restrictions on the label
      table size userspace applications need to cope with.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7720c01f
    • E
      mpls: Basic routing support · 0189197f
      Eric W. Biederman 提交于
      This change adds a new Kconfig option MPLS_ROUTING.
      
      The core of this change is the code to look at an mpls packet received
      from another machine.  Look that packet up in a routing table and
      forward the packet on.
      
      Support of MPLS over ATM is not considered or attempted here.  This
      implemntation follows RFC3032 and implements the MPLS shim header that
      can pass over essentially any network.
      
      What RFC3021 refers to as the as the Incoming Label Map (ILM) I call
      net->mpls.platform_label[].  What RFC3031 refers to as the Next Label
      Hop Forwarding Entry (NHLFE) I call mpls_route.  Though calling it the
      label fordwarding information base (lfib) might also be valid.
      
      Further the implemntation forwards packets as described in RFC3032.
      There is no need and given the original motivation for MPLS a strong
      discincentive to have a flexible label forwarding path.  In essence
      the logic is the topmost label is read, looked up, removed, and
      replaced by 0 or more new lables and the sent out the specified
      interface to it's next hop.
      
      Quite a few optional features are not implemented here.  Among them
      are generation of ICMP errors when the TTL is exceeded or the packet
      is larger than the next hop MTU (those conditions are detected and the
      packets are dropped instead of generating an icmp error).  The traffic
      class field is always set to 0.  The implementation focuses on IP over
      MPLS and does not handle egress of other kinds of protocols.
      
      Instead of implementing coordination with the neighbour table and
      sorting out how to input next hops in a different address family (for
      which there is value).  I was lazy and implemented a next hop mac
      address instead.  The code is simpler and there are flavor of MPLS
      such as MPLS-TP where neither an IPv4 nor an IPv6 next hop is
      appropriate so a next hop by mac address would need to be implemented
      at some point.
      
      Two new definitions AF_MPLS and PF_MPLS are exposed to userspace.
      
      Decoding the mpls header must be done by first byeswapping a 32bit bit
      endian word into the local cpu endian and then bit shifting to extract
      the pieces.  There is no C bit-field that can represent a wire format
      mpls header on a little endian machine as the low bits of the 20bit
      label wind up in the wrong half of third byte.  Therefore internally
      everything is deal with in cpu native byte order except when writing
      to and reading from a packet.
      
      For management simplicity if a label is configured to forward out
      an interface that is down the packet is dropped early.  Similarly
      if an network interface is removed rt_dev is updated to NULL
      (so no reference is preserved) and any packets for that label
      are dropped.  Keeping the label entries in the kernel allows
      the kernel label table to function as the definitive source
      of which labels are allocated and which are not.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0189197f