1. 16 7月, 2014 2 次提交
    • T
      net: set name assign type for renamed devices · 238fa362
      Tom Gundersen 提交于
      Based on a patch from David Herrmann.
      
      This is the only place devices can be renamed.
      
      v9: restore revers-christmas-tree order of local variables
      Signed-off-by: NTom Gundersen <teg@jklm.no>
      Reviewed-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      238fa362
    • T
      net: add name_assign_type netdev attribute · 685343fc
      Tom Gundersen 提交于
      Based on a patch by David Herrmann.
      
      The name_assign_type attribute gives hints where the interface name of a
      given net-device comes from. These values are currently defined:
        NET_NAME_ENUM:
          The ifname is provided by the kernel with an enumerated
          suffix, typically based on order of discovery. Names may
          be reused and unpredictable.
        NET_NAME_PREDICTABLE:
          The ifname has been assigned by the kernel in a predictable way
          that is guaranteed to avoid reuse and always be the same for a
          given device. Examples include statically created devices like
          the loopback device and names deduced from hardware properties
          (including being given explicitly by the firmware). Names
          depending on the order of discovery, or in any other way on the
          existence of other devices, must not be marked as PREDICTABLE.
        NET_NAME_USER:
          The ifname was provided by user-space during net-device setup.
        NET_NAME_RENAMED:
          The net-device has been renamed from userspace. Once this type is set,
          it cannot change again.
        NET_NAME_UNKNOWN:
          This is an internal placeholder to indicate that we yet haven't yet
          categorized the name. It will not be exposed to userspace, rather
          -EINVAL is returned.
      
      The aim of these patches is to improve user-space renaming of interfaces. As
      a general rule, userspace must rename interfaces to guarantee that names stay
      the same every time a given piece of hardware appears (at boot, or when
      attaching it). However, there are several situations where userspace should
      not perform the renaming, and that depends on both the policy of the local
      admin, but crucially also on the nature of the current interface name.
      
      If an interface was created in repsonse to a userspace request, and userspace
      already provided a name, we most probably want to leave that name alone. The
      main instance of this is wifi-P2P devices created over nl80211, which currently
      have a long-standing bug where they are getting renamed by udev. We label such
      names NET_NAME_USER.
      
      If an interface, unbeknown to us, has already been renamed from userspace, we
      most probably want to leave also that alone. This will typically happen when
      third-party plugins (for instance to udev, but the interface is generic so could
      be from anywhere) renames the interface without informing udev about it. A
      typical situation is when you switch root from an installer or an initrd to the
      real system and the new instance of udev does not know what happened before
      the switch. These types of problems have caused repeated issues in the past. To
      solve this, once an interface has been renamed, its name is labelled
      NET_NAME_RENAMED.
      
      In many cases, the kernel is actually able to name interfaces in such a
      way that there is no need for userspace to rename them. This is the case when
      the enumeration order of devices, or in fact any other (non-parent) device on
      the system, can not influence the name of the interface. Examples include
      statically created devices, or any naming schemes based on hardware properties
      of the interface. In this case the admin may prefer to use the kernel-provided
      names, and to make that possible we label such names NET_NAME_PREDICTABLE.
      We want the kernel to have tho possibilty of performing predictable interface
      naming itself (and exposing to userspace that it has), as the information
      necessary for a proper naming scheme for a certain class of devices may not
      be exposed to userspace.
      
      The case where renaming is almost certainly desired, is when the kernel has
      given the interface a name using global device enumeration based on order of
      discovery (ethX, wlanY, etc). These naming schemes are labelled NET_NAME_ENUM.
      
      Lastly, a fallback is left as NET_NAME_UNKNOWN, to indicate that a driver has
      not yet been ported. This is mostly useful as a transitionary measure, allowing
      us to label the various naming schemes bit by bit.
      
      v8: minor documentation fixes
      v9: move comment to the right commit
      Signed-off-by: NTom Gundersen <teg@jklm.no>
      Reviewed-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Reviewed-by: NKay Sievers <kay@vrfy.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      685343fc
  2. 14 7月, 2014 1 次提交
  3. 11 7月, 2014 2 次提交
    • J
      bridge: netlink dump interface at par with brctl · 5e6d2435
      Jamal Hadi Salim 提交于
      Actually better than brctl showmacs because we can filter by bridge
      port in the kernel.
      The current bridge netlink interface doesnt scale when you have many
      bridges each with large fdbs or even bridges with many bridge ports
      
      And now for the science non-fiction novel you have all been
      waiting for..
      
      //lets see what bridge ports we have
      root@moja-1:/configs/may30-iprt/bridge# ./bridge link show
      8: eth1 state DOWN : <BROADCAST,MULTICAST> mtu 1500 master br0 state
      disabled priority 32 cost 19
      17: sw1-p1 state DOWN : <BROADCAST,NOARP> mtu 1500 master br0 state
      disabled priority 32 cost 100
      
      // show all..
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show
      33:33:00:00:00:01 dev bond0 self permanent
      33:33:00:00:00:01 dev dummy0 self permanent
      33:33:00:00:00:01 dev ifb0 self permanent
      33:33:00:00:00:01 dev ifb1 self permanent
      33:33:00:00:00:01 dev eth0 self permanent
      01:00:5e:00:00:01 dev eth0 self permanent
      33:33:ff:22:01:01 dev eth0 self permanent
      02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 dev eth1 self permanent
      33:33:00:00:00:01 dev eth1 self permanent
      33:33:00:00:00:01 dev gretap0 self permanent
      da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent
      33:33:00:00:00:01 dev sw1-p1 self permanent
      
      //filter by bridge
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0
      02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 dev eth1 self permanent
      33:33:00:00:00:01 dev eth1 self permanent
      da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent
      33:33:00:00:00:01 dev sw1-p1 self permanent
      
      // bridge sw1 has no ports attached..
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br sw1
      
      //filter by port
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show brport eth1
      02:00:00:12:01:02 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 self permanent
      33:33:00:00:00:01 self permanent
      
      // filter by port + bridge
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0 brport
      sw1-p1
      da:ac:46:27:d9:53 vlan 0 master br0 permanent
      33:33:00:00:00:01 self permanent
      
      // for shits and giggles (as they say in New Brunswick), lets
      // change the mac that br0 uses
      // Note: a magical fdb entry with no brport is added ...
      root@moja-1:/configs/may30-iprt/bridge# ip link set dev br0 address
      02:00:00:12:01:04
      
      // lets see if we can see the unicorn ..
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show
      33:33:00:00:00:01 dev bond0 self permanent
      33:33:00:00:00:01 dev dummy0 self permanent
      33:33:00:00:00:01 dev ifb0 self permanent
      33:33:00:00:00:01 dev ifb1 self permanent
      33:33:00:00:00:01 dev eth0 self permanent
      01:00:5e:00:00:01 dev eth0 self permanent
      33:33:ff:22:01:01 dev eth0 self permanent
      02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 dev eth1 self permanent
      33:33:00:00:00:01 dev eth1 self permanent
      33:33:00:00:00:01 dev gretap0 self permanent
      02:00:00:12:01:04 dev br0 vlan 0 master br0 permanent <=== there it is
      da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent
      33:33:00:00:00:01 dev sw1-p1 self permanent
      
      //can we see it if we filter by bridge?
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0
      02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 dev eth1 self permanent
      33:33:00:00:00:01 dev eth1 self permanent
      02:00:00:12:01:04 dev br0 vlan 0 master br0 permanent <=== there it is
      da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent
      33:33:00:00:00:01 dev sw1-p1 self permanent
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e6d2435
    • J
      bridge: fdb dumping takes a filter device · 5d5eacb3
      Jamal Hadi Salim 提交于
      Dumping a bridge fdb dumps every fdb entry
      held. With this change we are going to filter
      on selected bridge port.
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d5eacb3
  4. 09 7月, 2014 2 次提交
  5. 08 7月, 2014 6 次提交
  6. 02 7月, 2014 4 次提交
    • J
      pktgen: RCU-ify "if_list" to remove lock in next_to_run() · 8788370a
      Jesper Dangaard Brouer 提交于
      The if_lock()/if_unlock() in next_to_run() adds a significant
      overhead, because its called for every packet in busy loop of
      pktgen_thread_worker().  (Thomas Graf originally pointed me
      at this lock problem).
      
      Removing these two "LOCK" operations should in theory save us approx
      16ns (8ns x 2), as illustrated below we do save 16ns when removing
      the locks and introducing RCU protection.
      
      Performance data with CLONE_SKB==100000, TX-size=512, rx-usecs=30:
       (single CPU performance, ixgbe 10Gbit/s, E5-2630)
       * Prev   : 5684009 pps --> 175.93ns (1/5684009*10^9)
       * RCU-fix: 6272204 pps --> 159.43ns (1/6272204*10^9)
       * Diff   : +588195 pps --> -16.50ns
      
      To understand this RCU patch, I describe the pktgen thread model
      below.
      
      In pktgen there is several kernel threads, but there is only one CPU
      running each kernel thread.  Communication with the kernel threads are
      done through some thread control flags.  This allow the thread to
      change data structures at a know synchronization point, see main
      thread func pktgen_thread_worker().
      
      Userspace changes are communicated through proc-file writes.  There
      are three types of changes, general control changes "pgctrl"
      (func:pgctrl_write), thread changes "kpktgend_X"
      (func:pktgen_thread_write), and interface config changes "etcX@N"
      (func:pktgen_if_write).
      
      Userspace "pgctrl" and "thread" changes are synchronized via the mutex
      pktgen_thread_lock, thus only a single userspace instance can run.
      The mutex is taken while the packet generator is running, by pgctrl
      "start".  Thus e.g. "add_device" cannot be invoked when pktgen is
      running/started.
      
      All "pgctrl" and all "thread" changes, except thread "add_device",
      communicate via the thread control flags.  The main problem is the
      exception "add_device", that modifies threads "if_list" directly.
      
      Fortunately "add_device" cannot be invoked while pktgen is running.
      But there exists a race between "rem_device_all" and "add_device"
      (which normally don't occur, because "rem_device_all" waits 125ms
      before returning). Background'ing "rem_device_all" and running
      "add_device" immediately allow the race to occur.
      
      The race affects the threads (list of devices) "if_list".  The if_lock
      is used for protecting this "if_list".  Other readers are given
      lock-free access to the list under RCU read sections.
      
      Note, interface config changes (via proc) can occur while pktgen is
      running, which worries me a bit.  I'm assuming proc_remove() takes
      appropriate locks, to assure no writers exists after proc_remove()
      finish.
      
      I've been running a script exercising the race condition (leading me
      to fix the proc_remove order), without any issues.  The script also
      exercises concurrent proc writes, while the interface config is
      getting removed.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8788370a
    • J
      pktgen: avoid expensive set_current_state() call in loop · baac167b
      Jesper Dangaard Brouer 提交于
      Avoid calling set_current_state() inside the busy-loop in
      pktgen_thread_worker().  In case of pkt_dev->delay, then it is still
      used/enabled in pktgen_xmit() via the spin() call.
      
      The set_current_state(TASK_INTERRUPTIBLE) uses a xchg, which implicit
      is LOCK prefixed.  I've measured the asm LOCK operation to take approx
      8ns on this E5-2630 CPU.  Performance increase corrolate with this
      measurement.
      
      Performance data with CLONE_SKB==100000, rx-usecs=30:
       (single CPU performance, ixgbe 10Gbit/s, E5-2630)
       * Prev:  5454050 pps --> 183.35ns (1/5454050*10^9)
       * Now:   5684009 pps --> 175.93ns (1/5684009*10^9)
       * Diff:  +229959 pps -->  -7.42ns
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      baac167b
    • J
      rtnetlink: allow to register ops without ops->setup set · b0ab2fab
      Jiri Pirko 提交于
      So far, it is assumed that ops->setup is filled up. But there might be
      case that ops might make sense even without ->setup. In that case,
      forbid to newlink and dellink.
      
      This allows to register simple rtnl link ops containing only ->kind.
      That allows consistent way of passing device kind (either device-kind or
      slave-kind) to userspace.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0ab2fab
    • Y
      net: fix some typos in comment · 9bf2b8c2
      Ying Xue 提交于
      In commit 37112105("net:
      QDISC_STATE_RUNNING dont need atomic bit ops") the
      __QDISC_STATE_RUNNING is renamed to __QDISC___STATE_RUNNING,
      but the old names existing in comment are not replaced with
      the new name completely.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9bf2b8c2
  7. 26 6月, 2014 6 次提交
  8. 24 6月, 2014 1 次提交
  9. 20 6月, 2014 1 次提交
  10. 19 6月, 2014 2 次提交
  11. 18 6月, 2014 1 次提交
  12. 15 6月, 2014 1 次提交
    • T
      net: Fix save software checksum complete · 46fb51eb
      Tom Herbert 提交于
      Geert reported issues regarding checksum complete and UDP.
      The logic introduced in commit 7e3cead5
      ("net: Save software checksum complete") is not correct.
      
      This patch:
      1) Restores code in __skb_checksum_complete_header except for setting
         CHECKSUM_UNNECESSARY. This function may be calculating checksum on
         something less than skb->len.
      2) Adds saving checksum to __skb_checksum_complete. The full packet
         checksum 0..skb->len is calculated without adding in pseudo header.
         This value is saved in skb->csum and then the pseudo header is added
         to that to derive the checksum for validation.
      3) In both __skb_checksum_complete_header and __skb_checksum_complete,
         set skb->csum_valid to whether checksum of zero was computed. This
         allows skb_csum_unnecessary to return true without changing to
         CHECKSUM_UNNECESSARY which was done previously.
      4) Copy new csum related bits in __copy_skb_header.
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46fb51eb
  13. 13 6月, 2014 1 次提交
    • M
      rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 · e5eca6d4
      Michal Schmidt 提交于
      When running RHEL6 userspace on a current upstream kernel, "ip link"
      fails to show VF information.
      
      The reason is a kernel<->userspace API change introduced by commit
      88c5b5ce ("rtnetlink: Call nlmsg_parse() with correct header length"),
      after which the kernel does not see iproute2's IFLA_EXT_MASK attribute
      in the netlink request.
      
      iproute2 adjusted for the API change in its commit 63338dca4513
      ("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter").
      
      The problem has been noticed before:
      http://marc.info/?l=linux-netdev&m=136692296022182&w=2
      (Subject: Re: getting VF link info seems to be broken in 3.9-rc8)
      
      We can do better than tell those with old userspace to upgrade. We can
      recognize the old iproute2 in the kernel by checking the netlink message
      length. Even when including the IFLA_EXT_MASK attribute, its netlink
      message is shorter than struct ifinfomsg.
      
      With this patch "ip link" shows VF information in both old and new
      iproute2 versions.
      Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5eca6d4
  14. 12 6月, 2014 4 次提交
    • D
      net/core: Add VF link state control policy · c5b46160
      Doug Ledford 提交于
      Commit 1d8faf48 (net/core: Add VF link state control) added VF link state
      control to the netlink VF nested structure, but failed to add a proper entry
      for the new structure into the VF policy table.  Add the missing entry so
      the table and the actual data copied into the netlink nested struct are in
      sync.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5b46160
    • T
      net: Save software checksum complete · 7e3cead5
      Tom Herbert 提交于
      In skb_checksum complete, if we need to compute the checksum for the
      packet (via skb_checksum) save the result as CHECKSUM_COMPLETE.
      Subsequent checksum verification can use this.
      
      Also, added csum_complete_sw flag to distinguish between software and
      hardware generated checksum complete, we should always be able to trust
      the software computation.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e3cead5
    • O
      net: add __pskb_copy_fclone and pskb_copy_for_clone · bad93e9d
      Octavian Purdila 提交于
      There are several instances where a pskb_copy or __pskb_copy is
      immediately followed by an skb_clone.
      
      Add a couple of new functions to allow the copy skb to be allocated
      from the fclone cache and thus speed up subsequent skb_clone calls.
      
      Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
      Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Cc: Marek Lindner <mareklindner@neomailbox.ch>
      Cc: Simon Wunderlich <sw@simonwunderlich.de>
      Cc: Antonio Quartulli <antonio@meshcoding.com>
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Johan Hedberg <johan.hedberg@gmail.com>
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
      Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
      Cc: Samuel Ortiz <sameo@linux.intel.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Allan Stephens <allan.stephens@windriver.com>
      Cc: Andrew Hendry <andrew.hendry@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Reviewed-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bad93e9d
    • A
      net: filter: fix warning on 32-bit arch · 61f83d0d
      Alexei Starovoitov 提交于
      fix compiler warning on 32-bit architectures:
      
      net/core/filter.c: In function '__sk_run_filter':
      net/core/filter.c:540:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      net/core/filter.c:550:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      net/core/filter.c:560:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      61f83d0d
  15. 11 6月, 2014 2 次提交
    • W
      net: fix UDP tunnel GSO of frag_list GRO packets · 5882a07c
      Wei-Chun Chao 提交于
      This patch fixes a kernel BUG_ON in skb_segment. It is hit when
      testing two VMs on openvswitch with one VM acting as VXLAN gateway.
      
      During VXLAN packet GSO, skb_segment is called with skb->data
      pointing to inner TCP payload. skb_segment calls skb_network_protocol
      to retrieve the inner protocol. skb_network_protocol actually expects
      skb->data to point to MAC and it calls pskb_may_pull with ETH_HLEN.
      This ends up pulling in ETH_HLEN data from header tail. As a result,
      pskb_trim logic is skipped and BUG_ON is hit later.
      
      Move skb_push in front of skb_network_protocol so that skb->data
      lines up properly.
      
      kernel BUG at net/core/skbuff.c:2999!
      Call Trace:
      [<ffffffff816ac412>] tcp_gso_segment+0x122/0x410
      [<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390
      [<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170
      [<ffffffff816b3658>] skb_udp_tunnel_segment+0xd8/0x390
      [<ffffffff816b3c00>] udp4_ufo_fragment+0x120/0x140
      [<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390
      [<ffffffff8109d742>] ? default_wake_function+0x12/0x20
      [<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170
      [<ffffffff8164b4d0>] __skb_gso_segment+0x60/0xc0
      [<ffffffff8164b6b3>] dev_hard_start_xmit+0x183/0x550
      [<ffffffff8166c91e>] sch_direct_xmit+0xfe/0x1d0
      [<ffffffff8164bc94>] __dev_queue_xmit+0x214/0x4f0
      [<ffffffff8164bf90>] dev_queue_xmit+0x10/0x20
      [<ffffffff81687edb>] ip_finish_output+0x66b/0x890
      [<ffffffff81688a58>] ip_output+0x58/0x90
      [<ffffffff816c628f>] ? fib_table_lookup+0x29f/0x350
      [<ffffffff816881c9>] ip_local_out_sk+0x39/0x50
      [<ffffffff816cbfad>] iptunnel_xmit+0x10d/0x130
      [<ffffffffa0212200>] vxlan_xmit_skb+0x1d0/0x330 [vxlan]
      [<ffffffffa02a3919>] vxlan_tnl_send+0x129/0x1a0 [openvswitch]
      [<ffffffffa02a2cd6>] ovs_vport_send+0x26/0xa0 [openvswitch]
      [<ffffffffa029931e>] do_output+0x2e/0x50 [openvswitch]
      Signed-off-by: NWei-Chun Chao <weichunc@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5882a07c
    • A
      net: filter: cleanup A/X name usage · e430f34e
      Alexei Starovoitov 提交于
      The macro 'A' used in internal BPF interpreter:
       #define A regs[insn->a_reg]
      was easily confused with the name of classic BPF register 'A', since
      'A' would mean two different things depending on context.
      
      This patch is trying to clean up the naming and clarify its usage in the
      following way:
      
      - A and X are names of two classic BPF registers
      
      - BPF_REG_A denotes internal BPF register R0 used to map classic register A
        in internal BPF programs generated from classic
      
      - BPF_REG_X denotes internal BPF register R7 used to map classic register X
        in internal BPF programs generated from classic
      
      - internal BPF instruction format:
      struct sock_filter_int {
              __u8    code;           /* opcode */
              __u8    dst_reg:4;      /* dest register */
              __u8    src_reg:4;      /* source register */
              __s16   off;            /* signed offset */
              __s32   imm;            /* signed immediate constant */
      };
      
      - BPF_X/BPF_K is 1 bit used to encode source operand of instruction
      In classic:
        BPF_X - means use register X as source operand
        BPF_K - means use 32-bit immediate as source operand
      In internal:
        BPF_X - means use 'src_reg' register as source operand
        BPF_K - means use 32-bit immediate as source operand
      Suggested-by: NChema Gonzalez <chema@google.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NChema Gonzalez <chema@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e430f34e
  16. 09 6月, 2014 1 次提交
  17. 06 6月, 2014 2 次提交
  18. 05 6月, 2014 1 次提交