1. 29 10月, 2016 9 次提交
  2. 28 10月, 2016 9 次提交
    • A
      net: skip genenerating uevents for network namespaces that are exiting · 002d8a1a
      Andrey Vagin 提交于
      No one can see these events, because a network namespace can not be
      destroyed, if it has sockets.
      
      Unlike other devices, uevent-s for network devices are generated
      only inside their network namespaces. They are filtered in
      kobj_bcast_filter()
      
      My experiments shows that net namespaces are destroyed more 30% faster
      with this optimization.
      
      Here is a perf output for destroying network namespaces without this
      patch.
      
      -   94.76%     0.02%  kworker/u48:1  [kernel.kallsyms]     [k] cleanup_net
         - 94.74% cleanup_net
            - 94.64% ops_exit_list.isra.4
               - 41.61% default_device_exit_batch
                  - 41.47% unregister_netdevice_many
                     - rollback_registered_many
                        - 40.36% netdev_unregister_kobject
                           - 14.55% device_del
                              + 13.71% kobject_uevent
                           - 13.04% netdev_queue_update_kobjects
                              + 12.96% kobject_put
                           - 12.72% net_rx_queue_update_kobjects
                                kobject_put
                              - kobject_release
                                 + 12.69% kobject_uevent
                        + 0.80% call_netdevice_notifiers_info
               + 19.57% nfsd_exit_net
               + 11.15% tcp_net_metrics_exit
               + 8.25% rpcsec_gss_exit_net
      
      It's very critical to optimize the exit path for network namespaces,
      because they are destroyed under net_mutex and many namespaces can be
      destroyed for one iteration.
      
      v2: use dev_set_uevent_suppress()
      
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrei Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      002d8a1a
    • S
      ethernet: fix min/max MTU typos · 110447f8
      Stefan Richter 提交于
      Fixes: d894be57('ethernet: use net core MTU range checking in more drivers')
      CC: Jarod Wilson <jarod@redhat.com>
      CC: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Acked-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      110447f8
    • D
      Merge branch 'genetlink-improvements' · 0fb6af70
      David S. Miller 提交于
      Johannes Berg says:
      
      ====================
      genetlink improvements
      
      This series contains some generic netlink improvements, making
      the API safer to use, and making the function pointers in the
      family struct safer by allowing it to be __ro_after_init.
      
      The first patch, introducing genl_family_attrbuf(), just ensures
      that the users of family->attrbuf aren't actually racy, but making
      them use the indirection function for obtaining a reference and
      checking that the context can actually do so.
      
      The second patch removes the more or less broken ability to have
      a static family ID, the three IDs that need to be static because
      it's simply needed (genl controller), or due to old API misused.
      Everything else couldn't be static anyway, or could fail when the
      family is registered, if somebody else already got a static ID.
      
      The third patch statically initializes the families, mostly to save
      some code. I wrote this initially because I thought I could make
      them all const, but that ends up being very inefficient (it would
      require always doing some kind of family -> id lookup), so now it's
      just here because I had it already and it reduces the code size.
      
      The fourth patch then, finally, lays the groundwork for what I had
      really wanted - now with __ro_after_init instead of const; I remove
      code there to do the ID->family hash table mapping in genetlink and
      use IDR instead to both allocate and map the IDs, which again ends
      up saving some code size.
      
      Finally, the fifth patch updates all families, as it turns out, no
      families exist that really dynamically register/unregister. This
      last patch should perhaps be split up, I could submit it for each
      subsystem separately, but it'd depend on the second and third to
      go in first, so would take a while. I can do that though, if that
      seems better to you.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fb6af70
    • J
      genetlink: mark families as __ro_after_init · 56989f6d
      Johannes Berg 提交于
      Now genl_register_family() is the only thing (other than the
      users themselves, perhaps, but I didn't find any doing that)
      writing to the family struct.
      
      In all families that I found, genl_register_family() is only
      called from __init functions (some indirectly, in which case
      I've add __init annotations to clarifly things), so all can
      actually be marked __ro_after_init.
      
      This protects the data structure from accidental corruption.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56989f6d
    • J
      genetlink: use idr to track families · 2ae0f17d
      Johannes Berg 提交于
      Since generic netlink family IDs are small integers, allocated
      densely, IDR is an ideal match for lookups. Replace the existing
      hand-written hash-table with IDR for allocation and lookup.
      
      This lets the families only be written to once, during register,
      since the list_head can be removed and removal of a family won't
      cause any writes.
      
      It also slightly reduces the code size (by about 1.3k on x86-64).
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ae0f17d
    • J
      genetlink: statically initialize families · 489111e5
      Johannes Berg 提交于
      Instead of providing macros/inline functions to initialize
      the families, make all users initialize them statically and
      get rid of the macros.
      
      This reduces the kernel code size by about 1.6k on x86-64
      (with allyesconfig).
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      489111e5
    • J
      genetlink: no longer support using static family IDs · a07ea4d9
      Johannes Berg 提交于
      Static family IDs have never really been used, the only
      use case was the workaround I introduced for those users
      that assumed their family ID was also their multicast
      group ID.
      
      Additionally, because static family IDs would never be
      reserved by the generic netlink code, using a relatively
      low ID would only work for built-in families that can be
      registered immediately after generic netlink is started,
      which is basically only the control family (apart from
      the workaround code, which I also had to add code for so
      it would reserve those IDs)
      
      Thus, anything other than GENL_ID_GENERATE is flawed and
      luckily not used except in the cases I mentioned. Move
      those workarounds into a few lines of code, and then get
      rid of GENL_ID_GENERATE entirely, making it more robust.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a07ea4d9
    • J
      genetlink: introduce and use genl_family_attrbuf() · c90c39da
      Johannes Berg 提交于
      This helper function allows family implementations to access
      their family's attrbuf. This gets rid of the attrbuf usage
      in families, and also adds locking validation, since it's not
      valid to use the attrbuf with parallel_ops or outside of the
      dumpit callback.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c90c39da
    • A
      skbedit: allow the user to specify bitmask for mark · 4fe77d82
      Antonio Quartulli 提交于
      The user may want to use only some bits of the skb mark in
      his skbedit rules because the remaining part might be used by
      something else.
      
      Introduce the "mask" parameter to the skbedit actor in order
      to implement such functionality.
      
      When the mask is specified, only those bits selected by the
      latter are altered really changed by the actor, while the
      rest is left untouched.
      Signed-off-by: NAntonio Quartulli <antonio@open-mesh.com>
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4fe77d82
  3. 27 10月, 2016 15 次提交
  4. 24 10月, 2016 7 次提交
    • C
      net: ip, diag -- Add diag interface for raw sockets · 432490f9
      Cyrill Gorcunov 提交于
      In criu we are actively using diag interface to collect sockets
      present in the system when dumping applications. And while for
      unix, tcp, udp[lite], packet, netlink it works as expected,
      the raw sockets do not have. Thus add it.
      
      v2:
       - add missing sock_put calls in raw_diag_dump_one (by eric.dumazet@)
       - implement @destroy for diag requests (by dsa@)
      
      v3:
       - add export of raw_abort for IPv6 (by dsa@)
       - pass net-admin flag into inet_sk_diag_fill due to
         changes in net-next branch (by dsa@)
      
      v4:
       - use @pad in struct inet_diag_req_v2 for raw socket
         protocol specification: raw module carries sockets
         which may have custom protocol passed from socket()
         syscall and sole @sdiag_protocol is not enough to
         match underlied ones
       - start reporting protocol specifed in socket() call
         when sockets are raw ones for the same reason: user
         space tools like ss may parse this attribute and use
         it for socket matching
      
      v5 (by eric.dumazet@):
       - use sock_hold in raw_sock_get instead of atomic_inc,
         we're holding (raw_v4_hashinfo|raw_v6_hashinfo)->lock
         when looking up so counter won't be zero here.
      
      v6:
       - use sdiag_raw_protocol() helper which will access @pad
         structure used for raw sockets protocol specification:
         we can't simply rename this member without breaking uapi
      
      v7:
       - sine sdiag_raw_protocol() helper is not suitable for
         uapi lets rather make an alias structure with proper
         names. __check_inet_diag_req_raw helper will catch
         if any of structure unintentionally changed.
      
      CC: David S. Miller <davem@davemloft.net>
      CC: Eric Dumazet <eric.dumazet@gmail.com>
      CC: David Ahern <dsa@cumulusnetworks.com>
      CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      CC: James Morris <jmorris@namei.org>
      CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Andrey Vagin <avagin@openvz.org>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      432490f9
    • T
      lwt: Remove unused len field · f76a9db3
      Thomas Graf 提交于
      The field is initialized by ILA and MPLS but never used. Remove it.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f76a9db3
    • A
      net: allow to kill a task which waits net_mutex in copy_new_ns · 7281a665
      Andrey Vagin 提交于
      net_mutex can be locked for a long time. It may be because many
      namespaces are being destroyed or many processes decide to create
      a network namespace.
      
      Both these operations are heavy, so it is better to have an ability to
      kill a process which is waiting net_mutex.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrei Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7281a665
    • S
      net/sched: em_meta: Fix 'meta vlan' to correctly recognize zero VID frames · d65f2fa6
      Shmulik Ladkani 提交于
      META_COLLECTOR int_vlan_tag() assumes that if the accel tag (vlan_tci)
      is zero, then no vlan accel tag is present.
      
      This is incorrect for zero VID vlan accel packets, making the following
      match fail:
        tc filter add ... basic match 'meta(vlan mask 0xfff eq 0)' ...
      
      Apparently 'int_vlan_tag' was implemented prior VLAN_TAG_PRESENT was
      introduced in 05423b24 "vlan: allow null VLAN ID to be used"
      (and at time introduced, the 'vlan_tx_tag_get' call in em_meta was not
       adapted).
      
      Fix, testing skb_vlan_tag_present instead of testing skb_vlan_tag_get's
      value.
      
      Fixes: 05423b24 ("vlan: allow null VLAN ID to be used")
      Fixes: 1a31f204 ("netsched: Allow meta match on vlan tag on receive")
      Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d65f2fa6
    • D
      Merge branch 'mlxsw-cosmetics-plus-res-mgmt-rewrite' · 1bac9381
      David S. Miller 提交于
      Jiri Pirko says:
      
      ====================
      mlxsw: Driver update
      
      Mostly cosmetics and small resource values management rewrite.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bac9381
    • J
      mlxsw: Convert resources into array · c1a38311
      Jiri Pirko 提交于
      Since the number of resources is going to get much bigger, ease up the
      addition by simly defining IDs. Convert the existing structure members
      to a set array, one for validity, one for values. Introduce a set of
      getters and setters for easy access.
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1a38311
    • J
      mlxsw: cmd: Push resource query defines to cmd.h · f38a2314
      Jiri Pirko 提交于
      Push cmd resource query related defines to cmd.h where they belong.
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f38a2314