1. 03 3月, 2015 1 次提交
  2. 18 1月, 2015 1 次提交
    • J
      netlink: make nlmsg_end() and genlmsg_end() void · 053c095a
      Johannes Berg 提交于
      Contrary to common expectations for an "int" return, these functions
      return only a positive value -- if used correctly they cannot even
      return 0 because the message header will necessarily be in the skb.
      
      This makes the very common pattern of
      
        if (genlmsg_end(...) < 0) { ... }
      
      be a whole bunch of dead code. Many places also simply do
      
        return nlmsg_end(...);
      
      and the caller is expected to deal with it.
      
      This also commonly (at least for me) causes errors, because it is very
      common to write
      
        if (my_function(...))
          /* error condition */
      
      and if my_function() does "return nlmsg_end()" this is of course wrong.
      
      Additionally, there's not a single place in the kernel that actually
      needs the message length returned, and if anyone needs it later then
      it'll be very easy to just use skb->len there.
      
      Remove this, and make the functions void. This removes a bunch of dead
      code as described above. The patch adds lines because I did
      
      -	return nlmsg_end(...);
      +	nlmsg_end(...);
      +	return 0;
      
      I could have preserved all the function's return values by returning
      skb->len, but instead I've audited all the places calling the affected
      functions and found that none cared. A few places actually compared
      the return value with <= 0 in dump functionality, but that could just
      be changed to < 0 with no change in behaviour, so I opted for the more
      efficient version.
      
      One instance of the error I've made numerous times now is also present
      in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
      check for <0 or <=0 and thus broke out of the loop every single time.
      I've preserved this since it will (I think) have caused the messages to
      userspace to be formatted differently with just a single message for
      every SKB returned to userspace. It's possible that this isn't needed
      for the tools that actually use this, but I don't even know what they
      are so couldn't test that changing this behaviour would be acceptable.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      053c095a
  3. 03 11月, 2014 1 次提交
  4. 18 9月, 2014 1 次提交
    • H
      ipsec: Remove obsolete MAX_AH_AUTH_LEN · 689f1c9d
      Herbert Xu 提交于
      While tracking down the MAX_AH_AUTH_LEN crash in an old kernel
      I thought that this limit was rather arbitrary and we should
      just get rid of it.
      
      In fact it seems that we've already done all the work needed
      to remove it apart from actually removing it.  This limit was
      there in order to limit stack usage.  Since we've already
      switched over to allocating scratch space using kmalloc, there
      is no longer any need to limit the authentication length.
      
      This patch kills all references to it, including the BUG_ONs
      that led me here.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      689f1c9d
  5. 02 9月, 2014 1 次提交
    • C
      xfrm: configure policy hash table thresholds by netlink · 880a6fab
      Christophe Gouault 提交于
      Enable to specify local and remote prefix length thresholds for the
      policy hash table via a netlink XFRM_MSG_NEWSPDINFO message.
      
      prefix length thresholds are specified by XFRMA_SPD_IPV4_HTHRESH and
      XFRMA_SPD_IPV6_HTHRESH optional attributes (struct xfrmu_spdhthresh).
      
      example:
      
          struct xfrmu_spdhthresh thresh4 = {
              .lbits = 0;
              .rbits = 24;
          };
          struct xfrmu_spdhthresh thresh6 = {
              .lbits = 0;
              .rbits = 56;
          };
          struct nlmsghdr *hdr;
          struct nl_msg *msg;
      
          msg = nlmsg_alloc();
          hdr = nlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, XFRMA_SPD_IPV4_HTHRESH, sizeof(__u32), NLM_F_REQUEST);
          nla_put(msg, XFRMA_SPD_IPV4_HTHRESH, sizeof(thresh4), &thresh4);
          nla_put(msg, XFRMA_SPD_IPV6_HTHRESH, sizeof(thresh6), &thresh6);
          nla_send_auto(sk, msg);
      
      The numbers are the policy selector minimum prefix lengths to put a
      policy in the hash table.
      
      - lbits is the local threshold (source address for out policies,
        destination address for in and fwd policies).
      
      - rbits is the remote threshold (destination address for out
        policies, source address for in and fwd policies).
      
      The default values are:
      
      XFRMA_SPD_IPV4_HTHRESH: 32 32
      XFRMA_SPD_IPV6_HTHRESH: 128 128
      
      Dynamic re-building of the SPD is performed when the thresholds values
      are changed.
      
      The current thresholds can be read via a XFRM_MSG_GETSPDINFO request:
      the kernel replies to XFRM_MSG_GETSPDINFO requests by an
      XFRM_MSG_NEWSPDINFO message, with both attributes
      XFRMA_SPD_IPV4_HTHRESH and XFRMA_SPD_IPV6_HTHRESH.
      Signed-off-by: NChristophe Gouault <christophe.gouault@6wind.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      880a6fab
  6. 30 6月, 2014 1 次提交
  7. 04 6月, 2014 1 次提交
    • M
      xfrm: fix race between netns cleanup and state expire notification · 21ee543e
      Michal Kubecek 提交于
      The xfrm_user module registers its pernet init/exit after xfrm
      itself so that its net exit function xfrm_user_net_exit() is
      executed before xfrm_net_exit() which calls xfrm_state_fini() to
      cleanup the SA's (xfrm states). This opens a window between
      zeroing net->xfrm.nlsk pointer and deleting all xfrm_state
      instances which may access it (via the timer). If an xfrm state
      expires in this window, xfrm_exp_state_notify() will pass null
      pointer as socket to nlmsg_multicast().
      
      As the notifications are called inside rcu_read_lock() block, it
      is sufficient to retrieve the nlsk socket with rcu_dereference()
      and check the it for null.
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21ee543e
  8. 25 4月, 2014 1 次提交
  9. 23 4月, 2014 1 次提交
  10. 22 4月, 2014 1 次提交
    • T
      xfrm: Remove useless secid field from xfrm_audit. · f1370cc4
      Tetsuo Handa 提交于
      It seems to me that commit ab5f5e8b "[XFRM]: xfrm audit calls" is doing
      something strange at xfrm_audit_helper_usrinfo().
      If secid != 0 && security_secid_to_secctx(secid) != 0, the caller calls
      audit_log_task_context() which basically does
      secid != 0 && security_secid_to_secctx(secid) == 0 case
      except that secid is obtained from current thread's context.
      
      Oh, what happens if secid passed to xfrm_audit_helper_usrinfo() was
      obtained from other thread's context? It might audit current thread's
      context rather than other thread's context if security_secid_to_secctx()
      in xfrm_audit_helper_usrinfo() failed for some reason.
      
      Then, are all the caller of xfrm_audit_helper_usrinfo() passing either
      secid obtained from current thread's context or secid == 0?
      It seems to me that they are.
      
      If I didn't miss something, we don't need to pass secid to
      xfrm_audit_helper_usrinfo() because audit_log_task_context() will
      obtain secid from current thread's context.
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      f1370cc4
  11. 10 3月, 2014 1 次提交
    • N
      selinux: add gfp argument to security_xfrm_policy_alloc and fix callers · 52a4c640
      Nikolay Aleksandrov 提交于
      security_xfrm_policy_alloc can be called in atomic context so the
      allocation should be done with GFP_ATOMIC. Add an argument to let the
      callers choose the appropriate way. In order to do so a gfp argument
      needs to be added to the method xfrm_policy_alloc_security in struct
      security_operations and to the internal function
      selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
      callers and leave GFP_KERNEL as before for the rest.
      The path that needed the gfp argument addition is:
      security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
      all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
      selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)
      
      Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
      add it to security_context_to_sid which is used inside and prior to this
      patch did only GFP_KERNEL allocation. So add gfp argument to
      security_context_to_sid and adjust all of its callers as well.
      
      CC: Paul Moore <paul@paul-moore.com>
      CC: Dave Jones <davej@redhat.com>
      CC: Steffen Klassert <steffen.klassert@secunet.com>
      CC: Fan Du <fan.du@windriver.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: LSM list <linux-security-module@vger.kernel.org>
      CC: SELinux list <selinux@tycho.nsa.gov>
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      52a4c640
  12. 07 3月, 2014 1 次提交
  13. 20 2月, 2014 1 次提交
  14. 17 2月, 2014 1 次提交
    • N
      ipsec: add support of limited SA dump · d3623099
      Nicolas Dichtel 提交于
      The goal of this patch is to allow userland to dump only a part of SA by
      specifying a filter during the dump.
      The kernel is in charge to filter SA, this avoids to generate useless netlink
      traffic (it save also some cpu cycles). This is particularly useful when there
      is a big number of SA set on the system.
      
      Note that I removed the union in struct xfrm_state_walk to fix a problem on arm.
      struct netlink_callback->args is defined as a array of 6 long and the first long
      is used in xfrm code to flag the cb as initialized. Hence, we must have:
      sizeof(struct xfrm_state_walk) <= sizeof(long) * 5.
      With the union, it was false on arm (sizeof(struct xfrm_state_walk) was
      sizeof(long) * 7), due to the padding.
      In fact, whatever the arch is, this union seems useless, there will be always
      padding after it. Removing it will not increase the size of this struct (and
      reduce it on arm).
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      d3623099
  15. 13 2月, 2014 1 次提交
  16. 12 2月, 2014 1 次提交
  17. 14 1月, 2014 1 次提交
  18. 02 1月, 2014 2 次提交
  19. 16 12月, 2013 2 次提交
  20. 06 12月, 2013 3 次提交
    • F
      xfrm: Namespacify xfrm state/policy locks · 283bc9f3
      Fan Du 提交于
      By semantics, xfrm layer is fully name space aware,
      so will the locks, e.g. xfrm_state/pocliy_lock.
      Ensure exclusive access into state/policy link list
      for different name space with one global lock is not
      right in terms of semantics aspect at first place,
      as they are indeed mutually independent with each
      other, but also more seriously causes scalability
      problem.
      
      One practical scenario is on a Open Network Stack,
      more than hundreds of lxc tenants acts as routers
      within one host, a global xfrm_state/policy_lock
      becomes the bottleneck. But onces those locks are
      decoupled in a per-namespace fashion, locks contend
      is just with in specific name space scope, without
      causing additional SPD/SAD access delay for other
      name space.
      
      Also this patch improve scalability while as without
      changing original xfrm behavior.
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      283bc9f3
    • F
      xfrm: Using the right namespace to migrate key info · 8d549c4f
      Fan Du 提交于
      because the home agent could surely be run on a different
      net namespace other than init_net. The original behavior
      could lead into inconsistent of key info.
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      8d549c4f
    • F
      xfrm: Try to honor policy index if it's supplied by user · e682adf0
      Fan Du 提交于
      xfrm code always searches for unused policy index for
      newly created policy regardless whether or not user
      space policy index hint supplied.
      
      This patch enables such feature so that using
      "ip xfrm ... index=xxx" can be used by user to set
      specific policy index.
      
      Currently this beahvior is broken, so this patch make
      it happen as expected.
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      e682adf0
  21. 17 9月, 2013 1 次提交
    • F
      xfrm: Guard IPsec anti replay window against replay bitmap · 33fce60d
      Fan Du 提交于
      For legacy IPsec anti replay mechanism:
      
      bitmap in struct xfrm_replay_state could only provide a 32 bits
      window size limit in current design, thus user level parameter
      sadb_sa_replay should honor this limit, otherwise misleading
      outputs("replay=244") by setkey -D will be:
      
      192.168.25.2 192.168.22.2
      	esp mode=transport spi=147561170(0x08cb9ad2) reqid=0(0x00000000)
      	E: aes-cbc  9a8d7468 7655cf0b 719d27be b0ddaac2
      	A: hmac-sha1  2d2115c2 ebf7c126 1c54f186 3b139b58 264a7331
      	seq=0x00000000 replay=244 flags=0x00000000 state=mature
      	created: Sep 17 14:00:00 2013	current: Sep 17 14:00:22 2013
      	diff: 22(s)	hard: 30(s)	soft: 26(s)
      	last: Sep 17 14:00:00 2013	hard: 0(s)	soft: 0(s)
      	current: 1408(bytes)	hard: 0(bytes)	soft: 0(bytes)
      	allocated: 22	hard: 0	soft: 0
      	sadb_seq=1 pid=4854 refcnt=0
      192.168.22.2 192.168.25.2
      	esp mode=transport spi=255302123(0x0f3799eb) reqid=0(0x00000000)
      	E: aes-cbc  6485d990 f61a6bd5 e5660252 608ad282
      	A: hmac-sha1  0cca811a eb4fa893 c47ae56c 98f6e413 87379a88
      	seq=0x00000000 replay=244 flags=0x00000000 state=mature
      	created: Sep 17 14:00:00 2013	current: Sep 17 14:00:22 2013
      	diff: 22(s)	hard: 30(s)	soft: 26(s)
      	last: Sep 17 14:00:00 2013	hard: 0(s)	soft: 0(s)
      	current: 1408(bytes)	hard: 0(bytes)	soft: 0(bytes)
      	allocated: 22	hard: 0	soft: 0
      	sadb_seq=0 pid=4854 refcnt=0
      
      And also, optimizing xfrm_replay_check window checking by setting the
      desirable x->props.replay_window with only doing the comparison once
      for all when xfrm_state is first born.
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      33fce60d
  22. 16 9月, 2013 1 次提交
  23. 01 6月, 2013 1 次提交
  24. 06 3月, 2013 2 次提交
  25. 30 1月, 2013 1 次提交
  26. 19 11月, 2012 1 次提交
    • E
      net: Allow userns root to control llc, netfilter, netlink, packet, and xfrm · df008c91
      Eric W. Biederman 提交于
      Allow an unpriviled user who has created a user namespace, and then
      created a network namespace to effectively use the new network
      namespace, by reducing capable(CAP_NET_ADMIN) and
      capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
      CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
      
      Allow creation of af_key sockets.
      Allow creation of llc sockets.
      Allow creation of af_packet sockets.
      
      Allow sending xfrm netlink control messages.
      
      Allow binding to netlink multicast groups.
      Allow sending to netlink multicast groups.
      Allow adding and dropping netlink multicast groups.
      Allow sending to all netlink multicast groups and port ids.
      
      Allow reading the netfilter SO_IP_SET socket option.
      Allow sending netfilter netlink messages.
      Allow setting and getting ip_vs netfilter socket options.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df008c91
  27. 21 9月, 2012 6 次提交
  28. 19 9月, 2012 2 次提交
  29. 18 9月, 2012 1 次提交
    • E
      userns: Convert the audit loginuid to be a kuid · e1760bd5
      Eric W. Biederman 提交于
      Always store audit loginuids in type kuid_t.
      
      Print loginuids by converting them into uids in the appropriate user
      namespace, and then printing the resulting uid.
      
      Modify audit_get_loginuid to return a kuid_t.
      
      Modify audit_set_loginuid to take a kuid_t.
      
      Modify /proc/<pid>/loginuid on read to convert the loginuid into the
      user namespace of the opener of the file.
      
      Modify /proc/<pid>/loginud on write to convert the loginuid
      rom the user namespace of the opener of the file.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Paul Moore <paul@paul-moore.com> ?
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      e1760bd5