1. 08 4月, 2015 3 次提交
  2. 04 4月, 2015 2 次提交
  3. 30 3月, 2015 1 次提交
    • N
      netns: don't clear nsid too early on removal · 4217291e
      Nicolas Dichtel 提交于
      With the current code, ids are removed too early.
      Suppose you have an ipip interface that stands in the netns foo and its link
      part in the netns bar (so the netns bar has an nsid into the netns foo).
      Now, you remove the netns bar:
       - the bar nsid into the netns foo is removed
       - the netns exit method of ipip is called, thus our ipip iface is removed:
         => a netlink message is sent in the netns foo to advertise this deletion
         => this netlink message requests an nsid for bar, thus a new nsid is
            allocated for bar and never removed.
      
      We must remove nsids when we are sure that nobody will refer to netns currently
      cleaned.
      
      Fixes: 0c7aecd4 ("netns: add rtnl cmd to add and get peer netns ids")
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4217291e
  4. 13 3月, 2015 1 次提交
  5. 24 1月, 2015 1 次提交
  6. 23 1月, 2015 1 次提交
  7. 20 1月, 2015 1 次提交
  8. 05 12月, 2014 6 次提交
  9. 10 9月, 2014 1 次提交
  10. 30 7月, 2014 1 次提交
    • E
      namespaces: Use task_lock and not rcu to protect nsproxy · 728dba3a
      Eric W. Biederman 提交于
      The synchronous syncrhonize_rcu in switch_task_namespaces makes setns
      a sufficiently expensive system call that people have complained.
      
      Upon inspect nsproxy no longer needs rcu protection for remote reads.
      remote reads are rare.  So optimize for same process reads and write
      by switching using rask_lock instead.
      
      This yields a simpler to understand lock, and a faster setns system call.
      
      In particular this fixes a performance regression observed
      by Rafael David Tinoco <rafael.tinoco@canonical.com>.
      
      This is effectively a revert of Pavel Emelyanov's commit
      cf7b708c Make access to task's nsproxy lighter
      from 2007.  The race this originialy fixed no longer exists as
      do_notify_parent uses task_active_pid_ns(parent) instead of
      parent->nsproxy.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      728dba3a
  11. 16 5月, 2014 1 次提交
  12. 27 4月, 2014 1 次提交
  13. 31 8月, 2013 1 次提交
  14. 02 5月, 2013 1 次提交
  15. 23 2月, 2013 1 次提交
  16. 15 12月, 2012 1 次提交
    • E
      userns: Require CAP_SYS_ADMIN for most uses of setns. · 5e4a0847
      Eric W. Biederman 提交于
      Andy Lutomirski <luto@amacapital.net> found a nasty little bug in
      the permissions of setns.  With unprivileged user namespaces it
      became possible to create new namespaces without privilege.
      
      However the setns calls were relaxed to only require CAP_SYS_ADMIN in
      the user nameapce of the targed namespace.
      
      Which made the following nasty sequence possible.
      
      pid = clone(CLONE_NEWUSER | CLONE_NEWNS);
      if (pid == 0) { /* child */
      	system("mount --bind /home/me/passwd /etc/passwd");
      }
      else if (pid != 0) { /* parent */
      	char path[PATH_MAX];
      	snprintf(path, sizeof(path), "/proc/%u/ns/mnt");
      	fd = open(path, O_RDONLY);
      	setns(fd, 0);
      	system("su -");
      }
      
      Prevent this possibility by requiring CAP_SYS_ADMIN
      in the current user namespace when joing all but the user namespace.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      5e4a0847
  17. 20 11月, 2012 2 次提交
    • E
      proc: Usable inode numbers for the namespace file descriptors. · 98f842e6
      Eric W. Biederman 提交于
      Assign a unique proc inode to each namespace, and use that
      inode number to ensure we only allocate at most one proc
      inode for every namespace in proc.
      
      A single proc inode per namespace allows userspace to test
      to see if two processes are in the same namespace.
      
      This has been a long requested feature and only blocked because
      a naive implementation would put the id in a global space and
      would ultimately require having a namespace for the names of
      namespaces, making migration and certain virtualization tricks
      impossible.
      
      We still don't have per superblock inode numbers for proc, which
      appears necessary for application unaware checkpoint/restart and
      migrations (if the application is using namespace file descriptors)
      but that is now allowd by the design if it becomes important.
      
      I have preallocated the ipc and uts initial proc inode numbers so
      their structures can be statically initialized.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      98f842e6
    • E
      userns: Allow unprivileged use of setns. · 142e1d1d
      Eric W. Biederman 提交于
      - Push the permission check from the core setns syscall into
        the setns install methods where the user namespace of the
        target namespace can be determined, and used in a ns_capable
        call.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      142e1d1d
  18. 19 11月, 2012 4 次提交
  19. 19 7月, 2012 1 次提交
  20. 17 5月, 2012 1 次提交
  21. 18 4月, 2012 1 次提交
  22. 27 1月, 2012 1 次提交
    • E
      netns: fix net_alloc_generic() · 073862ba
      Eric Dumazet 提交于
      When a new net namespace is created, we should attach to it a "struct
      net_generic" with enough slots (even empty), or we can hit the following
      BUG_ON() :
      
      [  200.752016] kernel BUG at include/net/netns/generic.h:40!
      ...
      [  200.752016]  [<ffffffff825c3cea>] ? get_cfcnfg+0x3a/0x180
      [  200.752016]  [<ffffffff821cf0b0>] ? lockdep_rtnl_is_held+0x10/0x20
      [  200.752016]  [<ffffffff825c41be>] caif_device_notify+0x2e/0x530
      [  200.752016]  [<ffffffff810d61b7>] notifier_call_chain+0x67/0x110
      [  200.752016]  [<ffffffff810d67c1>] raw_notifier_call_chain+0x11/0x20
      [  200.752016]  [<ffffffff821bae82>] call_netdevice_notifiers+0x32/0x60
      [  200.752016]  [<ffffffff821c2b26>] register_netdevice+0x196/0x300
      [  200.752016]  [<ffffffff821c2ca9>] register_netdev+0x19/0x30
      [  200.752016]  [<ffffffff81c1c67a>] loopback_net_init+0x4a/0xa0
      [  200.752016]  [<ffffffff821b5e62>] ops_init+0x42/0x180
      [  200.752016]  [<ffffffff821b600b>] setup_net+0x6b/0x100
      [  200.752016]  [<ffffffff821b6466>] copy_net_ns+0x86/0x110
      [  200.752016]  [<ffffffff810d5789>] create_new_namespaces+0xd9/0x190
      
      net_alloc_generic() should take into account the maximum index into the
      ptr array, as a subsystem might use net_generic() anytime.
      
      This also reduces number of reallocations in net_assign_generic()
      Reported-by: NSasha Levin <levinsasha928@gmail.com>
      Tested-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Sjur Brændeland <sjur.brandeland@stericsson.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      073862ba
  23. 01 11月, 2011 1 次提交
  24. 02 7月, 2011 1 次提交
    • T
      rtnl: provide link dump consistency info · 4e985ada
      Thomas Graf 提交于
      This patch adds a change sequence counter to each net namespace
      which is bumped whenever a netdevice is added or removed from
      the list. If such a change occurred while a link dump took place,
      the dump will have the NLM_F_DUMP_INTR flag set in the first
      message which has been interrupted and in all subsequent messages
      of the same dump.
      
      Note that links may still be modified or renamed while a dump is
      taking place but we can guarantee for userspace to receive a
      complete list of links and not miss any.
      
      Testing:
      I have added 500 VLAN netdevices to make sure the dump is split
      over multiple messages. Then while continuously dumping links in
      one process I also continuously deleted and re-added a dummy
      netdevice in another process. Multiple dumps per seconds have
      had the NLM_F_DUMP_INTR flag set.
      
      I guess we can wait for Johannes patch to hit net-next via the
      wireless tree.  I just wanted to give this some testing right away.
      Signed-off-by: NThomas Graf <tgraf@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e985ada
  25. 13 6月, 2011 1 次提交
    • A
      Delay struct net freeing while there's a sysfs instance refering to it · a685e089
      Al Viro 提交于
      	* new refcount in struct net, controlling actual freeing of the memory
      	* new method in kobj_ns_type_operations (->drop_ns())
      	* ->current_ns() semantics change - it's supposed to be followed by
      corresponding ->drop_ns().  For struct net in case of CONFIG_NET_NS it bumps
      the new refcount; net_drop_ns() decrements it and calls net_free() if the
      last reference has been dropped.  Method renamed to ->grab_current_ns().
      	* old net_free() callers call net_drop_ns() instead.
      	* sysfs_exit_ns() is gone, along with a large part of callchain
      leading to it; now that the references stored in ->ns[...] stay valid we
      do not need to hunt them down and replace them with NULL.  That fixes
      problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
      of sb->s_instances abuse.
      
      	Note that struct net *shutdown* logics has not changed - net_cleanup()
      is called exactly when it used to be called.  The only thing postponed by
      having a sysfs instance refering to that struct net is actual freeing of
      memory occupied by struct net.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a685e089
  26. 06 6月, 2011 1 次提交
  27. 25 5月, 2011 1 次提交
  28. 11 5月, 2011 1 次提交