1. 30 10月, 2007 1 次提交
    • D
      [NET]: Fix race between poll_napi() and net_rx_action() · 0a7606c1
      David S. Miller 提交于
      netpoll_poll_lock() synchronizes the ->poll() invocation
      code paths, but once we have the lock we have to make
      sure that NAPI_STATE_SCHED is still set.  Otherwise we
      get:
      
      	cpu 0			cpu 1
      
      	net_rx_action()		poll_napi()
      	netpoll_poll_lock()	... spin on ->poll_lock
      	->poll()
      	  netif_rx_complete
      	netpoll_poll_unlock()	acquire ->poll_lock()
      				->poll()
      				 netif_rx_complete()
      				 CRASH
      
      Based upon a bug report from Tina Yang.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a7606c1
  2. 27 10月, 2007 2 次提交
  3. 26 10月, 2007 3 次提交
  4. 24 10月, 2007 5 次提交
  5. 23 10月, 2007 1 次提交
  6. 22 10月, 2007 3 次提交
  7. 20 10月, 2007 5 次提交
    • J
      Convert files to UTF-8 and some cleanups · 96de0e25
      Jan Engelhardt 提交于
      * Convert files to UTF-8.
      
        * Also correct some people's names
          (one example is Eißfeldt, which was found in a source file.
          Given that the author used an ß at all in a source file
          indicates that the real name has in fact a 'ß' and not an 'ss',
          which is commonly used as a substitute for 'ß' when limited to
          7bit.)
      
        * Correct town names (Goettingen -> Göttingen)
      
        * Update Eberhard Mönkeberg's address (http://lkml.org/lkml/2007/1/8/313)
      Signed-off-by: NJan Engelhardt <jengelh@gmx.de>
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      96de0e25
    • P
      Use helpers to obtain task pid in printks · ba25f9dc
      Pavel Emelyanov 提交于
      The task_struct->pid member is going to be deprecated, so start
      using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
      the kernel.
      
      The first thing to start with is the pid, printed to dmesg - in
      this case we may safely use task_pid_nr(). Besides, printks produce
      more (much more) than a half of all the explicit pid usage.
      
      [akpm@linux-foundation.org: git-drm went and changed lots of stuff]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: Dave Airlie <airlied@linux.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba25f9dc
    • J
      remove asm/bitops.h includes · 1977f032
      Jiri Slaby 提交于
      remove asm/bitops.h includes
      
      including asm/bitops directly may cause compile errors. don't include it
      and include linux/bitops instead. next patch will deny including asm header
      directly.
      
      Cc: Adrian Bunk <bunk@kernel.org>
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1977f032
    • P
      pid namespaces: changes to show virtual ids to user · b488893a
      Pavel Emelyanov 提交于
      This is the largest patch in the set. Make all (I hope) the places where
      the pid is shown to or get from user operate on the virtual pids.
      
      The idea is:
       - all in-kernel data structures must store either struct pid itself
         or the pid's global nr, obtained with pid_nr() call;
       - when seeking the task from kernel code with the stored id one
         should use find_task_by_pid() call that works with global pids;
       - when showing pid's numerical value to the user the virtual one
         should be used, but however when one shows task's pid outside this
         task's namespace the global one is to be used;
       - when getting the pid from userspace one need to consider this as
         the virtual one and use appropriate task/pid-searching functions.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: nuther build fix]
      [akpm@linux-foundation.org: yet nuther build fix]
      [akpm@linux-foundation.org: remove unneeded casts]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul Menage <menage@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b488893a
    • P
      Make access to task's nsproxy lighter · cf7b708c
      Pavel Emelyanov 提交于
      When someone wants to deal with some other taks's namespaces it has to lock
      the task and then to get the desired namespace if the one exists.  This is
      slow on read-only paths and may be impossible in some cases.
      
      E.g.  Oleg recently noticed a race between unshare() and the (sent for
      review in cgroups) pid namespaces - when the task notifies the parent it
      has to know the parent's namespace, but taking the task_lock() is
      impossible there - the code is under write locked tasklist lock.
      
      On the other hand switching the namespace on task (daemonize) and releasing
      the namespace (after the last task exit) is rather rare operation and we
      can sacrifice its speed to solve the issues above.
      
      The access to other task namespaces is proposed to be performed
      like this:
      
           rcu_read_lock();
           nsproxy = task_nsproxy(tsk);
           if (nsproxy != NULL) {
                   / *
                     * work with the namespaces here
                     * e.g. get the reference on one of them
                     * /
           } / *
               * NULL task_nsproxy() means that this task is
               * almost dead (zombie)
               * /
           rcu_read_unlock();
      
      This patch has passed the review by Eric and Oleg :) and,
      of course, tested.
      
      [clg@fr.ibm.com: fix unshare()]
      [ebiederm@xmission.com: Update get_net_ns_by_pid]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf7b708c
  8. 19 10月, 2007 2 次提交
    • O
      [NET]: Fix bug in sk_filter race cures. · 9b013e05
      Olof Johansson 提交于
      Looks like this might be causing problems, at least for me on ppc. This
      happened during a normal boot, right around first interface config/dhcp
      run..
      
      cpu 0x0: Vector: 300 (Data Access) at [c00000000147b820]
          pc: c000000000435e5c: .sk_filter_delayed_uncharge+0x1c/0x60
          lr: c0000000004360d0: .sk_attach_filter+0x170/0x180
          sp: c00000000147baa0
         msr: 9000000000009032
         dar: 4
       dsisr: 40000000
        current = 0xc000000004780fa0
        paca    = 0xc000000000650480
          pid   = 1295, comm = dhclient3
      0:mon> t
      [c00000000147bb20] c0000000004360d0 .sk_attach_filter+0x170/0x180
      [c00000000147bbd0] c000000000418988 .sock_setsockopt+0x788/0x7f0
      [c00000000147bcb0] c000000000438a74 .compat_sys_setsockopt+0x4e4/0x5a0
      [c00000000147bd90] c00000000043955c .compat_sys_socketcall+0x25c/0x2b0
      [c00000000147be30] c000000000007508 syscall_exit+0x0/0x40
      --- Exception: c01 (System Call) at 000000000ff618d8
      SP (fffdf040) is in userspace
      0:mon> 
      
      I.e. null pointer deref at sk_filter_delayed_uncharge+0x1c:
      
      0:mon> di $.sk_filter_delayed_uncharge
      c000000000435e40  7c0802a6      mflr    r0
      c000000000435e44  fbc1fff0      std     r30,-16(r1)
      c000000000435e48  7c8b2378      mr      r11,r4
      c000000000435e4c  ebc2cdd0      ld      r30,-12848(r2)
      c000000000435e50  f8010010      std     r0,16(r1)
      c000000000435e54  f821ff81      stdu    r1,-128(r1)
      c000000000435e58  380300a4      addi    r0,r3,164
      c000000000435e5c  81240004      lwz     r9,4(r4)
      
      That's the deref of fp:
      
      static void sk_filter_delayed_uncharge(struct sock *sk, struct sk_filter *fp)
      {
              unsigned int size = sk_filter_len(fp);
      ...
      
      That is called from sk_attach_filter():
      
      ...
              rcu_read_lock_bh();
              old_fp = rcu_dereference(sk->sk_filter);
              rcu_assign_pointer(sk->sk_filter, fp);
              rcu_read_unlock_bh();
      
              sk_filter_delayed_uncharge(sk, old_fp);
              return 0;
      ...
      
      So, looks like rcu_dereference() returned NULL. I don't know the
      filter code at all, but it seems like it might be a valid case?
      sk_detach_filter() seems to handle a NULL sk_filter, at least.
      
      So, this needs review by someone who knows the filter, but it fixes the
      problem for me:
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b013e05
    • E
      sysctl: fix neighbour table sysctls. · d12af679
      Eric W. Biederman 提交于
      - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary
        sysctl names for a function that works with proc.
      
      - In neighbour.c reorder the table to put the possibly unused entries
        at the end so we can remove them by terminating the table early.
      
      - In neighbour.c kill the entries with questionable binary sysctl
        handling behavior.
      
      - In neighbour.c if we don't have a strategy routine remove the
        binary path.  So we don't the default sysctl strategy routine
        on data that is not ready for it.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d12af679
  9. 18 10月, 2007 5 次提交
  10. 16 10月, 2007 6 次提交
  11. 14 10月, 2007 1 次提交
    • R
      net core: fix kernel-doc for new function parameters · c4ea43c5
      Randy Dunlap 提交于
      Fix networking code kernel-doc for newly added parameters.
      
      Warning(linux-2.6.23-git2//net/core/sock.c:879): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:570): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:594): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:617): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:641): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:667): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:722): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:959): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:1195): No description found for parameter 'dev'
      Warning(linux-2.6.23-git2//net/core/dev.c:2105): No description found for parameter 'n'
      Warning(linux-2.6.23-git2//net/core/dev.c:3272): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//net/core/dev.c:3445): No description found for parameter 'net'
      Warning(linux-2.6.23-git2//include/linux/netdevice.h:1301): No description found for parameter 'cpu'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4ea43c5
  12. 13 10月, 2007 1 次提交
    • K
      Driver core: change add_uevent_var to use a struct · 7eff2e7a
      Kay Sievers 提交于
      This changes the uevent buffer functions to use a struct instead of a
      long list of parameters. It does no longer require the caller to do the
      proper buffer termination and size accounting, which is currently wrong
      in some places. It fixes a known bug where parts of the uevent
      environment are overwritten because of wrong index calculations.
      
      Many thanks to Mathieu Desnoyers for finding bugs and improving the
      error handling.
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      
      7eff2e7a
  13. 11 10月, 2007 5 次提交
    • D
      [NET]: make netlink user -> kernel interface synchronious · cd40b7d3
      Denis V. Lunev 提交于
      This patch make processing netlink user -> kernel messages synchronious.
      This change was inspired by the talk with Alexey Kuznetsov about current
      netlink messages processing. He says that he was badly wrong when introduced 
      asynchronious user -> kernel communication.
      
      The call netlink_unicast is the only path to send message to the kernel
      netlink socket. But, unfortunately, it is also used to send data to the
      user.
      
      Before this change the user message has been attached to the socket queue
      and sk->sk_data_ready was called. The process has been blocked until all
      pending messages were processed. The bad thing is that this processing
      may occur in the arbitrary process context.
      
      This patch changes nlk->data_ready callback to get 1 skb and force packet
      processing right in the netlink_unicast.
      
      Kernel -> user path in netlink_unicast remains untouched.
      
      EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock
      drop, but the process remains in the cycle until the message will be fully
      processed. So, there is no need to use this kludges now.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Acked-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd40b7d3
    • D
      [NET]: rtnl_unlock cleanups · 1536cc0d
      Denis V. Lunev 提交于
      There is no need to process outstanding netlink user->kernel packets
      during rtnl_unlock now. There is no rtnl_trylock in the rtnetlink_rcv
      anymore.
      
      Normal code path is the following:
      netlink_sendmsg
         netlink_unicast
             netlink_sendskb
                 skb_queue_tail
                 netlink_data_ready
                     rtnetlink_rcv
                         mutex_lock(&rtnl_mutex);
                         netlink_run_queue(sk, qlen, &rtnetlink_rcv_msg);
                         mutex_unlock(&rtnl_mutex);
      
      So, it is possible, that packets can be present in the rtnl->sk_receive_queue
      during rtnl_unlock, but there is no need to process them at that moment as
      rtnetlink_rcv for that packet is pending.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Acked-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1536cc0d
    • P
      [NET]: Remove double dev->flags checking when calling dev_close() · 9b772652
      Pavel Emelyanov 提交于
      The unregister_netdevice() and dev_change_net_namespace()
      both check for dev->flags to be IFF_UP before calling the
      dev_close(), but the dev_close() checks for IFF_UP itself,
      so remove those unneeded checks.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b772652
    • P
      [NETNS]: Don't memset() netns to zero manually · 32f0c4cb
      Pavel Emelyanov 提交于
      The newly created net namespace is set to 0 with memset()
      in setup_net(). The setup_net() is also called for the
      init_net_ns(), which is zeroed naturally as a global var.
      
      So remove this memset and allocate new nets with the
      kmem_cache_zalloc().
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32f0c4cb
    • P
      [NETNS]: Move some code into __init section when CONFIG_NET_NS=n · 4665079c
      Pavel Emelyanov 提交于
      With the net namespaces many code leaved the __init section,
      thus making the kernel occupy more memory than it did before.
      Since we have a config option that prohibits the namespace
      creation, the functions that initialize/finalize some netns
      stuff are simply not needed and can be freed after the boot.
      
      Currently, this is almost not noticeable, since few calls
      are no longer in __init, but when the namespaces will be
      merged it will be possible to free more code. I propose to
      use the __net_init, __net_exit and __net_initdata "attributes"
      for functions/variables that are not used if the CONFIG_NET_NS
      is not set to save more space in memory.
      
      The exiting functions cannot just reside in the __exit section,
      as noticed by David, since the init section will have
      references on it and the compilation will fail due to modpost
      checks. These references can exist, since the init namespace
      never dies and the exit callbacks are never called. So I
      introduce the __exit_refok attribute just like it is already
      done with the __init_refok.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4665079c