1. 27 5月, 2010 1 次提交
    • E
      net: fix lock_sock_bh/unlock_sock_bh · 8a74ad60
      Eric Dumazet 提交于
      This new sock lock primitive was introduced to speedup some user context
      socket manipulation. But it is unsafe to protect two threads, one using
      regular lock_sock/release_sock, one using lock_sock_bh/unlock_sock_bh
      
      This patch changes lock_sock_bh to be careful against 'owned' state.
      If owned is found to be set, we must take the slow path.
      lock_sock_bh() now returns a boolean to say if the slow path was taken,
      and this boolean is used at unlock_sock_bh time to call the appropriate
      unlock function.
      
      After this change, BH are either disabled or enabled during the
      lock_sock_bh/unlock_sock_bh protected section. This might be misleading,
      so we rename these functions to lock_sock_fast()/unlock_sock_fast().
      Reported-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Tested-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a74ad60
  2. 24 5月, 2010 4 次提交
    • H
      tun: Update classid on packet injection · 82862742
      Herbert Xu 提交于
      This patch makes tun update its socket classid every time we
      inject a packet into the network stack.  This is so that any
      updates made by the admin to the process writing packets to
      tun is effected.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82862742
    • H
      cls_cgroup: Store classid in struct sock · f8451725
      Herbert Xu 提交于
      Up until now cls_cgroup has relied on fetching the classid out of
      the current executing thread.  This runs into trouble when a packet
      processing is delayed in which case it may execute out of another
      thread's context.
      
      Furthermore, even when a packet is not delayed we may fail to
      classify it if soft IRQs have been disabled, because this scenario
      is indistinguishable from one where a packet unrelated to the
      current thread is processed by a real soft IRQ.
      
      In fact, the current semantics is inherently broken, as a single
      skb may be constructed out of the writes of two different tasks.
      A different manifestation of this problem is when the TCP stack
      transmits in response of an incoming ACK.  This is currently
      unclassified.
      
      As we already have a concept of packet ownership for accounting
      purposes in the skb->sk pointer, this is a natural place to store
      the classid in a persistent manner.
      
      This patch adds the cls_cgroup classid in struct sock, filling up
      an existing hole on 64-bit :)
      
      The value is set at socket creation time.  So all sockets created
      via socket(2) automatically gains the ID of the thread creating it.
      Whenever another process touches the socket by either reading or
      writing to it, we will change the socket classid to that of the
      process if it has a valid (non-zero) classid.
      
      For sockets created on inbound connections through accept(2), we
      inherit the classid of the original listening socket through
      sk_clone, possibly preceding the actual accept(2) call.
      
      In order to minimise risks, I have not made this the authoritative
      classid.  For now it is only used as a backup when we execute
      with soft IRQs disabled.  Once we're completely happy with its
      semantics we can use it as the sole classid.
      
      Footnote: I have rearranged the error path on cls_group module
      creation.  If we didn't do this, then there is a window where
      someone could create a tc rule using cls_group before the cgroup
      subsystem has been registered.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8451725
    • D
      net-2.6 : V2 - fix dev_get_valid_name · 8ce6cebc
      Daniel Lezcano 提交于
      the commit:
      
      commit d9031024
      Author: Octavian Purdila <opurdila@ixiacom.com>
      Date:   Wed Nov 18 02:36:59 2009 +0000
      
          net: device name allocation cleanups
      
      introduced a bug when there is a hash collision making impossible
      to rename a device with eth%d. This bug is very hard to reproduce
      and appears rarely.
      
      The problem is coming from we don't pass a temporary buffer to
      __dev_alloc_name but 'dev->name' which is modified by the function.
      
      A detailed explanation is here:
      
      http://marc.info/?l=linux-netdev&m=127417784011987&w=2
      
      Changelog:
       V2 : replaced strings comparison by pointers comparison
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@free.fr>
      Reviewed-by: NOctavian Purdila <opurdila@ixiacom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8ce6cebc
    • D
      rtnetlink: Fix error handling in do_setlink() · 253683bb
      David Howells 提交于
      Commit c02db8c6:
      
      	Author:  Chris Wright <chrisw@sous-sol.org>
      	Date:    Sun May 16 01:05:45 2010 -0700
      	Subject: rtnetlink: make SR-IOV VF interface symmetric
      
      adds broken error handling to do_setlink() in net/core/rtnetlink.c.  The
      problem is the following chunk of code:
      
      	if (tb[IFLA_VFINFO_LIST]) {
      		struct nlattr *attr;
      		int rem;
      		nla_for_each_nested(attr, tb[IFLA_VFINFO_LIST], rem) {
      			if (nla_type(attr) != IFLA_VF_INFO)
        ---->				goto errout;
      			err = do_setvfinfo(dev, attr);
      			if (err < 0)
      				goto errout;
      			modified = 1;
      		}
      	}
      
      which can get to errout without setting err, resulting in the following error:
      
      net/core/rtnetlink.c: In function 'do_setlink':
      net/core/rtnetlink.c:904: warning: 'err' may be used uninitialized in this function
      
      Change the code to return -EINVAL in this case.  Note that this might not be
      the appropriate error though.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Chris Wright <chrisw@sous-sol.org>
      cc: David S. Miller <davem@davemloft.net>
      Acked-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      253683bb
  3. 21 5月, 2010 2 次提交
  4. 18 5月, 2010 4 次提交
    • S
      net: Add netlink support for virtual port management (was iovnl) · 57b61080
      Scott Feldman 提交于
      Add new netdev ops ndo_{set|get}_vf_port to allow setting of
      port-profile on a netdev interface.  Extends netlink socket RTM_SETLINK/
      RTM_GETLINK with two new sub msgs called IFLA_VF_PORTS and IFLA_PORT_SELF
      (added to end of IFLA_cmd list).  These are both nested atrtibutes
      using this layout:
      
                    [IFLA_NUM_VF]
                    [IFLA_VF_PORTS]
                            [IFLA_VF_PORT]
                                    [IFLA_PORT_*], ...
                            [IFLA_VF_PORT]
                                    [IFLA_PORT_*], ...
                            ...
                    [IFLA_PORT_SELF]
                            [IFLA_PORT_*], ...
      
      These attributes are design to be set and get symmetrically.  VF_PORTS
      is a list of VF_PORTs, one for each VF, when dealing with an SR-IOV
      device.  PORT_SELF is for the PF of the SR-IOV device, in case it wants
      to also have a port-profile, or for the case where the VF==PF, like in
      enic patch 2/2 of this patch set.
      
      A port-profile is used to configure/enable the external switch virtual port
      backing the netdev interface, not to configure the host-facing side of the
      netdev.  A port-profile is an identifier known to the switch.  How port-
      profiles are installed on the switch or how available port-profiles are
      made know to the host is outside the scope of this patch.
      
      There are two types of port-profiles specs in the netlink msg.  The first spec
      is for 802.1Qbg (pre-)standard, VDP protocol.  The second spec is for devices
      that run a similar protocol as VDP but in firmware, thus hiding the protocol
      details.  In either case, the specs have much in common and makes sense to
      define the netlink msg as the union of the two specs.  For example, both specs
      have a notition of associating/deassociating a port-profile.  And both specs
      require some information from the hypervisor manager, such as client port
      instance ID.
      
      The general flow is the port-profile is applied to a host netdev interface
      using RTM_SETLINK, the receiver of the RTM_SETLINK msg communicates with the
      switch, and the switch virtual port backing the host netdev interface is
      configured/enabled based on the settings defined by the port-profile.  What
      those settings comprise, and how those settings are managed is again
      outside the scope of this patch, since this patch only deals with the
      first step in the flow.
      Signed-off-by: NScott Feldman <scofeldm@cisco.com>
      Signed-off-by: NRoopa Prabhu <roprabhu@cisco.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57b61080
    • J
      net: Remove unnecessary semicolons after switch statements · ccbd6a5a
      Joe Perches 提交于
      Also added an explicit break; to avoid
      a fallthrough in net/ipv4/tcp_input.c
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccbd6a5a
    • E
      net: add a noref bit on skb dst · 7fee226a
      Eric Dumazet 提交于
      Use low order bit of skb->_skb_dst to tell dst is not refcounted.
      
      Change _skb_dst to _skb_refdst to make sure all uses are catched.
      
      skb_dst() returns the dst, regardless of noref bit set or not, but
      with a lockdep check to make sure a noref dst is not given if current
      user is not rcu protected.
      
      New skb_dst_set_noref() helper to set an notrefcounted dst on a skb.
      (with lockdep check)
      
      skb_dst_drop() drops a reference only if skb dst was refcounted.
      
      skb_dst_force() helper is used to force a refcount on dst, when skb
      is queued and not anymore RCU protected.
      
      Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if
      !IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in
      sock_queue_rcv_skb(), in __nf_queue().
      
      Use skb_dst_force() in dev_requeue_skb().
      
      Note: dst_use_noref() still dirties dst, we might transform it
      later to do one dirtying per jiffies.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7fee226a
    • E
      rps: avoid one atomic in enqueue_to_backlog · ebda37c2
      Eric Dumazet 提交于
      If CONFIG_SMP=y, then we own a queue spinlock, we can avoid the atomic
      test_and_set_bit() from napi_schedule_prep().
      
      We now have same number of atomic ops per netif_rx() calls than with
      pre-RPS kernel.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebda37c2
  5. 16 5月, 2010 4 次提交
    • C
      rtnetlink: make SR-IOV VF interface symmetric · c02db8c6
      Chris Wright 提交于
      Now we have a set of nested attributes:
      
        IFLA_VFINFO_LIST (NESTED)
          IFLA_VF_INFO (NESTED)
            IFLA_VF_MAC
            IFLA_VF_VLAN
            IFLA_VF_TX_RATE
      
      This allows a single set to operate on multiple attributes if desired.
      Among other things, it means a dump can be replayed to set state.
      
      The current interface has yet to be released, so this seems like
      something to consider for 2.6.34.
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c02db8c6
    • E
      net: Introduce sk_route_nocaps · a465419b
      Eric Dumazet 提交于
      TCP-MD5 sessions have intermittent failures, when route cache is
      invalidated. ip_queue_xmit() has to find a new route, calls
      sk_setup_caps(sk, &rt->u.dst), destroying the 
      
      sk->sk_route_caps &= ~NETIF_F_GSO_MASK
      
      that MD5 desperately try to make all over its way (from
      tcp_transmit_skb() for example)
      
      So we send few bad packets, and everything is fine when
      tcp_transmit_skb() is called again for this socket.
      
      Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a
      socket field, sk_route_nocaps, containing bits to mask on sk_route_caps.
      Reported-by: NBhaskar Dutta <bhaskie@gmail.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a465419b
    • E
      net: Consistent skb timestamping · 3b098e2d
      Eric Dumazet 提交于
      With RPS inclusion, skb timestamping is not consistent in RX path.
      
      If netif_receive_skb() is used, its deferred after RPS dispatch.
      
      If netif_rx() is used, its done before RPS dispatch.
      
      This can give strange tcpdump timestamps results.
      
      I think timestamping should be done as soon as possible in the receive
      path, to get meaningful values (ie timestamps taken at the time packet
      was delivered by NIC driver to our stack), even if NAPI already can
      defer timestamping a bit (RPS can help to reduce the gap)
      
      Tom Herbert prefer to sample timestamps after RPS dispatch. In case
      sampling is expensive (HPET/acpi_pm on x86), this makes sense.
      
      Let admins switch from one mode to another, using a new
      sysctl, /proc/sys/net/core/netdev_tstamp_prequeue
      
      Its default value (1), means timestamps are taken as soon as possible,
      before backlog queueing, giving accurate timestamps.
      
      Setting a 0 value permits to sample timestamps when processing backlog,
      after RPS dispatch, to lower the load of the pre-RPS cpu.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b098e2d
    • J
      net: adjust handle_macvlan to pass port struct to hook · a14462f1
      Jiri Pirko 提交于
      Now there's null check here and also again in the hook. Looking at bridge bits
      which are simmilar, port structure is rcu_dereferenced right away in
      handle_bridge and passed to hook. Looks nicer.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a14462f1
  6. 07 5月, 2010 1 次提交
    • E
      rps: Various optimizations · eecfd7c4
      Eric Dumazet 提交于
      Introduce ____napi_schedule() helper for callers in irq disabled
      contexts. rps_trigger_softirq() becomes a leaf function.
      
      Use container_of() in process_backlog() instead of accessing per_cpu
      address.
      
      Use a custom inlined version of __napi_complete() in process_backlog()
      to avoid one locked instruction :
      
       only current cpu owns and manipulates this napi,
       and NAPI_STATE_SCHED is the only possible flag set on backlog.
       we can use a plain write instead of clear_bit(),
       and we dont need an smp_mb() memory barrier, since RPS is on,
       backlog is protected by a spinlock.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eecfd7c4
  7. 06 5月, 2010 2 次提交
    • E
      veth: Dont kfree_skb() after dev_forward_skb() · 6ec82562
      Eric Dumazet 提交于
      In case of congestion, netif_rx() frees the skb, so we must assume
      dev_forward_skb() also consume skb.
      
      Bug introduced by commit 44540960
      (veth: move loopback logic to common location)
      
      We must change dev_forward_skb() to always consume skb, and veth to not
      double free it.
      
      Bug report : http://marc.info/?l=linux-netdev&m=127310770900442&w=3Reported-by: NMartín Ferrari <martin.ferrari@gmail.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ec82562
    • W
      netpoll: add generic support for bridge and bonding devices · 0e34e931
      WANG Cong 提交于
      This whole patchset is for adding netpoll support to bridge and bonding
      devices. I already tested it for bridge, bonding, bridge over bonding,
      and bonding over bridge. It looks fine now.
      
      To make bridge and bonding support netpoll, we need to adjust
      some netpoll generic code. This patch does the following things:
      
      1) introduce two new priv_flags for struct net_device:
         IFF_IN_NETPOLL which identifies we are processing a netpoll;
         IFF_DISABLE_NETPOLL is used to disable netpoll support for a device
         at run-time;
      
      2) introduce one new method for netdev_ops:
         ->ndo_netpoll_cleanup() is used to clean up netpoll when a device is
           removed.
      
      3) introduce netpoll_poll_dev() which takes a struct net_device * parameter;
         export netpoll_send_skb() and netpoll_poll_dev() which will be used later;
      
      4) hide a pointer to struct netpoll in struct netpoll_info, ditto.
      
      5) introduce ->real_dev for struct netpoll.
      
      6) introduce a new status NETDEV_BONDING_DESLAE, which is used to disable
         netconsole before releasing a slave, to avoid deadlocks.
      
      Cc: David Miller <davem@davemloft.net>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NWANG Cong <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0e34e931
  8. 05 5月, 2010 1 次提交
    • E
      net: __alloc_skb() speedup · ec7d2f2c
      Eric Dumazet 提交于
      With following patch I can reach maximum rate of my pktgen+udpsink
      simulator :
      - 'old' machine : dual quad core E5450  @3.00GHz
      - 64 UDP rx flows (only differ by destination port)
      - RPS enabled, NIC interrupts serviced on cpu0
      - rps dispatched on 7 other cores. (~130.000 IPI per second)
      - SLAB allocator (faster than SLUB in this workload)
      - tg3 NIC
      - 1.080.000 pps without a single drop at NIC level.
      
      Idea is to add two prefetchw() calls in __alloc_skb(), one to prefetch
      first sk_buff cache line, the second to prefetch the shinfo part.
      
      Also using one memset() to initialize all skb_shared_info fields instead
      of one by one to reduce number of instructions, using long word moves.
      
      All skb_shared_info fields before 'dataref' are cleared in 
      __alloc_skb().
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec7d2f2c
  9. 04 5月, 2010 1 次提交
  10. 03 5月, 2010 1 次提交
    • C
      net: fix softnet_stat · dee42870
      Changli Gao 提交于
      Per cpu variable softnet_data.total was shared between IRQ and SoftIRQ context
      without any protection. And enqueue_to_backlog should update the netdev_rx_stat
      of the target CPU.
      
      This patch renames softnet_data.total to softnet_data.processed: the number of
      packets processed in uppper levels(IP stacks).
      
      softnet_stat data is moved into softnet_data.
      Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
      ----
       include/linux/netdevice.h |   17 +++++++----------
       net/core/dev.c            |   26 ++++++++++++--------------
       net/sched/sch_generic.c   |    2 +-
       3 files changed, 20 insertions(+), 25 deletions(-)
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dee42870
  11. 02 5月, 2010 2 次提交
    • D
      net: Inline skb_pull() in eth_type_trans(). · 47d29646
      David S. Miller 提交于
      In commit 6be8ac2f ("[NET]: uninline skb_pull, de-bloats a lot")
      we uninlined skb_pull.
      
      But in some critical paths it makes sense to inline this thing
      and it helps performance significantly.
      
      Create an skb_pull_inline() so that we can do this in a way that
      serves also as annotation.
      
      Based upon a patch by Eric Dumazet.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47d29646
    • E
      net: sock_def_readable() and friends RCU conversion · 43815482
      Eric Dumazet 提交于
      sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we
      need two atomic operations (and associated dirtying) per incoming
      packet.
      
      RCU conversion is pretty much needed :
      
      1) Add a new structure, called "struct socket_wq" to hold all fields
      that will need rcu_read_lock() protection (currently: a
      wait_queue_head_t and a struct fasync_struct pointer).
      
      [Future patch will add a list anchor for wakeup coalescing]
      
      2) Attach one of such structure to each "struct socket" created in
      sock_alloc_inode().
      
      3) Respect RCU grace period when freeing a "struct socket_wq"
      
      4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
      socket_wq"
      
      5) Change sk_sleep() function to use new sk->sk_wq instead of
      sk->sk_sleep
      
      6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
      a rcu_read_lock() section.
      
      7) Change all sk_has_sleeper() callers to :
        - Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock)
        - Use wq_has_sleeper() to eventually wakeup tasks.
        - Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock)
      
      8) sock_wake_async() is modified to use rcu protection as well.
      
      9) Exceptions :
        macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
      instead of dynamically allocated ones. They dont need rcu freeing.
      
      Some cleanups or followups are probably needed, (possible
      sk_callback_lock conversion to a spinlock for example...).
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43815482
  12. 29 4月, 2010 1 次提交
    • E
      net: speedup udp receive path · 4b0b72f7
      Eric Dumazet 提交于
      Since commit 95766fff ([UDP]: Add memory accounting.), 
      each received packet needs one extra sock_lock()/sock_release() pair.
      
      This added latency because of possible backlog handling. Then later,
      ticket spinlocks added yet another latency source in case of DDOS.
      
      This patch introduces lock_sock_bh() and unlock_sock_bh()
      synchronization primitives, avoiding one atomic operation and backlog
      processing.
      
      skb_free_datagram_locked() uses them instead of full blown
      lock_sock()/release_sock(). skb is orphaned inside locked section for
      proper socket memory reclaim, and finally freed outside of it.
      
      UDP receive path now take the socket spinlock only once.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b0b72f7
  13. 28 4月, 2010 4 次提交
  14. 26 4月, 2010 2 次提交
    • P
      net: rtnetlink: decouple rtnetlink address families from real address families · 25239cee
      Patrick McHardy 提交于
      Decouple rtnetlink address families from real address families in socket.h to
      be able to add rtnetlink interfaces to code that is not a real address family
      without increasing AF_MAX/NPROTO.
      
      This will be used to add support for multicast route dumping from all tables
      as the proc interface can't be extended to support anything but the main table
      without breaking compatibility.
      
      This partialy undoes the patch to introduce independant families for routing
      rules and converts ipmr routing rules to a new rtnetlink family. Similar to
      that patch, values up to 127 are reserved for real address families, values
      above that may be used arbitrarily.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      25239cee
    • P
      net: fib_rules: mark arguments to fib_rules_register const and __net_initdata · 3d0c9c4e
      Patrick McHardy 提交于
      fib_rules_register() duplicates the template passed to it without modification,
      mark the argument as const. Additionally the templates are only needed when
      instantiating a new namespace, so mark them as __net_initdata, which means
      they can be discarded when CONFIG_NET_NS=n.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      3d0c9c4e
  15. 25 4月, 2010 2 次提交
  16. 23 4月, 2010 2 次提交
  17. 22 4月, 2010 3 次提交
  18. 21 4月, 2010 3 次提交
    • D
      net: Fix an RCU warning in dev_pick_tx() · 05d17608
      David Howells 提交于
      Fix the following RCU warning in dev_pick_tx():
      
      ===================================================
      [ INFO: suspicious rcu_dereference_check() usage. ]
      ---------------------------------------------------
      net/core/dev.c:1993 invoked rcu_dereference_check() without protection!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 0
      2 locks held by swapper/0:
       #0:  (&idev->mc_ifc_timer){+.-...}, at: [<ffffffff81039e65>] run_timer_softirq+0x17b/0x278
       #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff812ea3eb>] dev_queue_xmit+0x14e/0x4dc
      
      stack backtrace:
      Pid: 0, comm: swapper Not tainted 2.6.34-rc5-cachefs #4
      Call Trace:
       <IRQ>  [<ffffffff810516c4>] lockdep_rcu_dereference+0xaa/0xb2
       [<ffffffff812ea4f6>] dev_queue_xmit+0x259/0x4dc
       [<ffffffff812ea3eb>] ? dev_queue_xmit+0x14e/0x4dc
       [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf
       [<ffffffff81035362>] ? local_bh_enable_ip+0xbc/0xc1
       [<ffffffff812f0954>] neigh_resolve_output+0x24b/0x27c
       [<ffffffff8134f673>] ip6_output_finish+0x7c/0xb4
       [<ffffffff81350c34>] ip6_output2+0x256/0x261
       [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf
       [<ffffffff813517fb>] ip6_output+0xbbc/0xbcb
       [<ffffffff8135bc5d>] ? fib6_force_start_gc+0x2b/0x2d
       [<ffffffff81368acb>] mld_sendpack+0x273/0x39d
       [<ffffffff81368858>] ? mld_sendpack+0x0/0x39d
       [<ffffffff81052099>] ? mark_held_locks+0x52/0x70
       [<ffffffff813692fc>] mld_ifc_timer_expire+0x24f/0x288
       [<ffffffff81039ed6>] run_timer_softirq+0x1ec/0x278
       [<ffffffff81039e65>] ? run_timer_softirq+0x17b/0x278
       [<ffffffff813690ad>] ? mld_ifc_timer_expire+0x0/0x288
       [<ffffffff81035531>] ? __do_softirq+0x69/0x140
       [<ffffffff8103556a>] __do_softirq+0xa2/0x140
       [<ffffffff81002e0c>] call_softirq+0x1c/0x28
       [<ffffffff81004b54>] do_softirq+0x38/0x80
       [<ffffffff81034f06>] irq_exit+0x45/0x47
       [<ffffffff810177c3>] smp_apic_timer_interrupt+0x88/0x96
       [<ffffffff810028d3>] apic_timer_interrupt+0x13/0x20
       <EOI>  [<ffffffff810488dd>] ? __atomic_notifier_call_chain+0x0/0x86
       [<ffffffff810096bf>] ? mwait_idle+0x6e/0x78
       [<ffffffff810096b6>] ? mwait_idle+0x65/0x78
       [<ffffffff810011cb>] cpu_idle+0x4d/0x83
       [<ffffffff81380b05>] rest_init+0xb9/0xc0
       [<ffffffff81380a4c>] ? rest_init+0x0/0xc0
       [<ffffffff8168dcf0>] start_kernel+0x392/0x39d
       [<ffffffff8168d2a3>] x86_64_start_reservations+0xb3/0xb7
       [<ffffffff8168d38b>] x86_64_start_kernel+0xe4/0xeb
      
      An rcu_dereference() should be an rcu_dereference_bh().
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05d17608
    • R
      net: Remove two unnecessary exports (skbuff). · ccb7c773
      Rami Rosen 提交于
      There is no need to export skb_under_panic() and skb_over_panic() in
      skbuff.c, since these methods are used only in skbuff.c ; this patch
      removes these two exports. It also marks these functions as 'static'
      and removeS the extern declarations of them from
      include/linux/skbuff.h
      Signed-off-by: NRami Rosen <ramirose@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccb7c773
    • E
      net: sk_sleep() helper · aa395145
      Eric Dumazet 提交于
      Define a new function to return the waitqueue of a "struct sock".
      
      static inline wait_queue_head_t *sk_sleep(struct sock *sk)
      {
      	return sk->sk_sleep;
      }
      
      Change all read occurrences of sk_sleep by a call to this function.
      
      Needed for a future RCU conversion. sk_sleep wont be a field directly
      available.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa395145