1. 17 3月, 2010 26 次提交
    • A
      RDS: Do not call set_page_dirty() with irqs off · 561c7df6
      Andy Grover 提交于
      set_page_dirty() unconditionally re-enables interrupts, so
      if we call it with irqs off, they will be on after the call,
      and that's bad. This patch moves the call after we've re-enabled
      interrupts in send_drop_to(), so it's safe.
      
      Also, add BUG_ONs to let us know if we ever do call set_page_dirty
      with interrupts off.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      561c7df6
    • S
      RDS: Properly unmap when getting a remote access error · 450d06c0
      Sherman Pun 提交于
      If the RDMA op has aborted with a remote access error,
      in addition to what we already do (tell userspace it has
      completed with an error) also unmap it and put() the rm.
      
      Otherwise, hangs may occur on arches that track maps and
      will not exit without proper cleanup.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      450d06c0
    • A
      RDS: only put sockets that have seen congestion on the poll_waitq · b98ba52f
      Andy Grover 提交于
      rds_poll_waitq's listeners will be awoken if we receive a congestion
      notification. Bad performance may result because *all* polled sockets
      contend for this single lock. However, it should not be necessary to
      wake pollers when a congestion update arrives if they have never
      experienced congestion, and not putting these on the waitq will
      hopefully greatly reduce contention.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b98ba52f
    • T
      RDS: Fix locking in rds_send_drop_to() · 550a8002
      Tina Yang 提交于
      It seems rds_send_drop_to() called
      __rds_rdma_send_complete(rs, rm, RDS_RDMA_CANCELED)
      with only rds_sock lock, but not rds_message lock. It raced with
      other threads that is attempting to modify the rds_message as well,
      such as from within rds_rdma_send_complete().
      Signed-off-by: NTina Yang <tina.yang@oracle.com>
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      550a8002
    • A
      RDS: Turn down alarming reconnect messages · 97069788
      Andy Grover 提交于
      RDS's error messages when a connection goes down are a little
      extreme. A connection may go down, and it will be re-established,
      and everything is fine. This patch links these messages through
      rdsdebug(), instead of to printk directly.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97069788
    • A
      RDS: Workaround for in-use MRs on close causing crash · 571c02fa
      Andy Grover 提交于
      if a machine is shut down without closing sockets properly, and
      freeing all MRs, then a BUG_ON will bring it down. This patch
      changes these to WARN_ONs -- leaking MRs is not fatal (although
      not ideal, and there is more work to do here for a proper fix.)
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      571c02fa
    • T
      RDS: Fix send locking issue · 048c15e6
      Tina Yang 提交于
      Fix a deadlock between rds_rdma_send_complete() and
      rds_send_remove_from_sock() when rds socket lock and
      rds message lock are acquired out-of-order.
      Signed-off-by: NTina Yang <Tina.Yang@oracle.com>
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      048c15e6
    • A
      RDS: Fix congestion issues for loopback · 2e7b3b99
      Andy Grover 提交于
      We have two kinds of loopback: software (via loop transport)
      and hardware (via IB). sw is used for 127.0.0.1, and doesn't
      support rdma ops. hw is used for sends to local device IPs,
      and supports rdma. Both are used in different cases.
      
      For both of these, when there is a congestion map update, we
      want to call rds_cong_map_updated() but not actually send
      anything -- since loopback local and foreign congestion maps
      point to the same spot, they're already in sync.
      
      The old code never called sw loop's xmit_cong_map(),so
      rds_cong_map_updated() wasn't being called for it. sw loop
      ports would not work right with the congestion monitor.
      
      Fixing that meant that hw loopback now would send congestion maps
      to itself. This is also undesirable (racy), so we check for this
      case in the ib-specific xmit code.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e7b3b99
    • A
      RDS/TCP: Wait to wake thread when write space available · 8e82376e
      Andy Grover 提交于
      Instead of waking the send thread whenever any send space is available,
      wait until it is at least half empty. This is modeled on how
      sock_def_write_space() does it, and may help to minimize context
      switches.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e82376e
    • A
      RDS: update copy_to_user state in tcp transport · b075cfdb
      Andy Grover 提交于
      Other transports use rds_page_copy_user, which updates our
      s_copy_to_user counter. TCP doesn't, so it needs to explicity
      call rds_stats_add().
      Reported-by: NRichard Frank <richard.frank@oracle.com>
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b075cfdb
    • A
      RDS: sendmsg() should check sndtimeo, not rcvtimeo · 1123fd73
      Andy Grover 提交于
      Most likely cut n paste error - sendmsg() was checking sock_rcvtimeo.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1123fd73
    • A
      RDS: Do not BUG() on error returned from ib_post_send · 735f61e6
      Andy Grover 提交于
      BUGging on a runtime error code should be avoided. This
      patch also eliminates all other BUG()s that have no real
      reason to exist.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      735f61e6
    • D
      bridge: Make first arg to deliver_clone const. · 87faf3cc
      David S. Miller 提交于
      Otherwise we get a warning from the call in br_forward().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87faf3cc
    • Y
      bridge br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only without IGMP snooping. · 32dec5dd
      YOSHIFUJI Hideaki / 吉藤英明 提交于
      Without CONFIG_BRIDGE_IGMP_SNOOPING,
      BR_INPUT_SKB_CB(skb)->mrouters_only is not appropriately
      initialized, so we can see garbage.
      
      A clear option to fix this is to set it even without that
      config, but we cannot optimize out the branch.
      
      Let's introduce a macro that returns value of mrouters_only
      and let it return 0 without CONFIG_BRIDGE_IGMP_SNOOPING.
      Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32dec5dd
    • V
      route: Fix caught BUG_ON during rt_secret_rebuild_oneshot() · 858a18a6
      Vitaliy Gusev 提交于
      route: Fix caught BUG_ON during rt_secret_rebuild_oneshot()
      
      Call rt_secret_rebuild can cause BUG_ON(timer_pending(&net->ipv4.rt_secret_timer)) in
      add_timer as there is not any synchronization for call rt_secret_rebuild_oneshot()
      for the same net namespace.
      
      Also this issue affects to rt_secret_reschedule().
      
      Thus use mod_timer enstead.
      Signed-off-by: NVitaliy Gusev <vgusev@openvz.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      858a18a6
    • Y
    • Y
    • J
      NET: netpoll, fix potential NULL ptr dereference · 21edbb22
      Jiri Slaby 提交于
      Stanse found that one error path in netpoll_setup dereferences npinfo
      even though it is NULL. Avoid that by adding new label and go to that
      instead.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Daniel Borkmann <danborkmann@googlemail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Acked-by: chavey@google.com
      Acked-by: NMatt Mackall <mpm@selenic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21edbb22
    • N
      tipc: fix lockdep warning on address assignment · a2f46ee1
      Neil Horman 提交于
      So in the forward porting of various tipc packages, I was constantly
      getting this lockdep warning everytime I used tipc-config to set a network
      address for the protocol:
      
      [ INFO: possible circular locking dependency detected ]
      2.6.33 #1
      tipc-config/1326 is trying to acquire lock:
      (ref_table_lock){+.-...}, at: [<ffffffffa0315148>] tipc_ref_discard+0x53/0xd4 [tipc]
      
      but task is already holding lock:
      (&(&entry->lock)->rlock#2){+.-...}, at: [<ffffffffa03150d5>] tipc_ref_lock+0x43/0x63 [tipc]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&(&entry->lock)->rlock#2){+.-...}:
      [<ffffffff8107b508>] __lock_acquire+0xb67/0xd0f
      [<ffffffff8107b78c>] lock_acquire+0xdc/0x102
      [<ffffffff8145471e>] _raw_spin_lock_bh+0x3b/0x6e
      [<ffffffffa03152b1>] tipc_ref_acquire+0xe8/0x11b [tipc]
      [<ffffffffa031433f>] tipc_createport_raw+0x78/0x1b9 [tipc]
      [<ffffffffa031450b>] tipc_createport+0x8b/0x125 [tipc]
      [<ffffffffa030f221>] tipc_subscr_start+0xce/0x126 [tipc]
      [<ffffffffa0308fb2>] process_signal_queue+0x47/0x7d [tipc]
      [<ffffffff81053e0c>] tasklet_action+0x8c/0xf4
      [<ffffffff81054bd8>] __do_softirq+0xf8/0x1cd
      [<ffffffff8100aadc>] call_softirq+0x1c/0x30
      [<ffffffff810549f4>] _local_bh_enable_ip+0xb8/0xd7
      [<ffffffff81054a21>] local_bh_enable_ip+0xe/0x10
      [<ffffffff81454d31>] _raw_spin_unlock_bh+0x34/0x39
      [<ffffffffa0308eb8>] spin_unlock_bh.clone.0+0x15/0x17 [tipc]
      [<ffffffffa0308f47>] tipc_k_signal+0x8d/0xb1 [tipc]
      [<ffffffffa0308dd9>] tipc_core_start+0x8a/0xad [tipc]
      [<ffffffffa01b1087>] 0xffffffffa01b1087
      [<ffffffff8100207d>] do_one_initcall+0x72/0x18a
      [<ffffffff810872fb>] sys_init_module+0xd8/0x23a
      [<ffffffff81009b42>] system_call_fastpath+0x16/0x1b
      
      -> #0 (ref_table_lock){+.-...}:
      [<ffffffff8107b3b2>] __lock_acquire+0xa11/0xd0f
      [<ffffffff8107b78c>] lock_acquire+0xdc/0x102
      [<ffffffff81454836>] _raw_write_lock_bh+0x3b/0x6e
      [<ffffffffa0315148>] tipc_ref_discard+0x53/0xd4 [tipc]
      [<ffffffffa03141ee>] tipc_deleteport+0x40/0x119 [tipc]
      [<ffffffffa0316e35>] release+0xeb/0x137 [tipc]
      [<ffffffff8139dbf4>] sock_release+0x1f/0x6f
      [<ffffffff8139dc6b>] sock_close+0x27/0x2b
      [<ffffffff811116f6>] __fput+0x12a/0x1df
      [<ffffffff811117c5>] fput+0x1a/0x1c
      [<ffffffff8110e49b>] filp_close+0x68/0x72
      [<ffffffff8110e552>] sys_close+0xad/0xe7
      [<ffffffff81009b42>] system_call_fastpath+0x16/0x1b
      
      Finally decided I should fix this.  Its a straightforward inversion,
      tipc_ref_acquire takes two locks in this order:
      ref_table_lock
      entry->lock
      
      while tipc_deleteport takes them in this order:
      entry->lock (via tipc_port_lock())
      ref_table_lock (via tipc_ref_discard())
      
      when the same entry is referenced, we get the above warning.  The fix is equally
      straightforward.  Theres no real relation between the entry->lock and the
      ref_table_lock (they just are needed at the same time), so move the entry->lock
      aquisition in tipc_ref_acquire down, after we unlock ref_table_lock (this is
      safe since the ref_table_lock guards changes to the reference table, and we've
      already claimed a slot there.  I've tested the below fix and confirmed that it
      clears up the lockdep issue
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: Allan Stephens <allan.stephens@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2f46ee1
    • J
      l2tp: Fix UDP socket reference count bugs in the pppol2tp driver · c3259c8a
      James Chapman 提交于
      This patch fixes UDP socket refcnt bugs in the pppol2tp driver.
      
      A bug can cause a kernel stack trace when a tunnel socket is closed.
      
      A way to reproduce the issue is to prepare the UDP socket for L2TP (by
      opening a tunnel pppol2tp socket) and then close it before any L2TP
      sessions are added to it. The sequence is
      
      Create UDP socket
      Create tunnel pppol2tp socket to prepare UDP socket for L2TP
        pppol2tp_connect: session_id=0, peer_session_id=0
      L2TP SCCRP control frame received (tunnel_id==0)
        pppol2tp_recv_core: sock_hold()
        pppol2tp_recv_core: sock_put
      L2TP ZLB control frame received (tunnel_id=nnn)
        pppol2tp_recv_core: sock_hold()
        pppol2tp_recv_core: sock_put
      Close tunnel management socket
        pppol2tp_release: session_id=0, peer_session_id=0
      Close UDP socket
        udp_lib_close: BUG
      
      The addition of sock_hold() in pppol2tp_connect() solves the problem.
      
      For data frames, two sock_put() calls were added to plug a refcnt leak
      per received data frame. The ref that is grabbed at the top of
      pppol2tp_recv_core() must always be released, but this wasn't done for
      accepted data frames or data frames discarded because of bad UDP
      checksums. This leak meant that any UDP socket that had passed L2TP
      data traffic (i.e. L2TP data frames, not just L2TP control frames)
      using pppol2tp would not be released by the kernel.
      
      WARNING: at include/net/sock.h:435 udp_lib_unhash+0x117/0x120()
      Pid: 1086, comm: openl2tpd Not tainted 2.6.33-rc1 #8
      Call Trace:
       [<c119e9b7>] ? udp_lib_unhash+0x117/0x120
       [<c101b871>] ? warn_slowpath_common+0x71/0xd0
       [<c119e9b7>] ? udp_lib_unhash+0x117/0x120
       [<c101b8e3>] ? warn_slowpath_null+0x13/0x20
       [<c119e9b7>] ? udp_lib_unhash+0x117/0x120
       [<c11598a7>] ? sk_common_release+0x17/0x90
       [<c11a5e33>] ? inet_release+0x33/0x60
       [<c11577b0>] ? sock_release+0x10/0x60
       [<c115780f>] ? sock_close+0xf/0x30
       [<c106e542>] ? __fput+0x52/0x150
       [<c106b68e>] ? filp_close+0x3e/0x70
       [<c101d2e2>] ? put_files_struct+0x62/0xb0
       [<c101eaf7>] ? do_exit+0x5e7/0x650
       [<c1081623>] ? mntput_no_expire+0x13/0x70
       [<c106b68e>] ? filp_close+0x3e/0x70
       [<c101eb8a>] ? do_group_exit+0x2a/0x70
       [<c101ebe1>] ? sys_exit_group+0x11/0x20
       [<c10029b0>] ? sysenter_do_call+0x12/0x26
      Signed-off-by: NJames Chapman <jchapman@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3259c8a
    • S
      smsc95xx: wait for PHY to complete reset during init · db443c44
      Steve Glendinning 提交于
      This patch ensures the PHY correctly completes its reset before
      setting register values.
      Signed-off-by: NSteve Glendinning <steve.glendinning@smsc.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db443c44
    • J
      l2tp: Fix oops in pppol2tp_xmit · 3feec909
      James Chapman 提交于
      When transmitting L2TP frames, we derive the outgoing interface's UDP
      checksum hardware assist capabilities from the tunnel dst dev. This
      can sometimes be NULL, especially when routing protocols are used and
      routing changes occur. This patch just checks for NULL dst or dev
      pointers when checking for netdev hardware assist features.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000c
      IP: [<f89d074c>] pppol2tp_xmit+0x341/0x4da [pppol2tp]
      *pde = 00000000
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/class/net/lo/operstate
      Modules linked in: pppol2tp pppox ppp_generic slhc ipv6 dummy loop snd_hda_codec_atihdmi snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc evdev psmouse serio_raw processor button i2c_piix4 i2c_core ati_agp agpgart pcspkr ext3 jbd mbcache sd_mod ide_pci_generic atiixp ide_core ahci ata_generic floppy ehci_hcd ohci_hcd libata e1000e scsi_mod usbcore nls_base thermal fan thermal_sys [last unloaded: scsi_wait_scan]
      
      Pid: 0, comm: swapper Not tainted (2.6.32.8 #1)
      EIP: 0060:[<f89d074c>] EFLAGS: 00010297 CPU: 3
      EIP is at pppol2tp_xmit+0x341/0x4da [pppol2tp]
      EAX: 00000000 EBX: f64d1680 ECX: 000005b9 EDX: 00000000
      ESI: f6b91850 EDI: f64d16ac EBP: f6a0c4c0 ESP: f70a9cac
       DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
      Process swapper (pid: 0, ti=f70a8000 task=f70a31c0 task.ti=f70a8000)
      Stack:
       000005a9 000005b9 f734c400 f66652c0 f7352e00 f67dc800 00000000 f6b91800
      <0> 000005a3 f70ef6c4 f67dcda9 000005a3 f89b192e 00000246 000005a3 f64d1680
      <0> f63633e0 f6363320 f64d1680 f65a7320 f65a7364 f65856c0 f64d1680 f679f02f
      Call Trace:
       [<f89b192e>] ? ppp_push+0x459/0x50e [ppp_generic]
       [<f89b217f>] ? ppp_xmit_process+0x3b6/0x430 [ppp_generic]
       [<f89b2306>] ? ppp_start_xmit+0x10d/0x120 [ppp_generic]
       [<c11c15cb>] ? dev_hard_start_xmit+0x21f/0x2b2
       [<c11d0947>] ? sch_direct_xmit+0x48/0x10e
       [<c11c19a0>] ? dev_queue_xmit+0x263/0x3a6
       [<c11e2a9f>] ? ip_finish_output+0x1f7/0x221
       [<c11df682>] ? ip_forward_finish+0x2e/0x30
       [<c11de645>] ? ip_rcv_finish+0x295/0x2a9
       [<c11c0b19>] ? netif_receive_skb+0x3e9/0x404
       [<f814b791>] ? e1000_clean_rx_irq+0x253/0x2fc [e1000e]
       [<f814cb7a>] ? e1000_clean+0x63/0x1fc [e1000e]
       [<c1047eff>] ? sched_clock_local+0x15/0x11b
       [<c11c1095>] ? net_rx_action+0x96/0x195
       [<c1035750>] ? __do_softirq+0xaa/0x151
       [<c1035828>] ? do_softirq+0x31/0x3c
       [<c10358fe>] ? irq_exit+0x26/0x58
       [<c1004b21>] ? do_IRQ+0x78/0x89
       [<c1003729>] ? common_interrupt+0x29/0x30
       [<c101ac28>] ? native_safe_halt+0x2/0x3
       [<c1008c54>] ? default_idle+0x55/0x75
       [<c1009045>] ? c1e_idle+0xd2/0xd5
       [<c100233c>] ? cpu_idle+0x46/0x62
      Code: 8d 45 08 f0 ff 45 08 89 6b 08 c7 43 68 7e fb 9c f8 8a 45 24 83 e0 0c 3c 04 75 09 80 63 64 f3 e9 b4 00 00 00 8b 43 18 8b 4c 24 04 <8b> 40 0c 8d 79 11 f6 40 44 0e 8a 43 64 75 51 6a 00 8b 4c 24 08
      EIP: [<f89d074c>] pppol2tp_xmit+0x341/0x4da [pppol2tp] SS:ESP 0068:f70a9cac
      CR2: 000000000000000c
      Signed-off-by: NJames Chapman <jchapman@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3feec909
    • S
      smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver · d0cad871
      Steve Glendinning 提交于
      This patch adds a driver for SMSC's LAN7500 family of USB 2.0
      to gigabit ethernet adapters.  It's loosely based on the smsc95xx
      driver but the device registers for LAN7500 are completely different.
      Signed-off-by: NSteve Glendinning <steve.glendinning@smsc.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0cad871
    • A
      ne: Do not use slashes in irq name string · c5e49fb5
      Atsushi Nemoto 提交于
      This patch fixes following warning introduced by commit
      12bac0d9 ("proc: warn on non-existing
      proc entries"):
      
      WARNING: at /work/mips-linux/make/linux/fs/proc/generic.c:316 __xlate_proc_name+0xe0/0xe8()
      name 'RBHMA4X00/RTL8019'
      Signed-off-by: NAtsushi Nemoto <anemo@mba.ocn.ne.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5e49fb5
    • J
      NET: ksz884x, fix lock imbalance · edee3932
      Jiri Slaby 提交于
      Stanse found that one error path (when alloc_skb fails) in netdev_tx
      omits to unlock hw_priv->hwlock. Fix that.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Tristram Ha <Tristram.Ha@micrel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      edee3932
    • T
      gigaset: correct range checking off by one error · 6ad34145
      Tilman Schmidt 提交于
      Correct a potential array overrun due to an off by one error in the
      range check on the CAPI CONNECT_REQ CIPValue parameter.
      Found and reported by Dan Carpenter using smatch.
      
      Impact: bugfix
      Signed-off-by: NTilman Schmidt <tilman@imap.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ad34145
  2. 16 3月, 2010 14 次提交