1. 27 5月, 2011 6 次提交
  2. 26 5月, 2011 13 次提交
    • F
      bonding: documentation and code cleanup for resend_igmp · 94265cf5
      Flavio Leitner 提交于
      Improves the documentation about how IGMP resend parameter
      works, fix two missing checks and coding style issues.
      Signed-off-by: NFlavio Leitner <fbl@redhat.com>
      Acked-by: NRick Jones <rick.jones2@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94265cf5
    • N
      bonding: prevent deadlock on slave store with alb mode (v3) · 9fe0617d
      Neil Horman 提交于
      This soft lockup was recently reported:
      
      [root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
      [root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
      bonding: bond5: doing slave updates when interface is down.
      bonding bond5: master_dev is not up in bond_enslave
      [root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
      bonding: bond5: doing slave updates when interface is down.
      
      BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
      CPU 12:
      Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
      be2d
      Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
      RIP: 0010:[<ffffffff80064bf0>]  [<ffffffff80064bf0>]
      .text.lock.spinlock+0x26/00
      RSP: 0018:ffff810113167da8  EFLAGS: 00000286
      RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
      RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
      RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
      R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
      R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
      FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0
      
      Call Trace:
       [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
       [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
       [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
       [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
       [<ffffffff8006457b>] __down_write_nested+0x12/0x92
       [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
       [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
       [<ffffffff80016b87>] vfs_write+0xce/0x174
       [<ffffffff80017450>] sys_write+0x45/0x6e
       [<ffffffff8005d28d>] tracesys+0xd5/0xe0
      
      It occurs because we are able to change the slave configuarion of a bond while
      the bond interface is down.  The bonding driver initializes some data structures
      only after its ndo_open routine is called.  Among them is the initalization of
      the alb tx and rx hash locks.  So if we add or remove a slave without first
      opening the bond master device, we run the risk of trying to lock/unlock a
      spinlock that has garbage for data in it, which results in our above softlock.
      
      Note that sometimes this works, because in many cases an unlocked spinlock has
      the raw_lock parameter initialized to zero (meaning that the kzalloc of the
      net_device private data is equivalent to calling spin_lock_init), but thats not
      true in all cases, and we aren't guaranteed that condition, so we need to pass
      the relevant spinlocks through the spin_lock_init function.
      
      Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
      the ndo_init path, so they are ready for use by the bond_store_slaves path.
      
      Change notes:
      v2) Based on conversation with Jay and Nicolas it seems that the ability to
      enslave devices while the bond master is down should be safe to do.  As such
      this is an outlier bug, and so instead we'll just initalize the errant spinlocks
      in the init path rather than the open path, solving the problem.  We'll also
      remove the warnings about the bond being down during enslave operations, since
      it should be safe
      
      v3) Fix spelling error
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reported-by: jtluka@redhat.com
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: nicolas.2p.debian@gmail.com
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fe0617d
    • E
      net: hold rtnl again in dump callbacks · 2907c35f
      Eric Dumazet 提交于
      Commit e67f88dd (dont hold rtnl mutex during netlink dump callbacks)
      missed fact that rtnl_fill_ifinfo() must be called with rtnl held.
      
      Because of possible deadlocks between two mutexes (cb_mutex and rtnl),
      its not easy to solve this problem, so revert this part of the patch.
      
      It also forgot one rcu_read_unlock() in FIB dump_rules()
      
      Add one ASSERT_RTNL() in rtnl_fill_ifinfo() to remind us the rule.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2907c35f
    • M
      Add Fujitsu 1000base-SX PCI ID to tg3 · 1dcb14d9
      Meelis Roos 提交于
      This patch adds the PCI ID of Fujitsu 1000base-SX NIC to tg3 driver.
      Tested to detect the card, MAC and serdes, not tested with link at the
      moment since I have no fiber switch here. I did not add new constants to
      the pci_ids.h header file since these constants are used only here.
      Signed-off-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dcb14d9
    • D
      a5971d43
    • E
      sch_sfq: fix peek() implementation · 07bd8df5
      Eric Dumazet 提交于
      Since commit eeaeb068 (sch_sfq: allow big packets and be fair),
      sfq_peek() can return a different skb that would be normally dequeued by
      sfq_dequeue() [ if current slot->allot is negative ]
      
      Use generic qdisc_peek_dequeued() instead of custom implementation, to
      get consistent result.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Jarek Poplawski <jarkao2@gmail.com>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Jesper Dangaard Brouer <hawk@diku.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07bd8df5
    • P
      isdn: netjet - blacklist Digium TDM400P · 367bbf2a
      Prarit Bhargava 提交于
      [2nd try ... 1st attempt didn't make it to netdev mailing list]
      
      A quick google search reveals that people with this card are blacklisting it
      in the initramfs and in the module blacklist based on a statement that it
      is unsupported. Since the older Digium is also unsupported I'm pretty
      confident that this newer card is also not supported.
      
      lspci -xxx -vv shows
      
      04:07.0 Communication controller: Tiger Jet Network Inc. Tiger3XX Modem/ISDN interface
              Subsystem: Device b100:0003
      P.
      
      ----8<----
      The Asterisk Voice Card, DIGIUM TDM400P is unsupported by the netjet driver.
      Blacklist it like the Digium X100P/X101P card.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      367bbf2a
    • U
      via-velocity: don't annotate MAC registers as packed · d10358de
      Ulrich Hecht 提交于
      On ARM, memory accesses through packed pointers behave in unexpected
      ways in GCC releases 4.3 and higher; see https://lkml.org/lkml/2011/2/2/163
      for discussion.
      
      In this particular case, 32-bit I/O registers are accessed bytewise,
      causing incorrect setting of the DMA address registers which in turn
      leads to an error interrupt storm that brings the system to a halt.
      
      Since the mac_regs structure does not need any packing anyway, this patch
      simply removes the attribute to fix the issue.
      Signed-off-by: NUlrich Hecht <uli@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d10358de
    • I
      xen: netfront: hold RTNL when updating features. · 1ba37c51
      Ian Campbell 提交于
      Konrad reports:
      [    0.930811] RTNL: assertion failed at /home/konrad/ssd/linux/net/core/dev.c (5258)
      [    0.930821] Pid: 22, comm: xenwatch Not tainted 2.6.39-05193-gd762f438 #1
      [    0.930825] Call Trace:
      [    0.930834]  [<ffffffff8143bd0e>] __netdev_update_features+0xae/0xe0
      [    0.930840]  [<ffffffff8143dd41>] netdev_update_features+0x11/0x30
      [    0.930847]  [<ffffffffa0037105>] netback_changed+0x4e5/0x800 [xen_netfront]
      [    0.930854]  [<ffffffff8132a838>] xenbus_otherend_changed+0xa8/0xb0
      [    0.930860]  [<ffffffff8157ca99>] ? _raw_spin_unlock_irqrestore+0x19/0x20
      [    0.930866]  [<ffffffff8132adfe>] backend_changed+0xe/0x10
      [    0.930871]  [<ffffffff8132875a>] xenwatch_thread+0xba/0x180
      [    0.930876]  [<ffffffff810a8ba0>] ? wake_up_bit+0x40/0x40
      [    0.930881]  [<ffffffff813286a0>] ? split+0xf0/0xf0
      [    0.930886]  [<ffffffff810a8646>] kthread+0x96/0xa0
      [    0.930891]  [<ffffffff815855a4>] kernel_thread_helper+0x4/0x10
      [    0.930896]  [<ffffffff815846b3>] ? int_ret_from_sys_call+0x7/0x1b
      [    0.930901]  [<ffffffff8157cf61>] ? retint_restore_args+0x5/0x6
      [    0.930906]  [<ffffffff815855a0>] ? gs_change+0x13/0x13
      
      This update happens in xenbus watch callback context and hence does not already
      hold the rtnl. Take the lock as necessary.
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ba37c51
    • W
      sctp: fix memory leak of the ASCONF queue when free asoc · 8b4472cc
      Wei Yongjun 提交于
      If an ASCONF chunk is outstanding, then the following ASCONF
      chunk will be queued for later transmission. But when we free
      the asoc, we forget to free the ASCONF queue at the same time,
      this will cause memory leak.
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b4472cc
    • N
      net: make dev_disable_lro use physical device if passed a vlan dev (v2) · f11970e3
      Neil Horman 提交于
      If the device passed into dev_disable_lro is a vlan, then repoint the dev
      poniter so that we actually modify the underlying physical device.
      Signed-of-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: davem@davemloft.net
      CC: bhutchings@solarflare.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f11970e3
    • N
      net: move is_vlan_dev into public header file (v2) · 6dcbbe25
      Neil Horman 提交于
      Migrate is_vlan_dev() to if_vlan.h so that core networkig can use it
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: davem@davemloft.net
      CC: bhutchings@solarflare.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6dcbbe25
    • D
      22e95ac8
  3. 25 5月, 2011 13 次提交
  4. 24 5月, 2011 8 次提交
    • D
      net: convert %p usage to %pK · 71338aa7
      Dan Rosenberg 提交于
      The %pK format specifier is designed to hide exposed kernel pointers,
      specifically via /proc interfaces.  Exposing these pointers provides an
      easy target for kernel write vulnerabilities, since they reveal the
      locations of writable structures containing easily triggerable function
      pointers.  The behavior of %pK depends on the kptr_restrict sysctl.
      
      If kptr_restrict is set to 0, no deviation from the standard %p behavior
      occurs.  If kptr_restrict is set to 1, the default, if the current user
      (intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
      (currently in the LSM tree), kernel pointers using %pK are printed as 0's.
       If kptr_restrict is set to 2, kernel pointers using %pK are printed as
      0's regardless of privileges.  Replacing with 0's was chosen over the
      default "(null)", which cannot be parsed by userland %p, which expects
      "(nil)".
      
      The supporting code for kptr_restrict and %pK are currently in the -mm
      tree.  This patch converts users of %p in net/ to %pK.  Cases of printing
      pointers to the syslog are not covered, since this would eliminate useful
      information for postmortem debugging and the reading of the syslog is
      already optionally protected by the dmesg_restrict sysctl.
      Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Thomas Graf <tgraf@infradead.org>
      Cc: Eugene Teo <eugeneteo@kernel.org>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Eric Paris <eparis@parisplace.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      71338aa7
    • M
      net/irda: convert bfin_sir to common Blackfin UART header · 229de618
      Mike Frysinger 提交于
      No need to duplicate these defines now that the common Blackfin code has
      unified these for all UART devices.
      Signed-off-by: NMike Frysinger <vapier@gentoo.org>
      Cc: Samuel Ortiz <samuel@sortiz.org>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      229de618
    • D
      ipv6: Fix return of xfrm6_tunnel_rcv() · 6ac3f664
      David S. Miller 提交于
      Like ipv4, just return xfrm6_rcv_spi()'s return value directly.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ac3f664
    • J
      net: filter: Use WARN_RATELIMIT · 6c4a5cb2
      Joe Perches 提交于
      A mis-configured filter can spam the logs with lots of stack traces.
      
      Rate-limit the warnings and add printout of the bogus filter information.
      Original-patch-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c4a5cb2
    • J
      bug.h: Add WARN_RATELIMIT · b3eec79b
      Joe Perches 提交于
      Add a generic mechanism to ratelimit WARN(foo, fmt, ...) messages
      using a hidden per call site static struct ratelimit_state.
      
      Also add an __WARN_RATELIMIT variant to be able to use a specific
      struct ratelimit_state.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3eec79b
    • E
      sch_sfq: avoid giving spurious NET_XMIT_CN signals · 8efa8854
      Eric Dumazet 提交于
      While chasing a possible net_sched bug, I found that IP fragments have
      litle chance to pass a congestioned SFQ qdisc :
      
      - Say SFQ qdisc is full because one flow is non responsive.
      - ip_fragment() wants to send two fragments belonging to an idle flow.
      - sfq_enqueue() queues first packet, but see queue limit reached :
      - sfq_enqueue() drops one packet from 'big consumer', and returns
      NET_XMIT_CN.
      - ip_fragment() cancel remaining fragments.
      
      This patch restores fairness, making sure we return NET_XMIT_CN only if
      we dropped a packet from the same flow.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Jarek Poplawski <jarkao2@gmail.com>
      CC: Jamal Hadi Salim <hadi@cyberus.ca>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8efa8854
    • B
      ehea: Fix multicast registration on semi-promiscuous mode · a4910b74
      Breno Leitao 提交于
      Ehea will not register multicast groups in phyp if the physical
      interface is in promiscuous mode. But it should register if the
      logical port is in promiscuous mode, but the physical port is not.
      
      Ehea physical promiscuous mode is defined by ehea_port->promisc,
      while logical port is defined by IFF_PROMISC.
      
      So currently, if the user set the interface in promiscuous mode,
      IGMP will not be registred in PHYP, and PHYP will never pass
      the multicast packet to the logical port, which is bad
      
      So, this patch just fixes it, assuring that we register in phyp
      if the physical port is not on promiscuous mode.
      Signed-off-by: NBreno Leitao <leitao@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4910b74
    • E
      snap: remove one synchronize_net() · 418f275e
      Eric Dumazet 提交于
      No need to wait for a rcu grace period after list insertion.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      418f275e