1. 30 1月, 2011 2 次提交
    • E
      net: Fix ip link add netns oops · 13ad1774
      Eric W. Biederman 提交于
      Ed Swierk <eswierk@bigswitch.com> writes:
      > On 2.6.35.7
      >  ip link add link eth0 netns 9999 type macvlan
      > where 9999 is a nonexistent PID triggers an oops and causes all network functions to hang:
      > [10663.821898] BUG: unable to handle kernel NULL pointer dereference at 000000000000006d
      >  [10663.821917] IP: [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.821933] PGD 1d3927067 PUD 22f5c5067 PMD 0
      >  [10663.821944] Oops: 0000 [#1] SMP
      >  [10663.821953] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
      >  [10663.821959] CPU 3
      >  [10663.821963] Modules linked in: macvlan ip6table_filter ip6_tables rfcomm ipt_MASQUERADE binfmt_misc iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sco ipt_REJECT bnep l2cap xt_tcpudp iptable_filter ip_tables x_tables bridge stp vboxnetadp vboxnetflt vboxdrv kvm_intel kvm parport_pc ppdev snd_hda_codec_intelhdmi snd_hda_codec_conexant arc4 iwlagn iwlcore mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi i915 snd_seq_midi_event snd_seq thinkpad_acpi drm_kms_helper btusb tpm_tis nvram uvcvideo snd_timer snd_seq_device bluetooth videodev v4l1_compat v4l2_compat_ioctl32 tpm drm tpm_bios snd cfg80211 psmouse serio_raw intel_ips soundcore snd_page_alloc intel_agp i2c_algo_bit video output netconsole configfs lp parport usbhid hid e1000e sdhci_pci ahci libahci sdhci led_class
      >  [10663.822155]
      >  [10663.822161] Pid: 6000, comm: ip Not tainted 2.6.35-23-generic #41-Ubuntu 2901CTO/2901CTO
      >  [10663.822167] RIP: 0010:[<ffffffff8149c2fa>] [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.822177] RSP: 0018:ffff88014aebf7b8 EFLAGS: 00010286
      >  [10663.822182] RAX: 00000000fffffff4 RBX: ffff8801ad900800 RCX: 0000000000000000
      >  [10663.822187] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffff88014ad63000
      >  [10663.822191] RBP: ffff88014aebf808 R08: 0000000000000041 R09: 0000000000000041
      >  [10663.822196] R10: 0000000000000000 R11: dead000000200200 R12: ffff88014aebf818
      >  [10663.822201] R13: fffffffffffffffd R14: ffff88014aebf918 R15: ffff88014ad62000
      >  [10663.822207] FS: 00007f00c487f700(0000) GS:ffff880001f80000(0000) knlGS:0000000000000000
      >  [10663.822212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      >  [10663.822216] CR2: 000000000000006d CR3: 0000000231f19000 CR4: 00000000000026e0
      >  [10663.822221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      >  [10663.822226] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      >  [10663.822231] Process ip (pid: 6000, threadinfo ffff88014aebe000, task ffff88014afb16e0)
      >  [10663.822236] Stack:
      >  [10663.822240] ffff88014aebf808 ffffffff814a2bb5 ffff88014aebf7e8 00000000a00ee8d6
      >  [10663.822251] <0> 0000000000000000 ffffffffa00ef940 ffff8801ad900800 ffff88014aebf818
      >  [10663.822265] <0> ffff88014aebf918 ffff8801ad900800 ffff88014aebf858 ffffffff8149c413
      >  [10663.822281] Call Trace:
      >  [10663.822290] [<ffffffff814a2bb5>] ? dev_addr_init+0x75/0xb0
      >  [10663.822298] [<ffffffff8149c413>] dev_alloc_name+0x43/0x90
      >  [10663.822307] [<ffffffff814a85ee>] rtnl_create_link+0xbe/0x1b0
      >  [10663.822314] [<ffffffff814ab2aa>] rtnl_newlink+0x48a/0x570
      >  [10663.822321] [<ffffffff814aafcc>] ? rtnl_newlink+0x1ac/0x570
      >  [10663.822332] [<ffffffff81030064>] ? native_x2apic_icr_read+0x4/0x20
      >  [10663.822339] [<ffffffff814a8c17>] rtnetlink_rcv_msg+0x177/0x290
      >  [10663.822346] [<ffffffff814a8aa0>] ? rtnetlink_rcv_msg+0x0/0x290
      >  [10663.822354] [<ffffffff814c25d9>] netlink_rcv_skb+0xa9/0xd0
      >  [10663.822360] [<ffffffff814a8a85>] rtnetlink_rcv+0x25/0x40
      >  [10663.822367] [<ffffffff814c223e>] netlink_unicast+0x2de/0x2f0
      >  [10663.822374] [<ffffffff814c303e>] netlink_sendmsg+0x1fe/0x2e0
      >  [10663.822383] [<ffffffff81488533>] sock_sendmsg+0xf3/0x120
      >  [10663.822391] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822400] [<ffffffff81168656>] ? __d_lookup+0x136/0x150
      >  [10663.822406] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822414] [<ffffffff812b7a0d>] ? _atomic_dec_and_lock+0x4d/0x80
      >  [10663.822422] [<ffffffff8116ea90>] ? mntput_no_expire+0x30/0x110
      >  [10663.822429] [<ffffffff81486ff5>] ? move_addr_to_kernel+0x65/0x70
      >  [10663.822435] [<ffffffff81493308>] ? verify_iovec+0x88/0xe0
      >  [10663.822442] [<ffffffff81489020>] sys_sendmsg+0x240/0x3a0
      > [10663.822450] [<ffffffff8111e2a9>] ? __do_fault+0x479/0x560
      >  [10663.822457] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822465] [<ffffffff8116cf4a>] ? alloc_fd+0x10a/0x150
      >  [10663.822473] [<ffffffff8158d76e>] ? do_page_fault+0x15e/0x350
      >  [10663.822482] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
      >  [10663.822487] Code: 90 48 8d 78 02 be 25 00 00 00 e8 92 1d e2 ff 48 85 c0 75 cf bf 20 00 00 00 e8 c3 b1 c6 ff 49 89 c7 b8 f4 ff ff ff 4d 85 ff 74 bd <4d> 8b 75 70 49 8d 45 70 48 89 45 b8 49 83 ee 58 eb 28 48 8d 55
      >  [10663.822618] RIP [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.822627] RSP <ffff88014aebf7b8>
      >  [10663.822631] CR2: 000000000000006d
      >  [10663.822636] ---[ end trace 3dfd6c3ad5327ca7 ]---
      
      This bug was introduced in:
      commit 81adee47
      Author: Eric W. Biederman <ebiederm@aristanetworks.com>
      Date:   Sun Nov 8 00:53:51 2009 -0800
      
          net: Support specifying the network namespace upon device creation.
      
          There is no good reason to not support userspace specifying the
          network namespace during device creation, and it makes it easier
          to create a network device and pass it to a child network namespace
          with a well known name.
      
          We have to be careful to ensure that the target network namespace
          for the new device exists through the life of the call.  To keep
          that logic clear I have factored out the network namespace grabbing
          logic into rtnl_link_get_net.
      
          In addtion we need to continue to pass the source network namespace
          to the rtnl_link_ops.newlink method so that we can find the base
          device source network namespace.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      
      Where apparently I forgot to add error handling to the path where we create
      a new network device in a new network namespace, and pass in an invalid pid.
      
      Cc: stable@kernel.org
      Reported-by: NEd Swierk <eswierk@bigswitch.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13ad1774
    • H
      gro: Reset dev pointer on reuse · 66c46d74
      Herbert Xu 提交于
      On older kernels the VLAN code may zero skb->dev before dropping
      it and causing it to be reused by GRO.
      
      Unfortunately we didn't reset skb->dev in that case which causes
      the next GRO user to get a bogus skb->dev pointer.
      
      This particular problem no longer happens with the current upstream
      kernel due to changes in VLAN processing.
      
      However, for correctness we should still reset the skb->dev pointer
      in the GRO reuse function in case a future user does the same thing.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66c46d74
  2. 28 1月, 2011 4 次提交
  3. 27 1月, 2011 1 次提交
  4. 26 1月, 2011 3 次提交
    • J
      TCP: fix a bug that triggers large number of TCP RST by mistake · 44f5324b
      Jerry Chu 提交于
      This patch fixes a bug that causes TCP RST packets to be generated
      on otherwise correctly behaved applications, e.g., no unread data
      on close,..., etc. To trigger the bug, at least two conditions must
      be met:
      
      1. The FIN flag is set on the last data packet, i.e., it's not on a
      separate, FIN only packet.
      2. The size of the last data chunk on the receive side matches
      exactly with the size of buffer posted by the receiver, and the
      receiver closes the socket without any further read attempt.
      
      This bug was first noticed on our netperf based testbed for our IW10
      proposal to IETF where a large number of RST packets were observed.
      netperf's read side code meets the condition 2 above 100%.
      
      Before the fix, tcp_data_queue() will queue the last skb that meets
      condition 1 to sk_receive_queue even though it has fully copied out
      (skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
      tcp_recvmsg() often returns all the copied out data successfully
      without actually consuming the skb, due to a check
      "if ((chunk = len - tp->ucopy.len) != 0) {"
      and
      "len -= chunk;"
      after tcp_prequeue_process() that causes "len" to become 0 and an
      early exit from the big while loop.
      
      I don't see any reason not to free the skb whose data have been fully
      consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
      get there if MSG_PEEK is on. Am I missing some arcane cases related
      to urgent data?
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44f5324b
    • F
      mac80211: fix a crash in ieee80211_beacon_get_tim on change_interface · eb3e554b
      Felix Fietkau 提交于
      Some drivers (e.g. ath9k) do not always disable beacons when they're
      supposed to. When an interface is changed using the change_interface op,
      the mode specific sdata part is in an undefined state and trying to
      get a beacon at this point can produce weird crashes.
      
      To fix this, add a check for ieee80211_sdata_running before using
      anything from the sdata.
      Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
      Cc: stable@kernel.org
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      eb3e554b
    • D
      ipv6: Revert 'administrative down' address handling changes. · 73a8bd74
      David S. Miller 提交于
      This reverts the following set of commits:
      
      d1ed113f ("ipv6: remove duplicate neigh_ifdown")
      29ba5fed ("ipv6: don't flush routes when setting loopback down")
      9d82ca98 ("ipv6: fix missing in6_ifa_put in addrconf")
      2de79570 ("ipv6: addrconf: don't remove address state on ifdown if the address is being kept")
      8595805a ("IPv6: only notify protocols if address is compeletely gone")
      27bdb2ab ("IPv6: keep tentative addresses in hash table")
      93fa159a ("IPv6: keep route for tentative address")
      8f37ada5 ("IPv6: fix race between cleanup and add/delete address")
      84e8b803 ("IPv6: addrconf notify when address is unavailable")
      dc2b99f7 ("IPv6: keep permanent addresses on admin down")
      
      because the core semantic change to ipv6 address handling on ifdown
      has broken some things, in particular "disable_ipv6" sysctl handling.
      
      Stephen has made several attempts to get things back in working order,
      but nothing has restored disable_ipv6 fully yet.
      Reported-by: NEric W. Biederman <ebiederm@xmission.com>
      Tested-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73a8bd74
  5. 25 1月, 2011 7 次提交
  6. 24 1月, 2011 1 次提交
  7. 21 1月, 2011 2 次提交
    • E
      net_sched: accurate bytes/packets stats/rates · 9190b3b3
      Eric Dumazet 提交于
      In commit 44b82883 (net_sched: pfifo_head_drop problem), we fixed
      a problem with pfifo_head drops that incorrectly decreased
      sch->bstats.bytes and sch->bstats.packets
      
      Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
      previously enqueued packet, and bstats cannot be changed, so
      bstats/rates are not accurate (over estimated)
      
      This patch changes the qdisc_bstats updates to be done at dequeue() time
      instead of enqueue() time. bstats counters no longer account for dropped
      frames, and rates are more correct, since enqueue() bursts dont have
      effect on dequeue() rate.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9190b3b3
    • D
      kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT · 6a108a14
      David Rientjes 提交于
      The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
      is used to configure any non-standard kernel with a much larger scope than
      only small devices.
      
      This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
      references to the option throughout the kernel.  A new CONFIG_EMBEDDED
      option is added that automatically selects CONFIG_EXPERT when enabled and
      can be used in the future to isolate options that should only be
      considered for embedded systems (RISC architectures, SLOB, etc).
      
      Calling the option "EXPERT" more accurately represents its intention: only
      expert users who understand the impact of the configuration changes they
      are making should enable it.
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NDavid Woodhouse <david.woodhouse@intel.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Greg KH <gregkh@suse.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Robin Holt <holt@sgi.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6a108a14
  8. 20 1月, 2011 11 次提交
  9. 19 1月, 2011 2 次提交
  10. 16 1月, 2011 3 次提交
  11. 15 1月, 2011 1 次提交
  12. 14 1月, 2011 3 次提交
    • E
      net: remove dev_txq_stats_fold() · 1ac9ad13
      Eric Dumazet 提交于
      After recent changes, (percpu stats on vlan/tunnels...), we dont need
      anymore per struct netdev_queue tx_bytes/tx_packets/tx_dropped counters.
      
      Only remaining users are ixgbe, sch_teql, gianfar & macvlan :
      
      1) ixgbe can be converted to use existing tx_ring counters.
      
      2) macvlan incremented txq->tx_dropped, it can use the
      dev->stats.tx_dropped counter.
      
      3) sch_teql : almost revert ab35cd4b (Use net_device internal stats)
          Now we have ndo_get_stats64(), use it, even for "unsigned long"
      fields (No need to bring back a struct net_device_stats)
      
      4) gianfar adds a stats structure per tx queue to hold
      tx_bytes/tx_packets
      
      This removes a lockdep warning (and possible lockup) in rndis gadget,
      calling dev_get_stats() from hard IRQ context.
      
      Ref: http://www.spinics.net/lists/netdev/msg149202.htmlReported-by: NNeil Jones <neiljay@gmail.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Jarek Poplawski <jarkao2@gmail.com>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: Sandeep Gopalpet <sandeep.kumar@freescale.com>
      CC: Michal Nazarewicz <mina86@mina86.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ac9ad13
    • J
      batman-adv: Even Batman should not dereference NULL pointers · ed7809d9
      Jesper Juhl 提交于
      There's a problem in net/batman-adv/unicast.c::frag_send_skb().
      dev_alloc_skb() allocates memory and may fail, thus returning NULL. If
      this happens we'll pass a NULL pointer on to skb_split() which in turn
      hands it to skb_split_inside_header() from where it gets passed to
      skb_put() that lets skb_tail_pointer() play with it and that function
      dereferences it. And thus the bat dies.
      
      While I was at it I also moved the call to dev_alloc_skb() above the
      assignment to 'unicast_packet' since there's no reason to do that
      assignment if the memory allocation fails.
      Signed-off-by: NJesper Juhl <jj@chaosbits.net>
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      ed7809d9
    • L
      mac80211: use maximum number of AMPDU frames as default in BA RX · 82694f76
      Luciano Coelho 提交于
      When the buffer size is set to zero in the block ack parameter set
      field, we should use the maximum supported number of subframes.  The
      existing code was bogus and was doing some unnecessary calculations
      that lead to wrong values.
      
      Thanks Johannes for helping me figure this one out.
      
      Cc: stable@kernel.org
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NLuciano Coelho <coelho@ti.com>
      Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      82694f76