1. 12 2月, 2011 1 次提交
  2. 11 2月, 2011 1 次提交
  3. 10 2月, 2011 1 次提交
  4. 09 2月, 2011 3 次提交
    • P
      netfilter: nf_conntrack: set conntrack templates again if we return NF_REPEAT · c3174286
      Pablo Neira Ayuso 提交于
      The TCP tracking code has a special case that allows to return
      NF_REPEAT if we receive a new SYN packet while in TIME_WAIT state.
      
      In this situation, the TCP tracking code destroys the existing
      conntrack to start a new clean session.
      
      [DESTROY] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=38925 dport=8000 src=192.168.1.2 dst=192.168.1.100 sport=8000 dport=38925 [ASSURED]
          [NEW] tcp      6 120 SYN_SENT src=192.168.0.2 dst=192.168.1.2 sport=38925 dport=8000 [UNREPLIED] src=192.168.1.2 dst=192.168.1.100 sport=8000 dport=38925
      
      However, this is a problem for the iptables' CT target event filtering
      which will not work in this case since the conntrack template will not
      be there for the new session. To fix this, we reassign the conntrack
      template to the packet if we return NF_REPEAT.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      c3174286
    • D
      net: Fix lockdep regression caused by initializing netdev queues too early. · 8d3bdbd5
      David S. Miller 提交于
      In commit aa942104 ("net: init ingress
      queue") we moved the allocation and lock initialization of the queues
      into alloc_netdev_mq() since register_netdevice() is way too late.
      
      The problem is that dev->type is not setup until the setup()
      callback is invoked by alloc_netdev_mq(), and the dev->type is
      what determines the lockdep class to use for the locks in the
      queues.
      
      Fix this by doing the queue allocation after the setup() callback
      runs.
      
      This is safe because the setup() callback is not allowed to make any
      state changes that need to be undone on error (memory allocations,
      etc.).  It may, however, make state changes that are undone by
      free_netdev() (such as netif_napi_add(), which is done by the
      ipoib driver's setup routine).
      
      The previous code also leaked a reference to the &init_net namespace
      object on RX/TX queue allocation failures.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d3bdbd5
    • D
      net/caif: Fix dangling list pointer in freed object on error. · b2df5a84
      David S. Miller 提交于
      rtnl_link_ops->setup(), and the "setup" callback passed to alloc_netdev*(),
      cannot make state changes which need to be undone on failure.  There is
      no cleanup mechanism available at this point.
      
      So we have to add the caif private instance to the global list once we
      are sure that register_netdev() has succedded in ->newlink().
      
      Otherwise, if register_netdev() fails, the caller will invoke free_netdev()
      and we will have a reference to freed up memory on the chnl_net_list.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2df5a84
  5. 08 2月, 2011 3 次提交
  6. 05 2月, 2011 1 次提交
  7. 04 2月, 2011 3 次提交
  8. 03 2月, 2011 2 次提交
    • A
      gro: reset skb_iif on reuse · 6d152e23
      Andy Gospodarek 提交于
      Like Herbert's change from a few days ago:
      
      66c46d74 gro: Reset dev pointer on reuse
      
      this may not be necessary at this point, but we should still clean up
      the skb->skb_iif.  If not we may end up with an invalid valid for
      skb->skb_iif when the skb is reused and the check is done in
      __netif_receive_skb.
      Signed-off-by: NAndy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d152e23
    • J
      mac80211: fix TX status cookie in HW offload case · 4334ec85
      Johannes Berg 提交于
      When the off-channel TX is done with remain-on-channel
      offloaded to hardware, the reported cookie is wrong as
      in that case we shouldn't use the SKB as the cookie but
      need to instead use the corresponding r-o-c cookie
      (XOR'ed with 2 to prevent API mismatches).
      
      Fix this by keeping track of the hw_roc_skb pointer
      just for the status processing and use the correct
      cookie to report in this case. We can't use the
      hw_roc_skb pointer itself because it is NULL'ed when
      the frame is transmitted to prevent it being used
      twice.
      
      This fixes a bug where the P2P state machine in the
      supplicant gets stuck because it never gets a correct
      result for its transmitted frame.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      4334ec85
  9. 01 2月, 2011 5 次提交
    • P
      netfilter: ecache: always set events bits, filter them later · 3db7e93d
      Pablo Neira Ayuso 提交于
      For the following rule:
      
      iptables -I PREROUTING -t raw -j CT --ctevents assured
      
      The event delivered looks like the following:
      
       [UPDATE] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
      
      Note that the TCP protocol state is not included. For that reason
      the CT event filtering is not very useful for conntrackd.
      
      To resolve this issue, instead of conditionally setting the CT events
      bits based on the ctmask, we always set them and perform the filtering
      in the late stage, just before the delivery.
      
      Thus, the event delivered looks like the following:
      
       [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      3db7e93d
    • P
      netfilter: arpt_mangle: fix return values of checkentry · 9d0db8b6
      Pablo Neira Ayuso 提交于
      In 135367b8 "netfilter: xtables: change xt_target.checkentry return type",
      the type returned by checkentry was changed from boolean to int, but the
      return values where not adjusted.
      
      arptables: Input/output error
      
      This broke arptables with the mangle target since it returns true
      under success, which is interpreted by xtables as >0, thus
      returning EIO.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      9d0db8b6
    • E
      net: Fix ipv6 neighbour unregister_sysctl_table warning · bf36076a
      Eric W. Biederman 提交于
      In my testing of 2.6.37 I was occassionally getting a warning about
      sysctl table entries being unregistered in the wrong order.  Digging
      in it turns out this dates back to the last great sysctl reorg done
      where Al Viro introduced the requirement that sysctl directories
      needed to be created before and destroyed after the files in them.
      
      It turns out that in that great reorg /proc/sys/net/ipv6/neigh was
      overlooked.  So this patch fixes that oversight and makes an annoying
      warning message go away.
      
      >------------[ cut here ]------------
      >WARNING: at kernel/sysctl.c:1992 unregister_sysctl_table+0x134/0x164()
      >Pid: 23951, comm: kworker/u:3 Not tainted 2.6.37-350888.2010AroraKernelBeta.fc14.x86_64 #1
      >Call Trace:
      > [<ffffffff8103e034>] warn_slowpath_common+0x80/0x98
      > [<ffffffff8103e061>] warn_slowpath_null+0x15/0x17
      > [<ffffffff810452f8>] unregister_sysctl_table+0x134/0x164
      > [<ffffffff810e7834>] ? kfree+0xc4/0xd1
      > [<ffffffff813439b2>] neigh_sysctl_unregister+0x22/0x3a
      > [<ffffffffa02cd14e>] addrconf_ifdown+0x33f/0x37b [ipv6]
      > [<ffffffff81331ec2>] ? skb_dequeue+0x5f/0x6b
      > [<ffffffffa02ce4a5>] addrconf_notify+0x69b/0x75c [ipv6]
      > [<ffffffffa02eb953>] ? ip6mr_device_event+0x98/0xa9 [ipv6]
      > [<ffffffff813d2413>] notifier_call_chain+0x32/0x5e
      > [<ffffffff8105bdea>] raw_notifier_call_chain+0xf/0x11
      > [<ffffffff8133cdac>] call_netdevice_notifiers+0x45/0x4a
      > [<ffffffff8133d2b0>] rollback_registered_many+0x118/0x201
      > [<ffffffff8133d3af>] unregister_netdevice_many+0x16/0x6d
      > [<ffffffff8133d571>] default_device_exit_batch+0xa4/0xb8
      > [<ffffffff81337c42>] ? cleanup_net+0x0/0x194
      > [<ffffffff81337a2a>] ops_exit_list+0x4e/0x56
      > [<ffffffff81337d36>] cleanup_net+0xf4/0x194
      > [<ffffffff81053318>] process_one_work+0x187/0x280
      > [<ffffffff8105441b>] worker_thread+0xff/0x19f
      > [<ffffffff8105431c>] ? worker_thread+0x0/0x19f
      > [<ffffffff8105776d>] kthread+0x7d/0x85
      > [<ffffffff81003824>] kernel_thread_helper+0x4/0x10
      > [<ffffffff810576f0>] ? kthread+0x0/0x85
      > [<ffffffff81003820>] ? kernel_thread_helper+0x0/0x10
      >---[ end trace 8a7e9310b35e9486 ]---
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf36076a
    • T
      net: Check rps_flow_table when RPS map length is 1 · 85875236
      Tom Herbert 提交于
      In get_rps_cpu, add check that the rps_flow_table for the device is
      NULL when trying to take fast path when RPS map length is one.
      Without this, RFS is effectively disabled if map length is one which
      is not correct.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85875236
    • R
      net: Add default_mtu() methods to blackhole dst_ops · ec831ea7
      Roland Dreier 提交于
      When an IPSEC SA is still being set up, __xfrm_lookup() will return
      -EREMOTE and so ip_route_output_flow() will return a blackhole route.
      This can happen in a sndmsg call, and after d33e4553 ("net: Abstract
      default MTU metric calculation behind an accessor.") this leads to a
      crash in ip_append_data() because the blackhole dst_ops have no
      default_mtu() method and so dst_mtu() calls a NULL pointer.
      
      Fix this by adding default_mtu() methods (that simply return 0, matching
      the old behavior) to the blackhole dst_ops.
      
      The IPv4 part of this patch fixes a crash that I saw when using an IPSEC
      VPN; the IPv6 part is untested because I don't have an IPv6 VPN, but it
      looks to be needed as well.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec831ea7
  10. 30 1月, 2011 6 次提交
    • S
      batman-adv: Make vis info stack traversal threadsafe · 1181e1da
      Sven Eckelmann 提交于
      The batman-adv vis server has to a stack which stores all information
      about packets which should be send later. This stack is protected
      with a spinlock that is used to prevent concurrent write access to it.
      
      The send_vis_packets function has to take all elements from the stack
      and send them to other hosts over the primary interface. The send will
      be initiated without the lock which protects the stack.
      
      The implementation using list_for_each_entry_safe has the problem that
      it stores the next element as "safe ptr" to allow the deletion of the
      current element in the list. The list may be modified during the
      unlock/lock pair in the loop body which may make the safe pointer
      not pointing to correct next element.
      
      It is safer to remove and use the first element from the stack until no
      elements are available. This does not need reduntant information which
      would have to be validated each time the lock was removed.
      Reported-by: NRussell Senior <russell@personaltelco.net>
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      1181e1da
    • S
      batman-adv: Remove vis info element in free_info · dda9fc6b
      Sven Eckelmann 提交于
      The free_info function will be called when no reference to the info
      object exists anymore. It must be ensured that the allocated memory
      gets freed and not only the elements which are managed by the info
      object.
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      dda9fc6b
    • S
      batman-adv: Remove vis info on hashing errors · 2674c158
      Sven Eckelmann 提交于
      A newly created vis info object must be removed when it couldn't be
      added to the hash. The old_info which has to be replaced was already
      removed and isn't related to the hash anymore.
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      2674c158
    • E
      net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT · 709b46e8
      Eric W. Biederman 提交于
      SIOCGETSGCNT is not a unique ioctl value as it it maps tio SIOCPROTOPRIVATE +1,
      which unfortunately means the existing infrastructure for compat networking
      ioctls is insufficient.  A trivial compact ioctl implementation would conflict
      with:
      
      SIOCAX25ADDUID
      SIOCAIPXPRISLT
      SIOCGETSGCNT_IN6
      SIOCGETSGCNT
      SIOCRSSCAUSE
      SIOCX25SSUBSCRIP
      SIOCX25SDTEFACILITIES
      
      To make this work I have updated the compat_ioctl decode path to mirror the
      the normal ioctl decode path.  I have added an ipv4 inet_compat_ioctl function
      so that I can have ipv4 specific compat ioctls.   I have added a compat_ioctl
      function into struct proto so I can break out ioctls by which kind of ip socket
      I am using.  I have added a compat_raw_ioctl function because SIOCGETSGCNT only
      works on raw sockets.  I have added a ipmr_compat_ioctl that mirrors the normal
      ipmr_ioctl.
      
      This was necessary because unfortunately the struct layout for the SIOCGETSGCNT
      has unsigned longs in it so changes between 32bit and 64bit kernels.
      
      This change was sufficient to run a 32bit ip multicast routing daemon on a
      64bit kernel.
      Reported-by: NBill Fenner <fenner@aristanetworks.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      709b46e8
    • E
      net: Fix ip link add netns oops · 13ad1774
      Eric W. Biederman 提交于
      Ed Swierk <eswierk@bigswitch.com> writes:
      > On 2.6.35.7
      >  ip link add link eth0 netns 9999 type macvlan
      > where 9999 is a nonexistent PID triggers an oops and causes all network functions to hang:
      > [10663.821898] BUG: unable to handle kernel NULL pointer dereference at 000000000000006d
      >  [10663.821917] IP: [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.821933] PGD 1d3927067 PUD 22f5c5067 PMD 0
      >  [10663.821944] Oops: 0000 [#1] SMP
      >  [10663.821953] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
      >  [10663.821959] CPU 3
      >  [10663.821963] Modules linked in: macvlan ip6table_filter ip6_tables rfcomm ipt_MASQUERADE binfmt_misc iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sco ipt_REJECT bnep l2cap xt_tcpudp iptable_filter ip_tables x_tables bridge stp vboxnetadp vboxnetflt vboxdrv kvm_intel kvm parport_pc ppdev snd_hda_codec_intelhdmi snd_hda_codec_conexant arc4 iwlagn iwlcore mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi i915 snd_seq_midi_event snd_seq thinkpad_acpi drm_kms_helper btusb tpm_tis nvram uvcvideo snd_timer snd_seq_device bluetooth videodev v4l1_compat v4l2_compat_ioctl32 tpm drm tpm_bios snd cfg80211 psmouse serio_raw intel_ips soundcore snd_page_alloc intel_agp i2c_algo_bit video output netconsole configfs lp parport usbhid hid e1000e sdhci_pci ahci libahci sdhci led_class
      >  [10663.822155]
      >  [10663.822161] Pid: 6000, comm: ip Not tainted 2.6.35-23-generic #41-Ubuntu 2901CTO/2901CTO
      >  [10663.822167] RIP: 0010:[<ffffffff8149c2fa>] [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.822177] RSP: 0018:ffff88014aebf7b8 EFLAGS: 00010286
      >  [10663.822182] RAX: 00000000fffffff4 RBX: ffff8801ad900800 RCX: 0000000000000000
      >  [10663.822187] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffff88014ad63000
      >  [10663.822191] RBP: ffff88014aebf808 R08: 0000000000000041 R09: 0000000000000041
      >  [10663.822196] R10: 0000000000000000 R11: dead000000200200 R12: ffff88014aebf818
      >  [10663.822201] R13: fffffffffffffffd R14: ffff88014aebf918 R15: ffff88014ad62000
      >  [10663.822207] FS: 00007f00c487f700(0000) GS:ffff880001f80000(0000) knlGS:0000000000000000
      >  [10663.822212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      >  [10663.822216] CR2: 000000000000006d CR3: 0000000231f19000 CR4: 00000000000026e0
      >  [10663.822221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      >  [10663.822226] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      >  [10663.822231] Process ip (pid: 6000, threadinfo ffff88014aebe000, task ffff88014afb16e0)
      >  [10663.822236] Stack:
      >  [10663.822240] ffff88014aebf808 ffffffff814a2bb5 ffff88014aebf7e8 00000000a00ee8d6
      >  [10663.822251] <0> 0000000000000000 ffffffffa00ef940 ffff8801ad900800 ffff88014aebf818
      >  [10663.822265] <0> ffff88014aebf918 ffff8801ad900800 ffff88014aebf858 ffffffff8149c413
      >  [10663.822281] Call Trace:
      >  [10663.822290] [<ffffffff814a2bb5>] ? dev_addr_init+0x75/0xb0
      >  [10663.822298] [<ffffffff8149c413>] dev_alloc_name+0x43/0x90
      >  [10663.822307] [<ffffffff814a85ee>] rtnl_create_link+0xbe/0x1b0
      >  [10663.822314] [<ffffffff814ab2aa>] rtnl_newlink+0x48a/0x570
      >  [10663.822321] [<ffffffff814aafcc>] ? rtnl_newlink+0x1ac/0x570
      >  [10663.822332] [<ffffffff81030064>] ? native_x2apic_icr_read+0x4/0x20
      >  [10663.822339] [<ffffffff814a8c17>] rtnetlink_rcv_msg+0x177/0x290
      >  [10663.822346] [<ffffffff814a8aa0>] ? rtnetlink_rcv_msg+0x0/0x290
      >  [10663.822354] [<ffffffff814c25d9>] netlink_rcv_skb+0xa9/0xd0
      >  [10663.822360] [<ffffffff814a8a85>] rtnetlink_rcv+0x25/0x40
      >  [10663.822367] [<ffffffff814c223e>] netlink_unicast+0x2de/0x2f0
      >  [10663.822374] [<ffffffff814c303e>] netlink_sendmsg+0x1fe/0x2e0
      >  [10663.822383] [<ffffffff81488533>] sock_sendmsg+0xf3/0x120
      >  [10663.822391] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822400] [<ffffffff81168656>] ? __d_lookup+0x136/0x150
      >  [10663.822406] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822414] [<ffffffff812b7a0d>] ? _atomic_dec_and_lock+0x4d/0x80
      >  [10663.822422] [<ffffffff8116ea90>] ? mntput_no_expire+0x30/0x110
      >  [10663.822429] [<ffffffff81486ff5>] ? move_addr_to_kernel+0x65/0x70
      >  [10663.822435] [<ffffffff81493308>] ? verify_iovec+0x88/0xe0
      >  [10663.822442] [<ffffffff81489020>] sys_sendmsg+0x240/0x3a0
      > [10663.822450] [<ffffffff8111e2a9>] ? __do_fault+0x479/0x560
      >  [10663.822457] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822465] [<ffffffff8116cf4a>] ? alloc_fd+0x10a/0x150
      >  [10663.822473] [<ffffffff8158d76e>] ? do_page_fault+0x15e/0x350
      >  [10663.822482] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
      >  [10663.822487] Code: 90 48 8d 78 02 be 25 00 00 00 e8 92 1d e2 ff 48 85 c0 75 cf bf 20 00 00 00 e8 c3 b1 c6 ff 49 89 c7 b8 f4 ff ff ff 4d 85 ff 74 bd <4d> 8b 75 70 49 8d 45 70 48 89 45 b8 49 83 ee 58 eb 28 48 8d 55
      >  [10663.822618] RIP [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.822627] RSP <ffff88014aebf7b8>
      >  [10663.822631] CR2: 000000000000006d
      >  [10663.822636] ---[ end trace 3dfd6c3ad5327ca7 ]---
      
      This bug was introduced in:
      commit 81adee47
      Author: Eric W. Biederman <ebiederm@aristanetworks.com>
      Date:   Sun Nov 8 00:53:51 2009 -0800
      
          net: Support specifying the network namespace upon device creation.
      
          There is no good reason to not support userspace specifying the
          network namespace during device creation, and it makes it easier
          to create a network device and pass it to a child network namespace
          with a well known name.
      
          We have to be careful to ensure that the target network namespace
          for the new device exists through the life of the call.  To keep
          that logic clear I have factored out the network namespace grabbing
          logic into rtnl_link_get_net.
      
          In addtion we need to continue to pass the source network namespace
          to the rtnl_link_ops.newlink method so that we can find the base
          device source network namespace.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      
      Where apparently I forgot to add error handling to the path where we create
      a new network device in a new network namespace, and pass in an invalid pid.
      
      Cc: stable@kernel.org
      Reported-by: NEd Swierk <eswierk@bigswitch.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13ad1774
    • H
      gro: Reset dev pointer on reuse · 66c46d74
      Herbert Xu 提交于
      On older kernels the VLAN code may zero skb->dev before dropping
      it and causing it to be reused by GRO.
      
      Unfortunately we didn't reset skb->dev in that case which causes
      the next GRO user to get a bogus skb->dev pointer.
      
      This particular problem no longer happens with the current upstream
      kernel due to changes in VLAN processing.
      
      However, for correctness we should still reset the skb->dev pointer
      in the GRO reuse function in case a future user does the same thing.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66c46d74
  11. 28 1月, 2011 4 次提交
  12. 27 1月, 2011 1 次提交
  13. 26 1月, 2011 4 次提交
    • L
      batman-adv: Fix kernel panic when fetching vis data on a vis server · dd58ddc6
      Linus Lüssing 提交于
      The hash_iterate removal introduced a bug leading to a kernel panic when
      fetching the vis data on a vis server. That commit forgot to rename one
      variable name, which this commit fixes now.
      Reported-by: NRussell Senior <russell@personaltelco.net>
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      dd58ddc6
    • J
      TCP: fix a bug that triggers large number of TCP RST by mistake · 44f5324b
      Jerry Chu 提交于
      This patch fixes a bug that causes TCP RST packets to be generated
      on otherwise correctly behaved applications, e.g., no unread data
      on close,..., etc. To trigger the bug, at least two conditions must
      be met:
      
      1. The FIN flag is set on the last data packet, i.e., it's not on a
      separate, FIN only packet.
      2. The size of the last data chunk on the receive side matches
      exactly with the size of buffer posted by the receiver, and the
      receiver closes the socket without any further read attempt.
      
      This bug was first noticed on our netperf based testbed for our IW10
      proposal to IETF where a large number of RST packets were observed.
      netperf's read side code meets the condition 2 above 100%.
      
      Before the fix, tcp_data_queue() will queue the last skb that meets
      condition 1 to sk_receive_queue even though it has fully copied out
      (skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
      tcp_recvmsg() often returns all the copied out data successfully
      without actually consuming the skb, due to a check
      "if ((chunk = len - tp->ucopy.len) != 0) {"
      and
      "len -= chunk;"
      after tcp_prequeue_process() that causes "len" to become 0 and an
      early exit from the big while loop.
      
      I don't see any reason not to free the skb whose data have been fully
      consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
      get there if MSG_PEEK is on. Am I missing some arcane cases related
      to urgent data?
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44f5324b
    • F
      mac80211: fix a crash in ieee80211_beacon_get_tim on change_interface · eb3e554b
      Felix Fietkau 提交于
      Some drivers (e.g. ath9k) do not always disable beacons when they're
      supposed to. When an interface is changed using the change_interface op,
      the mode specific sdata part is in an undefined state and trying to
      get a beacon at this point can produce weird crashes.
      
      To fix this, add a check for ieee80211_sdata_running before using
      anything from the sdata.
      Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
      Cc: stable@kernel.org
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      eb3e554b
    • D
      ipv6: Revert 'administrative down' address handling changes. · 73a8bd74
      David S. Miller 提交于
      This reverts the following set of commits:
      
      d1ed113f ("ipv6: remove duplicate neigh_ifdown")
      29ba5fed ("ipv6: don't flush routes when setting loopback down")
      9d82ca98 ("ipv6: fix missing in6_ifa_put in addrconf")
      2de79570 ("ipv6: addrconf: don't remove address state on ifdown if the address is being kept")
      8595805a ("IPv6: only notify protocols if address is compeletely gone")
      27bdb2ab ("IPv6: keep tentative addresses in hash table")
      93fa159a ("IPv6: keep route for tentative address")
      8f37ada5 ("IPv6: fix race between cleanup and add/delete address")
      84e8b803 ("IPv6: addrconf notify when address is unavailable")
      dc2b99f7 ("IPv6: keep permanent addresses on admin down")
      
      because the core semantic change to ipv6 address handling on ifdown
      has broken some things, in particular "disable_ipv6" sysctl handling.
      
      Stephen has made several attempts to get things back in working order,
      but nothing has restored disable_ipv6 fully yet.
      Reported-by: NEric W. Biederman <ebiederm@xmission.com>
      Tested-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73a8bd74
  14. 25 1月, 2011 5 次提交