1. 26 11月, 2013 1 次提交
    • B
      brcmsmac: Fix build dep on LEDS_CLASS · 0b3f5753
      Borislav Petkov 提交于
      When building randconfigs with CONFIG_BCMA_DRIVER_GPIO=y, I get
      
      drivers/built-in.o: In function `brcms_led_unregister':
      (.text+0x351aca): undefined reference to `led_classdev_unregister'
      drivers/built-in.o: In function `brcms_led_register':
      (.text+0x351c65): undefined reference to `led_classdev_register'
      
      during final linking stage because brcmsmac/led.c needs LEDS_CLASS for
      registering/deregistering the led device. Select the required symbols.
      
      Cc: Arend van Spriel <arend@broadcom.com>
      Cc: "Rafał Miłecki" <zajec5@gmail.com>
      Cc: <linux-wireless@vger.kernel.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      0b3f5753
  2. 22 11月, 2013 8 次提交
  3. 21 11月, 2013 15 次提交
    • J
      Merge branch 'master' of... · 7acd7187
      John W. Linville 提交于
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
      7acd7187
    • M
      net/phy: Add the autocross feature for forced links on VSC82x4 · 3fb69bca
      Madalin Bucur 提交于
      Add auto-MDI/MDI-X capability for forced (autonegotiation disabled)
      10/100 Mbps speeds on Vitesse VSC82x4 PHYs. Exported previously static
      function genphy_setup_forced() required by the new config_aneg handler
      in the Vitesse PHY module.
      Signed-off-by: NMadalin Bucur <madalin.bucur@freescale.com>
      Signed-off-by: NShruti Kanetkar <Shruti@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3fb69bca
    • S
      net/phy: Add VSC8662 support · 06ae4f84
      Sandeep Singh 提交于
      Vitesse VSC8662 is Dual Port 10/100/1000Base-T Phy
      Its register set and features are similar to other Vitesse Phys.
      Signed-off-by: NSandeep Singh <Sandeep@freescale.com>
      Signed-off-by: NAndy Fleming <afleming@gmail.com>
      Signed-off-by: NShruti Kanetkar <Shruti@Freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06ae4f84
    • S
      net/phy: Add VSC8574 support · c2efef74
      shaohui xie 提交于
      The VSC8574 is a quad-port Gigabit Ethernet transceiver with four SerDes
      interfaces for quad-port dual media capability.
      Signed-off-by: NShaohui Xie <Shaohui.Xie@freescale.com>
      Signed-off-by: NAndy Fleming <afleming@gmail.com>
      Signed-off-by: NShruti Kanetkar <Shruti@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2efef74
    • A
      net/phy: Add VSC8234 support · 0508019c
      Andy Fleming 提交于
      Vitesse VSC8234 is quad port 10/100/1000BASE-T PHY
      with SGMII and SERDES MAC interfaces.
      Signed-off-by: NAndy Fleming <afleming@gmail.com>
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      Signed-off-by: NShruti Kanetkar <Shruti@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0508019c
    • H
      net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage) · 68c6beb3
      Hannes Frederic Sowa 提交于
      In that case it is probable that kernel code overwrote part of the
      stack. So we should bail out loudly here.
      
      The BUG_ON may be removed in future if we are sure all protocols are
      conformant.
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68c6beb3
    • H
      net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426
      Hannes Frederic Sowa 提交于
      This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
      set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
      to return msg_name to the user.
      
      This prevents numerous uninitialized memory leaks we had in the
      recvmsg handlers and makes it harder for new code to accidentally leak
      uninitialized memory.
      
      Optimize for the case recvfrom is called with NULL as address. We don't
      need to copy the address at all, so set it to NULL before invoking the
      recvmsg handler. We can do so, because all the recvmsg handlers must
      cope with the case a plain read() is called on them. read() also sets
      msg_name to NULL.
      
      Also document these changes in include/linux/net.h as suggested by David
      Miller.
      
      Changes since RFC:
      
      Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      affect sendto as it would bail out earlier while trying to copy-in the
      address. It also more naturally reflects the logic by the callers of
      verify_iovec.
      
      With this change in place I could remove "
      if (!uaddr || msg_sys->msg_namelen == 0)
      	msg->msg_name = NULL
      ".
      
      This change does not alter the user visible error logic as we ignore
      msg_namelen as long as msg_name is NULL.
      
      Also remove two unnecessary curly brackets in ___sys_recvmsg and change
      comments to netdev style.
      
      Cc: David Miller <davem@davemloft.net>
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3d33426
    • D
      bridge: flush br's address entry in fdb when remove the · f8730420
      Ding Tianhong 提交于
       bridge dev
      
      When the following commands are executed:
      
      brctl addbr br0
      ifconfig br0 hw ether <addr>
      rmmod bridge
      
      The calltrace will occur:
      
      [  563.312114] device eth1 left promiscuous mode
      [  563.312188] br0: port 1(eth1) entered disabled state
      [  563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects
      [  563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G           O 3.12.0-0.7-default+ #9
      [  563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  563.468200]  0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8
      [  563.468204]  ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8
      [  563.468206]  ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78
      [  563.468209] Call Trace:
      [  563.468218]  [<ffffffff814d1c92>] dump_stack+0x6a/0x78
      [  563.468234]  [<ffffffff81148efd>] kmem_cache_destroy+0xfd/0x100
      [  563.468242]  [<ffffffffa062a270>] br_fdb_fini+0x10/0x20 [bridge]
      [  563.468247]  [<ffffffffa063ac76>] br_deinit+0x4e/0x50 [bridge]
      [  563.468254]  [<ffffffff810c7dc9>] SyS_delete_module+0x199/0x2b0
      [  563.468259]  [<ffffffff814e0922>] system_call_fastpath+0x16/0x1b
      [  570.377958] Bridge firewalling registered
      
      --------------------------- cut here -------------------------------
      
      The reason is that when the bridge dev's address is changed, the
      br_fdb_change_mac_address() will add new address in fdb, but when
      the bridge was removed, the address entry in the fdb did not free,
      the bridge_fdb_cache still has objects when destroy the cache, Fix
      this by flushing the bridge address entry when removing the bridge.
      
      v2: according to the Toshiaki Makita and Vlad's suggestion, I only
          delete the vlan0 entry, it still have a leak here if the vlan id
          is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
          to flush all entries whose dst is NULL for the bridge.
      Suggested-by: NToshiaki Makita <toshiaki.makita1@gmail.com>
      Suggested-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8730420
    • V
      net: core: Always propagate flag changes to interfaces · d2615bf4
      Vlad Yasevich 提交于
      The following commit:
          b6c40d68
          net: only invoke dev->change_rx_flags when device is UP
      
      tried to fix a problem with VLAN devices and promiscuouse flag setting.
      The issue was that VLAN device was setting a flag on an interface that
      was down, thus resulting in bad promiscuity count.
      This commit blocked flag propagation to any device that is currently
      down.
      
      A later commit:
          deede2fa
          vlan: Don't propagate flag changes on down interfaces
      
      fixed VLAN code to only propagate flags when the VLAN interface is up,
      thus fixing the same issue as above, only localized to VLAN.
      
      The problem we have now is that if we have create a complex stack
      involving multiple software devices like bridges, bonds, and vlans,
      then it is possible that the flags would not propagate properly to
      the physical devices.  A simple examle of the scenario is the
      following:
      
        eth0----> bond0 ----> bridge0 ---> vlan50
      
      If bond0 or eth0 happen to be down at the time bond0 is added to
      the bridge, then eth0 will never have promisc mode set which is
      currently required for operation as part of the bridge.  As a
      result, packets with vlan50 will be dropped by the interface.
      
      The only 2 devices that implement the special flag handling are
      VLAN and DSA and they both have required code to prevent incorrect
      flag propagation.  As a result we can remove the generic solution
      introduced in b6c40d68 and leave
      it to the individual devices to decide whether they will block
      flag propagation or not.
      Reported-by: NStefan Priebe <s.priebe@profihost.ag>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2615bf4
    • A
      ipv4: fix race in concurrent ip_route_input_slow() · dcdfdf56
      Alexei Starovoitov 提交于
      CPUs can ask for local route via ip_route_input_noref() concurrently.
      if nh_rth_input is not cached yet, CPUs will proceed to allocate
      equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
      via rt_cache_route()
      Most of the time they succeed, but on occasion the following two lines:
      	orig = *p;
      	prev = cmpxchg(p, orig, rt);
      in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
      But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
      so dst is leaking. dst_destroy() is never called and 'lo' device
      refcnt doesn't go to zero, which can be seen in the logs as:
      	unregister_netdevice: waiting for lo to become free. Usage count = 1
      Adding mdelay() between above two lines makes it easily reproducible.
      Fix it similar to nh_pcpu_rth_output case.
      
      Fixes: d2d68ba9 ("ipv4: Cache input routes in fib_info nexthops.")
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dcdfdf56
    • D
      Merge branch 'r8152' · 4f837c3b
      David S. Miller 提交于
      Hayes Wang says:
      
      ====================
      r8152 bug fixes
      
      For the patch #3, I add netif_tx_lock() before checking the
      netif_queue_stopped(). Besides, I add checking the skb queue
      length before waking the tx queue.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f837c3b
    • H
      r8152: fix incorrect type in assignment · 500b6d7e
      hayeswang 提交于
      The data from the hardware should be little endian. Correct the
      declaration.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      500b6d7e
    • H
      r8152: support stopping/waking tx queue · dd1b119c
      hayeswang 提交于
      The maximum packet number which a tx aggregation buffer could contain
      is the tx_qlen.
      
      	tx_qlen = buffer size / (packet size + descriptor size).
      
      If the tx buffer is empty and the queued packets are more than the
      maximum value which is defined above, stop the tx queue. Wake the
      tx queue if tx queue is stopped and the queued packets are less than
      tx_qlen.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd1b119c
    • H
      r8152: modify the tx flow · 61598788
      hayeswang 提交于
      Remove the code for sending the packet in the rtl8152_start_xmit().
      Let rtl8152_start_xmit() to queue the packet only, and schedule a
      tasklet to send the queued packets. This simplify the code and make
      sure all the packet would be sent by the original order.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      61598788
    • H
      r8152: fix tx/rx memory overflow · 7937f9e5
      hayeswang 提交于
      The tx/rx would access the memory which is out of the desired range.
      Modify the method of checking the end of the memory to avoid it.
      
      For r8152_tx_agg_fill(), the variable remain may become negative.
      However, the declaration is unsigned, so the while loop wouldn't
      break when reaching the end of the desied memory. Although to change
      the declaration from unsigned to signed is enough to fix it, I also
      modify the checking method for safe. Replace
      
      		remain = rx_buf_sz - sizeof(*tx_desc) -
      			 (u32)((void *)tx_data - agg->head);
      
      with
      
      		remain = rx_buf_sz - (int)(tx_agg_align(tx_data) - agg->head);
      
      to make sure the variable remain is always positive. Then, the
      overflow wouldn't happen.
      
      For rx_bottom(), the rx_desc should not be used to calculate the
      packet length before making sure the rx_desc is in the desired range.
      Change the checking to two parts. First, check the descriptor is in
      the memory. The other, using the descriptor to find out the packet
      length and check if the packet is in the memory.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7937f9e5
  4. 20 11月, 2013 16 次提交
    • M
      aacraid: prevent invalid pointer dereference · b4789b8e
      Mahesh Rajashekhara 提交于
      It appears that driver runs into a problem here if fibsize is too small
      because we allocate user_srbcmd with fibsize size only but later we
      access it until user_srbcmd->sg.count to copy it over to srbcmd.
      
      It is not correct to test (fibsize < sizeof(*user_srbcmd)) because this
      structure already includes one sg element and this is not needed for
      commands without data.  So, we would recommend to add the following
      (instead of test for fibsize == 0).
      Signed-off-by: NMahesh Rajashekhara <Mahesh.Rajashekhara@pmcs.com>
      Reported-by: NNico Golde <nico@ngolde.de>
      Reported-by: NFabian Yamaguchi <fabs@goesec.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4789b8e
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 1ee2dcc2
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
       "Mostly these are fixes for fallout due to merge window changes, as
        well as cures for problems that have been with us for a much longer
        period of time"
      
       1) Johannes Berg noticed two major deficiencies in our genetlink
          registration.  Some genetlink protocols we passing in constant
          counts for their ops array rather than something like
          ARRAY_SIZE(ops) or similar.  Also, some genetlink protocols were
          using fixed IDs for their multicast groups.
      
          We have to retain these fixed IDs to keep existing userland tools
          working, but reserve them so that other multicast groups used by
          other protocols can not possibly conflict.
      
          In dealing with these two problems, we actually now use less state
          management for genetlink operations and multicast groups.
      
       2) When configuring interface hardware timestamping, fix several
          drivers that simply do not validate that the hwtstamp_config value
          is one the driver actually supports.  From Ben Hutchings.
      
       3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.
      
       4) In dev_forward_skb(), set the skb->protocol in the right order
          relative to skb_scrub_packet().  From Alexei Starovoitov.
      
       5) Bridge erroneously fails to use the proper wrapper functions to make
          calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid.  Fix from Toshiaki
          Makita.
      
       6) When detaching a bridge port, make sure to flush all VLAN IDs to
          prevent them from leaking, also from Toshiaki Makita.
      
       7) Put in a compromise for TCP Small Queues so that deep queued devices
          that delay TX reclaim non-trivially don't have such a performance
          decrease.  One particularly problematic area is 802.11 AMPDU in
          wireless.  From Eric Dumazet.
      
       8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
          here.  Fix from Eric Dumzaet, reported by Dave Jones.
      
       9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.
      
      10) When computing mergeable buffer sizes, virtio-net fails to take the
          virtio-net header into account.  From Michael Dalton.
      
      11) Fix seqlock deadlock in ip4_datagram_connect() wrt.  statistic
          bumping, this one has been with us for a while.  From Eric Dumazet.
      
      12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
          Hugne.
      
      13) 6lowpan bit used for traffic classification was wrong, from Jukka
          Rissanen.
      
      14) macvlan has the same issue as normal vlans did wrt.  propagating LRO
          disabling down to the real device, fix it the same way.  From Michal
          Kubecek.
      
      15) CPSW driver needs to soft reset all slaves during suspend, from
          Daniel Mack.
      
      16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.
      
      17) The xen-netfront RX buffer refill timer isn't properly scheduled on
          partial RX allocation success, from Ma JieYue.
      
      18) When ipv6 ping protocol support was added, the AF_INET6 protocol
          initialization cleanup path on failure was borked a little.  Fix
          from Vlad Yasevich.
      
      19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
          blocks we can do the wrong thing with the msg_name we write back to
          userspace.  From Hannes Frederic Sowa.  There is another fix in the
          works from Hannes which will prevent future problems of this nature.
      
      20) Fix route leak in VTI tunnel transmit, from Fan Du.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
        genetlink: make multicast groups const, prevent abuse
        genetlink: pass family to functions using groups
        genetlink: add and use genl_set_err()
        genetlink: remove family pointer from genl_multicast_group
        genetlink: remove genl_unregister_mc_group()
        hsr: don't call genl_unregister_mc_group()
        quota/genetlink: use proper genetlink multicast APIs
        drop_monitor/genetlink: use proper genetlink multicast APIs
        genetlink: only pass array to genl_register_family_with_ops()
        tcp: don't update snd_nxt, when a socket is switched from repair mode
        atm: idt77252: fix dev refcnt leak
        xfrm: Release dst if this dst is improper for vti tunnel
        netlink: fix documentation typo in netlink_set_err()
        be2net: Delete secondary unicast MAC addresses during be_close
        be2net: Fix unconditional enabling of Rx interface options
        net, virtio_net: replace the magic value
        ping: prevent NULL pointer dereference on write to msg_name
        bnx2x: Prevent "timeout waiting for state X"
        bnx2x: prevent CFC attention
        bnx2x: Prevent panic during DMAE timeout
        ...
      1ee2dcc2
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 4457e6f6
      Linus Torvalds 提交于
      Pull sparc fixes from David Miller:
       "Two merge window fallout build fixes"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: merge fix
        sparc64: fix build regession
      4457e6f6
    • L
      Merge tag 'please-pull-fixia64' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux · e87e7be9
      Linus Torvalds 提交于
      Pull ia64 fix from Tony Luck:
       "Unbreak ia64 build by avoiding circular dependency"
      
      * tag 'please-pull-fixia64' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
        kernel/bounds: avoid circular dependencies in generated headers
      e87e7be9
    • K
      kernel/bounds: avoid circular dependencies in generated headers · 24b9fdc5
      Kirill A. Shutemov 提交于
      <linux/spinlock.h> has heavy dependencies on other header files.
      It triggers circular dependencies in generated headers on IA64, at
      least:
      
        CC      kernel/bounds.s
      In file included from /home/space/kas/git/public/linux/arch/ia64/include/asm/thread_info.h:9:0,
                       from include/linux/thread_info.h:54,
                       from include/asm-generic/preempt.h:4,
                       from arch/ia64/include/generated/asm/preempt.h:1,
                       from include/linux/preempt.h:18,
                       from include/linux/spinlock.h:50,
                       from kernel/bounds.c:14:
      /home/space/kas/git/public/linux/arch/ia64/include/asm/asm-offsets.h:1:35: fatal error: generated/asm-offsets.h: No such file or directory
      compilation terminated.
      
      Let's replace <linux/spinlock.h> with <linux/spinlock_types.h>, it's
      enough to find out size of spinlock_t.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-and-Tested-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      24b9fdc5
    • D
      Merge branch 'genetlink_mcast' · 091e0662
      David S. Miller 提交于
      Johannes Berg says:
      
      ====================
      genetlink: clean up multicast group APIs
      
      The generic netlink multicast group registration doesn't have to
      be dynamic, and can thus be simplified just like I did with the
      ops. This removes some complexity in registration code.
      
      Additionally, two users of generic netlink already use multicast
      groups in a wrong way, add workarounds for those two to keep the
      userspace API working, but at the same time make them not clash
      with other users of multicast groups as might happen now.
      
      While making it all a bit easier, also prevent such abuse by adding
      checks to the APIs so each family can only use the groups it owns.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      091e0662
    • J
      genetlink: make multicast groups const, prevent abuse · 2a94fe48
      Johannes Berg 提交于
      Register generic netlink multicast groups as an array with
      the family and give them contiguous group IDs. Then instead
      of passing the global group ID to the various functions that
      send messages, pass the ID relative to the family - for most
      families that's just 0 because the only have one group.
      
      This avoids the list_head and ID in each group, adding a new
      field for the mcast group ID offset to the family.
      
      At the same time, this allows us to prevent abusing groups
      again like the quota and dropmon code did, since we can now
      check that a family only uses a group it owns.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a94fe48
    • J
      genetlink: pass family to functions using groups · 68eb5503
      Johannes Berg 提交于
      This doesn't really change anything, but prepares for the
      next patch that will change the APIs to pass the group ID
      within the family, rather than the global group ID.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68eb5503
    • J
      genetlink: add and use genl_set_err() · 62b68e99
      Johannes Berg 提交于
      Add a static inline to generic netlink to wrap netlink_set_err()
      to make it easier to use here - use it in openvswitch (the only
      generic netlink user of netlink_set_err()).
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62b68e99
    • J
      genetlink: remove family pointer from genl_multicast_group · c2ebb908
      Johannes Berg 提交于
      There's no reason to have the family pointer there since it
      can just be passed internally where needed, so remove it.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2ebb908
    • J
      genetlink: remove genl_unregister_mc_group() · 06fb555a
      Johannes Berg 提交于
      There are no users of this API remaining, and we'll soon
      change group registration to be static (like ops are now)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06fb555a
    • J
      hsr: don't call genl_unregister_mc_group() · 03ed3827
      Johannes Berg 提交于
      There's no need to unregister the multicast group if the
      generic netlink family is registered immediately after.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03ed3827
    • J
      quota/genetlink: use proper genetlink multicast APIs · 2ecf7536
      Johannes Berg 提交于
      The quota code is abusing the genetlink API and is using
      its family ID as the multicast group ID, which is invalid
      and may belong to somebody else (and likely will.)
      
      Make the quota code use the correct API, but since this
      is already used as-is by userspace, reserve a family ID
      for this code and also reserve that group ID to not break
      userspace assumptions.
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ecf7536
    • J
      drop_monitor/genetlink: use proper genetlink multicast APIs · e5dcecba
      Johannes Berg 提交于
      The drop monitor code is abusing the genetlink API and is
      statically using the generic netlink multicast group 1, even
      if that group belongs to somebody else (which it invariably
      will, since it's not reserved.)
      
      Make the drop monitor code use the proper APIs to reserve a
      group ID, but also reserve the group id 1 in generic netlink
      code to preserve the userspace API. Since drop monitor can
      be a module, don't clear the bit for it on unregistration.
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5dcecba
    • J
      genetlink: only pass array to genl_register_family_with_ops() · c53ed742
      Johannes Berg 提交于
      As suggested by David Miller, make genl_register_family_with_ops()
      a macro and pass only the array, evaluating ARRAY_SIZE() in the
      macro, this is a little safer.
      
      The openvswitch has some indirection, assing ops/n_ops directly in
      that code. This might ultimately just assign the pointers in the
      family initializations, saving the struct genl_family_and_ops and
      code (once mcast groups are handled differently.)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c53ed742
    • A
      tcp: don't update snd_nxt, when a socket is switched from repair mode · dbde4979
      Andrey Vagin 提交于
      snd_nxt must be updated synchronously with sk_send_head.  Otherwise
      tp->packets_out may be updated incorrectly, what may bring a kernel panic.
      
      Here is a kernel panic from my host.
      [  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
      [  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
      ...
      [  146.301158] Call Trace:
      [  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0
      
      Before this panic a tcp socket was restored. This socket had sent and
      unsent data in the write queue. Sent data was restored in repair mode,
      then the socket was switched from reapair mode and unsent data was
      restored. After that the socket was switched back into repair mode.
      
      In that moment we had a socket where write queue looks like this:
      snd_una    snd_nxt   write_seq
         |_________|________|
                   |
      	  sk_send_head
      
      After a second switching from repair mode the state of socket was
      changed:
      
      snd_una          snd_nxt, write_seq
         |_________ ________|
                   |
      	  sk_send_head
      
      This state is inconsistent, because snd_nxt and sk_send_head are not
      synchronized.
      
      Bellow you can find a call trace, how packets_out can be incremented
      twice for one skb, if snd_nxt and sk_send_head are not synchronized.
      In this case packets_out will be always positive, even when
      sk_write_queue is empty.
      
      tcp_write_wakeup
      	skb = tcp_send_head(sk);
      	tcp_fragment
      		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
      			tcp_adjust_pcount(sk, skb, diff);
      	tcp_event_new_data_sent
      		tp->packets_out += tcp_skb_pcount(skb);
      
      I think update of snd_nxt isn't required, when a socket is switched from
      repair mode.  Because it's initialized in tcp_connect_init. Then when a
      write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
      so it's always is in consistent state.
      
      I have checked, that the bug is not reproduced with this patch and
      all tests about restoring tcp connections work fine.
      
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dbde4979