1. 18 7月, 2008 11 次提交
    • D
      pkt_sched: Schedule qdiscs instead of netdev_queue. · 37437bb2
      David S. Miller 提交于
      When we have shared qdiscs, packets come out of the qdiscs
      for multiple transmit queues.
      
      Therefore it doesn't make any sense to schedule the transmit
      queue when logically we cannot know ahead of time the TX
      queue of the SKB that the qdisc->dequeue() will give us.
      
      Just for sanity I added a BUG check to make sure we never
      get into a state where the noop_qdisc is scheduled.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37437bb2
    • D
      pkt_sched: Make QDISC_RUNNING a qdisc state. · e2627c8c
      David S. Miller 提交于
      Currently it is associated with a netdev_queue, but when we have
      qdisc sharing that no longer makes any sense.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2627c8c
    • D
      pkt_sched: Move gso_skb into Qdisc. · d3b753db
      David S. Miller 提交于
      We liberate any dangling gso_skb during qdisc destruction.
      
      It really only matters for the root qdisc.  But when qdiscs
      can be shared by multiple netdev_queue objects, we can't
      have the gso_skb in the netdev_queue any more.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3b753db
    • D
      netdev: Kill plain netif_schedule() · 92831bc3
      David S. Miller 提交于
      No more users.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92831bc3
    • D
      netdev: Add netdev->select_queue() method. · eae792b7
      David S. Miller 提交于
      Devices or device layers can set this to control the queue selection
      performed by dev_pick_tx().
      
      This function runs under RCU protection, which allows overriding
      functions to have some way of synchronizing with things like dynamic
      ->real_num_tx_queues adjustments.
      
      This makes the spinlock prefetch in dev_queue_xmit() a little bit
      less effective, but that's the price right now for correctness.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eae792b7
    • D
      netdev: netdev_priv() can now be sane again. · e3c50d5d
      David S. Miller 提交于
      The private area of a netdev is now at a fixed offset once more.
      
      Unfortunately, some assumptions that netdev_priv() == netdev->priv
      crept back into the tree.  In particular this happened in the
      loopback driver.  Make it use netdev->ml_priv.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3c50d5d
    • D
      6b0fb126
    • D
      net: Use queue aware tests throughout. · fd2ea0a7
      David S. Miller 提交于
      This effectively "flips the switch" by making the core networking
      and multiqueue-aware drivers use the new TX multiqueue structures.
      
      Non-multiqueue drivers need no changes.  The interfaces they use such
      as netif_stop_queue() degenerate into an operation on TX queue zero.
      So everything "just works" for them.
      
      Code that really wants to do "X" to all TX queues now invokes a
      routine that does so, such as netif_tx_wake_all_queues(),
      netif_tx_stop_all_queues(), etc.
      
      pktgen and netpoll required a little bit more surgery than the others.
      
      In particular the pktgen changes, whilst functional, could be largely
      improved.  The initial check in pktgen_xmit() will sometimes check the
      wrong queue, which is mostly harmless.  The thing to do is probably to
      invoke fill_packet() earlier.
      
      The bulk of the netpoll changes is to make the code operate solely on
      the TX queue indicated by by the SKB queue mapping.
      
      Setting of the SKB queue mapping is entirely confined inside of
      net/core/dev.c:dev_pick_tx().  If we end up needing any kind of
      special semantics (drops, for example) it will be implemented here.
      
      Finally, we now have a "real_num_tx_queues" which is where the driver
      indicates how many TX queues are actually active.
      
      With IGB changes from Jeff Kirsher.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd2ea0a7
    • D
      pkt_sched: Remove RR scheduler. · 1d8ae3fd
      David S. Miller 提交于
      This actually fixes a bug added by the RR scheduler changes.  The
      ->bands and ->prio2band parameters were being set outside of the
      sch_tree_lock() and thus could result in strange behavior and
      inconsistencies.
      
      It might be possible, in the new design (where there will be one qdisc
      per device TX queue) to allow similar functionality via a TX hash
      algorithm for RR but I really see no reason to export this aspect of
      how these multiqueue cards actually implement the scheduling of the
      the individual DMA TX rings and the single physical MAC/PHY port.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d8ae3fd
    • D
      netdev: Kill NETIF_F_MULTI_QUEUE. · 09e83b5d
      David S. Miller 提交于
      There is no need for a feature bit for something that
      can be tested by simply checking the TX queue count.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09e83b5d
    • D
      netdev: Allocate multiple queues for TX. · e8a0464c
      David S. Miller 提交于
      alloc_netdev_mq() now allocates an array of netdev_queue
      structures for TX, based upon the queue_count argument.
      
      Furthermore, all accesses to the TX queues are now vectored
      through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
      interfaces.  This makes it easy to grep the tree for all
      things that want to get to a TX queue of a net device.
      
      Problem spots which are not really multiqueue aware yet, and
      only work with one queue, can easily be spotted by grepping
      for all netdev_get_tx_queue() calls that pass in a zero index.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8a0464c
  2. 15 7月, 2008 9 次提交
    • D
      netdev: Add netdev->addr_list_lock protection. · e308a5d8
      David S. Miller 提交于
      Add netif_addr_{lock,unlock}{,_bh}() helpers.
      
      Use them to protect operations that operate on or read
      the network device unicast and multicast address lists.
      
      Also use them in cases where the code simply wants to
      block calls into the driver's ->set_rx_mode() and
      ->set_multicast_list() methods.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e308a5d8
    • D
      netdev: Add addr_list_lock to struct net_device. · f1f28aa3
      David S. Miller 提交于
      This will be used to protect the per-device unicast and multicast
      address lists, as well as the callbacks into the drivers which
      configure such state such as ->set_rx_mode() and ->set_multicast_list().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1f28aa3
    • P
      packet: deliver VLAN TCI to userspace · 393e52e3
      Patrick McHardy 提交于
      Store the VLAN tag in the auxillary data/tpacket2_hdr so userspace can
      properly deal with hardware VLAN tagging/stripping.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      393e52e3
    • P
      packet: support extensible, 64 bit clean mmaped ring structure · bbd6ef87
      Patrick McHardy 提交于
      The tpacket_hdr is not 64 bit clean due to use of an unsigned long
      and can't be extended because the following struct sockaddr_ll needs
      to be at a fixed offset.
      
      Add support for a version 2 tpacket protocol that removes these
      limitations.
      
      Userspace can query the header size through a new getsockopt option
      and change the protocol version through a setsockopt option. The
      changes needed to switch to the new protocol version are:
      
      1. replace struct tpacket_hdr by struct tpacket2_hdr
      2. query header len and save
      3. set protocol version to 2
       - set up ring as usual
      4. for getting the sockaddr_ll, use (void *)hdr + TPACKET_ALIGN(hdrlen)
         instead of (void *)hdr + TPACKET_ALIGN(sizeof(struct tpacket_hdr))
      
      Steps 2 and 4 can be omitted if the struct sockaddr_ll isn't needed.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbd6ef87
    • P
      vlan: deliver packets received with VLAN acceleration to network taps · bc1d0411
      Patrick McHardy 提交于
      When VLAN header stripping is used, packets currently bypass packet
      sockets (and other network taps) completely. For locally existing
      VLANs, they appear directly on the VLAN device, for unknown VLANs
      they are silently dropped.
      
      Add a new function netif_nit_deliver() to deliver incoming packets
      to all network interface taps and use it in __vlan_hwaccel_rx() to
      make VLAN packets visible on the underlying device.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc1d0411
    • P
      vlan: Don't store VLAN tag in cb · 6aa895b0
      Patrick McHardy 提交于
      Use a real skb member to store the skb to avoid clashes with qdiscs,
      which are allowed to use the cb area themselves. As currently only real
      devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit
      invalidation is neccessary.
      
      The new member fills a hole on 64 bit, the skb layout changes from:
      
              __u32                      mark;                 /*   172     4 */
              sk_buff_data_t             transport_header;     /*   176     4 */
              sk_buff_data_t             network_header;       /*   180     4 */
              sk_buff_data_t             mac_header;           /*   184     4 */
              sk_buff_data_t             tail;                 /*   188     4 */
              /* --- cacheline 3 boundary (192 bytes) --- */
              sk_buff_data_t             end;                  /*   192     4 */
      
              /* XXX 4 bytes hole, try to pack */
      
      to
      
              __u32                      mark;                 /*   172     4 */
              __u16                      vlan_tci;             /*   176     2 */
      
              /* XXX 2 bytes hole, try to pack */
      
              sk_buff_data_t             transport_header;     /*   180     4 */
              sk_buff_data_t             network_header;       /*   184     4 */
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6aa895b0
    • M
      tun: Fix/rewrite packet filtering logic · f271b2cc
      Max Krasnyansky 提交于
      Please see the following thread to get some context on this
      	http://marc.info/?l=linux-netdev&m=121564433018903&w=2
      
      Basically the issue is that current multi-cast filtering stuff in
      the TUN/TAP driver is seriously broken.
      Original patch went in without proper review and ACK. It was broken and
      confusing to start with and subsequent patches broke it completely.
      To give you an idea of what's broken here are some of the issues:
      
      - Very confusing comments throughout the code that imply that the
      character device is a network interface in its own right, and that packets
      are passed between the two nics. Which is completely wrong.
      
      - Wrong set of ioctls is used for setting up filters. They look like
      shortcuts for manipulating state of the tun/tap network interface but
      in reality manipulate the state of the TX filter.
      
      - ioctls that were originally used for setting address of the the TX filter
      got "fixed" and now set the address of the network interface itself. Which
      made filter totaly useless.
      
      - Filtering is done too late. Instead of filtering early on, to avoid
      unnecessary wakeups, filtering is done in the read() call.
      
      The list goes on and on :)
      
      So the patch cleans all that up. It introduces simple and clean interface for
      setting up TX filters (TUNSETTXFILTER + tun_filter spec) and does filtering
      before enqueuing the packets.
      
      TX filtering is useful in the scenarios where TAP is part of a bridge, in
      which case it gets all broadcast, multicast and potentially other packets when
      the bridge is learning. So for example Ethernet tunnelling app may want to
      setup TX filters to avoid tunnelling multicast traffic. QEMU and other
      hypervisors can push RX filtering that is currently done in the guest into the
      host context therefore saving wakeups and unnecessary data transfer.
      Signed-off-by: NMax Krasnyansky <maxk@qualcomm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f271b2cc
    • P
      72d9794f
    • M
      ssb: Include dma-mapping.h · 9c0c7a42
      Michael Buesch 提交于
      ssb.h implements DMA mapping functions, so it should
      include dma-mapping.h. This fixes compile failures on certain architectures.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NMichael Buesch <mb@bu3sch.de>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      9c0c7a42
  3. 13 7月, 2008 1 次提交
  4. 11 7月, 2008 2 次提交
  5. 09 7月, 2008 13 次提交
  6. 08 7月, 2008 4 次提交