- 10 10月, 2013 16 次提交
-
-
由 Alexander Duyck 提交于
Allocate the queue pairs individually instead of as a group. This allows for much easier queue management as it is possible to dynamically resize the queues without having to free and allocate the entire block. Ease statistic collection by treating Tx/Rx queue pairs as a single unit. Each pair is allocated together and starts with a Tx queue and ends with an Rx queue. By ordering them this way it is possible to know the Rx offset based on a pointer to the Tx queue. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This replaces the ring container array with a linked list. The idea is to make the logic much easier to deal with since this will allow us to call a simple helper function from the q_vectors to go through the entire list. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
Allocate the q_vectors individually. The advantage to this is that it allows for easier freeing and allocation. In addition it makes it so that we could do node specific allocations at some point in the future. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This makes it so that the Tx and Rx byte and packet counts are separated from the rest of the statistics. This allows for better isolation of these stats when we move them into the 64 bit statistics. Simplify things by re-ordering how the stats display in ethtool. Instead of displaying all of the Tx queues as a block, followed by all the Rx queues, the new order is Tx[0], Rx[0], Tx[1], Rx[1], ..., Tx[n], Rx[n]. This reduces the loops and cleans up the display for testing purposes since it is very easy to verify if flow director is doing the right thing as the Tx and Rx queue pair are shown in pairs. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
Implement BQL (byte queue limit) support in i40e. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
Drop Tx flag and TXSW which is tested but never set. As a result of this change we can drop a complicated check that always resulted in the final result of i40e_tx_csum being equal to the CHECKSUM_PARTIAL value. As such we can replace the entire function call with just a check for skb->summed == CHECKSUM_PARTIAL. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
Sync the fast path for i40e_tx_map and i40e_clean_tx_irq so that they are similar to igb and ixgbe. - Only update the Tx descriptor ring in tx_map - Make skb mapping always on the first buffer in the chain - Drop the use of MAPPED_AS_PAGE Tx flag - Only store flags on the first buffer_info structure Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
Avoid directly incrementing next_to_use for multiple reasons. The main reason being that if we directly increment it then it can attain a state where it is equal to the ring count. Technically this is a state it should not be able to reach but the way this is written it now can. This patch pulls the value off into a register and then increments it and writes back either the value or 0 depending on if the value is equal to the ring count. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
- drop the mapped_as_page u8 from the Tx buffer info as it was unused - use the DMA unmap accessors for Tx DMA - replace checks of DMA with checks of the unmap length to verify if an unmap is needed - update the Tx buffer layout to make it consistent with igb, ixgbe Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
The Tx "completed" stat was part of the original rewrite for detecting Tx hangs. However some time ago in ixgbe I determined that we could just use the packets stat instead. Since then this stat was removed from ixgbe and it serves no purpose in i40e so it can be dropped. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Anjali Singhai 提交于
Link events should not print to the log until the device is administratively up. Signed-off-by: NAnjali Singhai <anjali.singhai@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Ajit Khaparde 提交于
Signed-off-by: NAjit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ajit Khaparde 提交于
SkyHawk-R can support RoCE. Add code to display RoCE specific counters maintained in hardware. Signed-off-by: NAjit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ajit Khaparde 提交于
Moving to version 2 of GET_STATS command as SkyHawk-R supports higher number of rings. Signed-off-by: NAjit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
Each network interface (either PF or VF) is identified by its port's MAC id. Signed-off-by: NYuval Mintz <yuvalmin@broadcom.com> Signed-off-by: NAriel Elior <ariele@broadcom.com> Signed-off-by: NEilon Greenstein <eilong@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
CONFIG_IPV6=n is still a valid choice ;) It appears we can remove dead code. Reported-by: NWu Fengguang <fengguang.wu@intel.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2013 24 次提交
-
-
由 Eric Dumazet 提交于
At this point sk might contain garbage. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP listener refactoring, part 4 : To speed up inet lookups, we moved IPv4 addresses from inet to struct sock_common Now is time to do the same for IPv6, because it permits us to have fast lookups for all kind of sockets, including upcoming SYN_RECV. Getting IPv6 addresses in TCP lookups currently requires two extra cache lines, plus a dereference (and memory stall). inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6 This patch is way bigger than its IPv4 counter part, because for IPv4, we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6, it's not doable easily. inet6_sk(sk)->daddr becomes sk->sk_v6_daddr inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr at the same offset. We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic macro. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP listener refactoring, part 3 : Our goal is to hash SYN_RECV sockets into main ehash for fast lookup, and parallel SYN processing. Current inet_ehash_bucket contains two chains, one for ESTABLISH (and friend states) sockets, another for TIME_WAIT sockets only. As the hash table is sized to get at most one socket per bucket, it makes little sense to have separate twchain, as it makes the lookup slightly more complicated, and doubles hash table memory usage. If we make sure all socket types have the lookup keys at the same offsets, we can use a generic and faster lookup. It turns out TIME_WAIT and ESTABLISHED sockets already have common lookup fields for IPv4. [ INET_TW_MATCH() is no longer needed ] I'll provide a follow-up to factorize IPv6 lookup as well, to remove INET6_TW_MATCH() This way, SYN_RECV pseudo sockets will be supported the same. A new sock_gen_put() helper is added, doing either a sock_put() or inet_twsk_put() [ and will support SYN_RECV later ]. Note this helper should only be called in real slow path, when rcu lookup found a socket that was moved to another identity (freed/reused immediately), but could eventually be used in other contexts, like sock_edemux() Before patch : dmesg | grep "TCP established" TCP established hash table entries: 524288 (order: 11, 8388608 bytes) After patch : TCP established hash table entries: 524288 (order: 10, 4194304 bytes) Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net由 David S. Miller 提交于
Conflicts: include/linux/netdevice.h net/core/sock.c Trivial merge issues. Removal of "extern" for functions declaration in netdevice.h at the same time "const" was added to an argument. Two parallel line additions in net/core/sock.c Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc由 David S. Miller 提交于
Ben Hutchings says: ==================== Some more fixes for EF10 support; hopefully the last lot: 1. Fixes for reading statistics, from Edward Cree and Jon Cooper. 2. Addition of ethtool statistics for packets dropped by the hardware before they were associated with a specific function, from Edward Cree. 3. Only bind to functions that are in control of their associated port, as the driver currently assumes this is the case. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Steinar reported FQ pacing was not working for UDP flows. It looks like the initial sk->sk_pacing_rate value of 0 was a wrong choice. We should init it to ~0U (unlimited) Then, TCA_FQ_FLOW_DEFAULT_RATE should be removed because it makes no real sense. The default rate is really unlimited, and we need to avoid a zero divide. Reported-by: NSteinar H. Gunderson <sesse@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This reverts commit 612c3373. As per Stephen Hemminger, the layout of the netlink attribute is not implemented correctly so revert this for now. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Yongjun 提交于
Add the missing destroy_workqueue() before return from qlcnic_probe() in the error handling case. Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Yongjun 提交于
This patch fix the error handling in moxart_mac_probe(): - return -ENOMEM in some memory alloc fail cases - add missing free_netdev() in the error handling case Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Marc Kleine-Budde 提交于
This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCA_FQ_INITIAL_QUANTUM should set q->initial_quantum Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Oussama Ghorbel 提交于
Unlike ipv4, the struct member hlen holds the length of the GRE and ipv6 headers. This length is also counted in dev->hard_header_len. Perhaps, it's more clean to modify the hlen to count only the GRE header without ipv6 header as the variable name suggest, but the simple way to fix this without regression risk is simply modify the calculation of the limit in ip6gre_tunnel_change_mtu function. Verified in kernel version v3.11. Signed-off-by: NOussama Ghorbel <ou.ghorbel@gmail.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gao feng 提交于
We can get classid through cgroup_subsys_state, this is directviewing and effective. Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gao feng 提交于
Since the tasks have been migrated to the cgroup, there is no need to call task_netprioidx to get task's cgroup id. Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shawn Bohrer 提交于
The since the removal of the routing cache computing fib_compute_spec_dst() does a fib_table lookup for each UDP multicast packet received. This has introduced a performance regression for some UDP workloads. This change skips populating the packet info for sockets that do not have IP_PKTINFO set. Benchmark results from a netperf UDP_RR test: Before 89789.68 transactions/s After 90587.62 transactions/s Benchmark results from a fio 1 byte UDP multicast pingpong test (Multicast one way unicast response): Before 12.63us RTT After 12.48us RTT Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shawn Bohrer 提交于
The removal of the routing cache introduced a performance regression for some UDP workloads since a dst lookup must be done for each packet. This change caches the dst per socket in a similar manner to what we do for TCP by implementing early_demux. For UDP multicast we can only cache the dst if there is only one receiving socket on the host. Since caching only works when there is one receiving socket we do the multicast socket lookup using RCU. For UDP unicast we only demux sockets with an exact match in order to not break forwarding setups. Additionally since the hash chains may be long we only check the first socket to see if it is a match and not waste extra time searching the whole chain when we might not find an exact match. Benchmark results from a netperf UDP_RR test: Before 87961.22 transactions/s After 89789.68 transactions/s Benchmark results from a fio 1 byte UDP multicast pingpong test (Multicast one way unicast response): Before 12.97us RTT After 12.63us RTT Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shawn Bohrer 提交于
UDP sockets can receive packets from multiple endpoints and thus may be received on multiple receive queues. Since packets packets can arrive on multiple receive queues we should not mark the napi_id for all packets. This makes busy read/poll only work for connected UDP sockets. This additionally enables busy read/poll for UDP multicast packets as long as the socket is connected by moving the check into __udp_queue_rcv_skb(). Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com> Suggested-by: NEric Dumazet <edumazet@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
qdisc_tree_decrease_qlen() is called when some packets are dropped on a qdisc, and we want to notify parents of qlen changes. We also can increment parents qdisc qstats drop counters. This permits more accurate drop counters up to root qdisc. For example a graft operation typically resets a qdisc (drops all packets) and call qdisc_tree_decrease_qlen() Note that callers are responsible for their drop counters. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Vrabel 提交于
If a guest is destroyed without transitioning its frontend to CLOSED, the domain becomes a zombie as netback was not grant unmapping the shared rings. When removing a VIF, transition the backend to CLOSED so the VIF is disconnected if necessary (which will unmap the shared rings etc). This fixes a regression introduced by 279f438e (xen-netback: Don't destroy the netdev until the vif is shut down). Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Paul Durrant <Paul.Durrant@citrix.com> Acked-by: NWei Liu <wei.liu2@citrix.com> Reviewed-by: NPaul Durrant <paul.durrant@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Amir Vadai says: ==================== net/mlx4_en: Fix pages never dma unmapped on rx This patchset fixes a bug introduced by commit 51151a16 (mlx4: allow order-0 memory allocations in RX path). Where dma_unmap_page wasn't called. Changes from V0: - Added "Rename name of mlx4_en_rx_alloc members". Old names were confusing. - Last frag in page calculation was wrong. Since all frags in page are of the same size, need to add this frag_stride to end of frag offset, and not the size of next frag in skb. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amir Vadai 提交于
This patch fixes a bug introduced by commit 51151a16 (mlx4: allow order-0 memory allocations in RX path). dma_unmap_page never reached because condition to detect last fragment in page is wrong. offset+frag_stride can't be greater than size, need to make sure no additional frag will fit in page => compare offset + frag_stride + next_frag_size instead. next_frag_size is the same as the current one, since page is shared only with frags of the same size. CC: Eric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amir Vadai 提交于
Add page prefix to page related members: @size and @offset into @page_size and @page_offset CC: Eric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Veaceslav Falico 提交于
Currently, in TLB mode we change mac addresses only by memcpy-ing the to net_device->dev_addr, without actually setting them via dev_set_mac_address(). This permits us to receive all the traffic always on one mac address. However, in case the interface flips, some drivers might enforce the mac filtering for its FW/HW based on current ->dev_addr, and thus we won't be able to receive traffic on that interface, in case it will be selected as active in TLB mode. Fix it by setting the mac address forcefully on every new active slave that we select in TLB mode. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Yuval Mintz <yuvalmin@broadcom.com> Reported-by: NYuval Mintz <yuvalmin@broadcom.com> Tested-by: NYuval Mintz <yuvalmin@broadcom.com> Signed-off-by: NVeaceslav Falico <vfalico@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nguyen Hong Ky 提交于
This patch will fix RX packets errors when receiving big size of data by set bit RNC = 1. RNC - Receive Enable Control 0: Upon completion of reception of one frame, the E-DMAC writes the receive status to the descriptor and clears the RR bit in EDRRR to 0. 1: Upon completion of reception of one frame, the E-DMAC writes (writes back) the receive status to the descriptor. In addition, the E-DMAC reads the next descriptor and prepares for reception of the next frame. In addition, for get more stable when receiving packets, I set maximum size for the transmit/receive FIFO and inserts padding in receive data. Signed-off-by: NNguyen Hong Ky <nh-ky@jinso.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-