- 03 3月, 2016 40 次提交
-
-
由 David S. Miller 提交于
Alexandre TORGUE says: ==================== stmmac: enhance driver performances and update the version According to Giuseppe, I send the v3 series. This is a subset of patches to rework the driver in order to improve its performances and make it more robust under stress conditions. All patches have been ported on STi mainstream kernel branch and tested on ARM STiH4xx platforms and newer ones. This series also updates the driver version and prepares it to include further development to support new chips. In detail, these patches are: o to rework and improve the internal DMA bus settings Fine tuning is mandatory on some platforms for both performance and stability issues. o to rework and optimize the descriptor management. This will help a lot on performance side and preparing the inclusion on the GMAC4.x. o to add a set of optimizations for both xmit and rx functions. These will help a lot on performance side and making the driver more robust in case of low memory conditions and under some stress test, performed for example on IP-STB. Below some throughput figures obtained on some boxes before and after the patches. nuttcp (mbps) iperf (Mbps) ------------------------------------------------------------------ tcp udp tcp udp tx rx tx rx tx rx tx rx ------------------------------------------ old 680 800 480 506 760 800 600 700 new 830 880 540 630 840 880 700 800 V2: - rx_copybreak is now managed by using ethtool. V3: - improve comments on PCIe detailing that there are no regressions - rework some APIs to properly define some params as bool as expected - rework the formula to get the element inside the ring. Comparing V2, patches 4 and 13 have been merged because the same formula have been used. After this rework, no evident benefit has been noticed in terms of performances so the table above is still valid. Disassembling the code for SH4 and ARM, with the new formula just an instr is saved (depending on compiler flags) and this gives us not so relevanti gain, for example, on SH4 where some instr are executed in the same pipeline stage. Ring sizes are now fixed and maybe they can be reworked to be tuned w/o using stmmaceth= cmdline option. Indeed, nobody change these sizes and indeed the numbers selected by default respect the budget and avoid to pass invalid setup. These are the best driver default sizes for ring and chain. ====================
-
由 Giuseppe Cavallaro 提交于
This patch just updates the driver to the version fully tested on STi platforms. This version is Oct_2015. Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
There is a threshold now used to also limit the skb allocation when use zero-copy. This is to avoid that there are incoherence in the ring due to a failure on skb allocation under very aggressive testing and under low memory conditions. Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
This patch is to allow this driver to copy tiny frames during the reception process. This is giving more stability while stressing the driver on STi embedded systems. Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabrice Gasnier 提交于
phy_bus_name can be NULL when "fixed-link" property isn't used. Then, since "stmmac: do not poll phy handler when attach a switch", phy_bus_name ptr needs to be checked before strcmp is called. Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
This patch avoids to call the stmmac_adjust_link when the driver is connected to a switch by using the FIXED_PHY support. Prior this patch the phydev->irq was set as PHY_POLL so periodically the phy handler was invoked spending useless time because the link cannot actually change. Note that the stmmac_adjust_link will be called just one time and this guarantees that the ST glue logic will be setup according to the mode and speed fixed. Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
This patch is to fill the first descriptor just before granting the DMA engine so at the end of the xmit. The patch takes care about the algorithm adopted to mitigate the interrupts, then it fixes the last segment in case of no fragments. Moreover, this new implementation does not pass any "ter" field when prepare the descriptors because this is not necessary. The patch also details the memory barrier in the xmit. As final results, this patch guarantees the same performances but fixing a case if small datagram are sent. In fact, this kind of test is impacted if no coalesce is done. Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
The dirty index can be updated out of the loop where all the tx resources are claimed. This will help on performances too. Also a useless debug printk has been removed from the main loop. Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabrice Gasnier 提交于
This patch "inline" get_tx_owner and get_ls routines. It Results in a unique read to tdes0, instead of three, to check TX_OWN and LS bits, and other status bits. It helps improve driver TX path by removing two uncached read/writes inside TX clean loop for enhanced descriptors but not for normal ones because the des1 must be read in any case. Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com> Acked-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
This patch is to optimize the way to manage the TDES inside the xmit function. When prepare the frame, some settings (e.g. OWN bit) can be merged. This has been reworked to improve the tx performances. Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabrice Gasnier 提交于
The RDES0 register can be read several times while doing RX of a packet. This patch slightly improves RX path performance by reading rdes0 once for two operation: check rx owner, get rx status bits. Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com> Acked-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
Optimize tx_clean by avoiding a des3 read in stmmac_clean_desc3(). In ring mode, TX, des3 seems only used when xmit a jumbo frame. In case of normal descriptors, it may also be used for time stamping. Clean it in the above two case, without reading it. Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
last_segment field is read twice from dma descriptors in stmmac_clean(). Add last_segment to dma data so that this flag is from priv structure in cache instead of memory. It avoids reading twice from memory for each loop in stmmac_clean(). Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
Currently, the code pulls out the length field when unmapping a buffer directly from the descriptor. This will result in an uncached read to a dma_alloc_coherent() region. There is no need to do this, so this patch simply puts the value directly into a data structure which will hit the cache. Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
This patch is to rework the ring management now optimized. The indexes into the ring buffer are always incremented, and the entry is accessed via doing a modulo to find the "real" position in the ring. It is inefficient, modulo is an expensive operation. The formula [(entry + 1) & (size - 1)] is now adopted on a ring that is power-of-2 in size. Then, the number of elements cannot be set by command line but it is fixed. Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
This patch completely changes the descriptor layout to improve the whole performances due to the single read usage of the descriptors in critical paths. Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
This patch restructures the DMA bus settings and this is done by introducing a new platform structure used for programming the AXI Bus Mode Register inside the DMA module. This structure can be populated from device-tree as documented in the binding txt file. After initializing the DMA, the AXI register can be optionally tuned for platform drivers based. This patch also reworks some parameters to make coherent the DMA configuration now that AXI register is introduced. For example, the burst_len is managed by using the mentioned axi support above; so the snps,burst-len parameter has been removed. It makes sense to provide the AAL parameter from DT to Address-Aligned Beats inside the Register0 and review the PBL settings when initialize the engine. For PCI glue, rebuilding the story of this setting, it was added to align a configuration so not for fixing some known problem. No issue raised after this patch. It is safe to use the default burst length instead of tuning it to the maximum value Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Giuseppe Cavallaro 提交于
This patch is to share the same reset procedure between dwmac100 and dwmac1000 chips. This will also help on enhancing the driver and support new chips. Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Santosh Shilimkar says: ==================== RDS: Major clean-up with couple of new features for 4.6 v3: Re-generated the same series by omitting "-D" option from git format-patch command. Since first patch has file removals, git apply/am can't deal with it when formated with '-D' option. v2: Dropped module parameter from [PATCH 11/13] as suggested by David Miller Series is generated against net-next but also applies against Linus's tip cleanly. Entire patchset is available at below git tree: git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux.git for_4.6/net-next/rds_v2 The diff-stat looks bit scary since almost ~4K lines of code is getting removed. Brief summary of the series: - Drop the stale iWARP support: RDS iWarp support code has become stale and non testable for sometime. As discussed and agreed earlier on list, am dropping its support for good. If new iWarp user(s) shows up in future, the plan is to adapt existing IB RDMA with special sink case. - RDS gets SO_TIMESTAMP support - Long due RDS maintainer entry gets updated - Some RDS IB code refactoring towards new FastReg Memory registration (FRMR) - Lastly the initial support for FRMR RDS IB RDMA performance with FRMR is not yet as good as FMR and I do have some patches in progress to address that. But they are not ready for 4.6 so I left them out of this series. Also am keeping eye on new CQ API adaptations like other ULPs doing and will try to adapt RDS for the same most likely in 4.7+ timeframe. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Avinash Repaka 提交于
Fastreg MR(FRMR) is another method with which one can register memory to HCA. Some of the newer HCAs supports only fastreg mr mode, so we need to add support for it to have RDS functional on them. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NAvinash Repaka <avinash.repaka@oracle.com> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Fastreg MR(FRMR) memory registration and invalidation makes use of work request and completion queues for its operation. Patch allocates extra queue space towards these operation(s). Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Discovere Fast Memmory Registration support using IB device IB_DEVICE_MEM_MGT_EXTENSIONS. Certain HCA might support just FRMR or FMR or both FMR and FRWR. In case both mr type are supported, default FMR is used. Default MR is still kept as FMR against what everyone else is following. Default will be changed to FRMR once the RDS performance with FRMR is comparable with FMR. The work is in progress for the same. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Add MR reuse statistics to RDS IB transport. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Drop the RDS connection on RDMA_CM_EVENT_TIMEWAIT_EXIT so that it can reconnect and resume. While testing fastreg, this error happened in couple of tests but was getting un-noticed. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Preperatory patch for FRMR support. From connection info, we can retrieve cm_id which contains qp handled needed for work request posting. We also need to drop the RDS connection on QP error states where connection handle becomes useful. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
No functional change. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Keep fmr related filed in its own struct. Fastreg MR structure will be added to the union. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
No functional changes. This is in preperation towards adding fastreg memory resgitration support. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
This helps to combine asynchronous fastreg MR completion handler with send completion handler. No functional change. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Acked-by: NChien Yen <chien.yen@oracle.com> Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
The SO_TIMESTAMP generates time stamp for each incoming RDS messages User app can enable it by using SO_TIMESTAMP setsocketopt() at SOL_SOCKET level. CMSG data of cmsg type SO_TIMESTAMP contains the time stamp in struct timeval format. Reviewed-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
RDS iWarp support code has become stale and non testable. As indicated earlier, am dropping the support for it. If new iWarp user(s) shows up in future, we can adapat the RDS IB transprt for the special RDMA READ sink case. iWarp needs an MR for the RDMA READ sink. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Yuval Mintz says: ==================== qed: update series This patch series tries to improve general configuration by changing configuration to better suit B0 boards and allow more available resources to each physical function. In additition, it contains some small fixes and semantic changes. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
Remove 2 unused fields from driver code. Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
In case of problems when initializing the chip, the error flows aren't being properly done. Specifically, it's possible that the chip would be left in a configuration allowing it [internally] to access the host memory, causing fatal problems in the device that would require power cycle to overcome. Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
Current statistics logic is meant for L2, not for all future protocols. Move this content to the proper designated file. Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
BB_A0 is a development model that is will not reach actual clients. In fact, future firmware would simply fail to initialize such chip. This changes the configuration into B0 instead of A0, and adds a safeguard against the slim chance someone would actually try this with an A0 adapter in which case probe would gracefully fail. Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ram Amrani 提交于
Driver learns the inner bar sized from a register configured by management firmware, but older versions are not setting this register. But since we know which values were configured back then, use them instead. Signed-off-by: NRam Amrani <Ram.Amrani@qlogic.com> Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The driver calls cfg80211_get_station, which may be part of a module, so we must not enable BATMAN_ADV_BATMAN_V if BATMAN_ADV=y and CFG80211=m: net/built-in.o: In function `batadv_v_elp_get_throughput': (text+0x5c62c): undefined reference to `cfg80211_get_station' This clarifies the dependency to cover all combinations. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Fixes: c833484e ("batman-adv: ELP - compute the metric based on the estimated throughput") Acked-by: NAntonio Quartulli <a@unstable.cc> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amitoj Kaur Chawla 提交于
Use managed resource functions devm_kzalloc and pcim_enable_device to simplify error handling. Subsequently, remove unnecessary kfree, pci_disable_device and pci_release_regions. To be compatible with the change, various gotos are replaced with direct returns and unneeded labels are dropped. Also, `sc` was only being freed in the probe function and not the remove function before the change. By using devm_kzalloc this patch also fixes this memory leak. Signed-off-by: NAmitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-