- 06 2月, 2009 1 次提交
-
-
由 Herbert Xu 提交于
Unlike a normal socket path, the tuntap device send path does not have any accounting. This means that the user-space sender may be able to pin down arbitrary amounts of kernel memory by continuing to send data to an end-point that is congested. Even when this isn't an issue because of limited queueing at most end points, this can also be a problem because its only response to congestion is packet loss. That is, when those local queues at the end-point fills up, the tuntap device will start wasting system time because it will continue to send data there which simply gets dropped straight away. Of course one could argue that everybody should do congestion control end-to-end, unfortunately there are people in this world still hooked on UDP, and they don't appear to be going away anywhere fast. In fact, we've always helped them by performing accounting in our UDP code, the sole purpose of which is to provide congestion feedback other than through packet loss. This patch attempts to apply the same bandaid to the tuntap device. It creates a pseudo-socket object which is used to account our packets just as a normal socket does for UDP. Of course things are a little complex because we're actually reinjecting traffic back into the stack rather than out of the stack. The stack complexities however should have been resolved by preceding patches. So this one can simply start using skb_set_owner_w. For now the accounting is essentially disabled by default for backwards compatibility. In particular, we set the cap to INT_MAX. This is so that existing applications don't get confused by the sudden arrival EAGAIN errors. In future we may wish (or be forced to) do this by default. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2009 15 次提交
-
-
由 Herbert Xu 提交于
The function sock_alloc_send_pskb is completely useless if not exported since most of the code in it won't be used as is. In fact, this code has already been duplicated in the tun driver. Now that we need accounting in the tun driver, we can in fact use this function as is. So this patch marks it for export again. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
As it currently stands, skb destructors are forbidden on the receive path because the protocol end-points will overwrite any existing destructor with their own. This is the reason why we have to call skb_orphan in the loopback driver before we reinject the packet back into the stack, thus creating a period during which loopback traffic isn't charged to any socket. With virtualisation, we have a similar problem in that traffic is reinjected into the stack without being associated with any socket entity, thus providing no natural congestion push-back for those poor folks still stuck with UDP. Now had we been consistent in telling them that UDP simply has no congestion feedback, I could just fob them off. Unfortunately, we appear to have gone to some length in catering for this on the standard UDP path, with skb/socket accounting so that has created a very unhealthy dependency. Alas habits are difficult to break out of, so we may just have to allow skb destructors on the receive path. It turns out that making skb destructors useable on the receive path isn't as easy as it seems. For instance, simply adding skb_orphan to skb_set_owner_r isn't enough. This is because we assume all over the IP stack that skb->sk is an IP socket if present. The new transparent proxy code goes one step further and assumes that skb->sk is the receiving socket if present. Now all of this can be dealt with by adding simple checks such as only treating skb->sk as an IP socket if skb->sk->sk_family matches. However, it turns out that for bridging at least we don't need to do all of this work. This is of interest because most virtualisation setups use bridging so we don't actually go through the IP stack on the host (with the exception of our old nemesis the bridge netfilter, but that's easily taken care of). So this patch simply adds skb_orphan to the point just before we enter the IP stack, but after we've gone through the bridge on the receive path. It also adds an skb_orphan to the one place in netfilter that touches skb->sk/skb->destructor, that is, tproxy. One word of caution, because of the internal code structure, anyone wishing to deploy this must use skb_set_owner_w as opposed to skb_set_owner_r since many functions that create a new skb from an existing one will invoke skb_set_owner_w on the new skb. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
-
由 David S. Miller 提交于
-
由 Andy Fleming 提交于
Stashing is only supported on the 85xx (e500-based) SoCs. The 83xx and 86xx chips don't have a proper cache for this. U-Boot has been updated to add stashing properties to the device tree nodes of gianfar devices on 85xx. So now we modify Linux to keep stashing off unless those properties are there. Signed-off-by: NAndy Fleming <afleming@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Fleming 提交于
Signed-off-by: NAndy Fleming <afleming@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Fleming 提交于
The MDIO bus drivers for the UCC and gianfar ethernet controllers are essentially the same. There's no reason to duplicate that much code. Signed-off-by: NAndy Fleming <afleming@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Fleming 提交于
SOFT_RESET must be asserted for at least 3 TX clocks in order for it to work properly. The syncs in the gfar_write() commands have been hiding this, but we need to guarantee it. Signed-off-by: NAndy Fleming <afleming@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Fleming 提交于
BD_LENGTH_MASK is supposed to catch the low 16-bits of the status field, not the low byte. The old way, we would never be able to clean up tx packets with sizes divisible by 256. Signed-off-by: NAndy Fleming <afleming@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
Many physical NICs let the OS re-program the "hardware" MAC address. Virtual NICs should allow this too. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
VLAN filtering allows the hypervisor to drop packets from VLANs that we're not a part of, further reducing the number of extraneous packets recieved. This makes use of the VLAN virtqueue command class. The CTRL_VLAN feature bit tells us whether the backend supports VLAN filtering. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
Make use of the MAC control virtqueue class to support a MAC filter table. The filter table is managed by the hypervisor. We consider the table to be available if the CTRL_RX feature bit is set. We leave it to the hypervisor to manage the table and enable promiscuous or all-multi mode as necessary depending on the resources available to it. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
Make use of the RX_MODE control virtqueue class to enable the set_rx_mode netdev interface. This allows us to selectively enable/disable promiscuous and allmulti mode so we don't see packets we don't want. For now, we automatically enable these as needed if additional unicast or multicast addresses are requested. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
This will be used for RX mode, MAC filter table, VLAN filtering, etc... The control transaction consists of one or more "out" sg entries and one or more "in" sg entries. The first out entry contains a header defining the class and command. Additional out entries may provide data for the command. The last in entry provides a status response back from the command. Virtqueues typically run asynchronous, running a callback function when there's data in the channel. We can't readily make use of this in the command paths where we need to use this. Instead, we kick the virtqueue and spin. The kick causes an I/O write, triggering an immediate trap into the hypervisor. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Divy Le Ray 提交于
The LRO switch is always set to 1 in the rx processing loop. It breaks the accelerated iSCSI receive traffic. Fix its computation. Signed-off-by: NDivy Le Ray <divy@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2009 24 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (40 commits) Blackfin arch: Remove outdated code Blackfin arch: Fix udelay implementation Blackfin arch: Update Copyright information Blackfin arch: Add BF561 PPI POLS, POLC Masks Blackfin arch: Update CM-BF527 kernel config Blackfin arch: define bfin_memmap as static since it is only used here Blackfin arch: cplb mananger: use a do...while loop rather than a for loop Blackfin arch: fix bug - traps test case 19 for exception 0x2d fails Blackfin arch: add platform device bfin_mii-bus and KSZ8893M switch driver platform resources to board files Blackfin arch: build jtag tty driver as a module by default Blackfin arch: fix 2 bugs related to debug Blackfin arch: Add ANOMALY_05000380 to BF54x to kill the compile warning Blackfin arch: Fix bug - 561 SMP kernel can't boot from jffs2 Blackfin arch: base SIC_IWR# programming on whether the MMR exists Blackfin arch: read SYSCR on newer parts that mirror the bits of SWRST in it Blackfin arch: fixup board init function name Blackfin arch: drop CONFIG_I2C_BOARDINFO ifdefs Blackfin arch: bfin_reset->_bfin_reset redirection no longer needed Blackfin arch: sync reboot handler with version in u-boot Blackfin arch: Faster Implementation of csum_tcpudp_nofold() ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Kill bogus TPC/address truncation during 32-bit faults. sparc: fixup for sparseirq changes sparc64: Validate kernel generated fault addresses on sparc64. sparc64: On non-Niagara, need to touch NMI watchdog in NOHZ mode. sparc64: Implement NMI watchdog on capable cpus. sparc: Probe PMU type and record in sparc_pmu_type. sparc64: Move generic PCR support code to seperate file.
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: sunrpc: fix rdma dependencies e1000: Fix PCI enable to honor the need_ioport flag sgi-xp: link XPNET's net_device_ops to its net_device structure pcnet_cs: Fix misuse of the equality operator. hso: add new device id's dca: redesign locks to fix deadlocks cassini/sungem: limit reaches -1, but 0 tested net: variables reach -1, but 0 tested qlge: bugfix: Add missing netif_napi_del call. qlge: bugfix: Add flash offset for second port. qlge: bugfix: Fix endian issue when reading flash. udp: increments sk_drops in __udp_queue_rcv_skb() net: Fix userland breakage wrt. linux/if_tunnel.h net: packet socket packet_lookup_frame fix
-
git://git.o-hand.com/linux-mfd由 Linus Torvalds 提交于
* 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: Remove non exported references from pcf50633
-
由 Michael Hennerich 提交于
The removed version with the loop registers saved on the stack was originally intended to workaround the missing toolchain support for LoopReg Clobbers. Since our toolchain now supports these there is no point in keeping this workaround. And since we don't touch LoopRegs anymore we're no longer subject for ANOMALY_05000312. Signed-off-by: NMichael Hennerich <michael.hennerich@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Michael Hennerich 提交于
Avoid possible overflow during 32*32->32 multiplies. Reported-by: NMarco Reppenhagen <marco.reppenhagen@auerswald.de> Signed-off-by: NMichael Hennerich <michael.hennerich@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Michael Hennerich 提交于
Signed-off-by: NMichael Hennerich <michael.hennerich@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Michael Hennerich 提交于
Signed-off-by: NMichael Hennerich <michael.hennerich@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Michael Hennerich 提交于
Signed-off-by: NMichael Hennerich <michael.hennerich@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
use a do...while loop rather than a for loop to get slightly better optimization and to avoid gcc "may be used uninitialized" warnings ... we know that the [id]cplb_nr_bounds variables will never be 0, so this is OK Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Bernd Schmidt 提交于
Enable null pointer checking for ICPLBs. The code was there but for some reason I had commented it out at some stage during development. Should restrict this to 1K since atomic ops start there. Signed-off-by: NBernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Graf Yang 提交于
Blackfin arch: add platform device bfin_mii-bus and KSZ8893M switch driver platform resources to board files Signed-off-by: NGraf Yang <graf.yang@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Jie Zhang 提交于
- unable to single step over emuexcpt instruction - gdbproxy goes into infinite loop when doing gdb does "next" over "emuexcpt" Don't decrement PC after software breakpoint. Signed-off-by: NJie Zhang <jie.zhang@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Bryan Wu 提交于
Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Graf Yang 提交于
bss_l2 section is garbage when the data in this section is used by _bfin_relocate_l1_mem, so move the zero out function ahead. Signed-off-by: NGraf Yang <graf.yang@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
base SIC_IWR# programming on whether the MMR exists rather than having to maintain another list of processors Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
Drop CONFIG_I2C_BOARDINFO ifdefs as the common i2c header handles this already by stubbing things out Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Mike Frysinger 提交于
Signed-off-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-
由 Michael Hennerich 提交于
Avoid conditional branch instructions during carry bit additions. Special thanks to Bernd. Simplify: Use ((len + proto) << 8) like every other __LITTLE_ENDIAN__ machine Cc: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: NMichael Hennerich <michael.hennerich@analog.com> Signed-off-by: NBryan Wu <cooloney@kernel.org>
-