- 10 12月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Now RED uses a Q0.32 number to store max_p (max probability), allow RED/GRED/CHOKE to use/report full resolution at config/dump time. Old tc binaries are non aware of new attributes, and still set/get Plog. New tc binary set/get both Plog and max_p for backward compatibility, they display "probability value" if they get max_p from new kernels. # tc -d qdisc show dev ... ... qdisc red 10: parent 1:1 limit 360Kb min 30Kb max 90Kb ecn ewma 5 probability 0.09 Scell_log 15 Make sure we avoid potential divides by 0 in reciprocal_value(), if (max_th - min_th) is big. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 12月, 2011 6 次提交
-
-
由 Eric Dumazet 提交于
Adaptative RED AQM for linux, based on paper from Sally FLoyd, Ramakrishna Gummadi, and Scott Shenker, August 2001 : http://icir.org/floyd/papers/adaptiveRed.pdf Goal of Adaptative RED is to make max_p a dynamic value between 1% and 50% to reach the target average queue : (max_th - min_th) / 2 Every 500 ms: if (avg > target and max_p <= 0.5) increase max_p : max_p += alpha; else if (avg < target and max_p >= 0.01) decrease max_p : max_p *= beta; target :[min_th + 0.4*(min_th - max_th), min_th + 0.6*(min_th - max_th)]. alpha : min(0.01, max_p / 4) beta : 0.9 max_P is a Q0.32 fixed point number (unsigned, with 32 bits mantissa) Changes against our RED implementation are : max_p is no longer a negative power of two (1/(2^Plog)), but a Q0.32 fixed point number, to allow full range described in Adatative paper. To deliver a random number, we now use a reciprocal divide (thats really a multiply), but this operation is done once per marked/droped packet when in RED_BETWEEN_TRESH window, so added cost (compared to previous AND operation) is near zero. dump operation gives current max_p value in a new TCA_RED_MAX_P attribute. Example on a 10Mbit link : tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec red \ limit 400000 min 30000 max 90000 avpkt 1000 \ burst 55 ecn adaptative bandwidth 10Mbit # tc -s -d qdisc show dev eth3 ... qdisc red 10: parent 1:1 limit 400000b min 30000b max 90000b ecn adaptative ewma 5 max_p=0.113335 Scell_log 15 Sent 50414282 bytes 34504 pkt (dropped 35, overlimits 1392 requeues 0) rate 9749Kbit 831pps backlog 72056b 16p requeues 0 marked 1357 early 35 pdrop 0 other 0 Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Introduce functions handy to copy vlan ids from one driver's list to another. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This allows to keep track of vids needed to be in rx vlan filters of devices even if they are used in bond/team etc. vlan_info as well as vlan_group previously was, is allocated when first vid is added and dealocated whan last vid is deleted. vlan_group definition is moved to private header. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This patch adds wrapper for ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid functions. Check for NETIF_F_HW_VLAN_FILTER feature is done in this wrapper. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Let caller know the result of adding/removing vlan id to/from vlan filter. In some drivers I make those functions to just return 0. But in those where there is able to see if hw setup went correctly, return value is set appropriately. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
As this structure is priv, name it approprietely. Also for pointer to it use name "vlan". Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 12月, 2011 4 次提交
-
-
由 Pavel Emelyanov 提交于
This one coinsides with the sock_diag_req in the beginning and contains only used fields from its previous analogue. The existing code is patched to use the _compat version of it for now. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
When receiving the SOCK_DIAG_BY_FAMILY message we have to find the handler for provided family and pass the nl message to it. This patch describes an infrastructure to work with such nandlers and implements stubs for AF_INET(6) ones. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
This type will run the family+protocol based socket dumping. Also prepare the stub function for it. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The ultimate goal is to get the sock_diag module, that works in family+protocol terms. Currently this is suitable to do on the inet_diag basis, so rename parts of the code. It will be moved to sock_diag.c later. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 12月, 2011 3 次提交
-
-
Add EthType 0x88b5. This Ethertype value is available for public use for prototype and vendor-specific protocol development,as defined in Amendment 802a to IEEE Std 802. Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li Zefan 提交于
Though not all events have field 'prev_pid', it was allowed to do this: # echo 'prev_pid == 100' > events/sched/filter but commit 75b8e982 (tracing/filter: Swap entire filter of events) broke it without any reason. Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.comSigned-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Andreas Herrmann 提交于
I've received complaints that the numa_node attribute for family 15h model 00-0fh (e.g. Interlagos) northbridge functions shows -1 instead of the proper node ID. Correct this with attached quirks (similar to quirks for other AMD CPU families used in multi-socket systems). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Frank Arnold <frank.arnold@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/20111202072143.GA31916@alberich.amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 12月, 2011 2 次提交
-
-
由 Peter Zijlstra 提交于
When you do: $ perf record -e cycles,cycles,cycles noploop 10 You expect about 10,000 samples for each event, i.e., 10s at 1000samples/sec. However, this is not what's happening. You get much fewer samples, maybe 3700 samples/event: $ perf report -D | tail -15 Aggregated stats: TOTAL events: 10998 MMAP events: 66 COMM events: 2 SAMPLE events: 10930 cycles stats: TOTAL events: 3644 SAMPLE events: 3644 cycles stats: TOTAL events: 3642 SAMPLE events: 3642 cycles stats: TOTAL events: 3644 SAMPLE events: 3644 On a Intel Nehalem or even AMD64, there are 4 counters capable of measuring cycles, so there is plenty of space to measure those events without multiplexing (even with the NMI watchdog active). And even with multiplexing, we'd expect roughly the same number of samples per event. The root of the problem was that when the event that caused the buffer to become full was not the first event passed on the cmdline, the user notification would get lost. The notification was sent to the file descriptor of the overflowed event but the perf tool was not polling on it. The perf tool aggregates all samples into a single buffer, i.e., the buffer of the first event. Consequently, it assumes notifications for any event will come via that descriptor. The seemingly straight forward solution of moving the waitq into the ringbuffer object doesn't work because of life-time issues. One could perf_event_set_output() on a fd that you're also blocking on and cause the old rb object to be freed while its waitq would still be referenced by the blocked thread -> FAIL. Therefore link all events to the ringbuffer and broadcast the wakeup from the ringbuffer object to all possible events that could be waited upon. This is rather ugly, and we're open to better solutions but it works for now. Reported-by: NStephane Eranian <eranian@google.com> Finished-by: NStephane Eranian <eranian@google.com> Reviewed-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111126014731.GA7030@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Eric Dumazet 提交于
We discovered that TCP stack could retransmit misaligned skbs if a malicious peer acknowledged sub MSS frame. This currently can happen only if output interface is non SG enabled : If SG is enabled, tcp builds headless skbs (all payload is included in fragments), so the tcp trimming process only removes parts of skb fragments, header stay aligned. Some arches cant handle misalignments, so force a head reallocation and shrink headroom to MAX_TCP_HEADER. Dont care about misaligments on x86 and PPC (or other arches setting NET_IP_ALIGN to 0) This patch introduces __pskb_copy() which can specify the headroom of new head, and pskb_copy() becomes a wrapper on top of __pskb_copy() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 12月, 2011 4 次提交
-
-
由 Jesse Gross 提交于
Open vSwitch is a multilayer Ethernet switch targeted at virtualized environments. In addition to supporting a variety of features expected in a traditional hardware switch, it enables fine-grained programmatic extension and flow-based control of the network. This control is useful in a wide variety of applications but is particularly important in multi-server virtualization deployments, which are often characterized by highly dynamic endpoints and the need to maintain logical abstractions for multiple tenants. The Open vSwitch datapath provides an in-kernel fast path for packet forwarding. It is complemented by a userspace daemon, ovs-vswitchd, which is able to accept configuration from a variety of sources and translate it into packet processing rules. See http://openvswitch.org for more information and userspace utilities. Signed-off-by: NJesse Gross <jesse@nicira.com>
-
由 Pravin B Shelar 提交于
Open vSwitch needs this function for vlan handling. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
由 Jesse Gross 提交于
This adds rcu_dereference_genl and genl_dereference, which are genl variants of the RTNL functions to enforce proper locking with lockdep and sparse. Signed-off-by: NJesse Gross <jesse@nicira.com>
-
由 Pravin B Shelar 提交于
Open vSwitch uses genl_mutex locking to protect datapath data-structures like flow-table, flow-actions. Following patch adds lockdep_genl_is_held() which is used for rcu annotation to prove locking. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
- 02 12月, 2011 2 次提交
-
-
由 David S. Miller 提交于
The return value isn't used. Suggested by Ben Hucthings. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wolfgang Grandegger 提交于
Signed-off-by: NWolfgang Grandegger <wg@grandegger.com> Acked-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 12月, 2011 3 次提交
-
-
由 Hagen Paul Pfeifer 提交于
Currently netem is not in the ability to emulate channel bandwidth. Only static delay (and optional random jitter) can be configured. To emulate the channel rate the token bucket filter (sch_tbf) can be used. But TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot be 0. Also the idea behind TBF is that the credit (token in buckets) fills if no packet is transmitted. So that there is always a "positive" credit for new packets. In real life this behavior contradicts the law of nature where nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0 seconds. Netem is an excellent place to implement a rate limiting feature: static delay is already implemented, tfifo already has time information and the user can skip TBF configuration completely. This patch implement rate feature which can be configured via tc. e.g: tc qdisc add dev eth0 root netem rate 10kbit To emulate a link of 5000byte/s and add an additional static delay of 10ms: tc qdisc add dev eth0 root netem delay 10ms rate 5KBps Note: similar to TBF the rate extension is bounded to the kernel timing system. Depending on the architecture timer granularity, higher rates (e.g. 10mbit/s and higher) tend to transmission bursts. Also note: further queues living in network adaptors; see ethtool(8). Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@drr.davemloft.net>
-
由 David Miller 提交于
If the neigh entry has device private state, it will need constructor/destructor ops. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Miller 提交于
netdev->neigh_priv_len records the private area length. This will trigger for neigh_table objects which set tbl->entry_size to zero, and the first instances of this will be forthcoming. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 11月, 2011 4 次提交
-
-
由 Tom Herbert 提交于
Networking stack support for byte queue limits, uses dynamic queue limits library. Byte queue limits are maintained per transmit queue, and a dql structure has been added to netdev_queue structure for this purpose. Configuration of bql is in the tx-<n> sysfs directory for the queue under the byte_queue_limits directory. Configuration includes: limit_min, bql minimum limit limit_max, bql maximum limit hold_time, bql slack hold time Also under the directory are: limit, current byte limit inflight, current number of bytes on the queue Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Add interfaces for drivers to call for recording number of packets and bytes at send time and transmit completion. Also, added a function to "reset" a queue. These will be used by Byte Queue Limits. Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Create separate queue state flags so that either the stack or drivers can turn on XOFF. Added a set of functions used in the stack to determine if a queue is really stopped (either by stack or driver) Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Implementation of dynamic queue limits (dql). This is a libary which allows a queue limit to be dynamically managed. The goal of dql is to set the queue limit, number of objects to the queue, to be minimized without allowing the queue to be starved. dql would be used with a queue which has these properties: 1) Objects are queued up to some limit which can be expressed as a count of objects. 2) Periodically a completion process executes which retires consumed objects. 3) Starvation occurs when limit has been reached, all queued data has actually been consumed but completion processing has not yet run, so queuing new data is blocked. 4) Minimizing the amount of queued data is desirable. A canonical example of such a queue would be a NIC HW transmit queue. The queue limit is dynamic, it will increase or decrease over time depending on the workload. The queue limit is recalculated each time completion processing is done. Increases occur when the queue is starved and can exponentially increase over successive intervals. Decreases occur when more data is being maintained in the queue than needed to prevent starvation. The number of extra objects, or "slack", is measured over successive intervals, and to avoid hysteresis the limit is only reduced by the miminum slack seen over a configurable time period. dql API provides routines to manage the queue: - dql_init is called to intialize the dql structure - dql_reset is called to reset dynamic values - dql_queued called when objects are being enqueued - dql_avail returns availability in the queue - dql_completed is called when objects have be consumed in the queue Configuration consists of: - max_limit, maximum limit - min_limit, minimum limit - slack_hold_time, time to measure instances of slack before reducing queue limit Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 11月, 2011 6 次提交
-
-
由 Lars-Peter Clausen 提交于
Currently the SigmaDSP firmware loader only works correctly on little-endian systems. Fix this by using the proper endianess conversion functions. Signed-off-by: NLars-Peter Clausen <lars@metafoo.de> Acked-by: NMike Frysinger <vapier@gentoo.org> Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com> Cc: stable@kernel.org
-
由 Lars-Peter Clausen 提交于
The SigmaDSP firmware loader currently does not perform enough boundary size checks when processing the firmware. As a result it is possible that a malformed firmware can cause an out of bounds memory access. This patch adds checks which ensure that both the action header and the payload are completely inside the firmware data boundaries before processing them. Signed-off-by: NLars-Peter Clausen <lars@metafoo.de> Acked-by: NMike Frysinger <vapier@gentoo.org> Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com> Cc: stable@kernel.org
-
由 Anton Blanchard 提交于
I just hit this during my testing. Isn't there another bug lurking? BUG kmalloc-8: Redzone overwritten INFO: 0xc0000000de9dec48-0xc0000000de9dec4b. First byte 0x0 instead of 0xcc INFO: Allocated in .__seq_open_private+0x30/0xa0 age=0 cpu=5 pid=3896 .__kmalloc+0x1e0/0x2d0 .__seq_open_private+0x30/0xa0 .seq_open_net+0x60/0xe0 .dev_mc_seq_open+0x4c/0x70 .proc_reg_open+0xd8/0x260 .__dentry_open.clone.11+0x2b8/0x400 .do_last+0xf4/0x950 .path_openat+0xf8/0x480 .do_filp_open+0x48/0xc0 .do_sys_open+0x140/0x250 syscall_exit+0x0/0x40 dev_mc_seq_ops uses dev_seq_start/next/stop but only allocates sizeof(struct seq_net_private) of private data, whereas it expects sizeof(struct dev_iter_state): struct dev_iter_state { struct seq_net_private p; unsigned int pos; /* bucket << BUCKET_SPACE + offset */ }; Create dev_seq_open_ops and use it so we don't have to expose struct dev_iter_state. [ Problem added by commit f04565dd (dev: use name hash for dev_seq_ops) -Eric ] Signed-off-by: NAnton Blanchard <anton@samba.org> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rafael J. Wysocki 提交于
The comments describing device power management callbacks in include/pm.h are outdated and somewhat confusing, so make them reflect the reality more accurately. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
-
由 Thomas Pedersen 提交于
As per 802.11mb 13.9.11.3 Signed-off-by: NThomas Pedersen <thomas@cozybit.com> Signed-off-by: NJavier Cardona <javier@cozybit.com> Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
-
由 Simon Wunderlich 提交于
This patch contains the configuration changes in nl80211/cfg80211. Signed-off-by: NSimon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: NMathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
-
- 28 11月, 2011 3 次提交
-
-
由 Amir Vadai 提交于
Device must be in promiscuous mode or DMAC must be same as the host MAC, or else packet will be dropped by the HW rx filtering. Signed-off-by: NAmir Vadai <amirv@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Oren Duer 提交于
There are 2 capability bits for WOL, one for each port. WOL handlers were looking only on the second bit, regardless of the port. Signed-off-by: NOren Duer <oren@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Or Gerlitz 提交于
Towards adding RSS support for IB drivers/application who use the mlx4 HW, make the RSS related definitions global and change the mlx4_en driver to use them. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 11月, 2011 2 次提交
-
-
Signed-off-by: NChas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Dooks 提交于
Add support for writing data to EEPROM. Signed-off-by: NBen Dooks <ben@simtec.co.uk> Cc: Wolfram Sang <w.sang@pengutronix.de> Cc: Jean Delvare <khali@linux-fr.org> Cc: Linux Kernel <linux-kernel@vger.kernel.org> Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-