提交 · e9bce845c0cee1a492e5cee6a827ae71140fe8b3 · openeuler / raspberrypi-kernel

11 3月, 2011 2 次提交

net: add proper documentation for previously added net_device_ops for FCoE · e9bce845

由 Yi Zou 提交于 3月 09, 2011

Add proper documentation for previously added net_device_ops ops for FCoE.
Signed-off-by: NYi Zou <yi.zou@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

e9bce845

ipv4: Remove redundant RCU locking in ip_check_mc(). · dbdd9a52

由 David S. Miller 提交于 3月 10, 2011

All callers are under rcu_read_lock() protection already.

Rename to ip_check_mc_rcu() to make it even more clear.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbdd9a52

10 3月, 2011 7 次提交

tg3: Add code to verify RODATA checksum of VPD · d4894f3e

由 Matt Carlson 提交于 3月 09, 2011

This patch adds code to verify the checksum stored in the "RV" info
keyword of the RODATA VPD section.
Signed-off-by: NMatt Carlson <mcarlson@broadcom.com>
Reviewed-by: NMichael Chan <mchan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4894f3e

sysctl: the include of rcupdate.h is only needed in the kernel · 991ac30d

由 Stephen Rothwell 提交于 3月 10, 2011

Fixes this built error:

include/linux/sysctl.h:28: included file 'linux/rcupdate.h' is not exported
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

991ac30d

sysctl: the include of rcupdate.h is only needed in the kernel · 684adca4

由 Stephen Rothwell 提交于 3月 10, 2011

Fixes this build-check error:

  include/linux/sysctl.h:28: included file 'linux/rcupdate.h' is not exported
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

684adca4

net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules · 8909c9ad

由 Vasiliy Kulikov 提交于 3月 02, 2011

Since a8f80e8f any process with
CAP_NET_ADMIN may load any module from /lib/modules/.  This doesn't mean
that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
limited to /lib/modules/**.  However, CAP_NET_ADMIN capability shouldn't
allow anybody load any module not related to networking.

This patch restricts an ability of autoloading modules to netdev modules
with explicit aliases.  This fixes CVE-2011-1019.

Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
of loading netdev modules by name (without any prefix) for processes
with CAP_SYS_MODULE to maintain the compatibility with network scripts
that use autoloading netdev modules by aliases like "eth0", "wlan0".

Currently there are only three users of the feature in the upstream
kernel: ipip, ip_gre and sit.

    root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
    root@albatros:~# grep Cap /proc/$$/status
    CapInh:	0000000000000000
    CapPrm:	fffffff800001000
    CapEff:	fffffff800001000
    CapBnd:	fffffff800001000
    root@albatros:~# modprobe xfs
    FATAL: Error inserting xfs
    (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
    root@albatros:~# lsmod | grep xfs
    root@albatros:~# ifconfig xfs
    xfs: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep xfs
    root@albatros:~# lsmod | grep sit
    root@albatros:~# ifconfig sit
    sit: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep sit
    root@albatros:~# ifconfig sit0
    sit0      Link encap:IPv6-in-IPv4
	      NOARP  MTU:1480  Metric:1

    root@albatros:~# lsmod | grep sit
    sit                    10457  0
    tunnel4                 2957  1 sit

For CAP_SYS_MODULE module loading is still relaxed:

    root@albatros:~# grep Cap /proc/$$/status
    CapInh:	0000000000000000
    CapPrm:	ffffffffffffffff
    CapEff:	ffffffffffffffff
    CapBnd:	ffffffffffffffff
    root@albatros:~# ifconfig xfs
    xfs: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep xfs
    xfs                   745319  0

Reference: https://lkml.org/lkml/2011/2/24/203Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NKees Cook <kees.cook@canonical.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

8909c9ad

tcp: ioctl type SIOCOUTQNSD returns amount of data not sent · 2f4e1b39

由 Mario Schuknecht 提交于 3月 09, 2011

In contrast to SIOCOUTQ which returns the amount of data sent
but not yet acknowledged plus data not yet sent this patch only
returns the data not sent.

For various methods of live streaming bitrate control it may
be helpful to know how much data are in the tcp outqueue are
not sent yet.
Signed-off-by: NMario Schuknecht <m.schuknecht@dresearch.de>
Signed-off-by: NSteffen Sledz <sledz@dresearch.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f4e1b39

Phonet: kill the ST-Ericsson pipe controller Kconfig · a015f6f4

由 Rémi Denis-Courmont 提交于 3月 08, 2011

This is now a run-time choice so that a single kernel can support both
old and new generation ISI modems. Support for manually enabling the
pipe flow is removed as it did not work properly, does not fit well
with the socket API, and I am not aware of any use at the moment.
Signed-off-by: NRémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a015f6f4

Phonet: provide pipe socket option to retrieve the pipe identifier · acaf7df6

由 Rémi Denis-Courmont 提交于 3月 08, 2011

User-space sometimes needs this information. In particular, the GPRS
context or the AT commands pipe setups may use the pipe handle as a
reference.

This removes the settable pipe handle with CONFIG_PHONET_PIPECTRLR.
It did not handle error cases correctly. Furthermore, the kernel
*could* implement a smart scheme for allocating handles (if ever
needed), but userspace really cannot.
Signed-off-by: NRémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

acaf7df6

08 3月, 2011 3 次提交

unfuck proc_sysctl ->d_compare() · dfef6dcd

由 Al Viro 提交于 3月 08, 2011

a) struct inode is not going to be freed under ->d_compare();
however, the thing PROC_I(inode)->sysctl points to just might.
Fortunately, it's enough to make freeing that sucker delayed,
provided that we don't step on its ->unregistering, clear
the pointer to it in PROC_I(inode) before dropping the reference
and check if it's NULL in ->d_compare().

b) I'm not sure that we *can* walk into NULL inode here (we recheck
dentry->seq between verifying that it's still hashed / fetching
dentry->d_inode and passing it to ->d_compare() and there's no
negative hashed dentries in /proc/sys/*), but if we can walk into
that, we really should not have ->d_compare() return 0 on it!
Said that, I really suspect that this check can be simply killed.
Nick?
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dfef6dcd

net: add ndo_fcoe_ddp_target() to support FCoE DDP in target mode · 6247e086

由 Yi Zou 提交于 2月 01, 2011

The Fiber Channel over Ethernet (FCoE) Direct Data Placement (DDP) can also be
used for FCoE target, where the DDP used for read I/O on an initiator can be
used on an FCoE target to speed up the write I/O to the target from the initiator.
The added ndo_fcoe_ddp_target() works in the similar way as the existing
ndo_fcoe_ddp_setup() to allow the underlying hardware set up the DDP context
accordingly when it gets called from the FCoE target implementation on top
the existing Open-FCoE fcoe/libfc protocol stack so without losing the ability
to provide DDP for read I/O as an initiator, it can also provide DDP offload
to the write I/O coming from the initiator as a target.
Signed-off-by: NYi Zou <yi.zou@intel.com>
Signed-off-by: NKiran Patil <kiran.patil@intel.com>
Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

6247e086

netdevice: Convert printk to pr_info in netif_tx_stop_queue · 256ee435

由 Joe Perches 提交于 3月 01, 2011

This allows any caller to be prefaced by any specific
pr_fmt to better identify which device driver is using
this function inappropriately.

Add terminating newline.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

256ee435

05 3月, 2011 5 次提交

mm: add alloc_page_vma_node() · 236344d6

由 Andi Kleen 提交于 3月 04, 2011

Add a alloc_page_vma_node that allows passing the "local" node in.  Used
in a followon patch.
Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

236344d6

mm: change alloc_pages_vma to pass down the policy node for local policy · 2f5f9486

由 Andi Kleen 提交于 3月 04, 2011

Currently alloc_pages_vma() always uses the local node as policy node for
the LOCAL policy.  Pass this node down as an argument instead.

No behaviour change from this patch, but will be needed for followons.
Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f5f9486

libceph: fix msgr keepalive flag · e76661d0

由 Sage Weil 提交于 3月 03, 2011

There was some broken keepalive code using a dead variable.  Shift to using
the proper bit flag.
Signed-off-by: NSage Weil <sage@newdream.net>

e76661d0

libceph: fix msgr backoff · 60bf8bf8

由 Sage Weil 提交于 3月 04, 2011

With commit f363e45f we replaced a bunch of hacky workqueue mutual
exclusion logic with the WQ_NON_REENTRANT flag.  One pieces of fallout is
that the exponential backoff breaks in certain cases:

 * con_work attempts to connect.
 * we get an immediate failure, and the socket state change handler queues
   immediate work.
 * con_work calls con_fault, we decide to back off, but can't queue delayed
   work.

In this case, we add a BACKOFF bit to make con_work reschedule delayed work
next time it runs (which should be immediately).
Signed-off-by: NSage Weil <sage@newdream.net>

60bf8bf8

Mark ptrace_{traceme,attach,detach} static · e3e89cc5

由 Linus Torvalds 提交于 3月 04, 2011

They are only used inside kernel/ptrace.c, and have been for a long
time. We don't want to go back to the bad-old-days when architectures
did things on their own, so make them static and private.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e3e89cc5

04 3月, 2011 2 次提交

netlink: kill eff_cap from struct netlink_skb_parms · 01a16b21

由 Patrick McHardy 提交于 3月 03, 2011

Netlink message processing in the kernel is synchronous these days,
capabilities can be checked directly in security_netlink_recv() from
the current process.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Reviewed-by: NJames Morris <jmorris@namei.org>
[chrisw: update to include pohmelfs and uvesafb]
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01a16b21

netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms · c53fa1ed

由 Patrick McHardy 提交于 3月 03, 2011

Netlink message processing in the kernel is synchronous these days, the
session information can be collected when needed.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c53fa1ed

03 3月, 2011 4 次提交

blktrace: Remove blk_fill_rwbs_rq. · 2d3a8497

由 Tao Ma 提交于 3月 03, 2011

If we enable trace events to trace block actions, We use
blk_fill_rwbs_rq to analyze the corresponding actions
in request's cmd_flags, but we only choose the minor 2 bits
from it, so most of other flags(e.g, REQ_SYNC) are missing.
For example, with a sync write we get:
write_test-2409  [001]   160.013869: block_rq_insert: 3,64 W 0 () 258135 + =
8 [write_test]

Since now we have integrated the flags of both bio and request,
it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and
blk_fill_rwbs_rq isn't needed any more.

With this patch, after a sync write we get:
write_test-2417  [000]   226.603878: block_rq_insert: 3,64 WS 0 () 258135 +=
 8 [write_test]
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

2d3a8497

dcbnl: add support for retrieving peer configuration - cee · dc6ed1df

由 Shmulik Ravid 提交于 2月 27, 2011

This patch adds the support for retrieving the remote or peer DCBX
configuration via dcbnl for embedded DCBX stacks supporting the CEE DCBX
standard.
Signed-off-by: NShmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc6ed1df

dcbnl: add support for retrieving peer configuration - ieee · eed84713

由 Shmulik Ravid 提交于 2月 27, 2011

These 2 patches add the support for retrieving the remote or peer DCBX
configuration via dcbnl for embedded DCBX stacks. The peer configuration
is part of the DCBX MIB and is useful for debugging and diagnostics of
the overall DCB configuration. The first patch add this support for IEEE
802.1Qaz standard the second patch add the same support for the older
CEE standard. Diff for v2 - the peer-app-info is CEE specific.
Signed-off-by: NShmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eed84713

netdevice: make initial group visible to userspace · 23b41168

由 Vlad Dogaru 提交于 2月 26, 2011

INIT_NETDEV_GROUP is needed by userspace, move it outside __KERNEL__
guards.
Signed-off-by: NVlad Dogaru <ddvlad@rosedu.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23b41168

02 3月, 2011 4 次提交

block: add @force_kblockd to __blk_run_queue() · 1654e741

由 Tejun Heo 提交于 3月 02, 2011

__blk_run_queue() automatically either calls q->request_fn() directly
or schedules kblockd depending on whether the function is recursed.
blk-flush implementation needs to be able to explicitly choose
kblockd.  Add @force_kblockd.

All the current users are converted to specify %false for the
parameter and this patch doesn't introduce any behavior change.

stable: This is prerequisite for fixing ide oops caused by the new
        blk-flush implementation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

1654e741

mfd: Don't suspend WM8994 if the CODEC is not suspended · 77bd70e9

由 Mark Brown 提交于 2月 04, 2011

ASoC supports keeping the audio subsysetm active over suspend in order
to support use cases such as audio passthrough from a cellular modem
with the main CPU suspended. Ensure that we don't power down the CODEC
when this is happening by checking to see if VMID is up and skipping
suspend and resume when it is. If the CODEC has suspended then it'll
turn VMID off before the core suspend() gets called.
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

77bd70e9

cfg80211: add a field for the bitrate of the last rx data packet from a station · c8dcfd8a

由 Felix Fietkau 提交于 2月 27, 2011

Also fix a typo in the STATION_INFO_TX_BITRATE description
Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

c8dcfd8a

blk-throttle: Do not use kblockd workqueue for throtl work · 450adcbe

由 Vivek Goyal 提交于 3月 01, 2011

o Dominik Klein reported a system hang issue while doing some blkio
  throttling testing.

  https://lkml.org/lkml/2011/2/24/173

o Some tracing revealed that CFQ was not dispatching any more jobs as
  queue unplug was not happening. And queue unplug was not happening
  because unplug work was not being called as there was one throttling
  work on same cpu which as not finished yet. And throttling work had not
  finished as it was tyring to dispatch a bio to CFQ but all the request
  descriptors were consume to it was put to sleep.

o So basically it is a cyclic dependecny between CFQ unplug work and
  throtl dispatch work. Tejun suggested that use separate workqueue for
  such cases.

o This patch uses a separate workqueue for throttle related work and
  does not rely on kblockd workqueue anymore.

Cc: stable@kernel.org
Reported-by: NDominik Klein <dk@in-telegence.net>
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

450adcbe

01 3月, 2011 1 次提交

ACPI: Fix build for CONFIG_NET unset · af06216a

由 Rafael J. Wysocki 提交于 3月 01, 2011

Several ACPI drivers fail to build if CONFIG_NET is unset, because
they refer to things depending on CONFIG_THERMAL that in turn depends
on CONFIG_NET.  However, CONFIG_THERMAL doesn't really need to depend
on CONFIG_NET, because the only part of it requiring CONFIG_NET is
the netlink interface in thermal_sys.c.

Put the netlink interface in thermal_sys.c under #ifdef CONFIG_NET
and remove the dependency of CONFIG_THERMAL on CONFIG_NET from
drivers/thermal/Kconfig.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Len Brown <lenb@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Luming Yu <luming.yu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

af06216a

28 2月, 2011 1 次提交

netpoll: remove IFF_IN_NETPOLL flag · 080e4130

由 Amerigo Wang 提交于 2月 17, 2011

V4: rebase to net-next-2.6

This patch removes the flag IFF_IN_NETPOLL, we don't need it any more since
we have netpoll_tx_running() now.
Signed-off-by: NWANG Cong <amwang@redhat.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

080e4130

26 2月, 2011 1 次提交

rapidio: fix sysfs config attribute to access 16MB of maint space · fe41947e

由 Alexandre Bounine 提交于 2月 25, 2011

Fixes sysfs config attribute to allow access to entire 16MB maintenance
space of RapidIO devices.
Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe41947e

25 2月, 2011 3 次提交

netem: revised correlated loss generator · 661b7972

由 stephen hemminger 提交于 2月 23, 2011

This is a patch originated with Stefano Salsano and Fabio Ludovici.
It provides several alternative loss models for use with netem.
This patch adds two state machine based loss models.

See: http://netgroup.uniroma2.it/twiki/bin/view.cgi/Main/NetemCLGSigned-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

661b7972

netem: define NETEM_DIST_MAX · df173bda

由 stephen hemminger 提交于 2月 23, 2011

Rather than magic constant in code, expose the maximum size of
packet distribution table in API. In iproute2, q_netem defines
MAX_DIST as 16K already.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df173bda

PM: Make ACPI wakeup from S5 work again when CONFIG_PM_SLEEP is unset · 805bdaec

由 Rafael J. Wysocki 提交于 2月 24, 2011

Commit 074037ec (PM / Wakeup: Introduce wakeup source objects and
event statistics (v3)) caused ACPI wakeup to only work if
CONFIG_PM_SLEEP is set, but it also worked for CONFIG_PM_SLEEP unset
before.  This can be fixed by making device_set_wakeup_enable(),
device_init_wakeup() and device_may_wakeup() work in the same way
as before commit 074037ec when CONFIG_PM_SLEEP is unset.
Reported-and-tested-by: NJustin Maggard <jmaggard10@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

805bdaec

24 2月, 2011 6 次提交

Fix over-zealous flush_disk when changing device size. · 93b270f7

由 NeilBrown 提交于 2月 24, 2011

There are two cases when we call flush_disk.
In one, the device has disappeared (check_disk_change) so any
data will hold becomes irrelevant.
In the oter, the device has changed size (check_disk_size_change)
so data we hold may be irrelevant.

In both cases it makes sense to discard any 'clean' buffers,
so they will be read back from the device if needed.

In the former case it makes sense to discard 'dirty' buffers
as there will never be anywhere safe to write the data.  In the
second case it *does*not* make sense to discard dirty buffers
as that will lead to file system corruption when you simply enlarge
the containing devices.

flush_disk calls __invalidate_devices.
__invalidate_device calls both invalidate_inodes and invalidate_bdev.

invalidate_inodes *does* discard I_DIRTY inodes and this does lead
to fs corruption.

invalidate_bev *does*not* discard dirty pages, but I don't really care
about that at present.

So this patch adds a flag to __invalidate_device (calling it
__invalidate_device2) to indicate whether dirty buffers should be
killed, and this is passed to invalidate_inodes which can choose to
skip dirty inodes.

flusk_disk then passes true from check_disk_change and false from
check_disk_size_change.

dm avoids tripping over this problem by calling i_size_write directly
rathher than using check_disk_size_change.

md does use check_disk_size_change and so is affected.

This regression was introduced by commit 608aeef1 which causes
check_disk_size_change to call flush_disk, so it is suitable for any
kernel since 2.6.27.

Cc: stable@kernel.org
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Andrew Patterson <andrew.patterson@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NNeilBrown <neilb@suse.de>

93b270f7

mm: prevent concurrent unmap_mapping_range() on the same inode · 2aa15890

由 Miklos Szeredi 提交于 2月 23, 2011

Michael Leun reported that running parallel opens on a fuse filesystem
can trigger a "kernel BUG at mm/truncate.c:475"

Gurudas Pai reported the same bug on NFS.

The reason is, unmap_mapping_range() is not prepared for more than
one concurrent invocation per inode.  For example:

  thread1: going through a big range, stops in the middle of a vma and
     stores the restart address in vm_truncate_count.

  thread2: comes in with a small (e.g. single page) unmap request on
     the same vma, somewhere before restart_address, finds that the
     vma was already unmapped up to the restart address and happily
     returns without doing anything.

Another scenario would be two big unmap requests, both having to
restart the unmapping and each one setting vm_truncate_count to its
own value.  This could go on forever without any of them being able to
finish.

Truncate and hole punching already serialize with i_mutex.  Other
callers of unmap_mapping_range() do not, and it's difficult to get
i_mutex protection for all callers.  In particular ->d_revalidate(),
which calls invalidate_inode_pages2_range() in fuse, may be called
with or without i_mutex.

This patch adds a new mutex to 'struct address_space' to prevent
running multiple concurrent unmap_mapping_range() on the same mapping.

[ We'll hopefully get rid of all this with the upcoming mm
  preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex
  lockbreak" patch in particular.  But that is for 2.6.39 ]
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reported-by: NMichael Leun <lkml20101129@newton.leun.net>
Reported-by: NGurudas Pai <gurudas.pai@oracle.com>
Tested-by: NGurudas Pai <gurudas.pai@oracle.com>
Acked-by: NHugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2aa15890

tipc: Clean out all remaining instances of #if 0'd unused code · 77c81e0b

由 Allan Stephens 提交于 1月 18, 2011

Remove all instances of legacy or proposed-but-not-implemented code
that lives within an #if 0 ... #endif block. If some of it is needed
in the future it can recovered out of history, but there is no need
for it to clutter up the active code base.
Signed-off-by: NAllan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

77c81e0b

tipc: Improve handling of invalid link tolerance values · 5413b4c6

由 Allan Stephens 提交于 1月 18, 2011

Enhances TIPC link code to ignore an invalid link tolerance value
contained in an incoming LINK_PROTOCOL message, rather than
processing the value and potentially causing a divide-by-zero error.

Also add a compile-time check that catches attempts to redefine
TIPC's minimum link tolerance value in a manner that might result
in the same divide-by-zero error at run-time.
Signed-off-by: NAllan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

5413b4c6

net: Implement SFEATURES compatibility for not updated drivers · 39fc0ce5

由 Michał Mirosław 提交于 2月 22, 2011

Use discrete setting ops for not updated drivers. This will not make
them conform to full G/SFEATURES semantics, though.
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39fc0ce5

net_sched: SFB flow scheduler · e13e02a3

由 Eric Dumazet 提交于 2月 23, 2011

This is the Stochastic Fair Blue scheduler, based on work from :

W. Feng, D. Kandlur, D. Saha, K. Shin. Blue: A New Class of Active Queue
Management Algorithms. U. Michigan CSE-TR-387-99, April 1999.

http://www.thefengs.com/wuchang/blue/CSE-TR-387-99.pdf

This implementation is based on work done by Juliusz Chroboczek

General SFB algorithm can be found in figure 14, page 15:

B[l][n] : L x N array of bins (L levels, N bins per level)
enqueue()
Calculate hash function values h{0}, h{1}, .. h{L-1}
Update bins at each level
for i = 0 to L - 1
   if (B[i][h{i}].qlen > bin_size)
      B[i][h{i}].p_mark += p_increment;
   else if (B[i][h{i}].qlen == 0)
      B[i][h{i}].p_mark -= p_decrement;
p_min = min(B[0][h{0}].p_mark ... B[L-1][h{L-1}].p_mark);
if (p_min == 1.0)
    ratelimit();
else
    mark/drop with probabilty p_min;

I did the adaptation of Juliusz code to meet current kernel standards,
and various changes to address previous comments :

http://thread.gmane.org/gmane.linux.network/90225
http://thread.gmane.org/gmane.linux.network/90375

Default flow classifier is the rxhash introduced by RPS in 2.6.35, but
we can use an external flow classifier if wanted.

tc qdisc add dev $DEV parent 1:11 handle 11:  \
        est 0.5sec 2sec sfb limit 128

tc filter add dev $DEV protocol ip parent 11: handle 3 \
        flow hash keys dst divisor 1024

Notes:

1) SFB default child qdisc is pfifo_fast. It can be changed by another
qdisc but a child qdisc MUST not drop a packet previously queued. This
is because SFB needs to handle a dequeued packet in order to maintain
its virtual queue states. pfifo_head_drop or CHOKe should not be used.

2) ECN is enabled by default, unlike RED/CHOKe/GRED

With help from Patrick McHardy & Andi Kleen
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Andi Kleen <andi@firstfloor.org>
CC: John W. Linville <linville@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e13e02a3

23 2月, 2011 1 次提交
- D
  xfrm: Mark flowi arg to security_xfrm_state_pol_flow_match() const. · e33f7704
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  e33f7704