提交 · 6582c665d6b882dad8329e05749fbcf119f1ab88 · openanolis / cloud-kernel

18 12月, 2012 40 次提交

prandom: introduce prandom_bytes() and prandom_bytes_state() · 6582c665

由 Akinobu Mita 提交于 12月 17, 2012

Add functions to get the requested number of pseudo-random bytes.

The difference from get_random_bytes() is that it generates pseudo-random
numbers by prandom_u32().  It doesn't consume the entropy pool, and the
sequence is reproducible if the same rnd_state is used.  So it is suitable
for generating random bytes for testing.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: David Laight <david.laight@aculab.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6582c665

random32: rename random32 to prandom · 496f2f93

由 Akinobu Mita 提交于 12月 17, 2012

This renames all random32 functions to have 'prandom_' prefix as follows:

  void prandom_seed(u32 seed);	/* rename from srandom32() */
  u32 prandom_u32(void);		/* rename from random32() */
  void prandom_seed_state(struct rnd_state *state, u64 seed);
  				/* rename from prandom32_seed() */
  u32 prandom_u32_state(struct rnd_state *state);
  				/* rename from prandom32() */

The purpose of this renaming is to prevent some kernel developers from
assuming that prandom32() and random32() might imply that only
prandom32() was the one using a pseudo-random number generator by
prandom32's "p", and the result may be a very embarassing security
exposure.  This concern was expressed by Theodore Ts'o.

And furthermore, I'm going to introduce new functions for getting the
requested number of pseudo-random bytes.  If I continue to use both
prandom32 and random32 prefixes for these functions, the confusion
is getting worse.

As a result of this renaming, "prandom_" is the common prefix for
pseudo-random number library.

Currently, srandom32() and random32() are preserved because it is
difficult to rename too many users at once.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Cc: David Laight <david.laight@aculab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

496f2f93

aoe: fix use after free in aoedev_by_aoeaddr() · 31279b14

由 Dan Carpenter 提交于 12月 17, 2012

We should return NULL on failure instead of returning a freed pointer.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Ed Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

31279b14

aoe: update internal version number to 81 · 2b37c7d8

由 Ed Cashin 提交于 12月 17, 2012

This version number is printed to the console on module initialization
and is available in sysfs, which is where the userland aoe-version tool
looks for it.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2b37c7d8

aoe: identify source of runt AoE packets · bf29754a

由 Ed Cashin 提交于 12月 17, 2012

This change only affects experimental AoE storage networks.

It modifies the console message about runt packets detected so that the
AoE major and minor addresses of the AoE target that generated the runt
are mentioned.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bf29754a

aoe: allow comma separator in aoe_iflist value · 4a6c9ee9

由 Ed Cashin 提交于 12月 17, 2012

By default, the aoe driver uses any ethernet interface for AoE, but the
aoe_iflist module parameter provides a convenient way to limit AoE
traffic to a specific list of local network interfaces.

This change allows a list to be specified using the comma character as a
separator.  For example,

  modprobe aoe aoe_iflist=eth2,eth3

Before, it was inconvenient to get the quoting right in shell scripts
when setting aoe_iflist to have more than one network interface.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4a6c9ee9

aoe: allow user to disable target failure timeout · c450ba0f

由 Ed Cashin 提交于 12月 17, 2012

With this change, the aoe driver treats the value zero as special for
the aoe_deadsecs module parameter.  Normally, this value specifies the
number of seconds during which the driver will continue to attempt
retransmits to an unresponsive AoE target.  After aoe_deadsecs has
elapsed, the aoe driver marks the aoe device as "down" and fails all
I/O.

The new meaning of an aoe_deadsecs of zero is for the driver to
retransmit commands indefinitely.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c450ba0f

aoe: use dynamic number of remote ports for AoE storage target · 71114ec4

由 Ed Cashin 提交于 12月 17, 2012

Many AoE targets have four or fewer network ports, but some existing
storage devices have many, and the AoE protocol sets no limit.

This patch allows the use of more than eight remote MAC addresses per AoE
target, while reducing the amount of memory used by the aoe driver in
cases where there are many AoE targets with fewer than eight MAC addresses
each.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

71114ec4

aoe: avoid races between device destruction and discovery · e52a2932

由 Ed Cashin 提交于 12月 17, 2012

This change avoids a race that could result in a NULL pointer derference
following a WARNing from kobject_add_internal, "don't try to register
things with the same name in the same directory."

The problem was found with a test that forgets and discovers an
aoe device in a loop:

  while test ! -r /tmp/stop; do
	aoe-flush -a
	aoe-discover
  done

The race was between aoedev_flush taking aoedevs out of the devlist,
allowing a new discovery of the same AoE target to take place before the
driver gets around to calling sysfs_remove_group.  Fixing that one
revealed another race between do_open and add_disk, and this patch avoids
that, too.

The fix required some care, because for flushing (forgetting) an aoedev,
some of the steps must be performed under lock and some must be able to
sleep.  Also, for discovering a new aoedev, some steps might sleep.

The check for a bad aoedev pointer remains from a time when about half of
this patch was done, and it was possible for the
bdev->bd_disk->private_data to become corrupted.  The check should be
removed eventually, but it is not expected to add significant overhead,
occurring in the aoeblk_open routine.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e52a2932

aoe: improve handling of misbehaving network paths · bbb44e30

由 Ed Cashin 提交于 12月 17, 2012

An AoE target can have multiple network ports used for AoE, and in the
aoe driver, those are tracked by the aoetgt struct.  These changes allow
the aoe driver to handle network paths, or aoetgts, that are not working
well, compared to the others.

Paths that do not get responses despite the retransmission of AoE
commands are marked as "tainted", and non-tainted paths are preferred.

Meanwhile, the aoe driver attempts to "probe" the tainted path in the
background by issuing reads of LBA 0 that are padded out to full
(possibly jumbo-frame) size.  If the probes get responses, then the path
is "redeemed", and its taint is removed.

This mechanism has been shown to be helpful in transparently handling
and recovering from real-world network "brown outs" in ways that the
earlier "shoot the help-needing target in the head" mechanism could not.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bbb44e30

aoe: return real minor number for static minors · b91316f2

由 Ed Cashin 提交于 12月 17, 2012

The value returned by the static minor device number number allocator is
the real minor number, so it must be multiplied by the supported number
of partitions per aoedev.

Without this fix the support for systems without udev is incomplete, and
the few users of aoe on such systems will have surprising results when
device nodes names do not match the AoE target.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b91316f2

aoe: initialize sysminor to avoid compiler warning · 10935d05

由 Ed Cashin 提交于 12月 17, 2012

Because the minor_get and related functions use the return values for
errors, the compiler doesn't know that sysminor will always either 1) be
initialized in aoedev_by_aoeaddr by the call to minor_get, or 2) be
unused as the "goto out" is executed.

This patch avoids the compiler warning.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

10935d05

aoe: make error messages more specific in static minor allocation · e0b2bbab

由 Ed Cashin 提交于 12月 17, 2012

For some special-purpose systems where udev isn't present, static
allocation of minor numbers is desirable.  This update distinguishes
different failure scenarios, to help the user understand what went
wrong.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e0b2bbab

aoe: remove call to request handler from I/O completion · 60116cf7

由 Ed Cashin 提交于 12月 17, 2012

There is no need to call the request handler function in the I/O
completion routine.  The user impact of not doing it is a more "nice" aoe
driver that is less susceptible to causing soft lockups.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60116cf7

aoe: cleanup: correct comment for aoetgt nout · 72837600

由 Ed Cashin 提交于 12月 17, 2012

A misplaced comment was attached to the nout member of the aoetgt.  This
change corrects the comment.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

72837600

aoe: increase default cap on outstanding AoE commands in the network · 7b6ccc5f

由 Ed Cashin 提交于 12月 17, 2012

The aoe driver will never be waiting for more than aoe_maxout AoE
commands from a given remote network port on an AoE target.  Increasing
the cap increases performance.  Users can tighten the setting to reduce
the amount of memory used for handling AoE traffic or the network
bandwidth used for AoE.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7b6ccc5f

aoe: remove vestigial request queue allocation · 0a41409c

由 Ed Cashin 提交于 12月 17, 2012

Before the aoe driver was an I/O request handler, it was a
make_request-style block driver.  Even so, there was a problem where
sysfs expected a request queue to exist, so one was provided in commit
7135a71b ("aoe: allocate unused request_queue for sysfs").

During the transition to the request-handler style, a patch was merged
that was based on a driver without the noop queue, and the noop queue
remained in place after the patch was merged, even though a new
functional queue was introduced by the patch, allocated through
blk_init_queue.

The user impact is a memory leak proportional to the number of AoE
targets discovered.  This patch removes the memory leak and cleans up
vestiges of the old do-nothing queue from the aoeblk_gdalloc function.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0a41409c

aoe: copy fallback timing information on destination failover · fe7252bf

由 Ed Cashin 提交于 12月 17, 2012

Commit f3b8e07af774 ("aoe: commands in retransmit queue use new
destination on failure") omits the copying of the coarse-grained time
when an AoE command was sent during the failover from one destination
MAC address on the AoE target to another.

The coarse-grained timing is only used when the system time changes or
an unlikely length of time has passed since the sending of the AoE
command.  Users will not be impacted unless their system clock is very
inaccurate or something unusual (e.g., 10 GbE link reset) happens during
the period when the aoe driver is handling the failure of a port on the
AoE target.  Being effected will mean that an AoE target could be
considered "down" too eagerly.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe7252bf

aoe: update driver-internal version to 64+ · 519b77b0

由 Ed Cashin 提交于 12月 17, 2012

Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

519b77b0

aoe: commands in retransmit queue use new destination on failure · 3fc9b032

由 Ed Cashin 提交于 12月 17, 2012

When one remote MAC address isn't working as a destination for AoE
commands, the frames used to track information associated with the AoE
commands are moved to a new aoetgt (defined by the tuple of {AoE major,
AoE minor, target MAC address}).

This patch makes sure that the frames on the queue for retransmits that
need to be done are updated to use the new destination, so that
retransmits will be sent through a working network path.

Without this change, packets on the retransmit queue will be needlessly
retransmitted to the unresponsive destination MAC, possibly causing
premature target failure before there's time for the retransmit timer to
run again, decide to retransmit again, and finally update the destination
to a working MAC address on the AoE target.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3fc9b032

aoe: use high-resolution RTTs with fallback to low-res · 5f0c9c48

由 Ed Cashin 提交于 12月 17, 2012

These changes improve the accuracy of the decision about whether it's time
to retransmit an AoE command by using the microsecond-resolution
gettimeofday instead of jiffies.

Because the system time can jump suddenly, the decision reverts to using
jiffies if the high-resolution time difference is relatively large.
Otherwise the AoE targets could be considered failed inappropriately.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5f0c9c48

aoe: manipulate aoedev network stats under lock · 0d555ecf

由 Ed Cashin 提交于 12月 17, 2012

With this bugfix in place the calculation of the criterion for "lateness"
is performed under lock.  Without the lock, there is a chance that one of
the non-atomic operations performed on the round trip time statistics
could be incomplete, such that an incorrect lateness criterion would be
calculated.

Without this change, the effect of the bug would be rare unecessary but
benign retransmissions.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0d555ecf

aoe: err device: include MAC addresses for unexpected responses · 2292a7e1

由 Ed Cashin 提交于 12月 17, 2012

The /dev/etherd/err character device provides low-level information about
normal but sometimes interesting AoE command retransmits and "unexpected
responses", i.e., responses for packets that have already been
retransmitted.

This change adds MAC addresses to the messages about unexpected responses,
so that when they occur, it's more easy to determine the network paths to
which they belong.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2292a7e1

aoe: improve network congestion handling · 3a0c40d2

由 Ed Cashin 提交于 12月 17, 2012

The aoe driver already had some congestion handling, but it was limited in
its ability to cope with the kind of congestion that can arise on more
complex networks such as those involving paths through multiple ethernet
switches.

Some of the lessons from TCP's history of development can be applied to
improving the congestion control and avoidance on AoE storage networks.
These changes use familar concepts from Van Jacobson's "Congestion
Avoidance and Control" paper from '88, without adding significant
overhead.

This patch depends on an upcoming patch that covers the failover case when
AoE commands being retransmitted are transferred from one retransmit queue
to another.  Another upcoming patch increases the timing accuracy.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a0c40d2

aoe: provide ATA identify device content to user on request · 667be1e7

由 Ed Cashin 提交于 12月 17, 2012

Make the aoe driver follow expected behavior when the user uses ioctl to
get the ATA device identify information, allowing access to model, serial
number, etc.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

667be1e7

aoe: update driver-internal version number to 60 · cd220bf5

由 Ed Cashin 提交于 12月 17, 2012

Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cd220bf5

aoe: whitespace cleanup · a04b41cd

由 Ed Cashin 提交于 12月 17, 2012

Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a04b41cd

aoe: cleanup: remove unused ata_scnt function · d4379625

由 Ed Cashin 提交于 12月 17, 2012

Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d4379625

aoe: "payload" sysfs file exports per-AoE-command data transfer size · 90a2508d

由 Ed Cashin 提交于 12月 17, 2012

The userland aoetools package includes an "aoe-stat" command that can
display a "payload size" column when the aoe driver exports this
information.  Users can quickly see what amount of user data is
transferred inside each AoE command on the network, network headers
excluded.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

90a2508d

aoe: support larger I/O requests via aoe_maxsectors module param · aa304fde

由 Ed Cashin 提交于 12月 17, 2012

The GPFS filesystem is an example of an aoe user that requires the aoe
driver to support I/O request sizes larger than the default.  Most users
will not need large I/O request sizes, because they would need to be split
up into multiple AoE commands anyway.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aa304fde

aoe: support the forgetting (flushing) of a user-specified AoE target · 4ba9aa7f

由 Ed Cashin 提交于 12月 17, 2012

Users sometimes want to cause the aoe driver to forget a particular
previously discovered device when it is no longer online.  The aoetools
provide an "aoe-flush" command that users run to perform this
administrative task.  The changes below provide the support needed in the
driver.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4ba9aa7f

aoe: update cap on outstanding commands based on config query response · 1b8a1636

由 Ed Cashin 提交于 12月 17, 2012

The ATA over Ethernet config query response contains a "buffer count"
field reflecting the AoE target's capacity to buffer incoming AoE
commands.

By taking the current value of this field into accound, we increase
performance throughput or avoid network congestion, when the value
has increased or decreased, respectively.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1b8a1636

aoe: print warning regarding a common reason for dropped transmits · 4e78dd14

由 Ed Cashin 提交于 12月 17, 2012

Dropped transmits are not common, but when they do occur, increasing
the transmit queue length often helps.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4e78dd14

aoe: describe the behavior of the "err" character device · 662a8896

由 Ed Cashin 提交于 12月 17, 2012

Signed-off-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

662a8896

Documentation/sparse.txt: document context annotations for lock checking · 6e976631

由 Ed Cashin 提交于 12月 17, 2012

The context feature of sparse is used with the Linux kernel sources to
check for imbalanced uses of locks.  Document the annotations defined in
include/linux/compiler.h that tell sparse what to expect when a lock is
held on function entry, exit, or both.
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Acked-by: NChristopher Li <sparse@chrisli.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6e976631

linux/compiler.h: add __must_hold macro for functions called with a lock held · 8529091e

由 Josh Triplett 提交于 12月 17, 2012

linux/compiler.h has macros to denote functions that acquire or release
locks, but not to denote functions called with a lock held that return
with the lock still held.  Add a __must_hold macro to cover that case.
Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
Reported-by: NEd Cashin <ecashin@coraid.com>
Tested-by: NEd Cashin <ecashin@coraid.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8529091e

pidns: remove unused is_container_init() · a5ba911e

由 Gao feng 提交于 12月 17, 2012

Since commit 1cdcbec1 ("CRED: Neuter sys_capset()")
is_container_init() has no callers.
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5ba911e

exec: use -ELOOP for max recursion depth · d7402698

由 Kees Cook 提交于 12月 17, 2012

To avoid an explosion of request_module calls on a chain of abusive
scripts, fail maximum recursion with -ELOOP instead of -ENOEXEC. As soon
as maximum recursion depth is hit, the error will fail all the way back
up the chain, aborting immediately.

This also has the side-effect of stopping the user's shell from attempting
to reexecute the top-level file as a shell script. As seen in the
dash source:

        if (cmd != path_bshell && errno == ENOEXEC) {
                *argv-- = cmd;
                *argv = cmd = path_bshell;
                goto repeat;
        }

The above logic was designed for running scripts automatically that lacked
the "#!" header, not to re-try failed recursion. On a legitimate -ENOEXEC,
things continue to behave as the shell expects.

Additionally, when tracking recursion, the binfmt handlers should not be
involved. The recursion being tracked is the depth of calls through
search_binary_handler(), so that function should be exclusively responsible
for tracking the depth.
Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: halfdog <me@halfdog.net>
Cc: P J P <ppandit@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d7402698

proc: pid/status: show all supplementary groups · 8d238027

由 Artem Bityutskiy 提交于 12月 17, 2012

We display a list of supplementary group for each process in
/proc/<pid>/status.  However, we show only the first 32 groups, not all of
them.

Although this is rare, but sometimes processes do have more than 32
supplementary groups, and this kernel limitation breaks user-space apps
that rely on the group list in /proc/<pid>/status.

Number 32 comes from the internal NGROUPS_SMALL macro which defines the
length for the internal kernel "small" groups buffer.  There is no
apparent reason to limit to this value.

This patch removes the 32 groups printing limit.

The Linux kernel limits the amount of supplementary groups by NGROUPS_MAX,
which is currently set to 65536.  And this is the maximum count of groups
we may possibly print.
Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d238027

/proc/pid/status: add "Seccomp" field · 2f4b3bf6

由 Kees Cook 提交于 12月 17, 2012

It is currently impossible to examine the state of seccomp for a given
process.  While attaching with gdb and attempting "call
prctl(PR_GET_SECCOMP,...)" will work with some situations, it is not
reliable.  If the process is in seccomp mode 1, this query will kill the
process (prctl not allowed), if the process is in mode 2 with prctl not
allowed, it will similarly be killed, and in weird cases, if prctl is
filtered to return errno 0, it can look like seccomp is disabled.

When reviewing the state of running processes, there should be a way to
externally examine the seccomp mode.  ("Did this build of Chrome end up
using seccomp?" "Did my distro ship ssh with seccomp enabled?")

This adds the "Seccomp" line to /proc/$pid/status.
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: James Morris <jmorris@namei.org>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f4b3bf6

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功