提交 · 644008df899ec252e78db28c1b6d6b86779aada8 · bug2833 / cloud-kernel

04 11月, 2013 4 次提交

random: initialize the last_time field in struct timer_rand_state · 644008df

由 Theodore Ts'o 提交于 11月 03, 2013

Since we initialize jiffies to wrap five minutes before boot (see
INITIAL_JIFFIES defined in include/linux/jiffies.h) it's important to
make sure the last_time field is initialized to INITIAL_JIFFIES.
Otherwise, the entropy estimator will overestimate the amount of
entropy resulting from the first call to add_timer_randomness(),
generally by about 8 bits.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

644008df

random: don't zap entropy count in rand_initialize() · ae9ecd92

由 Theodore Ts'o 提交于 11月 03, 2013

The rand_initialize() function was being run fairly late in the kernel
boot sequence.  This was unfortunate, since it zero'ed the entropy
counters, thus throwing away credit that was accumulated earlier in
the boot sequence, and it also meant that initcall functions run
before rand_initialize were using a minimally initialized pool.

To fix this, fix init_std_data() to no longer zap the entropy counter;
it wasn't necessary, and move rand_initialize() to be an early
initcall.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ae9ecd92

random: printk notifications for urandom pool initialization · 301f0595

由 Theodore Ts'o 提交于 11月 03, 2013

Print a notification to the console when the nonblocking pool is
initialized.  Also printk a warning when a process tries reading from
/dev/urandom before it is fully initialized.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

301f0595

random: make add_timer_randomness() fill the nonblocking pool first · 40db23e5

由 Theodore Ts'o 提交于 11月 03, 2013

Change add_timer_randomness() so that it directs incoming entropy to
the nonblocking pool first if it hasn't been fully initialized yet.
This matches the strategy we use in add_interrupt_randomness(), which
allows us to push the randomness where we need it the most during when
the system is first booting up, so that get_random_bytes() and
/dev/urandom become safe to use as soon as possible.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

40db23e5

11 10月, 2013 14 次提交

random: convert DEBUG_ENT to tracepoints · f80bbd8b

由 Theodore Ts'o 提交于 10月 03, 2013

Instead of using the random driver's ad-hoc DEBUG_ENT() mechanism, use
tracepoints instead. This allows for a much more fine-grained control
of which debugging mechanism which a developer might need, and unifies
the debugging messages with all of the existing tracepoints.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f80bbd8b

random: push extra entropy to the output pools · 6265e169

由 Theodore Ts'o 提交于 10月 03, 2013

As the input pool gets filled, start transfering entropy to the output
pools until they get filled.  This allows us to use the output pools
to store more system entropy.  Waste not, want not....
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6265e169

random: drop trickle mode · 95b709b6

由 Theodore Ts'o 提交于 10月 02, 2013

The add_timer_randomness() used to drop into trickle mode when entropy
pool was estimated to be 87.5% full.  This was important when
add_timer_randomness() was used to sample interrupts.  It's not used
for this any more --- add_interrupt_randomness() now uses fast_mix()
instead.  By elimitating trickle mode, it allows us to fully utilize
entropy provided by add_input_randomness() and add_disk_randomness()
even when the input pool is above the old trickle threshold of 87.5%.

This helps to answer the criticism in [1] in their hypothetical
scenario where our entropy estimator was inaccurate, even though the
measurements in [2] seem to indicate that our entropy estimator given
real-life entropy collection is actually pretty good, albeit on the
conservative side (which was as it was designed).

[1] http://eprint.iacr.org/2013/338.pdf
[2] http://eprint.iacr.org/2012/251.pdfSigned-off-by: N"Theodore Ts'o" <tytso@mit.edu>

95b709b6

random: adjust the generator polynomials in the mixing function slightly · 6e9fa2c8

由 Theodore Ts'o 提交于 9月 22, 2013

Our mixing functions were analyzed by Lacharme, Roeck, Strubel, and
Videau in their paper, "The Linux Pseudorandom Number Generator
Revisited" (see: http://eprint.iacr.org/2012/251.pdf).

They suggested a slight change to improve our mixing functions
slightly.  I also adjusted the comments to better explain what is
going on, and to document why the polynomials were changed.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6e9fa2c8

random: speed up the fast_mix function by a factor of four · 655b2264

由 Theodore Ts'o 提交于 9月 22, 2013

By mixing the entropy in chunks of 32-bit words instead of byte by
byte, we can speed up the fast_mix function significantly.  Since it
is called on every single interrupt, on systems with a very heavy
interrupt load, this can make a noticeable difference.

Also fix a compilation warning in add_interrupt_randomness() and avoid
xor'ing cycles and jiffies together just in case we have an
architecture which tries to define random_get_entropy() by returning
jiffies.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NJörn Engel <joern@logfs.org>

655b2264

random: cap the rate which the /dev/urandom pool gets reseeded · f5c2742c

由 Theodore Ts'o 提交于 9月 22, 2013

In order to avoid draining the input pool of its entropy at too high
of a rate, enforce a minimum time interval between reseedings of the
urandom pool.  This is set to 60 seconds by default.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f5c2742c

random: optimize the entropy_store structure · c59974ae

由 Theodore Ts'o 提交于 9月 21, 2013

Use smaller types to slightly shrink the size of the entropy store
structure.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c59974ae

random: optimize spinlock use in add_device_randomness() · 3ef4cb2d

由 Theodore Ts'o 提交于 9月 12, 2013

The add_device_randomness() function calls mix_pool_bytes() twice for
the input pool and the non-blocking pool, for a total of four times.
By using _mix_pool_byte() and taking the spinlock in
add_device_randomness(), we can halve the number of times we need
take each pool's spinlock.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3ef4cb2d

random: fix the tracepoint for get_random_bytes(_arch) · 5910895f

由 Theodore Ts'o 提交于 9月 12, 2013

Fix a problem where get_random_bytes_arch() was calling the tracepoint
get_random_bytes().  So add a new tracepoint for
get_random_bytes_arch(), and make get_random_bytes() and
get_random_bytes_arch() call their correct tracepoint.

Also, add a new tracepoint for add_device_randomness()
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5910895f

random: account for entropy loss due to overwrites · 30e37ec5

由 H. Peter Anvin 提交于 9月 10, 2013

When we write entropy into a non-empty pool, we currently don't
account at all for the fact that we will probabilistically overwrite
some of the entropy in that pool.  This means that unless the pool is
fully empty, we are currently *guaranteed* to overestimate the amount
of entropy in the pool!

Assuming Shannon entropy with zero correlations we end up with an
exponentally decaying value of new entropy added:

	entropy <- entropy + (pool_size - entropy) *
		(1 - exp(-add_entropy/pool_size))

However, calculations involving fractional exponentials are not
practical in the kernel, so apply a piecewise linearization:

	  For add_entropy <= pool_size/2 then

	  (1 - exp(-add_entropy/pool_size)) >= (add_entropy/pool_size)*0.7869...

	  ... so we can approximate the exponential with
	  3/4*add_entropy/pool_size and still be on the
	  safe side by adding at most pool_size/2 at a time.

In order for the loop not to take arbitrary amounts of time if a bad
ioctl is received, terminate if we are within one bit of full.  This
way the loop is guaranteed to terminate after no more than
log2(poolsize) iterations, no matter what the input value is.  The
vast majority of the time the loop will be executed exactly once.

The piecewise linearization is very conservative, approaching 3/4 of
the usable input value for small inputs, however, our entropy
estimation is pretty weak at best, especially for small values; we
have no handle on correlation; and the Shannon entropy measure (Rényi
entropy of order 1) is not the correct one to use in the first place,
but rather the correct entropy measure is the min-entropy, the Rényi
entropy of infinite order.

As such, this conservatism seems more than justified.

This does introduce fractional bit values.  I have left it to have 3
bits of fraction, so that with a pool of 2^12 bits the multiply in
credit_entropy_bits() can still fit into an int, as 2*(3+12) < 31.  It
is definitely possible to allow for more fractional accounting, but
that multiply then would have to be turned into a 32*32 -> 64 multiply.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: DJ Johnston <dj.johnston@intel.com>

30e37ec5

random: allow fractional bits to be tracked · a283b5c4

由 H. Peter Anvin 提交于 9月 10, 2013

Allow fractional bits of entropy to be tracked by scaling the entropy
counter (fixed point).  This will be used in a subsequent patch that
accounts for entropy lost due to overwrites.

[ Modified by tytso to fix up a few missing places where the
  entropy_count wasn't properly converted from fractional bits to
  bits. ]
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

a283b5c4

random: statically compute poolbitshift, poolbytes, poolbits · 9ed17b70

由 H. Peter Anvin 提交于 9月 10, 2013

Use a macro to statically compute poolbitshift (will be used in a
subsequent patch), poolbytes, and poolbits.  On virtually all
architectures the cost of a memory load with an offset is the same as
the one of a memory load.

It is still possible for this to generate worse code since the C
compiler doesn't know the fixed relationship between these fields, but
that is somewhat unlikely.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

9ed17b70

random: mix in architectural randomness earlier in extract_buf() · 85a1f777

由 Theodore Ts'o 提交于 9月 21, 2013

Previously if CPU chip had a built-in random number generator (i.e.,
RDRAND on newer x86 chips), we mixed it in at the very end of
extract_buf() using an XOR operation.

We now mix it in right after the calculate a hash across the entire
pool.  This has the advantage that any contribution of entropy from
the CPU's HWRNG will get mixed back into the pool.  In addition, it
means that if the HWRNG has any defects (either accidentally or
maliciously introduced), this will be mitigated via the non-linear
transform of the SHA-1 hash function before we hand out generated
output.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

85a1f777

random: allow architectures to optionally define random_get_entropy() · 61875f30

由 Theodore Ts'o 提交于 9月 21, 2013

Allow architectures which have a disabled get_cycles() function to
provide a random_get_entropy() function which provides a fine-grained,
rapidly changing counter that can be used by the /dev/random driver.

For example, an architecture might have a rapidly changing register
used to control random TLB cache eviction, or DRAM refresh that
doesn't meet the requirements of get_cycles(), but which is good
enough for the needs of the random driver.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

61875f30

23 9月, 2013 1 次提交

random: run random_int_secret_init() run after all late_initcalls · 47d06e53

由 Theodore Ts'o 提交于 9月 10, 2013

The some platforms (e.g., ARM) initializes their clocks as
late_initcalls for some unknown reason.  So make sure
random_int_secret_init() is run after all of the late_initcalls are
run.

Cc: stable@vger.kernel.org
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

47d06e53

03 9月, 2013 4 次提交

L

Linux 3.11 · 6e466452
由 Linus Torvalds 提交于 9月 02, 2013

6e466452

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 248d296d

由 Linus Torvalds 提交于 9月 02, 2013

Pull SCSI fix from James Bottomley:
 "This is a bug fix for the pm80xx driver.  It turns out that when the
  new hardware support was added in 3.10 the IO command size was kept at
  the old hard coded value.  This means that the driver attaches to some
  new cards and then simply hangs the system"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  [SCSI] pm80xx: fix Adaptec 71605H hang

248d296d

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e09a1fa9

由 Linus Torvalds 提交于 9月 02, 2013

Pull x86 boot fix from Peter Anvin:
 "A single very small boot fix for very large memory systems (> 0.5T)"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC=y and more than 512G RAM

e09a1fa9

Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma · ac0bc789

由 Linus Torvalds 提交于 9月 02, 2013

Pull slave-dma fix from Vinod Koul:
 "A fix for resolving TI_EDMA driver's build error in allmodconfig to
  have filter function built in""

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dma/Kconfig: TI_EDMA needs to be boolean

ac0bc789

31 8月, 2013 15 次提交

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · a8787645

由 Linus Torvalds 提交于 8月 30, 2013

Pull networking fixes from David Miller:

 1) There was a simplification in the ipv6 ndisc packet sending
    attempted here, which avoided using memory accounting on the
    per-netns ndisc socket for sending NDISC packets.  It did fix some
    important issues, but it causes regressions so it gets reverted here
    too.  Specifically, the problem with this change is that the IPV6
    output path really depends upon there being a valid skb->sk
    attached.

    The reason we want to do this change in some form when we figure out
    how to do it right, is that if a device goes down the ndisc_sk
    socket send queue will fill up and block NDISC packets that we want
    to send to other devices too.  That's really bad behavior.

    Hopefully Thomas can come up with a better version of this change.

 2) Fix a severe TCP performance regression by reverting a change made
    to dev_pick_tx() quite some time ago.  From Eric Dumazet.

 3) TIPC returns wrongly signed error codes, fix from Erik Hugne.

 4) Fix OOPS when doing IPSEC over ipv4 tunnels due to orphaning the
    skb->sk too early.  Fix from Li Hongjun.

 5) RAW ipv4 sockets can use the wrong routing key during lookup, from
    Chris Clark.

 6) Similar to #1 revert an older change that tried to use plain
    alloc_skb() for SYN/ACK TCP packets, this broke the netfilter owner
    mark which needs to see the skb->sk for such frames.  From Phil
    Oester.

 7) BNX2x driver bug fixes from Ariel Elior and Yuval Mintz,
    specifically in the handling of virtual functions.

 8) IPSEC path error propagations to sockets is not done properly when
    we have v4 in v6, and v6 in v4 type rules.  Fix from Hannes Frederic
    Sowa.

 9) Fix missing channel context release in mac80211, from Johannes Berg.

10) Fix network namespace handing wrt.  SCM_RIGHTS, from Andy
    Lutomirski.

11) Fix usage of bogus NAPI weight in jme, netxen, and ps3_gelic
    drivers.  From Michal Schmidt.

12) Hopefully a complete and correct fix for the genetlink dump locking
    and module reference counting.  From Pravin B Shelar.

13) sk_busy_loop() must do a cpu_relax(), from Eliezer Tamir.

14) Fix handling of timestamp offset when restoring a snapshotted TCP
    socket.  From Andrew Vagin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  net: fec: fix time stamping logic after napi conversion
  net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delay
  mISDN: return -EINVAL on error in dsp_control_req()
  net: revert 8728c544 ("net: dev_pick_tx() fix")
  Revert "ipv6: Don't depend on per socket memory for neighbour discovery messages"
  ipv4 tunnels: fix an oops when using ipip/sit with IPsec
  tipc: set sk_err correctly when connection fails
  tcp: tcp_make_synack() should use sock_wmalloc
  bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones
  ipv6: Don't depend on per socket memory for neighbour discovery messages
  ipv4: sendto/hdrincl: don't use destination address found in header
  tcp: don't apply tsoffset if rcv_tsecr is zero
  tcp: initialize rcv_tstamp for restored sockets
  net: xilinx: fix memleak
  net: usb: Add HP hs2434 device to ZLP exception table
  net: add cpu_relax to busy poll loop
  net: stmmac: fixed the pbl setting with DT
  genl: Hold reference on correct module while netlink-dump.
  genl: Fix genl dumpit() locking.
  xfrm: Fix potential null pointer dereference in xdst_queue_output
  ...

a8787645

MAINTAINERS: change my DT related maintainer address · de80963e

由 Ian Campbell 提交于 8月 30, 2013

Filtering capabilities on my work email are pretty much non-existent and this
has turned out to be something of a firehose...

Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Acked-by: NPawel Moll <pawel.moll@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

de80963e

Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 936dbcc3

由 Linus Torvalds 提交于 8月 30, 2013

Pull sound fixes from Takashi Iwai:
 "This contains two Oops fixes (opti9xx and HD-audio) and a simple fixup
  for an Acer laptop.  All marked as stable patches"

* tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: opti9xx: Fix conflicting driver object name
  ALSA: hda - Fix NULL dereference with CONFIG_SND_DYNAMIC_MINORS=n
  ALSA: hda - Add inverted digital mic fixup for Acer Aspire One

936dbcc3

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · d9eda0fa

由 Linus Torvalds 提交于 8月 30, 2013

Pull ARM SoC fixes from Olof Johansson:
 "Two straggling fixes that I had missed as they were posted a couple of
  weeks ago, causing problems with interrupts (breaking them completely)
  on the CSR SiRF platforms"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm: prima2: drop nr_irqs in mach as we moved to linear irqdomain
  irqchip: sirf: move from legacy mode to linear irqdomain

d9eda0fa

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 418a95bc

由 Linus Torvalds 提交于 8月 30, 2013

Pull drm fixes from Dave Airlie:
 "Since we are getting to the pointy end, one i915 black screen on some
  machines, and one vmwgfx stop userspace ability to nuke the VM,

  There might be one or two ati or nouveau fixes trickle in before
  final, but I think this should pretty much be it"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/vmwgfx: Split GMR2_REMAP commands if they are to large
  drm/i915: ivb: fix edp voltage swing reg val

418a95bc

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 155e3a35

由 Linus Torvalds 提交于 8月 30, 2013

Pull input layer updates from Dmitry Torokhov:
 "Just a couple of new IDs in Wacom and xpad drivers, i8042 is now
  disabled on ARC, and data checks in Elantech driver that were overly
  relaxed by the previous patch are now tightened"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: i8042 - disable the driver on ARC platforms
  Input: xpad - add signature for Razer Onza Classic Edition
  Input: elantech - fix packet check for v3 and v4 hardware
  Input: wacom - add support for 0x300 and 0x301

155e3a35

net: fec: fix time stamping logic after napi conversion · 0affdf34

由 Richard Cochran 提交于 8月 30, 2013

Commit dc975382 "net: fec: add napi support to improve proformance"
converted the fec driver to the napi model. However, that commit
forgot to remove the call to skb_defer_rx_timestamp which is only
needed in non-napi drivers.

(The function napi_gro_receive eventually calls netif_receive_skb,
which in turn calls skb_defer_rx_timestamp.)

This patch should also be applied to the 3.9 and 3.10 kernels.
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0affdf34

net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delay · 2d98c29b

由 Daniel Borkmann 提交于 8月 29, 2013

While looking into MLDv1/v2 code, I noticed that bridging code does
not convert it's max delay into jiffies for MLDv2 messages as we do
in core IPv6' multicast code.

RFC3810, 5.1.3. Maximum Response Code says:

  The Maximum Response Code field specifies the maximum time allowed
  before sending a responding Report. The actual time allowed, called
  the Maximum Response Delay, is represented in units of milliseconds,
  and is derived from the Maximum Response Code as follows: [...]

As we update timers that work with jiffies, we need to convert it.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Linus Lüssing <linus.luessing@web.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d98c29b

mISDN: return -EINVAL on error in dsp_control_req() · 0d63c27d

由 Dan Carpenter 提交于 8月 29, 2013

If skb->len is too short then we should return an error.  Otherwise we
read beyond the end of skb->data for several bytes.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d63c27d

net: revert ("net: dev_pick_tx() fix") · 702821f4

由 Eric Dumazet 提交于 8月 28, 2013

commit 8728c544 ("net: dev_pick_tx() fix") and commit
b6fe83e9 ("bonding: refine IFF_XMIT_DST_RELEASE capability")
are quite incompatible : Queue selection is disabled because skb
dst was dropped before entering bonding device.

This causes major performance regression, mainly because TCP packets
for a given flow can be sent to multiple queues.

This is particularly visible when using the new FQ packet scheduler
with MQ + FQ setup on the slaves.

We can safely revert the first commit now that 416186fb
("net: Split core bits of netdev_pick_tx into __netdev_pick_tx")
properly caps the queue_index.
Reported-by: NXi Wang <xii@google.com>
Diagnosed-by: NXi Wang <xii@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Denys Fedorysychenko <nuclearcat@nuclearcat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

702821f4

Revert "ipv6: Don't depend on per socket memory for neighbour discovery messages" · 25ad6117

由 David S. Miller 提交于 8月 30, 2013

This reverts commit 1f324e38.

It seems to cause regressions, and in particular the output path
really depends upon there being a socket attached to skb->sk for
checks such as sk_mc_loop(skb->sk) for example.  See ip6_output_finish2().
Reported-by: NStephen Warren <swarren@wwwdotorg.org>
Reported-by: NFabio Estevam <festevam@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25ad6117

ipv4 tunnels: fix an oops when using ipip/sit with IPsec · 737e828b

由 Li Hongjun 提交于 8月 28, 2013

Since commit 3d7b46cd (ip_tunnel: push generic protocol handling to
ip_tunnel module.), an Oops is triggered when an xfrm policy is configured on
an IPv4 over IPv4 tunnel.

xfrm4_policy_check() calls __xfrm_policy_check2(), which uses skb_dst(skb). But
this field is NULL because iptunnel_pull_header() calls skb_dst_drop(skb).
Signed-off-by: NLi Hongjun <hongjun.li@6wind.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

737e828b

tipc: set sk_err correctly when connection fails · 2c8d8518

由 Erik Hugne 提交于 8月 28, 2013

Should a connect fail, if the publication/server is unavailable or
due to some other error, a positive value will be returned and errno
is never set. If the application code checks for an explicit zero
return from connect (success) or a negative return (failure), it
will not catch the error and subsequent send() calls will fail as
shown from the strace snippet below.

socket(0x1e /* PF_??? */, SOCK_SEQPACKET, 0) = 3
connect(3, {sa_family=0x1e /* AF_??? */, sa_data="\2\1\322\4\0\0\322\4\0\0\0\0\0\0"}, 16) = 111
sendto(3, "test", 4, 0, NULL, 0)        = -1 EPIPE (Broken pipe)

The reason for this behaviour is that TIPC wrongly inverts error
codes set in sk_err.
Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c8d8518

tcp: tcp_make_synack() should use sock_wmalloc · eb8895de

由 Phil Oester 提交于 8月 27, 2013

In commit 90ba9b19 (tcp: tcp_make_synack() can use alloc_skb()), Eric changed
the call to sock_wmalloc in tcp_make_synack to alloc_skb.  In doing so,
the netfilter owner match lost its ability to block the SYNACK packet on
outbound listening sockets.  Revert the change, restoring the owner match
functionality.

This closes netfilter bugzilla #847.
Signed-off-by: NPhil Oester <kernel@linuxace.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb8895de

bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones · cc0fdd80

由 Linus Lüssing 提交于 8月 30, 2013

Currently we would still potentially suffer multicast packet loss if there
is just either an IGMP or an MLD querier: For the former case, we would
possibly drop IPv6 multicast packets, for the latter IPv4 ones. This is
because we are currently assuming that if either an IGMP or MLD querier
is present that the other one is present, too.

This patch makes the behaviour and fix added in
"bridge: disable snooping if there is no querier" (b00589af)
to also work if there is either just an IGMP or an MLD querier on the
link: It refines the deactivation of the snooping to be protocol
specific by using separate timers for the snooped IGMP and MLD queries
as well as separate timers for our internal IGMP and MLD queriers.
Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc0fdd80

30 8月, 2013 2 次提交

Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 41615e81

由 Linus Torvalds 提交于 8月 29, 2013

Pull cgroup fix from Tejun Heo:
 "During the percpu reference counting update which was merged during
  v3.11-rc1, the cgroup destruction path was updated so that a cgroup in
  the process of dying may linger on the children list, which was
  necessary as the cgroup should still be included in child/descendant
  iteration while percpu ref is being killed.

  Unfortunately, I forgot to update cgroup destruction path accordingly
  and cgroup destruction may fail spuriously with -EBUSY due to
  lingering dying children even when there's no live child left - e.g.
  "rmdir parent/child parent" will usually fail.

  This can be easily fixed by iterating through the children list to
  verify that there's no live child left.  While this is very late in
  the release cycle, this bug is very visible to userland and I believe
  the fix is relatively safe.

  Thanks Hugh for spotting and providing fix for the issue"

* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix rmdir EBUSY regression in 3.11

41615e81

Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · ff497452

由 Linus Torvalds 提交于 8月 29, 2013

Pull workqueue fix from Tejun Heo:
 "This contains one fix which could lead to system-wide lockup on
  !PREEMPT kernels.  It's very late in the cycle but this definitely is
  a -stable material.

  The problem is that workqueue worker tasks may process unlimited
  number of work items back-to-back without every yielding inbetween.
  This usually isn't noticeable but a work item which re-queues itself
  waiting for someone else to do something can deadlock with
  stop_machine.  stop_machine will ensure nothing else happens on all
  other cpus and the requeueing work item will reqeueue itself
  indefinitely without ever yielding and thus preventing the CPU from
  entering stop_machine.

  Kudos to Jamie Liu for spotting and diagnosing the problem.  This can
  be trivially fixed by adding cond_resched() after processing each work
  item"

* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: cond_resched() after processing each work item

ff497452

bug2833 / cloud-kernel 与 Fork 源项目一致

bug2833 / cloud-kernel
与 Fork 源项目一致