提交 · 8339ee543ece6e2dcc1bbd97d5350163c198cf00 · openanolis / cloud-kernel

21 10月, 2011 13 次提交

GFS2: Make resource groups "append only" during life of fs · 8339ee54

由 Steven Whitehouse 提交于 8月 31, 2011

Since we have ruled out supporting online filesystem shrink,
it is possible to make the resource group list append only
during the life of a super block. This gives several benefits:

Firstly, we only need to read new rindex elements as they are added
rather than needing to reread the whole rindex file each time one
element is added.

Secondly, the rindex glock can be held for much shorter periods of
time, and is completely removed from the fast path for allocations.
The lock is taken in shared mode only when updating the resource
groups when the first allocation occurs, and after a grow has
taken place.

Thirdly, this results in a reduction in code size, and everything
gets a lot simpler to understand in this area.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8339ee54

GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme · 7c9ca621

由 Bob Peterson 提交于 8月 31, 2011

Here is an update of Bob's original rbtree patch which, in addition, also
resolves the rather strange ref counting that was being done relating to
the bitmap blocks.

Originally we had a dual system for journaling resource groups. The metadata
blocks were journaled and also the rgrp itself was added to a list. The reason
for adding the rgrp to the list in the journal was so that the "repolish
clones" code could be run to update the free space, and potentially send any
discard requests when the log was flushed. This was done by comparing the
"cloned" bitmap with what had been written back on disk during the transaction
commit.

Due to this, there was a requirement to hang on to the rgrps' bitmap buffers
until the journal had been flushed. For that reason, there was a rather
complicated set up in the ->go_lock ->go_unlock functions for rgrps involving
both a mutex and a spinlock (the ->sd_rindex_spin) to maintain a reference
count on the buffers.

However, the journal maintains a reference count on the buffers anyway, since
they are being journaled as metadata buffers. So by moving the code which deals
with the post-journal accounting for bitmap blocks to the metadata journaling
code, we can entirely dispense with the rather strange buffer ref counting
scheme and also the requirement to journal the rgrps.

The net result of all this is that the ->sd_rindex_spin is left to do exactly
one job, and that is to look after the rbtree or rgrps.

This patch is designed to be a stepping stone towards using RCU for the rbtree
of resource groups, however the reduction in the number of uses of the
->sd_rindex_spin is likely to have benefits for multi-threaded workloads,
anyway.

The patch retains ->go_lock and ->go_unlock for rgrps, however these maybe also
be removed in future in favour of calling the functions directly where required
in the code. That will allow locking of resource groups without needing to
actually read them in - something that could be useful in speeding up statfs.

In the mean time though it is valid to dereference ->bi_bh only when the rgrp
is locked. This is basically the same rule as before, modulo the references not
being valid until the following journal flush.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Cc: Benjamin Marzinski <bmarzins@redhat.com>

7c9ca621

GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added · 9453615a

由 Steven Whitehouse 提交于 8月 23, 2011

We need to take the inode's glock whenever the inode's size
is referenced, otherwise it might not be uptodate. Even
though generic_file_llseek_unlocked() doesn't implement
SEEK_DATA, SEEK_HOLE directly, it does reference the inode's
size in those cases, so we need to add them to the list
of origins which need the glock.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>

9453615a

GFS2: Clean up gfs2_create · 9a63edd1

由 Steven Whitehouse 提交于 8月 18, 2011

If we pass through knowledge of whether the creation is intended to be
exclusive or not, then we can deal with that in gfs2_create_inode
and remove one set of locking. Also this removes the loop in
gfs2_create and simplifies the code a bit.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9a63edd1

GFS2: Use ->dirty_inode() · ab9bbda0

由 Steven Whitehouse 提交于 8月 15, 2011

The aim of this patch is to use the newly enhanced ->dirty_inode()
super block operation to deal with atime updates, rather than
piggy backing that code into ->write_inode() as is currently
done.

The net result is a simplification of the code in various places
and a reduction of the number of gfs2_dinode_out() calls since
this is now implied by ->dirty_inode().

Some of the mark_inode_dirty() calls have been moved under glocks
in order to take advantage of then being able to avoid locking in
->dirty_inode() when we already have suitable locks.

One consequence is that generic_write_end() now correctly deals
with file size updates, so that we do not need a separate check
for that afterwards. This also, indirectly, means that fdatasync
should work correctly on GFS2 - the current code always syncs the
metadata whether it needs to or not.

Has survived testing with postmark (with and without atime) and
also fsx.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ab9bbda0

GFS2: Fix bug trap and journaled data fsync · f1818529

由 Steven Whitehouse 提交于 8月 05, 2011

Journaled data requires that a complete flush of all dirty data for
the file is done, in order that the ail flush which comes after
will succeed.

Also the recently enhanced bug trap can trigger falsely in case
an ail flush from fsync races with a page read. This updates the
bug trap such that it will ignore buffers which are locked and
only trigger on dirty and/or pinned buffers when the ail flush
is run from fsync. The original bug trap is retained when ail
flush is run from ->go_sync()
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f1818529

GFS2: Fix inode allocation error path · 40ac218f

由 Steven Whitehouse 提交于 8月 02, 2011

If we have got far enough through the inode allocation code
path that an inode has already been allocated, then we must
call iput to dispose of it, if an error occurs during a
later part of the process. This will always be the final iput
since there will be no other references to the inode.

Unlike when the inode has been unlinked, its block state will
be GFS2_BLKST_INODE rather than GFS2_BLKST_UNLINKED so we need
to skip the test in ->evict_inode() for this one case in order
to ensure that it will be deallocated correctly. This patch adds
a new flag in order to ensure that this will happen correctly.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

40ac218f

GFS2: Make atime checks more efficient · 1d4ec642

由 Steven Whitehouse 提交于 8月 02, 2011

We do not need to start a transaction unless the atime
check has proved positive. Also if we are going to flush
the complete ail list anyway, we might as well skip the
writeback for this specific inode's metadata, since that
will be done as part of the ail writeback process in an
order offering potentially more efficient I/O.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1d4ec642

GFS2: Fix bug-trap in ail flush code · 75549186

由 Steven Whitehouse 提交于 8月 02, 2011

The assert was being tested under the wrong lock, a
legacy of the original code. Also, if it does trigger,
the resulting information was not always a lot of help.

This moves the patch under the correct lock and also
prints out more useful information in tacking down the
source of the problem.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

75549186

GFS2: Split data write & wait in fsync · 2f0264d5

由 Steven Whitehouse 提交于 7月 27, 2011

Now that the data writing is part of fsync proper, we can split
the waiting part out and do it later on. This reduces the
number of waits that we do during fsync on average.

There is also no need to take the i_mutex unless we are flushing
metadata to disk, so we can move that to within the metadata
flushing code.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2f0264d5

GFS2: Clean up dir hash table reading · 4c28d338

由 Steven Whitehouse 提交于 7月 26, 2011

Since there is now only a single caller to gfs2_dir_read_data()
and it has a number of constant arguments, we can factor
those out. Also some tests relating to the inode size were
being done twice.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4c28d338

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · fd11e153

由 Linus Torvalds 提交于 10月 20, 2011

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Add alignment flag to PCI expansion resources
  sparc: Avoid calling sigprocmask()
  sparc: Use set_current_blocked()
  sparc32,leon: SRMMU MMU Table probe fix

fd11e153

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 505f48b5

由 Linus Torvalds 提交于 10月 20, 2011

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  fib_rules: fix unresolved_rules counting
  r8169: fix wrong eee setting for rlt8111evl
  r8169: fix driver shutdown WoL regression.
  ehea: Change maintainer to me
  pptp: pptp_rcv_core() misses pskb_may_pull() call
  tproxy: copy transparent flag when creating a time wait
  pptp: fix skb leak in pptp_xmit()
  bonding: use local function pointer of bond->recv_probe in bond_handle_frame
  smsc911x: Add support for SMSC LAN89218
  tg3: negate USE_PHYLIB flag check
  netconsole: enable netconsole can make net_device refcnt incorrent
  bluetooth: Properly clone LSM attributes to newly created child connections
  l2tp: fix a potential skb leak in l2tp_xmit_skb()
  bridge: fix hang on removal of bridge via netlink
  x25: Prevent skb overreads when checking call user data
  x25: Handle undersized/fragmented skbs
  x25: Validate incoming call user data lengths
  udplite: fast-path computation of checksum coverage
  IPVS netns shutdown/startup dead-lock
  netfilter: nf_conntrack: fix event flooding in GRE protocol tracker

505f48b5

20 10月, 2011 6 次提交

mm: fix race between mremap and removing migration entry · 486cf46f

由 Hugh Dickins 提交于 10月 19, 2011

I don't usually pay much attention to the stale "? " addresses in
stack backtraces, but this lucky report from Pawel Sikora hints that
mremap's move_ptes() has inadequate locking against page migration.

 3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page():
 kernel BUG at include/linux/swapops.h:105!
 RIP: 0010:[<ffffffff81127b76>]  [<ffffffff81127b76>]
                       migration_entry_wait+0x156/0x160
  [<ffffffff811016a1>] handle_pte_fault+0xae1/0xaf0
  [<ffffffff810feee2>] ? __pte_alloc+0x42/0x120
  [<ffffffff8112c26b>] ? do_huge_pmd_anonymous_page+0xab/0x310
  [<ffffffff81102a31>] handle_mm_fault+0x181/0x310
  [<ffffffff81106097>] ? vma_adjust+0x537/0x570
  [<ffffffff81424bed>] do_page_fault+0x11d/0x4e0
  [<ffffffff81109a05>] ? do_mremap+0x2d5/0x570
  [<ffffffff81421d5f>] page_fault+0x1f/0x30

mremap's down_write of mmap_sem, together with i_mmap_mutex or lock,
and pagetable locks, were good enough before page migration (with its
requirement that every migration entry be found) came in, and enough
while migration always held mmap_sem; but not enough nowadays, when
there's memory hotremove and compaction.

The danger is that move_ptes() lets a migration entry dodge around
behind remove_migration_pte()'s back, so it's in the old location when
looking at the new, then in the new location when looking at the old.

Either mremap's move_ptes() must additionally take anon_vma lock(), or
migration's remove_migration_pte() must stop peeking for is_swap_entry()
before it takes pagetable lock.

Consensus chooses the latter: we prefer to add overhead to migration
than to mremapping, which gets used by JVMs and by exec stack setup.
Reported-and-tested-by: NPaweł Sikora <pluto@agmk.net>
Signed-off-by: NHugh Dickins <hughd@google.com>
Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

486cf46f

sparc: Add alignment flag to PCI expansion resources · aad45644

由 Kjetil Oftedal 提交于 10月 19, 2011

Currently no type of alignment is specified for PCI expansion roms while
parsing the openfirmware tree. This causes calls to pci_map_rom() to fail.
IORESOURCE_SIZEALIGN is the default alignment used for rom resouces in
pci/probe.c, and has been verified to work with various cards on a ultra 10.
Signed-off-By: NKjetil Oftedal <oftedal@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aad45644

fib_rules: fix unresolved_rules counting · afaef734

由 Yan, Zheng 提交于 10月 17, 2011

we should decrease ops->unresolved_rules when deleting a unresolved rule.
Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

afaef734

r8169: fix wrong eee setting for rlt8111evl · 1b23a3e3

由 hayeswang 提交于 10月 13, 2011

Correct the wrong parameter for setting EEE for RTL8111E-VL.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b23a3e3

r8169: fix driver shutdown WoL regression. · 649b3b8c

由 françois romieu 提交于 10月 14, 2011

Due to commit 92fc43b4 ("r8169: modify the
flow of the hw reset."), rtl8169_hw_reset stomps during driver shutdown on
RxConfig bits which are needed for WOL on some versions of the hardware.

As these bits were formerly set from the r81{0x, 68}_pll_power_down methods,
factor them out for use in the driver shutdown (rtl_shutdown) handler.

I favored __rtl8169_get_wol() -hardware state indication- over
RTL_FEATURE_WOL as the latter has become a good candidate for removal.
Signed-off-by: NFrancois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
Tested-by: NMarc Ballarin <ballarin.marc@gmx.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

649b3b8c

ehea: Change maintainer to me · 34b1901a

由 Thadeu Lima de Souza Cascardo 提交于 10月 13, 2011

Breno Leitao has passed the maintainership to me.
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Breno Leitao <leitao@linux.vnet.ibm.com>
Acked-by: NBreno Leitão <leitao@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34b1901a

19 10月, 2011 14 次提交

Merge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus · e4fcd69c

由 Linus Torvalds 提交于 10月 19, 2011

* 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus:
  [media] videodev: fix a NULL pointer dereference in v4l2_device_release()

e4fcd69c

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · f91f6cfd

由 Linus Torvalds 提交于 10月 19, 2011

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms/atom: fix handling of FB scratch indices
  drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)
  drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls
  drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping
  ttm: Fix error-path using an uninitialized value

f91f6cfd

[media] videodev: fix a NULL pointer dereference in v4l2_device_release() · e58fced2

由 Antonio Ospite 提交于 10月 12, 2011

The change in 8280b662 does not cover the case when v4l2_dev is already
NULL, fix that.

With a Kinect sensor, seen as an USB camera using GSPCA in this context,
a NULL pointer dereference BUG can be triggered by just unplugging the
device after the camera driver has been loaded.
Signed-off-by: NAntonio Ospite <ospite@studenti.unina.it>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

e58fced2

drm/radeon/kms/atom: fix handling of FB scratch indices · 5a6e8482

由 Alex Deucher 提交于 10月 18, 2011

FB scratch indices are dword indices, but we were treating
them as byte indices.  As such, we were getting the wrong
FB scratch data for non-0 indices.  Fix the indices and
guard the indexing against indices larger than the scratch
allocation.

Fixes memory corruption on some boards if data was written
past the end of the FB scratch array.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Reported-by: NDave Airlie <airlied@redhat.com>
Tested-by: NDave Airlie <airlied@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NDave Airlie <airlied@redhat.com>

5a6e8482

pptp: pptp_rcv_core() misses pskb_may_pull() call · 4ea2739e

由 Eric Dumazet 提交于 10月 17, 2011

e1000e uses paged frags, so any layer incorrectly pulling bytes from skb
can trigger a BUG in skb_pull()

[951.142737]  [<ffffffff813d2f36>] skb_pull+0x15/0x17
[951.142737]  [<ffffffffa0286824>] pptp_rcv_core+0x126/0x19a [pptp]
[951.152725]  [<ffffffff813d17c4>] sk_receive_skb+0x69/0x105
[951.163558]  [<ffffffffa0286993>] pptp_rcv+0xc8/0xdc [pptp]
[951.165092]  [<ffffffffa02800a3>] gre_rcv+0x62/0x75 [gre]
[951.165092]  [<ffffffff81410784>] ip_local_deliver_finish+0x150/0x1c1
[951.177599]  [<ffffffff81410634>] ? ip_local_deliver_finish+0x0/0x1c1
[951.177599]  [<ffffffff81410846>] NF_HOOK.clone.7+0x51/0x58
[951.177599]  [<ffffffff81410996>] ip_local_deliver+0x51/0x55
[951.177599]  [<ffffffff814105b9>] ip_rcv_finish+0x31a/0x33e
[951.177599]  [<ffffffff8141029f>] ? ip_rcv_finish+0x0/0x33e
[951.204898]  [<ffffffff81410846>] NF_HOOK.clone.7+0x51/0x58
[951.214651]  [<ffffffff81410bb5>] ip_rcv+0x21b/0x246

pptp_rcv_core() is a nice example of a function assuming everything it
needs is available in skb head.
Reported-by: NBradley Peterson <despite@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ea2739e

tproxy: copy transparent flag when creating a time wait · 58af19e3

由 KOVACS Krisztian 提交于 10月 18, 2011

The transparent socket option setting was not copied to the time wait
socket when an inet socket was being replaced by a time wait socket. This
broke the --transparent option of the socket match and may have caused
that FIN packets belonging to sockets in FIN_WAIT2 or TIME_WAIT state
were being dropped by the packet filter.
Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58af19e3

pptp: fix skb leak in pptp_xmit() · 8bae8bd6

由 Eric Dumazet 提交于 10月 17, 2011

In case we cant transmit skb, we must free it
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bae8bd6

bonding: use local function pointer of bond->recv_probe in bond_handle_frame · 4d97480b

由 Mitsuo Hayasaka 提交于 10月 12, 2011

The bond->recv_probe is called in bond_handle_frame() when
a packet is received, but bond_close() sets it to NULL. So,
a panic occurs when both functions work in parallel.

Why this happen:
After null pointer check of bond->recv_probe, an sk_buff is
duplicated and bond->recv_probe is called in bond_handle_frame.
So, a panic occurs when bond_close() is called between the
check and call of bond->recv_probe.

Patch:
This patch uses a local function pointer of bond->recv_probe
in bond_handle_frame(). So, it can avoid the null pointer
dereference.
Signed-off-by: NMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d97480b

smsc911x: Add support for SMSC LAN89218 · 28c21379

由 Phil Edworthy 提交于 10月 12, 2011

LAN89218 is register compatible with LAN911x.
Signed-off-by: NPhil Edworthy <phil.edworthy@renesas.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28c21379

tg3: negate USE_PHYLIB flag check · e730c823

由 Jiri Pirko 提交于 10月 11, 2011

USE_PHYLIB flag in tg3_remove_one() is being checked incorrectly. This
results tg3_phy_fini->phy_disconnect is never called and when tg3 module
is removed.

In my case this resulted in panics in phy_state_machine calling function
phydev->adjust_link.

So correct this check.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Acked-by: NMatt Carlson <mcarlson@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e730c823

netconsole: enable netconsole can make net_device refcnt incorrent · d5123480

由 Gao feng 提交于 10月 11, 2011

There is no check if netconsole is enabled current.
so when exec echo 1 > enabled;
the reference of net_device will increment always.
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Acked-by: NFlavio Leitner <fbl@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5123480

bluetooth: Properly clone LSM attributes to newly created child connections · 6230c9b4

由 Paul Moore 提交于 10月 07, 2011

The Bluetooth stack has internal connection handlers for all of the various
Bluetooth protocols, and unfortunately, they are currently lacking the LSM
hooks found in the core network stack's connection handlers.  I say
unfortunately, because this can cause problems for users who have have an
LSM enabled and are using certain Bluetooth devices.  See one problem
report below:

 * http://bugzilla.redhat.com/show_bug.cgi?id=741703

In order to keep things simple at this point in time, this patch fixes the
problem by cloning the parent socket's LSM attributes to the newly created
child socket.  If we decide we need a more elaborate LSM marking mechanism
for Bluetooth (I somewhat doubt this) we can always revisit this decision
in the future.
Reported-by: NJames M. Cape <jcape@ignore-your.tv>
Signed-off-by: NPaul Moore <pmoore@redhat.com>
Acked-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6230c9b4

l2tp: fix a potential skb leak in l2tp_xmit_skb() · 835acf5d

由 Eric Dumazet 提交于 10月 07, 2011

l2tp_xmit_skb() can leak one skb if skb_cow_head() returns an error.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

835acf5d

bridge: fix hang on removal of bridge via netlink · 1ce5cce8

由 stephen hemminger 提交于 10月 06, 2011

Need to cleanup bridge device timers and ports when being bridge
device is being removed via netlink.

This fixes the problem of observed when doing:
 ip link add br0 type bridge
 ip link set dev eth1 master br0
 ip link set br0 up
 ip link del br0

which would cause br0 to hang in unregister_netdev because
of leftover reference count.
Reported-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ce5cce8

18 10月, 2011 7 次提交

cputimer: Cure lock inversion · bcd5cff7

由 Peter Zijlstra 提交于 10月 17, 2011

There's a lock inversion between the cputimer->lock and rq->lock;
notably the two callchains involved are:

 update_rlimit_cpu()
   sighand->siglock
   set_process_cpu_timer()
     cpu_timer_sample_group()
       thread_group_cputimer()
         cputimer->lock
         thread_group_cputime()
           task_sched_runtime()
             ->pi_lock
             rq->lock

 scheduler_tick()
   rq->lock
   task_tick_fair()
     update_curr()
       account_group_exec()
         cputimer->lock

Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
the second one is keeping up-to-date.

This problem was introduced by e8abccb7 ("posix-cpu-timers: Cure
SMP accounting oddities").

Cure the problem by removing the cputimer->lock and rq->lock nesting,
this leaves concurrent enablers doing duplicate work, but the time
wasted should be on the same order otherwise wasted spinning on the
lock and the greater-than assignment filter should ensure we preserve
monotonicity.
Reported-by: NDave Jones <davej@redhat.com>
Reported-by: NSimon Kirby <sim@hostway.ca>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/1318928713.21167.4.camel@twinsSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

bcd5cff7

drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2) · a4863ca9

由 Alex Deucher 提交于 10月 12, 2011

Settings in this table reflect the physical panel/connector rather
than the internal dig encoding.

v2: fix typo for DRM_MODE_CONNECTOR_VGA case.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

a4863ca9

drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls · 09cc6506

由 Alex Deucher 提交于 10月 12, 2011

It's handled via external clock.  It should already be protected
by the external ss flag, but add an explicit check just in case.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

09cc6506

drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping · 3a6dea31

由 Alex Deucher 提交于 10月 12, 2011

llano has fully routeable dig encoders similar to DCE3.2 while
ontario has a hardcoded mapping similar to DCE4.0.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

3a6dea31

ttm: Fix error-path using an uninitialized value · e22469ca

由 Thomas Hellstrom 提交于 10月 17, 2011

Pointed out by Michel Daenzer.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

e22469ca

L

Linux 3.1-rc10 · 899e3ee4
由 Linus Torvalds 提交于 10月 17, 2011

899e3ee4
D

Merge branch 'nf' of git://1984.lsi.us.es/net · ae2a4583
由 David S. Miller 提交于 10月 17, 2011

ae2a4583

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功