提交 · 73636b1aacb1a07e6fbe0d25e560e69b024a8e25 · openeuler / Kernel

26 5月, 2012 1 次提交

arch/tile: allow building Linux with transparent huge pages enabled · 73636b1a

由 Chris Metcalf 提交于 3月 28, 2012

The change adds some infrastructure for managing tile pmd's more generally,
using pte_pmd() and pmd_pte() methods to translate pmd values to and
from ptes, since on TILEPro a pmd is really just a nested structure
holding a pgd (aka pte). Several existing pmd methods are moved into
this framework, and a whole raft of additional pmd accessors are defined
that are used by the transparent hugepage framework.

The tile PTE now has a "client2" bit. The bit is used to indicate a
transparent huge page is in the process of being split into subpages.

This change also fixes a generic bug where the return value of the
generic pmdp_splitting_flush() was incorrect.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

73636b1a

01 5月, 2012 1 次提交

asm-generic: Use __BITS_PER_LONG in statfs.h · f5c2347e

由 H. Peter Anvin 提交于 4月 26, 2012

<asm-generic/statfs.h> is exported to userspace, so using
BITS_PER_LONG is invalid.  We need to use __BITS_PER_LONG instead.

This is kernel bugzilla 43165.
Reported-by: NH.J. Lu <hjl.tools@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1335465916-16965-1-git-send-email-hpa@linux.intel.comAcked-by: NArnd Bergmann <arnd@arndb.de>
Cc: <stable@vger.kernel.org>

f5c2347e

24 4月, 2012 1 次提交

asm-generic: Allow overriding clock_t and add attributes to siginfo_t · d643bdca

由 H. Peter Anvin 提交于 4月 23, 2012

For the particular issue of x32, which shares code with i386 in the
handling of compat_siginfo_t, the use of a 64-bit clock_t bumps the
sigchld structure out of alignment, which triggers a messy cascade of
padding.

This was already handled on the kernel compat side, but it needs
handling on the user space side, which uses the generic header.  To
make that possible:

1. Allow __kernel_clock_t to be overridden in struct siginfo;
2. Allow there to be attributes added to struct siginfo.
Reported-by: NH.J. Lu <hjl.rools@gmail.com>
Cc: Bruce J. Beare <bruce.j.beare@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/CAMe9rOqF6Kh6-NK7oP0Fpzkd4SBAWU%2BG53hwBbSD4iA2UzyxuA@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

d643bdca

03 4月, 2012 1 次提交

asm-generic: add linux/types.h to cmpxchg.h · 80da6a4f

由 Paul Gortmaker 提交于 4月 01, 2012

Builds of the openrisc or1ksim_defconfig show the following:

  In file included from arch/openrisc/include/generated/asm/cmpxchg.h:1:0,
                   from include/asm-generic/atomic.h:18,
                   from arch/openrisc/include/generated/asm/atomic.h:1,
                   from include/linux/atomic.h:4,
                   from include/linux/dcache.h:4,
                   from fs/notify/fsnotify.c:19:
  include/asm-generic/cmpxchg.h: In function '__xchg':
  include/asm-generic/cmpxchg.h:34:20: error: expected ')' before 'u8'
  include/asm-generic/cmpxchg.h:34:20: warning: type defaults to 'int' in type name

and many more lines of similar errors.  It seems specific to the or32
because most other platforms have an arch specific component that would
have already included types.h ahead of time, but the o32 does not.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: David Howells <dhowells@redhat.com
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

80da6a4f

29 3月, 2012 8 次提交

Delete all instances of asm/system.h · 141124c0

由 David Howells 提交于 3月 28, 2012

Delete all instances of asm/system.h as they should be redundant by this
point.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

141124c0

Remove all #inclusions of asm/system.h · 9ffc93f2

由 David Howells 提交于 3月 28, 2012

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: NDavid Howells <dhowells@redhat.com>

9ffc93f2

Add #includes needed to permit the removal of asm/system.h · 96f951ed

由 David Howells 提交于 3月 28, 2012

asm/system.h is a cause of circular dependency problems because it contains
commonly used primitive stuff like barrier definitions and uncommonly used
stuff like switch_to() that might require MMU definitions.

asm/system.h has been disintegrated by this point on all arches into the
following common segments:

 (1) asm/barrier.h

     Moved memory barrier definitions here.

 (2) asm/cmpxchg.h

     Moved xchg() and cmpxchg() here.  #included in asm/atomic.h.

 (3) asm/bug.h

     Moved die() and similar here.

 (4) asm/exec.h

     Moved arch_align_stack() here.

 (5) asm/elf.h

     Moved AT_VECTOR_SIZE_ARCH here.

 (6) asm/switch_to.h

     Moved switch_to() here.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

96f951ed

Split arch_align_stack() out from asm-generic/system.h · 5d125066

由 David Howells 提交于 3月 28, 2012

Split arch_align_stack() out from asm-generic/system.h into its own header of
asm-generic/exec.h as part of the asm/system.h disintegration.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

5d125066

Split the switch_to() wrapper out of asm-generic/system.h · 158bc507

由 David Howells 提交于 3月 28, 2012

Split the switch_to() wrapper out of asm-generic/system.h into its own
asm-generic/system.h as part of the asm/system.h disintegration.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

158bc507

Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h · b4816afa

由 David Howells 提交于 3月 28, 2012

Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
to simplify disintegration of asm/system.h.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

b4816afa

Create asm-generic/barrier.h · 885df91c

由 David Howells 提交于 3月 28, 2012

Create asm-generic/barrier.h and move the barrier definitions from
asm-generic/system.h to it.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

885df91c

Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h · 34484277

由 David Howells 提交于 3月 28, 2012

Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h as all arch
files that #include the former also #include the latter.  See:

	grep -rl asm-generic/cmpxchg-local[.]h arch/ | sort > b
	grep -rl asm-generic/cmpxchg[.]h arch/ | sort > a
	comm a b

This simplifies the disintegration of asm-generic/system.h for arches that
don't have their own.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

34484277

28 3月, 2012 1 次提交

compat: use sys_sendfile64() implementation for sendfile syscall · 1631fcea

由 Chris Metcalf 提交于 3月 26, 2012

<asm-generic/unistd.h> was set up to use sys_sendfile() for the 32-bit
compat API instead of sys_sendfile64(), but in fact the right thing to
do is to use sys_sendfile64() in all cases.  The 32-bit sendfile64() API
in glibc uses the sendfile64 syscall, so it has to be capable of doing
full 64-bit operations.  But the sys_sendfile() kernel implementation
has a MAX_NON_LFS test in it which explicitly limits the offset to 2^32.
So, we need to use the sys_sendfile64() implementation in the kernel
for this case.

Cc: <stable@kernel.org>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

1631fcea

26 3月, 2012 1 次提交

params: <level>_initcall-like kernel parameters · 026cee00

由 Pawel Moll 提交于 3月 26, 2012

This patch adds a set of macros that can be used to declare
kernel parameters to be parsed _before_ initcalls at a chosen
level are executed.  We rename the now-unused "flags" field of
struct kernel_param as the level.  It's signed, for when we
use this for early params as well, in future.

Linker macro collating init calls had to be modified in order
to add additional symbols between levels that are later used
by the init code to split the calls into blocks.
Signed-off-by: NPawel Moll <pawel.moll@arm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

026cee00

24 3月, 2012 2 次提交

coredump: add VM_NODUMP, MADV_NODUMP, MADV_CLEAR_NODUMP · accb61fe

由 Jason Baron 提交于 3月 23, 2012

Since we no longer need the VM_ALWAYSDUMP flag, let's use the freed bit
for 'VM_NODUMP' flag.  The idea is is to add a new madvise() flag:
MADV_DONTDUMP, which can be set by applications to specifically request
memory regions which should not dump core.

The specific application I have in mind is qemu: we can add a flag there
that wouldn't dump all of guest memory when qemu dumps core.  This flag
might also be useful for security sensitive apps that want to absolutely
make sure that parts of memory are not dumped.  To clear the flag use:
MADV_DODUMP.

[akpm@linux-foundation.org: s/MADV_NODUMP/MADV_DONTDUMP/, s/MADV_CLEAR_NODUMP/MADV_DODUMP/, per Roland]
[akpm@linux-foundation.org: fix up the architectures which broke]
Signed-off-by: NJason Baron <jbaron@redhat.com>
Acked-by: NRoland McGrath <roland@hack.frob.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

accb61fe

consolidate WARN_...ONCE() static variables · 7ccaba53

由 Jan Beulich 提交于 3月 23, 2012

Due to the alignment of following variables, these typically consume
more than just the single byte that 'bool' requires, and as there are a
few hundred instances, the cache pollution (not so much the waste of
memory) sums up.  Put these variables into their own section, outside of
any half way frequently used memory range.

Do the same also to the __warned variable of rcu_lockdep_assert().
(Don't, however, include the ones used by printk_once() and alike, as
they can potentially be hot.)
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7ccaba53

22 3月, 2012 1 次提交

mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode · 1a5a9906

由 Andrea Arcangeli 提交于 3月 21, 2012

In some cases it may happen that pmd_none_or_clear_bad() is called with
the mmap_sem hold in read mode.  In those cases the huge page faults can
allocate hugepmds under pmd_none_or_clear_bad() and that can trigger a
false positive from pmd_bad() that will not like to see a pmd
materializing as trans huge.

It's not khugepaged causing the problem, khugepaged holds the mmap_sem
in write mode (and all those sites must hold the mmap_sem in read mode
to prevent pagetables to go away from under them, during code review it
seems vm86 mode on 32bit kernels requires that too unless it's
restricted to 1 thread per process or UP builds).  The race is only with
the huge pagefaults that can convert a pmd_none() into a
pmd_trans_huge().

Effectively all these pmd_none_or_clear_bad() sites running with
mmap_sem in read mode are somewhat speculative with the page faults, and
the result is always undefined when they run simultaneously.  This is
probably why it wasn't common to run into this.  For example if the
madvise(MADV_DONTNEED) runs zap_page_range() shortly before the page
fault, the hugepage will not be zapped, if the page fault runs first it
will be zapped.

Altering pmd_bad() not to error out if it finds hugepmds won't be enough
to fix this, because zap_pmd_range would then proceed to call
zap_pte_range (which would be incorrect if the pmd become a
pmd_trans_huge()).

The simplest way to fix this is to read the pmd in the local stack
(regardless of what we read, no need of actual CPU barriers, only
compiler barrier needed), and be sure it is not changing under the code
that computes its value.  Even if the real pmd is changing under the
value we hold on the stack, we don't care.  If we actually end up in
zap_pte_range it means the pmd was not none already and it was not huge,
and it can't become huge from under us (khugepaged locking explained
above).

All we need is to enforce that there is no way anymore that in a code
path like below, pmd_trans_huge can be false, but pmd_none_or_clear_bad
can run into a hugepmd.  The overhead of a barrier() is just a compiler
tweak and should not be measurable (I only added it for THP builds).  I
don't exclude different compiler versions may have prevented the race
too by caching the value of *pmd on the stack (that hasn't been
verified, but it wouldn't be impossible considering
pmd_none_or_clear_bad, pmd_bad, pmd_trans_huge, pmd_none are all inlines
and there's no external function called in between pmd_trans_huge and
pmd_none_or_clear_bad).

		if (pmd_trans_huge(*pmd)) {
			if (next-addr != HPAGE_PMD_SIZE) {
				VM_BUG_ON(!rwsem_is_locked(&tlb->mm->mmap_sem));
				split_huge_page_pmd(vma->vm_mm, pmd);
			} else if (zap_huge_pmd(tlb, vma, pmd, addr))
				continue;
			/* fall through */
		}
		if (pmd_none_or_clear_bad(pmd))

Because this race condition could be exercised without special
privileges this was reported in CVE-2012-1179.

The race was identified and fully explained by Ulrich who debugged it.
I'm quoting his accurate explanation below, for reference.

====== start quote =======
      mapcount 0 page_mapcount 1
      kernel BUG at mm/huge_memory.c:1384!

    At some point prior to the panic, a "bad pmd ..." message similar to the
    following is logged on the console:

      mm/memory.c:145: bad pmd ffff8800376e1f98(80000000314000e7).

    The "bad pmd ..." message is logged by pmd_clear_bad() before it clears
    the page's PMD table entry.

        143 void pmd_clear_bad(pmd_t *pmd)
        144 {
    ->  145         pmd_ERROR(*pmd);
        146         pmd_clear(pmd);
        147 }

    After the PMD table entry has been cleared, there is an inconsistency
    between the actual number of PMD table entries that are mapping the page
    and the page's map count (_mapcount field in struct page). When the page
    is subsequently reclaimed, __split_huge_page() detects this inconsistency.

       1381         if (mapcount != page_mapcount(page))
       1382                 printk(KERN_ERR "mapcount %d page_mapcount %d\n",
       1383                        mapcount, page_mapcount(page));
    -> 1384         BUG_ON(mapcount != page_mapcount(page));

    The root cause of the problem is a race of two threads in a multithreaded
    process. Thread B incurs a page fault on a virtual address that has never
    been accessed (PMD entry is zero) while Thread A is executing an madvise()
    system call on a virtual address within the same 2 MB (huge page) range.

               virtual address space
              .---------------------.
              |                     |
              |                     |
            .-|---------------------|
            | |                     |
            | |                     |<-- B(fault)
            | |                     |
      2 MB  | |/////////////////////|-.
      huge <  |/////////////////////|  > A(range)
      page  | |/////////////////////|-'
            | |                     |
            | |                     |
            '-|---------------------|
              |                     |
              |                     |
              '---------------------'

    - Thread A is executing an madvise(..., MADV_DONTNEED) system call
      on the virtual address range "A(range)" shown in the picture.

    sys_madvise
      // Acquire the semaphore in shared mode.
      down_read(&current->mm->mmap_sem)
      ...
      madvise_vma
        switch (behavior)
        case MADV_DONTNEED:
             madvise_dontneed
               zap_page_range
                 unmap_vmas
                   unmap_page_range
                     zap_pud_range
                       zap_pmd_range
                         //
                         // Assume that this huge page has never been accessed.
                         // I.e. content of the PMD entry is zero (not mapped).
                         //
                         if (pmd_trans_huge(*pmd)) {
                             // We don't get here due to the above assumption.
                         }
                         //
                         // Assume that Thread B incurred a page fault and
             .---------> // sneaks in here as shown below.
             |           //
             |           if (pmd_none_or_clear_bad(pmd))
             |               {
             |                 if (unlikely(pmd_bad(*pmd)))
             |                     pmd_clear_bad
             |                     {
             |                       pmd_ERROR
             |                         // Log "bad pmd ..." message here.
             |                       pmd_clear
             |                         // Clear the page's PMD entry.
             |                         // Thread B incremented the map count
             |                         // in page_add_new_anon_rmap(), but
             |                         // now the page is no longer mapped
             |                         // by a PMD entry (-> inconsistency).
             |                     }
             |               }
             |
             v
    - Thread B is handling a page fault on virtual address "B(fault)" shown
      in the picture.

    ...
    do_page_fault
      __do_page_fault
        // Acquire the semaphore in shared mode.
        down_read_trylock(&mm->mmap_sem)
        ...
        handle_mm_fault
          if (pmd_none(*pmd) && transparent_hugepage_enabled(vma))
              // We get here due to the above assumption (PMD entry is zero).
              do_huge_pmd_anonymous_page
                alloc_hugepage_vma
                  // Allocate a new transparent huge page here.
                ...
                __do_huge_pmd_anonymous_page
                  ...
                  spin_lock(&mm->page_table_lock)
                  ...
                  page_add_new_anon_rmap
                    // Here we increment the page's map count (starts at -1).
                    atomic_set(&page->_mapcount, 0)
                  set_pmd_at
                    // Here we set the page's PMD entry which will be cleared
                    // when Thread A calls pmd_clear_bad().
                  ...
                  spin_unlock(&mm->page_table_lock)

    The mmap_sem does not prevent the race because both threads are acquiring
    it in shared mode (down_read).  Thread B holds the page_table_lock while
    the page's map count and PMD table entry are updated.  However, Thread A
    does not synchronize on that lock.

====== end quote =======

[akpm@linux-foundation.org: checkpatch fixes]
Reported-by: NUlrich Obergfell <uobergfe@redhat.com>
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Jones <davej@redhat.com>
Acked-by: NLarry Woodman <lwoodman@redhat.com>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>		[2.6.38+]
Cc: Mark Salter <msalter@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1a5a9906

05 3月, 2012 1 次提交

BUG: headers with BUG/BUG_ON etc. need linux/bug.h · 187f1882

由 Paul Gortmaker 提交于 11月 23, 2011

If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
other BUG variant in a static inline (i.e. not in a #define) then
that header really should be including <linux/bug.h> and not just
expecting it to be implicitly present.

We can make this change risk-free, since if the files using these
headers didn't have exposure to linux/bug.h already, they would have
been causing compile failures/warnings.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

187f1882

03 3月, 2012 1 次提交
- G
  gpio: constify the data parameter to gpiochip_find() · 6e2cf651
  由 Grant Likely 提交于 3月 02, 2012
```
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
```
  6e2cf651
27 2月, 2012 1 次提交

[PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional · 97a29d59

由 James Bottomley 提交于 1月 30, 2012

The problem in

commit fea80311
Author: Randy Dunlap <rdunlap@xenotime.net>
Date:   Sun Jul 24 11:39:14 2011 -0700

    iomap: make IOPORT/PCI mapping functions conditional

is that if your architecture supplies pci_iomap/pci_iounmap, it expects
always to supply them.  Adding empty body defitions in the !CONFIG_PCI
case, which is what this patch does, breaks the parisc compile because
the functions become doubly defined.  It took us a while to spot this,
because we don't actually build !CONFIG_PCI very often (only if someone
is brave enough to test the snake/asp machines).

Since the note in the commit log says this is to fix a
CONFIG_GENERIC_IOMAP issue (which it does because CONFIG_GENERIC_IOMAP
supplies pci_iounmap only if CONFIG_PCI is set), there should actually
have been a condition upon this.  This should make sure no other
architecture's !CONFIG_PCI compile breaks in the same way as parisc.

The fix had to be updated to take account of the GENERIC_PCI_IOMAP
separation.
Reported-by: NRolf Eike Beer <eike@sf-mail.de>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

97a29d59

25 2月, 2012 2 次提交

epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() · d80e731e

由 Oleg Nesterov 提交于 2月 24, 2012

This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.

epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.

This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.

__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.

ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.

The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.

In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.

Note:

	- we do not have wake_up_all_poll() but wake_up_poll()
	  is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.

	- signalfd_cleanup() uses POLLHUP along with POLLFREE,
	  we need a couple of simple changes in eventpoll.c to
	  make sure it can't be "lost".
Reported-by: NMaxime Bizon <mbizon@freebox.fr>
Cc: <stable@kernel.org>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d80e731e

bitops: Add missing parentheses to new get_order macro · b893485d

由 Joerg Roedel 提交于 2月 24, 2012

The new get_order macro introcuded in commit

d66acc39

does not use parentheses around all uses of the parameter n.
This causes new compile warnings, for example in the
amd_iommu_init.c function:

drivers/iommu/amd_iommu_init.c:561:6: warning: suggest parentheses around comparison in operand of ‘&’ [-Wparentheses]
drivers/iommu/amd_iommu_init.c:561:6: warning: suggest parentheses around comparison in operand of ‘&’ [-Wparentheses]

Fix those warnings by adding the missing parentheses.
Reported-by: NIngo Molnar <mingo@elte.hu>
Cc: David Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Link: http://lkml.kernel.org/r/1330088295-28732-1-git-send-email-joerg.roedel@amd.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b893485d

24 2月, 2012 4 次提交

net: Add framework to allow sending packets with customized CRC. · 3bdc0eba

由 Ben Greear 提交于 2月 11, 2012

This is useful for testing RX handling of frames with bad
CRCs.

Requires driver support to actually put the packet on the
wire properly.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

3bdc0eba

PCI: collapse pcibios_resource_to_bus · fb127cb9

由 Bjorn Helgaas 提交于 2月 23, 2012

Everybody uses the generic pcibios_resource_to_bus() supplied by the core
now, so remove the ARCH_HAS_GENERIC_PCI_OFFSETS used during conversion.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

fb127cb9

PCI: add generic pcibios_resource_to_bus() · 36a66cd6

由 Bjorn Helgaas 提交于 2月 23, 2012

This replaces the generic versions of pcibios_resource_to_bus() and
pcibios_bus_to_resource() in asm-generic/pci.h with versions that use
pci_resource_to_bus() and pci_bus_to_resource().

The replacements are equivalent except that they can apply host
bridge window offsets when the arch has supplied them by using
pci_add_resource_offset().

Each arch can convert to using pci_add_resource_offset() individually by
removing its device resource fixups from pcibios_fixup_bus() and supplying
ARCH_HAS_GENERIC_PCI_OFFSETS.  ARCH_HAS_GENERIC_PCI_OFFSETS can be removed
after all have converted.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

36a66cd6

PCI: add pci_clear_flags() · dcce6dc4

由 Bjorn Helgaas 提交于 2月 23, 2012

Add a pci_clear_flags() for cases when we statically initialize
pci_flags, then decide to clear things out later.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

dcce6dc4

22 2月, 2012 2 次提交

asm-generic: architecture independent readq/writeq for 32bit environment · 797a796a

由 Hitoshi Mitake 提交于 2月 07, 2012

This provides unified readq()/writeq() helper functions for 32-bit
drivers.

For some cases, readq/writeq without atomicity is harmful, and order of
io access has to be specified explicitly.  So in this patch, new two
header files which contain non-atomic readq/writeq are added.

 - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/
   writeq with the order of lower address -> higher address

 - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/
   writeq with reversed order

This allows us to remove some readq()s that were added drivers when the
default non-atomic ones were removed in commit dbee8a0a ("x86:
remove 32-bit versions of readq()/writeq()")

The drivers which need readq/writeq but can do with the non-atomic ones
must add the line:

  #include <asm-generic/io-64-nonatomic-lo-hi.h> /* or hi-lo.h */

But this will be nop in 64-bit environments, and no other #ifdefs are
required.  So I believe that this patch can solve the problem of
 1. driver-specific readq/writeq
 2. atomicity and order of io access

This patch is tested with building allyesconfig and allmodconfig as
ARCH=x86 and ARCH=i386 on top of tip/master.

Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Ravi Anand <ravi.anand@qlogic.com>
Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: James Bottomley <James.Bottomley@parallels.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NHitoshi Mitake <h.mitake@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

797a796a

sock: Introduce the SO_PEEK_OFF sock option · ef64a54f

由 Pavel Emelyanov 提交于 2月 21, 2012

This one specifies where to start MSG_PEEK-ing queue data from. When
set to negative value means that MSG_PEEK works as ususally -- peeks
from the head of the queue always.

When some bytes are peeked from queue and the peeking offset is non
negative it is moved forward so that the next peek will return next
portion of data.

When non-peeking recvmsg occurs and the peeking offset is non negative
is is moved backward so that the next peek will still peek the proper
data (i.e. the one that would have been picked if there were no non
peeking recv in between).

The offset is set using per-proto opteration to let the protocol handle
the locking issues and to check whether the peeking offset feature is
supported by the protocol the socket belongs to.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef64a54f

21 2月, 2012 3 次提交

bitops: Optimise get_order() · d66acc39

由 David Howells 提交于 2月 20, 2012

Optimise get_order() to use bit scanning instructions if such exist rather than
a loop.  Also, make it possible to use get_order() in static initialisations
too by building it on top of ilog2() in the constant parameter case.

This has been tested for i386 and x86_64 using the following userspace program,
and for FRV by making appropriate substitutions for fls() and fls64().  It will
abort if the case for get_order() deviates from the original except for the
order of 0, for which get_order() produces an undefined result.  This program
tests both dynamic and static parameters.

	#include <stdlib.h>
	#include <stdio.h>

	#ifdef __x86_64__
	#define BITS_PER_LONG 64
	#else
	#define BITS_PER_LONG 32
	#endif

	#define PAGE_SHIFT 12

	typedef unsigned long long __u64, u64;
	typedef unsigned int __u32, u32;
	#define noinline	__attribute__((noinline))

	static inline int fls(int x)
	{
		int bitpos = -1;

		asm("bsrl %1,%0"
		    : "+r" (bitpos)
		    : "rm" (x));
		return bitpos + 1;
	}

	static __always_inline int fls64(__u64 x)
	{
	#if BITS_PER_LONG == 64
		long bitpos = -1;

		asm("bsrq %1,%0"
		    : "+r" (bitpos)
		    : "rm" (x));
		return bitpos + 1;
	#else
		__u32 h = x >> 32, l = x;
		int bitpos = -1;

		asm("bsrl	%1,%0	\n"
		    "subl	%2,%0	\n"
		    "bsrl	%3,%0	\n"
		    : "+r" (bitpos)
		    : "rm" (l), "i"(32), "rm" (h));

		return bitpos + 33;
	#endif
	}

	static inline __attribute__((const))
	int __ilog2_u32(u32 n)
	{
		return fls(n) - 1;
	}

	static inline __attribute__((const))
	int __ilog2_u64(u64 n)
	{
		return fls64(n) - 1;
	}

	extern __attribute__((const, noreturn))
	int ____ilog2_NaN(void);

	#define ilog2(n)				\
	(						\
		__builtin_constant_p(n) ? (		\
			(n) < 1 ? ____ilog2_NaN() :	\
			(n) & (1ULL << 63) ? 63 :	\
			(n) & (1ULL << 62) ? 62 :	\
			(n) & (1ULL << 61) ? 61 :	\
			(n) & (1ULL << 60) ? 60 :	\
			(n) & (1ULL << 59) ? 59 :	\
			(n) & (1ULL << 58) ? 58 :	\
			(n) & (1ULL << 57) ? 57 :	\
			(n) & (1ULL << 56) ? 56 :	\
			(n) & (1ULL << 55) ? 55 :	\
			(n) & (1ULL << 54) ? 54 :	\
			(n) & (1ULL << 53) ? 53 :	\
			(n) & (1ULL << 52) ? 52 :	\
			(n) & (1ULL << 51) ? 51 :	\
			(n) & (1ULL << 50) ? 50 :	\
			(n) & (1ULL << 49) ? 49 :	\
			(n) & (1ULL << 48) ? 48 :	\
			(n) & (1ULL << 47) ? 47 :	\
			(n) & (1ULL << 46) ? 46 :	\
			(n) & (1ULL << 45) ? 45 :	\
			(n) & (1ULL << 44) ? 44 :	\
			(n) & (1ULL << 43) ? 43 :	\
			(n) & (1ULL << 42) ? 42 :	\
			(n) & (1ULL << 41) ? 41 :	\
			(n) & (1ULL << 40) ? 40 :	\
			(n) & (1ULL << 39) ? 39 :	\
			(n) & (1ULL << 38) ? 38 :	\
			(n) & (1ULL << 37) ? 37 :	\
			(n) & (1ULL << 36) ? 36 :	\
			(n) & (1ULL << 35) ? 35 :	\
			(n) & (1ULL << 34) ? 34 :	\
			(n) & (1ULL << 33) ? 33 :	\
			(n) & (1ULL << 32) ? 32 :	\
			(n) & (1ULL << 31) ? 31 :	\
			(n) & (1ULL << 30) ? 30 :	\
			(n) & (1ULL << 29) ? 29 :	\
			(n) & (1ULL << 28) ? 28 :	\
			(n) & (1ULL << 27) ? 27 :	\
			(n) & (1ULL << 26) ? 26 :	\
			(n) & (1ULL << 25) ? 25 :	\
			(n) & (1ULL << 24) ? 24 :	\
			(n) & (1ULL << 23) ? 23 :	\
			(n) & (1ULL << 22) ? 22 :	\
			(n) & (1ULL << 21) ? 21 :	\
			(n) & (1ULL << 20) ? 20 :	\
			(n) & (1ULL << 19) ? 19 :	\
			(n) & (1ULL << 18) ? 18 :	\
			(n) & (1ULL << 17) ? 17 :	\
			(n) & (1ULL << 16) ? 16 :	\
			(n) & (1ULL << 15) ? 15 :	\
			(n) & (1ULL << 14) ? 14 :	\
			(n) & (1ULL << 13) ? 13 :	\
			(n) & (1ULL << 12) ? 12 :	\
			(n) & (1ULL << 11) ? 11 :	\
			(n) & (1ULL << 10) ? 10 :	\
			(n) & (1ULL <<  9) ?  9 :	\
			(n) & (1ULL <<  8) ?  8 :	\
			(n) & (1ULL <<  7) ?  7 :	\
			(n) & (1ULL <<  6) ?  6 :	\
			(n) & (1ULL <<  5) ?  5 :	\
			(n) & (1ULL <<  4) ?  4 :	\
			(n) & (1ULL <<  3) ?  3 :	\
			(n) & (1ULL <<  2) ?  2 :	\
			(n) & (1ULL <<  1) ?  1 :	\
			(n) & (1ULL <<  0) ?  0 :	\
			____ilog2_NaN()			\
					   ) :		\
		(sizeof(n) <= 4) ?			\
		__ilog2_u32(n) :			\
		__ilog2_u64(n)				\
	 )

	static noinline __attribute__((const))
	int old_get_order(unsigned long size)
	{
		int order;

		size = (size - 1) >> (PAGE_SHIFT - 1);
		order = -1;
		do {
			size >>= 1;
			order++;
		} while (size);
		return order;
	}

	static noinline __attribute__((const))
	int __get_order(unsigned long size)
	{
		int order;
		size--;
		size >>= PAGE_SHIFT;
	#if BITS_PER_LONG == 32
		order = fls(size);
	#else
		order = fls64(size);
	#endif
		return order;
	}

	#define get_order(n)						\
	(								\
		__builtin_constant_p(n) ? (				\
			(n == 0UL) ? BITS_PER_LONG - PAGE_SHIFT :	\
			((n < (1UL << PAGE_SHIFT)) ? 0 :		\
			 ilog2((n) - 1) - PAGE_SHIFT + 1)		\
		) :							\
		__get_order(n)						\
	)

	#define order(N) \
		{ (1UL << N) - 1,	get_order((1UL << N) - 1)	},	\
		{ (1UL << N),		get_order((1UL << N))		},	\
		{ (1UL << N) + 1,	get_order((1UL << N) + 1)	}

	struct order {
		unsigned long n, order;
	};

	static const struct order order_table[] = {
		order(0),
		order(1),
		order(2),
		order(3),
		order(4),
		order(5),
		order(6),
		order(7),
		order(8),
		order(9),
		order(10),
		order(11),
		order(12),
		order(13),
		order(14),
		order(15),
		order(16),
		order(17),
		order(18),
		order(19),
		order(20),
		order(21),
		order(22),
		order(23),
		order(24),
		order(25),
		order(26),
		order(27),
		order(28),
		order(29),
		order(30),
		order(31),
	#if BITS_PER_LONG == 64
		order(32),
		order(33),
		order(34),
		order(35),
	#endif
		{ 0x2929 }
	};

	void check(int loop, unsigned long n)
	{
		unsigned long old, new;

		printf("[%2d]: %09lx | ", loop, n);

		old = old_get_order(n);
		new = get_order(n);

		printf("%3ld, %3ld\n", old, new);
		if (n != 0 && old != new)
			abort();
	}

	int main(int argc, char **argv)
	{
		const struct order *p;
		unsigned long n;
		int loop;

		for (loop = 0; loop <= BITS_PER_LONG - 1; loop++) {
			n = 1UL << loop;
			check(loop, n - 1);
			check(loop, n);
			check(loop, n + 1);
		}

		for (p = order_table; p->n != 0x2929; p++) {
			unsigned long old, new;

			old = old_get_order(p->n);
			new = p->order;
			printf("%09lx\t%3ld, %3ld\n", p->n, old, new);
			if (p->n != 0 && old != new)
				abort();
		}

		return 0;
	}

Disassembling the x86_64 version of the above code shows:

	0000000000400510 <old_get_order>:
	  400510:       48 83 ef 01             sub    $0x1,%rdi
	  400514:       b8 ff ff ff ff          mov    $0xffffffff,%eax
	  400519:       48 c1 ef 0b             shr    $0xb,%rdi
	  40051d:       0f 1f 00                nopl   (%rax)
	  400520:       83 c0 01                add    $0x1,%eax
	  400523:       48 d1 ef                shr    %rdi
	  400526:       75 f8                   jne    400520 <old_get_order+0x10>
	  400528:       f3 c3                   repz retq
	  40052a:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

	0000000000400530 <__get_order>:
	  400530:       48 83 ef 01             sub    $0x1,%rdi
	  400534:       48 c7 c0 ff ff ff ff    mov    $0xffffffffffffffff,%rax
	  40053b:       48 c1 ef 0c             shr    $0xc,%rdi
	  40053f:       48 0f bd c7             bsr    %rdi,%rax
	  400543:       83 c0 01                add    $0x1,%eax
	  400546:       c3                      retq
	  400547:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
	  40054e:       00 00

As can be seen, the new __get_order() function is simpler than the
old_get_order() function.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20120220223928.16199.29548.stgit@warthog.procyon.org.ukAcked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

d66acc39

bitops: Adjust the comment on get_order() to describe the size==0 case · e0891a98

由 David Howells 提交于 2月 20, 2012

Adjust the comment on get_order() to note that the result of passing a size of
0 results in an undefined value.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20120220223917.16199.9416.stgit@warthog.procyon.org.ukAcked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

e0891a98

posix_types: Introduce __kernel_[u]long_t · afead38d

由 H. Peter Anvin 提交于 2月 14, 2012

Introduce __kernel_[u]long_t, which allows an ABI to override all
defaults of type [unsigned] long.

This enables x32 and potentially other 32-bit userspace on 64-bit
kernel ABIs.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

afead38d

15 2月, 2012 3 次提交

posix_types: Remove fd_set macros · 8b3d1cda

由 H. Peter Anvin 提交于 2月 07, 2012

<asm/posix_types.h> includes a set of macros that operate on file
descriptors.  Way long ago those were exported to user space, but
nowadays they are #ifdef __KERNEL__.

However, they are nothing but standard (nonatomic) bit operations, and
we already have optimized versions of bit operations in the kernel.
We can't include <linux/bitops.h> in <asm/posix_types.h> but we can
move the definitions to <linux/time.h> and define them there in terms
of standard kernel bitops.

[ v2: folds the following fixes in:

  a) Stray space in __FD_SET(), reported by Andrew Morton
  b) #include <linux/string.h> needed for memset(), reported by Tony Luck ]
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1328677745-20121-22-git-send-email-hpa@zytor.com
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

8b3d1cda

posix_types: Make it possible to override __kernel_fsid_t · 34e6f9e9

由 H. Peter Anvin 提交于 2月 07, 2012

__kernel_fsid_t has members of type "long" on at least one
architecture (MIPS32), so make it possible to override the definition.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1328677745-20121-3-git-send-email-hpa@zytor.com
Cc: Arnd Bergmann <arnd@arndb.de>

34e6f9e9

posix_types: Make __kernel_[ug]id32_t default to unsigned int · b4255ba3

由 H. Peter Anvin 提交于 2月 07, 2012

All ports use unsigned int for __kernel_[ug]id32_t, but not all ports
use unsigned int for __kernel_[ug]id_t.  Thus, change the default for
the "32" types so ports don't need to override them.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1328677745-20121-2-git-send-email-hpa@zytor.com
Cc: Arnd Bergmann <arnd@arndb.de>

b4255ba3

01 2月, 2012 1 次提交

lib: add NO_GENERIC_PCI_IOPORT_MAP · b923650b

由 Michael S. Tsirkin 提交于 1月 30, 2012

Some architectures need to override the way
IO port mapping is done on PCI devices.
Supply a generic macro that calls
ioport_map, and make it possible for architectures
to override.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

b923650b

13 1月, 2012 1 次提交

thp: add tlb_remove_pmd_tlb_entry · f21760b1

由 Shaohua Li 提交于 1月 12, 2012

We have tlb_remove_tlb_entry to indicate a pte tlb flush entry should be
flushed, but not a corresponding API for pmd entry.  This isn't a
problem so far because THP is only for x86 currently and tlb_flush()
under x86 will flush entire TLB.  But this is confusion and could be
missed if thp is ported to other arch.

Also convert tlb->need_flush = 1 to a VM_BUG_ON(!tlb->need_flush) in
__tlb_remove_page() as suggested by Andrea Arcangeli.  The
__tlb_remove_page() function is supposed to be called after
tlb_remove_xxx_tlb_entry() and we can catch any misuse.
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f21760b1

05 1月, 2012 1 次提交

GPIO: add bindings for managed devices · 1a0703ed

由 John Crispin 提交于 12月 20, 2011

This patch adds 2 functions that allow managed devices to request GPIOs.
These GPIOs will then be managed by drivers/base/devres.c.
Signed-off-by: NJohn Crispin <blogic@openwrt.org>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

1a0703ed

04 1月, 2012 1 次提交
- A
  consolidate umode_t declarations · 0583fcc9
  由 Al Viro 提交于 7月 26, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  0583fcc9
30 12月, 2011 1 次提交

procfs: do not confuse jiffies with cputime64_t · 34845636

由 Andreas Schwab 提交于 12月 28, 2011

Commit 2a95ea6c ("procfs: do not overflow get_{idle,iowait}_time
for nohz") did not take into account that one some architectures jiffies
and cputime use different units.

This causes get_idle_time() to return numbers in the wrong units, making
the idle time fields in /proc/stat wrong.

Instead of converting the usec value returned by
get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
usecs_to_cputime64 to convert it to the correct unit of cputime64_t.
Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Artem S. Tashkinov" <t.artem@mailcity.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

34845636

15 12月, 2011 1 次提交

[S390] cputime: add sparse checking and cleanup · 64861634

由 Martin Schwidefsky 提交于 12月 15, 2011

Make cputime_t and cputime64_t nocast to enable sparse checking to
detect incorrect use of cputime. Drop the cputime macros for simple
scalar operations. The conversion macros are still needed.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

64861634

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功