1. 11 5月, 2007 17 次提交
    • D
      signal/timer/event: signalfd core · fba2afaa
      Davide Libenzi 提交于
      This patch series implements the new signalfd() system call.
      
      I took part of the original Linus code (and you know how badly it can be
      broken :), and I added even more breakage ;) Signals are fetched from the same
      signal queue used by the process, so signalfd will compete with standard
      kernel delivery in dequeue_signal().  If you want to reliably fetch signals on
      the signalfd file, you need to block them with sigprocmask(SIG_BLOCK).  This
      seems to be working fine on my Dual Opteron machine.  I made a quick test
      program for it:
      
      http://www.xmailserver.org/signafd-test.c
      
      The signalfd() system call implements signal delivery into a file descriptor
      receiver.  The signalfd file descriptor if created with the following API:
      
      int signalfd(int ufd, const sigset_t *mask, size_t masksize);
      
      The "ufd" parameter allows to change an existing signalfd sigmask, w/out going
      to close/create cycle (Linus idea).  Use "ufd" == -1 if you want a brand new
      signalfd file.
      
      The "mask" allows to specify the signal mask of signals that we are interested
      in.  The "masksize" parameter is the size of "mask".
      
      The signalfd fd supports the poll(2) and read(2) system calls.  The poll(2)
      will return POLLIN when signals are available to be dequeued.  As a direct
      consequence of supporting the Linux poll subsystem, the signalfd fd can use
      used together with epoll(2) too.
      
      The read(2) system call will return a "struct signalfd_siginfo" structure in
      the userspace supplied buffer.  The return value is the number of bytes copied
      in the supplied buffer, or -1 in case of error.  The read(2) call can also
      return 0, in case the sighand structure to which the signalfd was attached,
      has been orphaned.  The O_NONBLOCK flag is also supported, and read(2) will
      return -EAGAIN in case no signal is available.
      
      If the size of the buffer passed to read(2) is lower than sizeof(struct
      signalfd_siginfo), -EINVAL is returned.  A read from the signalfd can also
      return -ERESTARTSYS in case a signal hits the process.  The format of the
      struct signalfd_siginfo is, and the valid fields depends of the (->code &
      __SI_MASK) value, in the same way a struct siginfo would:
      
      struct signalfd_siginfo {
      	__u32 signo;	/* si_signo */
      	__s32 err;	/* si_errno */
      	__s32 code;	/* si_code */
      	__u32 pid;	/* si_pid */
      	__u32 uid;	/* si_uid */
      	__s32 fd;	/* si_fd */
      	__u32 tid;	/* si_fd */
      	__u32 band;	/* si_band */
      	__u32 overrun;	/* si_overrun */
      	__u32 trapno;	/* si_trapno */
      	__s32 status;	/* si_status */
      	__s32 svint;	/* si_int */
      	__u64 svptr;	/* si_ptr */
      	__u64 utime;	/* si_utime */
      	__u64 stime;	/* si_stime */
      	__u64 addr;	/* si_addr */
      };
      
      [akpm@linux-foundation.org: fix signalfd_copyinfo() on i386]
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fba2afaa
    • D
      signal/timer/event fds: anonymous inode source · 5dc8bf81
      Davide Libenzi 提交于
      This patch add an anonymous inode source, to be used for files that need
      and inode only in order to create a file*. We do not care of having an
      inode for each file, and we do not even care of having different names in
      the associated dentries (dentry names will be same for classes of file*).
      This allow code reuse, and will be used by epoll, signalfd and timerfd
      (and whatever else there'll be).
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5dc8bf81
    • S
      Don't init pgrp and __session in INIT_SIGNALS · 325aa33d
      Sukadev Bhattiprolu 提交于
      Remove initialization of pgrp and __session in INIT_SIGNALS, as these are
      later set by the call to __set_special_pids() in init/main.c by the patch:
      
      	explicitly-set-pgid-and-sid-of-init-process.patch
      Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      325aa33d
    • S
      statically initialize struct pid for swapper · 820e45db
      Sukadev Bhattiprolu 提交于
      Statically initialize a struct pid for the swapper process (pid_t == 0) and
      attach it to init_task.  This is needed so task_pid(), task_pgrp() and
      task_session() interfaces work on the swapper process also.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: <containers@lists.osdl.org>
      Acked-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      820e45db
    • S
      attach_pid() with struct pid parameter · e713d0da
      Sukadev Bhattiprolu 提交于
      attach_pid() currently takes a pid_t and then uses find_pid() to find the
      corresponding struct pid.  Sometimes we already have the struct pid.  We can
      then skip find_pid() if attach_pid() were to take a struct pid parameter.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Cc: <containers@lists.osdl.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e713d0da
    • M
      consolidate generic_writepages and mpage_writepages · 0ea97180
      Miklos Szeredi 提交于
      Clean up massive code duplication between mpage_writepages() and
      generic_writepages().
      
      The new generic function, write_cache_pages() takes a function pointer
      argument, which will be called for each page to be written.
      
      Maybe cifs_writepages() too can use this infrastructure, but I'm not
      touching that with a ten-foot pole.
      
      The upcoming page writeback support in fuse will also want this.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Acked-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ea97180
    • P
      tty: add compat_ioctl · e10cc1df
      Paul Fulghum 提交于
      Add compat_ioctl method for tty code to allow processing of 32 bit ioctl
      calls on 64 bit systems by tty core, tty drivers, and line disciplines.
      
      Based on patch by Arnd Bergmann:
      http://www.uwsg.iu.edu/hypermail/linux/kernel/0511.0/1732.html
      
      [akpm@linux-foundation.org: make things static]
      Signed-off-by: NPaul Fulghum <paulkf@microgate.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e10cc1df
    • R
      module_author: don't advise putting in an email address · 108f39a1
      Rene Herman 提交于
      module_author: don't advise putting in an email address
      
      It's information that's easily outdated and easily mistaken for a driver
      contact which is a problem especially for modules with multiple current and
      non-current authors as well as for modules with a maintainer who may not
      even be a module author.
      Signed-off-by: NRene Herman <rene.herman@gmail.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      108f39a1
    • B
      Add hard_irq_disable() · 2d3fbbb3
      Benjamin Herrenschmidt 提交于
      Some architectures, like powerpc, implement lazy disabling of interrupts.
      That means that on those, local_irq_disable() doesn't actually disable
      interrupts on the CPU, but only sets some per CPU flag which cause them to be
      disabled only if an interrupt actually occurs.
      
      However, in some cases, such as stop_machine, we really want interrupts to be
      fully disabled.  For example, I have code using stop machine to do ECC error
      injection, used to verify operations of the ECC hardware, that sort of thing.
      It really needs to make sure that nothing is actually writing to memory while
      the injection happens.  Similar examples can be found in other low level bits
      and pieces.
      
      This patch implements a generic hard_irq_disable() function which is meant to
      be called -after- local_irq_disable() and ensures that interrupts are fully
      disabled on that CPU.  The default implementation is a nop, though powerpc
      does already provide an appropriate one.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d3fbbb3
    • P
      synclink_gt: add compat_ioctl · 2acdb169
      Paul Fulghum 提交于
      Add support for 32 bit ioctl on 64 bit systems for synclink_gt
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPaul Fulghum <paulkf@microgate.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2acdb169
    • R
      lib/hexdump · 99eaf3c4
      Randy Dunlap 提交于
      Based on ace_dump_mem() from Grant Likely for the Xilinx SystemACE
      CompactFlash interface.
      
      Add print_hex_dump() & hex_dumper() to lib/hexdump.c and linux/kernel.h.
      
      This patch adds the functions print_hex_dump() & hex_dumper().
      print_hex_dump() can be used to perform a hex + ASCII dump of data to
      syslog, in an easily viewable format, thus providing a common text hex dump
      format.
      
      hex_dumper() provides a dump-to-memory function.  It converts one "line" of
      output (16 bytes of input) at a time.
      
      Example usages:
      	print_hex_dump(KERN_DEBUG, DUMP_PREFIX_ADDRESS, frame->data, frame->len);
      	hex_dumper(frame->data, frame->len, linebuf, sizeof(linebuf));
      
      Example output using %DUMP_PREFIX_OFFSET:
      0009ab42: 40414243 44454647 48494a4b 4c4d4e4f-@ABCDEFG HIJKLMNO
      Example output using %DUMP_PREFIX_ADDRESS:
      ffffffff88089af0: 70717273 74757677 78797a7b 7c7d7e7f-pqrstuvw xyz{|}~.
      
      [akpm@linux-foundation.org: cleanups, add export]
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      99eaf3c4
    • E
      getrusage(): fill ru_inblock and ru_oublock fields if possible · 6eaeeaba
      Eric Dumazet 提交于
      If CONFIG_TASK_IO_ACCOUNTING is defined, we update io accounting counters for
      each task.
      
      This patch permits reporting of values using the well known getrusage()
      syscall, filling ru_inblock and ru_oublock instead of null values.
      
      As TASK_IO_ACCOUNTING currently counts bytes counts, we approximate blocks
      count doing : nr_blocks = nr_bytes / 512
      
      Example of use :
      ----------------------
      After patch is applied, /usr/bin/time command can now give a good
      approximation of IO that the process had to do.
      
      $ /usr/bin/time grep tototo /usr/include/*
      Command exited with non-zero status 1
      0.00user 0.02system 0:02.11elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
      24288inputs+0outputs (0major+259minor)pagefaults 0swaps
      
      $ /usr/bin/time dd if=/dev/zero of=/tmp/testfile count=1000
      1000+0 enregistrements lus
      1000+0 enregistrements écrits
      512000 octets (512 kB) copiés, 0,00326601 seconde, 157 MB/s
      0.00user 0.00system 0:00.00elapsed 80%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+3000outputs (0major+299minor)pagefaults 0swaps
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6eaeeaba
    • A
      add upper-32-bits macro · 218e180e
      Andrew Morton 提交于
      We keep on getting "right shift count >= width of type" warnings when doing
      things like
      
      	sector_t s;
      
      	x = s >> 56;
      
      because with CONFIG_LBD=n, s is only 32-bit.  Similar problems can occur with
      dma_addr_t's.
      
      So add a simple wrapper function which code can use to avoid this warning.
      The above example would become
      
      	x = upper_32_bits(s) >> 24;
      
      The first user is in fact AFS.
      
      Cc: James Bottomley <James.Bottomley@SteelEye.com>
      Cc: "Cameron, Steve" <Steve.Cameron@hp.com>
      Cc: "Miller, Mike (OS Dev)" <Mike.Miller@hp.com>
      Cc: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      218e180e
    • C
      slub: support concurrent local and remote frees and allocs on a slab · 894b8788
      Christoph Lameter 提交于
      Avoid atomic overhead in slab_alloc and slab_free
      
      SLUB needs to use the slab_lock for the per cpu slabs to synchronize with
      potential kfree operations.  This patch avoids that need by moving all free
      objects onto a lockless_freelist.  The regular freelist continues to exist
      and will be used to free objects.  So while we consume the
      lockless_freelist the regular freelist may build up objects.
      
      If we are out of objects on the lockless_freelist then we may check the
      regular freelist.  If it has objects then we move those over to the
      lockless_freelist and do this again.  There is a significant savings in
      terms of atomic operations that have to be performed.
      
      We can even free directly to the lockless_freelist if we know that we are
      running on the same processor.  So this speeds up short lived objects.
      They may be allocated and freed without taking the slab_lock.  This is
      particular good for netperf.
      
      In order to maximize the effect of the new faster hotpath we extract the
      hottest performance pieces into inlined functions.  These are then inlined
      into kmem_cache_alloc and kmem_cache_free.  So hotpath allocation and
      freeing no longer requires a subroutine call within SLUB.
      
      [I am not sure that it is worth doing this because it changes the easy to
      read structure of slub just to reduce atomic ops.  However, there is
      someone out there with a benchmark on 4 way and 8 way processor systems
      that seems to show a 5% regression vs.  Slab.  Seems that the regression is
      due to increased atomic operations use vs.  SLAB in SLUB).  I wonder if
      this is applicable or discernable at all in a real workload?]
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      894b8788
    • K
    • K
    • I
      CRC ITU-T V.41 · 3e7cbae7
      Ivo van Doorn 提交于
      This will add the CRC calculation according
      to the CRC ITU-T V.41 to the kernel lib/ folder.
      
      This code has been derived from the rt2x00 driver,
      currently found only in the wireless-dev tree, but
      this library is generic and could be used by more
      drivers who currently use their own implementation.
      Signed-off-by: NIvo van Doorn <IvDoorn@gmail.com>
      
      Also useful for the new firewire stack.
      Signed-off-by: NKristian Hoegsberg <krh@redhat.com>
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      3e7cbae7
  2. 10 5月, 2007 23 次提交
    • S
      [POWERPC] pmu_sys_suspended is only defined for PPC32 · 49d687b6
      Stephen Rothwell 提交于
      thus we get a link error on ppc64 with CONFIG_PM=y.  This fixes it.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      49d687b6
    • L
      acpi,msi-laptop: Fall back to EC polling mode for MSI laptop specific EC commands · 00eb43a1
      Lennart Poettering 提交于
      The ACPI EC that is used in MSI laptops knows some non-standard
      commands for changing the screen brighntess and a few other things,
      which are used by the msi-laptop.c driver. Unfortunately for these
      commands no GPE events for IBF and OBF are triggered. Since nowadays
      the EC code uses the ec_intr=1 mode by default, this causes these
      operations to timeout, although they don't fail. In result, all
      operations that you can do with the msi-laptop.c driver take more or
      less 1s to complete, which is awfully slow.
      
      In one of the more recent kernels (2.6.20?) the EC subsystem has been
      revamped. With that change the EC timeout has been increased. before
      that increase the MSI EC accesses were slow -- but not *that* slow,
      hence I took notice of this limitation of the MSI EC hardware only very
      recently.
      
      The standard EC operations on the MSI EC as defined in the ACPI spec
      support GPE events properly.
      
      The following patch adds a new argument "force_poll" to the
      ec_transaction() function (and friends). If set to 1, the function
      will poll for IBF/OBF even if ec_intr=1 is enabled. If set to 0 the
      current behaviour is used. The msi-laptop driver is modified to make
      use of this new flag, so that OBF/IBF is polled for the special MSI EC
      transactions -- but only for them.
      Signed-off-by: NLennart Poettering <mzxreary@0pointer.de>
      Acked-by: NAlexey Starikovskiy <aystarik@gmail.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      00eb43a1
    • L
      Revert "md: improve partition detection in md array" · 44ce6294
      Linus Torvalds 提交于
      This reverts commit 5b479c91.
      
      Quoth Neil Brown:
      
        "It causes an oops when auto-detecting raid arrays, and it doesn't
         seem easy to fix.
      
         The array may not be 'open' when do_md_run is called, so
         bdev->bd_disk might be NULL, so bd_set_size can oops.
      
         This whole approach of opening an md device before it has been
         assembled just seems to get more and more painful.  I think I'm going
         to have to come up with something clever to provide both backward
         comparability with usage expectation, and sane integration into the
         rest of the kernel."
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      44ce6294
    • B
      ide: legacy PCI bus order probing fixes · 6d208b39
      Bartlomiej Zolnierkiewicz 提交于
      IDE PCI host drivers should register themselves with IDE core only when
      IDE driver is built-in, otherwise (IDE driver is modular and thus IDE PCI
      host drivers are also modular) the code has no effect and just complicates
      the probing.
      
      Fix it by adding new config option CONFIG_IDEPCI_PCIBUS (defined only when
      needed and invisible to the user) and covering by #ifdef/#endif the code
      in question.  It turned out that "ide=reverse" was silently accepted but did
      nothing in case when IDE driver was modular, this is fixed now.
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      6d208b39
    • B
      ide: add ide_proc_register_port() · 5cbf79cd
      Bartlomiej Zolnierkiewicz 提交于
      * create_proc_ide_interfaces() tries to add /proc entries for every probed
        and initialized IDE port, replace it by ide_proc_register_port() which does
        it only for the given port (also rename destroy_proc_ide_interface() to
        ide_proc_unregister_port() for consistency)
        
      * convert {create,destroy}_proc_ide_interface[s]() users to use new functions
      
      * pmac driver depended on proc_ide_create() to add /proc port entries, fix it
        
      * au1xxx-ide, swarm and cs5520 drivers depended indirectly on ide-generic
        driver (CONFIG_IDE_GENERIC=y) to add port /proc entries, fix them
      
      * there is now no need to add /proc entries for IDE ports in proc_ide_create()
        so don't do it
      
      * proc_ide_create() needs now to be called before drivers are probed - fix it,
        while at it make proc_ide_create() create /proc "ide" directory
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      5cbf79cd
    • B
      ide: add "initializing" argument to ide_register_hw() · 869c56ee
      Bartlomiej Zolnierkiewicz 提交于
      Add "initializing" argument to ide_register_hw() and use it instead of ide.c
      wide variable of the same name.  Update all users of ide_register_hw()
      accordingly.
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      869c56ee
    • B
      ide: cable detection fixes (take 2) · 7f8f48af
      Bartlomiej Zolnierkiewicz 提交于
      Tejun's recent eighty_ninty_three() fix has inspired me to do more thorough
      review of the cable detection code...
      
      * print user-friendly warning about limiting the maximum transfer speed
        to UDMA33 (and the reason behind it) when 80-wire cable is not detected,
        also while at it cleanup eighty_ninty_three() a bit
      
      * use eighty_ninty_three() in ide_ata66_check(), this actually fixes 3 bugs:
        - bit 14 (word 93 validity check) == 1 && bit 13 (80-wire cable test) == 1
          were used as 80-wire cable present test for CONFIG_IDEDMA_IVB=n case
          (please see FIXME comment in eighty_ninty_three() for more details)
        - CONFIG_IDEDMA_IVB=y/n cases were interchanged
        - check for SATA devices was missing
      
      * remove private cable warnings from pdc_202xx{old,new} drivers now that core
        code provides this functionality (plus, in pdc202xx_new case the test could
        give false warnings for ATAPI devices because pdc202xx_new driver doesn't
        even support ATAPI DMA)
      
      Cc: Tejun Heo <htejun@gmail.com>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      7f8f48af
    • B
      ide: move IDE settings handling to ide-proc.c · 7662d046
      Bartlomiej Zolnierkiewicz 提交于
      * move
      	__ide_add_setting()
      	ide_add_setting()
      	__ide_remove_setting()
      	auto_remove_settings()
      	ide_find_setting_by_name()
      	ide_read_setting()
      	ide_write_setting()
      	set_xfer_rate()
      	ide_add_generic_settings()
      	ide_register_subdriver()
      	ide_unregister_subdriver()
      
        from ide.c to ide-proc.c
      
      * set_{io_32bit,pio_mode,using_dma}() cannot be marked static now, fix it
      
      * rename ide_[un]register_subdriver() to ide_proc_[un]register_driver(),
        update device drivers to use new names
      
      * add CONFIG_IDE_PROC_FS=n versions of ide_proc_[un]register_driver()
        and ide_add_generic_settings()
      
      * make ide_find_setting_by_name(), ide_{read,write}_setting()
        and ide_{add,remove}_proc_entries() static
      
      * cover IDE settings code in device drivers with CONFIG_IDE_PROC_FS #ifdef,
        also while at it cover with CONFIG_IDE_PROC_FS #ifdef ide_driver_t.proc
      
      * remove bogus comment from ide.h
      
      * cover with CONFIG_IDE_PROC_FS #ifdef .proc and .settings in ide_drive_t
      
      Besides saner code this patch results in the IDE core smaller by ~2 kB
      (on x86-32) and IDE disk driver by ~1 kB (ditto) when CONFIG_IDE_PROC_FS=n.
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      7662d046
    • B
      ide: split off ioctl handling from IDE settings (v2) · 1497943e
      Bartlomiej Zolnierkiewicz 提交于
      * do write permission and min/max checks in ide_procset_t functions
      
      * ide-disk.c: drive->id is always available so cleanup "multcount" setting
        accordingly
      
      * ide-disk.c: "address" setting was incorrectly defined as type TYPE_INTA,
        fix it by using type TYPE_BYTE and updating ide_drive_t->adressing field,
        the bug didn't trigger because this IDE setting uses custom ->set function
      
      * ide.c: add set_ksettings() for handling HDIO_SET_KEEPSETTINGS ioctl
      
      * ide.c: add set_unmaskirq() for handling HDIO_SET_UNMASKINTR ioctl
      
      * handle ioctls directly in generic_ide_ioclt() and idedisk_ioctl()
        instead of using IDE settings to deal with them
      
      * remove no longer needed ide_find_setting_by_ioctl() and {read,write}_ioctl
        fields from ide_settings_t, also remove now unused TYPE_INTA handling
      
      v2:
      * add missing EXPORT_SYMBOL_GPL(ide_setting_sem) needed now for ide-disk
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      1497943e
    • B
      ide: make /proc/ide/ optional · ecfd80e4
      Bartlomiej Zolnierkiewicz 提交于
      All important information/features should be already available through
      sysfs and ioctl interfaces.
      
      Add CONFIG_IDE_PROC_FS (CONFIG_SCSI_PROC_FS rip-off) config option,
      disabling it makes IDE driver ~5 kB smaller (on x86-32).
      
      While at it add CONFIG_PROC_FS=n versions of proc_ide_{create,destroy}()
      and remove no longer needed #ifdefs.
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      ecfd80e4
    • B
      ide: add ide_tune_dma() helper · 29e744d0
      Bartlomiej Zolnierkiewicz 提交于
      After reworking the code responsible for selecting the best DMA
      transfer mode it is now possible to add generic ide_tune_dma() helper.
      
      Convert some IDE PCI host drivers to use it (the ones left need more work).
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      29e744d0
    • B
      ide: rework the code for selecting the best DMA transfer mode (v3) · 2d5eaa6d
      Bartlomiej Zolnierkiewicz 提交于
      Depends on the "ide: fix UDMA/MWDMA/SWDMA masks" patch.
      
      * add ide_hwif_t.udma_filter hook for filtering UDMA mask
        (use it in alim15x3, hpt366, siimage and serverworks drivers)
      * add ide_max_dma_mode() for finding best DMA mode for the device
        (loosely based on some older libata-core.c code)
      * convert ide_dma_speed() users to use ide_max_dma_mode()
      * make ide_rate_filter() take "ide_drive_t *drive" as an argument instead
        of "u8 mode" and teach it to how to use UDMA mask to do filtering
      * use ide_rate_filter() in hpt366 driver
      * remove no longer needed ide_dma_speed() and *_ratemask()
      * unexport eighty_ninty_three()
      
      v2:
      * rename ->filter_udma_mask to ->udma_filter
        [ Suggested by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ]
      
      v3:
      * updated for scc_pata driver (fixes XFER_UDMA_6 filtering for user-space
        originated transfer mode change requests when 100MHz clock is used)
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      2d5eaa6d
    • B
      ide: fix UDMA/MWDMA/SWDMA masks (v3) · 18137207
      Bartlomiej Zolnierkiewicz 提交于
      * use 0x00 instead of 0x80 to disable ->{ultra,mwdma,swdma}_mask
      * add udma_mask field to ide_pci_device_t and use it to initialize
        ->ultra_mask in aec62xx, cmd64x, pdc202xx_{new,old} and piix drivers
      * fix UDMA masks to match with chipset specific *_ratemask()
        (alim15x3, hpt366, serverworks and siimage drivers need UDMA mask
         filtering method - done in the next patch)
      
      v2:
      * piix: fix cable detection for 82801AA_1 and 82372FB_1
        [ Noticed by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ]
      * cmd64x: use hwif->cds->udma_mask
        [ Suggested by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ]
      * aec62xx: fix newly introduced bug - check DMA status not command register
        [ Noticed by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ]
      
      v3:
      * piix: use hwif->cds->udma_mask
        [ Suggested by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ]
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      18137207
    • N
      md: improve partition detection in md array · 5b479c91
      NeilBrown 提交于
      md currently uses ->media_changed to make sure rescan_partitions
      is call on md array after they are assembled.
      
      However that doesn't happen until the array is opened, which is later
      than some people would like.
      
      So use blkdev_ioctl to do the rescan immediately that the
      array has been assembled.
      
      This means we can remove all the ->change infrastructure as it was only used
      to trigger a partition rescan.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b479c91
    • H
      fbdev: add support for AVR32 · 880169dd
      Haavard Skinnemoen 提交于
      Provide framebuffer page protection flags and definitions of
      fb_readl/fb_writel for AVR32.
      Signed-off-by: NHaavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: "Antonino A. Daplas" <adaplas@pol.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      880169dd
    • A
      svgalib: move fb_get_caps to svgalib · 5a87ede9
      Antonino A. Daplas 提交于
      Move fb_get_caps() method to svgalib.c as svga_get_caps() so it can be used by
      s3fb, arkfb and vt8623fb.
      Signed-off-by: NAntonino Daplas <adaplas@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5a87ede9
    • D
      compiler: introduce __used and __maybe_unused · 0d7ebbbc
      David Rientjes 提交于
      __used is defined to be __attribute__((unused)) for all pre-3.3 gcc
      compilers to suppress warnings for unused functions because perhaps they
      are referenced only in inline assembly.  It is defined to be
      __attribute__((used)) for gcc 3.3 and later so that the code is still
      emitted for such functions.
      
      __maybe_unused is defined to be __attribute__((unused)) for both function
      and variable use if it could possibly be unreferenced due to the evaluation
      of preprocessor macros.  Function prototypes shall be marked with
      __maybe_unused if the actual definition of the function is dependant on
      preprocessor macros.
      
      No update to compiler-intel.h is necessary because ICC supports both
      __attribute__((used)) and __attribute__((unused)) as specified by the gcc
      manual.
      
      __attribute_used__ is deprecated and will be removed once all current
      code is converted to using __used.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Adrian Bunk <bunk@stusta.de>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0d7ebbbc
    • R
      rename thread_info to stack · f7e4217b
      Roman Zippel 提交于
      This finally renames the thread_info field in task structure to stack, so that
      the assumptions about this field are gone and archs have more freedom about
      placing the thread_info structure.
      
      Nonbroken archs which have a proper thread pointer can do the access to both
      current thread and task structure via a single pointer.
      
      It'll allow for a few more cleanups of the fork code, from which e.g.  ia64
      could benefit.
      Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
      [akpm@linux-foundation.org: build fix]
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Chris Zankel <chris@zankel.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f7e4217b
    • R
      Allow arch to initialize arch field of the module structure · e61a1c1c
      Roman Zippel 提交于
      This will later allow an arch to add module specific information via linker
      generated tables instead of poking directly in the module object structure.
      Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e61a1c1c
    • T
      clocksource: fix resume logic · b52f52a0
      Thomas Gleixner 提交于
      We need to make sure that the clocksources are resumed, when timekeeping is
      resumed.  The current resume logic does not guarantee this.
      
      Add a resume function pointer to the clocksource struct, so clocksource
      drivers which need to reinitialize the clocksource can provide a resume
      function.
      
      Add a resume function, which calls the maybe available clocksource resume
      functions and resets the watchdog function, so a stable TSC can be used
      accross suspend/resume.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b52f52a0
    • C
      Move remote node draining out of slab allocators · 4037d452
      Christoph Lameter 提交于
      Currently the slab allocators contain callbacks into the page allocator to
      perform the draining of pagesets on remote nodes.  This requires SLUB to have
      a whole subsystem in order to be compatible with SLAB.  Moving node draining
      out of the slab allocators avoids a section of code in SLUB.
      
      Move the node draining so that is is done when the vm statistics are updated.
      At that point we are already touching all the cachelines with the pagesets of
      a processor.
      
      Add a expire counter there.  If we have to update per zone or global vm
      statistics then assume that the pageset will require subsequent draining.
      
      The expire counter will be decremented on each vm stats update pass until it
      reaches zero.  Then we will drain one batch from the pageset.  The draining
      will cause vm counter updates which will then cause another expiration until
      the pcp is empty.  So we will drain a batch every 3 seconds.
      
      Note that remote node draining is a somewhat esoteric feature that is required
      on large NUMA systems because otherwise significant portions of system memory
      can become trapped in pcp queues.  The number of pcp is determined by the
      number of processors and nodes in a system.  A system with 4 processors and 2
      nodes has 8 pcps which is okay.  But a system with 1024 processors and 512
      nodes has 512k pcps with a high potential for large amount of memory being
      caught in them.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4037d452
    • C
      vmstat: use our own timer events · d1187ed2
      Christoph Lameter 提交于
      vmstat is currently using the cache reaper to periodically bring the
      statistics up to date.  The cache reaper does only exists in SLUB as a way to
      provide compatibility with SLAB.  This patch removes the vmstat calls from the
      slab allocators and provides its own handling.
      
      The advantage is also that we can use a different frequency for the updates.
      Refreshing vm stats is a pretty fast job so we can run this every second and
      stagger this by only one tick.  This will lead to some overlap in large
      systems.  F.e a system running at 250 HZ with 1024 processors will have 4 vm
      updates occurring at once.
      
      However, the vm stats update only accesses per node information.  It is only
      necessary to stagger the vm statistics updates per processor in each node.  Vm
      counter updates occurring on distant nodes will not cause cacheline
      contention.
      
      We could implement an alternate approach that runs the first processor on each
      node at the second and then each of the other processor on a node on a
      subsequent tick.  That may be useful to keep a large amount of the second free
      of timer activity.  Maybe the timer folks will have some feedback on this one?
      
      [jirislaby@gmail.com: add missing break]
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d1187ed2
    • R
      Add suspend-related notifications for CPU hotplug · 8bb78442
      Rafael J. Wysocki 提交于
      Since nonboot CPUs are now disabled after tasks and devices have been
      frozen and the CPU hotplug infrastructure is used for this purpose, we need
      special CPU hotplug notifications that will help the CPU-hotplug-aware
      subsystems distinguish normal CPU hotplug events from CPU hotplug events
      related to a system-wide suspend or resume operation in progress.  This
      patch introduces such notifications and causes them to be used during
      suspend and resume transitions.  It also changes all of the
      CPU-hotplug-aware subsystems to take these notifications into consideration
      (for now they are handled in the same way as the corresponding "normal"
      ones).
      
      [oleg@tv-sign.ru: cleanups]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8bb78442