1. 10 11月, 2014 1 次提交
    • T
      /dev/mem: Use more consistent data types · 4707a341
      Thierry Reding 提交于
      The xlate_dev_{kmem,mem}_ptr() functions take either a physical address
      or a kernel virtual address, so data types should be phys_addr_t and
      void *. They both return a kernel virtual address which is only ever
      used in calls to copy_{from,to}_user(), so make variables that store it
      void * rather than char * for consistency.
      
      Also only define a weak unxlate_dev_mem_ptr() function if architectures
      haven't overridden them in the asm/io.h header file.
      Signed-off-by: NThierry Reding <treding@nvidia.com>
      4707a341
  2. 09 10月, 2014 1 次提交
  3. 16 2月, 2014 1 次提交
    • P
      /dev/mem: handle out-of-bounds read/write · 08d2d00b
      Petr Tesarik 提交于
      The loff_t type may be wider than phys_addr_t (e.g. on 32-bit systems).
      Consequently, the file offset may be truncated in the assignment.
      Currently, /dev/mem wraps around, which may cause applications to read
      or write incorrect regions of memory by accident.
      
      Let's follow POSIX file semantics here and return 0 when reading from
      and -EFBIG when writing to an offset that cannot be represented by a
      phys_addr_t.
      
      Note that the conditional is optimized out by the compiler if loff_t
      has the same size as phys_addr_t.
      Signed-off-by: NPetr Tesarik <ptesarik@suse.cz>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08d2d00b
  4. 22 1月, 2014 1 次提交
  5. 04 7月, 2013 1 次提交
    • Z
      /dev/oldmem: Remove the interface · a11edb59
      Zhang Yanfei 提交于
      /dev/oldmem provides the interface for us to access the "old memory" in
      the dump-capture kernel.  Unfortunately, no one actually uses this
      interface.
      
      And this interface could actually cause some real problems if used on ia64
      where the cached/uncached accesses are mixed.  See the discussion from the
      link: https://lkml.org/lkml/2013/4/12/386.
      
      So Eric suggested that we should remove /dev/oldmem as an unused piece of
      code.
      
      [akpm@linux-foundation.org: mention /dev/oldmem obsolescence in devices.txt]
      Suggested-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a11edb59
  6. 07 6月, 2013 1 次提交
  7. 08 5月, 2013 2 次提交
  8. 23 2月, 2013 1 次提交
  9. 07 2月, 2013 1 次提交
  10. 25 10月, 2012 1 次提交
  11. 09 10月, 2012 1 次提交
    • K
      mm: kill vma flag VM_RESERVED and mm->reserved_vm counter · 314e51b9
      Konstantin Khlebnikov 提交于
      A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
      currently it lost original meaning but still has some effects:
      
       | effect                 | alternative flags
      -+------------------------+---------------------------------------------
      1| account as reserved_vm | VM_IO
      2| skip in core dump      | VM_IO, VM_DONTDUMP
      3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      
      This patch removes reserved_vm counter from mm_struct.  Seems like nobody
      cares about it, it does not exported into userspace directly, it only
      reduces total_vm showed in proc.
      
      Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
      
      remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
      remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
      
      [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314e51b9
  12. 11 7月, 2012 1 次提交
  13. 10 5月, 2012 1 次提交
  14. 08 5月, 2012 2 次提交
    • K
      kmsg: export printk records to the /dev/kmsg interface · e11fea92
      Kay Sievers 提交于
      Support for multiple concurrent readers of /dev/kmsg, with read(),
      seek(), poll() support. Output of message sequence numbers, to allow
      userspace log consumers to reliably reconnect and reconstruct their
      state at any given time. After open("/dev/kmsg"), read() always
      returns *all* buffered records. If only future messages should be
      read, SEEK_END can be used. In case records get overwritten while
      /dev/kmsg is held open, or records get faster overwritten than they
      are read, the next read() will return -EPIPE and the current reading
      position gets updated to the next available record. The passed
      sequence numbers allow the log consumer to calculate the amount of
      lost messages.
      
        [root@mop ~]# cat /dev/kmsg
        5,0,0;Linux version 3.4.0-rc1+ (kay@mop) (gcc version 4.7.0 20120315 ...
        6,159,423091;ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
        7,160,424069;pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7] (ignored)
         SUBSYSTEM=acpi
         DEVICE=+acpi:PNP0A03:00
        6,339,5140900;NET: Registered protocol family 10
        30,340,5690716;udevd[80]: starting version 181
        6,341,6081421;FDC 0 is a S82078B
        6,345,6154686;microcode: CPU0 sig=0x623, pf=0x0, revision=0x0
        7,346,6156968;sr 1:0:0:0: Attached scsi CD-ROM sr0
         SUBSYSTEM=scsi
         DEVICE=+scsi:1:0:0:0
        6,347,6289375;microcode: CPU1 sig=0x623, pf=0x0, revision=0x0
      
      Cc: Karel Zak <kzak@redhat.com>
      Tested-by: NWilliam Douglas <william.douglas@intel.com>
      Signed-off-by: NKay Sievers <kay@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e11fea92
    • K
      printk: convert byte-buffer to variable-length record buffer · 7ff9554b
      Kay Sievers 提交于
      - Record-based stream instead of the traditional byte stream
        buffer. All records carry a 64 bit timestamp, the syslog facility
        and priority in the record header.
      
      - Records consume almost the same amount, sometimes less memory than
        the traditional byte stream buffer (if printk_time is enabled). The record
        header is 16 bytes long, plus some padding bytes at the end if needed.
        The byte-stream buffer needed 3 chars for the syslog prefix, 15 char for
        the timestamp and a newline.
      
      - Buffer management is based on message sequence numbers. When records
        need to be discarded, the reading heads move on to the next full
        record. Unlike the byte-stream buffer, no old logged lines get
        truncated or partly overwritten by new ones. Sequence numbers also
        allow consumers of the log stream to get notified if any message in
        the stream they are about to read gets discarded during the time
        of reading.
      
      - Better buffered IO support for KERN_CONT continuation lines, when printk()
        is called multiple times for a single line. The use of KERN_CONT is now
        mandatory to use continuation; a few places in the kernel need trivial fixes
        here. The buffering could possibly be extended to per-cpu variables to allow
        better thread-safety for multiple printk() invocations for a single line.
      
      - Full-featured syslog facility value support. Different facilities
        can tag their messages. All userspace-injected messages enforce a
        facility value > 0 now, to be able to reliably distinguish them from
        the kernel-generated messages. Independent subsystems like a
        baseband processor running its own firmware, or a kernel-related
        userspace process can use their own unique facility values. Multiple
        independent log streams can co-exist that way in the same
        buffer. All share the same global sequence number counter to ensure
        proper ordering (and interleaving) and to allow the consumers of the
        log to reliably correlate the events from different facilities.
      Tested-by: NWilliam Douglas <william.douglas@intel.com>
      Signed-off-by: NKay Sievers <kay@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ff9554b
  15. 04 1月, 2012 1 次提交
  16. 01 11月, 2011 1 次提交
  17. 20 4月, 2011 2 次提交
  18. 24 3月, 2011 1 次提交
  19. 26 10月, 2010 1 次提交
  20. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  21. 22 9月, 2010 1 次提交
  22. 07 8月, 2010 1 次提交
    • D
      Fix init ordering of /dev/console vs callers of modprobe · 31d1d48e
      David Howells 提交于
      Make /dev/console get initialised before any initialisation routine that
      invokes modprobe because if modprobe fails, it's going to want to open
      /dev/console, presumably to write an error message to.
      
      The problem with that is that if the /dev/console driver is not yet
      initialised, the chardev handler will call request_module() to invoke
      modprobe, which will fail, because we never compile /dev/console as a
      module.
      
      This will lead to a modprobe loop, showing the following in the kernel
      log:
      
      	request_module: runaway loop modprobe char-major-5-1
      	request_module: runaway loop modprobe char-major-5-1
      	request_module: runaway loop modprobe char-major-5-1
      	request_module: runaway loop modprobe char-major-5-1
      	request_module: runaway loop modprobe char-major-5-1
      
      This can happen, for example, when the built in md5 module can't find
      the built in cryptomgr module (because the latter fails to initialise).
      The md5 module comes before the call to tty_init(), presumably because
      'crypto' comes before 'drivers' alphabetically.
      
      Fix this by calling tty_init() from chrdev_init().
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      31d1d48e
  23. 07 4月, 2010 3 次提交
  24. 13 3月, 2010 2 次提交
  25. 03 2月, 2010 2 次提交
  26. 16 12月, 2009 6 次提交
  27. 10 12月, 2009 1 次提交
    • C
      vfs: Implement proper O_SYNC semantics · 6b2f3d1f
      Christoph Hellwig 提交于
      While Linux provided an O_SYNC flag basically since day 1, it took until
      Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
      since that day we had generic_osync_around with only minor changes and the
      great "For now, when the user asks for O_SYNC, we'll actually give
      O_DSYNC" comment.  This patch intends to actually give us real O_SYNC
      semantics in addition to the O_DSYNC semantics.  After Jan's O_SYNC
      patches which are required before this patch it's actually surprisingly
      simple, we just need to figure out when to set the datasync flag to
      vfs_fsync_range and when not.
      
      This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
      numerical value to keep binary compatibility, and adds a new real O_SYNC
      flag.  To guarantee backwards compatiblity it is defined as expanding to
      both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
      sure we are backwards-compatible when compiled against the new headers.
      
      This also means that all places that don't care about the differences can
      just check O_DSYNC and get the right behaviour for O_SYNC, too - only
      places that actuall care need to check __O_SYNC in addition.  Drivers and
      network filesystems have been updated in a fail safe way to always do the
      full sync magic if O_DSYNC is set.  The few places setting O_SYNC for
      lower layers are kept that way for now to stay failsafe.
      
      We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
      to make sure we always get these sane options.
      
      Note that parisc really screwed up their headers as they already define a
      O_DSYNC that has always been a no-op.  We try to repair it by using it for
      the new O_DSYNC and redefinining O_SYNC to send both the traditional
      O_SYNC numerical value _and_ the O_DSYNC one.
      
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger@sun.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      6b2f3d1f
  28. 04 12月, 2009 1 次提交