1. 08 6月, 2016 1 次提交
  2. 19 5月, 2016 1 次提交
  3. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  4. 16 3月, 2016 1 次提交
  5. 16 1月, 2016 1 次提交
    • D
      mm, dax, pmem: introduce pfn_t · 34c0fd54
      Dan Williams 提交于
      For the purpose of communicating the optional presence of a 'struct
      page' for the pfn returned from ->direct_access(), introduce a type that
      encapsulates a page-frame-number plus flags.  These flags contain the
      historical "page_link" encoding for a scatterlist entry, but can also
      denote "device memory".  Where "device memory" is a set of pfns that are
      not part of the kernel's linear mapping by default, but are accessed via
      the same memory controller as ram.
      
      The motivation for this new type is large capacity persistent memory
      that needs struct page entries in the 'memmap' to support 3rd party DMA
      (i.e.  O_DIRECT I/O with a persistent memory source/target).  However,
      we also need it in support of maintaining a list of mapped inodes which
      need to be unmapped at driver teardown or freeze_bdev() time.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34c0fd54
  6. 12 11月, 2015 1 次提交
  7. 08 11月, 2015 1 次提交
  8. 28 8月, 2015 1 次提交
  9. 21 8月, 2015 1 次提交
  10. 29 7月, 2015 1 次提交
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  11. 17 7月, 2015 1 次提交
  12. 17 2月, 2015 1 次提交
    • M
      brd: rename XIP to DAX · a7a97fc9
      Matthew Wilcox 提交于
      Since this is relating to FS_XIP, not KERNEL_XIP, it should be called
      DAX instead of XIP.
      Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Andreas Dilger <andreas.dilger@intel.com>
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7a97fc9
  13. 14 1月, 2015 3 次提交
    • B
      brd: Request from fdisk 4k alignment · c8fa3173
      Boaz Harrosh 提交于
      Because of the direct_access() API which returns a PFN. partitions
      better start on 4K boundary, else offset ZERO of a partition will
      not be aligned and blk_direct_access() will fail the call.
      
      By setting blk_queue_physical_block_size(PAGE_SIZE) we can communicate
      this to fdisk and friends.
      
      The call to blk_queue_physical_block_size() is harmless and will
      not affect the Kernel behavior in any way. It is only for
      communication to user-mode.
      
      before this patch running fdisk on a default size brd of 4M
      the first sector offered is 34 (BAD), but after this patch it
      will be 40, ie 8 sectors aligned. Also when entering some random
      partition sizes the next partition-start sector is offered 8 sectors
      aligned after this patch. (Please note that with fdisk the user
      can still enter bad values, only the offered default values will
      be correct)
      
      Note that with bdev-size > 4M fdisk will try to align on a 1M
      boundary (above first-sector will be 2048), in any case.
      
      CC: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NBoaz Harrosh <boaz@plexistor.com>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c8fa3173
    • B
      brd: Fix all partitions BUGs · 937af5ec
      Boaz Harrosh 提交于
      This patch fixes up brd's partitions scheme, now enjoying all worlds.
      
      The MAIN fix here is that currently, if one fdisks some partitions,
      a BAD bug will make all partitions point to the same start-end sector
      ie: 0 - brd_size And an mkfs of any partition would trash the partition
      table and the other partition.
      
      Another fix is that "mount -U uuid" will not work if show_part was not
      specified, because of the GENHD_FL_SUPPRESS_PARTITION_INFO flag.
      We now always load without it and remove the show_part parameter.
      
      [We remove Dmitry's new module-param part_show it is now always
       show]
      
      So NOW the logic goes like this:
      * max_part - Just says how many minors to reserve between ramX
        devices. In any way, there can be as many partition as requested.
        If minors between devices ends, then dynamic 259-major ids will
        be allocated on the fly.
        The default is now max_part=1, which means all partitions devt(s)
        will be from the dynamic (259) major-range.
        (If persistent partition minors is needed use max_part=X)
        For example with /dev/sdX max_part is hard coded 16.
      
      * Creation of new devices on the fly still/always work:
        mknod /path/devnod b 1 X
        fdisk -l /path/devnod
        Will create a new device if [X / max_part] was not already
        created before. (Just as before)
      
        partitions on the dynamically created device will work as well
        Same logic applies with minors as with the pre-created ones.
      
      TODO: dynamic grow of device size. So each device can have it's
            own size.
      
      CC: Dmitry Monakhov <dmonakhov@openvz.org>
      Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NBoaz Harrosh <boaz@plexistor.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      937af5ec
    • M
      block: Change direct_access calling convention · dd22f551
      Matthew Wilcox 提交于
      In order to support accesses to larger chunks of memory, pass in a
      'size' parameter (counted in bytes), and return the amount available at
      that address.
      
      Add a new helper function, bdev_direct_access(), to handle common
      functionality including partition handling, checking the length requested
      is positive, checking for the sector being page-aligned, and checking
      the length of the request does not pass the end of the partition.
      Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NBoaz Harrosh <boaz@plexistor.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dd22f551
  14. 22 8月, 2014 1 次提交
  15. 05 6月, 2014 2 次提交
  16. 24 11月, 2013 2 次提交
    • K
      block: Convert bio_for_each_segment() to bvec_iter · 7988613b
      Kent Overstreet 提交于
      More prep work for immutable biovecs - with immutable bvecs drivers
      won't be able to use the biovec directly, they'll need to use helpers
      that take into account bio->bi_iter.bi_bvec_done.
      
      This updates callers for the new usage without changing the
      implementation yet.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Cc: Jim Paris <jim@jtan.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@inktank.com>
      Cc: ceph-devel@vger.kernel.org
      Cc: Joshua Morris <josh.h.morris@us.ibm.com>
      Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
      Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
      Cc: support@lsi.com
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Guo Chao <yan@linux.vnet.ibm.com>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: drbd-user@lists.linbit.com
      Cc: nbd-general@lists.sourceforge.net
      Cc: cbe-oss-dev@lists.ozlabs.org
      Cc: xen-devel@lists.xensource.com
      Cc: virtualization@lists.linux-foundation.org
      Cc: linux-raid@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: DL-MPTFusionLinux@lsi.com
      Cc: linux-scsi@vger.kernel.org
      Cc: devel@driverdev.osuosl.org
      Cc: linux-fsdevel@vger.kernel.org
      Cc: cluster-devel@redhat.com
      Cc: linux-mm@kvack.org
      Acked-by: NGeoff Levand <geoff@infradead.org>
      7988613b
    • K
      block: Abstract out bvec iterator · 4f024f37
      Kent Overstreet 提交于
      Immutable biovecs are going to require an explicit iterator. To
      implement immutable bvecs, a later patch is going to add a bi_bvec_done
      member to this struct; for now, this patch effectively just renames
      things.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@inktank.com>
      Cc: ceph-devel@vger.kernel.org
      Cc: Joshua Morris <josh.h.morris@us.ibm.com>
      Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Cc: Benny Halevy <bhalevy@tonian.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: xfs@oss.sgi.com
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Guo Chao <yan@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: "Roger Pau Monné" <roger.pau@citrix.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Ian Campbell <Ian.Campbell@citrix.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchand@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Peng Tao <tao.peng@emc.com>
      Cc: Andy Adamson <andros@netapp.com>
      Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Pankaj Kumar <pankaj.km@samsung.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: Mel Gorman <mgorman@suse.de>6
      4f024f37
  17. 08 11月, 2013 1 次提交
    • M
      block: fix a probe argument to blk_register_region · a207f593
      Mikulas Patocka 提交于
      The probe function is supposed to return NULL on failure (as we can see in
      kobj_lookup: kobj = probe(dev, index, data); ... if (kobj) return kobj;
      
      However, in loop and brd, it returns negative error from ERR_PTR.
      
      This causes a crash if we simulate disk allocation failure and run
      less -f /dev/loop0 because the negative number is interpreted as a pointer:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000002b4
      IP: [<ffffffff8118b188>] __blkdev_get+0x28/0x450
      PGD 23c677067 PUD 23d6d1067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in: loop hpfs nvidia(PO) ip6table_filter ip6_tables uvesafb cfbcopyarea cfbimgblt cfbfillrect fbcon font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor fb fbdev msr ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc tun ipv6 cpufreq_stats cpufreq_ondemand cpufreq_userspace cpufreq_powersave cpufreq_conservative hid_generic spadfs usbhid hid fuse raid0 snd_usb_audio snd_pcm_oss snd_mixer_oss md_mod snd_pcm snd_timer snd_page_alloc snd_hwdep snd_usbmidi_lib dmi_sysfs snd_rawmidi nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack snd soundcore lm85 hwmon_vid ohci_hcd ehci_pci ehci_hcd serverworks sata_svw libata acpi_cpufreq freq_table mperf ide_core usbcore kvm_amd kvm tg3 i2c_piix4 libphy microcode e100 usb_common ptp skge i2c_core pcspkr k10temp evdev floppy hwmon pps_core mii rtc_cmos button processor unix [last unloaded: nvidia]
      CPU: 1 PID: 6831 Comm: less Tainted: P        W  O 3.10.15-devel #18
      Hardware name: empty empty/S3992-E, BIOS 'V1.06   ' 06/09/2009
      task: ffff880203cc6bc0 ti: ffff88023e47c000 task.ti: ffff88023e47c000
      RIP: 0010:[<ffffffff8118b188>]  [<ffffffff8118b188>] __blkdev_get+0x28/0x450
      RSP: 0018:ffff88023e47dbd8  EFLAGS: 00010286
      RAX: ffffffffffffff74 RBX: ffffffffffffff74 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
      RBP: ffff88023e47dc18 R08: 0000000000000002 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff88023f519658
      R13: ffffffff8118c300 R14: 0000000000000000 R15: ffff88023f519640
      FS:  00007f2070bf7700(0000) GS:ffff880247400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000002b4 CR3: 000000023da1d000 CR4: 00000000000007e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       0000000000000002 0000001d00000000 000000003e47dc50 ffff88023f519640
       ffff88043d5bb668 ffffffff8118c300 ffff88023d683550 ffff88023e47de60
       ffff88023e47dc98 ffffffff8118c10d 0000001d81605698 0000000000000292
      Call Trace:
       [<ffffffff8118c300>] ? blkdev_get_by_dev+0x60/0x60
       [<ffffffff8118c10d>] blkdev_get+0x1dd/0x370
       [<ffffffff8118c300>] ? blkdev_get_by_dev+0x60/0x60
       [<ffffffff813cea6c>] ? _raw_spin_unlock+0x2c/0x50
       [<ffffffff8118c300>] ? blkdev_get_by_dev+0x60/0x60
       [<ffffffff8118c365>] blkdev_open+0x65/0x80
       [<ffffffff8114d12e>] do_dentry_open.isra.18+0x23e/0x2f0
       [<ffffffff8114d214>] finish_open+0x34/0x50
       [<ffffffff8115e122>] do_last.isra.62+0x2d2/0xc50
       [<ffffffff8115eb58>] path_openat.isra.63+0xb8/0x4d0
       [<ffffffff81115a8e>] ? might_fault+0x4e/0xa0
       [<ffffffff8115f4f0>] do_filp_open+0x40/0x90
       [<ffffffff813cea6c>] ? _raw_spin_unlock+0x2c/0x50
       [<ffffffff8116db85>] ? __alloc_fd+0xa5/0x1f0
       [<ffffffff8114e45f>] do_sys_open+0xef/0x1d0
       [<ffffffff8114e559>] SyS_open+0x19/0x20
       [<ffffffff813cff16>] system_call_fastpath+0x1a/0x1f
      Code: 44 00 00 55 48 89 e5 41 57 49 89 ff 41 56 41 89 d6 41 55 41 54 4c 8d 67 18 53 48 83 ec 18 89 75 cc e9 f2 00 00 00 0f 1f 44 00 00 <48> 8b 80 40 03 00 00 48 89 df 4c 8b 68 58 e8 d5
      a4 07 00 44 89
      RIP  [<ffffffff8118b188>] __blkdev_get+0x28/0x450
       RSP <ffff88023e47dbd8>
      CR2: 00000000000002b4
      ---[ end trace bb7f32dbf02398dc ]---
      
      The brd change should be backported to stable kernels starting with 2.6.25.
      The loop change should be backported to stable kernels starting with 2.6.22.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: stable@kernel.org	# 2.6.22+
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a207f593
  18. 25 5月, 2013 1 次提交
  19. 24 3月, 2013 1 次提交
    • K
      block: Add bio_end_sector() · f73a1c7d
      Kent Overstreet 提交于
      Just a little convenience macro - main reason to add it now is preparing
      for immutable bio vecs, it'll reduce the size of the patch that puts
      bi_sector/bi_size/bi_idx into a struct bvec_iter.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      CC: Jens Axboe <axboe@kernel.dk>
      CC: Lars Ellenberg <drbd-dev@lists.linbit.com>
      CC: Jiri Kosina <jkosina@suse.cz>
      CC: Alasdair Kergon <agk@redhat.com>
      CC: dm-devel@redhat.com
      CC: Neil Brown <neilb@suse.de>
      CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
      CC: Heiko Carstens <heiko.carstens@de.ibm.com>
      CC: linux-s390@vger.kernel.org
      CC: Chris Mason <chris.mason@fusionio.com>
      CC: Steven Whitehouse <swhiteho@redhat.com>
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      f73a1c7d
  20. 20 3月, 2012 1 次提交
  21. 04 1月, 2012 1 次提交
  22. 12 9月, 2011 1 次提交
  23. 27 5月, 2011 5 次提交
    • N
      brd: export module parameters · 8892cbaf
      Namhyung Kim 提交于
      Export 'rd_nr', 'rd_size' and 'max_part' parameters to sysfs so user can
      know that how many devices are allowed, how big each device is and how
      many partitions are supported. If 'max_part' is 0, it means simply the
      device doesn't support partitioning.
      
      Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
      needed. User should check this value after the module loading if he/she
      want to use that number correctly (i.e. fdisk, mknod, etc.).
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      8892cbaf
    • N
      brd: fix comment on initial device creation · 13868b76
      Namhyung Kim 提交于
      If 'rd_nr' param was not specified, 16 (can be adjusted via
      CONFIG_BLK_DEV_RAM_COUNT) devices would be created by default
      but comment said 1. Fix it.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      13868b76
    • N
      brd: handle on-demand devices correctly · af465668
      Namhyung Kim 提交于
      When finding or allocating a ram disk device, brd_probe() did not take
      partition numbers into account so that it can result to a different
      device. Consider following example (I set CONFIG_BLK_DEV_RAM_COUNT=4
      for simplicity) :
      
      $ sudo modprobe brd max_part=15
      $ ls -l /dev/ram*
      brw-rw---- 1 root disk 1,  0 2011-05-25 15:41 /dev/ram0
      brw-rw---- 1 root disk 1, 16 2011-05-25 15:41 /dev/ram1
      brw-rw---- 1 root disk 1, 32 2011-05-25 15:41 /dev/ram2
      brw-rw---- 1 root disk 1, 48 2011-05-25 15:41 /dev/ram3
      $ sudo mknod /dev/ram4 b 1 64
      $ sudo dd if=/dev/zero of=/dev/ram4 bs=4k count=256
      256+0 records in
      256+0 records out
      1048576 bytes (1.0 MB) copied, 0.00215578 s, 486 MB/s
      namhyung@leonhard:linux$ ls -l /dev/ram*
      brw-rw---- 1 root disk 1,    0 2011-05-25 15:41 /dev/ram0
      brw-rw---- 1 root disk 1,   16 2011-05-25 15:41 /dev/ram1
      brw-rw---- 1 root disk 1,   32 2011-05-25 15:41 /dev/ram2
      brw-rw---- 1 root disk 1,   48 2011-05-25 15:41 /dev/ram3
      brw-r--r-- 1 root root 1,   64 2011-05-25 15:45 /dev/ram4
      brw-rw---- 1 root disk 1, 1024 2011-05-25 15:44 /dev/ram64
      
      After this patch, /dev/ram4 - instead of /dev/ram64 - was
      accessed correctly.
      
      In addition, 'range' passed to blk_register_region() should
      include all range of dev_t that RAMDISK_MAJOR can address.
      It does not need to be limited by partition numbers unless
      'rd_nr' param was specified.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      af465668
    • N
      brd: limit 'max_part' module param to DISK_MAX_PARTS · 315980c8
      Namhyung Kim 提交于
      The 'max_part' parameter controls the number of maximum partition
      a brd device can have. However if a user specifies very large
      value it would exceed the limitation of device minor number and
      can cause a kernel panic (or, at least, produce invalid device
      nodes in some cases).
      
      On my desktop system, following command kills the kernel. On qemu,
      it triggers similar oops but the kernel was alive:
      
      $ sudo modprobe brd max_part=100000
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
       IP: [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
       PGD 7af1067 PUD 7b19067 PMD 0
       Oops: 0000 [#1] SMP
       last sysfs file:
       CPU 0
       Modules linked in: brd(+)
      
       Pid: 44, comm: insmod Tainted: G        W   2.6.39-qemu+ #158 Bochs Bochs
       RIP: 0010:[<ffffffff81110a9a>]  [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
       RSP: 0018:ffff880007b15d78  EFLAGS: 00000286
       RAX: ffff880007b05478 RBX: ffff880007a52760 RCX: ffff880007b15dc8
       RDX: ffff880007a4f900 RSI: ffff880007b15e48 RDI: ffff880007a52760
       RBP: ffff880007b15da8 R08: 0000000000000002 R09: 0000000000000000
       R10: ffff880007b15e48 R11: ffff880007b05478 R12: 0000000000000000
       R13: ffff880007b05478 R14: 0000000000400920 R15: 0000000000000063
       FS:  0000000002160880(0063) GS:ffff880007c00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000058 CR3: 0000000007b1c000 CR4: 00000000000006b0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
       Process insmod (pid: 44, threadinfo ffff880007b14000, task ffff880007acb980)
       Stack:
        ffff880007b15dc8 ffff880007b05478 ffff880007b15da8 00000000fffffffe
        ffff880007a52760 ffff880007b05478 ffff880007b15de8 ffffffff81143c0a
        0000000000400920 ffff880007a52760 ffff880007b05478 0000000000000000
       Call Trace:
        [<ffffffff81143c0a>] kobject_add_internal+0xdf/0x1a0
        [<ffffffff81143da1>] kobject_add_varg+0x41/0x50
        [<ffffffff81143e6b>] kobject_add+0x64/0x66
        [<ffffffff8113bbe7>] blk_register_queue+0x5f/0xb8
        [<ffffffff81140f72>] add_disk+0xdf/0x289
        [<ffffffffa00040df>] brd_init+0xdf/0x1aa [brd]
        [<ffffffffa0004000>] ? 0xffffffffa0003fff
        [<ffffffffa0004000>] ? 0xffffffffa0003fff
        [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
        [<ffffffff8108516c>] sys_init_module+0x9c/0x1dc
        [<ffffffff812ff4bb>] system_call_fastpath+0x16/0x1b
       Code: 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 18 48 85 ff 75 04 0f 0b eb fe 48 8b 47 18 49 c7 c4 70 1e 4d 81 48 85 c0 74 04 4c 8b 60 30
        8b 44 24 58 45 31 ed 0f b6 c4 85 c0 74 0d 48 8b 43 28 48 89
       RIP  [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
        RSP <ffff880007b15d78>
       CR2: 0000000000000058
       ---[ end trace aebb1175ce1f6739 ]---
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      315980c8
    • N
      brd: get rid of unused members from struct brd_device · a2cba291
      Namhyung Kim 提交于
      brd_refcnt, brd_offset, brd_sizelimit and brd_blocksize in struct
      brd_device seem to be copied from struct loop_device but they're
      not used anywhere. Let get rid of them.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      a2cba291
  24. 05 10月, 2010 1 次提交
    • A
      block: autoconvert trivial BKL users to private mutex · 2a48fc0a
      Arnd Bergmann 提交于
      The block device drivers have all gained new lock_kernel
      calls from a recent pushdown, and some of the drivers
      were already using the BKL before.
      
      This turns the BKL into a set of per-driver mutexes.
      Still need to check whether this is safe to do.
      
      file=$1
      name=$2
      if grep -q lock_kernel ${file} ; then
          if grep -q 'include.*linux.mutex.h' ${file} ; then
                  sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
          else
                  sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
          fi
          sed -i ${file} \
              -e "/^#include.*linux.mutex.h/,$ {
                      1,/^\(static\|int\|long\)/ {
                           /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);
      
      } }"  \
          -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
          -e '/[      ]*cycle_kernel_lock();/d'
      else
          sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                      -e '/cycle_kernel_lock()/d'
      fi
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      2a48fc0a
  25. 10 9月, 2010 2 次提交
    • T
      block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() · 4913efe4
      Tejun Heo 提交于
      Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
      requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
      -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
      blk_queue_flush().
      
      blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
      device has write cache and can flush it, it should set REQ_FLUSH.  If
      the device can handle FUA writes, it should also set REQ_FUA.
      
      All blk_queue_ordered() users are converted.
      
      * ORDERED_DRAIN is mapped to 0 which is the default value.
      * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
      * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      4913efe4
    • T
      block: kill QUEUE_ORDERED_BY_TAG · 6958f145
      Tejun Heo 提交于
      Nobody is making meaningful use of ORDERED_BY_TAG now and queue
      draining for barrier requests will be removed soon which will render
      the advantage of tag ordering moot.  Kill ORDERED_BY_TAG.  The
      following users are affected.
      
      * brd: converted to ORDERED_DRAIN.
      * virtio_blk: ORDERED_TAG path was already marked deprecated.  Removed.
      * xen-blkfront: ORDERED_TAG case dropped.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6958f145
  26. 08 8月, 2010 3 次提交
  27. 01 6月, 2010 1 次提交
  28. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  29. 26 2月, 2010 1 次提交