1. 11 2月, 2014 10 次提交
    • E
      fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem · 96c7a2ff
      Eric W. Biederman 提交于
      Recently due to a spike in connections per second memcached on 3
      separate boxes triggered the OOM killer from accept.  At the time the
      OOM killer was triggered there was 4GB out of 36GB free in zone 1.  The
      problem was that alloc_fdtable was allocating an order 3 page (32KiB) to
      hold a bitmap, and there was sufficient fragmentation that the largest
      page available was 8KiB.
      
      I find the logic that PAGE_ALLOC_COSTLY_ORDER can't fail pretty dubious
      but I do agree that order 3 allocations are very likely to succeed.
      
      There are always pathologies where order > 0 allocations can fail when
      there are copious amounts of free memory available.  Using the pigeon
      hole principle it is easy to show that it requires 1 page more than 50%
      of the pages being free to guarantee an order 1 (8KiB) allocation will
      succeed, 1 page more than 75% of the pages being free to guarantee an
      order 2 (16KiB) allocation will succeed and 1 page more than 87.5% of
      the pages being free to guarantee an order 3 allocate will succeed.
      
      A server churning memory with a lot of small requests and replies like
      memcached is a common case that if anything can will skew the odds
      against large pages being available.
      
      Therefore let's not give external applications a practical way to kill
      linux server applications, and specify __GFP_NORETRY to the kmalloc in
      alloc_fdmem.  Unless I am misreading the code and by the time the code
      reaches should_alloc_retry in __alloc_pages_slowpath (where
      __GFP_NORETRY becomes signification).  We have already tried everything
      reasonable to allocate a page and the only thing left to do is wait.  So
      not waiting and falling back to vmalloc immediately seems like the
      reasonable thing to do even if there wasn't a chance of triggering the
      OOM killer.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Cong Wang <cwang@twopensource.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      96c7a2ff
    • M
      xen: properly account for _PAGE_NUMA during xen pte translations · a9c8e4be
      Mel Gorman 提交于
      Steven Noonan forwarded a users report where they had a problem starting
      vsftpd on a Xen paravirtualized guest, with this in dmesg:
      
        BUG: Bad page map in process vsftpd  pte:8000000493b88165 pmd:e9cc01067
        page:ffffea00124ee200 count:0 mapcount:-1 mapping:     (null) index:0x0
        page flags: 0x2ffc0000000014(referenced|dirty)
        addr:00007f97eea74000 vm_flags:00100071 anon_vma:ffff880e98f80380 mapping:          (null) index:7f97eea74
        CPU: 4 PID: 587 Comm: vsftpd Not tainted 3.12.7-1-ec2 #1
        Call Trace:
          dump_stack+0x45/0x56
          print_bad_pte+0x22e/0x250
          unmap_single_vma+0x583/0x890
          unmap_vmas+0x65/0x90
          exit_mmap+0xc5/0x170
          mmput+0x65/0x100
          do_exit+0x393/0x9e0
          do_group_exit+0xcc/0x140
          SyS_exit_group+0x14/0x20
          system_call_fastpath+0x1a/0x1f
        Disabling lock debugging due to kernel taint
        BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:0 val:-1
        BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:1 val:1
      
      The issue could not be reproduced under an HVM instance with the same
      kernel, so it appears to be exclusive to paravirtual Xen guests.  He
      bisected the problem to commit 1667918b ("mm: numa: clear numa
      hinting information on mprotect") that was also included in 3.12-stable.
      
      The problem was related to how xen translates ptes because it was not
      accounting for the _PAGE_NUMA bit.  This patch splits pte_present to add
      a pteval_present helper for use by xen so both bare metal and xen use
      the same code when checking if a PTE is present.
      
      [mgorman@suse.de: wrote changelog, proposed minor modifications]
      [akpm@linux-foundation.org: fix typo in comment]
      Reported-by: NSteven Noonan <steven@uplinklabs.net>
      Tested-by: NSteven Noonan <steven@uplinklabs.net>
      Signed-off-by: NElena Ufimtseva <ufimtseva@gmail.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: <stable@vger.kernel.org>	[3.12+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9c8e4be
    • D
      mm/slub.c: list_lock may not be held in some circumstances · 255d0884
      David Rientjes 提交于
      Commit c65c1877 ("slub: use lockdep_assert_held") incorrectly
      required that add_full() and remove_full() hold n->list_lock.  The lock
      is only taken when kmem_cache_debug(s), since that's the only time it
      actually does anything.
      
      Require that the lock only be taken under such a condition.
      Reported-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Tested-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      255d0884
    • G
      drivers/md/bcache/extents.c: use %zi to format size_t · bd180b4e
      Geert Uytterhoeven 提交于
        drivers/md/bcache/extents.c: In function `btree_ptr_bad_expensive':
        drivers/md/bcache/extents.c:196: warning: format `%li' expects type `long int', but argument 4 has type `size_t'
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd180b4e
    • G
      vmcore: prevent PT_NOTE p_memsz overflow during header update · 38dfac84
      Greg Pearson 提交于
      Currently, update_note_header_size_elf64() and
      update_note_header_size_elf32() will add the size of a PT_NOTE entry to
      real_sz even if that causes real_sz to exceeds max_sz.  This patch
      corrects the while loop logic in those routines to ensure that does not
      happen and prints a warning if a PT_NOTE entry is dropped.  If zero
      PT_NOTE entries are found or this condition is encountered because the
      only entry was dropped, a warning is printed and an error is returned.
      
      One possible negative side effect of exceeding the max_sz limit is an
      allocation failure in merge_note_headers_elf64() or
      merge_note_headers_elf32() which would produce console output such as
      the following while booting the crash kernel.
      
        vmalloc: allocation failure: 14076997632 bytes
        swapper/0: page allocation failure: order:0, mode:0x80d2
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-gbp1 #7
        Call Trace:
          dump_stack+0x19/0x1b
          warn_alloc_failed+0xf0/0x160
          __vmalloc_node_range+0x19e/0x250
          vmalloc_user+0x4c/0x70
          merge_note_headers_elf64.constprop.9+0x116/0x24a
          vmcore_init+0x2d4/0x76c
          do_one_initcall+0xe2/0x190
          kernel_init_freeable+0x17c/0x207
          kernel_init+0xe/0x180
          ret_from_fork+0x7c/0xb0
      
        Kdump: vmcore not initialized
      
        kdump: dump target is /dev/sda4
        kdump: saving to /sysroot//var/crash/127.0.0.1-2014.01.28-13:58:52/
        kdump: saving vmcore-dmesg.txt
        Cannot open /proc/vmcore: No such file or directory
        kdump: saving vmcore-dmesg.txt failed
        kdump: saving vmcore
        kdump: saving vmcore failed
      
      This type of failure has been seen on a four socket prototype system
      with certain memory configurations.  Most PT_NOTE sections have a single
      entry similar to:
      
        n_namesz = 0x5
        n_descsz = 0x150
        n_type   = 0x1
      
      Occasionally, a second entry is encountered with very large n_namesz and
      n_descsz sizes:
      
        n_namesz = 0x80000008
        n_descsz = 0x510ae163
        n_type   = 0x80000008
      
      Not yet sure of the source of these extra entries, they seem bogus, but
      they shouldn't cause crash dump to fail.
      Signed-off-by: NGreg Pearson <greg.pearson@hp.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      38dfac84
    • A
      drivers/message/i2o/i2o_config.c: fix deadlock in compat_ioctl(I2OGETIOPS) · a3eb7fbb
      Alexey Khoroshilov 提交于
      i2o_cfg_compat_ioctl(I2OGETIOPS) locks i2o_cfg_mutex and then calls
      i2o_cfg_ioctl(I2OGETIOPS) that locks i2o_cfg_mutex as well.  A deadlock
      is guaranteed.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: NAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a3eb7fbb
    • H
      Documentation/: update 00-INDEX files · 3cf8ca1c
      Henrik Austad 提交于
      Some of the 00-INDEX files are somewhat outdated and some folders does
      not contain 00-INDEX at all.  Only outdated (with the notably exception
      of spi) indexes are touched here, the 169 folders without 00-INDEX has
      not been touched.
      
      New 00-INDEX
       - spi/* was added in a series of commits dating back to 2006
      
      Added files (missing in (*/)00-INDEX)
       - dmatest.txt was added by commit 851b7e16 ("dmatest: run test via
         debugfs")
       - this_cpu_ops.txt was added by commit a1b2a555 ("percpu: add
         documentation on this_cpu operations")
       - ww-mutex-design.txt was added by commit 040a0a37 ("mutex: Add
         support for wound/wait style locks")
       - bcache.txt was added by commit cafe5635 ("bcache: A block layer
         cache")
       - kernel-per-CPU-kthreads.txt was added by commit 49717cb4
         ("kthread: Document ways of reducing OS jitter due to per-CPU
         kthreads")
       - phy.txt was added by commit ff764963 ("drivers: phy: add generic
         PHY framework")
       - block/null_blk was added by commit 12f8f4fc ("null_blk:
         documentation")
       - module-signing.txt was added by commit 3cafea30 ("Add
         Documentation/module-signing.txt file")
       - assoc_array.txt was added by commit 3cb98950 ("Add a generic
         associative array implementation.")
       - arm/IXP4xx was part of the initial repo
       - arm/cluster-pm-race-avoidance.txt was added by commit 7fe31d28
         ("ARM: mcpm: introduce helpers for platform coherency exit/setup")
       - arm/firmware.txt was added by commit 7366b92a ("ARM: Add
         interface for registering and calling firmware-specific operations")
       - arm/kernel_mode_neon.txt was added by commit 2afd0a05 ("ARM:
         7825/1: document the use of NEON in kernel mode")
       - arm/tcm.txt was added by commit bc581770 ("ARM: 5580/2: ARM TCM
         (Tightly-Coupled Memory) support v3")
       - arm/vlocks.txt was added by commit 9762f12d ("ARM: mcpm: Add
         baremetal voting mutexes")
       - blackfin/gptimers-example.c, Makefile was added by commit
         4b60779d ("Blackfin: add an example showing how to use the
         gptimers API")
       - devicetree/usage-model.txt was added by commit 31134efc ("dt:
         Linux DT usage model documentation")
       - fb/api.txt was added by commit fb21c2f4 ("fbdev: Add FOURCC-based
         format configuration API")
       - fb/sm501.txt was added by commit e6a04980 ("video, sm501: add
         edid and commandline support")
       - fb/udlfb.txt was added by commit 96f8d864 ("fbdev: move udlfb out
         of staging.")
       - filesystems/Makefile was added by commit 1e0051ae
         ("Documentation/fs/: split txt and source files")
       - filesystems/nfs/nfsd-admin-interfaces.txt was added by commit
         8a4c6e19 ("nfsd: document kernel interfaces for nfsd
         configuration")
       - ide/warm-plug-howto.txt was added by commit f74c9141 ("ide: add
         warm-plug support for IDE devices (take 2)")
       - laptops/Makefile was added by commit d49129ac
         ("Documentation/laptop/: split txt and source files")
       - leds/leds-blinkm.txt was added by commit b54cf35a ("LEDS: add
         BlinkM RGB LED driver, documentation and update MAINTAINERS")
       - leds/ledtrig-oneshot.txt was added by commit 5e417281 ("leds: add
         oneshot trigger")
       - leds/ledtrig-transient.txt was added by commit 44e1e9f8 ("leds:
         add new transient trigger for one shot timer activation")
       - m68k/README.buddha was part of the initial repo
       - networking/LICENSE.(qla3xxx|qlcnic|qlge) was added by commits
         40839129, c4e84bde, 5a4faa87
       - networking/Makefile was added by commit 3794f3e8 ("docsrc: build
         Documentation/ sources")
       - networking/i40evf.txt was added by commit 105bf2fe ("i40evf: add
         driver to kernel build system")
       - networking/ipsec.txt was added by commit b3c6efbc ("xfrm: Add
         file to document IPsec corner case")
       - networking/mac80211-auth-assoc-deauth.txt was added by commit
         3cd7920a ("mac80211: add auth/assoc/deauth flow diagram")
       - networking/netlink_mmap.txt was added by commit 5683264c
         ("netlink: add documentation for memory mapped I/O")
       - networking/nf_conntrack-sysctl.txt was added by commit c9f9e0e1
         ("netfilter: doc: add nf_conntrack sysctl api documentation") lan)
       - networking/team.txt was added by commit 3d249d4c ("net: introduce
         ethernet teaming device")
       - networking/vxlan.txt was added by commit d342894c ("vxlan:
         virtual extensible lan")
       - power/runtime_pm.txt was added by commit 5e928f77 ("PM: Introduce
         core framework for run-time PM of I/O devices (rev.  17)")
       - power/charger-manager.txt was added by commit 3bb3dbbd
         ("power_supply: Add initial Charger-Manager driver")
       - RCU/lockdep-splat.txt was added by commit d7bd2d68 ("rcu:
         Document interpretation of RCU-lockdep splats")
       - s390/kvm.txt was added by 5ecee4ba (KVM: s390: API documentation)
       - s390/qeth.txt was added by commit b4d72c08 ("qeth: bridgeport
         support - basic control")
       - scheduler/sched-bwc.txt was added by commit 88ebc08e ("sched: Add
         documentation for bandwidth control")
       - scsi/advansys.txt was added by commit 4bd6d7f3 ("[SCSI] advansys:
         Move documentation to Documentation/scsi")
       - scsi/bfa.txt was added by commit 1ec90174 ("[SCSI] bfa: add
         readme file")
       - scsi/bnx2fc.txt was added by commit 12b8fc10 ("[SCSI] bnx2fc: Add
         driver documentation")
       - scsi/cxgb3i.txt was added by commit c3673464 ("[SCSI] cxgb3i: Add
         cxgb3i iSCSI driver.")
       - scsi/hpsa.txt was added by commit 992ebcf1 ("[SCSI] hpsa: Add
         hpsa.txt to Documentation/scsi")
       - scsi/link_power_management_policy.txt was added by commit
         ca77329f ("[libata] Link power management infrastructure")
       - scsi/osd.txt was added by commit 78e0c621 ("[SCSI] osd:
         Documentation for OSD library")
       - scsi/scsi-parameter.txt was created/moved by commit 163475fb
         ("Documentation: move SCSI parameters to their own text file")
       - serial/driver was part of the initial repo
       - serial/n_gsm.txt was added by commit 323e8412 ("n_gsm: add a
         documentation")
       - timers/Makefile was added by commit 3794f3e8 ("docsrc: build
         Documentation/ sources")
       - virt/kvm/s390.txt was added by commit d9101fca ("KVM: s390:
         diagnose call documentation")
       - vm/split_page_table_lock was added by commit 49076ec2 ("mm:
         dynamically allocate page->ptl if it cannot be embedded to struct
         page")
       - w1/slaves/w1_ds28e04 was added by commit fbf7f7b4 ("w1: Add
         1-wire slave device driver for DS28E04-100")
       - w1/masters/omap-hdq was added by commit e0a29382 ("hdq:
         documentation for OMAP HDQ")
       - x86/early-microcode.txt was added by commit 0d91ea86 ("x86, doc:
         Documentation for early microcode loading")
       - x86/earlyprintk.txt was added by commit a1aade47 ("x86/doc:
         mini-howto for using earlyprintk=dbgp")
       - x86/entry_64.txt was added by commit 8b4777a4 ("x86-64: Document
         some of entry_64.S")
       - x86/pat.txt was added by commit d27554d8 ("x86: PAT
         documentation")
      
      Moved files
       - arm/kernel_user_helpers.txt was moved out of arch/arm/kernel by
         commit 37b83046 ("ARM: kuser: move interface documentation out of
         the source code")
       - efi-stub.txt was moved out of x86/ and down into Documentation/ in
         commit 4172fe2f ("EFI stub documentation updates")
       - laptops/hpfall.c was moved out of hwmon/ and into laptops/ in commit
         efcfed9b ("Move hp_accel to drivers/platform/x86")
       - commit 5616c23a ("x86: doc: move x86-generic documentation from
         Doc/x86/i386"):
         * x86/usb-legacy-support.txt
         * x86/boot.txt
         * x86/zero_page.txt
       - power/video_extension.txt was moved to acpi in commit 70e66e4d
         ("ACPI / video: move video_extension.txt to Documentation/acpi")
      
      Removed files (left in 00-INDEX)
       - memory.txt was removed by commit 00ea8990 ("memory.txt: remove
         stray information")
       - gpio.txt was moved to gpio/ in commit fd8e198c ("Documentation:
         gpiolib: document new interface")
       - networking/DLINK.txt was removed by commit 168e06ae
         ("drivers/net: delete old parallel port de600/de620 drivers")
       - serial/hayes-esp.txt was removed by commit f53a2ade ("tty: esp:
         remove broken driver")
       - s390/TAPE was removed by commit 9e280f66 ("[S390] remove tape
         block docu")
       - vm/locking was removed by commit 57ea8171 ("mm: documentation:
         remove hopelessly out-of-date locking doc")
       - laptops/acer-wmi.txt was remvoed by commit 02003667 ("acer-wmi:
         Delete out-of-date documentation")
      
      Typos/misc issues
       - rpc-server-gss.txt was added as knfsd-rpcgss.txt in commit
         030d794b ("SUNRPC: Use gssproxy upcall for server RPCGSS
         authentication.")
       - commit b88cf73d ("net: add missing entries to
         Documentation/networking/00-INDEX")
         * generic-hdlc.txt was added as generic_hdlc.txt
         * spider_net.txt was added as spider-net.txt
       - w1/master/mxc-w1 was added as mxc_w1 by commit a5fd9139 ("w1: add
         1-wire master driver for i.MX27 / i.MX31")
       - s390/zfcpdump.txt was added as zfcpdump by commit 6920c12a
         ("[S390] Add Documentation/s390/00-INDEX.")
      Signed-off-by: NHenrik Austad <henrik@austad.us>
      Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	[rcu bits]
      Acked-by: NRob Landley <rob@landley.net>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: James Bottomley <JBottomley@parallels.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3cf8ca1c
    • R
      checkpatch: fix detection of git repository · 3645e328
      Richard Genoud 提交于
      Since git v1.7.7, the .git directory can be a file when, for example,
      the kernel is a submodule of another git super project.  So, the check
      "-d .git" is not working anymore in this case.  Using a more generic
      check like "-e .git" corrects this behaviour.
      Signed-off-by: NRichard Genoud <richard.genoud@gmail.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3645e328
    • R
      get_maintainer: fix detection of git repository · ec83b616
      Richard Genoud 提交于
      Since git v1.7.7, the .git directory can be a file when, for example,
      the kernel is a submodule of another git super project.  So, the check
      "-d .git" is not working anymore in this case.  Using a more generic
      check like "-e .git" corrects this behaviour.
      Signed-off-by: NRichard Genoud <richard.genoud@gmail.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ec83b616
    • D
      drivers/misc/sgi-gru/grukdump.c: unlocking should be conditional in gru_dump_context() · 49d3d6c3
      Dan Carpenter 提交于
      I was reviewing this and noticed that unlocking should be conditional on
      the error path.  I've changed it to unlock and return directly since we
      only do it once and it seems unlikely to change in the near future.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NDimitri Sivanich <sivanich@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49d3d6c3
  2. 10 2月, 2014 8 次提交
  3. 09 2月, 2014 10 次提交
    • F
      Btrfs: fix data corruption when reading/updating compressed extents · a2aa75e1
      Filipe David Borba Manana 提交于
      When using a mix of compressed file extents and prealloc extents, it
      is possible to fill a page of a file with random, garbage data from
      some unrelated previous use of the page, instead of a sequence of zeroes.
      
      A simple sequence of steps to get into such case, taken from the test
      case I made for xfstests, is:
      
         _scratch_mkfs
         _scratch_mount "-o compress-force=lzo"
         $XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
      
      This results in the following file items in the fs tree:
      
         item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
             inode generation 6 transid 6 size 542872 block group 0 mode 100600
         item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16
             inode ref index 2 namelen 6 name: foobar
         item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
             extent data disk byte 0 nr 0 gen 6
             extent data offset 0 nr 24576 ram 266240
             extent compression 0
         item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53
             prealloc data disk byte 12849152 nr 241664 gen 6
             prealloc data offset 0 nr 241664
         item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53
             extent data disk byte 12845056 nr 4096 gen 6
             extent data offset 0 nr 20480 ram 20480
             extent compression 2
         item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53
             prealloc data disk byte 13090816 nr 405504 gen 6
             prealloc data offset 0 nr 258048
      
      The on disk extent at offset 266240 (which corresponds to 1 single disk block),
      contains 5 compressed chunks of file data. Each of the first 4 compress 4096
      bytes of file data, while the last one only compresses 3024 bytes of file data.
      Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 =
      1072 bytes) should always return zeroes (our next extent is a prealloc one).
      
      The solution here is the compression code path to zero the remaining (untouched)
      bytes of the last page it uncompressed data into, as the information about how
      much space the file data consumes in the last page is not known in the upper layer
      fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing
      the remainder of the page but only if it corresponds to the last page of the inode
      and if the inode's size is not a multiple of the page size.
      
      This would cause not only returning random data on reads, but also permanently
      storing random data when updating parts of the region that should be zeroed.
      For the example above, it means updating a single byte in the region [285648 ; 286720[
      would store that byte correctly but also store random data on disk.
      
      A test case for xfstests follows soon.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      a2aa75e1
    • J
      Btrfs: don't loop forever if we can't run because of the tree mod log · 27a377db
      Josef Bacik 提交于
      A user reported a 100% cpu hang with my new delayed ref code.  Turns out I
      forgot to increase the count check when we can't run a delayed ref because of
      the tree mod log.  If we can't run any delayed refs during this there is no
      point in continuing to look, and we need to break out.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      27a377db
    • D
      btrfs: reserve no transaction units in btrfs_ioctl_set_features · 8051aa1a
      David Sterba 提交于
      Added in patch "btrfs: add ioctls to query/change feature bits online"
      modifications to superblock don't need to reserve metadata blocks when
      starting a transaction.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      8051aa1a
    • J
      btrfs: commit transaction after setting label and features · d0270aca
      Jeff Mahoney 提交于
      The set_fslabel ioctl uses btrfs_end_transaction, which means it's
      possible that the change will be lost if the system crashes, same for
      the newly set features. Let's use btrfs_commit_transaction instead.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      d0270aca
    • J
      Btrfs: fix assert screwup for the pending move stuff · 6cc98d90
      Josef Bacik 提交于
      Wang noticed that he was failing btrfs/030 even though me and Filipe couldn't
      reproduce.  Turns out this is because Wang didn't have CONFIG_BTRFS_ASSERT set,
      which meant that a key part of Filipe's original patch was not being built in.
      This appears to be a mess up with merging Filipe's patch as it does not exist in
      his original patch.  Fix this by changing how we make sure del_waiting_dir_move
      asserts that it did not error and take the function out of the ifdef check.
      This makes btrfs/030 pass with the assert on or off.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NFilipe Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      6cc98d90
    • L
      Merge tag 'pinctrl-v3.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 49447903
      Linus Torvalds 提交于
      Pull pinctrl fixes from Linus Walleij:
       "First round of pin control fixes for v3.14:
      
         - Protect pinctrl_list_add() with the proper mutex.  This was
           identified by RedHat.  Caused nasty locking warnings was rootcased
           by Stanislaw Gruszka.
      
         - Avoid adding dangerous debugfs files when either half of the
           subsystem is unused: pinmux or pinconf.
      
         - Various fixes to various drivers: locking, hardware particulars, DT
           parsing, error codes"
      
      * tag 'pinctrl-v3.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: tegra: return correct error type
        pinctrl: do not init debugfs entries for unimplemented functionalities
        pinctrl: protect pinctrl_list add
        pinctrl: sirf: correct the pin index of ac97_pins group
        pinctrl: imx27: fix offset calculation in imx_read_2bit
        pinctrl: vt8500: Change devicetree data parsing
        pinctrl: imx27: fix wrong offset to ICONFB
        pinctrl: at91: use locked variant of irq_set_handler
      49447903
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c132adef
      Linus Torvalds 提交于
      Pull irq fix from Thomas Gleixner:
       "Add a missing Kconfig dependency"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Generic irq chip requires IRQ_DOMAIN
      c132adef
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c1ff8431
      Linus Torvalds 提交于
      Pull x86 fixes from Peter Anvin:
       "Quite a varied little collection of fixes.  Most of them are
        relatively small or isolated; the biggest one is Mel Gorman's fixes
        for TLB range flushing.
      
        A couple of AMD-related fixes (including not crashing when given an
        invalid microcode image) and fix a crash when compiled with gcov"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, microcode, AMD: Unify valid container checks
        x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y
        x86/efi: Allow mapping BGRT on x86-32
        x86: Fix the initialization of physnode_map
        x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable()
        x86/intel/mid: Fix X86_INTEL_MID dependencies
        arch/x86/mm/srat: Skip NUMA_NO_NODE while parsing SLIT
        mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge
        x86: mm: change tlb_flushall_shift for IvyBridge
        x86/mm: Eliminate redundant page table walk during TLB range flushing
        x86/mm: Clean up inconsistencies when flushing TLB ranges
        mm, x86: Account for TLB flushes only when debugging
        x86/AMD/NB: Fix amd_set_subcaches() parameter type
        x86/quirks: Add workaround for AMD F16h Erratum792
        x86, doc, kconfig: Fix dud URL for Microcode data
      c1ff8431
    • L
      Merge tag 'jfs-3.14-rc2' of git://github.com/kleikamp/linux-shaggy · ec2e6cb2
      Linus Torvalds 提交于
      Pull jfs fix from David Kleikamp:
       "Fix regression"
      
      * tag 'jfs-3.14-rc2' of git://github.com/kleikamp/linux-shaggy:
        jfs: fix generic posix ACL regression
      ec2e6cb2
    • D
      jfs: fix generic posix ACL regression · c18f7b51
      Dave Kleikamp 提交于
      I missed a couple errors in reviewing the patches converting jfs
      to use the generic posix ACL function. Setting ACL's currently
      fails with -EOPNOTSUPP.
      Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
      Reported-by: NMichael L. Semon <mlsemon35@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      c18f7b51
  4. 08 2月, 2014 12 次提交
    • R
      watchdog: dw_wdt: Add dependency on HAS_IOMEM · 1ccfe6f9
      Richard Weinberger 提交于
      On archs like S390 or um this driver cannot build nor work.
      Make it depend on HAS_IOMEM to bypass build failures.
      
      drivers/built-in.o: In function `dw_wdt_drv_probe':
      drivers/watchdog/dw_wdt.c:302: undefined reference to `devm_ioremap_resource'
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Signed-off-by: NWim Van Sebroeck <wim@iguana.be>
      1ccfe6f9
    • L
      Merge tag 'driver-core-3.14-rc2' of... · 34a9bff4
      Linus Torvalds 提交于
      Merge tag 'driver-core-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fix from Greg KH:
       "Here is a single kernfs fix to resolve a much-reported lockdep issue
        with the removal of entries in sysfs"
      
      * tag 'driver-core-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag
      34a9bff4
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 41f76d8b
      Linus Torvalds 提交于
      Pull ceph fixes from Sage Weil:
       "There is an RBD fix for a crash due to the immutable bio changes, an
        error path fix, and a locking fix in the recent redirect support"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
        libceph: do not dereference a NULL bio pointer
        libceph: take map_sem for read in handle_reply()
        libceph: factor out logic from ceph_osdc_start_request()
        libceph: fix error handling in ceph_osdc_init()
      41f76d8b
    • L
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 42be3f35
      Linus Torvalds 提交于
      Pull arm64 fixes from Catalin Marinas:
       - Relax VDSO alignment requirements so that the kernel-picked one (4K)
         does not conflict with the dynamic linker's one (64K)
       - VDSO gettimeofday fix
       - Barrier fixes for atomic operations and cache flushing
       - TLB invalidation when overriding early page mappings during boot
       - Wired up new 32-bit arm (compat) syscalls
       - LSM_MMAP_MIN_ADDR when COMPAT is enabled
       - defconfig update
       - Clean-up (comments, pgd_alloc).
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: defconfig: Expand default enabled features
        arm64: asm: remove redundant "cc" clobbers
        arm64: atomics: fix use of acquire + release for full barrier semantics
        arm64: barriers: allow dsb macro to take option parameter
        security: select correct default LSM_MMAP_MIN_ADDR on arm on arm64
        arm64: compat: Wire up new AArch32 syscalls
        arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE
        arm64: vdso: fix coarse clock handling
        arm64: simplify pgd_alloc
        arm64: fix typo: s/SERRROR/SERROR/
        arm64: Invalidate the TLB when replacing pmd entries during boot
        arm64: Align CMA sizes to PAGE_SIZE
        arm64: add DSB after icache flush in __flush_icache_all()
        arm64: vdso: prevent ld from aligning PT_LOAD segments to 64k
      42be3f35
    • L
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · d94d0e27
      Linus Torvalds 提交于
      Pull MIPS updates from Ralf Baechle:
       "hree minor patches.  All have sat in -next for a few days"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: fpu.h: Fix build when CONFIG_BUG is not set
        MIPS: Wire up sched_setattr/sched_getattr syscalls
        MIPS: Alchemy: Fix DB1100 GPIO registration
      d94d0e27
    • L
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 3e382dd9
      Linus Torvalds 提交于
      Pull media fixes from Mauro Carvalho Chehab:
       "A series of small fixes.  Mostly driver ones.  There is one core
        regression fix on a patch that was meant to fix some race issues on
        vb2, but that actually caused more harm than good.  So, we're just
        reverting it for now"
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] adv7842: Composite free-run platfrom-data fix
        [media] v4l2-dv-timings: fix GTF calculation
        [media] hdpvr: Fix memory leak in debug
        [media] af9035: add ID [2040:f900] Hauppauge WinTV-MiniStick 2
        [media] mxl111sf: Fix compile when CONFIG_DVB_USB_MXL111SF is unset
        [media] mxl111sf: Fix unintentional garbage stack read
        [media] cx24117: use a valid dev pointer for dev_err printout
        [media] cx24117: remove dead code in always 'false' if statement
        [media] update Michael Krufky's email address
        [media] vb2: Check if there are buffers before streamon
        [media] Revert "[media] videobuf_vm_{open,close} race fixes"
        [media] go7007-loader: fix usb_dev leak
        [media] media: bt8xx: add missing put_device call
        [media] exynos4-is: Compile in fimc-lite runtime PM callbacks conditionally
        [media] exynos4-is: Compile in fimc runtime PM callbacks conditionally
        [media] exynos4-is: Fix error paths in probe() for !pm_runtime_enabled()
        [media] s5p-jpeg: Fix wrong NV12 format parameters
        [media] s5k5baf: allow to handle arbitrary long i2c sequences
      3e382dd9
    • L
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 2091f435
      Linus Torvalds 提交于
      Pull hwmon fixes from Guenter Roeck:
       "Fix PMBus driver problem with some multi-page voltage sensors and fix
        da9055 interrupt initialization"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (da9055) Remove use of regmap_irq_get_virq()
        hwmon: (pmbus) Support per-page exponent in linear mode
      2091f435
    • L
      Merge tag 'pm+acpi-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 22446d3f
      Linus Torvalds 提交于
      Pull ACPI and power management fixes from Rafael Wysocki:
       "These include a fix for a recent ACPI hotplug regression, four
        concurrency related fixes and one PCI device removal fix for
        ACPI-based PCI hotplug (ACPIPHP), intel_pstate fix that should go into
        stable, three simple ACPI cleanups and a new entry for the ACPI video
        blacklist.
      
        Specifics:
      
         - Fix for a recent ACPI hotplug regression causing a NULL pointer
           dereference to occur while handling ACPI eject notifications for
           already ejected devices.  From Toshi Kani.
      
         - Four concurrency-related fixes for ACPIPHP.  Two of them add
           missing locking and the other two fix race conditions related to
           reference counting.
      
         - ACPIPHP fix to avoid NULL pointer dereferences during device
           removal involving Virtual Funcions.
      
         - intel_pstate fix to make it compute the percentage of time the CPU
           is busy properly.  From Dirk Brandewie.
      
         - Removal of two unnecessary NULL pointer checks in ACPI code and a
           fix for sscanf() format string from Dan Carpenter and Luis G.F.
      
         - New ACPI video blacklist entry for HP EliteBook Revolve 810 from
           Mika Westerberg"
      
      * tag 'pm+acpi-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / hotplug: Fix panic on eject to ejected device
        ACPI / battery: Fix incorrect sscanf() string in acpi_battery_init_alarm()
        ACPI / proc: remove unneeded NULL check
        ACPI / utils: remove a pointless NULL check
        ACPI / video: Add HP EliteBook Revolve 810 to the blacklist
        intel_pstate: Take core C0 time into account for core busy calculation
        ACPI / hotplug / PCI: Fix bridge removal race vs dock events
        ACPI / hotplug / PCI: Fix bridge removal race in handle_hotplug_event()
        ACPI / hotplug / PCI: Scan root bus under the PCI rescan-remove lock
        ACPI / hotplug / PCI: Move PCI rescan-remove locking to hotplug_event()
        ACPI / hotplug / PCI: Remove entries from bus->devices in reverse order
      22446d3f
    • I
      libceph: do not dereference a NULL bio pointer · 0ec1d15e
      Ilya Dryomov 提交于
      Commit f38a5181 ("ceph: Convert to immutable biovecs") introduced
      a NULL pointer dereference, which broke rbd in -rc1.  Fix it.
      
      Cc: Kent Overstreet <kmo@daterainc.com>
      Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      0ec1d15e
    • H
      Merge tag 'efi-urgent' into x86/urgent · a3b072cd
      H. Peter Anvin 提交于
       * Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit).
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      a3b072cd
    • I
      libceph: take map_sem for read in handle_reply() · ff513ace
      Ilya Dryomov 提交于
      Handling redirect replies requires both map_sem and request_mutex.
      Taking map_sem unconditionally near the top of handle_reply() avoids
      possible race conditions that arise from releasing request_mutex to be
      able to acquire map_sem in redirect reply case.  (Lock ordering is:
      map_sem, request_mutex, crush_mutex.)
      Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      ff513ace
    • I
      libceph: factor out logic from ceph_osdc_start_request() · 0bbfdfe8
      Ilya Dryomov 提交于
      Factor out logic from ceph_osdc_start_request() into a new helper,
      __ceph_osdc_start_request().  ceph_osdc_start_request() now amounts to
      taking locks and calling __ceph_osdc_start_request().
      Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      0bbfdfe8