1. 05 1月, 2016 2 次提交
    • B
      xfs: debug mode log record crc error injection · 609adfc2
      Brian Foster 提交于
      XFS now uses CRC verification over a limited section of the log to
      detect torn writes prior to a crash. This is difficult to test directly
      due to the timing and hardware requirements to cause a short write.
      
      Add a mechanism to inject CRC errors into log records to facilitate
      testing torn write detection during log recovery. This mechanism is
      dangerous and can result in filesystem corruption. Thus, it is only
      available in DEBUG mode for testing/development purposes. Set a non-zero
      value to the following sysfs entry to enable error injection:
      
      	/sys/fs/xfs/<dev>/log/log_badcrc_factor
      
      Once enabled, XFS intentionally writes an invalid CRC to a log record at
      some random point in the future based on the provided frequency. The
      filesystem immediately shuts down once the record has been written to
      the physical log to prevent metadata writeback (e.g., AIL insertion)
      once the log write completes. This helps reasonably simulate a torn
      write to the log as the affected record must be safe to discard. The
      next mount after the intentional shutdown requires log recovery and
      should detect and recover from the torn write.
      
      Note again that this _will_ result in data loss or worse. For testing
      and development purposes only!
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      609adfc2
    • B
      xfs: detect and trim torn writes during log recovery · 7088c413
      Brian Foster 提交于
      Certain types of storage, such as persistent memory, do not provide
      sector atomicity for writes. This means that if a crash occurs while XFS
      is writing log records, only part of those records might make it to the
      storage. This is problematic because log recovery uses the cycle value
      packed at the top of each log block to locate the head/tail of the log.
      This can lead to CRC verification failures during log recovery and an
      unmountable fs for a filesystem that is otherwise consistent.
      
      Update log recovery to incorporate log record CRC verification as part
      of the head/tail discovery process. Once the head is located via the
      traditional algorithm, run a CRC-only pass over the records up to the
      head of the log. If CRC verification fails, assume that the records are
      torn as a matter of policy and trim the head block back to the start of
      the first bad record.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      7088c413
  2. 04 1月, 2016 9 次提交
    • B
      xfs: refactor log record start detection into a new helper · eed6b462
      Brian Foster 提交于
      As part of the head/tail discovery process, log recovery locates the
      head block and then reverse seeks to find the start of the last active
      record in the log. This is non-trivial as the record itself could have
      wrapped around the end of the physical log. Log recovery torn write
      detection potentially needs to walk further behind the last record in
      the log, as multiple log I/Os can be in-flight at one time during a
      crash event.
      
      Therefore, refactor the reverse log record header search mechanism into
      a new helper that supports the ability to seek past an arbitrary number
      of log records (or until the tail is hit). Update the head/tail search
      mechanism to call the new helper, but otherwise there is no change in
      log recovery behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      eed6b462
    • B
      xfs: support a crc verification only log record pass · 6528250b
      Brian Foster 提交于
      Log recovery torn write detection uses CRC verification over a range of
      the active log to identify torn writes. Since the generic log recovery
      pass code implements a superset of the functionality required for CRC
      verification, it can be easily modified to support a CRC verification
      only pass.
      
      Create a new CRC pass type and update the log record processing helper
      to skip everything beyond CRC verification when in this mode. This pass
      will be invoked in subsequent patches to implement torn write detection.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      6528250b
    • B
      xfs: return start block of first bad log record during recovery · d7f37692
      Brian Foster 提交于
      Each log recovery pass walks from the tail block to the head block and
      processes records appropriately based on the associated log pass type.
      There are various failure conditions that can occur through this
      sequence, such as I/O errors, CRC errors, etc. Log torn write detection
      will perform CRC verification near the head of the log to detect torn
      writes and trim torn records from the log appropriately.
      
      As it is, xlog_do_recovery_pass() only returns an error code in the
      event of CRC failure, which isn't enough information to trim the head of
      the log. Update xlog_do_recovery_pass() to optionally return the start
      block of the associated record when an error occurs. This patch contains
      no functional changes.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      d7f37692
    • B
      xfs: refactor and open code log record crc check · b94fb2d1
      Brian Foster 提交于
      Log record CRC verification currently occurs during active log recovery,
      immediately before a log record is unpacked. Therefore, the CRC
      calculation code is buried within the data unpack function. CRC
      verification pass support only needs to go so far as check the CRC, but
      this is not easily allowed as the code is currently organized.
      
      Since we now have a new log record processing helper, pull the record
      CRC verification code out from the unpack helper and open-code it at the
      top of the new process helper. This facilitates the ability to modify
      how records are processed based on the type of the current pass. This
      patch contains no functional changes.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      b94fb2d1
    • B
      xfs: refactor log record unpack and data processing · 9d94901f
      Brian Foster 提交于
      xlog_do_recovery_pass() duplicates a couple function calls related to
      processing log records because the function must handle wrapping around
      the end of the log if the head is behind the tail. This is implemented
      as separate loops. CRC verification pass support will modify how records
      are processed in both of these loops.
      
      Rather than continue to duplicate code, factor the calls that process a
      log record into a new helper and call that helper from both loops. This
      patch contains no functional changes.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      9d94901f
    • B
      xfs: detect and handle invalid iclog size set by mkfs · a70f9fe5
      Brian Foster 提交于
      XFS log records have separate fields for the record size and the iclog
      size used to write the record. mkfs.xfs zeroes the log and writes an
      unmount record to generate a clean log for the subsequent mount. The
      userspace record logging code has a bug where the iclog size (h_size)
      field of the log record is hardcoded to 32k, even if a log stripe unit
      is specified. The log record length is correctly extended to the stripe
      unit. Since the kernel log recovery code uses the h_size field to
      determine the log buffer size, this means that the kernel can attempt to
      read/process records larger than the buffer size and overrun the buffer.
      
      This has historically not been a problem because the kernel doesn't
      actually run through log recovery in the clean unmount case. Instead,
      the kernel detects that a single unmount record exists between the head
      and tail and pushes the tail forward such that the log is viewed as
      clean (head == tail). Once CRC verification is enabled, however, all
      records at the head of the log are verified for CRC errors and thus we
      are susceptible to overrun problems if the iclog field is not correct.
      
      While the core problem must be fixed in userspace, this is historical
      behavior that must be detected in the kernel to avoid severe side
      effects such as memory corruption and crashes. Update the log buffer
      size calculation code to detect this condition, warn the user and resize
      the log buffer based on the log stripe unit. Return a corruption error
      in cases where this does not look like a clean filesystem (i.e., the log
      record header indicates more than one operation).
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      a70f9fe5
    • L
      Linux 4.4-rc8 · 16830985
      Linus Torvalds 提交于
      16830985
    • L
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 42946160
      Linus Torvalds 提交于
      Pull MIPS build fix from Ralf Baechle:
       "Fix a makefile issue resulting in build breakage with older binutils.
      
        This has sat in -next for a few days, testers and buildbot are happy
        with it, too though if you are going for another -rc that'd certainly
        help ironing out a few more issues"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: VDSO: Fix build error with binutils 2.24 and earlier
      42946160
    • L
      Merge tag 'drm-intel-fixes-2016-01-02' of git://anongit.freedesktop.org/drm-intel · 4e5e384c
      Linus Torvalds 提交于
      Pull i915 drm fixes from Jani Nikula:
       "Two display fixes still for v4.4.
      
        The new year's resolution is to start using signed tags per Linus'
        request.  This one is still unsigned; I want to fix this up in our
        maintainer scripts instead of doing it one-off"
      
      * tag 'drm-intel-fixes-2016-01-02' of git://anongit.freedesktop.org/drm-intel:
        drm/i915: increase the tries for HDMI hotplug live status checking
        drm/i915: Unbreak check_digital_port_conflicts()
      4e5e384c
  3. 01 1月, 2016 5 次提交
    • L
      Merge tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 9c982e86
      Linus Torvalds 提交于
      Pull PCI bugfix from Bjorn Helgaas:
       "Here's another fix for v4.4.
      
        This fixes 32-bit config reads for the HiSilicon driver.  Obviously
        the driver is completely broken without this fix (apparently it
        actually was tested internally, but got broken somehow in the process
        of upstreaming it).
      
        Summary:
      
        HiSilicon host bridge driver
          Fix 32-bit config reads (Dongdong Liu)"
      
      * tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: hisi: Fix hisi_pcie_cfg_read() 32-bit reads
      9c982e86
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 7c672dd6
      Linus Torvalds 提交于
      Pull sparc fixes from David Miller:
       "Just some missing syscall wire ups"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Wire up mlock2 system call.
        sparc: Add all necessary direct socket system calls.
      7c672dd6
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 8f5daf2a
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Prevent XFRM per-cpu counter updates for one namespace from being
          applied to another namespace.  Fix from DanS treetman.
      
       2) Fix RCU de-reference in iwl_mvm_get_key_sta_id(), from Johannes
          Berg.
      
       3) Remove ethernet header assumption in nft_do_chain_netdev(), from
          Pablo Neira Ayuso.
      
       4) Fix cpsw PHY ident with multiple slaves and fixed-phy, from Pascal
          Speck.
      
       5) Fix use after free in sixpack_close and mkiss_close.
      
       6) Fix VXLAN fw assertion on bnx2x, from Yuval Mintz.
      
       7) natsemi doesn't check for DMA mapping errors, from Alexey
          Khoroshilov.
      
       8) Fix inverted test in ip6addrlbl_get(), from ANdrey Ryabinin.
      
       9) Missing initialization of needed_headroom in geneve tunnel driver,
          from Paolo Abeni.
      
      10) Fix conntrack template leak in openvswitch, from Joe Stringer.
      
      11) Mission initialization of wq->flags in sock_alloc_inode(), from
          Nicolai Stange.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
        sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close
        net, socket, socket_wq: fix missing initialization of flags
        drivers: net: cpsw: fix error return code
        openvswitch: Fix template leak in error cases.
        sctp: label accepted/peeled off sockets
        sctp: use GFP_USER for user-controlled kmalloc
        qlcnic: fix a loop exit condition better
        net: cdc_ncm: avoid changing RX/TX buffers on MTU changes
        geneve: initialize needed_headroom
        ipv6: honor ifindex in case we receive ll addresses in router advertisements
        addrconf: always initialize sysctl table data
        ipv6/addrlabel: fix ip6addrlbl_get()
        switchdev: bridge: Pass ageing time as clock_t instead of jiffies
        sh_eth: fix 16-bit descriptor field access endianness too
        veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
        net: usb: cdc_ncm: Adding Dell DW5813 LTE AT&T Mobile Broadband Card
        net: usb: cdc_ncm: Adding Dell DW5812 LTE Verizon Mobile Broadband Card
        natsemi: add checks for dma mapping errors
        rhashtable: Kill harmless RCU warning in rhashtable_walk_init
        openvswitch: correct encoding of set tunnel action attributes
        ...
      8f5daf2a
    • D
      sparc: Wire up mlock2 system call. · 42d85c52
      David S. Miller 提交于
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42d85c52
    • D
      sparc: Add all necessary direct socket system calls. · 8b30ca73
      David S. Miller 提交于
      The GLIBC folks would like to eliminate socketcall support
      eventually, and this makes sense regardless so wire them
      all up.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b30ca73
  4. 31 12月, 2015 4 次提交
    • X
      sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close · 068d8bd3
      Xin Long 提交于
      In sctp_close, sctp_make_abort_user may return NULL because of memory
      allocation failure. If this happens, it will bypass any state change
      and never free the assoc. The assoc has no chance to be freed and it
      will be kept in memory with the state it had even after the socket is
      closed by sctp_close().
      
      So if sctp_make_abort_user fails to allocate memory, we should abort
      the asoc via sctp_primitive_ABORT as well. Just like the annotation in
      sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
      "Even if we can't send the ABORT due to low memory delete the TCB.
      This is a departure from our typical NOMEM handling".
      
      But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
      dereference the chunk pointer, and system crash. So we should add
      SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
      places where it adds SCTP_CMD_REPLY cmd.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      068d8bd3
    • D
      Merge tag 'wireless-drivers-for-davem-2015-12-28' of... · a0ccc3f2
      David S. Miller 提交于
      Merge tag 'wireless-drivers-for-davem-2015-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      iwlwifi
      
      * don't load firmware that won't exist for 7260
      * fix RCU splat
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0ccc3f2
    • N
      net, socket, socket_wq: fix missing initialization of flags · 574aab1e
      Nicolai Stange 提交于
      Commit ceb5d58b ("net: fix sock_wake_async() rcu protection") from
      the current 4.4 release cycle introduced a new flags member in
      struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
      from struct socket's flags member into that new place.
      
      Unfortunately, the new flags field is never initialized properly, at least
      not for the struct socket_wq instance created in sock_alloc_inode().
      
      One particular issue I encountered because of this is that my GNU Emacs
      failed to draw anything on my desktop -- i.e. what I got is a transparent
      window, including the title bar. Bisection lead to the commit mentioned
      above and further investigation by means of strace told me that Emacs
      is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
      reproducible 100% of times and the fact that properly initializing the
      struct socket_wq ->flags fixes the issue leads me to the conclusion that
      somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
      preventing my Emacs from receiving any SIGIO's due to data becoming
      available and it got stuck.
      
      Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
      member to zero.
      
      Fixes: ceb5d58b ("net: fix sock_wake_async() rcu protection")
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      574aab1e
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · c6169202
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "Make the block layer great again.
      
        Basically three amazing fixes in this pull request, split into 4
        patches.  Believe me, they should go into 4.4.  Two of them fix a
        regression, the third and last fixes an easy-to-trigger bug.
      
         - Fix a bad irq enable through null_blk, for queue_mode=1 and using
           timer completions.  Add a block helper to restart a queue
           asynchronously, and use that from null_blk.  From me.
      
         - Fix a performance issue in NVMe.  Some devices (Intel Pxxxx) expose
           a stripe boundary, and performance suffers if we cross it.  We took
           that into account for merging, but not for the newer splitting
           code.  Fix from Keith.
      
         - Fix a kernel oops in lightnvm with multiple channels.  From Matias"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        lightnvm: wrong offset in bad blk lun calculation
        null_blk: use async queue restart helper
        block: add blk_start_queue_async()
        block: Split bios on chunk boundaries
      c6169202
  5. 30 12月, 2015 16 次提交
    • G
      drm/i915: increase the tries for HDMI hotplug live status checking · 3d8acd1f
      Gary Wang 提交于
      The total delay of HDMI hotplug detecting with 30ms is sometimes not
      enoughtfor HDMI live status up with specific HDMI monitors in BSW platform.
      
      After doing experiments for following monitors, it needs 80ms at least
      for those worst cases.
      
      Lenovo L246 1xwA (4 failed, necessary hot-plug delay: 58/40/60/40ms)
      Philips HH2AP (9 failed, necessary hot-plug delay: 80/50/50/60/46/40/58/58/39ms)
      BENQ ET-0035-N (6 failed, necessary hot-plug delay: 60/50/50/80/80/40ms)
      DELL U2713HM (2 failed, necessary hot-plug delay: 58/59ms)
      HP HP-LP2475w (5 failed, necessary hot-plug delay: 70/50/40/60/40ms)
      
      It looks like 70-80 ms is BSW platform needs in some bad cases of the
      monitors at this end (8 times delay at most). Keep less than 100ms for
      HDCP pulse HPD low (with at least 100ms) to respond a plug out.
      Reviewed-by: NCooper Chiou <cooper.chiou@intel.com>
      Tested-by: NGary Wang <gary.c.wang@intel.com>
      Cc: Gavin Hindman <gavin.hindman@intel.com>
      Cc: Sonika Jindal <sonika.jindal@intel.com>
      Cc: Shashank Sharma <shashank.sharma@intel.com>
      Cc: Shobhit Kumar <shobhit.kumar@intel.com>
      Signed-off-by: NGary Wang <gary.c.wang@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1450858295-12804-1-git-send-email-gary.c.wang@intel.comTested-by: NShobhit Kumar <shobhit.kumar@intel.com>
      Cc: drm-intel-fixes@lists.freedesktop.org
      Fixes: 237ed86c ("drm/i915: Check live status before reading edid")
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      (cherry picked from commit f8d03ea0)
      [Jani: undo the file mode change of the original commit]
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      3d8acd1f
    • L
      Merge branch 'akpm' (patches from Andrew) · 866be88a
      Linus Torvalds 提交于
      Merge misc fixes from Andrew Morton:
       "9 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/vmstat: fix overflow in mod_zone_page_state()
        ocfs2/dlm: clear migration_pending when migration target goes down
        mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()
        ocfs2: fix flock panic issue
        m32r: add io*_rep helpers
        m32r: fix build failure
        arch/x86/xen/suspend.c: include xen/xen.h
        mm: memcontrol: fix possible memcg leak due to interrupted reclaim
        ocfs2: fix BUG when calculate new backup super
      866be88a
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · e25bd6ca
      Linus Torvalds 提交于
      Pull vfs fix from Al Viro:
       "Fix for 3.15 breakage of fcntl64() in arm OABI compat.  -stable
        fodder"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        [PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64()
      e25bd6ca
    • H
      mm/vmstat: fix overflow in mod_zone_page_state() · 6cdb18ad
      Heiko Carstens 提交于
      mod_zone_page_state() takes a "delta" integer argument.  delta contains
      the number of pages that should be added or subtracted from a struct
      zone's vm_stat field.
      
      If a zone is larger than 8TB this will cause overflows.  E.g.  for a
      zone with a size slightly larger than 8TB the line
      
          mod_zone_page_state(zone, NR_ALLOC_BATCH, zone->managed_pages);
      
      in mm/page_alloc.c:free_area_init_core() will result in a negative
      result for the NR_ALLOC_BATCH entry within the zone's vm_stat, since 8TB
      contain 0x8xxxxxxx pages which will be sign extended to a negative
      value.
      
      Fix this by changing the delta argument to long type.
      
      This could fix an early boot problem seen on s390, where we have a 9TB
      system with only one node.  ZONE_DMA contains 2GB and ZONE_NORMAL the
      rest.  The system is trying to allocate a GFP_DMA page but ZONE_DMA is
      completely empty, so it tries to reclaim pages in an endless loop.
      
      This was seen on a heavily patched 3.10 kernel.  One possible
      explaination seem to be the overflows caused by mod_zone_page_state().
      Unfortunately I did not have the chance to verify that this patch
      actually fixes the problem, since I don't have access to the system
      right now.  However the overflow problem does exist anyway.
      
      Given the description that a system with slightly less than 8TB does
      work, this seems to be a candidate for the observed problem.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cdb18ad
    • X
      ocfs2/dlm: clear migration_pending when migration target goes down · cc28d6d8
      xuejiufei 提交于
      We have found a BUG on res->migration_pending when migrating lock
      resources.  The situation is as follows.
      
      dlm_mark_lockres_migration
        res->migration_pending = 1;
        __dlm_lockres_reserve_ast
        dlm_lockres_release_ast returns with res->migration_pending remains
            because other threads reserve asts
        wait dlm_migration_can_proceed returns 1
        >>>>>>> o2hb found that target goes down and remove target
                from domain_map
        dlm_migration_can_proceed returns 1
        dlm_mark_lockres_migrating returns -ESHOTDOWN with
            res->migration_pending still remains.
      
      When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON
      with res->migration_pending.  So clear migration_pending when target is
      down.
      Signed-off-by: NJiufei Xue <xuejiufei@huawei.com>
      Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cc28d6d8
    • A
      mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone() · 5f0f2887
      Andrew Banman 提交于
      test_pages_in_a_zone() does not account for the possibility of missing
      sections in the given pfn range.  pfn_valid_within always returns 1 when
      CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
      sections to pass the test, leading to a kernel oops.
      
      Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
      for missing sections before proceeding into the zone-check code.
      
      This also prevents a crash from offlining memory devices with missing
      sections.  Despite this, it may be a good idea to keep the related patch
      '[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
      missing sections' because missing sections in a memory block may lead to
      other problems not covered by the scope of this fix.
      Signed-off-by: NAndrew Banman <abanman@sgi.com>
      Acked-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5f0f2887
    • J
      ocfs2: fix flock panic issue · b5a8bc33
      Junxiao Bi 提交于
      Commit 4f656367 ("Move locks API users to locks_lock_inode_wait()")
      move flock/posix lock indentify code to locks_lock_inode_wait(), but
      missed to set fl_flags to FL_FLOCK which caused the following kernel
      panic on 4.4.0_rc5.
      
        kernel BUG at fs/locks.c:1895!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: ocfs2(O) ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront
        CPU: 0 PID: 20268 Comm: flock_unit_test Tainted: G           O    4.4.0-rc5-next-20151217 #1
        Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014
        task: ffff88007b3672c0 ti: ffff880028b58000 task.ti: ffff880028b58000
        RIP: locks_lock_inode_wait+0x2e/0x160
        Call Trace:
          ocfs2_do_flock+0x91/0x160 [ocfs2]
          ocfs2_flock+0x76/0xd0 [ocfs2]
          SyS_flock+0x10f/0x1a0
          entry_SYSCALL_64_fastpath+0x12/0x71
        Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 48 81 ec 88 00 00 00 8b 46 40 83 e0 03 83 f8 01 0f 84 ad 00 00 00 83 f8 02 74 04 <0f> 0b eb fe 4c 8d ad 60 ff ff ff 4c 8d 7b 58 e8 0e 8e 73 00 4d
        RIP  locks_lock_inode_wait+0x2e/0x160
         RSP <ffff880028b5bce8>
        ---[ end trace dfca74ec9b5b274c ]---
      
      Fixes: 4f656367 ("Move locks API users to locks_lock_inode_wait()")
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5a8bc33
    • S
      m32r: add io*_rep helpers · 92a8ed4c
      Sudip Mukherjee 提交于
      m32r allmodconfig was failing with the error:
      
        error: implicit declaration of function 'read'
      
      On checking io.h it turned out that 'read' is not defined but 'readb' is
      defined and 'ioread8' will then obviously mean 'readb'.
      
      At the same time some of the helper functions ioreadN_rep() and
      iowriteN_rep() were missing which also led to the build failure.
      Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      92a8ed4c
    • S
      m32r: fix build failure · 6122192e
      Sudip Mukherjee 提交于
      m32r allmodconfig is failing with:
      
        In file included from ../include/linux/kvm_para.h:4:0,
                         from ../kernel/watchdog.c:26:
        ../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory
      
      kvm_para.h was not included in the build.
      Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6122192e
    • A
      arch/x86/xen/suspend.c: include xen/xen.h · facca616
      Andrew Morton 提交于
      Fix the build warning:
      
        arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend':
        arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration]
                if (xen_pv_domain())
                    ^
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      facca616
    • V
      mm: memcontrol: fix possible memcg leak due to interrupted reclaim · 6df38689
      Vladimir Davydov 提交于
      Memory cgroup reclaim can be interrupted with mem_cgroup_iter_break()
      once enough pages have been reclaimed, in which case, in contrast to a
      full round-trip over a cgroup sub-tree, the current position stored in
      mem_cgroup_reclaim_iter of the target cgroup does not get invalidated
      and so is left holding the reference to the last scanned cgroup.  If the
      target cgroup does not get scanned again (we might have just reclaimed
      the last page or all processes might exit and free their memory
      voluntary), we will leak it, because there is nobody to put the
      reference held by the iterator.
      
      The problem is easy to reproduce by running the following command
      sequence in a loop:
      
          mkdir /sys/fs/cgroup/memory/test
          echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
          echo $$ > /sys/fs/cgroup/memory/test/cgroup.procs
          memhog 150M
          echo $$ > /sys/fs/cgroup/memory/cgroup.procs
          rmdir test
      
      The cgroups generated by it will never get freed.
      
      This patch fixes this issue by making mem_cgroup_iter avoid taking
      reference to the current position.  In order not to hit use-after-free
      bug while running reclaim in parallel with cgroup deletion, we make use
      of ->css_released cgroup callback to clear references to the dying
      cgroup in all reclaim iterators that might refer to it.  This callback
      is called right before scheduling rcu work which will free css, so if we
      access iter->position from rcu read section, we might be sure it won't
      go away under us.
      
      [hannes@cmpxchg.org: clean up css ref handling]
      Fixes: 5ac8fb31 ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>	[3.19+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6df38689
    • J
      ocfs2: fix BUG when calculate new backup super · 5c9ee4cb
      Joseph Qi 提交于
      When resizing, it firstly extends the last gd.  Once it should backup
      super in the gd, it calculates new backup super and update the
      corresponding value.
      
      But it currently doesn't consider the situation that the backup super is
      already done.  And in this case, it still sets the bit in gd bitmap and
      then decrease from bg_free_bits_count, which leads to a corrupted gd and
      trigger the BUG in ocfs2_block_group_set_bits:
      
          BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);
      
      So check whether the backup super is done and then do the updates.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: NJiufei Xue <xuejiufei@huawei.com>
      Reviewed-by: NYiwen Jiang <jiangyiwen@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9ee4cb
    • G
      MIPS: VDSO: Fix build error with binutils 2.24 and earlier · 398c7500
      Guenter Roeck 提交于
      Commit 2a037f31 ("MIPS: VDSO: Fix build error") tries to fix a build
      error seen with binutils 2.24 and earlier. However, the fix does not work,
      and again results in the already known build errors if the kernel is built
      with an earlier version of binutils.
      
      CC      arch/mips/vdso/gettimeofday.o
      /tmp/ccnOVbHT.s: Assembler messages:
      /tmp/ccnOVbHT.s:50: Error: can't resolve `_start' {*UND* section} - `L0 {.text section}
      /tmp/ccnOVbHT.s:374: Error: can't resolve `_start' {*UND* section} - `L0 {.text section}
      scripts/Makefile.build:258: recipe for target 'arch/mips/vdso/gettimeofday.o' failed
      make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
      
      Fixes: 2a037f31 ("MIPS: VDSO: Fix build error")
      Cc: Qais Yousef <qais.yousef@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/11926/Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      398c7500
    • J
      drivers: net: cpsw: fix error return code · c1e3334f
      Julia Lawall 提交于
      Propagate the return value of platform_get_irq on failure.
      
      A simplified version of the semantic match that finds the two cases where
      no error code is returned at all is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      identifier ret; expression e1,e2;
      @@
      (
      if (\(ret < 0\|ret != 0\))
       { ... return ret; }
      |
      ret = 0
      )
      ... when != ret = e1
          when != &ret
      *if(...)
      {
        ... when != ret = e2
            when forall
       return ret;
      }
      // </smpl>
      Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1e3334f
    • J
      openvswitch: Fix template leak in error cases. · 90c7afc9
      Joe Stringer 提交于
      Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
      reference leak on helper objects, but inadvertently introduced a leak on
      the ct template.
      
      Previously, ct_info.ct->general.use was initialized to 0 by
      nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
      returned successful. If an error occurred while adding the helper or
      adding the action to the actions buffer, the __ovs_ct_free_action()
      cleanup would use nf_ct_put() to free the entry; However, this relies on
      atomic_dec_and_test(ct_info.ct->general.use). This reference must be
      incremented first, or nf_ct_put() will never free it.
      
      Fix the issue by acquiring a reference to the template immediately after
      allocation.
      
      Fixes: cae3a262 ("openvswitch: Allow attaching helpers to ct action")
      Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak")
      Signed-off-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90c7afc9
    • A
      [PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64() · 76cc404b
      Al Viro 提交于
      Cc: stable@vger.kernel.org # 3.15+
      Reviewed-by: NJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      76cc404b
  6. 29 12月, 2015 4 次提交
    • M
      lightnvm: wrong offset in bad blk lun calculation · c3293a9a
      Matias Bjørling 提交于
      dev->nr_luns reports the total number of luns available in a device
      while dev->luns_per_chnl is the number of luns per channel.
      
      When multiple channels are available, the offset is calculated from a
      channel and lun id into a linear array. As it multiplies with
      the total number of luns, we go out of bound when channel id > 0 and
      causes the kernel to panic when we read a protected kernel memory area.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c3293a9a
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 1e60508c
      Linus Torvalds 提交于
      Pull rdma fixes from Doug Ledford:
       "Three late 4.4-rc fixes.
      
        The first two were very small in terms of number of lines, the third
        is more lines of change than I like this late in the cycle, but there
        are positive test results from Avagotech and from my own test setup
        with the target hardware, and given the problem was a 100% failure
        case, I sent it through.
      
         - A previous patch updated the mlx4 driver to use vmalloc when there
           was not enough memory to get a contiguous region large enough for
           our needs, so we need kvfree() whenever we free that item.  We
           missed one place, so fix that now.
      
         - A previous patch added code to match incoming packets against a
           specific device, but failed to compensate for devices that have
           both InfiniBand and Ethernet ports.  Fix that.
      
         - Under certain vlan conditions, the ocrdma driver would fail to
           bring up any vlan interfaces and would print out a circular locking
           failure.  Fix that"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        RDMA/be2net: Remove open and close entry points
        RDMA/ocrdma: Depend on async link events from CNA
        RDMA/ocrdma: Dispatch only port event when port state changes
        RDMA/ocrdma: Fix vlan-id assignment in qp parameters
        IB/mlx4: Replace kfree with kvfree in mlx4_ib_destroy_srq
        IB/cma: cma_match_net_dev needs to take into account port_num
      1e60508c
    • J
      null_blk: use async queue restart helper · 48cc661e
      Jens Axboe 提交于
      If null_blk is run in NULL_IRQ_TIMER mode and with queue_mode NULL_Q_RQ,
      we need to restart the queue from the hrtimer interrupt. We can't
      directly invoke the request_fn from that context, so punt the queue run
      to async kblockd context.
      Tested-by: NRabin Vincent <rabin@rab.in>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      48cc661e
    • J
      block: add blk_start_queue_async() · 21491412
      Jens Axboe 提交于
      We currently only have an inline/sync helper to restart a stopped
      queue. If drivers need an async version, they have to roll their
      own. Add a generic helper instead.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      21491412