1. 07 8月, 2015 2 次提交
    • R
      audit: implement audit by executable · 34d99af5
      Richard Guy Briggs 提交于
      This adds the ability audit the actions of a not-yet-running process.
      
      This patch implements the ability to filter on the executable path.  Instead of
      just hard coding the ino and dev of the executable we care about at the moment
      the rule is inserted into the kernel, use the new audit_fsnotify
      infrastructure to manage this dynamically.  This means that if the filename
      does not yet exist but the containing directory does, or if the inode in
      question is unlinked and creat'd (aka updated) the rule will just continue to
      work.  If the containing directory is moved or deleted or the filesystem is
      unmounted, the rule is deleted automatically.  A future enhancement would be to
      have the rule survive across directory disruptions.
      
      This is a heavily modified version of a patch originally submitted by Eric
      Paris with some ideas from Peter Moody.
      
      Cc: Peter Moody <peter@hda3.com>
      Cc: Eric Paris <eparis@redhat.com>
      Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
      [PM: minor whitespace clean to satisfy ./scripts/checkpatch]
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      34d99af5
    • R
      audit: use macros for unset inode and device values · 84cb777e
      Richard Guy Briggs 提交于
      Clean up a number of places were casted magic numbers are used to represent
      unset inode and device numbers in preparation for the audit by executable path
      patch set.
      Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
      [PM: enclosed the _UNSET macros in parentheses for ./scripts/checkpatch]
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      84cb777e
  2. 09 4月, 2015 1 次提交
    • B
      Defer processing of REQ_PREEMPT requests for blocked devices · bba0bdd7
      Bart Van Assche 提交于
      SCSI transport drivers and SCSI LLDs block a SCSI device if the
      transport layer is not operational. This means that in this state
      no requests should be processed, even if the REQ_PREEMPT flag has
      been set. This patch avoids that a rescan shortly after a cable
      pull sporadically triggers the following kernel oops:
      
      BUG: unable to handle kernel paging request at ffffc9001a6bc084
      IP: [<ffffffffa04e08f2>] mlx4_ib_post_send+0xd2/0xb30 [mlx4_ib]
      Process rescan-scsi-bus (pid: 9241, threadinfo ffff88053484a000, task ffff880534aae100)
      Call Trace:
       [<ffffffffa0718135>] srp_post_send+0x65/0x70 [ib_srp]
       [<ffffffffa071b9df>] srp_queuecommand+0x1cf/0x3e0 [ib_srp]
       [<ffffffffa0001ff1>] scsi_dispatch_cmd+0x101/0x280 [scsi_mod]
       [<ffffffffa0009ad1>] scsi_request_fn+0x411/0x4d0 [scsi_mod]
       [<ffffffff81223b37>] __blk_run_queue+0x27/0x30
       [<ffffffff8122a8d2>] blk_execute_rq_nowait+0x82/0x110
       [<ffffffff8122a9c2>] blk_execute_rq+0x62/0xf0
       [<ffffffffa000b0e8>] scsi_execute+0xe8/0x190 [scsi_mod]
       [<ffffffffa000b2f3>] scsi_execute_req+0xa3/0x130 [scsi_mod]
       [<ffffffffa000c1aa>] scsi_probe_lun+0x17a/0x450 [scsi_mod]
       [<ffffffffa000ce86>] scsi_probe_and_add_lun+0x156/0x480 [scsi_mod]
       [<ffffffffa000dc2f>] __scsi_scan_target+0xdf/0x1f0 [scsi_mod]
       [<ffffffffa000dfa3>] scsi_scan_host_selected+0x183/0x1c0 [scsi_mod]
       [<ffffffffa000edfb>] scsi_scan+0xdb/0xe0 [scsi_mod]
       [<ffffffffa000ee13>] store_scan+0x13/0x20 [scsi_mod]
       [<ffffffff811c8d9b>] sysfs_write_file+0xcb/0x160
       [<ffffffff811589de>] vfs_write+0xce/0x140
       [<ffffffff81158b53>] sys_write+0x53/0xa0
       [<ffffffff81464592>] system_call_fastpath+0x16/0x1b
       [<00007f611c9d9300>] 0x7f611c9d92ff
      Reported-by: NMax Gurtuvoy <maxg@mellanox.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      bba0bdd7
  3. 08 4月, 2015 2 次提交
    • M
      include/linux/dmapool.h: declare struct device · ce66b032
      Mark Brown 提交于
      dmapool uses struct device in function arguments but relies on an
      implicit inclusion to declare struct device causing warnings in some
      configurations:
      
        include/linux/dmapool.h:31:7: warning: 'struct device' declared inside parameter list
      
      Fix this by adding a struct device declaration to the file.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ce66b032
    • M
      mm: move zone lock to a different cache line than order-0 free page lists · a368ab67
      Mel Gorman 提交于
      Huang Ying reported the following problem due to commit 3484b2de ("mm:
      rearrange zone fields into read-only, page alloc, statistics and page
      reclaim lines") from the Intel performance tests
      
          24b7e581  3484b2de
          ----------------  --------------------------
                   %stddev     %change         %stddev
                       \          |                \
              152288 \261  0%     -46.2%      81911 \261  0%  aim7.jobs-per-min
                 237 \261  0%     +85.6%        440 \261  0%  aim7.time.elapsed_time
                 237 \261  0%     +85.6%        440 \261  0%  aim7.time.elapsed_time.max
               25026 \261  0%     +70.7%      42712 \261  0%  aim7.time.system_time
             2186645 \261  5%     +32.0%    2885949 \261  4%  aim7.time.voluntary_context_switches
             4576561 \261  1%     +24.9%    5715773 \261  0%  aim7.time.involuntary_context_switches
      
      The problem is specific to very large machines under stress.  It was not
      reproducible with the machines I had used to justify the original patch
      because large numbers of CPUs are required.  When pressure is high enough,
      the cache line is bouncing between CPUs trying to acquire the lock and the
      holder of the lock adjusting free lists.  The intention was that the
      acquirer of the lock would automatically have the cache line holding the
      free lists but according to Huang, this is not a universal win.
      
      One possibility is to move the zone lock to its own cache line but it
      increases the size of the zone.  This patch moves the lock to the other
      end of the free lists where they do not contend under high pressure.  It
      does mean the page allocator paths now require more cache lines but Huang
      reports that it restores performance to previous levels on large machines
      
                   %stddev     %change         %stddev
                       \          |                \
               84568 \261  1%     +94.3%     164280 \261  1%  aim7.jobs-per-min
             2881944 \261  2%     -35.1%    1870386 \261  8%  aim7.time.voluntary_context_switches
                 681 \261  1%      -3.4%        658 \261  0%  aim7.time.user_time
             5538139 \261  0%     -12.1%    4867884 \261  0%  aim7.time.involuntary_context_switches
               44174 \261  1%     -46.0%      23848 \261  1%  aim7.time.system_time
                 426 \261  1%     -48.4%        219 \261  1%  aim7.time.elapsed_time
                 426 \261  1%     -48.4%        219 \261  1%  aim7.time.elapsed_time.max
                 468 \261  1%     -43.1%        266 \261  2%  uptime.boot
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reported-by: NHuang Ying <ying.huang@intel.com>
      Tested-by: NHuang Ying <ying.huang@intel.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a368ab67
  4. 07 4月, 2015 2 次提交
    • A
      fix mremap() vs. ioctx_kill() race · b2edffdd
      Al Viro 提交于
      teach ->mremap() method to return an error and have it fail for
      aio mappings in process of being killed
      
      Note that in case of ->mremap() failure we need to undo move_page_tables()
      we'd already done; we could call ->mremap() first, but then the failure of
      move_page_tables() would require undoing whatever _successful_ ->mremap()
      has done, which would be a lot more headache in general.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b2edffdd
    • H
      ipv6: protect skb->sk accesses from recursive dereference inside the stack · f60e5990
      hannes@stressinduktion.org 提交于
      We should not consult skb->sk for output decisions in xmit recursion
      levels > 0 in the stack. Otherwise local socket settings could influence
      the result of e.g. tunnel encapsulation process.
      
      ipv6 does not conform with this in three places:
      
      1) ip6_fragment: we do consult ipv6_npinfo for frag_size
      
      2) sk_mc_loop in ipv6 uses skb->sk and checks if we should
         loop the packet back to the local socket
      
      3) ip6_skb_dst_mtu could query the settings from the user socket and
         force a wrong MTU
      
      Furthermore:
      In sk_mc_loop we could potentially land in WARN_ON(1) if we use a
      PF_PACKET socket ontop of an IPv6-backed vxlan device.
      
      Reuse xmit_recursion as we are currently only interested in protecting
      tunnel devices.
      
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f60e5990
  5. 03 4月, 2015 1 次提交
  6. 01 4月, 2015 1 次提交
    • J
      sunrpc: make debugfs file creation failure non-fatal · f9c72d10
      Jeff Layton 提交于
      We currently have a problem that SELinux policy is being enforced when
      creating debugfs files. If a debugfs file is created as a side effect of
      doing some syscall, then that creation can fail if the SELinux policy
      for that process prevents it.
      
      This seems wrong. We don't do that for files under /proc, for instance,
      so Bruce has proposed a patch to fix that.
      
      While discussing that patch however, Greg K.H. stated:
      
          "No kernel code should care / fail if a debugfs function fails, so
           please fix up the sunrpc code first."
      
      This patch converts all of the sunrpc debugfs setup code to be void
      return functins, and the callers to not look for errors from those
      functions.
      
      This should allow rpc_clnt and rpc_xprt creation to work, even if the
      kernel fails to create debugfs files for some reason.
      
      Symptoms were failing krb5 mounts on systems using gss-proxy and
      selinux.
      
      Fixes: 388f0c77 "sunrpc: add a debugfs rpc_xprt directory..."
      Cc: stable@vger.kernel.org
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f9c72d10
  7. 31 3月, 2015 2 次提交
    • M
      block: fix blk_stack_limits() regression due to lcm() change · e9637415
      Mike Snitzer 提交于
      Linux 3.19 commit 69c953c8 ("lib/lcm.c: lcm(n,0)=lcm(0,n) is 0, not n")
      caused blk_stack_limits() to not properly stack queue_limits for stacked
      devices (e.g. DM).
      
      Fix this regression by establishing lcm_not_zero() and switching
      blk_stack_limits() over to using it.
      
      DM uses blk_set_stacking_limits() to establish the initial top-level
      queue_limits that are then built up based on underlying devices' limits
      using blk_stack_limits().  In the case of optimal_io_size (io_opt)
      blk_set_stacking_limits() establishes a default value of 0.  With commit
      69c953c8, lcm(0, n) is no longer n, which compromises proper stacking of
      the underlying devices' io_opt.
      
      Test:
      $ modprobe scsi_debug dev_size_mb=10 num_tgts=1 opt_blks=1536
      $ cat /sys/block/sde/queue/optimal_io_size
      786432
      $ dmsetup create node --table "0 100 linear /dev/sde 0"
      
      Before this fix:
      $ cat /sys/block/dm-5/queue/optimal_io_size
      0
      
      After this fix:
      $ cat /sys/block/dm-5/queue/optimal_io_size
      786432
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org # 3.19+
      Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e9637415
    • C
      nfsd: require an explicit option to enable pNFS · f3f03330
      Christoph Hellwig 提交于
      Turns out sending out layouts to any client is a bad idea if they
      can't get at the storage device, so require explicit admin action
      to enable pNFS.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f3f03330
  8. 30 3月, 2015 4 次提交
  9. 26 3月, 2015 1 次提交
    • M
      mm: numa: slow PTE scan rate if migration failures occur · 074c2381
      Mel Gorman 提交于
      Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226
      
        Across the board the 4.0-rc1 numbers are much slower, and the degradation
        is far worse when using the large memory footprint configs. Perf points
        straight at the cause - this is from 4.0-rc1 on the "-o bhash=101073" config:
      
         -   56.07%    56.07%  [kernel]            [k] default_send_IPI_mask_sequence_phys
            - default_send_IPI_mask_sequence_phys
               - 99.99% physflat_send_IPI_mask
                  - 99.37% native_send_call_func_ipi
                       smp_call_function_many
                     - native_flush_tlb_others
                        - 99.85% flush_tlb_page
                             ptep_clear_flush
                             try_to_unmap_one
                             rmap_walk
                             try_to_unmap
                             migrate_pages
                             migrate_misplaced_page
                           - handle_mm_fault
                              - 99.73% __do_page_fault
                                   trace_do_page_fault
                                   do_async_page_fault
                                 + async_page_fault
                    0.63% native_send_call_func_single_ipi
                       generic_exec_single
                       smp_call_function_single
      
      This is showing excessive migration activity even though excessive
      migrations are meant to get throttled.  Normally, the scan rate is tuned
      on a per-task basis depending on the locality of faults.  However, if
      migrations fail for any reason then the PTE scanner may scan faster if
      the faults continue to be remote.  This means there is higher system CPU
      overhead and fault trapping at exactly the time we know that migrations
      cannot happen.  This patch tracks when migration failures occur and
      slows the PTE scanner.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reported-by: NDave Chinner <david@fromorbit.com>
      Tested-by: NDave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      074c2381
  10. 21 3月, 2015 1 次提交
    • C
      Input: add MT_TOOL_PALM · a736775d
      Charlie Mooney 提交于
      Currently there are only two "tools" that can be specified by a multi-touch
      driver: MT_TOOL_FINGER and MT_TOOL_PEN. In working with Elan (The touch
      vendor) and discussing their next-gen devices it seems that it will be
      useful to have more tools so that their devices can give the upper layers
      of the stack hints as to what is touching the sensor.
      
      In particular they have new experimental firmware that can better
      differentiate between palms vs fingertips and would like to plumb a patch
      so that we can use their hints in higher-level gesture soft- ware.  The
      firmware on the device can reasonably do a better job of palm detection
      because it has access to all of the raw sensor readings as opposed to just
      the width/pressure/etc that are exposed by the driver.  As such, the
      firmware can characterize what a palm looks like in much finer-grained
      detail and this change would allow such a device to share its findings with
      the kernel.
      Signed-off-by: NCharlie Mooney <charliemooney@chromium.org>
      Acked-by: NPeter Hutterer <peter.hutterer@who-t.net>
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      a736775d
  11. 20 3月, 2015 3 次提交
    • C
      target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0 · 9bc6548f
      Christophe Vu-Brugier 提交于
      A check that rejects a CDB with FUA bit set if no write cache is
      emulated was added by the following commit:
      
        fde9f50f target: Add sanity checks for DPO/FUA bit usage
      
      The condition is as follows:
      
        if (!dev->dev_attrib.emulate_fua_write ||
            !dev->dev_attrib.emulate_write_cache)
      
      However, this check is wrong if the backend device supports WCE but
      "emulate_write_cache" is disabled.
      
      This patch uses se_dev_check_wce() (previously named
      spc_check_dev_wce) to invoke transport->get_write_cache() if the
      device has a write cache or check the "emulate_write_cache" attribute
      otherwise.
      Reported-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NChristophe Vu-Brugier <cvubrugier@fastmail.fm>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      9bc6548f
    • P
      regmap: introduce regmap_name to fix syscon regmap trace events · c6b570d9
      Philipp Zabel 提交于
      This patch fixes a NULL pointer dereference when enabling regmap event
      tracing in the presence of a syscon regmap, introduced by commit bdb0066d
      ("mfd: syscon: Decouple syscon interface from platform devices").
      That patch introduced syscon regmaps that have their dev field set to NULL.
      The regmap trace events expect it to point to a valid struct device and feed
      it to dev_name():
      
        $ echo 1 > /sys/kernel/debug/tracing/events/regmap/enable
      
        Unable to handle kernel NULL pointer dereference at virtual address 0000002c
        pgd = 80004000
        [0000002c] *pgd=00000000
        Internal error: Oops: 17 [#1] SMP ARM
        Modules linked in: coda videobuf2_vmalloc
        CPU: 0 PID: 304 Comm: kworker/0:2 Not tainted 4.0.0-rc2+ #9197
        Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
        Workqueue: events_freezable thermal_zone_device_check
        task: 9f25a200 ti: 9f1ee000 task.ti: 9f1ee000
        PC is at ftrace_raw_event_regmap_block+0x3c/0xe4
        LR is at _regmap_raw_read+0x1bc/0x1cc
        pc : [<803636e8>]    lr : [<80365f2c>]    psr: 600f0093
        sp : 9f1efd78  ip : 9f1efdb8  fp : 9f1efdb4
        r10: 00000004  r9 : 00000001  r8 : 00000001
        r7 : 00000180  r6 : 00000000  r5 : 9f00e3c0  r4 : 00000003
        r3 : 00000001  r2 : 00000180  r1 : 00000000  r0 : 9f00e3c0
        Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
        Control: 10c5387d  Table: 2d91004a  DAC: 00000015
        Process kworker/0:2 (pid: 304, stack limit = 0x9f1ee210)
        Stack: (0x9f1efd78 to 0x9f1f0000)
        fd60:                                                       9f1efda4 9f1efd88
        fd80: 800708c0 805f9510 80927140 800f0013 9f1fc800 9eb2f490 00000000 00000180
        fda0: 808e3840 00000001 9f1efdfc 9f1efdb8 80365f2c 803636b8 805f8958 800708e0
        fdc0: a00f0013 803636ac 9f16de00 00000180 80927140 9f1fc800 9f1fc800 9f1efe6c
        fde0: 9f1efe6c 9f732400 00000000 00000000 9f1efe1c 9f1efe00 80365f70 80365d7c
        fe00: 80365f3c 9f1fc800 9f1fc800 00000180 9f1efe44 9f1efe20 803656a4 80365f48
        fe20: 9f1fc800 00000180 9f1efe6c 9f1efe6c 9f732400 00000000 9f1efe64 9f1efe48
        fe40: 803657bc 80365634 00000001 9e95f910 9f1fc800 9f1efeb4 9f1efe8c 9f1efe68
        fe60: 80452ac0 80365778 9f1efe8c 9f1efe78 9e93d400 9e93d5e8 9f1efeb4 9f72ef40
        fe80: 9f1efeac 9f1efe90 8044e11c 80452998 8045298c 9e93d608 9e93d400 808e1978
        fea0: 9f1efecc 9f1efeb0 8044fd14 8044e0d0 ffffffff 9f25a200 9e93d608 9e481380
        fec0: 9f1efedc 9f1efed0 8044fde8 8044fcec 9f1eff1c 9f1efee0 80038d50 8044fdd8
        fee0: 9f1ee020 9f72ef40 9e481398 00000000 00000008 9f72ef54 9f1ee020 9f72ef40
        ff00: 9e481398 9e481380 00000008 9f72ef40 9f1eff5c 9f1eff20 80039754 80038bfc
        ff20: 00000000 9e481380 80894100 808e1662 00000000 9e4f2ec0 00000000 9e481380
        ff40: 800396f8 00000000 00000000 00000000 9f1effac 9f1eff60 8003e020 80039704
        ff60: ffffffff 00000000 ffffffff 9e481380 00000000 00000000 9f1eff78 9f1eff78
        ff80: 00000000 00000000 9f1eff88 9f1eff88 9e4f2ec0 8003df30 00000000 00000000
        ffa0: 00000000 9f1effb0 8000eb60 8003df3c 00000000 00000000 00000000 00000000
        ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff ffffffff
        Backtrace:
        [<803636ac>] (ftrace_raw_event_regmap_block) from [<80365f2c>] (_regmap_raw_read+0x1bc/0x1cc)
         r9:00000001 r8:808e3840 r7:00000180 r6:00000000 r5:9eb2f490 r4:9f1fc800
        [<80365d70>] (_regmap_raw_read) from [<80365f70>] (_regmap_bus_read+0x34/0x6c)
         r10:00000000 r9:00000000 r8:9f732400 r7:9f1efe6c r6:9f1efe6c r5:9f1fc800
         r4:9f1fc800
        [<80365f3c>] (_regmap_bus_read) from [<803656a4>] (_regmap_read+0x7c/0x144)
         r6:00000180 r5:9f1fc800 r4:9f1fc800 r3:80365f3c
        [<80365628>] (_regmap_read) from [<803657bc>] (regmap_read+0x50/0x70)
         r9:00000000 r8:9f732400 r7:9f1efe6c r6:9f1efe6c r5:00000180 r4:9f1fc800
        [<8036576c>] (regmap_read) from [<80452ac0>] (imx_get_temp+0x134/0x1a4)
         r6:9f1efeb4 r5:9f1fc800 r4:9e95f910 r3:00000001
        [<8045298c>] (imx_get_temp) from [<8044e11c>] (thermal_zone_get_temp+0x58/0x74)
         r7:9f72ef40 r6:9f1efeb4 r5:9e93d5e8 r4:9e93d400
        [<8044e0c4>] (thermal_zone_get_temp) from [<8044fd14>] (thermal_zone_device_update+0x34/0xec)
         r6:808e1978 r5:9e93d400 r4:9e93d608 r3:8045298c
        [<8044fce0>] (thermal_zone_device_update) from [<8044fde8>] (thermal_zone_device_check+0x1c/0x20)
         r5:9e481380 r4:9e93d608
        [<8044fdcc>] (thermal_zone_device_check) from [<80038d50>] (process_one_work+0x160/0x3d4)
        [<80038bf0>] (process_one_work) from [<80039754>] (worker_thread+0x5c/0x4f4)
         r10:9f72ef40 r9:00000008 r8:9e481380 r7:9e481398 r6:9f72ef40 r5:9f1ee020
         r4:9f72ef54
        [<800396f8>] (worker_thread) from [<8003e020>] (kthread+0xf0/0x108)
         r10:00000000 r9:00000000 r8:00000000 r7:800396f8 r6:9e481380 r5:00000000
         r4:9e4f2ec0
        [<8003df30>] (kthread) from [<8000eb60>] (ret_from_fork+0x14/0x34)
         r7:00000000 r6:00000000 r5:8003df30 r4:9e4f2ec0
        Code: e3140040 1a00001a e3140020 1a000016 (e596002c)
        ---[ end trace 193c15c2494ec960 ]---
      
      Fixes: bdb0066d (mfd: syscon: Decouple syscon interface from platform devices)
      Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      c6b570d9
    • S
      ata: Add a new flag to destinguish sas controller · 5067c046
      Shaohua Li 提交于
      SAS controller has its own tag allocation, which doesn't directly match to ATA
      tag, so SAS and SATA have different code path for ata tags. Originally we use
      port->scsi_host (98bd4be1) to destinguish SAS controller, but libsas set
      ->scsi_host too, so we can't use it for the destinguish, we add a new flag for
      this purpose.
      
      Without this patch, the following oops can happen because scsi-mq uses
      a host-wide tag map shared among all devices with some integer tag
      values >= ATA_MAX_QUEUE.  These unexpectedly high tag values cause
      __ata_qc_from_tag() to return NULL, which is then dereferenced in
      ata_qc_new_init().
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
        IP: [<ffffffff804fd46e>] ata_qc_new_init+0x3e/0x120
        PGD 32adf0067 PUD 32adf1067 PMD 0
        Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
        Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi igb
        i2c_algo_bit ptp pps_core pm80xx libsas scsi_transport_sas sg coretemp
        eeprom w83795 i2c_i801
        CPU: 4 PID: 1450 Comm: cydiskbench Not tainted 4.0.0-rc3 #1
        Hardware name: Supermicro X8DTH-i/6/iF/6F/X8DTH, BIOS 2.1b       05/04/12
        task: ffff8800ba86d500 ti: ffff88032a064000 task.ti: ffff88032a064000
        RIP: 0010:[<ffffffff804fd46e>]  [<ffffffff804fd46e>] ata_qc_new_init+0x3e/0x120
        RSP: 0018:ffff88032a067858  EFLAGS: 00010046
        RAX: 0000000000000000 RBX: ffff8800ba0d2230 RCX: 000000000000002a
        RDX: ffffffff80505ae0 RSI: 0000000000000020 RDI: ffff8800ba0d2230
        RBP: ffff88032a067868 R08: 0000000000000201 R09: 0000000000000001
        R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ba0d0000
        R13: ffff8800ba0d2230 R14: ffffffff80505ae0 R15: ffff8800ba0d0000
        FS:  0000000041223950(0063) GS:ffff88033e480000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 0000000000000058 CR3: 000000032a0a3000 CR4: 00000000000006e0
        Stack:
         ffff880329eee758 ffff880329eee758 ffff88032a0678a8 ffffffff80502dad
         ffff8800ba167978 ffff880329eee758 ffff88032bf9c520 ffff8800ba167978
         ffff88032bf9c520 ffff88032bf9a290 ffff88032a0678b8 ffffffff80506909
        Call Trace:
         [<ffffffff80502dad>] ata_scsi_translate+0x3d/0x1b0
         [<ffffffff80506909>] ata_sas_queuecmd+0x149/0x2a0
         [<ffffffffa0046650>] sas_queuecommand+0xa0/0x1f0 [libsas]
         [<ffffffff804ea544>] scsi_dispatch_cmd+0xd4/0x1a0
         [<ffffffff804eb50f>] scsi_queue_rq+0x66f/0x7f0
         [<ffffffff803e5098>] __blk_mq_run_hw_queue+0x208/0x3f0
         [<ffffffff803e54b8>] blk_mq_run_hw_queue+0x88/0xc0
         [<ffffffff803e5c74>] blk_mq_insert_request+0xc4/0x130
         [<ffffffff803e0b63>] blk_execute_rq_nowait+0x73/0x160
         [<ffffffffa0023fca>] sg_common_write+0x3da/0x720 [sg]
         [<ffffffffa0025100>] sg_new_write+0x250/0x360 [sg]
         [<ffffffffa0025feb>] sg_write+0x13b/0x450 [sg]
         [<ffffffff8032ec91>] vfs_write+0xd1/0x1b0
         [<ffffffff8032ee54>] SyS_write+0x54/0xc0
         [<ffffffff80689932>] system_call_fastpath+0x12/0x17
      
      tj: updated description.
      
      Fixes: 12cb5ce1 ("libata: use blk taging")
      Reported-and-tested-by: NTony Battersby <tonyb@cybernetics.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5067c046
  12. 19 3月, 2015 1 次提交
    • P
      netfilter: restore rule tracing via nfnetlink_log · 4017a7ee
      Pablo Neira Ayuso 提交于
      Since fab4085f ("netfilter: log: nf_log_packet() as real unified
      interface"), the loginfo structure that is passed to nf_log_packet() is
      used to explicitly indicate the logger type you want to use.
      
      This is a problem for people tracing rules through nfnetlink_log since
      packets are always routed to the NF_LOG_TYPE logger after the
      aforementioned patch.
      
      We can fix this by removing the trace loginfo structures, but that still
      changes the log level from 4 to 5 for tracing messages and there may be
      someone relying on this outthere. So let's just introduce a new
      nf_log_trace() function that restores the former behaviour.
      Reported-by: NMarkus Kötter <koetter@rrzn.uni-hannover.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4017a7ee
  13. 18 3月, 2015 4 次提交
  14. 17 3月, 2015 2 次提交
    • K
      regulator: palmas: Correct TPS659038 register definition for REGEN2 · e03826d5
      Keerthy 提交于
      The register offset for REGEN2_CTRL in different for TPS659038 chip as when
      compared with other Palmas family PMICs. In the case of TPS659038 the wrong
      offset pointed to PLLEN_CTRL and was causing a hang. Correcting the same.
      Signed-off-by: NKeerthy <j-keerthy@ti.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      e03826d5
    • P
      livepatch: Fix subtle race with coming and going modules · 8cb2c2dc
      Petr Mladek 提交于
      There is a notifier that handles live patches for coming and going modules.
      It takes klp_mutex lock to avoid races with coming and going patches but
      it does not keep the lock all the time. Therefore the following races are
      possible:
      
        1. The notifier is called sometime in STATE_MODULE_COMING. The module
           is visible by find_module() in this state all the time. It means that
           new patch can be registered and enabled even before the notifier is
           called. It might create wrong order of stacked patches, see below
           for an example.
      
         2. New patch could still see the module in the GOING state even after
            the notifier has been called. It will try to initialize the related
            object structures but the module could disappear at any time. There
            will stay mess in the structures. It might even cause an invalid
            memory access.
      
      This patch solves the problem by adding a boolean variable into struct module.
      The value is true after the coming and before the going handler is called.
      New patches need to be applied when the value is true and they need to ignore
      the module when the value is false.
      
      Note that we need to know state of all modules on the system. The races are
      related to new patches. Therefore we do not know what modules will get
      patched.
      
      Also note that we could not simply ignore going modules. The code from the
      module could be called even in the GOING state until mod->exit() finishes.
      If we start supporting patches with semantic changes between function
      calls, we need to apply new patches to any still usable code.
      See below for an example.
      
      Finally note that the patch solves only the situation when a new patch is
      registered. There are no such problems when the patch is being removed.
      It does not matter who disable the patch first, whether the normal
      disable_patch() or the module notifier. There is nothing to do
      once the patch is disabled.
      
      Alternative solutions:
      ======================
      
      + reject new patches when a patched module is coming or going; this is ugly
      
      + wait with adding new patch until the module leaves the COMING and GOING
        states; this might be dangerous and complicated; we would need to release
        kgr_lock in the middle of the patch registration to avoid a deadlock
        with the coming and going handlers; also we might need a waitqueue for
        each module which seems to be even bigger overhead than the boolean
      
      + stop modules from entering COMING and GOING states; wait until modules
        leave these states when they are already there; looks complicated; we would
        need to ignore the module that asked to stop the others to avoid a deadlock;
        also it is unclear what to do when two modules asked to stop others and
        both are in COMING state (situation when two new patches are applied)
      
      + always register/enable new patches and fix up the potential mess (registered
        patches order) in klp_module_init(); this is nasty and prone to regressions
        in the future development
      
      + add another MODULE_STATE where the kallsyms are visible but the module is not
        used yet; this looks too complex; the module states are checked on "many"
        locations
      
      Example of patch stacking breakage:
      ===================================
      
      The notifier could _not_ _simply_ ignore already initialized module objects.
      For example, let's have three patches (P1, P2, P3) for functions a() and b()
      where a() is from vmcore and b() is from a module M. Something like:
      
      	a()	b()
      P1	a1()	b1()
      P2	a2()	b2()
      P3	a3()	b3(3)
      
      If you load the module M after all patches are registered and enabled.
      The ftrace ops for function a() and b() has listed the functions in this
      order:
      
      	ops_a->func_stack -> list(a3,a2,a1)
      	ops_b->func_stack -> list(b3,b2,b1)
      
      , so the pointer to b3() is the first and will be used.
      
      Then you might have the following scenario. Let's start with state when patches
      P1 and P2 are registered and enabled but the module M is not loaded. Then ftrace
      ops for b() does not exist. Then we get into the following race:
      
      CPU0					CPU1
      
      load_module(M)
      
        complete_formation()
      
        mod->state = MODULE_STATE_COMING;
        mutex_unlock(&module_mutex);
      
      					klp_register_patch(P3);
      					klp_enable_patch(P3);
      
      					# STATE 1
      
        klp_module_notify(M)
          klp_module_notify_coming(P1);
          klp_module_notify_coming(P2);
          klp_module_notify_coming(P3);
      
      					# STATE 2
      
      The ftrace ops for a() and b() then looks:
      
        STATE1:
      
      	ops_a->func_stack -> list(a3,a2,a1);
      	ops_b->func_stack -> list(b3);
      
        STATE2:
      	ops_a->func_stack -> list(a3,a2,a1);
      	ops_b->func_stack -> list(b2,b1,b3);
      
      therefore, b2() is used for the module but a3() is used for vmcore
      because they were the last added.
      
      Example of the race with going modules:
      =======================================
      
      CPU0					CPU1
      
      delete_module()  #SYSCALL
      
         try_stop_module()
           mod->state = MODULE_STATE_GOING;
      
         mutex_unlock(&module_mutex);
      
      					klp_register_patch()
      					klp_enable_patch()
      
      					#save place to switch universe
      
      					b()     # from module that is going
      					  a()   # from core (patched)
      
         mod->exit();
      
      Note that the function b() can be called until we call mod->exit().
      
      If we do not apply patch against b() because it is in MODULE_STATE_GOING,
      it will call patched a() with modified semantic and things might get wrong.
      
      [jpoimboe@redhat.com: use one boolean instead of two]
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      8cb2c2dc
  15. 14 3月, 2015 2 次提交
  16. 13 3月, 2015 4 次提交
  17. 12 3月, 2015 2 次提交
    • E
      xps: must clear sender_cpu before forwarding · c29390c6
      Eric Dumazet 提交于
      John reported that my previous commit added a regression
      on his router.
      
      This is because sender_cpu & napi_id share a common location,
      so get_xps_queue() can see garbage and perform an out of bound access.
      
      We need to make sure sender_cpu is cleared before doing the transmit,
      otherwise any NIC busy poll enabled (skb_mark_napi_id()) can trigger
      this bug.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NJohn <jw@nuclearfallout.net>
      Bisected-by: NJohn <jw@nuclearfallout.net>
      Fixes: 2bd82484 ("xps: fix xps for stacked devices")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c29390c6
    • M
      clk: introduce clk_is_match · 3d3801ef
      Michael Turquette 提交于
      Some drivers compare struct clk pointers as a means of knowing
      if the two pointers reference the same clock hardware. This behavior is
      dubious (drivers must not dereference struct clk), but did not cause any
      regressions until the per-user struct clk patch was merged. Now the test
      for matching clk's will always fail with per-user struct clk's.
      
      clk_is_match is introduced to fix the regression and prevent drivers
      from comparing the pointers manually.
      
      Fixes: 035a61c3 ("clk: Make clk API return per-user struct clk instances")
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
      Signed-off-by: NMichael Turquette <mturquette@linaro.org>
      [arnd@arndb.de: Fix COMMON_CLK=N && HAS_CLK=Y config]
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      [sboyd@codeaurora.org: const arguments to clk_is_match() and
      remove unnecessary ternary operation]
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      3d3801ef
  18. 10 3月, 2015 2 次提交
  19. 08 3月, 2015 2 次提交
  20. 07 3月, 2015 1 次提交