1. 18 3月, 2020 15 次提交
    • M
      blk-mq: not embed .mq_kobj and ctx->kobj into queue instance · 9ff28240
      Ming Lei 提交于
      commit 1db4909e76f64a85f4aaa187f0f683f5c85a471d upstream.
      
      Even though .mq_kobj, ctx->kobj and q->kobj share same lifetime
      from block layer's view, actually they don't because userspace may
      grab one kobject anytime via sysfs.
      
      This patch fixes the issue by the following approach:
      
      1) introduce 'struct blk_mq_ctxs' for holding .mq_kobj and managing
      all ctxs
      
      2) free all allocated ctxs and the 'blk_mq_ctxs' instance in release
      handler of .mq_kobj
      
      3) grab one ref of .mq_kobj before initializing each ctx->kobj, so that
      .mq_kobj is always released after all ctxs are freed.
      
      This patch fixes kernel panic issue during booting when DEBUG_KOBJECT_RELEASE
      is enabled.
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: "jianchao.wang" <jianchao.w.wang@oracle.com>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      9ff28240
    • Q
      mm/memblock.c: skip kmemleak for kasan_init() · 9f093569
      Qian Cai 提交于
      commit fed84c78527009d4f799a3ed9a566502fa026d82 upstream.
      
      Kmemleak does not play well with KASAN (tested on both HPE Apollo 70 and
      Huawei TaiShan 2280 aarch64 servers).
      
      After calling start_kernel()->setup_arch()->kasan_init(), kmemleak early
      log buffer went from something like 280 to 260000 which caused kmemleak
      disabled and crash dump memory reservation failed.  The multitude of
      kmemleak_alloc() calls is from nested loops while KASAN is setting up full
      memory mappings, so let early kmemleak allocations skip those
      memblock_alloc_internal() calls came from kasan_init() given that those
      early KASAN memory mappings should not reference to other memory.  Hence,
      no kmemleak false positives.
      
      kasan_init
        kasan_map_populate [1]
          kasan_pgd_populate [2]
            kasan_pud_populate [3]
              kasan_pmd_populate [4]
                kasan_pte_populate [5]
                  kasan_alloc_zeroed_page
                    memblock_alloc_try_nid
                      memblock_alloc_internal
                        kmemleak_alloc
      
      [1] for_each_memblock(memory, reg)
      [2] while (pgdp++, addr = next, addr != end)
      [3] while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)))
      [4] while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)))
      [5] while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep)))
      
      Link: http://lkml.kernel.org/r/1543442925-17794-1-git-send-email-cai@gmx.usSigned-off-by: NQian Cai <cai@gmx.us>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      9f093569
    • X
      alinux: jbd2: track slow handle which is preventing transaction committing · 83cd9d23
      Xiaoguang Wang 提交于
      While transaction is going to commit, it first sets its state to be
      T_LOCKED and waits all outstanding handles to complete, and the
      committing transaction will always be in locked state so long as it
      has outstanding handles, also the whole fs will be locked and all later
      fs modification operations will be stucked in wait_transaction_locked().
      
      It's hard to tell why handles are that slow, so here we add a new staic
      tracepoint to track such slow handle, and show io wait time and sched
      wait time, output likes below:
        fsstress-20347 [024] ....  1570.305454: jbd2_slow_handle_stats: dev 254,17
      tid 15853 type 4 line_no 3101 interval 126 sync 0 requested_blocks 24
      dirtied_blocks 0 trans_wait 122 space_wait 0 sched_wait 0 io_wait 126
      
      "trans_wait 122" means that this current committing transaction has been
      locked for 122ms, due to this handle is not completed quickly.
      
      From "io_wait 126", we can see that io is the major reason.
      
      In this patch, we also add a per fs control file used to determine
      whether a handle can be considered to be slow.
          /proc/fs/jbd2/vdb1-8/stall_thresh
      default value is 100ms, users can set new threshold by echoing new value
      to this file.
      
      Later I also plan to add a proc file fs per fs to record these info.
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      83cd9d23
    • X
      alinux: fs: record page or bio info while process is waitting on it · 2864376f
      Xiaoguang Wang 提交于
      If one process context is stucked in wait_on_buffer(), lock_buffer(),
      lock_page() and wait_on_page_writeback() and wait_on_bit_io(), it's
      hard to tell ture reason, for example, whether this page is under io,
      or this page is just locked too long by other process context.
      
      Normally io request has multiple bios, and every bio contains multiple
      pages which will hold data to be read from or written to device, so here
      we record page info or bio info in task_struct while process calls
      lock_page(), lock_buffer(), wait_on_page_writeback(), wait_on_buffer()
      and wait_on_bit_io(), we add a new proce interface:
      [lege@localhost linux]$ cat /proc/4516/wait_res
      1 ffffd0969f95d3c0 4295369599 4295381596
      
      Above info means that thread 4516 is waitting on a page, address is
      ffffd0969f95d3c0, and has waited for 11997ms.
      
      First field denotes the page address process is waitting on.
      Second field denotes the wait moment and the third denotes current moment.
      
      In practice, if we found a process waitting on one page for too long time,
      we can get page's address by reading /proc/$pid/wait_page, and search this
      page address in all block devices' /sys/kernel/debug/block/${devname}/rq_hang,
      if search operation hits one, we can get the request and know why this io
      request hangs that long.
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      2864376f
    • X
      alinux: blk: add iohang check function · 80d6ee24
      Xiaoguang Wang 提交于
      Background:
        We do not have a dependable block layer interface to determine whether
      block device has io requests which have not been completed for somewhat
      long time. Currently we have 'in_flight' interface, it counts the number
      of I/O requests that have been issued to the device driver but have
      not yet completed, and it does not include I/O requests that are in the
      queue but not yet issued to the device driver, which means it will not
      count io requests that have been stucked in block layer.
        Also say that there are steady io requests issued to device driver,
      'in_flight' maybe always non-zero, but you could not determine whether
      there is one io request which has not been completed for too long.
      
      Solution:
        To find io requests which have not been completed for too long, here
      add 3 new inferfaces:
        /sys/block/vdb/queue/hang_threshold
      If one io request's running time has been greater than this value, count
      this io as hang.
      
        /sys/block/vdb/hang
      Show read/write io requests' hang counter.
      
        /sys/kernel/debug/block/vdb/rq_hang
      Show all hang io requests's detailed info, like below:
        ffff97db96301200 {.op=WRITE, .cmd_flags=SYNC, .rq_flags=STARTED|
      ELVPRIV|IO_STAT|STATS, .state=in_flight, .tag=30, .internal_tag=169,
      .start_time_ns=140634088407, .io_start_time_ns=140634102958,
      .current_time=146497371953, .bio = ffff97db91e8e000,
      .bio_pages = { ffffd096a0602540 }, .bio = ffff97db91e8ec00,
      .bio_pages = { ffffd096a070eec0 }, .bio = ffff97db91e8f600,
      .bio_pages = { ffffd096a0424cc0 }, .bio = ffff97db91e8f300,
      .bio_pages = { ffffd096a0600a80 }}
      
      With above info, we can easily see this request's latency distribution,
      and see next patch for bio_pages's usage.
      
      Note, /sys/kernel/debug/block/vdb/rq_hang only exists in blk-mq device driver
      and needs CONFIG_BLK_DEBUG_FS enabled.
      Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      80d6ee24
    • T
      tcp: Add snd_wnd to TCP_INFO · ecee8235
      Thomas Higdon 提交于
      commit 8f7baad7f03543451af27f5380fc816b008aa1f2 upstream
      
      Neal Cardwell mentioned that snd_wnd would be useful for diagnosing TCP
      performance problems --
      > (1) Usually when we're diagnosing TCP performance problems, we do so
      > from the sender, since the sender makes most of the
      > performance-critical decisions (cwnd, pacing, TSO size, TSQ, etc).
      > From the sender-side the thing that would be most useful is to see
      > tp->snd_wnd, the receive window that the receiver has advertised to
      > the sender.
      
      This serves the purpose of adding an additional __u32 to avoid the
      would-be hole caused by the addition of the tcpi_rcvi_ooopack field.
      Signed-off-by: NThomas Higdon <tph@fb.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NTony Lu <tonylu@linux.alibaba.com>
      Acked-by: NDust Li <dust.li@linux.alibaba.com>
      ecee8235
    • T
      tcp: Add TCP_INFO counter for packets received out-of-order · 0107d737
      Thomas Higdon 提交于
      commit f9af2dbbfe01def62765a58af7fbc488351893c3 upstream
      
      For receive-heavy cases on the server-side, we want to track the
      connection quality for individual client IPs. This counter, similar to
      the existing system-wide TCPOFOQueue counter in /proc/net/netstat,
      tracks out-of-order packet reception. By providing this counter in
      TCP_INFO, it will allow understanding to what degree receive-heavy
      sockets are experiencing out-of-order delivery and packet drops
      indicating congestion.
      
      Please note that this is similar to the counter in NetBSD TCP_INFO, and
      has the same name.
      
      Also note that we avoid increasing the size of the tcp_sock struct by
      taking advantage of a hole.
      Signed-off-by: NThomas Higdon <tph@fb.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NTony Lu <tonylu@linux.alibaba.com>
      Acked-by: NDust Li <dust.li@linux.alibaba.com>
      0107d737
    • S
      mm, memcg: introduce memory.events.local · de7ca746
      Shakeel Butt 提交于
      commit 1e577f970f66a53d429cbee37b36177c9712f488 upstream.
      
      The memory controller in cgroup v2 exposes memory.events file for each
      memcg which shows the number of times events like low, high, max, oom
      and oom_kill have happened for the whole tree rooted at that memcg.
      Users can also poll or register notification to monitor the changes in
      that file.  Any event at any level of the tree rooted at memcg will
      notify all the listeners along the path till root_mem_cgroup.  There are
      existing users which depend on this behavior.
      
      However there are users which are only interested in the events
      happening at a specific level of the memcg tree and not in the events in
      the underlying tree rooted at that memcg.  One such use-case is a
      centralized resource monitor which can dynamically adjust the limits of
      the jobs running on a system.  The jobs can create their sub-hierarchy
      for their own sub-tasks.  The centralized monitor is only interested in
      the events at the top level memcgs of the jobs as it can then act and
      adjust the limits of the jobs.  Using the current memory.events for such
      centralized monitor is very inconvenient.  The monitor will keep
      receiving events which it is not interested and to find if the received
      event is interesting, it has to read memory.event files of the next
      level and compare it with the top level one.  So, let's introduce
      memory.events.local to the memcg which shows and notify for the events
      at the memcg level.
      
      Now, does memory.stat and memory.pressure need their local versions.  IMHO
      no due to the no internal process contraint of the cgroup v2.  The
      memory.stat file of the top level memcg of a job shows the stats and
      vmevents of the whole tree.  The local stats or vmevents of the top level
      memcg will only change if there is a process running in that memcg but v2
      does not allow that.  Similarly for memory.pressure there will not be any
      process in the internal nodes and thus no chance of local pressure.
      
      Link: http://lkml.kernel.org/r/20190527174643.209172-1-shakeelb@google.comSigned-off-by: NShakeel Butt <shakeelb@google.com>
      Reviewed-by: NRoman Gushchin <guro@fb.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Chris Down <chris@chrisdown.name>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NXu Yu <xuyu@linux.alibaba.com>
      Reviewed-by: NXunlei Pang <xlpang@linux.alibaba.com>
      de7ca746
    • C
      mm, memcg: consider subtrees in memory.events · 0a198bb9
      Chris Down 提交于
      commit 9852ae3fe5293264f01c49f2571ef7688f7823ce upstream.
      
      memory.stat and other files already consider subtrees in their output, and
      we should too in order to not present an inconsistent interface.
      
      The current situation is fairly confusing, because people interacting with
      cgroups expect hierarchical behaviour in the vein of memory.stat,
      cgroup.events, and other files.  For example, this causes confusion when
      debugging reclaim events under low, as currently these always read "0" at
      non-leaf memcg nodes, which frequently causes people to misdiagnose breach
      behaviour.  The same confusion applies to other counters in this file when
      debugging issues.
      
      Aggregation is done at write time instead of at read-time since these
      counters aren't hot (unlike memory.stat which is per-page, so it does it
      at read time), and it makes sense to bundle this with the file
      notifications.
      
      After this patch, events are propagated up the hierarchy:
      
          [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events
          low 0
          high 0
          max 0
          oom 0
          oom_kill 0
          [root@ktst ~]# systemd-run -p MemoryMax=1 true
          Running as unit: run-r251162a189fb4562b9dabfdc9b0422f5.service
          [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events
          low 0
          high 0
          max 7
          oom 1
          oom_kill 1
      
      As this is a change in behaviour, this can be reverted to the old
      behaviour by mounting with the `memory_localevents' flag set.  However, we
      use the new behaviour by default as there's a lack of evidence that there
      are any current users of memory.events that would find this change
      undesirable.
      
      akpm: this is a behaviour change, so Cc:stable.  THis is so that
      forthcoming distros which use cgroup v2 are more likely to pick up the
      revised behaviour.
      
      [xuyu: remove the new memory_localevents mount option because it is
      rarely used]
      
      Link: http://lkml.kernel.org/r/20190208224419.GA24772@chrisdown.nameSigned-off-by: NChris Down <chris@chrisdown.name>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NShakeel Butt <shakeelb@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NXu Yu <xuyu@linux.alibaba.com>
      Reviewed-by: NXunlei Pang <xlpang@linux.alibaba.com>
      0a198bb9
    • R
      mm: introduce ARCH_HAS_PTE_DEVMAP · d7066a91
      Robin Murphy 提交于
      commit 175967318c3018d01931ac950c82adab5deb47ca upstream.
      
      ARCH_HAS_ZONE_DEVICE is somewhat meaningless in itself, and combined
      with the long-out-of-date comment can lead to the impression than an
      architecture may just enable it (since __add_pages() now "comprehends
      device memory" for itself) and expect things to work.
      
      In practice, however, ZONE_DEVICE users have little chance of
      functioning correctly without __HAVE_ARCH_PTE_DEVMAP, so let's clean
      that up the same way as ARCH_HAS_PTE_SPECIAL and make it the proper
      dependency so the real situation is clearer.
      
      Link: http://lkml.kernel.org/r/87554aa78478a02a63f2c4cf60a847279ae3eb3b.1558547956.git.robin.murphy@arm.comSigned-off-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Acked-by: NOliver O'Halloran <oohall@gmail.com>
      Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NShannon Zhao <shannon.zhao@linux.alibaba.com>
      Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
      d7066a91
    • M
      linkage: add generic GLOBAL() macro · 37889fe9
      Mark Rutland 提交于
      commit ad697a1aecac19ec351063b5d8e6fc9d4bca7ee5 upstream.
      
      Declaring a global symbol in assembly is tedious, error-prone, and
      painful to read. While ENTRY() exists, this is supposed to be used for
      function entry points, and this affects alignment in a potentially
      undesireable manner.
      
      Instead, let's add a generic GLOBAL() macro for this, as x86 added
      locally in commit:
      
        95695547 ("x86: asm linkage - introduce GLOBAL macro")
      
      ... thus allowing us to use this more freely in the kernel.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Torsten Duwe <duwe@suse.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: Zou Cao<zoucao@linux.alibaba.com>
      Acked-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>
      37889fe9
    • S
      compiler.h: add CC_USING_PATCHABLE_FUNCTION_ENTRY · 7f887650
      Sven Schnelle 提交于
      commit 2809b392a62ae307da058a52d451b2fc3ce4de7e upstream.
      
      This can be used for architectures implementing dynamic
      ftrace via -fpatchable-function-entry.
      Signed-off-by: NSven Schnelle <svens@stackframe.org>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: Zou Cao<zoucao@linux.alibaba.com>
      Acked-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>
      7f887650
    • M
      module/ftrace: handle patchable-function-entry · 0f596d7d
      Mark Rutland 提交于
      backport from a1326b17ac03a9012cb3d01e434aacb4d67a416c upstream
      
      When using patchable-function-entry, the compiler will record the
      callsites into a section named "__patchable_function_entries" rather
      than "__mcount_loc". Let's abstract this difference behind a new
      FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this
      explicitly (e.g. with custom module linker scripts).
      
      As parisc currently handles this explicitly, it is fixed up accordingly,
      with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is
      only defined when DYNAMIC_FTRACE is selected, the parisc module loading
      code is updated to only use the definition in that case. When
      DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so
      this removes some redundant work in that case.
      
      To make sure that this is keep up-to-date for modules and the main
      kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery
      simplified for legibility.
      
      I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled,
      and verified that the section made it into the .ko files for modules.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NHelge Deller <deller@gmx.de>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NTorsten Duwe <duwe@suse.de>
      Tested-by: NAmit Daniel Kachhap <amit.kachhap@arm.com>
      Tested-by: NSven Schnelle <svens@stackframe.org>
      Tested-by: NTorsten Duwe <duwe@suse.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: linux-parisc@vger.kernel.org
      Signed-off-by: Zou Cao<zoucao@linux.alibaba.com>
      Acked-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>
      0f596d7d
    • M
      ftrace: add ftrace_init_nop() · 21aca133
      Mark Rutland 提交于
      commit fbf6c73c5b264c25484fa9f449b5546569fe11f0 upstream
      
      Architectures may need to perform special initialization of ftrace
      callsites, and today they do so by special-casing ftrace_make_nop() when
      the expected branch address is MCOUNT_ADDR. In some cases (e.g. for
      patchable-function-entry), we don't have an mcount-like symbol and don't
      want a synthetic MCOUNT_ADDR, but we may need to perform some
      initialization of callsites.
      
      To make it possible to separate initialization from runtime
      modification, and to handle cases without an mcount-like symbol, this
      patch adds an optional ftrace_init_nop() function that architectures can
      implement, which does not pass a branch address.
      
      Where an architecture does not provide ftrace_init_nop(), we will fall
      back to the existing behaviour of calling ftrace_make_nop() with
      MCOUNT_ADDR.
      
      At the same time, ftrace_code_disable() is renamed to
      ftrace_nop_initialize() to make it clearer that it is intended to
      intialize a callsite into a disabled state, and is not for disabling a
      callsite that has been runtime enabled. The kerneldoc description of rec
      arguments is updated to cover non-mcount callsites.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NAmit Daniel Kachhap <amit.kachhap@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NMiroslav Benes <mbenes@suse.cz>
      Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Reviewed-by: NTorsten Duwe <duwe@suse.de>
      Tested-by: NAmit Daniel Kachhap <amit.kachhap@arm.com>
      Tested-by: NSven Schnelle <svens@stackframe.org>
      Tested-by: NTorsten Duwe <duwe@suse.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: Zou Cao<zoucao@linux.alibaba.com>
      Acked-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>
      21aca133
    • L
      spi: Optionally use GPIO descriptors for CS GPIOs · 7bc17db3
      Linus Walleij 提交于
      commit f3186dd876697e696d07136623d5cf0a6fb0bc0f upstream
      
      This augments the SPI core to optionally use GPIO descriptors
      for chip select on a per-master-driver opt-in basis.
      
      Drivers using this will rely on the SPI core to look up
      GPIO descriptors associated with the device, such as
      when using device tree or board files with GPIO descriptor
      tables.
      
      When getting descriptors from the device tree, this will in
      turn activate the code in gpiolib that was
      added in commit 6953c57ab172
      ("gpio: of: Handle SPI chipselect legacy bindings")
      which means that these descriptors are aware of the active
      low semantics that is the default for SPI CS GPIO lines
      and we can assume that all of these are "active high" and
      thus assign SPI_CS_HIGH to all CS lines on the DT path.
      
      The previously used gpio_set_value() would call down into
      gpiod_set_raw_value() and ignore the polarity inversion
      semantics.
      
      It seems like many drivers go to great lengths to set up the
      CS GPIO line as non-asserted, respecting SPI_CS_HIGH. We pull
      this out of the SPI drivers and into the core, and by simply
      requesting the line as GPIOD_OUT_LOW when retrieveing it from
      the device and relying on the gpiolib to handle any inversion
      semantics. This way a lot of code can be simplified and
      removed in each converted driver.
      
      The end goal after dealing with each driver in turn, is to
      delete the non-descriptor path (of_spi_register_master() for
      example) and let the core deal with only descriptors.
      
      The different SPI drivers have complex interactions with the
      core so we cannot simply change them all over, we need to use
      a stepwise, bisectable approach so that each driver can be
      converted and fixed in isolation.
      
      This patch has the intended side effect of adding support for
      ACPI GPIOs as it starts relying on gpiod_get_*() to get
      the GPIO handle associated with the device.
      
      Cc: Linuxarm <linuxarm@huawei.com>
      Acked-by: NJonathan Cameron <jonathan.cameron@huawei.com>
      Tested-by: NFangjian (Turing) <f.fangjian@huawei.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
      Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>
      7bc17db3
  2. 17 1月, 2020 25 次提交