1. 30 10月, 2019 3 次提交
    • E
      ext4: adjust reserved cluster count when removing extents · eb792dc6
      Eric Whitney 提交于
      commit 9fe671496b6c286f9033aedfc1718d67721da0ae upstream.
      
      Modify ext4_ext_remove_space() and the code it calls to correct the
      reserved cluster count for pending reservations (delayed allocated
      clusters shared with allocated blocks) when a block range is removed
      from the extent tree.  Pending reservations may be found for the clusters
      at the ends of written or unwritten extents when a block range is removed.
      If a physical cluster at the end of an extent is freed, it's necessary
      to increment the reserved cluster count to maintain correct accounting
      if the corresponding logical cluster is shared with at least one
      delayed and unwritten extent as found in the extents status tree.
      
      Add a new function, ext4_rereserve_cluster(), to reapply a reservation
      on a delayed allocated cluster sharing blocks with a freed allocated
      cluster.  To avoid ENOSPC on reservation, a flag is applied to
      ext4_free_blocks() to briefly defer updating the freeclusters counter
      when an allocated cluster is freed.  This prevents another thread
      from allocating the freed block before the reservation can be reapplied.
      
      Redefine the partial cluster object as a struct to carry more state
      information and to clarify the code using it.
      
      Adjust the conditional code structure in ext4_ext_remove_space to
      reduce the indentation level in the main body of the code to improve
      readability.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
      eb792dc6
    • E
      ext4: fix reserved cluster accounting at delayed write time · b834eb8a
      Eric Whitney 提交于
      commit 0b02f4c0d6d9e2c611dfbdd4317193e9dca740e6 upstream.
      
      The code in ext4_da_map_blocks sometimes reserves space for more
      delayed allocated clusters than it should, resulting in premature
      ENOSPC, exceeded quota, and inaccurate free space reporting.
      
      Fix this by checking for written and unwritten blocks shared in the
      same cluster with the newly delayed allocated block.  A cluster
      reservation should not be made for a cluster for which physical space
      has already been allocated.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
      b834eb8a
    • E
      ext4: generalize extents status tree search functions · 05cc8939
      Eric Whitney 提交于
      commit ad431025aecda85d3ebef5e4a3aca5c1c681d0c7 upstream.
      
      Ext4 contains a few functions that are used to search for delayed
      extents or blocks in the extents status tree.  Rather than duplicate
      code to add new functions to search for extents with different status
      values, such as written or a combination of delayed and unwritten,
      generalize the existing code to search for caller-specified extents
      status values.  Also, move this code into extents_status.c where it
      is better associated with the data structures it operates upon, and
      where it can be more readily used to implement new extents status tree
      functions that might want a broader scope for i_es_lock.
      
      Three missing static specifiers in RFC version of patch reported and
      fixed by Fengguang Wu <fengguang.wu@intel.com>.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
      05cc8939
  2. 29 10月, 2019 1 次提交
    • Q
      btrfs: tracepoints: Fix bad entry members of qgroup events · 0b95aaae
      Qu Wenruo 提交于
      commit 1b2442b4ae0f234daeadd90e153b466332c466d8 upstream.
      
      [BUG]
      For btrfs:qgroup_meta_reserve event, the trace event can output garbage:
      
        qgroup_meta_reserve: 9c7f6acc-b342-4037-bc47-7f6e4d2232d7: refroot=5(FS_TREE) type=DATA diff=2
        qgroup_meta_reserve: 9c7f6acc-b342-4037-bc47-7f6e4d2232d7: refroot=5(FS_TREE) type=0x258792 diff=2
      
      The @type can be completely garbage, as DATA type is not possible for
      trace_qgroup_meta_reserve() trace event.
      
      [CAUSE]
      Ther are several problems related to qgroup trace events:
      - Unassigned entry member
        Member entry::type of trace_qgroup_update_reserve() and
        trace_qgourp_meta_reserve() is not assigned
      
      - Redundant entry member
        Member entry::type is completely useless in
        trace_qgroup_meta_convert()
      
      Fixes: 4ee0d883 ("btrfs: qgroup: Update trace events for metadata reservation")
      CC: stable@vger.kernel.org # 4.10+
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b95aaae
  3. 08 10月, 2019 1 次提交
  4. 29 8月, 2019 1 次提交
    • D
      rxrpc: Fix read-after-free in rxrpc_queue_local() · a05354cb
      David Howells 提交于
      commit 06d9532fa6b34f12a6d75711162d47c17c1add72 upstream.
      
      rxrpc_queue_local() attempts to queue the local endpoint it is given and
      then, if successful, prints a trace line.  The trace line includes the
      current usage count - but we're not allowed to look at the local endpoint
      at this point as we passed our ref on it to the workqueue.
      
      Fix this by reading the usage count before queuing the work item.
      
      Also fix the reading of local->debug_id for trace lines, which must be done
      with the same consideration as reading the usage count.
      
      Fixes: 09d2bf59 ("rxrpc: Add a tracepoint to track rxrpc_local refcounting")
      Reported-by: syzbot+78e71c5bab4f76a6a719@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a05354cb
  5. 26 7月, 2019 1 次提交
    • D
      rxrpc: Fix oops in tracepoint · 0f2f2ceb
      David Howells 提交于
      [ Upstream commit 99f0eae653b2db64917d0b58099eb51e300b311d ]
      
      If the rxrpc_eproto tracepoint is enabled, an oops will be cause by the
      trace line that rxrpc_extract_header() tries to emit when a protocol error
      occurs (typically because the packet is short) because the call argument is
      NULL.
      
      Fix this by using ?: to assume 0 as the debug_id if call is NULL.
      
      This can then be induced by:
      
      	echo -e '\0\0\0\0\0\0\0\0' | ncat -4u --send-only <addr> 20001
      
      where addr has the following program running on it:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <unistd.h>
      	#include <sys/socket.h>
      	#include <arpa/inet.h>
      	#include <linux/rxrpc.h>
      	int main(void)
      	{
      		struct sockaddr_rxrpc srx;
      		int fd;
      		memset(&srx, 0, sizeof(srx));
      		srx.srx_family			= AF_RXRPC;
      		srx.srx_service			= 0;
      		srx.transport_type		= AF_INET;
      		srx.transport_len		= sizeof(srx.transport.sin);
      		srx.transport.sin.sin_family	= AF_INET;
      		srx.transport.sin.sin_port	= htons(0x4e21);
      		fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET6);
      		bind(fd, (struct sockaddr *)&srx, sizeof(srx));
      		sleep(20);
      		return 0;
      	}
      
      It results in the following oops.
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000340
      	#PF: supervisor read access in kernel mode
      	#PF: error_code(0x0000) - not-present page
      	...
      	RIP: 0010:trace_event_raw_event_rxrpc_rx_eproto+0x47/0xac
      	...
      	Call Trace:
      	 <IRQ>
      	 rxrpc_extract_header+0x86/0x171
      	 ? rcu_read_lock_sched_held+0x5d/0x63
      	 ? rxrpc_new_skb+0xd4/0x109
      	 rxrpc_input_packet+0xef/0x14fc
      	 ? rxrpc_input_data+0x986/0x986
      	 udp_queue_rcv_one_skb+0xbf/0x3d0
      	 udp_unicast_rcv_skb.isra.8+0x64/0x71
      	 ip_protocol_deliver_rcu+0xe4/0x1b4
      	 ip_local_deliver+0xf0/0x154
      	 __netif_receive_skb_one_core+0x50/0x6c
      	 netif_receive_skb_internal+0x26b/0x2e9
      	 napi_gro_receive+0xf8/0x1da
      	 rtl8169_poll+0x303/0x4c4
      	 net_rx_action+0x10e/0x333
      	 __do_softirq+0x1a5/0x38f
      	 irq_exit+0x54/0xc4
      	 do_IRQ+0xda/0xf8
      	 common_interrupt+0xf/0xf
      	 </IRQ>
      	 ...
      	 ? cpuidle_enter_state+0x23c/0x34d
      	 cpuidle_enter+0x2a/0x36
      	 do_idle+0x163/0x1ea
      	 cpu_startup_entry+0x1d/0x1f
      	 start_secondary+0x157/0x172
      	 secondary_startup_64+0xa4/0xb0
      
      Fixes: a25e21f0 ("rxrpc, afs: Use debug_ids rather than pointers in traces")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0f2f2ceb
  6. 20 4月, 2019 1 次提交
    • D
      rxrpc: Fix client call connect/disconnect race · 11582064
      David Howells 提交于
      [ Upstream commit 930c9f9125c85b5134b3e711bc252ecc094708e3 ]
      
      rxrpc_disconnect_client_call() reads the call's connection ID protocol
      value (call->cid) as part of that function's variable declarations.  This
      is bad because it's not inside the locked section and so may race with
      someone granting use of the channel to the call.
      
      This manifests as an assertion failure (see below) where the call in the
      presumed channel (0 because call->cid wasn't set when we read it) doesn't
      match the call attached to the channel we were actually granted (if 1, 2 or
      3).
      
      Fix this by moving the read and dependent calculations inside of the
      channel_lock section.  Also, only set the channel number and pointer
      variables if cid is not zero (ie. unset).
      
      This problem can be induced by injecting an occasional error in
      rxrpc_wait_for_channel() before the call to schedule().
      
      Make two further changes also:
      
       (1) Add a trace for wait failure in rxrpc_connect_call().
      
       (2) Drop channel_lock before BUG'ing in the case of the assertion failure.
      
      The failure causes a trace akin to the following:
      
      rxrpc: Assertion failed - 18446612685268945920(0xffff8880beab8c00) == 18446612685268621312(0xffff8880bea69800) is false
      ------------[ cut here ]------------
      kernel BUG at net/rxrpc/conn_client.c:824!
      ...
      RIP: 0010:rxrpc_disconnect_client_call+0x2bf/0x99d
      ...
      Call Trace:
       rxrpc_connect_call+0x902/0x9b3
       ? wake_up_q+0x54/0x54
       rxrpc_new_client_call+0x3a0/0x751
       ? rxrpc_kernel_begin_call+0x141/0x1bc
       ? afs_alloc_call+0x1b5/0x1b5
       rxrpc_kernel_begin_call+0x141/0x1bc
       afs_make_call+0x20c/0x525
       ? afs_alloc_call+0x1b5/0x1b5
       ? __lock_is_held+0x40/0x71
       ? lockdep_init_map+0xaf/0x193
       ? lockdep_init_map+0xaf/0x193
       ? __lock_is_held+0x40/0x71
       ? yfs_fs_fetch_data+0x33b/0x34a
       yfs_fs_fetch_data+0x33b/0x34a
       afs_fetch_data+0xdc/0x3b7
       afs_read_dir+0x52d/0x97f
       afs_dir_iterate+0xa0/0x661
       ? iterate_dir+0x63/0x141
       iterate_dir+0xa2/0x141
       ksys_getdents64+0x9f/0x11b
       ? filldir+0x111/0x111
       ? do_syscall_64+0x3e/0x1a0
       __x64_sys_getdents64+0x16/0x19
       do_syscall_64+0x7d/0x1a0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 45025bce ("rxrpc: Improve management and caching of client connection objects")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      11582064
  7. 17 1月, 2019 1 次提交
    • V
      sunrpc: use-after-free in svc_process_common() · 44e7bab3
      Vasily Averin 提交于
      commit d4b09acf924b84bae77cad090a9d108e70b43643 upstream.
      
      if node have NFSv41+ mounts inside several net namespaces
      it can lead to use-after-free in svc_process_common()
      
      svc_process_common()
              /* Setup reply header */
              rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE
      
      svc_process_common() can use incorrect rqstp->rq_xprt,
      its caller function bc_svc_process() takes it from serv->sv_bc_xprt.
      The problem is that serv is global structure but sv_bc_xprt
      is assigned per-netnamespace.
      
      According to Trond, the whole "let's set up rqstp->rq_xprt
      for the back channel" is nothing but a giant hack in order
      to work around the fact that svc_process_common() uses it
      to find the xpt_ops, and perform a couple of (meaningless
      for the back channel) tests of xpt_flags.
      
      All we really need in svc_process_common() is to be able to run
      rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr()
      
      Bruce J Fields points that this xpo_prep_reply_hdr() call
      is an awfully roundabout way just to do "svc_putnl(resv, 0);"
      in the tcp case.
      
      This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(),
      now it calls svc_process_common() with rqstp->rq_xprt = NULL.
      
      To adjust reply header svc_process_common() just check
      rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case.
      
      To handle rqstp->rq_xprt = NULL case in functions called from
      svc_process_common() patch intruduces net namespace pointer
      svc_rqst->rq_bc_net and adjust SVC_NET() definition.
      Some other function was also adopted to properly handle described case.
      Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
      Cc: stable@vger.kernel.org
      Fixes: 23c20ecd ("NFS: callback up - users counting cleanup")
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      v2: added lost extern svc_tcp_prep_reply_hdr()
      Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44e7bab3
  8. 10 1月, 2019 1 次提交
    • T
      ext4: force inode writes when nfsd calls commit_metadata() · bf2fd1f9
      Theodore Ts'o 提交于
      commit fde872682e175743e0c3ef939c89e3c6008a1529 upstream.
      
      Some time back, nfsd switched from calling vfs_fsync() to using a new
      commit_metadata() hook in export_operations().  If the file system did
      not provide a commit_metadata() hook, it fell back to using
      sync_inode_metadata().  Unfortunately doesn't work on all file
      systems.  In particular, it doesn't work on ext4 due to how the inode
      gets journalled --- the VFS writeback code will not always call
      ext4_write_inode().
      
      So we need to provide our own ext4_nfs_commit_metdata() method which
      calls ext4_write_inode() directly.
      
      Google-Bug-Id: 121195940
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf2fd1f9
  9. 08 12月, 2018 1 次提交
  10. 09 10月, 2018 1 次提交
  11. 02 10月, 2018 1 次提交
    • M
      mm, sched/numa: Remove rate-limiting of automatic NUMA balancing migration · efaffc5e
      Mel Gorman 提交于
      Rate limiting of page migrations due to automatic NUMA balancing was
      introduced to mitigate the worst-case scenario of migrating at high
      frequency due to false sharing or slowly ping-ponging between nodes.
      Since then, a lot of effort was spent on correctly identifying these
      pages and avoiding unnecessary migrations and the safety net may no longer
      be required.
      
      Jirka Hladky reported a regression in 4.17 due to a scheduler patch that
      avoids spreading STREAM tasks wide prematurely. However, once the task
      was properly placed, it delayed migrating the memory due to rate limiting.
      Increasing the limit fixed the problem for him.
      
      Currently, the limit is hard-coded and does not account for the real
      capabilities of the hardware. Even if an estimate was attempted, it would
      not properly account for the number of memory controllers and it could
      not account for the amount of bandwidth used for normal accesses. Rather
      than fudging, this patch simply eliminates the rate limiting.
      
      However, Jirka reports that a STREAM configuration using multiple
      processes achieved similar performance to 4.16. In local tests, this patch
      improved performance of STREAM relative to the baseline but it is somewhat
      machine-dependent. Most workloads show little or not performance difference
      implying that there is not a heavily reliance on the throttling mechanism
      and it is safe to remove.
      
      STREAM on 2-socket machine
                               4.19.0-rc5             4.19.0-rc5
                               numab-v1r1       noratelimit-v1r1
      MB/sec copy     43298.52 (   0.00%)    44673.38 (   3.18%)
      MB/sec scale    30115.06 (   0.00%)    31293.06 (   3.91%)
      MB/sec add      32825.12 (   0.00%)    34883.62 (   6.27%)
      MB/sec triad    32549.52 (   0.00%)    34906.60 (   7.24%
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: NRik van Riel <riel@surriel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jirka Hladky <jhladky@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linux-MM <linux-mm@kvack.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20181001100525.29789-2-mgorman@techsingularity.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      efaffc5e
  12. 28 9月, 2018 1 次提交
    • D
      rxrpc: Fix error distribution · f3344303
      David Howells 提交于
      Fix error distribution by immediately delivering the errors to all the
      affected calls rather than deferring them to a worker thread.  The problem
      with the latter is that retries and things can happen in the meantime when we
      want to stop that sooner.
      
      To this end:
      
       (1) Stop the error distributor from removing calls from the error_targets
           list so that peer->lock isn't needed to synchronise against other adds
           and removals.
      
       (2) Require the peer's error_targets list to be accessed with RCU, thereby
           avoiding the need to take peer->lock over distribution.
      
       (3) Don't attempt to affect a call's state if it is already marked complete.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f3344303
  13. 18 8月, 2018 1 次提交
    • D
      bpf: fix redirect to map under tail calls · f6069b9a
      Daniel Borkmann 提交于
      Commits 109980b8 ("bpf: don't select potentially stale ri->map
      from buggy xdp progs") and 7c300131 ("bpf: fix ri->map_owner
      pointer on bpf_prog_realloc") tried to mitigate that buggy programs
      using bpf_redirect_map() helper call do not leave stale maps behind.
      Idea was to add a map_owner cookie into the per CPU struct redirect_info
      which was set to prog->aux by the prog making the helper call as a
      proof that the map is not stale since the prog is implicitly holding
      a reference to it. This owner cookie could later on get compared with
      the program calling into BPF whether they match and therefore the
      redirect could proceed with processing the map safely.
      
      In (obvious) hindsight, this approach breaks down when tail calls are
      involved since the original caller's prog->aux pointer does not have
      to match the one from one of the progs out of the tail call chain,
      and therefore the xdp buffer will be dropped instead of redirected.
      A way around that would be to fix the issue differently (which also
      allows to remove related work in fast path at the same time): once
      the life-time of a redirect map has come to its end we use it's map
      free callback where we need to wait on synchronize_rcu() for current
      outstanding xdp buffers and remove such a map pointer from the
      redirect info if found to be present. At that time no program is
      using this map anymore so we simply invalidate the map pointers to
      NULL iff they previously pointed to that instance while making sure
      that the redirect path only reads out the map once.
      
      Fixes: 97f91a7c ("bpf: add bpf_redirect_map helper routine")
      Fixes: 109980b8 ("bpf: don't select potentially stale ri->map from buggy xdp progs")
      Reported-by: NSebastiano Miano <sebastiano.miano@polito.it>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      f6069b9a
  14. 14 8月, 2018 1 次提交
    • Z
      net: Change the layout of structure trace_event_raw_fib_table_lookup · 0192e7d4
      Zong Li 提交于
      There is an unalignment access about the structure
      'trace_event_raw_fib_table_lookup'.
      
      In include/trace/events/fib.h, there is a memory operation which casting
      the 'src' data member to a pointer, and then store a value to this
      pointer point to.
      
      p32 = (__be32 *) __entry->src;
      *p32 = flp->saddr;
      
      The offset of 'src' in structure trace_event_raw_fib_table_lookup is not
      four bytes alignment. On some architectures, they don't permit the
      unalignment access, it need to pay the price to handle this situation in
      exception handler.
      
      Adjust the layout of structure to avoid this case.
      
      Fixes: 9f323973 ("net/ipv4: Udate fib_table_lookup tracepoint")
      Signed-off-by: NZong Li <zong@andestech.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0192e7d4
  15. 07 8月, 2018 1 次提交
  16. 06 8月, 2018 2 次提交
  17. 01 8月, 2018 3 次提交
  18. 31 7月, 2018 1 次提交
    • J
      tracing: Centralize preemptirq tracepoints and unify their usage · c3bc8fd6
      Joel Fernandes (Google) 提交于
      This patch detaches the preemptirq tracepoints from the tracers and
      keeps it separate.
      
      Advantages:
      * Lockdep and irqsoff event can now run in parallel since they no longer
      have their own calls.
      
      * This unifies the usecase of adding hooks to an irqsoff and irqson
      event, and a preemptoff and preempton event.
        3 users of the events exist:
        - Lockdep
        - irqsoff and preemptoff tracers
        - irqs and preempt trace events
      
      The unification cleans up several ifdefs and makes the code in preempt
      tracer and irqsoff tracers simpler. It gets rid of all the horrific
      ifdeferry around PROVE_LOCKING and makes configuration of the different
      users of the tracepoints more easy and understandable. It also gets rid
      of the time_* function calls from the lockdep hooks used to call into
      the preemptirq tracer which is not needed anymore. The negative delta in
      lines of code in this patch is quite large too.
      
      In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
      as a single point for registering probes onto the tracepoints. With
      this,
      the web of config options for preempt/irq toggle tracepoints and its
      users becomes:
      
       PREEMPT_TRACER   PREEMPTIRQ_EVENTS  IRQSOFF_TRACER PROVE_LOCKING
             |                 |     \         |           |
             \    (selects)    /      \        \ (selects) /
            TRACE_PREEMPT_TOGGLE       ----> TRACE_IRQFLAGS
                            \                  /
                             \ (depends on)   /
                           PREEMPTIRQ_TRACEPOINTS
      
      Other than the performance tests mentioned in the previous patch, I also
      ran the locking API test suite. I verified that all tests cases are
      passing.
      
      I also injected issues by not registering lockdep probes onto the
      tracepoints and I see failures to confirm that the probes are indeed
      working.
      
      This series + lockdep probes not registered (just to inject errors):
      [    0.000000]      hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
      [    0.000000]      soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
      [    0.000000]        sirq-safe-A => hirqs-on/12:FAILED|FAILED|  ok  |
      [    0.000000]        sirq-safe-A => hirqs-on/21:FAILED|FAILED|  ok  |
      [    0.000000]          hard-safe-A + irqs-on/12:FAILED|FAILED|  ok  |
      [    0.000000]          soft-safe-A + irqs-on/12:FAILED|FAILED|  ok  |
      [    0.000000]          hard-safe-A + irqs-on/21:FAILED|FAILED|  ok  |
      [    0.000000]          soft-safe-A + irqs-on/21:FAILED|FAILED|  ok  |
      [    0.000000]     hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
      [    0.000000]     soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
      
      With this series + lockdep probes registered, all locking tests pass:
      
      [    0.000000]      hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
      [    0.000000]      soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
      [    0.000000]        sirq-safe-A => hirqs-on/12:  ok  |  ok  |  ok  |
      [    0.000000]        sirq-safe-A => hirqs-on/21:  ok  |  ok  |  ok  |
      [    0.000000]          hard-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
      [    0.000000]          soft-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
      [    0.000000]          hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
      [    0.000000]          soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
      [    0.000000]     hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
      [    0.000000]     soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
      
      Link: http://lkml.kernel.org/r/20180730222423.196630-4-joel@joelfernandes.orgAcked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      c3bc8fd6
  19. 26 7月, 2018 1 次提交
  20. 23 7月, 2018 1 次提交
  21. 13 7月, 2018 10 次提交
  22. 12 7月, 2018 2 次提交
    • B
      fsi: master-gpio: Add more tracepoints · 777fd524
      Benjamin Herrenschmidt 提交于
      This adds a few more tracepoints that have proven useful when
      debugging issues with the FSI bus.
      
      This also makes echo_delay() use clock_zeros() instead of
      open-code it in order to share the tracepoint.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Reviewed-by: NJoel Stanley <joel@jms.id.au>
      777fd524
    • S
      cgroup/tracing: Move taking of spin lock out of trace event handlers · e4f8d81c
      Steven Rostedt (VMware) 提交于
      It is unwise to take spin locks from the handlers of trace events.
      Mainly, because they can introduce lockups, because it introduces locks
      in places that are normally not tested. Worse yet, because trace events
      are tucked away in the include/trace/events/ directory, locks that are
      taken there are forgotten about.
      
      As a general rule, I tell people never to take any locks in a trace
      event handler.
      
      Several cgroup trace event handlers call cgroup_path() which eventually
      takes the kernfs_rename_lock spinlock. This injects the spinlock in the
      code without people realizing it. It also can cause issues for the
      PREEMPT_RT patch, as the spinlock becomes a mutex, and the trace event
      handlers are called with preemption disabled.
      
      By moving the calculation of the cgroup_path() out of the trace event
      handlers and into a macro (surrounded by a
      trace_cgroup_##type##_enabled()), then we could place the cgroup_path
      into a string, and pass that to the trace event. Not only does this
      remove the taking of the spinlock out of the trace event handler, but
      it also means that the cgroup_path() only needs to be called once (it
      is currently called twice, once to get the length to reserver the
      buffer for, and once again to get the path itself. Now it only needs to
      be done once.
      Reported-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e4f8d81c
  23. 04 7月, 2018 1 次提交
  24. 02 7月, 2018 1 次提交
  25. 20 6月, 2018 1 次提交
    • J
      clk: add duty cycle support · 9fba738a
      Jerome Brunet 提交于
      Add the possibility to apply and query the clock signal duty cycle ratio.
      
      This is useful when the duty cycle of the clock signal depends on some
      other parameters controlled by the clock framework.
      
      For example, the duty cycle of a divider may depends on the raw divider
      setting (ratio = N / div) , which is controlled by the CCF. In such case,
      going through the pwm framework to control the duty cycle ratio of this
      clock would be a burden.
      
      A clock provider is not required to implement the operation to set and get
      the duty cycle. If it does not implement .get_duty_cycle(), the ratio is
      assumed to be 50%.
      
      This change also adds a new flag, CLK_DUTY_CYCLE_PARENT. This flag should
      be used to indicate that a clock, such as gates and muxes, may inherit
      the duty cycle ratio of its parent clock. If a clock does not provide a
      get_duty_cycle() callback and has CLK_DUTY_CYCLE_PARENT, then the call
      will be directly forwarded to its parent clock, if any. For
      set_duty_cycle(), the clock should also have CLK_SET_RATE_PARENT for the
      call to be forwarded
      Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: NMichael Turquette <mturquette@baylibre.com>
      Link: lkml.kernel.org/r/20180619144141.8506-1-jbrunet@baylibre.com
      9fba738a