1. 24 8月, 2019 1 次提交
  2. 06 8月, 2019 1 次提交
  3. 10 7月, 2019 1 次提交
    • C
      Rework image & texture management to use concurrent message queues. (#9486) · ad582b50
      Chinmay Garde 提交于
      This patch reworks image decompression and collection in the following ways
      because of misbehavior in the described edge cases.
      
      The current flow for realizing a texture on the GPU from a blob of compressed
      bytes is to first pass it to the IO thread for image decompression and then
      upload to the GPU. The handle to the texture on the GPU is then passed back to
      the UI thread so that it can be included in subsequent layer trees for
      rendering. The GPU contexts on the Render & IO threads are in the same
      sharegroup so the texture ends up being visible to the Render Thread context
      during rendering. This works fine and does not block the UI thread. All
      references to the image are owned on UI thread by Dart objects. When the final
      reference to the image is dropped, the texture cannot be collected on the UI
      thread (because it has not GPU context). Instead, it must be passed to either
      the GPU or IO threads. The GPU thread is usually in the middle of a frame
      workload so we redirect the same to the IO thread for eventual collection. While
      texture collections are usually (comparatively) fast, texture decompression and
      upload are slow (order of magnitude of frame intervals).
      
      For application that end up creating (by not necessarily using) numerous large
      textures in straight-line execution, it could be the case that texture
      collection tasks are pending on the IO task runner after all the image
      decompressions (and upload) are done. Put simply, the collection of the first
      image could be waiting for the decompression and upload of the last image in the
      queue.
      
      This is exacerbated by two other hacks added to workaround unrelated issues.
      * First, creating a codec with a single image frame immediately kicks of
        decompression and upload of that frame image (even if the frame was never
        request from the codec). This hack was added because we wanted to get rid of
        the compressed image allocation ASAP. The expectation was codecs would only be
        created with the sole purpose of getting the decompressed image bytes.
        However, for applications that only create codecs to get image sizes (but
        never actually decompress the same), we would end up replacing the compressed
        image allocation with a larger allocation (device resident no less) for no
        obvious use. This issue is particularly insidious when you consider that the
        codec is usually asked for the native image size first before the frame is
        requested at a smaller size (usually using a new codec with same data but new
        targetsize). This would cause the creation of a whole extra texture (at 1:1)
        when the caller was trying to “optimize” for memory use by requesting a
        texture of a smaller size.
      * Second, all image collections we delayed in by the unref queue by 250ms
        because of observations that the calling thread (the UI thread) was being
        descheduled unnecessarily when a task with a timeout of zero was posted from
        the same (recall that a task has to be posted to the IO thread for the
        collection of that texture). 250ms is multiple frame intervals worth of
        potentially unnecessary textures.
      
      The net result of these issues is that we may end up creating textures when all
      that the application needs is to ask it’s codec for details about the same (but
      not necessarily access its bytes). Texture collection could also be delayed
      behind other jobs to decompress the textures on the IO thread. Also, all texture
      collections are delayed for an arbitrary amount of time.
      
      These issues cause applications to be susceptible to OOM situations. These
      situations manifest in various ways. Host memory exhaustion causes the usual OOM
      issues. Device memory exhaustion seems to manifest in different ways on iOS and
      Android. On Android, allocation of a new texture seems to be causing an
      assertion (in the driver). On iOS, the call hangs (presumably waiting for
      another thread to release textures which we won’t do because those tasks are
      blocked behind the current task completing).
      
      To address peak memory usage, the following changes have been made:
      * Image decompression and upload/collection no longer happen on the same thread.
        All image decompression will now be handled on a workqueue. The number of
        worker threads in this workqueue is equal to the number of processors on the
        device. These threads have a lower priority that either the UI or Render
        threads. These workers are shared between all Flutter applications in the
        process.
      * Both the images and their codec now report the correct allocation size to Dart
        for GC purposes. The Dart VM uses this to pick objects for collection. Earlier
        the image allocation was assumed to 32bpp with no mipmapping overhead
        reported. Now, the correct image size is reported and the mipmapping overhead
        is accounted for. Image codec sizes were not reported to the VM earlier and
        now are. Expect “External” VM allocations to be higher than previously
        reported and the numbers in Observatory to line up more closely with actual
        memory usage (device and host).
      * Decoding images to a specific size used to decode to 1:1 before performing a
        resize to the correct dimensions before texture upload. This has now been
        reworked so that images are first decompressed to a smaller size supported
        natively by the codec before final resizing to the requested target size. The
        intermediate copy is now smaller and more promptly collected. Resizing also
        happens on the workqueue worker.
      * The drain interval of the unref queue is now sub-frame-interval. I am hesitant
        to remove the delay entirely because I have not been able to instrument the
        performance overhead of the same. That is next on my list. But now, multiple
        frame intervals worth of textures no longer stick around.
      
      The following issues have been addressed:
      * https://github.com/flutter/flutter/issues/34070 Since this was the first usage
        of the concurrent message loops, the number of idle wakes were determined to
        be too high and this component has been rewritten to be simpler and not use
        the existing task runner and MessageLoopImpl interface.
      * Image decoding had no tests. The new `ui_unittests` harness has been added
        that sets up a GPU test harness on the host using SwiftShader. Tests have been
        added for image decompression, upload and resizing.
      * The device memory exhaustion in this benchmark has been addressed. That
        benchmark is still not viable for inclusion in any harness however because it
        creates 9 million codecs in straight-line execution. Because these codecs are
        destroyed in the microtask callbacks, these are referenced till those
        callbacks are executed. So now, instead of device memory exhaustion, this will
        lead to (slower) exhaustion of host memory. This is expected and working as
        intended.
      
      This patch only addresses peak memory use and makes collection of unused images
      and textures more prompt. It does NOT address memory use by images referenced
      strongly by the application or framework.
      ad582b50
  4. 21 4月, 2019 1 次提交
  5. 30 3月, 2019 1 次提交
  6. 08 11月, 2018 1 次提交
  7. 27 7月, 2018 1 次提交
  8. 14 4月, 2018 1 次提交
  9. 13 4月, 2018 1 次提交
  10. 12 4月, 2018 1 次提交
  11. 11 4月, 2018 2 次提交
    • C
      Revert "Support multiple shells in a single process. (#4932)" (#4964) · 9199b40f
      Chinmay Garde 提交于
      This reverts commit 6baff4c8.
      9199b40f
    • C
      Support multiple shells in a single process. (#4932) · 6baff4c8
      Chinmay Garde 提交于
      * Support multiple shells in a single process.
      
      The Flutter Engine currently works by initializing a singleton shell
      instance. This shell has to be created on the platform thread. The shell
      is responsible for creating the 3 main threads used by Flutter (UI, IO,
      GPU) as well as initializing the Dart VM. The shell, references to task
      runners of the main threads as well as all snapshots used for VM
      initialization are stored in singleton objects. The Flutter shell only
      creates the threads, rasterizers, contexts, etc. to fully support a
      single Flutter application. Current support for multiple Flutter
      applications is achieved by making multiple applications share the same
      resources (via the platform views mechanism).
      
      This scheme has the following limitations:
      
      * The shell is a singleton and there is no way to tear it down. Once you
        run a Flutter application in a process, all resources managed by it
        will remain referenced till process termination.
      * The threads on which the shell performs its operations are all
        singletons. These threads are never torn down and multiple Flutter
        applications (if present) have to compete with one another on these
        threads.
      * Resources referenced by the Dart VM are leaked because the VM isn't
        shutdown even when there are no more Flutter views.
      * The shell as a target does not compile on Fuchsia. The Fuchsia content
        handler uses specific dependencies of the shell to rebuild all the
        shell dependencies on its own. This leads to differences in frame
        scheduling, VM setup, service protocol endpoint setup, tracing, etc..
        Fuchsia is very much a second class citizen in this world.
      * Since threads and message loops are managed by the engine, the engine
        has to know about threading and platform message loop interop on each
        supported platform.
      
      Specific updates in this patch:
      
      * The shell is no longer a singleton and the embedder holds the unique
        reference to the shell.
      * Shell setup and teardown is deterministic.
      * Threads are no longer managed by the shell. Instead, the shell is
        given a task runner configuration by the embedder.
      * Since the shell does not own its threads, the embedder can control
        threads and the message loops operating on these threads. The shell is
        only given references to the task runners that execute tasks on these
        threads.
      * The shell only needs task runner references. These references can be
        to the same task runner. So, if the embedder thinks that a particular
        Flutter application would not need all the threads, it can pass
        references to the same task runner. This effectively makes Flutter
        application run in single threaded mode. There are some places in the
        shell that make synchronous calls, these sites have been updated to
        ensure that they don’t deadlock.
      * The test runner and the headless Dart code runner are now Flutter
        applications that are effectively single threaded (since they don’t
        have rendering concerns of big-boy Flutter application).
      * The embedder has to guarantee that the threads and outlive the shell.
        It is easy for the embedder to make that guarantee because shell
        termination is deterministic.
      * The embedder can create as many shell as it wants. Typically it
        creates a shell per Flutter application with its own task runner
        configuration. Most embedders obtain these task runners from threads
        dedicated to the shell. But, it is entirely possible that the embedder
        can obtain these task runners from a thread pool.
      * There can only be one Dart VM in the process. The numerous shell
        interact with one another to manage the VM lifecycle. Once the last
        shell goes away, the VM does as well and hence all resources
        associated with the VM are collected.
      * The shell as a target can now compile and run on Fuchsia. The current
        content handler has been removed from the Flutter engine source tree
        and a new implementation has been written that uses the new shell
        target.
      * Isolate management has been significantly overhauled. There are no
        owning references to Dart isolates within the shell. The VM owns the
        only strong reference to the Dart isolate. The isolate that has window
        bindings is now called the root isolate. Child isolates can now be
        created from the root isolate and their bindings and thread
        configurations are now inherited from the root isolate.
      * Terminating the shell terminates its root isolates as well as all the
        isolates spawned by this isolate. This is necessary be shell shutdown
        is deterministic and the embedder is free to collect the threads on
        which the isolates execute their tasks (and listen for mircrotasks
        flushes on).
      * Launching the root isolate is now significantly overhauled. The shell
        side (non-owning) reference to an isolate is now a little state
        machine and illegal state transitions should be impossible (barring
        construction issues). This is the only way to manage Dart isolates in
        the shell (the shell does not use the C API is dart_api.h anymore).
      * Once an isolate is launched, it must be prepared (and hence move to
        the ready phase) by associating a snapshot with the same. This
        snapshot can either be a precompiled snapshot, kernel snapshot, script
        snapshot or source file. Depending on the kind of data specified as a
        snapshot as well as the capabilities of the VM running in the process,
        isolate preparation can fail preparation with the right message.
      * Asset management has been significantly overhauled. All asset
        resolution goes through an abstract asset resolver interface. An asset
        manager implements this interface and manages one or more child asset
        resolvers. These asset resolvers typically resolve assets from
        directories, ZIP files (legacy FLX assets if provided), APK bundles,
        FDIO namespaces, etc…
      * Each launch of the shell requires a separate and fully configured
        asset resolver. This is necessary because launching isolates for the
        engine may require resolving snapshots as assets from the asset
        resolver. Asset resolvers can be shared by multiple launch instances
        in multiple shells and need to be thread safe.
      * References to the command line object have been removed from the
        shell. Instead, the shell only takes a settings object that may be
        configured from the command line. This makes it easy for embedders and
        platforms that don’t have a command line (Fuchsia) to configure the
        shell. Consequently, there is only one spot where the various switches
        are read from the command line (by the embedder and not the shell) to
        form the settings object.
      * All platform now respect the log tag (this was done only by Android
        till now) and each shell instance have its own log tag. This makes
        logs from multiple Flutter application in the same process (mainly
        Fuchsia) more easily decipherable.
      * The per shell IO task runner now has a new component that is
        unfortunately named the IOManager. This component manages the IO
        GrContext (used for asynchronous texture uploads) that cooperates with
        the GrContext on the GPU task runner associated with the shell. The
        IOManager is also responsible for flushing tasks that collect Skia
        objects that reference GPU resources during deterministic shell
        shutdown.
      * The embedder now has to be careful to only enable Blink on a single
        instance of the shell. Launching the legacy text layout and rendering
        engine multiple times is will trip assertions. The entirety of this
        runtime has been separated out into a separate object and can be
        removed in one go when the migration to libtxt is complete.
      * There is a new test target for the various C++ objects that the shell
        uses to interact with the Dart VM (the shell no longer use the C API
        in dart_api.h). This allows engine developers to test VM/Isolate
        initialization and teardown without having the setup a full shell
        instance.
      * There is a new test target for the testing a single shell instances
        without having to configure and launch an entire VM and associated
        root isolate.
      * Mac, Linux & Windows used to have different target that created the
        flutter_tester referenced by the tool. This has now been converted
        into a single target that compiles on all platforms.
      * WeakPointers vended by the fml::WeakPtrFactory(notice the difference
        between the same class in the fxl namespace) add threading checks on
        each use. This is enabled by getting rid of the “re-origination”
        feature of the WeakPtrFactory in the fxl namespace. The side effect of
        this is that all non-thread safe components have to be created, used
        and destroyed on the same thread. Numerous thread safety issues were
        caught by this extra assertion and have now been fixed.
        * Glossary of components that are only safe on a specific thread (and
          have the fml variants of the WeakPtrFactory):
          * Platform Thread: Shell
          * UI Thread: Engine, RuntimeDelegate, DartIsolate, Animator
          * GPU Thread: Rasterizer, Surface
          * IO Thread: IOManager
      
      This patch was reviewed in smaller chunks in the following pull
      requests. All comments from the pulls requests has been incorporated
      into this patch:
      
      * flutter/assets: https://github.com/flutter/engine/pull/4829
      * flutter/common: https://github.com/flutter/engine/pull/4830
      * flutter/content_handler: https://github.com/flutter/engine/pull/4831
      * flutter/flow: https://github.com/flutter/engine/pull/4832
      * flutter/fml: https://github.com/flutter/engine/pull/4833
      * flutter/lib/snapshot: https://github.com/flutter/engine/pull/4834
      * flutter/lib/ui: https://github.com/flutter/engine/pull/4835
      * flutter/runtime: https://github.com/flutter/engine/pull/4836
      * flutter/shell: https://github.com/flutter/engine/pull/4837
      * flutter/synchronization: https://github.com/flutter/engine/pull/4838
      * flutter/testing: https://github.com/flutter/engine/pull/4839
      6baff4c8