1. 14 4月, 2016 3 次提交
  2. 13 4月, 2016 17 次提交
  3. 12 4月, 2016 17 次提交
  4. 11 4月, 2016 3 次提交
    • A
      Clarifies some documentation in server_lib.py · 6a4b2502
      A. Unique TensorFlower 提交于
      Change: 119533248
      6a4b2502
    • A
      tensorflow: support usage of eigen thread pool · 017498bc
      A. Unique TensorFlower 提交于
      Use eigen ThreadPool instead of tensorflow one if TENSORFLOW_USE_EIGEN_THREADPOOL is defined. This will allow to switch to the new non-blocking ThreadPool.
      Change: 119512280
      017498bc
    • E
      Fix RNN performance bug. + Additions to rnn benchmarks & benchmarks.py. · eb161ecd
      Eugene Brevdo 提交于
      The RNN performance bug:
      * When passing sequence_length to rnn(), calculations were being performed past
        max_sequence_length.
      
      This bug had one major side effect:
      * It slowed down the calculation past max_sequence_length (it *should*
      return zeros for outputs and copy state through)
      
      The calculations themselves were still correct:  The state was still
      copied through and the output was still all zeros.  But that calculation
      was performed via a vector-conditional select() instead of a single
      scalar cond().  As a result a lot of extra copying was happening both
      in fw and backprop.
      
      Thanks to Nat Roth (natusroth@gmail) for unearthing this bug.
      
      **************
      Also:
      - updates to benchmarks.py (allow more specific benchmarks, added
        support for --benchmarks=all).
      - cleaned up RNN benchmarks code a bit.
      
      New and updated benchmarks:
      
      Calculation: Static Unroll with Halved Sequence Length vs. Half Static Unroll
      batch    full_t          units   gpu     dt(half_seq_len)        dt(unroll_half)         dt(half_seq_len)/dt(unroll_half)
      128      50              256     False   0.164351                0.155019                1.060204
      128      50              256     True    0.033295                0.028203                1.180550
      
      Calculation: Static Unroll with Dynamic Flow LSTM vs. Dynamic Unroll LSTM
      batch    max_t   units   gpu     dt(static)      dt(dynamic)     dt(dynamic)/dt(static)
      256      50      512     False   1.759111        1.692570        0.962173
      256      50      512     True    0.178953        0.190454        1.064269
      256      50      256     False   0.533132        0.567228        1.063955
      256      50      256     True    0.078298        0.085024        1.085905
      256      50      128     False   0.220362        0.215350        0.977255
      256      50      128     True    0.053379        0.059129        1.107723
      Change: 119495675
      eb161ecd