1. 04 3月, 2021 3 次提交
    • H
      Fix comment (#31424) · c40b98e0
      Huihuang Zheng 提交于
      Fix wrong code comment
      c40b98e0
    • H
      [Dy2stat] Fix Read-Only Attribute as while_loop Output (#31415) · 6bf02a12
      Huihuang Zheng 提交于
      Fix Read-Only Attribute as while_loop Output:
      
      Usually, our convert_while_loop will be like:
      ```
          [a, b, c] = paddle.jit.dy2static.convert_while_loop(
                  condition_name, body_name, [a, b, c])
      ```
      where a, b, c are in loop_var_names.
      
      However, if loop_var_names contains property such as foo.x, we cannot
      assign the attribute as output of convert_while_loop because Python
      property is a kind of read-only attribute. To handle the case, we replace
      the attributes which are output of convert_while_loop with generated
      variables, then if we know the attribute is not read-only at runtime, we
      assign the attribute. The created statements are like:
      ```
          [a, b, __attribute_variable_1] = paddle.jit.dy2static.convert_while_loop(
                  condition_name, body_name, [a, b, foo.x])
          if not isinstance(getattr(type(foo), x, None), property): foo.x = __attribute_variable_1
      ```
      6bf02a12
    • J
      Added LSTM BF16 and fixed GRU BF16 (#31234) · 5b4f8aac
      jakpiase 提交于
      5b4f8aac
  2. 03 3月, 2021 5 次提交
  3. 02 3月, 2021 2 次提交
    • P
      add n-d input support for trt scale converter (#31316) · 2e9e3fad
      Pei Yang 提交于
      * add n-d input support for trt scale converter
      
      * add flatten for ut
      
      * fix dims
      2e9e3fad
    • G
      lamb_op_xpu;test=kunlun (#31012) · d79fdc3d
      Gradie 提交于
      * lamb_op_xpu;test=kunlun
      
      * modify lamb_op_xpu.cc;test=kunlun
      
      * delete atol lamb_op_xpu; test=kunlun
      
      * update xpu.cmake;test=kunlun
      
      * test_error 1e-5,lamb_op_xpu;test=kunlun
      
      * error1e-5,lamb_op_xpu,test=kunlun
      
      * delete atol lamb_xpu;test=kunlun
      
      * modify atol,lamb_op_xpy;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu, XPUOptest;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu,modify xpu_cmake; test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu,modify xpucmake;test=kunlun
      d79fdc3d
  4. 28 2月, 2021 2 次提交
  5. 27 2月, 2021 1 次提交
  6. 26 2月, 2021 7 次提交
  7. 25 2月, 2021 4 次提交
  8. 24 2月, 2021 10 次提交
  9. 23 2月, 2021 4 次提交
  10. 22 2月, 2021 2 次提交
    • T
      support save multi sparse table in one path (#31108) · 565354f6
      Thunderbrook 提交于
      * save multi table one path
      
      * format
      565354f6
    • H
      [Dy2stat] Refactoring tensor_shape_transformer.py to Fix Change after Assign Bug (#31082) · cf43a321
      Huihuang Zheng 提交于
      **Problem**
      In our old shape transformer logic, if user write:
      ```
      s = tensor.shape
      ...
      y = paddle.some_api(s)
      ```
      Dy2stat will change it to
      ```
      ...
      y = paddle.some_api(convert_var_shape(tensor))
      ```
      However it will cause fatal bug if user changes the shape of `x` after assign. For example:
      ```
      s = tensor.shape
      ...
      tensor = paddle.some_change_shape_api(tensor)
      ...
      y = paddle.some_api(s)
      ```
      Then the Dy2stat will get wrong result because the code is translated into:
      ```
      tensor = paddle.some_change_shape_api(tensor)
      ...
      y = paddle.some_api(convert_var_shape(tensor)) # tensor shape has been changed, not origin `s` value
      ```
      
      **Solution Logic**
      
      It can not be solved in the old logic, so I refactoring tensor_shape_transformer logic. Now we will use `s` to store shape attribute and generate a var `s__STATIC_CONVERT_VAR_SHAPE_SUFFIX` to store static shape API `shape(tensor)`
      ```
      s = tensor.shape
      ...
      y = paddle.some_api(s)
      ```
      Dy2stat will change it to
      ```
      s = tensor.shape
      s__STATIC_CONVERT_VAR_SHAPE_SUFFIX = shape(tensor)
      ...
      y = paddle.some_api(choose_shape_attr_or_api(s, s__STATIC_CONVERT_VAR_SHAPE_SUFFIX ))
      ```
      In this case, the code is consistent with origin dygraph meaning and it fixed the change after assign bug.
      
      **Code Key Note**
      
      To help reviewers, the key change of this PR is changing `self.name_to_var_shape` from "mapping name to shape node" to "mapping name to its STATIC_CONVERT_VAR_SHAPE_SUFFIX name", then if a variable name has the SUFFIX, we can choose to use attribute shape or shape api. Other changes go with the key change.
      
      **Consideration**
      The issue of this PR is that we store extra static `shape` API result, will it harms the speed of Dy2stat? In some cases it will, but we argue that the benefit would be greater than the cost.
      
      1. The extra calling to static `shape` API will happen when coder assign among shape variables. Take the following dygraph code as an instance:
      ```
      s1 = tensor.shape
      s2 = s1
      s3 = s2
      ...
      ```
      Then we called extra static `shape` APIs again and again, however users seldom write code like this.
      
      2. If the shape variable is used a lot, for example:
      ```
      s = tensor.shape
      y1 = paddle.some_api1(s)
      y2 = paddle.some_api2(s)
      y3 = paddle.some_api3(s)
      ```
      Our old logic will create 3 shape APIs but now just 1. This is more common user code pattern. In fact, if reviewers take a look at the current unit test in this PR, you could see the op numbers decrease after this PR. So we argue that this PR can also improve speed in this code pattern.
      cf43a321