# 自然语言处理应用

## 概述

情感分类是自然语言处理中文本分类问题的子集，属于自然语言处理最基础的应用。它是对带有感情色彩的主观性文本进行分析和推理的过程，即分析说话人的态度，是倾向正面还是反面。

> 通常情况下，我们会把情感类别分为正面、反面和中性三类。虽然“面无表情”的评论也有不少；不过，大部分时候会只采用正面和反面的案例进行训练，下面这个数据集就是很好的例子。

传统的文本主题分类问题的典型参考数据集为[20 Newsgroups](http://qwone.com/~jason/20Newsgroups/)，该数据集由20组新闻数据组成，包含约20000个新闻文档。
其主题列表中有些类别的数据比较相似，例如comp.sys.ibm.pc.hardware和comp.sys.mac.hardware都是和电脑系统硬件相关的题目，相似度比较高。而有些主题类别的数据相对来说就毫无关联，例如misc.forsale和soc.religion.christian。

就网络本身而言，文本主题分类的网络结构和情感分类的网络结构大致相似。在掌握了情感分类网络如何构造之后，很容易可以构造一个类似的网络，稍作调参即可用于文本主题分类任务。

但在业务上下文侧，文本主题分类是分析文本讨论的客观内容，而情感分类是要从文本中得到它是否支持某种观点的信息。比如，“《阿甘正传》真是好看极了，影片主题明确，节奏流畅。”这句话，在文本主题分类是要将其归为类别为“电影”主题，而情感分类则要挖掘出这一影评的态度是正面还是负面。

相对于传统的文本主题分类，情感分类较为简单，实用性也较强。常见的购物网站、电影网站都可以采集到相对高质量的数据集，也很容易给业务领域带来收益。例如，可以结合领域上下文，自动分析特定类型客户对当前产品的意见，可以分主题分用户类型对情感进行分析，以作针对性的处理，甚至基于此进一步推荐产品，提高转化率，带来更高的商业收益。

特殊领域中，某些非极性词也充分表达了用户的情感倾向，比如下载使用APP时，“卡死了”、“下载太慢了”就表达了用户的负面情感倾向；股票领域中，“看涨”、“牛市”表达的就是用户的正面情感倾向。所以，本质上，我们希望模型能够在垂直领域中，挖掘出一些特殊的表达，作为极性词给情感分类系统使用：

$垂直极性词 = 通用极性词 + 领域特有极性词$

按照处理文本的粒度不同，情感分析可分为词语级、短语级、句子级、段落级以及篇章级等几个研究层次。这里以“段落级”为例，输入为一个段落，输出为影评是正面还是负面的信息。

接下来，以IMDB影评情感分类为例来体验MindSpore在自然语言处理上的应用。

## 整体流程

1. 准备环节。
2. 加载数据集，进行数据处理。
3. 定义网络。
4. 定义优化器和损失函数。
5. 使用网络训练数据，生成模型。
6. 得到模型之后，使用验证数据集，查看模型精度情况。

## 准备环节

### 下载数据集

本次体验采用IMDB影评数据集作为实验数据。

1. 下载IMDB影评数据集,数据集下载地址：<http://ai.stanford.edu/~amaas/data/sentiment/>。

    以下是负面影评（Negative）和正面影评（Positive）的案例。

| Review  | Label  | 
|:---|:---:|
| "Quitting" may be as much about exiting a pre-ordained identity as about drug withdrawal. As a rural guy coming to Beijing, class and success must have struck this young artist face on as an appeal to separate from his roots and far surpass his peasant parents' acting success. Troubles arise, however, when the new man is too new, when it demands too big a departure from family, history, nature, and personal identity. The ensuing splits, and confusion between the imaginary and the real and the dissonance between the ordinary and the heroic are the stuff of a gut check on the one hand or a complete escape from self on the other.  |  Negative |  
| This movie is amazing because the fact that the real people portray themselves and their real life experience and do such a good job it's like they're almost living the past over again. Jia Hongsheng plays himself an actor who quit everything except music and drugs struggling with depression and searching for the meaning of life while being angry at everyone especially the people who care for him most.  | Positive  |
    
&emsp;&emsp;将下载好的数据集解压并放在当前工作目录下。


2. 下载GloVe文件
    下载并解压GloVe文件到当前工作目录下，修改解压后的目录名为`glove`，并在所有Glove文件开头处添加如下所示新的一行，意思是总共读取400000个单词，每个单词用300纬度的词向量表示。

    ```
    400000 300
    ```

    GloVe文件下载地址：<http://nlp.stanford.edu/data/glove.6B.zip>


3. 在当前工作目录创建名为`preprocess`的空目录，该目录将用于存储在数据集预处理操作中IMDB数据集转换为MindRecord格式后的文件。

    此时当前工作目录结构如下所示。
    
    ```shell
    $ tree -L 2 lstm
    lstm
    ├── aclImdb
    │   ├── imdbEr.txt
    │   ├── imdb.vocab
    │   ├── README
    │   ├── test
    │   └── train
    ├── glove
    │   ├── glove.6B.100d.txt
    │   ├── glove.6B.200d.txt
    │   ├── glove.6B.300d.txt
    │   └── glove.6B.50d.txt
    └── preprocess
    ```

### 确定评价标准

作为典型的分类问题，情感分类的评价标准可以比照普通的分类问题处理。常见的精度（Accuracy）、精准度（Precision）、召回率（Recall）和F_beta分数都可以作为参考。

$精度（Accuracy）= 分类正确的样本数目 / 总样本数目$

$精准度（Precision）= 真阳性样本数目 / 所有预测类别为阳性的样本数目$

$召回率（Recall）= 真阳性样本数目 / 所有真实类别为阳性的样本数目$ 

$F1分数 = (2 * Precision * Recall) / (Precision + Recall)$

在IMDB这个数据集中，正负样本数差别不大，可以简单地用精度（accuracy）作为分类器的衡量标准。

### 确定网络

我们使用基于LSTM构建的SentimentNet网络进行自然语言处理。

> LSTM（Long short-term memory，长短期记忆）网络是一种时间循环神经网络，适合于处理和预测时间序列中间隔和延迟非常长的重要事件。
> 本次体验面向GPU或CPU硬件平台。

### 配置运行信息

1. 使用`parser`模块传入运行必要的信息。
    
    - `preprocess`：是否预处理数据集，默认为否。
    - `aclimdb_path`：数据集存放路径。
    - `glove_path`：GloVe文件存放路径。
    - `preprocess_path`：预处理数据集的结果文件夹。
    - `ckpt_path`：CheckPoint文件路径。
    - `pre_trained`：预加载CheckPoint文件。
    - `device_target`：指定GPU或CPU环境。

In [1]:
import argparse


parser = argparse.ArgumentParser(description='MindSpore LSTM Example')
parser.add_argument('--preprocess', type=str, default='false', choices=['true', 'false'],
                    help='whether to preprocess data.')
parser.add_argument('--aclimdb_path', type=str, default="./aclImdb",
                    help='path where the dataset is stored.')
parser.add_argument('--glove_path', type=str, default="./glove",
                    help='path where the GloVe is stored.')
parser.add_argument('--preprocess_path', type=str, default="./preprocess",
                    help='path where the pre-process data is stored.')
parser.add_argument('--ckpt_path', type=str, default="./",
                    help='the path to save the checkpoint file.')
parser.add_argument('--pre_trained', type=str, default=None,
                    help='the pretrained checkpoint file path.')
parser.add_argument('--device_target', type=str, default="GPU", choices=['GPU', 'CPU'],
                    help='the target device to run, support "GPU", "CPU". Default: "GPU".')
args = parser.parse_args(['--device_target', 'GPU', '--preprocess', 'true'])

2. 进行训练前，需要配置必要的信息，包括环境信息、执行的模式、后端信息及硬件信息。 
    
> 详细的接口配置信息，请参见MindSpore官网`context.set_context`API接口说明。

In [2]:
from mindspore import context


context.set_context(
        mode=context.GRAPH_MODE,
        save_graphs=False,
        device_target=args.device_target)

### 配置SentimentNet网络参数

在以下一段代码中配置基于LSTM构建的SentimentNet网络所需相关参数。

In [3]:
from easydict import EasyDict as edict


# LSTM CONFIG
lstm_cfg = edict({
    'num_classes': 2,
    'learning_rate': 0.1,
    'momentum': 0.9,
    'num_epochs': 10,
    'batch_size': 64,
    'embed_size': 300,
    'num_hiddens': 100,
    'num_layers': 2,
    'bidirectional': True,
    'save_checkpoint_steps': 390,
    'keep_checkpoint_max': 10
})

cfg = lstm_cfg

# 数据处理

## 预处理数据集

1. 定义`ImdbParser`类解析文本数据集，包括编码、分词、对齐、处理GloVe原始数据，使之能够适应网络结构。

In [4]:
import os
from itertools import chain
import numpy as np
import gensim


class ImdbParser():
    """
    parse aclImdb data to features and labels.
    sentence->tokenized->encoded->padding->features
    """

    def __init__(self, imdb_path, glove_path, embed_size=300):
        self.__segs = ['train', 'test']
        self.__label_dic = {'pos': 1, 'neg': 0}
        self.__imdb_path = imdb_path
        self.__glove_dim = embed_size
        self.__glove_file = os.path.join(glove_path, 'glove.6B.' + str(self.__glove_dim) + 'd.txt')

        # properties
        self.__imdb_datas = {}
        self.__features = {}
        self.__labels = {}
        self.__vacab = {}
        self.__word2idx = {}
        self.__weight_np = {}
        self.__wvmodel = None

    def parse(self):
        """
        parse imdb data to memory
        """
        self.__wvmodel = gensim.models.KeyedVectors.load_word2vec_format(self.__glove_file)

        for seg in self.__segs:
            self.__parse_imdb_datas(seg)
            self.__parse_features_and_labels(seg)
            self.__gen_weight_np(seg)

    def __parse_imdb_datas(self, seg):
        """
        load data from txt
        """
        data_lists = []
        for label_name, label_id in self.__label_dic.items():
            sentence_dir = os.path.join(self.__imdb_path, seg, label_name)
            for file in os.listdir(sentence_dir):
                with open(os.path.join(sentence_dir, file), mode='r', encoding='utf8') as f:
                    sentence = f.read().replace('\n', '')
                    data_lists.append([sentence, label_id])
        self.__imdb_datas[seg] = data_lists

    def __parse_features_and_labels(self, seg):
        """
        parse features and labels
        """
        features = []
        labels = []
        for sentence, label in self.__imdb_datas[seg]:
            features.append(sentence)
            labels.append(label)

        self.__features[seg] = features
        self.__labels[seg] = labels

        # update feature to tokenized
        self.__updata_features_to_tokenized(seg)
        # parse vacab
        self.__parse_vacab(seg)
        # encode feature
        self.__encode_features(seg)
        # padding feature
        self.__padding_features(seg)

    def __updata_features_to_tokenized(self, seg):
        tokenized_features = []
        for sentence in self.__features[seg]:
            tokenized_sentence = [word.lower() for word in sentence.split(" ")]
            tokenized_features.append(tokenized_sentence)
        self.__features[seg] = tokenized_features

    def __parse_vacab(self, seg):
        # vocab
        tokenized_features = self.__features[seg]
        vocab = set(chain(*tokenized_features))
        self.__vacab[seg] = vocab

        # word_to_idx: {'hello': 1, 'world':111, ... '<unk>': 0}
        word_to_idx = {word: i + 1 for i, word in enumerate(vocab)}
        word_to_idx['<unk>'] = 0
        self.__word2idx[seg] = word_to_idx

    def __encode_features(self, seg):
        """ encode word to index """
        word_to_idx = self.__word2idx['train']
        encoded_features = []
        for tokenized_sentence in self.__features[seg]:
            encoded_sentence = []
            for word in tokenized_sentence:
                encoded_sentence.append(word_to_idx.get(word, 0))
            encoded_features.append(encoded_sentence)
        self.__features[seg] = encoded_features

    def __padding_features(self, seg, maxlen=500, pad=0):
        """ pad all features to the same length """
        padded_features = []
        for feature in self.__features[seg]:
            if len(feature) >= maxlen:
                padded_feature = feature[:maxlen]
            else:
                padded_feature = feature
                while len(padded_feature) < maxlen:
                    padded_feature.append(pad)
            padded_features.append(padded_feature)
        self.__features[seg] = padded_features

    def __gen_weight_np(self, seg):
        """
        generate weight by gensim
        """
        weight_np = np.zeros((len(self.__word2idx[seg]), self.__glove_dim), dtype=np.float32)
        for word, idx in self.__word2idx[seg].items():
            if word not in self.__wvmodel:
                continue
            word_vector = self.__wvmodel.get_vector(word)
            weight_np[idx, :] = word_vector

        self.__weight_np[seg] = weight_np

    def get_datas(self, seg):
        """
        return features, labels, and weight
        """
        features = np.array(self.__features[seg]).astype(np.int32)
        labels = np.array(self.__labels[seg]).astype(np.int32)
        weight = np.array(self.__weight_np[seg])
        return features, labels, weight

2. 定义`convert_to_mindrecord`函数将数据集格式转换为MindRecord格式，便于MindSpore读取。

    函数`_convert_to_mindrecord`中`weight.txt`为数据预处理后自动生成的weight参数信息文件。

In [5]:
import os
import numpy as np
from mindspore.mindrecord import FileWriter


def _convert_to_mindrecord(data_home, features, labels, weight_np=None, training=True):
    """
    convert imdb dataset to mindrecoed dataset
    """
    if weight_np is not None:
        np.savetxt(os.path.join(data_home, 'weight.txt'), weight_np)

    # write mindrecord
    schema_json = {"id": {"type": "int32"},
                   "label": {"type": "int32"},
                   "feature": {"type": "int32", "shape": [-1]}}

    data_dir = os.path.join(data_home, "aclImdb_train.mindrecord")
    if not training:
        data_dir = os.path.join(data_home, "aclImdb_test.mindrecord")

    def get_imdb_data(features, labels):
        data_list = []
        for i, (label, feature) in enumerate(zip(labels, features)):
            data_json = {"id": i,
                         "label": int(label),
                         "feature": feature.reshape(-1)}
            data_list.append(data_json)
        return data_list

    writer = FileWriter(data_dir, shard_num=4)
    data = get_imdb_data(features, labels)
    writer.add_schema(schema_json, "nlp_schema")
    writer.add_index(["id", "label"])
    writer.write_raw_data(data)
    writer.commit()


def convert_to_mindrecord(embed_size, aclimdb_path, preprocess_path, glove_path):
    """
    convert imdb dataset to mindrecoed dataset
    """
    parser = ImdbParser(aclimdb_path, glove_path, embed_size)
    parser.parse()

    if not os.path.exists(preprocess_path):
        print(f"preprocess path {preprocess_path} is not exist")
        os.makedirs(preprocess_path)

    train_features, train_labels, train_weight_np = parser.get_datas('train')
    _convert_to_mindrecord(preprocess_path, train_features, train_labels, train_weight_np)

    test_features, test_labels, _ = parser.get_datas('test')
    _convert_to_mindrecord(preprocess_path, test_features, test_labels, training=False)
    

3. 调用`convert_to_mindrecord`函数执行数据集预处理，此处用时约3分钟。

In [6]:
if args.preprocess == "true":
    print("============== Starting Data Pre-processing ==============")
    convert_to_mindrecord(cfg.embed_size, args.aclimdb_path, args.preprocess_path, args.glove_path)
    print("======================= Successful =======================")




&nbsp;&nbsp;&nbsp;&nbsp;转换成功后会在`preprocess`目录下生成MindRecord文件，通常该操作在数据集不变的情况下，无需每次训练都执行，此时`preprocess`文件目录如下所示：

```shell
 $ tree preprocess
 ├── aclImdb_test.mindrecord0
 ├── aclImdb_test.mindrecord0.db
 ├── aclImdb_test.mindrecord1
 ├── aclImdb_test.mindrecord1.db
 ├── aclImdb_test.mindrecord2
 ├── aclImdb_test.mindrecord2.db
 ├── aclImdb_test.mindrecord3
 ├── aclImdb_test.mindrecord3.db
 ├── aclImdb_train.mindrecord0
 ├── aclImdb_train.mindrecord0.db
 ├── aclImdb_train.mindrecord1
 ├── aclImdb_train.mindrecord1.db
 ├── aclImdb_train.mindrecord2
 ├── aclImdb_train.mindrecord2.db
 ├── aclImdb_train.mindrecord3
 ├── aclImdb_train.mindrecord3.db
 └── weight.txt
```

- 以上各文件中：
    - 名称包含`aclImdb_train.mindrecord`的为转换后的MindRecord格式的训练数据集。
    - 名称包含`aclImdb_test.mindrecord`的为转换后的MindRecord格式的测试数据集。
    - `weight.txt`为预处理后自动生成的weight参数信息文件。


4. 定义创建数据集函数`lstm_create_dataset`，创建训练集`ds_train`。

In [7]:
import os
import mindspore.dataset as ds


def lstm_create_dataset(data_home, batch_size, repeat_num=1, training=True):
    """Data operations."""
    ds.config.set_seed(1)
    data_dir = os.path.join(data_home, "aclImdb_train.mindrecord0")
    if not training:
        data_dir = os.path.join(data_home, "aclImdb_test.mindrecord0")

    data_set = ds.MindDataset(data_dir, columns_list=["feature", "label"], num_parallel_workers=4)

    # apply map operations on images
    data_set = data_set.shuffle(buffer_size=data_set.get_dataset_size())
    data_set = data_set.batch(batch_size=batch_size, drop_remainder=True)
    data_set = data_set.repeat(count=repeat_num)

    return data_set

ds_train = lstm_create_dataset(args.preprocess_path, cfg.batch_size)

5. 通过`create_dict_iterator`方法创建字典迭代器，读取已创建的数据集`ds_train`中的数据。

    运行以下一段代码，读取第1个`batch`中的`label`数据列表，和第1个`batch`中第1个元素的`feature`数据。

In [8]:
iterator = ds_train.create_dict_iterator().get_next()
first_batch_label = iterator["label"]
first_batch_first_feature = iterator["feature"][0]
print(f"The first batch contains label below:\n{first_batch_label}\n")
print(f"The feature of the first item in the first batch is below vector:\n{first_batch_first_feature}")

The first batch contains label below:
[0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 1 1 1 0 0 1 0 0 1 0 1
 0 0 0 0 1 0 0 1 1 1 0 0 0 1 1 1 1 0 1 0 0 1 1 0 1 1 0]

The feature of the first item in the first batch is below vector:
[210974 227370 167874 221440 205821 250308  57410 167874 157597 211314
 104140 154424 238018 167874 216357  23869 209921 187724 131973 144940
 177558 221440 205821 119691 149127 137330 212709 117415  61509  42345
 166849 155531 219231  64473 210974 103293 225985 181047  41304 210974
 132905  33755  96216   8987 210974 195260 117816  15665 241057   8987
  93501 155531 118935 110275 101659 181047 226216 133895 114115   6596
 189694 210974  56753   3426  29344 103100 131973  46391  25351  35080
  27231  69404 190304 212709 117415 157277 167874 210974 109102  92239
 101085 123273  64473 117415 176947  27231 168206 219146 167874 210974
 227370  18539 155531 219231  64473 210974 155781  93577 192315 157597
 213189  66091 216583 100381 158491 181047  15368 2214

## 定义网络

1. 导入初始化网络所需模块。

In [9]:
import numpy as np
from mindspore import Tensor, nn, context
from mindspore.ops import operations as P
from mindspore.train.serialization import load_param_into_net, load_checkpoint

2. 定义`lstm_default_state`函数来初始化网络参数及网络状态。

In [10]:
# Initialize short-term memory (h) and long-term memory (c) to 0
def lstm_default_state(batch_size, hidden_size, num_layers, bidirectional):
    """init default input."""
    num_directions = 1
    if bidirectional:
        num_directions = 2

    if context.get_context("device_target") == "CPU":
        h_list = []
        c_list = []
        i = 0
        while i < num_layers:
            hi = Tensor(np.zeros((num_directions, batch_size, hidden_size)).astype(np.float32))
            h_list.append(hi)
            ci = Tensor(np.zeros((num_directions, batch_size, hidden_size)).astype(np.float32))
            c_list.append(ci)
            i = i + 1
        h = tuple(h_list)
        c = tuple(c_list)
        return h, c

    h = Tensor(
        np.zeros((num_layers * num_directions, batch_size, hidden_size)).astype(np.float32))
    c = Tensor(
        np.zeros((num_layers * num_directions, batch_size, hidden_size)).astype(np.float32))
    return h, c

3. 使用`Cell`方法，定义网络结构（`SentimentNet`网络）。

In [11]:
class SentimentNet(nn.Cell):
    """Sentiment network structure."""

    def __init__(self,
                 vocab_size,
                 embed_size,
                 num_hiddens,
                 num_layers,
                 bidirectional,
                 num_classes,
                 weight,
                 batch_size):
        super(SentimentNet, self).__init__()
        # Mapp words to vectors
        self.embedding = nn.Embedding(vocab_size,
                                      embed_size,
                                      embedding_table=weight)
        self.embedding.embedding_table.requires_grad = False
        self.trans = P.Transpose()
        self.perm = (1, 0, 2)
        self.encoder = nn.LSTM(input_size=embed_size,
                               hidden_size=num_hiddens,
                               num_layers=num_layers,
                               has_bias=True,
                               bidirectional=bidirectional,
                               dropout=0.0)

        self.h, self.c = lstm_default_state(batch_size, num_hiddens, num_layers, bidirectional)

        self.concat = P.Concat(1)
        if bidirectional:
            self.decoder = nn.Dense(num_hiddens * 4, num_classes)
        else:
            self.decoder = nn.Dense(num_hiddens * 2, num_classes)

    def construct(self, inputs):
        # input：(64,500,300)
        embeddings = self.embedding(inputs)
        embeddings = self.trans(embeddings, self.perm)
        output, _ = self.encoder(embeddings, (self.h, self.c))
        # states[i] size(64,200)  -> encoding.size(64,400)
        encoding = self.concat((output[0], output[499]))
        outputs = self.decoder(encoding)
        return outputs

4. 实例化`SentimentNet`，创建网络，此步骤用时约1分钟。

In [12]:
embedding_table = np.loadtxt(os.path.join(args.preprocess_path, "weight.txt")).astype(np.float32)
network = SentimentNet(vocab_size=embedding_table.shape[0],
                       embed_size=cfg.embed_size,
                       num_hiddens=cfg.num_hiddens,
                       num_layers=cfg.num_layers,
                       bidirectional=cfg.bidirectional,
                       num_classes=cfg.num_classes,
                       weight=Tensor(embedding_table),
                       batch_size=cfg.batch_size)

## 定义优化器及损失函数

运行以下一段代码，创建优化器和损失函数模型。

In [13]:
from mindspore import nn


loss = nn.SoftmaxCrossEntropyWithLogits(is_grad=False, sparse=True)
opt = nn.Momentum(network.trainable_params(), cfg.learning_rate, cfg.momentum)

## 训练并保存模型

加载训练数据集（`ds_train`）并配置好`CheckPoint`生成信息，然后使用`model.train`接口，进行模型训练，此步骤用时约7分钟。根据输出可以看到loss值随着训练逐步降低，最后达到0.262左右。

In [14]:
from mindspore import Model
from mindspore.train.callback import CheckpointConfig, ModelCheckpoint, TimeMonitor, LossMonitor
from mindspore.nn import Accuracy


model = Model(network, loss, opt, {'acc': Accuracy()})
loss_cb = LossMonitor()
print("============== Starting Training ==============")
config_ck = CheckpointConfig(save_checkpoint_steps=cfg.save_checkpoint_steps,
                             keep_checkpoint_max=cfg.keep_checkpoint_max)
ckpoint_cb = ModelCheckpoint(prefix="lstm", directory=args.ckpt_path, config=config_ck)
time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())
if args.device_target == "CPU":
    model.train(cfg.num_epochs, ds_train, callbacks=[time_cb, ckpoint_cb, loss_cb], dataset_sink_mode=False)
else:
    model.train(cfg.num_epochs, ds_train, callbacks=[time_cb, ckpoint_cb, loss_cb])
print("============== Training Success ==============")

epoch: 1 step: 1, loss is 0.6938
epoch: 1 step: 2, loss is 0.6922
epoch: 1 step: 3, loss is 0.6917
epoch: 1 step: 4, loss is 0.6952
epoch: 1 step: 5, loss is 0.6868
epoch: 1 step: 6, loss is 0.6982
epoch: 1 step: 7, loss is 0.6856
epoch: 1 step: 8, loss is 0.6819
epoch: 1 step: 9, loss is 0.7372
epoch: 1 step: 10, loss is 0.6948
epoch: 1 step: 11, loss is 0.6961
epoch: 1 step: 12, loss is 0.6975
epoch: 1 step: 13, loss is 0.6931
epoch: 1 step: 14, loss is 0.6903
epoch: 1 step: 15, loss is 0.6720
epoch: 1 step: 16, loss is 0.7079
epoch: 1 step: 17, loss is 0.7125
epoch: 1 step: 18, loss is 0.7477
epoch: 1 step: 19, loss is 0.6924
epoch: 1 step: 20, loss is 0.7085
epoch: 1 step: 21, loss is 0.6958
epoch: 1 step: 22, loss is 0.6918
epoch: 1 step: 23, loss is 0.6985
epoch: 1 step: 24, loss is 0.6919
epoch: 1 step: 25, loss is 0.6858
epoch: 1 step: 26, loss is 0.6796
epoch: 1 step: 27, loss is 0.7113
epoch: 1 step: 28, loss is 0.7065
epoch: 1 step: 29, loss is 0.6910
epoch: 1 step: 30, loss

epoch: 1 step: 90, loss is 0.6860
epoch: 1 step: 91, loss is 0.6900
epoch: 1 step: 92, loss is 0.6846
epoch: 1 step: 93, loss is 0.6833
epoch: 1 step: 94, loss is 0.6901
epoch: 1 step: 95, loss is 0.6831
epoch: 1 step: 96, loss is 0.7010
epoch: 1 step: 97, loss is 0.6925
epoch: 1 step: 98, loss is 0.6768
epoch: 1 step: 99, loss is 0.6848
epoch: 1 step: 100, loss is 0.6925
epoch: 1 step: 101, loss is 0.7067
epoch: 1 step: 102, loss is 0.7053
epoch: 1 step: 103, loss is 0.6841
epoch: 1 step: 104, loss is 0.6882
epoch: 1 step: 105, loss is 0.6794
epoch: 1 step: 106, loss is 0.6754
epoch: 1 step: 107, loss is 0.6788
epoch: 1 step: 108, loss is 0.6930
epoch: 1 step: 109, loss is 0.6792
epoch: 1 step: 110, loss is 0.6889
epoch: 1 step: 111, loss is 0.6800
epoch: 1 step: 112, loss is 0.6881
epoch: 1 step: 113, loss is 0.6866
epoch: 1 step: 114, loss is 0.6963
epoch: 1 step: 115, loss is 0.6698
epoch: 1 step: 116, loss is 0.6795
epoch: 1 step: 117, loss is 0.7177
epoch: 1 step: 118, loss is 0.

epoch: 1 step: 179, loss is 0.6838
epoch: 1 step: 180, loss is 0.7194
epoch: 1 step: 181, loss is 0.5811
epoch: 1 step: 182, loss is 0.7140
epoch: 1 step: 183, loss is 0.7558
epoch: 1 step: 184, loss is 0.6419
epoch: 1 step: 185, loss is 0.5970
epoch: 1 step: 186, loss is 0.7137
epoch: 1 step: 187, loss is 0.6258
epoch: 1 step: 188, loss is 0.6423
epoch: 1 step: 189, loss is 0.6785
epoch: 1 step: 190, loss is 0.6613
epoch: 1 step: 191, loss is 0.6538
epoch: 1 step: 192, loss is 0.6377
epoch: 1 step: 193, loss is 0.7727
epoch: 1 step: 194, loss is 0.6539
epoch: 1 step: 195, loss is 0.6855
epoch: 1 step: 196, loss is 0.6523
epoch: 1 step: 197, loss is 0.6892
epoch: 1 step: 198, loss is 0.6495
epoch: 1 step: 199, loss is 0.6546
epoch: 1 step: 200, loss is 0.6856
epoch: 1 step: 201, loss is 0.6739
epoch: 1 step: 202, loss is 0.6894
epoch: 1 step: 203, loss is 0.6625
epoch: 1 step: 204, loss is 0.6656
epoch: 1 step: 205, loss is 0.6302
epoch: 1 step: 206, loss is 0.6459
epoch: 1 step: 207, 

epoch: 1 step: 268, loss is 0.6293
epoch: 1 step: 269, loss is 0.6679
epoch: 1 step: 270, loss is 0.6610
epoch: 1 step: 271, loss is 0.6144
epoch: 1 step: 272, loss is 0.6461
epoch: 1 step: 273, loss is 0.6446
epoch: 1 step: 274, loss is 0.7186
epoch: 1 step: 275, loss is 0.7003
epoch: 1 step: 276, loss is 0.6935
epoch: 1 step: 277, loss is 0.7605
epoch: 1 step: 278, loss is 0.6664
epoch: 1 step: 279, loss is 0.5582
epoch: 1 step: 280, loss is 0.6123
epoch: 1 step: 281, loss is 0.6410
epoch: 1 step: 282, loss is 0.6696
epoch: 1 step: 283, loss is 0.6637
epoch: 1 step: 284, loss is 0.6558
epoch: 1 step: 285, loss is 0.6364
epoch: 1 step: 286, loss is 0.6613
epoch: 1 step: 287, loss is 0.6815
epoch: 1 step: 288, loss is 0.6551
epoch: 1 step: 289, loss is 0.6071
epoch: 1 step: 290, loss is 0.6287
epoch: 1 step: 291, loss is 0.6090
epoch: 1 step: 292, loss is 0.6697
epoch: 1 step: 293, loss is 0.6100
epoch: 1 step: 294, loss is 0.6452
epoch: 1 step: 295, loss is 0.5721
epoch: 1 step: 296, 

epoch: 1 step: 357, loss is 0.6403
epoch: 1 step: 358, loss is 0.6679
epoch: 1 step: 359, loss is 0.6559
epoch: 1 step: 360, loss is 0.6298
epoch: 1 step: 361, loss is 0.6193
epoch: 1 step: 362, loss is 0.6649
epoch: 1 step: 363, loss is 0.6179
epoch: 1 step: 364, loss is 0.6771
epoch: 1 step: 365, loss is 0.6193
epoch: 1 step: 366, loss is 0.5615
epoch: 1 step: 367, loss is 0.6999
epoch: 1 step: 368, loss is 0.6330
epoch: 1 step: 369, loss is 0.6941
epoch: 1 step: 370, loss is 0.7298
epoch: 1 step: 371, loss is 0.7247
epoch: 1 step: 372, loss is 0.5866
epoch: 1 step: 373, loss is 0.6025
epoch: 1 step: 374, loss is 0.6047
epoch: 1 step: 375, loss is 0.5705
epoch: 1 step: 376, loss is 0.7009
epoch: 1 step: 377, loss is 0.6272
epoch: 1 step: 378, loss is 0.6697
epoch: 1 step: 379, loss is 0.6578
epoch: 1 step: 380, loss is 0.5431
epoch: 1 step: 381, loss is 0.7024
epoch: 1 step: 382, loss is 0.5866
epoch: 1 step: 383, loss is 0.6498
epoch: 1 step: 384, loss is 0.5926
epoch: 1 step: 385, 

epoch: 2 step: 54, loss is 0.5748
epoch: 2 step: 55, loss is 0.5293
epoch: 2 step: 56, loss is 0.5660
epoch: 2 step: 57, loss is 0.5283
epoch: 2 step: 58, loss is 0.5347
epoch: 2 step: 59, loss is 0.5154
epoch: 2 step: 60, loss is 0.6732
epoch: 2 step: 61, loss is 0.5197
epoch: 2 step: 62, loss is 0.7254
epoch: 2 step: 63, loss is 0.9070
epoch: 2 step: 64, loss is 0.5558
epoch: 2 step: 65, loss is 1.0045
epoch: 2 step: 66, loss is 0.8933
epoch: 2 step: 67, loss is 1.0105
epoch: 2 step: 68, loss is 0.6910
epoch: 2 step: 69, loss is 0.6372
epoch: 2 step: 70, loss is 0.6704
epoch: 2 step: 71, loss is 0.7066
epoch: 2 step: 72, loss is 0.7282
epoch: 2 step: 73, loss is 0.7256
epoch: 2 step: 74, loss is 0.7049
epoch: 2 step: 75, loss is 0.7688
epoch: 2 step: 76, loss is 0.6864
epoch: 2 step: 77, loss is 0.6767
epoch: 2 step: 78, loss is 0.6959
epoch: 2 step: 79, loss is 0.6960
epoch: 2 step: 80, loss is 0.6875
epoch: 2 step: 81, loss is 0.6882
epoch: 2 step: 82, loss is 0.6958
epoch: 2 step:

epoch: 2 step: 143, loss is 0.5716
epoch: 2 step: 144, loss is 0.6271
epoch: 2 step: 145, loss is 0.5050
epoch: 2 step: 146, loss is 0.5590
epoch: 2 step: 147, loss is 0.6321
epoch: 2 step: 148, loss is 0.6130
epoch: 2 step: 149, loss is 0.5702
epoch: 2 step: 150, loss is 0.5732
epoch: 2 step: 151, loss is 0.5903
epoch: 2 step: 152, loss is 0.5511
epoch: 2 step: 153, loss is 0.6821
epoch: 2 step: 154, loss is 0.4778
epoch: 2 step: 155, loss is 0.6927
epoch: 2 step: 156, loss is 0.5322
epoch: 2 step: 157, loss is 0.4992
epoch: 2 step: 158, loss is 0.5179
epoch: 2 step: 159, loss is 0.7331
epoch: 2 step: 160, loss is 0.6702
epoch: 2 step: 161, loss is 0.5674
epoch: 2 step: 162, loss is 0.6555
epoch: 2 step: 163, loss is 0.6740
epoch: 2 step: 164, loss is 0.6001
epoch: 2 step: 165, loss is 0.6950
epoch: 2 step: 166, loss is 0.6409
epoch: 2 step: 167, loss is 0.5637
epoch: 2 step: 168, loss is 0.5931
epoch: 2 step: 169, loss is 0.5834
epoch: 2 step: 170, loss is 0.6347
epoch: 2 step: 171, 

epoch: 2 step: 232, loss is 0.5745
epoch: 2 step: 233, loss is 0.5614
epoch: 2 step: 234, loss is 0.5357
epoch: 2 step: 235, loss is 0.5186
epoch: 2 step: 236, loss is 0.6700
epoch: 2 step: 237, loss is 0.5584
epoch: 2 step: 238, loss is 0.5589
epoch: 2 step: 239, loss is 0.5363
epoch: 2 step: 240, loss is 0.5776
epoch: 2 step: 241, loss is 0.7283
epoch: 2 step: 242, loss is 0.5002
epoch: 2 step: 243, loss is 0.5267
epoch: 2 step: 244, loss is 0.7191
epoch: 2 step: 245, loss is 0.5527
epoch: 2 step: 246, loss is 0.6456
epoch: 2 step: 247, loss is 0.4888
epoch: 2 step: 248, loss is 0.5648
epoch: 2 step: 249, loss is 0.5652
epoch: 2 step: 250, loss is 0.5415
epoch: 2 step: 251, loss is 0.5158
epoch: 2 step: 252, loss is 0.6121
epoch: 2 step: 253, loss is 0.4672
epoch: 2 step: 254, loss is 0.5177
epoch: 2 step: 255, loss is 0.5891
epoch: 2 step: 256, loss is 0.5838
epoch: 2 step: 257, loss is 0.5129
epoch: 2 step: 258, loss is 0.4615
epoch: 2 step: 259, loss is 0.4765
epoch: 2 step: 260, 

epoch: 2 step: 321, loss is 0.3705
epoch: 2 step: 322, loss is 0.4149
epoch: 2 step: 323, loss is 0.4527
epoch: 2 step: 324, loss is 0.3693
epoch: 2 step: 325, loss is 0.4761
epoch: 2 step: 326, loss is 0.3317
epoch: 2 step: 327, loss is 0.5316
epoch: 2 step: 328, loss is 0.4163
epoch: 2 step: 329, loss is 0.3904
epoch: 2 step: 330, loss is 0.6191
epoch: 2 step: 331, loss is 0.3622
epoch: 2 step: 332, loss is 0.4183
epoch: 2 step: 333, loss is 0.5975
epoch: 2 step: 334, loss is 0.3783
epoch: 2 step: 335, loss is 0.4401
epoch: 2 step: 336, loss is 0.3810
epoch: 2 step: 337, loss is 0.3814
epoch: 2 step: 338, loss is 0.4297
epoch: 2 step: 339, loss is 0.2906
epoch: 2 step: 340, loss is 0.3323
epoch: 2 step: 341, loss is 0.4465
epoch: 2 step: 342, loss is 0.4510
epoch: 2 step: 343, loss is 0.4552
epoch: 2 step: 344, loss is 0.3955
epoch: 2 step: 345, loss is 0.3395
epoch: 2 step: 346, loss is 0.5065
epoch: 2 step: 347, loss is 0.4705
epoch: 2 step: 348, loss is 0.4732
epoch: 2 step: 349, 

epoch: 3 step: 18, loss is 0.3993
epoch: 3 step: 19, loss is 0.4321
epoch: 3 step: 20, loss is 0.3459
epoch: 3 step: 21, loss is 0.3473
epoch: 3 step: 22, loss is 0.4423
epoch: 3 step: 23, loss is 0.5265
epoch: 3 step: 24, loss is 0.4170
epoch: 3 step: 25, loss is 0.4483
epoch: 3 step: 26, loss is 0.5304
epoch: 3 step: 27, loss is 0.4433
epoch: 3 step: 28, loss is 0.4486
epoch: 3 step: 29, loss is 0.3785
epoch: 3 step: 30, loss is 0.4524
epoch: 3 step: 31, loss is 0.4300
epoch: 3 step: 32, loss is 0.3490
epoch: 3 step: 33, loss is 0.4418
epoch: 3 step: 34, loss is 0.4400
epoch: 3 step: 35, loss is 0.4215
epoch: 3 step: 36, loss is 0.4959
epoch: 3 step: 37, loss is 0.4083
epoch: 3 step: 38, loss is 0.3641
epoch: 3 step: 39, loss is 0.4726
epoch: 3 step: 40, loss is 0.3642
epoch: 3 step: 41, loss is 0.4058
epoch: 3 step: 42, loss is 0.4929
epoch: 3 step: 43, loss is 0.3960
epoch: 3 step: 44, loss is 0.5293
epoch: 3 step: 45, loss is 0.4512
epoch: 3 step: 46, loss is 0.4348
epoch: 3 step:

epoch: 3 step: 107, loss is 0.4050
epoch: 3 step: 108, loss is 0.4224
epoch: 3 step: 109, loss is 0.3945
epoch: 3 step: 110, loss is 0.3166
epoch: 3 step: 111, loss is 0.4504
epoch: 3 step: 112, loss is 0.4167
epoch: 3 step: 113, loss is 0.4151
epoch: 3 step: 114, loss is 0.4592
epoch: 3 step: 115, loss is 0.4591
epoch: 3 step: 116, loss is 0.4377
epoch: 3 step: 117, loss is 0.3935
epoch: 3 step: 118, loss is 0.4603
epoch: 3 step: 119, loss is 0.4321
epoch: 3 step: 120, loss is 0.3649
epoch: 3 step: 121, loss is 0.2203
epoch: 3 step: 122, loss is 0.4187
epoch: 3 step: 123, loss is 0.4314
epoch: 3 step: 124, loss is 0.4402
epoch: 3 step: 125, loss is 0.4183
epoch: 3 step: 126, loss is 0.2995
epoch: 3 step: 127, loss is 0.5258
epoch: 3 step: 128, loss is 0.3425
epoch: 3 step: 129, loss is 0.4904
epoch: 3 step: 130, loss is 0.3656
epoch: 3 step: 131, loss is 0.2937
epoch: 3 step: 132, loss is 0.3514
epoch: 3 step: 133, loss is 0.4062
epoch: 3 step: 134, loss is 0.4585
epoch: 3 step: 135, 

epoch: 3 step: 196, loss is 0.3696
epoch: 3 step: 197, loss is 0.3521
epoch: 3 step: 198, loss is 0.3601
epoch: 3 step: 199, loss is 0.4757
epoch: 3 step: 200, loss is 0.4163
epoch: 3 step: 201, loss is 0.3398
epoch: 3 step: 202, loss is 0.4203
epoch: 3 step: 203, loss is 0.3198
epoch: 3 step: 204, loss is 0.3190
epoch: 3 step: 205, loss is 0.3116
epoch: 3 step: 206, loss is 0.3934
epoch: 3 step: 207, loss is 0.4535
epoch: 3 step: 208, loss is 0.4659
epoch: 3 step: 209, loss is 0.3414
epoch: 3 step: 210, loss is 0.4802
epoch: 3 step: 211, loss is 0.5756
epoch: 3 step: 212, loss is 0.3171
epoch: 3 step: 213, loss is 0.4107
epoch: 3 step: 214, loss is 0.3674
epoch: 3 step: 215, loss is 0.4184
epoch: 3 step: 216, loss is 0.3420
epoch: 3 step: 217, loss is 0.6002
epoch: 3 step: 218, loss is 0.2872
epoch: 3 step: 219, loss is 0.3229
epoch: 3 step: 220, loss is 0.4415
epoch: 3 step: 221, loss is 0.3746
epoch: 3 step: 222, loss is 0.2635
epoch: 3 step: 223, loss is 0.3991
epoch: 3 step: 224, 

epoch: 3 step: 285, loss is 0.4163
epoch: 3 step: 286, loss is 0.4400
epoch: 3 step: 287, loss is 0.5866
epoch: 3 step: 288, loss is 0.5641
epoch: 3 step: 289, loss is 0.4612
epoch: 3 step: 290, loss is 0.2980
epoch: 3 step: 291, loss is 0.4731
epoch: 3 step: 292, loss is 0.3319
epoch: 3 step: 293, loss is 0.2109
epoch: 3 step: 294, loss is 0.3556
epoch: 3 step: 295, loss is 0.5077
epoch: 3 step: 296, loss is 0.3730
epoch: 3 step: 297, loss is 0.3788
epoch: 3 step: 298, loss is 0.4189
epoch: 3 step: 299, loss is 0.4771
epoch: 3 step: 300, loss is 0.4764
epoch: 3 step: 301, loss is 0.2127
epoch: 3 step: 302, loss is 0.3632
epoch: 3 step: 303, loss is 0.4322
epoch: 3 step: 304, loss is 0.2149
epoch: 3 step: 305, loss is 0.3922
epoch: 3 step: 306, loss is 0.3648
epoch: 3 step: 307, loss is 0.4253
epoch: 3 step: 308, loss is 0.2997
epoch: 3 step: 309, loss is 0.4857
epoch: 3 step: 310, loss is 0.2400
epoch: 3 step: 311, loss is 0.3372
epoch: 3 step: 312, loss is 0.3999
epoch: 3 step: 313, 

epoch: 3 step: 374, loss is 0.3921
epoch: 3 step: 375, loss is 0.4149
epoch: 3 step: 376, loss is 0.4907
epoch: 3 step: 377, loss is 0.3688
epoch: 3 step: 378, loss is 0.3472
epoch: 3 step: 379, loss is 0.4601
epoch: 3 step: 380, loss is 0.3989
epoch: 3 step: 381, loss is 0.4383
epoch: 3 step: 382, loss is 0.4026
epoch: 3 step: 383, loss is 0.4012
epoch: 3 step: 384, loss is 0.3780
epoch: 3 step: 385, loss is 0.4996
epoch: 3 step: 386, loss is 0.4128
epoch: 3 step: 387, loss is 0.4403
epoch: 3 step: 388, loss is 0.3133
epoch: 3 step: 389, loss is 0.3768
epoch: 3 step: 390, loss is 0.3408
Epoch time: 41946.990, per step time: 107.556, avg loss: 0.403
************************************************************
epoch: 4 step: 1, loss is 0.4017
epoch: 4 step: 2, loss is 0.4795
epoch: 4 step: 3, loss is 0.2870
epoch: 4 step: 4, loss is 0.4298
epoch: 4 step: 5, loss is 0.3789
epoch: 4 step: 6, loss is 0.3850
epoch: 4 step: 7, loss is 0.5787
epoch: 4 step: 8, loss is 0.4739
epoch: 4 step: 9,

epoch: 4 step: 71, loss is 0.3085
epoch: 4 step: 72, loss is 0.2767
epoch: 4 step: 73, loss is 0.3353
epoch: 4 step: 74, loss is 0.4800
epoch: 4 step: 75, loss is 0.2814
epoch: 4 step: 76, loss is 0.4233
epoch: 4 step: 77, loss is 0.2641
epoch: 4 step: 78, loss is 0.3865
epoch: 4 step: 79, loss is 0.2459
epoch: 4 step: 80, loss is 0.4205
epoch: 4 step: 81, loss is 0.4781
epoch: 4 step: 82, loss is 0.5155
epoch: 4 step: 83, loss is 0.3062
epoch: 4 step: 84, loss is 0.4246
epoch: 4 step: 85, loss is 0.4452
epoch: 4 step: 86, loss is 0.4439
epoch: 4 step: 87, loss is 0.3794
epoch: 4 step: 88, loss is 0.4272
epoch: 4 step: 89, loss is 0.3608
epoch: 4 step: 90, loss is 0.3053
epoch: 4 step: 91, loss is 0.3505
epoch: 4 step: 92, loss is 0.2630
epoch: 4 step: 93, loss is 0.4086
epoch: 4 step: 94, loss is 0.3074
epoch: 4 step: 95, loss is 0.2860
epoch: 4 step: 96, loss is 0.3472
epoch: 4 step: 97, loss is 0.4399
epoch: 4 step: 98, loss is 0.2984
epoch: 4 step: 99, loss is 0.5062
epoch: 4 step:

epoch: 4 step: 160, loss is 0.4387
epoch: 4 step: 161, loss is 0.3441
epoch: 4 step: 162, loss is 0.3684
epoch: 4 step: 163, loss is 0.3465
epoch: 4 step: 164, loss is 0.5299
epoch: 4 step: 165, loss is 0.5045
epoch: 4 step: 166, loss is 0.3958
epoch: 4 step: 167, loss is 0.3517
epoch: 4 step: 168, loss is 0.4668
epoch: 4 step: 169, loss is 0.2722
epoch: 4 step: 170, loss is 0.4252
epoch: 4 step: 171, loss is 0.4219
epoch: 4 step: 172, loss is 0.4034
epoch: 4 step: 173, loss is 0.4636
epoch: 4 step: 174, loss is 0.3881
epoch: 4 step: 175, loss is 0.3162
epoch: 4 step: 176, loss is 0.3936
epoch: 4 step: 177, loss is 0.3591
epoch: 4 step: 178, loss is 0.3104
epoch: 4 step: 179, loss is 0.2385
epoch: 4 step: 180, loss is 0.2899
epoch: 4 step: 181, loss is 0.3091
epoch: 4 step: 182, loss is 0.4573
epoch: 4 step: 183, loss is 0.4415
epoch: 4 step: 184, loss is 0.2995
epoch: 4 step: 185, loss is 0.2719
epoch: 4 step: 186, loss is 0.3571
epoch: 4 step: 187, loss is 0.3442
epoch: 4 step: 188, 

epoch: 4 step: 249, loss is 0.2710
epoch: 4 step: 250, loss is 0.3260
epoch: 4 step: 251, loss is 0.3744
epoch: 4 step: 252, loss is 0.2942
epoch: 4 step: 253, loss is 0.4133
epoch: 4 step: 254, loss is 0.2983
epoch: 4 step: 255, loss is 0.4217
epoch: 4 step: 256, loss is 0.3493
epoch: 4 step: 257, loss is 0.2805
epoch: 4 step: 258, loss is 0.3151
epoch: 4 step: 259, loss is 0.3350
epoch: 4 step: 260, loss is 0.5220
epoch: 4 step: 261, loss is 0.2808
epoch: 4 step: 262, loss is 0.2904
epoch: 4 step: 263, loss is 0.4144
epoch: 4 step: 264, loss is 0.3710
epoch: 4 step: 265, loss is 0.2993
epoch: 4 step: 266, loss is 0.3192
epoch: 4 step: 267, loss is 0.2591
epoch: 4 step: 268, loss is 0.4449
epoch: 4 step: 269, loss is 0.3405
epoch: 4 step: 270, loss is 0.3951
epoch: 4 step: 271, loss is 0.3147
epoch: 4 step: 272, loss is 0.3204
epoch: 4 step: 273, loss is 0.5377
epoch: 4 step: 274, loss is 0.3847
epoch: 4 step: 275, loss is 0.4134
epoch: 4 step: 276, loss is 0.3202
epoch: 4 step: 277, 

epoch: 4 step: 338, loss is 0.4460
epoch: 4 step: 339, loss is 0.3561
epoch: 4 step: 340, loss is 0.5193
epoch: 4 step: 341, loss is 0.4446
epoch: 4 step: 342, loss is 0.3434
epoch: 4 step: 343, loss is 0.3595
epoch: 4 step: 344, loss is 0.4241
epoch: 4 step: 345, loss is 0.2956
epoch: 4 step: 346, loss is 0.3377
epoch: 4 step: 347, loss is 0.3574
epoch: 4 step: 348, loss is 0.4708
epoch: 4 step: 349, loss is 0.4382
epoch: 4 step: 350, loss is 0.3674
epoch: 4 step: 351, loss is 0.5617
epoch: 4 step: 352, loss is 0.3479
epoch: 4 step: 353, loss is 0.4457
epoch: 4 step: 354, loss is 0.4470
epoch: 4 step: 355, loss is 0.3042
epoch: 4 step: 356, loss is 0.4274
epoch: 4 step: 357, loss is 0.3954
epoch: 4 step: 358, loss is 0.3816
epoch: 4 step: 359, loss is 0.3290
epoch: 4 step: 360, loss is 0.3382
epoch: 4 step: 361, loss is 0.4071
epoch: 4 step: 362, loss is 0.3767
epoch: 4 step: 363, loss is 0.4927
epoch: 4 step: 364, loss is 0.3349
epoch: 4 step: 365, loss is 0.3436
epoch: 4 step: 366, 

epoch: 5 step: 35, loss is 0.3577
epoch: 5 step: 36, loss is 0.4371
epoch: 5 step: 37, loss is 0.4086
epoch: 5 step: 38, loss is 0.1705
epoch: 5 step: 39, loss is 0.3365
epoch: 5 step: 40, loss is 0.3910
epoch: 5 step: 41, loss is 0.3509
epoch: 5 step: 42, loss is 0.4014
epoch: 5 step: 43, loss is 0.2674
epoch: 5 step: 44, loss is 0.3730
epoch: 5 step: 45, loss is 0.2710
epoch: 5 step: 46, loss is 0.2464
epoch: 5 step: 47, loss is 0.3998
epoch: 5 step: 48, loss is 0.2825
epoch: 5 step: 49, loss is 0.2899
epoch: 5 step: 50, loss is 0.2653
epoch: 5 step: 51, loss is 0.3137
epoch: 5 step: 52, loss is 0.2977
epoch: 5 step: 53, loss is 0.1626
epoch: 5 step: 54, loss is 0.3451
epoch: 5 step: 55, loss is 0.4533
epoch: 5 step: 56, loss is 0.3027
epoch: 5 step: 57, loss is 0.3573
epoch: 5 step: 58, loss is 0.2549
epoch: 5 step: 59, loss is 0.3431
epoch: 5 step: 60, loss is 0.3799
epoch: 5 step: 61, loss is 0.2788
epoch: 5 step: 62, loss is 0.2534
epoch: 5 step: 63, loss is 0.4903
epoch: 5 step:

epoch: 5 step: 124, loss is 0.3165
epoch: 5 step: 125, loss is 0.2910
epoch: 5 step: 126, loss is 0.4151
epoch: 5 step: 127, loss is 0.3650
epoch: 5 step: 128, loss is 0.4466
epoch: 5 step: 129, loss is 0.3491
epoch: 5 step: 130, loss is 0.3943
epoch: 5 step: 131, loss is 0.3831
epoch: 5 step: 132, loss is 0.3353
epoch: 5 step: 133, loss is 0.3608
epoch: 5 step: 134, loss is 0.3089
epoch: 5 step: 135, loss is 0.3661
epoch: 5 step: 136, loss is 0.2462
epoch: 5 step: 137, loss is 0.2555
epoch: 5 step: 138, loss is 0.3958
epoch: 5 step: 139, loss is 0.3909
epoch: 5 step: 140, loss is 0.4445
epoch: 5 step: 141, loss is 0.3978
epoch: 5 step: 142, loss is 0.4142
epoch: 5 step: 143, loss is 0.5226
epoch: 5 step: 144, loss is 0.4125
epoch: 5 step: 145, loss is 0.2795
epoch: 5 step: 146, loss is 0.3510
epoch: 5 step: 147, loss is 0.3275
epoch: 5 step: 148, loss is 0.5054
epoch: 5 step: 149, loss is 0.3694
epoch: 5 step: 150, loss is 0.5045
epoch: 5 step: 151, loss is 0.3543
epoch: 5 step: 152, 

epoch: 5 step: 213, loss is 0.4016
epoch: 5 step: 214, loss is 0.2758
epoch: 5 step: 215, loss is 0.4611
epoch: 5 step: 216, loss is 0.3102
epoch: 5 step: 217, loss is 0.3919
epoch: 5 step: 218, loss is 0.3644
epoch: 5 step: 219, loss is 0.3343
epoch: 5 step: 220, loss is 0.3409
epoch: 5 step: 221, loss is 0.3408
epoch: 5 step: 222, loss is 0.3310
epoch: 5 step: 223, loss is 0.3425
epoch: 5 step: 224, loss is 0.2430
epoch: 5 step: 225, loss is 0.2700
epoch: 5 step: 226, loss is 0.4033
epoch: 5 step: 227, loss is 0.3329
epoch: 5 step: 228, loss is 0.4596
epoch: 5 step: 229, loss is 0.3272
epoch: 5 step: 230, loss is 0.2274
epoch: 5 step: 231, loss is 0.4503
epoch: 5 step: 232, loss is 0.2505
epoch: 5 step: 233, loss is 0.3719
epoch: 5 step: 234, loss is 0.2949
epoch: 5 step: 235, loss is 0.3854
epoch: 5 step: 236, loss is 0.5405
epoch: 5 step: 237, loss is 0.3014
epoch: 5 step: 238, loss is 0.3945
epoch: 5 step: 239, loss is 0.3244
epoch: 5 step: 240, loss is 0.4346
epoch: 5 step: 241, 

epoch: 5 step: 302, loss is 0.3439
epoch: 5 step: 303, loss is 0.4070
epoch: 5 step: 304, loss is 0.4360
epoch: 5 step: 305, loss is 0.4695
epoch: 5 step: 306, loss is 0.2571
epoch: 5 step: 307, loss is 0.2597
epoch: 5 step: 308, loss is 0.3709
epoch: 5 step: 309, loss is 0.2729
epoch: 5 step: 310, loss is 0.3060
epoch: 5 step: 311, loss is 0.2724
epoch: 5 step: 312, loss is 0.4042
epoch: 5 step: 313, loss is 0.3170
epoch: 5 step: 314, loss is 0.2852
epoch: 5 step: 315, loss is 0.3810
epoch: 5 step: 316, loss is 0.4999
epoch: 5 step: 317, loss is 0.3802
epoch: 5 step: 318, loss is 0.4756
epoch: 5 step: 319, loss is 0.2718
epoch: 5 step: 320, loss is 0.4197
epoch: 5 step: 321, loss is 0.2601
epoch: 5 step: 322, loss is 0.2091
epoch: 5 step: 323, loss is 0.4082
epoch: 5 step: 324, loss is 0.2823
epoch: 5 step: 325, loss is 0.3926
epoch: 5 step: 326, loss is 0.2773
epoch: 5 step: 327, loss is 0.4278
epoch: 5 step: 328, loss is 0.2811
epoch: 5 step: 329, loss is 0.2949
epoch: 5 step: 330, 

Epoch time: 40547.118, per step time: 103.967, avg loss: 0.346
************************************************************
epoch: 6 step: 1, loss is 0.3137
epoch: 6 step: 2, loss is 0.3295
epoch: 6 step: 3, loss is 0.4285
epoch: 6 step: 4, loss is 0.2917
epoch: 6 step: 5, loss is 0.3357
epoch: 6 step: 6, loss is 0.3456
epoch: 6 step: 7, loss is 0.4375
epoch: 6 step: 8, loss is 0.3685
epoch: 6 step: 9, loss is 0.2734
epoch: 6 step: 10, loss is 0.2983
epoch: 6 step: 11, loss is 0.3373
epoch: 6 step: 12, loss is 0.3792
epoch: 6 step: 13, loss is 0.2534
epoch: 6 step: 14, loss is 0.2555
epoch: 6 step: 15, loss is 0.2536
epoch: 6 step: 16, loss is 0.2763
epoch: 6 step: 17, loss is 0.3496
epoch: 6 step: 18, loss is 0.2546
epoch: 6 step: 19, loss is 0.4003
epoch: 6 step: 20, loss is 0.4276
epoch: 6 step: 21, loss is 0.3958
epoch: 6 step: 22, loss is 0.2281
epoch: 6 step: 23, loss is 0.3480
epoch: 6 step: 24, loss is 0.3870
epoch: 6 step: 25, loss is 0.2697
epoch: 6 step: 26, loss is 0.2907
e

epoch: 6 step: 88, loss is 0.3815
epoch: 6 step: 89, loss is 0.3205
epoch: 6 step: 90, loss is 0.1674
epoch: 6 step: 91, loss is 0.3302
epoch: 6 step: 92, loss is 0.3680
epoch: 6 step: 93, loss is 0.3370
epoch: 6 step: 94, loss is 0.3272
epoch: 6 step: 95, loss is 0.3728
epoch: 6 step: 96, loss is 0.2415
epoch: 6 step: 97, loss is 0.3413
epoch: 6 step: 98, loss is 0.2772
epoch: 6 step: 99, loss is 0.3638
epoch: 6 step: 100, loss is 0.4868
epoch: 6 step: 101, loss is 0.2709
epoch: 6 step: 102, loss is 0.3050
epoch: 6 step: 103, loss is 0.3113
epoch: 6 step: 104, loss is 0.3130
epoch: 6 step: 105, loss is 0.2987
epoch: 6 step: 106, loss is 0.2144
epoch: 6 step: 107, loss is 0.4136
epoch: 6 step: 108, loss is 0.2410
epoch: 6 step: 109, loss is 0.3518
epoch: 6 step: 110, loss is 0.3474
epoch: 6 step: 111, loss is 0.2430
epoch: 6 step: 112, loss is 0.3468
epoch: 6 step: 113, loss is 0.3406
epoch: 6 step: 114, loss is 0.3484
epoch: 6 step: 115, loss is 0.3458
epoch: 6 step: 116, loss is 0.40

epoch: 6 step: 177, loss is 0.3473
epoch: 6 step: 178, loss is 0.4617
epoch: 6 step: 179, loss is 0.2574
epoch: 6 step: 180, loss is 0.2926
epoch: 6 step: 181, loss is 0.2689
epoch: 6 step: 182, loss is 0.2425
epoch: 6 step: 183, loss is 0.4197
epoch: 6 step: 184, loss is 0.3622
epoch: 6 step: 185, loss is 0.3172
epoch: 6 step: 186, loss is 0.2831
epoch: 6 step: 187, loss is 0.4395
epoch: 6 step: 188, loss is 0.3841
epoch: 6 step: 189, loss is 0.4334
epoch: 6 step: 190, loss is 0.5027
epoch: 6 step: 191, loss is 0.5141
epoch: 6 step: 192, loss is 0.3588
epoch: 6 step: 193, loss is 0.3650
epoch: 6 step: 194, loss is 0.3152
epoch: 6 step: 195, loss is 0.3063
epoch: 6 step: 196, loss is 0.3097
epoch: 6 step: 197, loss is 0.3507
epoch: 6 step: 198, loss is 0.2534
epoch: 6 step: 199, loss is 0.4216
epoch: 6 step: 200, loss is 0.4192
epoch: 6 step: 201, loss is 0.3980
epoch: 6 step: 202, loss is 0.3389
epoch: 6 step: 203, loss is 0.3186
epoch: 6 step: 204, loss is 0.5272
epoch: 6 step: 205, 

epoch: 6 step: 266, loss is 0.1851
epoch: 6 step: 267, loss is 0.3902
epoch: 6 step: 268, loss is 0.1962
epoch: 6 step: 269, loss is 0.2614
epoch: 6 step: 270, loss is 0.2919
epoch: 6 step: 271, loss is 0.4295
epoch: 6 step: 272, loss is 0.3681
epoch: 6 step: 273, loss is 0.2417
epoch: 6 step: 274, loss is 0.3749
epoch: 6 step: 275, loss is 0.3401
epoch: 6 step: 276, loss is 0.3363
epoch: 6 step: 277, loss is 0.3809
epoch: 6 step: 278, loss is 0.2851
epoch: 6 step: 279, loss is 0.3831
epoch: 6 step: 280, loss is 0.3269
epoch: 6 step: 281, loss is 0.2682
epoch: 6 step: 282, loss is 0.2464
epoch: 6 step: 283, loss is 0.3946
epoch: 6 step: 284, loss is 0.3671
epoch: 6 step: 285, loss is 0.2973
epoch: 6 step: 286, loss is 0.3856
epoch: 6 step: 287, loss is 0.4005
epoch: 6 step: 288, loss is 0.3100
epoch: 6 step: 289, loss is 0.4213
epoch: 6 step: 290, loss is 0.2163
epoch: 6 step: 291, loss is 0.2245
epoch: 6 step: 292, loss is 0.2426
epoch: 6 step: 293, loss is 0.3086
epoch: 6 step: 294, 

epoch: 6 step: 355, loss is 0.4533
epoch: 6 step: 356, loss is 0.2419
epoch: 6 step: 357, loss is 0.2371
epoch: 6 step: 358, loss is 0.3193
epoch: 6 step: 359, loss is 0.4685
epoch: 6 step: 360, loss is 0.3362
epoch: 6 step: 361, loss is 0.4437
epoch: 6 step: 362, loss is 0.3613
epoch: 6 step: 363, loss is 0.4118
epoch: 6 step: 364, loss is 0.3095
epoch: 6 step: 365, loss is 0.2669
epoch: 6 step: 366, loss is 0.2606
epoch: 6 step: 367, loss is 0.3994
epoch: 6 step: 368, loss is 0.2873
epoch: 6 step: 369, loss is 0.2830
epoch: 6 step: 370, loss is 0.2995
epoch: 6 step: 371, loss is 0.2545
epoch: 6 step: 372, loss is 0.2930
epoch: 6 step: 373, loss is 0.3777
epoch: 6 step: 374, loss is 0.5867
epoch: 6 step: 375, loss is 0.2580
epoch: 6 step: 376, loss is 0.1726
epoch: 6 step: 377, loss is 0.2685
epoch: 6 step: 378, loss is 0.2625
epoch: 6 step: 379, loss is 0.2591
epoch: 6 step: 380, loss is 0.3863
epoch: 6 step: 381, loss is 0.2968
epoch: 6 step: 382, loss is 0.3835
epoch: 6 step: 383, 

epoch: 7 step: 52, loss is 0.3062
epoch: 7 step: 53, loss is 0.3455
epoch: 7 step: 54, loss is 0.3581
epoch: 7 step: 55, loss is 0.2514
epoch: 7 step: 56, loss is 0.3478
epoch: 7 step: 57, loss is 0.2962
epoch: 7 step: 58, loss is 0.2631
epoch: 7 step: 59, loss is 0.2864
epoch: 7 step: 60, loss is 0.3093
epoch: 7 step: 61, loss is 0.2864
epoch: 7 step: 62, loss is 0.1889
epoch: 7 step: 63, loss is 0.3674
epoch: 7 step: 64, loss is 0.3365
epoch: 7 step: 65, loss is 0.3307
epoch: 7 step: 66, loss is 0.1550
epoch: 7 step: 67, loss is 0.2388
epoch: 7 step: 68, loss is 0.3041
epoch: 7 step: 69, loss is 0.3472
epoch: 7 step: 70, loss is 0.3063
epoch: 7 step: 71, loss is 0.2721
epoch: 7 step: 72, loss is 0.2984
epoch: 7 step: 73, loss is 0.2822
epoch: 7 step: 74, loss is 0.2518
epoch: 7 step: 75, loss is 0.3445
epoch: 7 step: 76, loss is 0.2901
epoch: 7 step: 77, loss is 0.3076
epoch: 7 step: 78, loss is 0.1980
epoch: 7 step: 79, loss is 0.1895
epoch: 7 step: 80, loss is 0.2033
epoch: 7 step:

epoch: 7 step: 141, loss is 0.4725
epoch: 7 step: 142, loss is 0.3928
epoch: 7 step: 143, loss is 0.3646
epoch: 7 step: 144, loss is 0.2601
epoch: 7 step: 145, loss is 0.4328
epoch: 7 step: 146, loss is 0.4251
epoch: 7 step: 147, loss is 0.2112
epoch: 7 step: 148, loss is 0.3383
epoch: 7 step: 149, loss is 0.3793
epoch: 7 step: 150, loss is 0.2300
epoch: 7 step: 151, loss is 0.3427
epoch: 7 step: 152, loss is 0.3089
epoch: 7 step: 153, loss is 0.3507
epoch: 7 step: 154, loss is 0.2947
epoch: 7 step: 155, loss is 0.2489
epoch: 7 step: 156, loss is 0.2677
epoch: 7 step: 157, loss is 0.3559
epoch: 7 step: 158, loss is 0.4911
epoch: 7 step: 159, loss is 0.1923
epoch: 7 step: 160, loss is 0.2644
epoch: 7 step: 161, loss is 0.2804
epoch: 7 step: 162, loss is 0.4733
epoch: 7 step: 163, loss is 0.3742
epoch: 7 step: 164, loss is 0.1808
epoch: 7 step: 165, loss is 0.3073
epoch: 7 step: 166, loss is 0.2948
epoch: 7 step: 167, loss is 0.2632
epoch: 7 step: 168, loss is 0.3022
epoch: 7 step: 169, 

epoch: 7 step: 230, loss is 0.4031
epoch: 7 step: 231, loss is 0.2659
epoch: 7 step: 232, loss is 0.4359
epoch: 7 step: 233, loss is 0.2296
epoch: 7 step: 234, loss is 0.3760
epoch: 7 step: 235, loss is 0.1930
epoch: 7 step: 236, loss is 0.4012
epoch: 7 step: 237, loss is 0.1525
epoch: 7 step: 238, loss is 0.4822
epoch: 7 step: 239, loss is 0.2978
epoch: 7 step: 240, loss is 0.2879
epoch: 7 step: 241, loss is 0.3184
epoch: 7 step: 242, loss is 0.3067
epoch: 7 step: 243, loss is 0.3059
epoch: 7 step: 244, loss is 0.3247
epoch: 7 step: 245, loss is 0.5435
epoch: 7 step: 246, loss is 0.3728
epoch: 7 step: 247, loss is 0.3015
epoch: 7 step: 248, loss is 0.2837
epoch: 7 step: 249, loss is 0.2077
epoch: 7 step: 250, loss is 0.1852
epoch: 7 step: 251, loss is 0.2704
epoch: 7 step: 252, loss is 0.3132
epoch: 7 step: 253, loss is 0.2244
epoch: 7 step: 254, loss is 0.2337
epoch: 7 step: 255, loss is 0.2662
epoch: 7 step: 256, loss is 0.1683
epoch: 7 step: 257, loss is 0.3610
epoch: 7 step: 258, 

epoch: 7 step: 319, loss is 0.2874
epoch: 7 step: 320, loss is 0.2773
epoch: 7 step: 321, loss is 0.3119
epoch: 7 step: 322, loss is 0.5180
epoch: 7 step: 323, loss is 0.2819
epoch: 7 step: 324, loss is 0.2582
epoch: 7 step: 325, loss is 0.3137
epoch: 7 step: 326, loss is 0.3719
epoch: 7 step: 327, loss is 0.2965
epoch: 7 step: 328, loss is 0.2923
epoch: 7 step: 329, loss is 0.2939
epoch: 7 step: 330, loss is 0.2711
epoch: 7 step: 331, loss is 0.2564
epoch: 7 step: 332, loss is 0.2319
epoch: 7 step: 333, loss is 0.2975
epoch: 7 step: 334, loss is 0.6099
epoch: 7 step: 335, loss is 0.3109
epoch: 7 step: 336, loss is 0.1355
epoch: 7 step: 337, loss is 0.4506
epoch: 7 step: 338, loss is 0.4515
epoch: 7 step: 339, loss is 0.3207
epoch: 7 step: 340, loss is 0.3045
epoch: 7 step: 341, loss is 0.2666
epoch: 7 step: 342, loss is 0.4119
epoch: 7 step: 343, loss is 0.2923
epoch: 7 step: 344, loss is 0.3069
epoch: 7 step: 345, loss is 0.2237
epoch: 7 step: 346, loss is 0.2427
epoch: 7 step: 347, 

epoch: 8 step: 16, loss is 0.2551
epoch: 8 step: 17, loss is 0.3402
epoch: 8 step: 18, loss is 0.2975
epoch: 8 step: 19, loss is 0.2487
epoch: 8 step: 20, loss is 0.2542
epoch: 8 step: 21, loss is 0.2751
epoch: 8 step: 22, loss is 0.3212
epoch: 8 step: 23, loss is 0.2760
epoch: 8 step: 24, loss is 0.1505
epoch: 8 step: 25, loss is 0.2349
epoch: 8 step: 26, loss is 0.1072
epoch: 8 step: 27, loss is 0.3493
epoch: 8 step: 28, loss is 0.1981
epoch: 8 step: 29, loss is 0.2218
epoch: 8 step: 30, loss is 0.2380
epoch: 8 step: 31, loss is 0.2702
epoch: 8 step: 32, loss is 0.2819
epoch: 8 step: 33, loss is 0.3173
epoch: 8 step: 34, loss is 0.2883
epoch: 8 step: 35, loss is 0.3038
epoch: 8 step: 36, loss is 0.3776
epoch: 8 step: 37, loss is 0.3619
epoch: 8 step: 38, loss is 0.3471
epoch: 8 step: 39, loss is 0.2261
epoch: 8 step: 40, loss is 0.2389
epoch: 8 step: 41, loss is 0.2973
epoch: 8 step: 42, loss is 0.3369
epoch: 8 step: 43, loss is 0.5723
epoch: 8 step: 44, loss is 0.3082
epoch: 8 step:

epoch: 8 step: 105, loss is 0.3339
epoch: 8 step: 106, loss is 0.3085
epoch: 8 step: 107, loss is 0.3561
epoch: 8 step: 108, loss is 0.3255
epoch: 8 step: 109, loss is 0.3709
epoch: 8 step: 110, loss is 0.2567
epoch: 8 step: 111, loss is 0.2285
epoch: 8 step: 112, loss is 0.1699
epoch: 8 step: 113, loss is 0.2693
epoch: 8 step: 114, loss is 0.4444
epoch: 8 step: 115, loss is 0.2116
epoch: 8 step: 116, loss is 0.3997
epoch: 8 step: 117, loss is 0.2387
epoch: 8 step: 118, loss is 0.2712
epoch: 8 step: 119, loss is 0.2482
epoch: 8 step: 120, loss is 0.2702
epoch: 8 step: 121, loss is 0.4016
epoch: 8 step: 122, loss is 0.3797
epoch: 8 step: 123, loss is 0.1121
epoch: 8 step: 124, loss is 0.2173
epoch: 8 step: 125, loss is 0.2104
epoch: 8 step: 126, loss is 0.2904
epoch: 8 step: 127, loss is 0.2524
epoch: 8 step: 128, loss is 0.2956
epoch: 8 step: 129, loss is 0.3088
epoch: 8 step: 130, loss is 0.2754
epoch: 8 step: 131, loss is 0.2397
epoch: 8 step: 132, loss is 0.3058
epoch: 8 step: 133, 

epoch: 8 step: 194, loss is 0.2501
epoch: 8 step: 195, loss is 0.1891
epoch: 8 step: 196, loss is 0.2274
epoch: 8 step: 197, loss is 0.3215
epoch: 8 step: 198, loss is 0.2382
epoch: 8 step: 199, loss is 0.3136
epoch: 8 step: 200, loss is 0.3687
epoch: 8 step: 201, loss is 0.1899
epoch: 8 step: 202, loss is 0.2513
epoch: 8 step: 203, loss is 0.2842
epoch: 8 step: 204, loss is 0.2917
epoch: 8 step: 205, loss is 0.2588
epoch: 8 step: 206, loss is 0.3324
epoch: 8 step: 207, loss is 0.3042
epoch: 8 step: 208, loss is 0.2606
epoch: 8 step: 209, loss is 0.3536
epoch: 8 step: 210, loss is 0.4595
epoch: 8 step: 211, loss is 0.2538
epoch: 8 step: 212, loss is 0.3812
epoch: 8 step: 213, loss is 0.1679
epoch: 8 step: 214, loss is 0.1868
epoch: 8 step: 215, loss is 0.4198
epoch: 8 step: 216, loss is 0.3415
epoch: 8 step: 217, loss is 0.2309
epoch: 8 step: 218, loss is 0.3316
epoch: 8 step: 219, loss is 0.3680
epoch: 8 step: 220, loss is 0.2453
epoch: 8 step: 221, loss is 0.4186
epoch: 8 step: 222, 

epoch: 8 step: 283, loss is 0.3052
epoch: 8 step: 284, loss is 0.3046
epoch: 8 step: 285, loss is 0.3282
epoch: 8 step: 286, loss is 0.2687
epoch: 8 step: 287, loss is 0.2085
epoch: 8 step: 288, loss is 0.2500
epoch: 8 step: 289, loss is 0.2477
epoch: 8 step: 290, loss is 0.1799
epoch: 8 step: 291, loss is 0.3890
epoch: 8 step: 292, loss is 0.2363
epoch: 8 step: 293, loss is 0.3996
epoch: 8 step: 294, loss is 0.3036
epoch: 8 step: 295, loss is 0.3625
epoch: 8 step: 296, loss is 0.3306
epoch: 8 step: 297, loss is 0.2989
epoch: 8 step: 298, loss is 0.3709
epoch: 8 step: 299, loss is 0.4077
epoch: 8 step: 300, loss is 0.3659
epoch: 8 step: 301, loss is 0.3173
epoch: 8 step: 302, loss is 0.2164
epoch: 8 step: 303, loss is 0.2811
epoch: 8 step: 304, loss is 0.2248
epoch: 8 step: 305, loss is 0.3226
epoch: 8 step: 306, loss is 0.4554
epoch: 8 step: 307, loss is 0.2045
epoch: 8 step: 308, loss is 0.2654
epoch: 8 step: 309, loss is 0.3877
epoch: 8 step: 310, loss is 0.3128
epoch: 8 step: 311, 

epoch: 8 step: 372, loss is 0.2237
epoch: 8 step: 373, loss is 0.1964
epoch: 8 step: 374, loss is 0.3240
epoch: 8 step: 375, loss is 0.4185
epoch: 8 step: 376, loss is 0.2762
epoch: 8 step: 377, loss is 0.2433
epoch: 8 step: 378, loss is 0.3024
epoch: 8 step: 379, loss is 0.3009
epoch: 8 step: 380, loss is 0.3313
epoch: 8 step: 381, loss is 0.2318
epoch: 8 step: 382, loss is 0.2963
epoch: 8 step: 383, loss is 0.3568
epoch: 8 step: 384, loss is 0.2718
epoch: 8 step: 385, loss is 0.3772
epoch: 8 step: 386, loss is 0.4922
epoch: 8 step: 387, loss is 0.4117
epoch: 8 step: 388, loss is 0.3131
epoch: 8 step: 389, loss is 0.3322
epoch: 8 step: 390, loss is 0.2457
Epoch time: 41559.509, per step time: 106.563, avg loss: 0.294
************************************************************
epoch: 9 step: 1, loss is 0.2256
epoch: 9 step: 2, loss is 0.3673
epoch: 9 step: 3, loss is 0.3487
epoch: 9 step: 4, loss is 0.2746
epoch: 9 step: 5, loss is 0.2949
epoch: 9 step: 6, loss is 0.2162
epoch: 9 step

epoch: 9 step: 69, loss is 0.4718
epoch: 9 step: 70, loss is 0.4030
epoch: 9 step: 71, loss is 0.3980
epoch: 9 step: 72, loss is 0.2488
epoch: 9 step: 73, loss is 0.1879
epoch: 9 step: 74, loss is 0.3052
epoch: 9 step: 75, loss is 0.1858
epoch: 9 step: 76, loss is 0.1737
epoch: 9 step: 77, loss is 0.3333
epoch: 9 step: 78, loss is 0.1959
epoch: 9 step: 79, loss is 0.2411
epoch: 9 step: 80, loss is 0.2749
epoch: 9 step: 81, loss is 0.1702
epoch: 9 step: 82, loss is 0.1831
epoch: 9 step: 83, loss is 0.3682
epoch: 9 step: 84, loss is 0.1844
epoch: 9 step: 85, loss is 0.2799
epoch: 9 step: 86, loss is 0.2805
epoch: 9 step: 87, loss is 0.3685
epoch: 9 step: 88, loss is 0.2802
epoch: 9 step: 89, loss is 0.1326
epoch: 9 step: 90, loss is 0.1912
epoch: 9 step: 91, loss is 0.3006
epoch: 9 step: 92, loss is 0.1286
epoch: 9 step: 93, loss is 0.2179
epoch: 9 step: 94, loss is 0.1999
epoch: 9 step: 95, loss is 0.2278
epoch: 9 step: 96, loss is 0.1420
epoch: 9 step: 97, loss is 0.1676
epoch: 9 step:

epoch: 9 step: 158, loss is 0.2664
epoch: 9 step: 159, loss is 0.4234
epoch: 9 step: 160, loss is 0.2787
epoch: 9 step: 161, loss is 0.3272
epoch: 9 step: 162, loss is 0.3409
epoch: 9 step: 163, loss is 0.3722
epoch: 9 step: 164, loss is 0.2464
epoch: 9 step: 165, loss is 0.1451
epoch: 9 step: 166, loss is 0.3036
epoch: 9 step: 167, loss is 0.2150
epoch: 9 step: 168, loss is 0.2903
epoch: 9 step: 169, loss is 0.4836
epoch: 9 step: 170, loss is 0.2690
epoch: 9 step: 171, loss is 0.3030
epoch: 9 step: 172, loss is 0.2788
epoch: 9 step: 173, loss is 0.3095
epoch: 9 step: 174, loss is 0.3485
epoch: 9 step: 175, loss is 0.3854
epoch: 9 step: 176, loss is 0.2738
epoch: 9 step: 177, loss is 0.2012
epoch: 9 step: 178, loss is 0.1913
epoch: 9 step: 179, loss is 0.1811
epoch: 9 step: 180, loss is 0.2216
epoch: 9 step: 181, loss is 0.3418
epoch: 9 step: 182, loss is 0.4854
epoch: 9 step: 183, loss is 0.3358
epoch: 9 step: 184, loss is 0.1935
epoch: 9 step: 185, loss is 0.3501
epoch: 9 step: 186, 

epoch: 9 step: 247, loss is 0.3666
epoch: 9 step: 248, loss is 0.2445
epoch: 9 step: 249, loss is 0.2603
epoch: 9 step: 250, loss is 0.2571
epoch: 9 step: 251, loss is 0.4252
epoch: 9 step: 252, loss is 0.3173
epoch: 9 step: 253, loss is 0.2151
epoch: 9 step: 254, loss is 0.3287
epoch: 9 step: 255, loss is 0.2224
epoch: 9 step: 256, loss is 0.2287
epoch: 9 step: 257, loss is 0.2828
epoch: 9 step: 258, loss is 0.4278
epoch: 9 step: 259, loss is 0.2781
epoch: 9 step: 260, loss is 0.2918
epoch: 9 step: 261, loss is 0.2349
epoch: 9 step: 262, loss is 0.3005
epoch: 9 step: 263, loss is 0.2941
epoch: 9 step: 264, loss is 0.2351
epoch: 9 step: 265, loss is 0.3136
epoch: 9 step: 266, loss is 0.3938
epoch: 9 step: 267, loss is 0.1917
epoch: 9 step: 268, loss is 0.2223
epoch: 9 step: 269, loss is 0.1965
epoch: 9 step: 270, loss is 0.2173
epoch: 9 step: 271, loss is 0.3242
epoch: 9 step: 272, loss is 0.2942
epoch: 9 step: 273, loss is 0.3043
epoch: 9 step: 274, loss is 0.5046
epoch: 9 step: 275, 

epoch: 9 step: 336, loss is 0.2749
epoch: 9 step: 337, loss is 0.1938
epoch: 9 step: 338, loss is 0.2136
epoch: 9 step: 339, loss is 0.1703
epoch: 9 step: 340, loss is 0.1344
epoch: 9 step: 341, loss is 0.2446
epoch: 9 step: 342, loss is 0.2180
epoch: 9 step: 343, loss is 0.3273
epoch: 9 step: 344, loss is 0.3550
epoch: 9 step: 345, loss is 0.2465
epoch: 9 step: 346, loss is 0.2084
epoch: 9 step: 347, loss is 0.3962
epoch: 9 step: 348, loss is 0.2505
epoch: 9 step: 349, loss is 0.2329
epoch: 9 step: 350, loss is 0.3404
epoch: 9 step: 351, loss is 0.3228
epoch: 9 step: 352, loss is 0.2663
epoch: 9 step: 353, loss is 0.2314
epoch: 9 step: 354, loss is 0.4019
epoch: 9 step: 355, loss is 0.2190
epoch: 9 step: 356, loss is 0.2142
epoch: 9 step: 357, loss is 0.2802
epoch: 9 step: 358, loss is 0.2102
epoch: 9 step: 359, loss is 0.1795
epoch: 9 step: 360, loss is 0.2005
epoch: 9 step: 361, loss is 0.2372
epoch: 9 step: 362, loss is 0.1931
epoch: 9 step: 363, loss is 0.3196
epoch: 9 step: 364, 

epoch: 10 step: 33, loss is 0.2019
epoch: 10 step: 34, loss is 0.2363
epoch: 10 step: 35, loss is 0.1242
epoch: 10 step: 36, loss is 0.1880
epoch: 10 step: 37, loss is 0.2874
epoch: 10 step: 38, loss is 0.1517
epoch: 10 step: 39, loss is 0.2969
epoch: 10 step: 40, loss is 0.2387
epoch: 10 step: 41, loss is 0.1753
epoch: 10 step: 42, loss is 0.1604
epoch: 10 step: 43, loss is 0.2058
epoch: 10 step: 44, loss is 0.1899
epoch: 10 step: 45, loss is 0.1511
epoch: 10 step: 46, loss is 0.2173
epoch: 10 step: 47, loss is 0.1632
epoch: 10 step: 48, loss is 0.3122
epoch: 10 step: 49, loss is 0.3052
epoch: 10 step: 50, loss is 0.3136
epoch: 10 step: 51, loss is 0.3212
epoch: 10 step: 52, loss is 0.3128
epoch: 10 step: 53, loss is 0.2322
epoch: 10 step: 54, loss is 0.1590
epoch: 10 step: 55, loss is 0.2994
epoch: 10 step: 56, loss is 0.1690
epoch: 10 step: 57, loss is 0.2279
epoch: 10 step: 58, loss is 0.2540
epoch: 10 step: 59, loss is 0.3558
epoch: 10 step: 60, loss is 0.2341
epoch: 10 step: 61, 

epoch: 10 step: 122, loss is 0.3619
epoch: 10 step: 123, loss is 0.2152
epoch: 10 step: 124, loss is 0.3646
epoch: 10 step: 125, loss is 0.2300
epoch: 10 step: 126, loss is 0.2405
epoch: 10 step: 127, loss is 0.2607
epoch: 10 step: 128, loss is 0.3845
epoch: 10 step: 129, loss is 0.4600
epoch: 10 step: 130, loss is 0.3505
epoch: 10 step: 131, loss is 0.1911
epoch: 10 step: 132, loss is 0.1612
epoch: 10 step: 133, loss is 0.3517
epoch: 10 step: 134, loss is 0.2793
epoch: 10 step: 135, loss is 0.1697
epoch: 10 step: 136, loss is 0.1566
epoch: 10 step: 137, loss is 0.3282
epoch: 10 step: 138, loss is 0.3097
epoch: 10 step: 139, loss is 0.2631
epoch: 10 step: 140, loss is 0.3907
epoch: 10 step: 141, loss is 0.3358
epoch: 10 step: 142, loss is 0.3061
epoch: 10 step: 143, loss is 0.1727
epoch: 10 step: 144, loss is 0.2522
epoch: 10 step: 145, loss is 0.3008
epoch: 10 step: 146, loss is 0.3309
epoch: 10 step: 147, loss is 0.3308
epoch: 10 step: 148, loss is 0.2165
epoch: 10 step: 149, loss is

epoch: 10 step: 211, loss is 0.3747
epoch: 10 step: 212, loss is 0.1915
epoch: 10 step: 213, loss is 0.2435
epoch: 10 step: 214, loss is 0.1964
epoch: 10 step: 215, loss is 0.1412
epoch: 10 step: 216, loss is 0.3663
epoch: 10 step: 217, loss is 0.2127
epoch: 10 step: 218, loss is 0.3638
epoch: 10 step: 219, loss is 0.2969
epoch: 10 step: 220, loss is 0.2878
epoch: 10 step: 221, loss is 0.3518
epoch: 10 step: 222, loss is 0.2342
epoch: 10 step: 223, loss is 0.2159
epoch: 10 step: 224, loss is 0.3619
epoch: 10 step: 225, loss is 0.2785
epoch: 10 step: 226, loss is 0.2721
epoch: 10 step: 227, loss is 0.2554
epoch: 10 step: 228, loss is 0.3147
epoch: 10 step: 229, loss is 0.2355
epoch: 10 step: 230, loss is 0.2799
epoch: 10 step: 231, loss is 0.3037
epoch: 10 step: 232, loss is 0.3153
epoch: 10 step: 233, loss is 0.2251
epoch: 10 step: 234, loss is 0.3054
epoch: 10 step: 235, loss is 0.2202
epoch: 10 step: 236, loss is 0.3073
epoch: 10 step: 237, loss is 0.2066
epoch: 10 step: 238, loss is

epoch: 10 step: 300, loss is 0.2784
epoch: 10 step: 301, loss is 0.2806
epoch: 10 step: 302, loss is 0.2436
epoch: 10 step: 303, loss is 0.3769
epoch: 10 step: 304, loss is 0.3425
epoch: 10 step: 305, loss is 0.2269
epoch: 10 step: 306, loss is 0.4220
epoch: 10 step: 307, loss is 0.2467
epoch: 10 step: 308, loss is 0.1316
epoch: 10 step: 309, loss is 0.1762
epoch: 10 step: 310, loss is 0.3126
epoch: 10 step: 311, loss is 0.3991
epoch: 10 step: 312, loss is 0.1567
epoch: 10 step: 313, loss is 0.2893
epoch: 10 step: 314, loss is 0.1417
epoch: 10 step: 315, loss is 0.2252
epoch: 10 step: 316, loss is 0.2381
epoch: 10 step: 317, loss is 0.2423
epoch: 10 step: 318, loss is 0.2374
epoch: 10 step: 319, loss is 0.2307
epoch: 10 step: 320, loss is 0.0773
epoch: 10 step: 321, loss is 0.2638
epoch: 10 step: 322, loss is 0.2122
epoch: 10 step: 323, loss is 0.3638
epoch: 10 step: 324, loss is 0.2257
epoch: 10 step: 325, loss is 0.1227
epoch: 10 step: 326, loss is 0.2076
epoch: 10 step: 327, loss is

epoch: 10 step: 389, loss is 0.2334
epoch: 10 step: 390, loss is 0.1966
Epoch time: 43320.815, per step time: 111.079, avg loss: 0.262
************************************************************


## 模型验证

创建并加载验证数据集（`ds_eval`），加载由**训练**保存的CheckPoint文件，进行验证，查看模型质量，此步骤用时约30秒。

In [15]:
from mindspore.train.serialization import load_checkpoint, load_param_into_net


args.ckpt_path = f'./lstm-{cfg.num_epochs}_390.ckpt'
print("============== Starting Testing ==============")
ds_eval = lstm_create_dataset(args.preprocess_path, cfg.batch_size, training=False)
param_dict = load_checkpoint(args.ckpt_path)
load_param_into_net(network, param_dict)
if args.device_target == "CPU":
    acc = model.eval(ds_eval, dataset_sink_mode=False)
else:
    acc = model.eval(ds_eval)
print("============== {} ==============".format(acc))




### 训练结果评价

根据以上一段代码的输出可以看到，在经历了10轮epoch之后，使用验证的数据集，对文本的情感分析正确率在85%左右，达到一个基本满意的结果。

## 总结

以上便完成了MindSpore自然语言处理应用的体验，我们通过本次体验全面了解了如何使用MindSpore进行自然语言中处理情感分类问题，理解了如何通过定义和初始化基于LSTM的`SentimentNet`网络进行训练模型及验证正确率。