提交 d0d7d887 编写于 作者: E easternDay

2019/3/18

上级
此差异已折叠。
此差异已折叠。
关于数据的几点说明:
1. 为了保持每首诗的字数相同,在整理数据的时候,对长度超过四句的古诗进行了特殊处理,仅保留了开头两句以及末尾两句;
2. 每首诗都是和`春`相关;
\ No newline at end of file
按照惯例,虽然每星期一都要升国旗,但这个星期一是与往日截然不同的。没错,这一天是光荣的,因为我成为了10个升旗手中的一员。当然作为新手,在无数人的注视之下,我的心中还是无比的紧张与焦虑。但我不能浪费这个机会,这是个很好的机会来锻炼自己,提升自己,所以我必须面对大家。走到主席台旁,面向大家,唱着国歌,感觉自己就是个正式的护旗手,好像自己置身于天安门前,而外身披华丽外套,而内心波澜起伏,可能是军人的作风,感觉自己承担着中华民族伟大复兴的重要任务,而当我看见有些人边升国旗边笑时,又是无比地认为他们身在福中不知福,替他们感到悲哀。经历这样一次当护旗手的行列,我不仅体会到中华人民的使命,又想在以后好好规划自己,为祖国做出贡献。
今天做广播操时,发生一件突发事件。当我们做到扩胸运动时,我的余光扫射到右边同班女生,身体似乎有些摇摇晃晃。她平时都是做广播操最标准,最卖力的那个,而今天却这种情况,我感觉她随时有可能晕倒的可能,所以我关注着她。果然不出所料,她一下子脚一软,晕倒在地。我作为男生,不能随便碰女生;作为前后桌,关系不错。不管了,生命要紧。我当机立断背起她,送她到医务室。等了一会儿,医务室老师说是轻微晕倒,问题不大,多休息就行,我也为她深吸了一口气。
电,在我们的生活中无处不在,但它们如果都在一瞬间消失殆尽,世界将陷入一片黑暗之中。没有电,我们的生活将会一团糟:医生使用不了无影灯做手术,各种仪器也将无法使用;工厂的机器将会停止转动,夜班的工人无法工作;人们只能靠油灯或月光来照明,而且无法在夜间活动……如果没有电,发生各种自然灾害时,只能靠人力搜索伤员,伤亡人数会大大增加。灾后重建也是个大问题,破损的房屋、楼房不计其数,没有办法使用起重机等工具,房子就只能由人工建成。一栋大楼,想建成也要花个一年半载的。在这期间,那些灾民将如何安顿呢?最重要的是,大家要节约用电,让黑暗永远不要到来。
早饭,我决定做烧饼夹煎蛋,感觉自己应该会做。我早早起床,梳理完毕,便既兴奋又紧张的拿着钱,购买我要的食材去了,首先,买了三个烧饼,一棵生菜。然后最紧张最害怕的环节到了,那就是煎鸡蛋,我轻手轻脚的打开电磁炉,按着妈妈说的,等锅里的水都没有时,倒入少许的油,把电磁炉调成小火,我好怕油会烫着我,小心翼翼的打开鸡蛋,第一个鸡蛋掉入锅中,油有些溅出,鸡蛋离锅太高了。打第二个鸡蛋时,我把手放低了些,这次还好没有溅出油来。哎!真不容易啊,终于,鸡蛋熟了,我在鸡蛋上涂好一层辣酱。最后,我把生菜洗好,把生菜、鸡蛋夹在烧饼里。烧饼夹鸡蛋做好了,我把第一个给了妈妈,妈妈很开心呢,说我做得很好,这是我第一次做饭呢,很有成就感,但又一想如果让我天天做,顿顿做,我会不会烦呢?看着妈妈,心想:妈妈一日三餐为我做饭,还变着样的做我爱吃的真是辛苦啊!
获胜目标就是在班上和三楼分别连续待上3分钟,不被抓到,就算获胜。游戏开始了,抓的人是姜雨欣,我们想一匹匹脱缰的野马一样飞快的分散了。我跟着陈哲睿跑到了三楼,忽然我似乎看见了一只手在我们的面前,我立刻感觉不妙,连忙逃走,姜雨欣也跟上来了,我灵机一动,把头面向墙那里,伪装成人群。果然她没有管我,去追陈哲睿了,我暗自庆幸,悄悄的躲在了班上,时间一分一秒过去,外面抓捕的声音越来越小,似乎只剩我一人了。时间还差一分钟!我想着,躲在了桌子下面。外面传来一阵开门声,我忍不住了,看了一眼,居然是姜雨欣!她发现了我,时间还剩30秒,我在班上飞快的跑着。10、9、8、7、6、5、4、3、2、1。时间到!我成功赢了这局游戏,成为了最后的“幸存者”!
今天晚上我写完作业,吃过晚饭后,妈妈说:“你陪我去风格秀理发好吗?”我高兴的答到:“好啊!好啊!”我和妈妈一块很快的到达了风格秀,到了那里,妈妈先洗了洗头发,然后一位叔叔开始给妈妈理发,理过发后妈妈决定要烫头,于是叔叔拿了很多的夹子和纸片,把妈妈头发卷了起来,然后开始加热看着妈妈的样子感觉真好笑,加热了十几分钟之后,然后给妈妈洗头,吹干后一看,哇,妈妈的头发好漂亮啊!
晚上的公园可热闹了。五彩斑斓的灯光,照着公园的地上,金光闪闪像白天一样,公园的中心最热闹了,有的人溜冰、有的人跳舞、有的在抽鞭、还有的人在散步、还有人在打羽毛球,公园里还有5个小亭子,亭子里有的人在椅子上下象棋、有的人在听收音机、还有的人坐着休息,抽鞭的人看起来可威武了,他们手上拿着一个鞭在天上抡一下,就像鞭炮一样的声音,啪、啪、帕。
12月来了,全国天气逐渐变冷了,我们南方也不例外。一早上起床就有一股冷空气扑面而来,冷的我瑟瑟发抖,恨不得要再次往被窝里钻。我慌忙地找衣服,鼻子一阵发痒。“阿嚏”我仰天长啸,打起了一个喷嚏,我不由得感叹:天气冷了,新年也快到了,很快就能放鞭炮啦!
\ No newline at end of file
# textgenrnn Datasets
Here are a few datasets for experimenting with textgenrnn. All datasets are obtained by inputing the given query into [BigQuery](https://cloud.google.com/bigquery/)
## Hacker News
Top 2000 Hacker News submissions by score. To reproduce:
```sql
#standardSQL
SELECT title
FROM `bigquery-public-data.hacker_news.full`
WHERE type = 'story'
ORDER BY score DESC
LIMIT 2000
```
Save to Google Sheets, and download as a .tsv (*not* a .csv, as csvs enclose sentences with quotes which contains commas!)
Replace `2000` with a larger number if necessary (up to `10000`)
## Reddit Subreddit Data
Top 1000 submissions by score for each of the included subreddits in the query, from January 2017 to June 2017. To reproduce in BigQuery:
```sql
#standardSQL
SELECT title FROM (
SELECT title,
ROW_NUMBER() OVER (PARTITION BY subreddit ORDER BY score DESC) as score_rank
FROM `fh-bigquery.reddit_posts.*`
WHERE (_TABLE_SUFFIX BETWEEN '2017_01' AND '2017_06')
AND LOWER(subreddit) IN ("legaladvice", "relationship_advice")
)
WHERE score_rank <= 1000
```
Save to Google Sheets, and download as a .tsv (*not* a .csv, as csvs enclose sentences with quotes which contains commas!)
Change `1000` and the included subreddits as appropriate (make sure the total does not exceed 10000!)
\ No newline at end of file
此差异已折叠。
此差异已折叠。
此差异已折叠。
春眠不觉晓,处处闻啼鸟。夜来风雨声,花落知多少。
慈母手中线,游子身上衣。谁言寸草心,报得三春晖。
好雨知时节,当春乃发生。晓看红湿处,花重锦官城。
红豆生南国,春来发几枝。愿君多采撷,此物最相思。
国破山河在,城春草木深。白头搔更短,浑欲不胜簪。
\ No newline at end of file
此差异已折叠。
{"rnn_layers": 2, "rnn_size": 64, "rnn_bidirectional": true, "max_length": 40, "max_words": 10000, "dim_embeddings": 300, "word_level": false, "single_text": false, "name": "shakespeare"}
\ No newline at end of file
{" ": 1, "e": 2, "t": 3, "o": 4, "a": 5, "n": 6, "h": 7, "s": 8, "r": 9, "i": 10, "l": 11, "d": 12, "u": 13, "m": 14, ",": 15, "y": 16, "c": 17, "w": 18, "f": 19, "g": 20, "I": 21, "p": 22, "b": 23, "A": 24, ".": 25, "v": 26, "E": 27, "T": 28, "S": 29, "k": 30, "O": 31, "N": 32, "'": 33, "R": 34, "L": 35, "C": 36, "H": 37, ";": 38, ":": 39, "W": 40, "U": 41, "M": 42, "B": 43, "D": 44, "P": 45, "F": 46, "?": 47, "G": 48, "-": 49, "!": 50, "Y": 51, "x": 52, "K": 53, "\f": 54, "|": 55, "V": 56, "j": 57, "q": 58, "J": 59, "z": 60, "1": 61, "2": 62, "Q": 63, "3": 64, "4": 65, "[": 66, "]": 67, "5": 68, "Z": 69, "X": 70, "6": 71, "7": 72, "&": 73, "8": 74, "9": 75, "0": 76, "(": 77, ")": 78, "<": 79, "@": 80, ">": 81, "$": 82, "<s>": 83}
\ No newline at end of file
from textgenrnn import textgenrnn
textgen = textgenrnn()
#textgen.generate(5)
#生成五段文字
#generated_texts = textgen.generate(n=5, prefix="Trump", temperature=0.2, return_as_list=True)
#print(generated_texts)
#生成以Trump开头的五段文字
texts = ['Never gonna give you up, never gonna let you down',
'Never gonna run around and desert you',
'Never gonna make you cry, never gonna say goodbye',
'Never gonna tell a lie and hurt you']
textgen.train_on_texts(texts, num_epochs=2, gen_epochs=2)
textgen.generate_to_file('textgenrnn_texts.txt', n=5)
#将结果输出到文本
'''
textgen = textgenrnn(name="chinese_poetry")
textgen.reset()
textgen.train_from_file('./datasets/chinese-poetry.txt',
new_model=True,
batch_size=4,
rnn_bidirectional=True,
rnn_size=64,
dim_embeddings=300,
num_epochs=20)
print(textgen.model.summary())
'''
'''
textgen = textgenrnn(
name="chinese_poetry",
weights_path='./chinese_poetry_weights.hdf5',
config_path='./chinese_poetry_config.json',
vocab_path='./chinese_poetry_vocab.json'
)
textgen.generate(20, temperature=1.0)
'''
'''
textgen = textgenrnn(name="shakespeare")
textgen.reset()
textgen.train_from_file('./datasets/shakespeare.txt',
new_model=True,
batch_size=4,
rnn_bidirectional=True,
rnn_size=64,
dim_embeddings=300,
num_epochs=20)
print(textgen.model.summary())
'''
Never don't let you
Never are you can gondly
Help and gonnaal desert
Intend you cut you
And churse let you can
from textgenrnn import textgenrnn
textgen = textgenrnn(
name="my.poem",
weights_path='./my.poem_weights.hdf5',
config_path='./my.poem_config.json',
vocab_path='./my.poem_vocab.json'
)
textgen.generate(20, temperature=1.0)
{"rnn_layers": 2, "rnn_size": 128, "rnn_bidirectional": true, "max_length": 25, "max_words": 10000, "dim_embeddings": 100, "word_level": false, "single_text": false, "name": "my.poem"}
\ No newline at end of file
此差异已折叠。
from textgenrnn import textgenrnn
textgen = textgenrnn(name="my.poem") # 给模型起个名字,比如`poem`, 之后生成的模型文件都会以这个名字为前缀
textgen.reset() # 重置模型
textgen.train_from_file( # 从数据文件训练模型
file_path = '../datasets/cn/5_chars_poem_2600.txt', # 文件路径
new_model = True, # 训练新模型
num_epochs = 3, # 训练轮数
word_level = False, # True:词级别,False:字级别
rnn_bidirectional = True, # 是否使用Bi-LSTM
max_length = 25, # 一条数据的最大长度
)
{"rnn_layers": 2, "rnn_size": 128, "rnn_bidirectional": true, "max_length": 25, "max_words": 10000, "dim_embeddings": 100, "word_level": false, "single_text": false, "name": "expl"}
\ No newline at end of file
{",": 1, "我": 2, "的": 3, "了": 4, "一": 5, "。": 6, "妈": 7, "有": 8, "在": 9, "、": 10, "不": 11, "人": 12, "做": 13, "着": 14, "上": 15, "发": 16, "生": 17, "到": 18, "后": 19, "时": 20, "是": 21, "好": 22, "天": 23, "个": 24, "来": 25, "蛋": 26, "们": 27, "她": 28, "然": 29, "子": 30, "就": 31, "鸡": 32, "最": 33, "要": 34, "会": 35, "还": 36, "!": 37, "能": 38, "起": 39, "也": 40, "间": 41, "成": 42, "开": 43, "里": 44, "把": 45, "感": 46, "大": 47, "电": 48, "没": 49, "很": 50, "看": 51, "分": 52, "面": 53, "头": 54, "班": 55, "那": 56, "这": 57, "果": 58, "为": 59, "机": 60, "无": 61, "将": 62, ":": 63, "手": 64, "工": 65, "只": 66, "油": 67, "想": 68, "打": 69, "啊": 70, "样": 71, "快": 72, "鞭": 73, "动": 74, "光": 75, "些": 76, "觉": 77, "可": 78, "倒": 79, "下": 80, "作": 81, "说": 82, "气": 83, "中": 84, "如": 85, "用": 86, "法": 87, "楼": 88, "呢": 89, "早": 90, "饭": 91, "烧": 92, "饼": 93, "夹": 94, "理": 95, "去": 96, "三": 97, "小": 98, "心": 99, "第": 100, "洗": 101, "“": 102, "”": 103, "叔": 104, "热": 105, "公": 106, "园": 107, "冷": 108, "今": 109, "似": 110, "乎": 111, "都": 112, "而": 113, "种": 114, "晕": 115, "出": 116, "地": 117, ";": 118, "紧": 119, "医": 120, "等": 121, "轻": 122, "活": 123, "入": 124, "暗": 125, "使": 126, "灯": 127, "自": 128, "灾": 129, "加": 130, "重": 131, "建": 132, "房": 133, "顿": 134, "拿": 135, "菜": 136, "锅": 137, "次": 138, "真": 139, "给": 140, "得": 141, "和": 142, "3": 143, "钟": 144, "抓": 145, "始": 146, "姜": 147, "雨": 148, "欣": 149, "过": 150, "外": 151, "声": 152, "音": 153, "1": 154, "晚": 155, "广": 156, "播": 157, "操": 158, "件": 159, "当": 160, "女": 161, "摇": 162, "晃": 163, "标": 164, "力": 165, "随": 166, "所": 167, "关": 168, "脚": 169, "便": 170, "前": 171, "桌": 172, "管": 173, "立": 174, "务": 175, "室": 176, "问": 177, "题": 178, "多": 179, "休": 180, "息": 181, "但": 182, "片": 183, "黑": 184, "之": 185, "各": 186, "器": 187, "夜": 188, "靠": 189, "月": 190, "照": 191, "…": 192, "害": 193, "伤": 194, "数": 195, "由": 196, "年": 197, "节": 198, "让": 199, "决": 200, "定": 201, "煎": 202, "床": 203, "完": 204, "兴": 205, "又": 206, "张": 207, "买": 208, "先": 209, "怕": 210, "磁": 211, "炉": 212, "烫": 213, "翼": 214, "溅": 215, "高": 216, "放": 217, "!": 218, "于": 219, "?": 220, "变": 221, "吃": 222, "获": 223, "胜": 224, "连": 225, "被": 226, "游": 227, "戏": 228, "匹": 229, "飞": 230, "散": 231, "跟": 232, "陈": 233, "哲": 234, "睿": 235, "跑": 236, "忙": 237, "幸": 238, "悄": 239, "躲": 240, "秒": 241, "越": 242, "剩": 243, "阵": 244, "0": 245, "5": 246, "2": 247, "风": 248, "格": 249, "秀": 250, "闹": 251, "闪": 252, "像": 253, "抽": 254, "亭": 255, "炮": 256, "啪": 257, "瑟": 258, "嚏": 259, "突": 260, "事": 261, "扩": 262, "胸": 263, "运": 264, "余": 265, "扫": 266, "射": 267, "右": 268, "边": 269, "同": 270, "身": 271, "体": 272, "平": 273, "准": 274, "卖": 275, "却": 276, "情": 277, "况": 278, "以": 279, "注": 280, "料": 281, "软": 282, "男": 283, "碰": 284, "系": 285, "错": 286, "命": 287, "断": 288, "背": 289, "送": 290, "儿": 291, "老": 292, "师": 293, "微": 294, "行": 295, "深": 296, "吸": 297, "口": 298, "处": 299, "它": 300, "瞬": 301, "消": 302, "失": 303, "殆": 304, "尽": 305, "世": 306, "界": 307, "陷": 308, "团": 309, "糟": 310, "影": 311, "术": 312, "仪": 313, "厂": 314, "停": 315, "止": 316, "转": 317, "或": 318, "明": 319, "且": 320, "搜": 321, "索": 322, "员": 323, "亡": 324, "增": 325, "破": 326, "损": 327, "屋": 328, "计": 329, "其": 330, "办": 331, "具": 332, "栋": 333, "花": 334, "半": 335, "载": 336, "期": 337, "民": 338, "何": 339, "安": 340, "?": 341, "家": 342, "约": 343, "永": 344, "远": 345, "己": 346, "应": 347, "该": 348, "梳": 349, "毕": 350, "既": 351, "奋": 352, "钱": 353, "购": 354, "食": 355, "材": 356, "首": 357, "棵": 358, "环": 359, "按": 360, "水": 361, "少": 362, "许": 363, "调": 364, "火": 365, "掉": 366, "离": 367, "太": 368, "二": 369, "低": 370, "哎": 371, "容": 372, "易": 373, "终": 374, "熟": 375, "涂": 376, "层": 377, "辣": 378, "酱": 379, "烦": 380, "日": 381, "餐": 382, "爱": 383, "辛": 384, "苦": 385, "目": 386, "别": 387, "续": 388, "待": 389, "算": 390, "脱": 391, "缰": 392, "野": 393, "马": 394, "忽": 395, "见": 396, "刻": 397, "妙": 398, "逃": 399, "走": 400, "灵": 401, "向": 402, "墙": 403, "伪": 404, "装": 405, "群": 406, "追": 407, "庆": 408, "捕": 409, "差": 410, "传": 411, "门": 412, "忍": 413, "住": 414, "眼": 415, "居": 416, "现": 417, "9": 418, "8": 419, "7": 420, "6": 421, "4": 422, "功": 423, "赢": 424, "局": 425, "存": 426, "者": 427, "写": 428, "业": 429, "你": 430, "陪": 431, "吗": 432, "答": 433, "块": 434, "达": 435, "位": 436, "纸": 437, "卷": 438, "笑": 439, "十": 440, "几": 441, "吹": 442, "干": 443, "哇": 444, "漂": 445, "亮": 446, "五": 447, "彩": 448, "斑": 449, "斓": 450, "金": 451, "白": 452, "溜": 453, "冰": 454, "跳": 455, "舞": 456, "步": 457, "羽": 458, "毛": 459, "球": 460, "椅": 461, "象": 462, "棋": 463, "听": 464, "收": 465, "坐": 466, "威": 467, "武": 468, "他": 469, "抡": 470, "帕": 471, "全": 472, "国": 473, "逐": 474, "渐": 475, "南": 476, "方": 477, "例": 478, "股": 479, "空": 480, "扑": 481, "抖": 482, "恨": 483, "再": 484, "往": 485, "窝": 486, "钻": 487, "慌": 488, "找": 489, "衣": 490, "服": 491, "鼻": 492, "痒": 493, "阿": 494, "仰": 495, "长": 496, "啸": 497, "喷": 498, "叹": 499, "新": 500, "啦": 501, "<s>": 502}
\ No newline at end of file
from textgenrnn import textgenrnn
textgen = textgenrnn(
name="expl",
weights_path='./expl_weights.hdf5',
config_path='./expl_config.json',
vocab_path='./expl_vocab.json'
)
textgen.generate(20, temperature=1.0)
from textgenrnn import textgenrnn
textgen = textgenrnn(name="expl") # 给模型起个名字,比如`poem`, 之后生成的模型文件都会以这个名字为前缀
textgen.reset() # 重置模型
textgen.train_from_file( # 从数据文件训练模型
file_path = '../datasets/cn/expl.txt', # 文件路径
new_model = True, # 训练新模型
num_epochs = 3, # 训练轮数
word_level = False, # True:词级别,False:字级别
rnn_bidirectional = True, # 是否使用Bi-LSTM
max_length = 25, # 一条数据的最大长度
)
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册