Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
sfewfsaf
Synonyms
提交
ec307398
S
Synonyms
项目概览
sfewfsaf
/
Synonyms
与 Fork 源项目一致
从无法访问的项目Fork
通知
6
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
S
Synonyms
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
ec307398
编写于
3月 09, 2018
作者:
H
Hai Liang Wang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Fix invalid cosine value
上级
e141ffe9
变更
4
隐藏空白更改
内联
并排
Showing
4 changed file
with
20 addition
and
13 deletion
+20
-13
demo.py
demo.py
+7
-1
setup.py
setup.py
+1
-1
synonyms/synonyms.py
synonyms/synonyms.py
+6
-11
synonyms/utils.py
synonyms/utils.py
+6
-0
未找到文件。
demo.py
浏览文件 @
ec307398
...
...
@@ -108,6 +108,12 @@ class Test(unittest.TestCase):
r
=
synonyms
.
compare
(
sen1
,
sen2
,
seg
=
True
)
print
(
"%s vs %s"
%
(
sen1
,
sen2
),
r
)
sen1
=
"你们好呀"
sen2
=
"大家好"
r
=
synonyms
.
compare
(
sen1
,
sen2
,
seg
=
False
)
print
(
"%s vs %s"
%
(
sen1
,
sen2
),
r
)
def
test_nearby
(
self
):
synonyms
.
display
(
"奥运"
)
# synonyms.display calls synonyms.nearby
synonyms
.
display
(
"北新桥"
)
# synonyms.display calls synonyms.nearby
...
...
@@ -118,5 +124,5 @@ def test():
if
__name__
==
'__main__'
:
FLAGS
([
__file__
,
'--verbosity'
,
'
-2
'
])
FLAGS
([
__file__
,
'--verbosity'
,
'
1
'
])
test
()
setup.py
浏览文件 @
ec307398
...
...
@@ -13,7 +13,7 @@ Welcome
setup
(
name
=
'synonyms'
,
version
=
'3.3.
9
'
,
version
=
'3.3.
10
'
,
description
=
'Chinese Synonyms for Natural Language Processing and Understanding'
,
long_description
=
LONGDOC
,
author
=
'Hai Liang Wang, Hu Ying Xi'
,
...
...
synonyms/synonyms.py
浏览文件 @
ec307398
...
...
@@ -20,7 +20,7 @@ from __future__ import division
__copyright__
=
"Copyright (c) 2017 . All Rights Reserved"
__author__
=
"Hu Ying Xi<>, Hai Liang Wang<hailiang.hl.wang@gmail.com>"
__date__
=
"2017-09-27"
__version__
=
"3.3.
9
"
__version__
=
"3.3.
10
"
import
os
import
sys
...
...
@@ -53,6 +53,7 @@ from .utils import any2utf8
from
.utils
import
any2unicode
from
.utils
import
sigmoid
from
.utils
import
cosine
from
.utils
import
is_digit
import
jieba
from
.jieba
import
posseg
as
_tokenizer
...
...
@@ -226,20 +227,14 @@ def _similarity_distance(s1, s2):
'''
compute similarity with distance measurement
'''
#
g = 0.0
g
=
0.0
try
:
g
=
cosine
(
_flat_sum_array
(
_get_wv
(
s1
)),
_flat_sum_array
(
_get_wv
(
s2
)))
g_
=
cosine
(
_flat_sum_array
(
_get_wv
(
s1
)),
_flat_sum_array
(
_get_wv
(
s2
)))
if
is_digit
(
g_
):
g
=
g_
except
:
pass
try
:
g_nan_num
=
np
.
isnan
(
g
).
sum
()
if
g_nan_num
==
100
:
g
=
0.0
except
:
pass
u
=
_nearby_levenshtein_distance
(
s1
,
s2
)
# print
("g: %s, u: %s" % (g, u))
logging
.
debug
(
"g: %s, u: %s"
%
(
g
,
u
))
if
u
>=
0.99
:
r
=
1.0
elif
u
>
0.9
:
...
...
synonyms/utils.py
浏览文件 @
ec307398
...
...
@@ -252,6 +252,12 @@ def call_on_class_only(*args, **kwargs):
"""Raise exception when load methods are called on instance"""
raise
AttributeError
(
'This method should be called on a class object.'
)
def
is_digit
(
obj
):
'''
Check if an object is Number
'''
return
isinstance
(
obj
,
(
numbers
.
Integral
,
numbers
.
Complex
,
numbers
.
Real
))
def
is_zhs
(
str
):
'''
Check if str is Chinese Word
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录