提交 74e45e75 编写于 作者: xuchaoxin1375's avatar xuchaoxin1375

- "update the client layout with more available parameters...

- "update the client layout with more available parameters cross-validation(`kfold`/`shufflesplit`/`StratifiedShuffleSplit`)
- support the `n_splits` as a slider element for use to adjust the folds
- update the `confusion matrix table`show button
上级 2669bf69
*.wav *.wav
*.npy *.npy
*.nyc
*.csv *.csv
\ No newline at end of file
此差异已折叠。
...@@ -4,5 +4,9 @@ ...@@ -4,5 +4,9 @@
"python.linting.pylintEnabled": true, "python.linting.pylintEnabled": true,
"git.ignoreLimitWarning": true, "git.ignoreLimitWarning": true,
"python.linting.lintOnSave": false, "python.linting.lintOnSave": false,
"python.analysis.typeCheckingMode": "off" "python.analysis.typeCheckingMode": "off",
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter"
},
"python.formatting.provider": "none"
} }
\ No newline at end of file
...@@ -56,7 +56,6 @@ ...@@ -56,7 +56,6 @@
#### 需要注意的包 #### 需要注意的包
- librosa - librosa
- librosa 0.9.2 is not the lastes version,but the newest version don't work well with some matplotlib version - librosa 0.9.2 is not the lastes version,but the newest version don't work well with some matplotlib version
- these problems appeared that I install `matplotlib` with `conda install` and `librosa`with `pip isntall` - these problems appeared that I install `matplotlib` with `conda install` and `librosa`with `pip isntall`
- The compatibility issues may be caused by mixing two installation methods. - The compatibility issues may be caused by mixing two installation methods.
...@@ -64,15 +63,59 @@ ...@@ -64,15 +63,59 @@
- However, the newer version may become the more preferred choice in the future, once the bugs or compatibility issues have been fixed. - However, the newer version may become the more preferred choice in the future, once the bugs or compatibility issues have been fixed.
- pluggy - pluggy
- the pluggy may installed automatically or not(I write here becases it when I test the `requirements.txt` in a brand new conda environment ,the pip prompt me that the pluggy was not installed - the pluggy may installed automatically or not(I write here becases it when I test the `requirements.txt` in a brand new conda environment ,the pip prompt me that the pluggy was not installed
- tensorflow - tensorflow
- if you just want to experience the basic ML alogrithms' working on SER task,it's no need for you to install tensorflow - if you just want to experience the basic ML alogrithms' working on SER task,it's no need for you to install tensorflow
- in may case , I use tensorflow==2.10,but other version of tensorflow above 2.6 may work well too - in may case , I use tensorflow==2.10,but other version of tensorflow above 2.6 may work well too
### 其他情况说明
- 我在一次调试matplotlib backend设置的过程中,偶然发现在notebook中使用`%matplotlib qt`失败
- 后来我尝试卸载重装matplotlib和pyside
- 然而提示我一些`dll`问题和权限错误问题
- 这些问题时平时不曾遇到
- 我猜测可能时机器太久没有关机了(平时我都是休眠),导致系统出现了一些错误
- 机器发生错误是很有可能的,就比如学校图书馆的刷脸系统,验证通过平时屏幕显示绿色,然而最近通过显示的也是红色
- 而在早期的windows7上,有时候从休眠中回复直接会失败
- 然后我重启机器,发现任务栏多出了个搜索框,系统更新了没有重启可能也造成了一些影响
- 有时候还会出现各种意外的问题
- 例如`KNeighborsClassifier`训练完毕后,用它预测新的样本或在训练集上预测会报错:
- ```python
File d:\condaPythonEnvs\tf210\lib\site-packages\sklearn\neighbors\_classification.py:237, in KNeighborsClassifier.predict(self, X)
235 neigh_dist = None
236 else:
--> 237 neigh_dist, neigh_ind = self.kneighbors(X)
239 classes_ = self.classes_
240 _y = self._y
643 get_config = getattr(self._dynlib, "openblas_get_config",
644 lambda: None)
645 get_config.restype = ctypes.c_char_p
--> 646 config = get_config().split()
647 if config[0] == b"OpenBLAS":
648 return config[1].decode("utf-8")
AttributeError: 'NoneType' object has no attribute 'split'
```
- 而当我切换到另一个环境又可以工作,说明是应该不是系统问题,而更可能是环境出了问题
#### 意料之外的错误的解决方案
- 重启你的机器,这或许解决一些系统错误
- 创建/切换虚拟环境,也许是安装某些包引起依赖降级或者环境污染
#### 关于进度条显示:tqdm #### 关于进度条显示:tqdm
...@@ -397,6 +440,104 @@ The initial letter(s) of the file name represents the emotion class, and the fol ...@@ -397,6 +440,104 @@ The initial letter(s) of the file name represents the emotion class, and the fol
- BaggingClassifier - BaggingClassifier
- Recurrent Neural Networks (Keras) - Recurrent Neural Networks (Keras)
### 选择合适的分类器(用k-fold评估)
- ```python
from sklearn.datasets import load_iris
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score
# 加载iris(鸢尾花)数据集
X, y = load_iris(return_X_y=True)
# 使用线性回归模型进行交叉验证
model = LinearRegression()
scores = cross_val_score(model, X, y, cv=5)
print("Scores:", scores)
print("Mean score:", scores.mean())
```
- ```bash
Scores: [0.96666667 0.96666667 0.9 0.96666667 1. ]
Mean score: 0.9600000000000002
```
- ```python
# 加载iris(鸢尾花)数据集
X, y = load_iris(return_X_y=True)
# model=RandomForestClassifier()
model=SVC()
scores = cross_val_score(model, X, y, cv=5,verbose=3)
print("Scores:", scores)
print("Mean score:", scores.mean())
```
- ```bash
Scores: [0.96666667 0.96666667 0.96666667 0.93333333 1. ]
Mean score: 0.9666666666666666
```
- 而使用决策树模型进行分类,效果比较好,在5折叠验证中,都在0.9以上
- 使用SVC或者RF,效果更加好
#### 使用shuffle
- 不对有序数据集iris洗牌,效果:
```python
# 加载iris(鸢尾花)数据集
X, y = load_iris(return_X_y=True)
# 定义5折交叉验证
kf = KFold(
n_splits=5,
# shuffle=True,
# random_state=42,
)
# 使用线性回归模型进行训练和测试
model = LinearRegression()
# model=RandomForestClassifier()
scores_cv = []
for train_index, test_index in kf.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
scores_cv.append(score)
print("Score:", score)
mean_score = np.mean(scores_cv)
print(f"{mean_score=}")
```
- ```bash
Score: 0.0
Score: 0.8512492308414581
Score: 0.0
Score: 0.7615543936085848
Score: 0.0
mean_score=0.32256072489000853
```
- 通常出现这种情况的话,可以认为是数据集读入策略有问题
- 解开shuffle=True的注释,得到正常的预测性能
- ```bash
Score: 0.9468960016420045
Score: 0.9315787260143983
Score: 0.9177129838664249
Score: 0.9026578332122843
Score: 0.921073136533955
mean_score=0.9239837362538135
```
#### SVC #### SVC
- Scikit-learn中的SVC是一种支持向量机(Support Vector Machine)分类器,用于解决二分类和多分类问题。SVC是一种非常强大的模型,可以处理高维度的数据,并且能够有效地处理非线性可分的数据。 - Scikit-learn中的SVC是一种支持向量机(Support Vector Machine)分类器,用于解决二分类和多分类问题。SVC是一种非常强大的模型,可以处理高维度的数据,并且能够有效地处理非线性可分的数据。
...@@ -414,7 +555,7 @@ The initial letter(s) of the file name represents the emotion class, and the fol ...@@ -414,7 +555,7 @@ The initial letter(s) of the file name represents the emotion class, and the fol
- 在Scikit-learn中,SVC的使用非常简单,只需要创建一个SVC对象,设置一些超参数,然后调用fit()方法训练模型。可以使用predict()方法对新数据进行分类预测。 - 在Scikit-learn中,SVC的使用非常简单,只需要创建一个SVC对象,设置一些超参数,然后调用fit()方法训练模型。可以使用predict()方法对新数据进行分类预测。
- 总之,SVC是一种强大的分类器,适用于解决二分类和多分类问题,尤其擅长处理高维度和非线性可分的数据。 - 总之,SVC是一种强大的分类器,适用于解决二分类和多分类问题,尤其擅长处理高维度和非线性可分的数据。
#### sklearn.svm.svc ##### sklearn.svm.svc
- [`sklearn.svm`](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.svm).SVC - [`sklearn.svm`](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.svm).SVC
......
## ##
import os
import inspect import inspect
import os
import constants.beauty as bt import constants.beauty as bt
import constants.uiconfig as ufg
import data_visualization as dv import data_visualization as dv
import fviewer import fviewer
import ipdb import ipdb
import numpy as np
import PySimpleGUI as sg import PySimpleGUI as sg
import query as q import query as q
import constants.uiconfig as ufg from constants.beauty import (ccser_theme, db_introduction, h2, logo,
import constants.beauty as bt option_frame, result_frame)
from constants.beauty import ( from constants.uiconfig import ML_KEY, __version__
ccser_theme, from demo_programs.Demo_Nice_Buttons import (image_file_to_bytes, red_pill64,
db_introduction, wcolor)
h2,
logo,
option_frame,
result_frame,
)
from demo_programs.Demo_Nice_Buttons import image_file_to_bytes, red_pill64, wcolor
from fviewer import audio_viewer_layout, fviewer_events, selected_files from fviewer import audio_viewer_layout, fviewer_events, selected_files
from joblib import load from joblib import load
from multilanguage import get_your_language_translator from multilanguage import get_your_language_translator
from constants.uiconfig import ML_KEY, __version__
from user import UserAuthenticatorGUI from user import UserAuthenticatorGUI
from config.algoparams import ava_cv_modes
lang = get_your_language_translator("English") lang = get_your_language_translator("English")
import sys import sys
...@@ -33,16 +30,9 @@ import sys ...@@ -33,16 +30,9 @@ import sys
from audio.core import get_used_keys from audio.core import get_used_keys
from audio.graph import showFreqGraph, showMelFreqGraph, showWaveForm from audio.graph import showFreqGraph, showMelFreqGraph, showWaveForm
from config.EF import ava_algorithms, ava_emotions, ava_features from config.EF import ava_algorithms, ava_emotions, ava_features
from config.MetaPath import ( from config.MetaPath import (ava_dbs, bclf, brgr, emodb,
ava_dbs, get_example_audio_file, ravdess, savee,
bclf, speech_dbs_dir)
brgr,
emodb,
get_example_audio_file,
ravdess,
savee,
speech_dbs_dir,
)
def import_config_bookmark(): def import_config_bookmark():
...@@ -58,7 +48,7 @@ def define_constants(): ...@@ -58,7 +48,7 @@ def define_constants():
size = (1500, 1000) size = (1500, 1000)
# size=None # size=None
# ava_cv_modes=("kfold","ss","sss")
train = "trian" train = "trian"
test = "test" test = "test"
algorithm = "" algorithm = ""
...@@ -66,7 +56,18 @@ audio_selected = "" ...@@ -66,7 +56,18 @@ audio_selected = ""
speech_folder = speech_dbs_dir speech_folder = speech_dbs_dir
no_result_yet = f"No Result Yet" no_result_yet = f"No Result Yet"
predict_res_key = "emotion_predict_res" predict_res_key = "emotion_predict_res"
kfold_radio_key = "kfold"
shuffle_split_radio_key = "ss"
show_confusion_matrix_key = "show_confusion_matrix"
stratified_shuffle_split_radio_key = "sss"
current_model_key = "current_model"
current_model_tip_key = "current_model_tip"
predict_proba_tips_key = "predict_proba" predict_proba_tips_key = "predict_proba"
cv_splits_slider_key = "cv_splits_slider"
train_cv_result_table_key = "train_cv_result_table"
# test_score&train_score view
train_result_table_key = "train_result_table" train_result_table_key = "train_result_table"
predict_proba_table_key = "predict_proba_table" predict_proba_table_key = "predict_proba_table"
predict_proba_frame_key = "predict_proba_frame" predict_proba_frame_key = "predict_proba_frame"
...@@ -76,7 +77,7 @@ userUI = UserAuthenticatorGUI() ...@@ -76,7 +77,7 @@ userUI = UserAuthenticatorGUI()
## ##
def get_algos_elements_list(): def get_algos_elements_list(ava_algorithms=ava_algorithms):
""" """
Radio组件需要设置分组,这里分组就设置额为:algorithm Radio组件需要设置分组,这里分组就设置额为:algorithm
利用default参数设置选中默认值(默认选中第一个) 利用default参数设置选中默认值(默认选中第一个)
...@@ -98,7 +99,6 @@ def get_algos_elements_list(): ...@@ -98,7 +99,6 @@ def get_algos_elements_list():
enable_events=True, enable_events=True,
) )
) )
return algos_radios return algos_radios
...@@ -238,8 +238,85 @@ def make_window(theme=None, size=None): ...@@ -238,8 +238,85 @@ def make_window(theme=None, size=None):
num_rows=1, # 默认表格会有一定的高度,这里设置为1,避免出现空白 num_rows=1, # 默认表格会有一定的高度,这里设置为1,避免出现空白
hide_vertical_scroll=True, hide_vertical_scroll=True,
) )
],
[
sg.Table(
values=[["pending"] * 2],
headings=["fold", "accu_score"],
justification="center",
font="Arial 16",
expand_x=True,
key=train_cv_result_table_key,
num_rows=1, # 默认表格会有一定的高度,这里设置为1,避免出现空白
hide_vertical_scroll=True,
visible=False,
)
],
# [
# sg.Table(
# values=[["pending"] * 2],
# headings=["confusion_matrix", "accu_score"],
# justification="center",
# font="Arial 16",
# expand_x=True,
# key=train_cv_result_table_key,
# num_rows=1, # 默认表格会有一定的高度,这里设置为1,避免出现空白
# hide_vertical_scroll=True,
# visible=False,
# )
# ]
]
confution_matrix_button_layout=[
[sg.B("show confusion matrix", key=show_confusion_matrix_key)],
]
cv_mode_layout = [
[
sg.T("cv mode:"),
sg.Radio(
"k-fold",
group_id="cv_mode",
key=kfold_radio_key,
default=False,
enable_events=True,
),
sg.Radio(
"shuffle-split",
group_id="cv_mode",
key=shuffle_split_radio_key,
default=False,
enable_events=True,
),
sg.Radio(
"stratified-shuffle-split",
group_id="cv_mode",
key=stratified_shuffle_split_radio_key,
default=True,
enable_events=True,
),
] ]
] ]
cv_param_settings_layout = [
[
sg.T("cv splits:"),
sg.Slider(
range=(1, 10),
key=cv_splits_slider_key,
orientation="h",
expand_x=True,
default_value=5,
enable_events=True,
),
],
*cv_mode_layout,
]
other_settings_frame_layout = [
[
bt.option_frame(
title="Other Parameter Settings", layout=cv_param_settings_layout
),
],
]
train_fit_button_layout = [ train_fit_button_layout = [
[ [
# sg.Button('start train'), # sg.Button('start train'),
...@@ -252,9 +329,12 @@ def make_window(theme=None, size=None): ...@@ -252,9 +329,12 @@ def make_window(theme=None, size=None):
pad=(0, 0), pad=(0, 0),
key="start train", key="start train",
), ),
sg.pin(sg.T("current model:", key=current_model_tip_key, visible=False)),
sg.T("", key=current_model_key),
] ]
] ]
train_result_layout = [
train_result_frame_layout = [
[ [
bt.result_frame( bt.result_frame(
title=lang["train_result_title"], title=lang["train_result_title"],
...@@ -268,11 +348,11 @@ def make_window(theme=None, size=None): ...@@ -268,11 +348,11 @@ def make_window(theme=None, size=None):
text=no_result_yet, justification="c", key=predict_res_key text=no_result_yet, justification="c", key=predict_res_key
) )
# predict_proba_tips_layout = [[sg.Text("pending", key=predict_proba_tips_key)]] # predict_proba_tips_layout = [[sg.Text("pending", key=predict_proba_tips_key)]]
#默认不显示predict_proba的不可用说明 # 默认不显示predict_proba的不可用说明
predict_proba_tips_layout = bt.normal_content_layout( predict_proba_tips_layout = bt.normal_content_layout(
text="pending", key=predict_proba_tips_key,visible=False text="pending", key=predict_proba_tips_key, visible=False
) )
#默认显示predict_proba表格 # 默认显示predict_proba表格
predict_proba_table_layout = [ predict_proba_table_layout = [
[ [
sg.Table( sg.Table(
...@@ -289,10 +369,7 @@ def make_window(theme=None, size=None): ...@@ -289,10 +369,7 @@ def make_window(theme=None, size=None):
) )
] ]
] ]
predict_prob_layout=[ predict_prob_layout = [*predict_proba_tips_layout, *predict_proba_table_layout]
*predict_proba_tips_layout,
*predict_proba_table_layout
]
predict_res_frames_layout = [ predict_res_frames_layout = [
[result_frame(layout=predict_res_layout)], [result_frame(layout=predict_res_layout)],
[ [
...@@ -380,8 +457,10 @@ def make_window(theme=None, size=None): ...@@ -380,8 +457,10 @@ def make_window(theme=None, size=None):
+ e_config_layout + e_config_layout
+ f_config_layout + f_config_layout
+ algos_layout + algos_layout
+ other_settings_frame_layout
+ train_fit_button_layout + train_fit_button_layout
+ train_result_layout + train_result_frame_layout
+ confution_matrix_button_layout
+ file_choose_layout + file_choose_layout
+ predict_res_frames_layout + predict_res_frames_layout
+ draw_layout + draw_layout
...@@ -527,7 +606,7 @@ def initial(values=None, verbose=1): ...@@ -527,7 +606,7 @@ def initial(values=None, verbose=1):
train_db = values["train_db"] train_db = values["train_db"]
test_db = values["test_db"] test_db = values["test_db"]
e_config = scan_choosed_options(values) e_config = scan_choosed_options(values)
algorithm = selected_algo(values) algorithm = selected_radio_in(values)
f_config = selected_features(values) f_config = selected_features(values)
if verbose >= 2: if verbose >= 2:
# print(train_db, test_db, e_config, algorithm, f_config) # print(train_db, test_db, e_config, algorithm, f_config)
...@@ -551,14 +630,13 @@ def selected_features(values): ...@@ -551,14 +630,13 @@ def selected_features(values):
return f_config return f_config
def selected_algo(values): def selected_radio_in(values,ava_list=ava_algorithms):
global algorithm # global algorithm
for algo in ava_algorithms: # res=""
if values and values[algo]: for algo_name in ava_list:
# 获取选中的算法名称(key) if values and values[algo_name]:
algorithm = algo
break break
return algorithm return algo_name
def scan_choosed_options(values): def scan_choosed_options(values):
...@@ -644,18 +722,18 @@ def recognize_auido( ...@@ -644,18 +722,18 @@ def recognize_auido(
data = list(predict_proba.items()) data = list(predict_proba.items())
# print(data,"@{data}") # print(data,"@{data}")
data = [[emo, proba] for emo, proba in data] data = [[emo, round(proba,bt.score_ndigits)] for emo, proba in data]
#关闭proba_tip的显示 # 关闭proba_tip的显示
window[predict_proba_tips_key].update(visible=False) window[predict_proba_tips_key].update(visible=False)
#更新proba表格内容 # 更新proba表格内容
# window[predict_proba_tips_frame_key].update(visible=False) # window[predict_proba_tips_frame_key].update(visible=False)
ppt = window[predict_proba_table_key] ppt = window[predict_proba_table_key]
# inspect.getfullargspec(ppt.update) # inspect.getfullargspec(ppt.update)
ppt.update( ppt.update(
values=data, values=data,
num_rows=4, num_rows=4,
# display_row_numbers=True #display_row_numbers=True
visible=True visible=True,
) )
# window[] # window[]
# window[predict_proba_table_frame_key].update(visible=True) # window[predict_proba_table_frame_key].update(visible=True)
...@@ -667,7 +745,7 @@ def recognize_auido( ...@@ -667,7 +745,7 @@ def recognize_auido(
), ),
visible=True, visible=True,
) )
#关闭表格的显示 # 关闭表格的显示
window[predict_proba_table_key].update(visible=False) window[predict_proba_table_key].update(visible=False)
window.refresh() window.refresh()
...@@ -696,11 +774,7 @@ def start_train_model( ...@@ -696,11 +774,7 @@ def start_train_model(
from recognizer.basic import EmotionRecognizer from recognizer.basic import EmotionRecognizer
if verbose: if verbose:
print("train_db:", train_db) check_training_arguments(train_db, test_db, e_config, f_config, algorithm)
print("test_db:", test_db)
print("e_config:", e_config)
print("f_config:", f_config)
print("algorithm:", algorithm)
bclf_estimators = load(bclf) bclf_estimators = load(bclf)
...@@ -719,7 +793,6 @@ def start_train_model( ...@@ -719,7 +793,6 @@ def start_train_model(
if algorithm == "RNN": if algorithm == "RNN":
from recognizer.deep import DeepEmotionRecognizer from recognizer.deep import DeepEmotionRecognizer
der = DeepEmotionRecognizer( der = DeepEmotionRecognizer(
train_dbs=train_db, test_dbs=test_db, e_config=e_config, f_config=f_config train_dbs=train_db, test_dbs=test_db, e_config=e_config, f_config=f_config
) )
...@@ -752,6 +825,7 @@ def model_res(er, verbose=1): ...@@ -752,6 +825,7 @@ def model_res(er, verbose=1):
""" """
train_score = er.train_score() train_score = er.train_score()
test_score = er.test_score() test_score = er.test_score()
if verbose: if verbose:
print(f"{er.model=}") print(f"{er.model=}")
print(f"{test_score=}") print(f"{test_score=}")
...@@ -776,7 +850,7 @@ def main(verbose=1): ...@@ -776,7 +850,7 @@ def main(verbose=1):
while True: while True:
if verbose >= 2: if verbose >= 2:
check_training_arguments(e_config, f_config, train_db, test_db, algorithm) check_training_arguments(train_db, test_db, e_config, f_config, algorithm)
if event: # 监听任何event if event: # 监听任何event
print(event, "@{event}", __file__) print(event, "@{event}", __file__)
...@@ -812,7 +886,7 @@ def main(verbose=1): ...@@ -812,7 +886,7 @@ def main(verbose=1):
print(f_config, "@{f_config}") print(f_config, "@{f_config}")
elif event in ava_algorithms: elif event in ava_algorithms:
algorithm = selected_algo(values) algorithm = selected_radio_in(values)
if verbose: if verbose:
print(algorithm, "@{algorithm}") print(algorithm, "@{algorithm}")
# print(event, "处于选择algorithm的循环中.") # print(event, "处于选择algorithm的循环中.")
...@@ -832,6 +906,7 @@ def main(verbose=1): ...@@ -832,6 +906,7 @@ def main(verbose=1):
# print("完成文件选取") # print("完成文件选取")
# --情感识别阶段-- # --情感识别阶段--
elif event == "start train": elif event == "start train":
n_splits = values[cv_splits_slider_key]
er = start_train_model( er = start_train_model(
train_db=train_db, train_db=train_db,
test_db=test_db, test_db=test_db,
...@@ -839,15 +914,10 @@ def main(verbose=1): ...@@ -839,15 +914,10 @@ def main(verbose=1):
f_config=f_config, f_config=f_config,
algorithm=algorithm, algorithm=algorithm,
) )
# 训练收尾工作:将计算结果(识别器)传递给fviewer,赋能fviewer可以(直接利用识别器对象)进行识别 # 训练收尾工作:将计算结果(识别器)传递给fviewer,赋能fviewer可以(直接利用识别器对象)进行识别
fviewer.er = er # 是否为多余#TODO refresh_trained_view(verbose, window, er, values)
train_score, test_score = model_res(er, verbose=verbose)
# window["train_result"].update(f"{train_score=},{test_score=}")
res = [round(x, 4) for x in (train_score, test_score)]
window["train_result_table"].update(
values=[res]
) # values类型是list[list[any]],每个内部列表表示表格的一个行的数据
elif event == "recognize it": elif event == "recognize it":
recognize_auido( recognize_auido(
...@@ -891,6 +961,11 @@ def main(verbose=1): ...@@ -891,6 +961,11 @@ def main(verbose=1):
content = [logo, db_introduction] content = [logo, db_introduction]
res = "\n".join(content) res = "\n".join(content)
sg.popup_scrolled(res, size=(150, 100), title="Introduction") sg.popup_scrolled(res, size=(150, 100), title="Introduction")
elif event==show_confusion_matrix_key:
from SG.demo_pandas_table import TablePandas
cm=er.confusion_matrix()
tp=TablePandas(df=cm)
tp.show_confution_matrix_window()
else: else:
# 具有独立的事件循环,直接调用即可 # 具有独立的事件循环,直接调用即可
userUI.run_module(event, values, window=window, verbose=1) userUI.run_module(event, values, window=window, verbose=1)
...@@ -908,6 +983,47 @@ def main(verbose=1): ...@@ -908,6 +983,47 @@ def main(verbose=1):
window.close() window.close()
def refresh_trained_view(verbose, window, er, values):
"""
Refreshes the trained view with the given parameters.
these args are available for the table element to update
args=ArgSpec(args=['self', 'values', 'num_rows', 'visible', 'select_rows', 'alternating_row_color', 'row_colors'],
varargs=None, keywords=None, defaults=(None, None, None, None, None, None))🎈
Args:
verbose (bool): Whether to print verbose output or not.
window (sg.Window): The PySimpleGUI window object to update.
er (EvaluationResult): The evaluation result object to use for updating the view.
"""
fviewer.er = er # 是否为多余#TODO
train_score, test_score = model_res(er, verbose=verbose)
# window["train_result"].update(f"{train_score=},{test_score=}")
res = [round(x, bt.score_ndigits) for x in (train_score, test_score)]
window[train_result_table_key].update(
values=[res]
) # values类型是list[list[any]],每个内部列表表示表格的一个行的数据
window[current_model_tip_key].update(visible=True)
window[current_model_key].update(value=er.model)
n_splits = values[cv_splits_slider_key]
# cv_mode=values[kfold_radio_key]
cv_mode=selected_radio_in(values,ava_list=ava_cv_modes)
# print(cv_mode,"@{cv_mode}🎈")
fold_scores = er.model_cv_score(mean_only=False, n_splits=n_splits,cv_mode=cv_mode)
folds = len(fold_scores)
mean_score = np.mean(fold_scores)
fold_scores_rows = [
[str(f"{i+1}"), round(score, bt.score_ndigits)]
for i, score in enumerate(fold_scores)
]
fold_scores_rows.append(["mean_score", round(mean_score, bt.score_ndigits)])
tcrt = window[train_cv_result_table_key].update
# args=inspect.signature(tcrt)
# print(f"{args=}🎈")
tcrt(values=fold_scores_rows, num_rows=folds + 1, visible=True)
def open_folder_event(window): def open_folder_event(window):
print("[LOG] Clicked Open Folder!") print("[LOG] Clicked Open Folder!")
folder_or_file = sg.popup_get_folder( folder_or_file = sg.popup_get_folder(
...@@ -946,7 +1062,7 @@ def file_selected_record(verbose, event, values): ...@@ -946,7 +1062,7 @@ def file_selected_record(verbose, event, values):
print(event, values["-FILENAME-"]) print(event, values["-FILENAME-"])
def check_training_arguments(e_config, f_config, train_db, test_db, algorithm): def check_training_arguments(train_db, test_db, e_config, f_config, algorithm):
print(f"train_db = {train_db}") print(f"train_db = {train_db}")
print(f"test_db = {test_db}") print(f"test_db = {test_db}")
print(f"e_config = {e_config}") print(f"e_config = {e_config}")
......
...@@ -12,6 +12,7 @@ frame_size = (600, 50) ...@@ -12,6 +12,7 @@ frame_size = (600, 50)
lb_size = (60, 10) lb_size = (60, 10)
ml_size = (60, 20) ml_size = (60, 20)
seperator_color = "blue" seperator_color = "blue"
score_ndigits=4
welcom_title_size = (45, 1) welcom_title_size = (45, 1)
slider_size = (60, 10) slider_size = (60, 10)
ccser_theme = "Reddit" ccser_theme = "Reddit"
...@@ -329,6 +330,7 @@ if __name__ == "__main__": ...@@ -329,6 +330,7 @@ if __name__ == "__main__":
# layout_inner = [[sg.Text("demo")]] # layout_inner = [[sg.Text("demo")]]
# layout = [[result_frame(title="demo", layout=layout_inner)]] # layout = [[result_frame(title="demo", layout=layout_inner)]]
# layout=res_content_layout("demo", expand_x=True) # layout=res_content_layout("demo", expand_x=True)
layout=normal_content_layout("demo") layout=normal_content_layout("demo")
window = sg.Window("demo of beauty elements", layout,resizable=True) window = sg.Window("demo of beauty elements", layout,resizable=True)
window.read() window.read()
......
import PySimpleGUI as sg
import pandas as pd
import SG.constants.beauty as bt
sg.theme(bt.ccser_theme)
class TablePandas():
def __init__(self,df=None) -> None:
if df is None:
# 创建一个 Pandas 数据帧
demo_data = {
"Name": ["Alice", "Bob", "Charlie", "David"],
"Age": [25, 30, 35, 40],
"Salary": [50000, 60000, 70000, 80000],
}
df = pd.DataFrame(demo_data)
self.df=df
# 创建 PySimpleGUI 窗口布局
def create_table_window(self,df):
layout = [
[
sg.Table(
values=df.values.tolist(),
headings=df.columns.tolist(),
max_col_width=25,
auto_size_columns=True,
justification="center",
num_rows=min(25, len(df)),
)
],
# [sg.Button("Exit")],
]
return layout
def get_confution_matrix_window(self,df=None):
layout = self.create_table_window(df)
window = sg.Window("Pandas Table Viewer", layout)
return window
# def show_confution_matrix_window(df):
# window=get_confution_matrix_window(df)
def show_confution_matrix_window(self,df=None):
# 创建 PySimpleGUI 窗口
# window = sg.Window("Pandas Table Viewer", layout)
df=df if df else self.df
window=self.get_confution_matrix_window(df=df)
# 处理事件循环
while True:
event, values = window.read()
if event in (sg.WINDOW_CLOSED,):
break
# 关闭 PySimpleGUI 窗口
window.close()
if __name__=="__main__":
tp=TablePandas()
tp.show_confution_matrix_window()
# show_confution_matrix_window(df)
\ No newline at end of file
...@@ -445,7 +445,7 @@ def fviewer_events(window, event=None, values=None, verbose=1): ...@@ -445,7 +445,7 @@ def fviewer_events(window, event=None, values=None, verbose=1):
print(emo_res, "@{emo_res}") print(emo_res, "@{emo_res}")
print(abs_pathes, "@{abs_pathes}") print(abs_pathes, "@{abs_pathes}")
t = ts.TableShow(header=["emotion", "path"], lists=[emo_res, abs_pathes]) t = ts.TableShow(header=["emotion", "path"], data_lists=[emo_res, abs_pathes])
print(t.lists, "@{t.lists}") print(t.lists, "@{t.lists}")
t.run() t.run()
......
import humansize
size = 1073741824 # 1 GB 的字节数
# 将字节数转换为人类可读的格式
print(humansize.approximate_size(size, binary=True)) # 输出:1.0 GiB
print(humansize.approximate_size(size, binary=False)) # 输出:1.1 GB
* Python GUIs for Humans[PySimpleGUI User's Manual](https://www.pysimplegui.org/en/latest/#pysimplegui-users-manual)[Jump-Start](https://www.pysimplegui.org/en/latest/#jump-start)[2021 Updates....](https://www.pysimplegui.org/en/latest/#2021-updates)[About The PySimpleGUI Documentation System](https://www.pysimplegui.org/en/latest/#about-the-pysimplegui-documentation-system)[Platforms](https://www.pysimplegui.org/en/latest/#platforms)[The PySimpleGUI "Family"](https://www.pysimplegui.org/en/latest/#the-pysimplegui-family)[Support](https://www.pysimplegui.org/en/latest/#support)[Learning Resources](https://www.pysimplegui.org/en/latest/#learning-resources)[The Quick Tour](https://www.pysimplegui.org/en/latest/#the-quick-tour)[Some Examples](https://www.pysimplegui.org/en/latest/#some-examples)[Pi Windows](https://www.pysimplegui.org/en/latest/#pi-windows)[Games](https://www.pysimplegui.org/en/latest/#games)[Windows Programs That Look Like Windows Programs](https://www.pysimplegui.org/en/latest/#windows-programs-that-look-like-windows-programs)[Background - Why PySimpleGUI Came to Be](https://www.pysimplegui.org/en/latest/#background-why-pysimplegui-came-to-be)[Features](https://www.pysimplegui.org/en/latest/#features)[Getting Started with PySimpleGUI](https://www.pysimplegui.org/en/latest/#getting-started-with-pysimplegui)[PEP8 Bindings For Methods and Functions](https://www.pysimplegui.org/en/latest/#pep8-bindings-for-methods-and-functions)[High Level API Calls - Popup's](https://www.pysimplegui.org/en/latest/#high-level-api-calls-popups)[Progress Meters!](https://www.pysimplegui.org/en/latest/#progress-meters)[Debug Output (easy\_print = Print = eprint)](https://www.pysimplegui.org/en/latest/#debug-output-easy_print-print-eprint)[Custom window API Calls (Your First window)](https://www.pysimplegui.org/en/latest/#custom-window-api-calls-your-first-window)[Copy these design patterns!](https://www.pysimplegui.org/en/latest/#copy-these-design-patterns)[Building Custom Windows](https://www.pysimplegui.org/en/latest/#building-custom-windows)[Themes - Automatic Coloring of Your Windows](https://www.pysimplegui.org/en/latest/#themes-automatic-coloring-of-your-windows)[Window Object - Beginning a window](https://www.pysimplegui.org/en/latest/#window-object-beginning-a-window)[Layouts](https://www.pysimplegui.org/en/latest/#layouts)[Generated Layouts (For sure want to read if you have > 5 repeating elements/rows)](https://www.pysimplegui.org/en/latest/#generated-layouts-for-sure-want-to-read-if-you-have-5-repeating-elementsrows)[Elements](https://www.pysimplegui.org/en/latest/#elements)[SystemTray](https://www.pysimplegui.org/en/latest/#systemtray)[Global Settings](https://www.pysimplegui.org/en/latest/#global-settings)[Persistent windows (Window stays open after button click)](https://www.pysimplegui.org/en/latest/#persistent-windows-window-stays-open-after-button-click)[Updating Elements (changing element's values in an active window)](https://www.pysimplegui.org/en/latest/#updating-elements-changing-elements-values-in-an-active-window)[Cursors - Setting for Elements and Windows](https://www.pysimplegui.org/en/latest/#cursors-setting-for-elements-and-windows)[Keyboard & Mouse Capture](https://www.pysimplegui.org/en/latest/#keyboard-mouse-capture)[Menus](https://www.pysimplegui.org/en/latest/#menus)[TTK & TTK Scrollbars](https://www.pysimplegui.org/en/latest/#ttk-ttk-scrollbars)[Running Multiple Windows](https://www.pysimplegui.org/en/latest/#running-multiple-windows)[The PySimpleGUI Debugger](https://www.pysimplegui.org/en/latest/#the-pysimplegui-debugger)[User Settings API](https://www.pysimplegui.org/en/latest/#user-settings-api)[Timer API](https://www.pysimplegui.org/en/latest/#timer-api)[Extending PySimpleGUI](https://www.pysimplegui.org/en/latest/#extending-pysimplegui)[Troubleshooting](https://www.pysimplegui.org/en/latest/#troubleshooting)[Debug Output](https://www.pysimplegui.org/en/latest/#debug-output)["Demo Programs" Applications](https://www.pysimplegui.org/en/latest/#demo-programs-applications)[Creating a Windows .EXE File](https://www.pysimplegui.org/en/latest/#creating-a-windows-exe-file)[Creating a Mac App File](https://www.pysimplegui.org/en/latest/#creating-a-mac-app-file)[Known Issues](https://www.pysimplegui.org/en/latest/#known-issues)
\ No newline at end of file
...@@ -2,7 +2,7 @@ import PySimpleGUI as sg ...@@ -2,7 +2,7 @@ import PySimpleGUI as sg
import pandas as pd import pandas as pd
from config.MetaPath import recognize_result_dir from config.MetaPath import recognize_result_dir
class TableShow(): class TableShow():
def __init__(self,header=None,lists=None): def __init__(self,header=None,data_lists=None):
"""将二维列表作为表格数据显示 """将二维列表作为表格数据显示
Parameters Parameters
...@@ -11,14 +11,15 @@ class TableShow(): ...@@ -11,14 +11,15 @@ class TableShow():
_description_ _description_
""" """
self.lists=lists self.lists=data_lists
self.length=len(lists[0]) self.length=len(data_lists[0])
# 创建表格数据 # 创建表格数据
self.data_rows = [[l[i] for l in lists] for i in range(self.length)] self.data_rows = [[l[i] for l in data_lists] for i in range(self.length)]
# print(self.data_rows,"@{data}") # print(self.data_rows,"@{data}")
# 定义表头 # 定义表头
# header = ["c1","c2"] # header = ["c1","c2"]
self.header=header#columns self.header=header#columns
self.data_df=pd.DataFrame(self.data_rows,columns=self.header) self.data_df=pd.DataFrame(self.data_rows,columns=self.header)
# 创建表格布局 # 创建表格布局
warning="the save operation will comsume some time to complete!Be patient!" warning="the save operation will comsume some time to complete!Be patient!"
...@@ -42,7 +43,7 @@ class TableShow(): ...@@ -42,7 +43,7 @@ class TableShow():
] ]
def run(self): def run(self):
window = sg.Window("结果表格", self.layout,resizable=True,size=(500,400)) window = sg.Window("result table", self.layout,resizable=True,size=(500,400))
# 事件循环 # 事件循环
while True: while True:
event, values = window.read() event, values = window.read()
...@@ -73,6 +74,6 @@ class TableShow(): ...@@ -73,6 +74,6 @@ class TableShow():
if __name__=="__main__": if __name__=="__main__":
# 创建窗口 # 创建窗口
ts=TableShow(header=list("ABC"),lists=[[1,2,2],[3,4,4],[5,6,6]]) ts=TableShow(header=list("ABC"),data_lists=[[1,2,2],[3,4,4],[5,6,6]])
ts.run() ts.run()
# # a=1
# # b=a
# # print(f"{a=} {b=}; {id(a)=} {id(b)=}")
# # a=100
# # print(f"{a=} {b=}; {id(a)=} {id(b)=}")
# # test=100
# # print(f"{test=} {id(test)=}")
# ##
# print(f"{id(10)=}")
# a=10
# print(f"{id(a)=}")
# def f(x):
# print(f"{id(x)=}")
# x=20
# print(f"{id(20)=}")
# print(f"{id(x)=}")
# f(a)
# a
# print('a: ', a)#10
a=[1,2,3]
def f(x):
x=100
f(a)
print(f"{a=}")#[1,2,3]
b=[1,2,3]
def g(x):
x[0]=100
g(b)
print(f"{b=}")
import PySimpleGUI as sg
# 创建窗口和 Frame 元素
layout = [
[sg.Frame('Frame 1', [[sg.Text('Frame content')]], key='-FRAME-')],
[sg.Button('Update Frame')],
]
window = sg.Window('My Window', layout)
# 事件循环
while True:
event, values = window.read()
if event == sg.WIN_CLOSED:
break
if event == 'Update Frame':
# 更新 Frame 布局
new_layout = [[sg.Text('Updated content')]]
window['-FRAME-'].update(new_layout)
# 关闭窗口
window.close()
\ No newline at end of file
ava_cv_modes=("kfold","ss","sss")
\ No newline at end of file
...@@ -53,21 +53,16 @@ gb_best=(GradientBoostingClassifier(learning_rate=0.3, loss='log_loss', max_dept ...@@ -53,21 +53,16 @@ gb_best=(GradientBoostingClassifier(learning_rate=0.3, loss='log_loss', max_dept
# bclf[1]=rf_best # bclf[1]=rf_best
# bclf[2]=gb_best # bclf[2]=gb_best
## ##
# dump(rf_best, "qq.joblib")
# bclf[-1] = bag_best # bclf[-1] = bag_best
## ##
bclf_res=load("bclf.joblib") bclf_res=load("bclf.joblib")
for item in bclf_res: for item in bclf_res:
print(item) print(item)
# ##
# brgr
# ##
# # 修改回归模型 # # 修改回归模型
# bag_rgr_best = (BaggingRegressor(max_features=1, max_samples=0.1), # bag_rgr_best = (BaggingRegressor(max_features=1, max_samples=0.1),
# {'max_features': 1, 'max_samples': 0.1, 'n_estimators': 10}, # {'max_features': 1, 'max_samples': 0.1, 'n_estimators': 10},
# 0.6521001743540973) # 0.6521001743540973)
# # brgr[-1]=bag_rgr_best # # brgr[-1]=bag_rgr_best
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
# from typing_extensions import deprecated # from typing_extensions import deprecated
import random import random
from time import time from time import time
from config.algoparams import ava_cv_modes
import matplotlib.pyplot as pl import matplotlib.pyplot as pl
import numpy as np import numpy as np
import pandas as pd import pandas as pd
...@@ -11,7 +11,7 @@ from sklearn.ensemble import RandomForestClassifier ...@@ -11,7 +11,7 @@ from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import (accuracy_score, classification_report, from sklearn.metrics import (accuracy_score, classification_report,
confusion_matrix, fbeta_score, make_scorer, confusion_matrix, fbeta_score, make_scorer,
mean_absolute_error, mean_squared_error) mean_absolute_error, mean_squared_error)
from sklearn.model_selection import GridSearchCV from sklearn.model_selection import GridSearchCV, KFold, ShuffleSplit, StratifiedShuffleSplit, cross_val_score
from sklearn.neighbors import KNeighborsClassifier from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC from sklearn.svm import SVC
from tqdm import tqdm from tqdm import tqdm
...@@ -401,12 +401,14 @@ class EmotionRecognizer: ...@@ -401,12 +401,14 @@ class EmotionRecognizer:
y_test = self.y_test y_test = self.y_test
# 调用训练好的模型进行预测 # 调用训练好的模型进行预测
model = self.model if self.model is not None else self.best_model() model = self.model if self.model is not None else self.best_model()
if len(X_test) == 0: self.validate_empty_array(X_test=X_test,y_test=y_test)
raise ValueError("X_test is empty") # if len(X_test) == 0:
if len(y_test) == 0: # raise ValueError("X_test is empty")
raise ValueError("y_test is empty") # if len(y_test) == 0:
# raise ValueError("y_test is empty")
# 预测计算 # 预测计算
print(X_test.shape, y_test.shape,"🎈") if verbose:
print(X_test.shape, y_test.shape,"🎈")
y_pred = model.predict(X_test) # type: ignore y_pred = model.predict(X_test) # type: ignore
if choosing == False: if choosing == False:
self.y_pred = np.array(y_pred) self.y_pred = np.array(y_pred)
...@@ -419,7 +421,53 @@ class EmotionRecognizer: ...@@ -419,7 +421,53 @@ class EmotionRecognizer:
report = classification_report(y_true=y_test, y_pred=y_pred) report = classification_report(y_true=y_test, y_pred=y_pred)
print(report, self.model.__class__.__name__) print(report, self.model.__class__.__name__)
return res return res
def model_cv_score(self, choosing=False, verbose=1,mean_only=True,n_splits=5,test_size=0.2,cv_mode="sss"):
"""
使用交叉验证的方式来评估模型
Calculates score on testing data
"""
X_train = self.X_train
y_train = self.y_train
# 调用训练好的模型进行预测
model = self.model if self.model is not None else self.best_model()
self.validate_empty_array(X_train, y_train)
# 预测计算
if verbose:
print(X_train.shape, y_train.shape,"🎈")
print(f"{n_splits=}")
n_splits=int(n_splits)
y_pred = model.predict(X_train) # type: ignore
if choosing == False:
self.y_pred = np.array(y_pred)
# 交叉验证的方式评估模型的得分
cv_mode_dict=dict(
sss=StratifiedShuffleSplit(n_splits=n_splits, test_size=test_size, random_state=0),
ss=ShuffleSplit(n_splits=n_splits, test_size=test_size, random_state=0),
kfold=KFold(n_splits=n_splits, shuffle=True, random_state=0),
)
cv_mode_selected=cv_mode_dict[cv_mode]
if verbose:
print(f"{cv_mode=}🎈")
if self.classification_task:
# res = accuracy_score(y_true=y_test, y_pred=y_pred)
res=cross_val_score(model, X_train, y_train, cv=cv_mode_selected)
if mean_only:
res=res.mean()
else:
res = mean_squared_error(y_true=y_train, y_pred=y_pred)
if self.verbose >= 2 or verbose >= 1:
report = classification_report(y_true=y_train, y_pred=y_pred)
print(report, self.model.__class__.__name__)
return res
def validate_empty_array(self, X_test=[], y_test=[]):
if len(X_test) == 0:
raise ValueError("X is empty")
if len(y_test) == 0:
raise ValueError("y is empty")
def meta_paths_of_db(self, db, partition="test"): def meta_paths_of_db(self, db, partition="test"):
res = meta_paths_of_db( res = meta_paths_of_db(
db=db, db=db,
...@@ -528,12 +576,12 @@ class EmotionRecognizer: ...@@ -528,12 +576,12 @@ class EmotionRecognizer:
# make it percentage # make it percentage
matrix *= 100 matrix *= 100
if labeled: if labeled:
matrix = pd.DataFrame( matrix_df = pd.DataFrame(
matrix, matrix,
index=[f"true_{e}" for e in self.e_config], index=[f"true_{e}" for e in self.e_config],
columns=[f"predicted_{e}" for e in self.e_config], columns=[f"predicted_{e}" for e in self.e_config],
) )
return matrix return matrix_df
def draw_confusion_matrix(self): def draw_confusion_matrix(self):
"""Calculates the confusion matrix and shows it""" """Calculates the confusion matrix and shows it"""
...@@ -624,7 +672,7 @@ def main(EmotionRecognizer, e_config): ...@@ -624,7 +672,7 @@ def main(EmotionRecognizer, e_config):
# my_model = RandomForestClassifier(max_depth=3, max_features=0.2) # my_model = RandomForestClassifier(max_depth=3, max_features=0.2)
my_model = SVC(C=0.001, gamma=0.001, kernel="poly",probability=True) my_model = SVC(C=0.001, gamma=0.001, kernel="poly",probability=True)
my_model=KNeighborsClassifier(n_neighbors=3, p=1, weights='distance') my_model=KNeighborsClassifier(n_neighbors=3, p=1, weights='distance')
my_model = None # my_model = None
# rec = EmotionRecognizer(model=my_model,e_config=AHNPS,f_config=f_config_def,test_dbs=[ravdess],train_dbs=[ravdess], verbose=1) # rec = EmotionRecognizer(model=my_model,e_config=AHNPS,f_config=f_config_def,test_dbs=[ravdess],train_dbs=[ravdess], verbose=1)
# rec = EmotionRecognizer(model=my_model,e_config=AHNPS,f_config=f_config_def,test_dbs=emodb,train_dbs=emodb, verbose=1) # rec = EmotionRecognizer(model=my_model,e_config=AHNPS,f_config=f_config_def,test_dbs=emodb,train_dbs=emodb, verbose=1)
...@@ -646,6 +694,8 @@ def main(EmotionRecognizer, e_config): ...@@ -646,6 +694,8 @@ def main(EmotionRecognizer, e_config):
print(f"{train_score=}") print(f"{train_score=}")
test_score = er.test_score() test_score = er.test_score()
print(f"{test_score=}") print(f"{test_score=}")
cv_score=er.model_cv_score()
print(f"{cv_score=}")
return er return er
......
##
import numpy as np
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import KFold, cross_val_predict, cross_val_score
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import ShuffleSplit, cross_val_score
from sklearn.tree import DecisionTreeClassifier
##
# 加载iris(鸢尾花)数据集
X, y = load_iris(return_X_y=True)
#! 定义5折交叉验证
kf = KFold(
n_splits=5,
shuffle=True,
random_state=42,
)
# 使用线性回归模型进行训练和测试
model = LinearRegression()
# model=RandomForestClassifier()
scores_cv = []
# 这里split参数可以是X也可以是y,因为只需要划分样本的索引,所以两者都可以
for train_index, test_index in kf.split(y):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
scores_cv.append(score)
print("Score:", score)
mean_score = np.mean(scores_cv)
print(f"{mean_score=}")
##
#!使用cross_val_score
#构造cv器的时候不需要传入数据集
ss_cv = ShuffleSplit(n_splits=3, test_size=0.2, random_state=42)
kf_cv=KFold(n_splits=3,shuffle=True,random_state=42)
scores = cross_val_score(
model,
X,
y,
#cv=5,
#cv=ss_cv,
cv=kf_cv,
verbose=3,
)
#cv取整数时,采用的非随机化的kfold方法划分,不是很可靠
#cv建议选用随机化的(StratifiedShuffleSplit最为高级)
#cv取kfold对象时,我们可以选择shuffle=True,使得所有样本都能够参与训练集/测试集
print("Scores:", scores)
print("Mean score:", scores.mean())
##
# 使用决策树模型进行交叉验证,并对数据集进行随机化操作
model = DecisionTreeClassifier()
ss_cv = ShuffleSplit(n_splits=3, test_size=0.2, random_state=42)
print("cv: ", ss_cv)
# ssr=ss_cv.split(X,y)
# for train_index,test_index in ssr:
# train_index,test_index=np.array(train_index),np.array(test_index)
# print(train_index.shape,test_index.shape)
scores = cross_val_score(model, X, y, cv=ss_cv, verbose=True)
print("Scores:", scores)
print("Mean score:", scores.mean())
##
import numpy as np
from sklearn.model_selection import StratifiedShuffleSplit
# 使用随机生成数据测试
rng = np.random.default_rng()
rng.integers(20, size=(12, 2))
# 样本总数为12,二分类,标签为0,1,两种样本比例为1:2
n = 12
n0, n1 = 1 * n // 3, 2 * n // 3
# 随机的为这些模拟样本分配标签(因为这里不涉及到训练,所以随机分配标签不影响效果,在数据集划分的阶段,不用关心样本和标签的关联规律,如果是要训练,通常是不能随机给样本特征分配标签)
y = [0] * n0 + [1] * n1
y = np.array(y)
rng.shuffle(y)
# 下面一种方式采用概率的方式生成标签,但是即使样本总数n可以被3整除,生成的数组也不保证数量是1:2
# y = rng.choice([0, 1], size=12, replace=True, p=[1/3, 2/3])
# count=np.unique(y,return_counts=True)
# print(count)
# 为例放便验证,这里将标签数组和样本索引打印出来
print(np.vstack([y, range(n)]))
# 构造分层随机拆分对象,指定做独立的5次划分,每次划分,测试集的样本数量占总样本数量n的20%
# 而StratifiedShuffleSplit会保持各个类别在测试集和训练集上的比例
# 是两种独立的约束(例如0类样和1类样本比例在数据集中为1:2,那么在训练集和测试集中依然保持(或接近)1:2)
sss = StratifiedShuffleSplit(n_splits=5, test_size=0.2, random_state=0)
# print(sss)
# 打印这5次
for i, (train_index, test_index) in enumerate(sss.split(X, y)):
print(f"Fold {i}:")
print(f" Train: index={train_index}")
print(f" Test: index={test_index}")
print(np.vstack((test_index, y[test_index])))
##
X,y=load_iris(return_X_y=True)
#效果等同于
data=load_iris()
X=data.data
y=data.target
#由于Bunch对象的特性,可以用字典方式访问
X=data['data']
y=data['target']
#
# 导出为pandas dataframe:
frame_data=load_iris(as_frame=True)
X_df=frame_data.data
y_df=frame_data.target
...@@ -5,8 +5,18 @@ ...@@ -5,8 +5,18 @@
- for convenient,I tag the version of the project (mainly on the client ) with data time - for convenient,I tag the version of the project (mainly on the client ) with data time
- the time(version) was create by powershell : - the time(version) was create by powershell :
- ` Get-Date -Format "yyyy-MM-dd@HH:mm:ss"` - ```powershell
Get-Date -Format "yyyy-MM-dd@HH:mm:ss"
```
- copy to clipbord automatically
```bash
Get-Date -Format "yyyy-MM-dd@HH:mm:ss"|scb
```
### versions with time ### versions with time
...@@ -19,4 +29,12 @@ ...@@ -19,4 +29,12 @@
- use the table element to show "test_score","train_score" and "predict_proba" if it is usable - use the table element to show "test_score","train_score" and "predict_proba" if it is usable
- update the file viwer part function and operation logic with more flexibility. - update the file viwer part function and operation logic with more flexibility.
- kwon problems: - kwon problems:
- with the display or not display the predict_proba layout,the left column may need to resize the window to make the last part of view to visible! - with the display or not display the predict_proba layout,the left column may need to resize the window to make the last part of view to visible!
\ No newline at end of file
- 2023-04-27@22:04:52
- "update the client layout with more available parameters cross-validation(`kfold`/`shufflesplit`/`StratifiedShuffleSplit`)
- support the `n_splits` as a slider element for use to adjust th
- update the `confusion matrix table`show button
...@@ -21,5 +21,6 @@ ipdb ...@@ -21,5 +21,6 @@ ipdb
pydub pydub
pluggy pluggy
psgdemos #this contain psgdemos and PySimpleGUI framework psgdemos #this contain psgdemos and PySimpleGUI framework
black #the python code formatter of th project
# playsound # playsound
# tensorflow==2.10.0 # tensorflow==2.10.0
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册