提交 a3f75315 编写于 作者: M ms_yan

repair format and type problem in split

上级 32a72c19
......@@ -613,7 +613,7 @@ class Dataset:
# if we still need more rows, give them to the first split.
# if we have too many rows, remove the extras from the first split that has
# enough rows.
size_difference = dataset_size - absolute_sizes_sum
size_difference = int(dataset_size - absolute_sizes_sum)
if size_difference > 0:
absolute_sizes[0] += size_difference
else:
......@@ -647,10 +647,14 @@ class Dataset:
Datasets of size round(f1*K), round(f2*K), …, round(fn*K) where K is the size of the
original dataset.
If after rounding:
-Any size equals 0, an error will occur.
-The sum of split sizes < K, the difference will be added to the first split.
-The sum of split sizes > K, the difference will be removed from the first large
enough split such that it will have atleast 1 row after removing the difference.
- Any size equals 0, an error will occur.
- The sum of split sizes < K, the difference will be added to the first split.
- The sum of split sizes > K, the difference will be removed from the first large
enough split such that it will have atleast 1 row after removing the difference.
randomize (bool, optional): determines whether or not to split the data randomly (default=True).
If true, the data will be randomly split. Otherwise, each split will be created with
consecutive rows from the dataset.
......@@ -1282,10 +1286,14 @@ class MappableDataset(SourceDataset):
Datasets of size round(f1*K), round(f2*K), …, round(fn*K) where K is the size of the
original dataset.
If after rounding:
-Any size equals 0, an error will occur.
-The sum of split sizes < K, the difference will be added to the first split.
-The sum of split sizes > K, the difference will be removed from the first large
enough split such that it will have atleast 1 row after removing the difference.
- Any size equals 0, an error will occur.
- The sum of split sizes < K, the difference will be added to the first split.
- The sum of split sizes > K, the difference will be removed from the first large
enough split such that it will have atleast 1 row after removing the difference.
randomize (bool, optional): determines whether or not to split the data randomly (default=True).
If true, the data will be randomly split. Otherwise, each split will be created with
consecutive rows from the dataset.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册