-
由 Ning Yu 提交于
This method was introduced to improve the data redistribution performance during gpexpand phase2, however per benchmark results the effect does not reach our expectation. For example when expanding a table from 7 segments to 8 segments the reshuffle method is only 30% faster than the traditional CTAS method, when expanding from 4 to 8 segments reshuffle is even 10% slower than CTAS. When there are indexes on the table the reshuffle performance can be worse, and extra VACUUM is needed to actually free the disk space. According to our experiments the bottleneck of reshuffle method is on the tuple deletion operation, it is much slower than the insertion operation used by CTAS. The reshuffle method does have some benefits, it requires less extra disk space, it also requires less network bandwidth (similar to CTAS method with the new JCH reduce method, but less than CTAS + MOD). And it can be faster in some cases, however as we can not automatically determine when it is faster it is not easy to get benefit from it in practice. On the other side the reshuffle method is less tested, it is possible to have bugs in corner cases, so it is not production ready yet. In such a case we decided to retire it entirely for now, we might add it back in the future if we can get rid of the slow deletion or find out reliable ways to automatically choose between reshuffle and ctas methods. Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8xknWag-SkI/5OsIhZWdDgAJReviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
1c262c6e