未验证 提交 f7174966 编写于 作者: X xiaoxiao 提交者: GitHub

compatible gpload (#11103)

* refactor gpload test file TEST.py

1. migrate gpload test to pytest
2. new function to form config file through yaml package and make it more reasonable
3. add a case to cover gpload update_condition arggument

* migrate gpload and TEST.py to python3.6
new test case 43 to test gpload behavior when column name has capital letters and without data type
change some ans file since psql react different

* change sql to find reuseable external table to make gpload compatible in gp7 and gp6
better TEST.py to write config file with ruamel.yaml moudle
Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
上级 9265ea6a
此差异已折叠。
......@@ -9,3 +9,4 @@ gpstringsubs.pl
gpdiff.pl
atmsort.pm
explain.pm
data/large_file.csv
import pytest
pytest.register_assert_rewrite('TEST')
from TEST import AnsFile
from TEST import read_diff
import os
def pytest_assertrepr_compare(config,op, left, right):
#first: fname
#second: output path
if op == '==':
diff = read_diff(os.path.splitext(left.path)[0], "")
print(diff)
output = ["query resulted in diff:"]
return output
\ No newline at end of file
aaa'qwer'shjhjg'2012-06-01 15:30:30'1'732'834567'45.67'789.123'7.12345'123.456789
bbb'twob'shpits'2011-06-01 12:30:30'2'732'834567'45.67'789.123'7.12345'987.654321
fff'twof'shpits'2011-06-01 12:30:30'3'732'834567'45.67'789.123'7.12345'654.321987
fff'twoff'shpits'2011-06-01 12:30:30'3'732'834567'45.67'789.123'7.12345'654.321987
eee'twoe'shpits'2011-06-01 12:30:30'4'732'834567'45.67'789.123'7.12345'145.456789
ggg'twog'shpits'2011-06-01 12:30:30'5'732'834567'45.67'789.123'7.12345'123.222289
iii'twoi'shpits'2011-06-01 12:30:30'6'732'834567'45.67'789.123'7.12345'122.444789
......
2018-11-05 22:52:11|INFO|gpload session started 2018-11-05 22:52:11
2018-11-05 22:52:11|INFO|setting schema 'public' for table 'texttable'
2018-11-05 22:52:11|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt" -t 30
2018-11-05 22:52:11|INFO|did not find a staging table to reuse. creating staging_gpload_reusable_afbaac0da7ced19791c9ab9c537f41d3
2018-11-05 22:52:11|INFO|did not find an external table to reuse. creating ext_gpload_reusable_601e34fe_e10a_11e8_b2e8_00505698a2d7
2018-11-05 22:52:11|ERROR|could not run SQL "create external table ext_gpload_reusable_601e34fe_e10a_11e8_b2e8_00505698a2d7("s1" text,"s2" text,"s3" text,"dt" timestamp without time zone,"n1" smallint,"n2" integer,"n3" bigint,"n4" numeric,"n5" numeric,"n6" real,"n7" double precision)location('gpfdist://127.0.0.1:8081//home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt') format'text' (delimiter '|' null '\N' escape '\') encoding'xxxx' ": ERROR: xxxx is not a valid encoding name
2018-11-05 22:52:11|INFO|rows Inserted = 0
2018-11-05 22:52:11|INFO|rows Updated = 0
2018-11-05 22:52:11|INFO|data formatting errors = 0
2018-11-05 22:52:11|INFO|gpload failed
2018-11-05 22:52:11|INFO|gpload session started 2018-11-05 22:52:11
2018-11-05 22:52:11|INFO|setting schema 'public' for table 'texttable'
2018-11-05 22:52:11|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt" -t 30
2018-11-05 22:52:11|INFO|did not find a staging table to reuse. creating staging_gpload_reusable_afbaac0da7ced19791c9ab9c537f41d3
2018-11-05 22:52:12|INFO|did not find an external table to reuse. creating ext_gpload_reusable_6067ad3c_e10a_11e8_b378_00505698a2d7
2018-11-05 22:52:12|ERROR|could not run SQL "create external table ext_gpload_reusable_6067ad3c_e10a_11e8_b378_00505698a2d7("s1" text,"s2" text,"s3" text,"dt" timestamp without time zone,"n1" smallint,"n2" integer,"n3" bigint,"n4" numeric,"n5" numeric,"n6" real,"n7" double precision)location('gpfdist://127.0.0.1:8081//home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt') format'text' (delimiter '|' null '\N' escape '\') encoding'xxxx' ": ERROR: xxxx is not a valid encoding name
2018-11-05 22:52:12|INFO|rows Inserted = 0
2018-11-05 22:52:12|INFO|rows Updated = 0
2018-11-05 22:52:12|INFO|data formatting errors = 0
2018-11-05 22:52:12|INFO|gpload failed
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 's1' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
2020-10-19 11:09:21|INFO|gpload session started 2020-10-19 11:09:21
2020-10-19 11:09:21|INFO|setting schema 'public' for table 'texttable'
2020-10-19 11:09:21|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt" -t 30
2020-10-19 11:09:21|INFO|did not find a staging table to reuse. creating staging_gpload_reusable_afbaac0da7ced19791c9ab9c537f41d3
2020-10-19 11:09:21|INFO|did not find an external table to reuse. creating ext_gpload_reusable_7bf26200_11b8_11eb_a45c_00505698707d
2020-10-19 11:09:21|ERROR|unexpected error -- backtrace written to log file
2020-10-19 11:09:21|INFO|rows Inserted = 0
2020-10-19 11:09:21|INFO|rows Updated = 0
2020-10-19 11:09:21|INFO|data formatting errors = 0
2020-10-19 11:09:21|INFO|gpload failed
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 's1' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
2020-10-19 11:09:22|INFO|gpload session started 2020-10-19 11:09:22
2020-10-19 11:09:22|INFO|setting schema 'public' for table 'texttable'
2020-10-19 11:09:22|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt" -t 30
2020-10-19 11:09:22|INFO|did not find a staging table to reuse. creating staging_gpload_reusable_afbaac0da7ced19791c9ab9c537f41d3
2020-10-19 11:09:22|INFO|did not find an external table to reuse. creating ext_gpload_reusable_7c2d9dde_11b8_11eb_af8a_00505698707d
2020-10-19 11:09:22|ERROR|unexpected error -- backtrace written to log file
2020-10-19 11:09:22|INFO|rows Inserted = 0
2020-10-19 11:09:22|INFO|rows Updated = 0
2020-10-19 11:09:22|INFO|data formatting errors = 0
2020-10-19 11:09:22|INFO|gpload failed
2020-10-19 10:58:33|INFO|gpload session started 2020-10-19 10:58:33
2020-10-19 10:58:33|INFO|setting schema 'public' for table 'texttable'
2020-10-19 10:58:33|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt" -t 30
2020-10-19 10:58:33|INFO|did not find an external table to reuse. creating ext_gpload_reusable_f96a472c_11b6_11eb_9b79_00505698707d
2020-10-19 10:58:33|INFO|running time: 0.10 seconds
2020-10-19 10:58:33|INFO|rows Inserted = 16
2020-10-19 10:58:33|INFO|rows Updated = 0
2020-10-19 10:58:33|INFO|data formatting errors = 0
2020-10-19 10:58:33|INFO|gpload succeeded
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 's1' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
2020-10-19 10:58:33|INFO|gpload session started 2020-10-19 10:58:33
2020-10-19 10:58:33|INFO|setting schema 'public' for table 'texttable'
2020-10-19 10:58:33|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file2.txt" -t 30
2020-10-19 10:58:33|INFO|did not find a staging table to reuse. creating staging_gpload_reusable_bf2513bc4e5c7466f3cd5abecf21f8f4
2020-10-19 10:58:33|INFO|did not find an external table to reuse. creating ext_gpload_reusable_f99cbec8_11b6_11eb_9e57_00505698707d
2020-10-19 10:58:33|INFO|running time: 0.11 seconds
2020-10-19 10:58:33|INFO|rows Inserted = 0
2020-10-19 10:58:33|INFO|rows Updated = 15
2020-10-19 10:58:33|INFO|data formatting errors = 0
2020-10-19 10:58:33|INFO|gpload succeeded
2020-10-19 14:21:35|INFO|gpload session started 2020-10-19 14:21:35
2020-10-19 14:21:35|INFO|setting schema 'public' for table 'testspecialchar'
2020-10-19 14:21:35|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt" -t 30
2020-10-19 14:21:35|INFO|did not find an external table to reuse. creating ext_gpload_reusable_5682442a_11d3_11eb_912a_00505698707d
2020-10-19 14:21:35|INFO|running time: 0.10 seconds
2020-10-19 14:21:35|INFO|rows Inserted = 8
2020-10-19 14:21:35|INFO|rows Updated = 0
2020-10-19 14:21:35|INFO|data formatting errors = 0
2020-10-19 14:21:35|INFO|gpload succeeded
2020-10-19 14:21:35|INFO|gpload session started 2020-10-19 14:21:35
2020-10-19 14:21:35|INFO|setting schema 'public' for table 'testspecialchar'
2020-10-19 14:21:35|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/workspace/gpdb/gpMgmt/bin/gpload_test/gpload2/data_file.txt" -t 30
2020-10-19 14:21:35|INFO|reusing external table ext_gpload_reusable_5682442a_11d3_11eb_912a_00505698707d
2020-10-19 14:21:35|INFO|running time: 0.08 seconds
2020-10-19 14:21:35|INFO|rows Inserted = 8
2020-10-19 14:21:35|INFO|rows Updated = 0
2020-10-19 14:21:35|INFO|data formatting errors = 0
2020-10-19 14:21:35|INFO|gpload succeeded
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册