--- id: tcph-test-report sidebar_position: 7.612 --- # TCP-H测试报告 ## 一.概述 为了快速评测StoneDB与业界内通用产品ClickHouse产品的差距,也为了尽可能了解StoneDB的性能水平,我们使用TPC-H 100G数据集对两款产品做了一次详细的压测,测试参考资料是tpc-h_v3.0.0,本次进行的测试流程与性能测试结果的计算方式均按照TPC-H官方技术文档tpc-h_v3.0.0中要求的标准执行 ### 1、TPC-H基准测试介绍 TPC基准™ H(TPC-H)是一个决策支持基准。它是一套面向业务的即席查询和并发修改。选择的查询和填充数据库的数据具有广泛的行业相关性,同时保持足够的易实现性。该基准说明了决策支持系统: - 检查大量数据 - 执行高度复杂的查询 - 回答关键业务问题 TPC-H通过在受控条件下对标准数据库执行一组查询来评估各种决策支持系统的性能。TPC-H查询: - 回答现实世界中的商业问题 - 模拟生成的即席查询 - 比大多数OLTP事务复杂得多 - 包含丰富的操作员范围和选择性约束 - 在被测系统的数据库服务器组件上生成密集活动 - 针对符合特定人口和规模要求的数据库执行 - 通过与在线生产数据库保持密切同步而产生的约束来实现 ### 2、StoneDB简介 StoneDB 是由石原子公司自主设计、研发的国内首款基于 MySQL 的开源 HTAP(Hybrid Transactional and Analytical Processing)数据库,是一款同时支持在线事务处理与在线分析处理的融合型分布式数据库产品,可实现与 MySQL 数据库的无缝切换。StoneDB具备超高性能、金融级高可用、实时分析、分布式数据库、兼容 MySQL 5.6、MySQL 5.7 协议和 MySQL 生态等重要特性。目标是通过插件化的方式解决 MySQL 数据库本身不具备分析能力的问题,为用户提供一站式 OLTP (Online Transactional Processing)、OLAP (Online Analytical Processing)、HTAP 解决方案。 ### 3、ClickHouse简介 ClickHouse是一个完全的列式数据库管理系统,允许在运行时创建表和数据库,加载数据和运行查询,而无需重新配置和重新启动服务器,支持线性扩展,简单方便,高可靠性,容错。采用 Local attached storage 作为存储,它的系统在生产环境中可以应用到比较大的规模,还提供了一些 SQL 直接接口,有比较丰富的原生 client。 ## 二、测试环境 下表列出了本次性能测试所使用的环境信息。 | **产品** | **架构** | **CPU** | **MEM** | **版本** | | --- | --- | --- | --- | --- | | StoneDB | 单节点 | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz * 64 | 256GB | 5.7_0.1 | | ClickHouse | 单节点 | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz * 64 | 256GB | 22.5.1.2079 | ## 三、测试结论 StoneDB与ClickHouse数据量同为100G,在相同的配置,相同的查询语句条件下,**依据TPC-H的基准性能指标22条SQL进行两款产品的对比测试,**根据测试数据汇总得出结论如下: | **对比维度** | **StoneDB** | **ClickHouse** | **StoneDB比ClickHouse的优势(倍)** | **ClickHouse比StoneDB的优势(倍)** | **备注** | | --- | --- | --- | --- | --- | --- | | 查询耗时总时长 | 3387.86s | 11537.41s | **3.4** | - | 耗时越小越有优势 | | CPU(最大值) | 5.27% | 50.24% | ** 9.5** | - | CPU越小越有优势 | | Memory(最大值) | 90.36G | 123.8G | ** 1.37** | - | Memory越小越有优势 | | Write IOPS(最大值) | 235.65io/s | 11.38io/s | ** 20.70** | - | Write IOPS越大越有优势 | | Read IOPS(最大值) | 138.37io/s | 29.3io/s | ** 4.72** | - | Read IOPS越大越有优势 | | System Load(1m最大值) | 8.77 | 34.96 | **3.99** | - | System Load越小越有优势 | | System Load(5m最大值) | 3.30 | 32.85 | **9.95** | - | System Load越小越有优势 | | System Load(15m最大值) | 2.08 | 32.20 | **15.48** | - | System Load越小越有优势 | | 查询成功率 | 95% | 82% | **1.17** | - | 成功率越大越有优势 | | 磁盘占用 | 59G | 42G | ** -** | ** 1.4** | 磁盘占用越小越有优势 | ![](./stonedb-vs-clickhouse.png) ## 四、测试结果明细 **解释说明:**每个SQL执行3次,最终计算平均值,以平均值进行性能对比。StoneDB,ClickHouse执行查询22个SQL的测试数据记录如下: 注: 相同配置下部分sql查询超时,没有在加载总耗时里统计(Q1,Q9,Q16,Q19,Q22) | SQL | StoneDB | ClickHouse | | --- | --- | --- | | Q2 | 33.09 | 6859.13 | | Q3 | 88.22 | 178.56 | | Q4 | 75.92 | 176.19 | | Q5 | 168.75 | 342.97 | | Q6 | 7.90 | 0.26 | | Q7 | 390.29 | 290.51 | | Q8 | 108.98 | 1687.23 | | Q10 | 108.57 | 54.43 | | Q11 | 26.92 | 35.98 | | Q12 | 49.41 | 2.44 | | Q13 | 249.84 | 107.73 | | Q14 | 29.58 | 6.39 | | Q15 | 40.90 | 0.73 | | Q17 | 209.45 | 8.60 | | Q18 | 1101.70 | 518.54 | | Q20 | 130.90 | 27.51 | | Q21 | 567.46 | 1240.18 | | **总时长(s)** | 3387.86 | 11537.41 | ## 五、测试过程 1. 安装所测产品并配置环境 2. 构建数据结构 - StoneDB - Clickhouse 3. 数据初始化 - 构建数据 - 导入数据 - 验证数据 4. 编写测试执行脚本 5. TPC-H测试集 - 单次执行22条query 以下为StoneDB和ClickHouse的参数配置: - StoneDB 参数设置 ```shell root@stoneatom-02:~# cat /stonedb/install/stonedb.cnf [client] port = 3306 socket = /stonedb/install/tmp/mysql.sock [mysqld] port = 3306 basedir = /stonedb/install/ character-sets-dir = /stonedb/install/share/charsets/ lc-messages-dir = /stonedb/install/share/ plugin_dir = /stonedb/install/lib/plugin/ tmpdir = /stonedb/install/tmp/ socket = /stonedb/install/tmp/mysql.sock datadir = /stonedb/install/data/ pid-file = /stonedb/install/data/mysqld.pid log-error = /stonedb/install/log/mysqld.log lc-messages-dir = /stonedb/install/share/english/ local-infile stonedb_force_hashjoin=1 stonedb_join_parallel=0 stonedb_parallel_mapjoin=1 character-set-server = utf8mb4 collation-server = utf8mb4_general_ci init_connect='SET NAMES utf8mb4' #log-basename=mysqld #debug-no-sync # Retry bind as this may fail on busy server #port-open-timeout=10 #plugin-load-add=ha_connect.so max_connections=1000 explicit_defaults_for_timestamp = true sql_mode='STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION' tmp_table_size = 1024M back_log = 130 #query_cache_size = 218M #query_cache_type = 0 concurrent_insert=2 #deadlock_search_depth_short = 3 #deadlock_search_depth_long = 10 #deadlock_timeout_long = 10000000 #deadlock_timeout_short = 5000 slow-query-log = 1 slow_query_log_file=/stonedb/install/log/slow.log #binlog config long_query_time=5 max_binlog_size=1024M sync_binlog=0 #log-bin=/stonedb/install/binlog/binlog expire_logs_days=1 #log_bin_compress=1 #log_bin_compress_min_len=256 binlog_format=statement binlog_cache_size = 524288 wait_timeout=3600 interactive_timeout=3600 connect_timeout=360 net_read_timeout=360 net_write_timeout=360 lock_wait_timeout=120 slave-net-timeout=120 skip-external-locking skip-grant-tables loose-key_buffer_size = 512M max_allowed_packet = 512M loose-sort_buffer_size = 4M loose-read_buffer_size = 4M loose-read_rnd_buffer_size = 16M loose-thread_cache_size = 8 loose-query_cache_size = 0 # Try number of CPU's*2 for thread_concurrency #thread_concurrency = 8 thread_stack = 512K lower_case_table_names=0 group_concat_max_len=512 open_files_limit = 65535 server-id = 1 # Uncomment the following if you are using innodb tables loose-innodb_data_home_dir = /stonedb/install/data/ loose-innodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextend loose-innodb_buffer_pool_size= 512M loose-innodb_lru_scan_depth= 100 loose-innodb_write_io_threads= 2 loose-innodb_read_io_threads= 2 loose-innodb_log_buffer_size= 1M loose-innodb_log_file_size= 1024M loose-innodb_log_files_in_group= 2 loose-innodb_log_group_home_dir = /stonedb/install/redolog/ loose-innodb-stats-persistent= OFF loose-innodb_lock_wait_timeout = 50 loose-innodb_flush_method = O_DIRECT loose-innodb_io_capacity = 500 loose-innodb_buffer_pool_dump_pct = 40 loose-innodb_print_all_deadlocks = 1 loose-innodb_undo_directory = /stonedb/install/undolog/ loose-innodb_undo_log_truncate = 1 loose-innodb_undo_tablespaces = 3 loose-innodb_undo_logs = 128 # MAINTAINER: # the loose- syntax is to make sure the cnf file is also # valid when building without the performance schema. # Run tests with the performance schema instrumentation loose-enable-performance-schema # Run tests with a small number of instrumented objects # to limit memory consumption with MTR loose-performance-schema-accounts-size=100 loose-performance-schema-digests-size=200 loose-performance-schema-hosts-size=100 loose-performance-schema-users-size=100 loose-performance-schema-max-mutex-instances=5000 loose-performance-schema-max-rwlock-instances=5000 loose-performance-schema-max-cond-instances=1000 loose-performance-schema-max-file-instances=10000 loose-performance-schema-max-socket-instances=1000 loose-performance-schema-max-table-instances=500 loose-performance-schema-max-table-handles=1000 loose-performance-schema-events-waits-history-size=10 loose-performance-schema-events-waits-history-long-size=10000 loose-performance-schema-events-stages-history-size=10 loose-performance-schema-events-stages-history-long-size=1000 loose-performance-schema-events-statements-history-size=10 loose-performance-schema-events-statements-history-long-size=1000 loose-performance-schema-max-thread-instances=200 loose-performance-schema-session-connect-attrs-size=2048 # Enable everything, for maximun code exposure during testing loose-performance-schema-instrument='%=ON' loose-performance-schema-consumer-events-stages-current=ON loose-performance-schema-consumer-events-stages-history=ON loose-performance-schema-consumer-events-stages-history-long=ON loose-performance-schema-consumer-events-statements-current=ON loose-performance-schema-consumer-events-statements-history=ON loose-performance-schema-consumer-events-statements-history-long=ON loose-performance-schema-consumer-events-waits-current=ON loose-performance-schema-consumer-events-waits-history=ON loose-performance-schema-consumer-events-waits-history-long=ON loose-performance-schema-consumer-global-instrumentation=ON loose-performance-schema-consumer-thread-instrumentation=ON binlog-direct-non-transactional-updates default-storage-engine=MyISAM #use_stat_tables=preferably # here, at the end of [mysqld] group mtr will automatically disable # all optional plugins. [embedded] # mtr automatically adds [embedded] group at the end and copies [mysqld] # and [mysqld.1] groups into it. # but we want [server] group to be after [mysqld] (and its copies). # create a non-empty [embedded] group here, to force it before [server] local-infile [server] # Aria is optional, but it must be enabled if it's used for temporary # tables. Let's enable it in the [server] group, because this group # is read after [mysqld] and [embedded] #loose-loog [connection] default-character-set=utf8mb4 [mysqldump] quick max_allowed_packet = 512M [mysql] no-auto-rehash # Remove the next comment character if you are not familiar with SQL #safe-updates no-auto-rehash max_allowed_packet=128M prompt='\u@\h [\d]> ' default_character_set=utf8 [isamchk] key_buffer_size = 256M sort_buffer_size = 256M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout ``` - ClickHouse etc/clickhouse-server/config.xml 参数配置 ```shell trace /var/log/clickhouse-server/clickhouse-server.log /var/log/clickhouse-server/clickhouse-server.err.log 1000M 10 8123 9000 9004 9005 9009 :: 4096 3 false /path/to/ssl_cert_file /path/to/ssl_key_file false /path/to/ssl_ca_cert_file none 0 -1 -1 false none true true sslv2,sslv3 true true true sslv2,sslv3 true RejectCertificateHandler 100 0 10000 0.9 4194304 0 8589934592 5368709120 1000 134217728 10000 /data/clickhouse/ /data/clickhouse/tmp/ ` /data/clickhouse/user_files/ users.xml /data/clickhouse/access/ false default default true false ' | sed -e 's|.*>\(.*\)<.*|\1|') wget https://github.com/ClickHouse/clickhouse-jdbc-bridge/releases/download/v$PKG_VER/clickhouse-jdbc-bridge_$PKG_VER-1_all.deb apt install --no-install-recommends -f ./clickhouse-jdbc-bridge_$PKG_VER-1_all.deb clickhouse-jdbc-bridge & * [CentOS/RHEL] export MVN_URL=https://repo1.maven.org/maven2/com/clickhouse/clickhouse-jdbc-bridge/ export PKG_VER=$(curl -sL $MVN_URL/maven-metadata.xml | grep '' | sed -e 's|.*>\(.*\)<.*|\1|') wget https://github.com/ClickHouse/clickhouse-jdbc-bridge/releases/download/v$PKG_VER/clickhouse-jdbc-bridge-$PKG_VER-1.noarch.rpm yum localinstall -y clickhouse-jdbc-bridge-$PKG_VER-1.noarch.rpm clickhouse-jdbc-bridge & Please refer to https://github.com/ClickHouse/clickhouse-jdbc-bridge#usage for more information. ]]> localhost 9000 false 127.0.0.1 9000 127.0.0.2 9000 127.0.0.3 9000 localhost 9000 localhost 9000 127.0.0.1 9000 127.0.0.2 9000 true 127.0.0.1 9000 true 127.0.0.2 9000 localhost 9440 1 localhost 9000 localhost 1 3600 3600 60 system query_log
toYYYYMM(event_date) 7500
system trace_log
toYYYYMM(event_date) 7500
system query_thread_log
toYYYYMM(event_date) 7500
system query_views_log
toYYYYMM(event_date) 7500
system part_log
toYYYYMM(event_date) 7500
system metric_log
7500 1000
system asynchronous_metric_log
7000
engine MergeTree partition by toYYYYMM(finish_date) order by (finish_date, finish_time_us, trace_id) system opentelemetry_span_log
7500
system crash_log
1000
system processors_profile_log
toYYYYMM(event_date) 7500
*_dictionary.xml *_function.xml /clickhouse/task_queue/ddl click_cost any 0 3600 86400 60 max 0 60 3600 300 86400 3600 /data/clickhouse/format_schemas/ hide encrypt/decrypt arguments ((?:aes_)?(?:encrypt|decrypt)(?:_mysql)?)\s*\(\s*(?:'(?:\\'|.)+'|.*?)\s*\) \1(???) false false https://6f33034cfe684dd7a3ab9875e57b1c8d@o388870.ingest.sentry.io/5226277
``` ### 1、构建数据结构 分别构建StoneDB和ClickHouse表结构,由于不同引擎创建表结构语法不同,所以需要对建表语句进行修改 修改后语句详见:[Create Table](#zC4TU) ### 2、数据初始化 **构建数据** 性能测试以TPC-H 100GB数据为测试数据,使用标准的DBGEN工具构造样本数据。从TPC官网下载TPC-H标准的数据生成工具DBGEN,编译后生成二进制可执行文件dbgen。 ```shell ./dbgen -s $scale -C $chunks -S $i -f -s:指定scale值 -C:一共分成几个chunk -S::当前命令生成第几个 chunk ``` **导入数据** 先导入这五个库:part 、region 、nation 、customer 、supplier,具体导入SQL见测试附件。lineitem、orders、partsupp数据量比较大,建议使用脚本切割后导入,切割脚本参考测试附件 ### 3、生成测试查询SQL脚本 生成初始SQL,生产语句如下 ```bash #!/usr/bin/bash db_type=$1 for i in {1..22} do ./qgen -d $i -s 100 >./$db_type/$db_type"$i".sql done ``` 由于引擎不同,语法略有差异,需对22条查询语句进行修改,但逻辑不变,修改后语句详见:[Query](#fVQw3) 如需测试可下载以下附件: [q.zip](https://stoneatom.yuque.com/attachments/yuque/0/2022/zip/483217/1655258123852-71189c0e-25ec-4a15-a1b0-305f56a60c30.zip) ### 4、压测脚本 测试前,均对数据库进行重启清理缓存数据。连续执行三轮测试,取三次测试结果的平均值。使用如下脚本进行测试 ```bash #!/bin/bash # systemctl start clickhouse-server for (( i=1;i<=3;i=i+1 )) do ./tpch_benchmark.sh sleep 300 done # systemctl stop clickhouse-server ``` ```bash #!/bin/bash # stone # host= # port= # user= # password= # database= # ClickHouse host= port= database= #查询语句绝对路径,需要修改成自己的路径 banchdir=/path/tpc-h/queries # db_type=ck #ck、stone、mysql db_type=ck #ck、stone、mysql resfile=$banchdir/$db_type/"TPCH_${db_type}_`date "+%Y%m%d%H%M%S"`" #igqure=(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22) #CK需要跳过9 16 19 22因为执行时间过长 igqure=(9 16 19 22) echo "start test run at"`date "+%Y-%m-%d %H:%M:%S"`|tee -a ${resfile}.out echo "日志地址: ${resfile}" for (( i=1; i<=22;i=i+1 )) do for j in ${igqure[@]} do if [[ $i -eq $j ]];then echo "pass query$i.sql" continue 2 fi done # if (( $i==13 || $i==15 ));then # continue # fi queryfile=${db_type}${i}".sql" echo "run query ${i}"|tee -a ${resfile}.out #ck使用这个 echo " $database $banchdir/query$i.sql " #|tee -a ${resfile}.out start_time=`date "+%s.%N"` clickhouse-client -h $host --port $port -d $database < $banchdir/query$i.sql |tee -a ${resfile}.out #mysql 和stoneDB使用这个 # mysql -u$user -p$password -h$host -P$port $database -e "source $banchdir/query$i.sql" 2>&1 |tee -a ${resfile}.out # mysql -u$user -p$password -h$host -P$port $database -e "source $banchdir/query$i.sql" 2>&1 |tee -a ${resfile}.out end_time=`date "+%s.%N"` start_s=${start_time%.*} start_nanos=${start_time#*.} end_s=${end_time%.*} end_nanos=${end_time#*.} if [ "$end_nanos" -lt "$start_nanos" ];then end_s=$(( 10#$end_s -1 )) end_nanos=$(( 10#$end_nanos + 10 ** 9)) fi time=$(( 10#$end_s - 10#$start_s )).`printf "%03d\n" $(( (10#$end_nanos - 10#$start_nanos)/10**6 ))` echo ${queryfile} "the $i run cost "${time}" second start at "`date -d @$start_time "+%Y-%m-%d %H:%M:%S"`" stop at "`date -d @$end_time "+%Y-%m-%d %H:%M:%S"` >> ${resfile}.time # systemctl stop clickhouse-server done ``` 执行脚本: ```shell nohup ./start_benchmark.sh & ``` ## 六、数据规格 磁盘总占用大小(StoneDB):59G 磁盘总占用大小(ClickHouse):42G ## 七、测试方法 - 本次测试使用TPC-H的脚本,依次执行TPC-H测试集。 - 连续执行三次测试,取三次测试结果的平均值。 ## 八、测试策略 1. 在服务器上部署StoneDB产品 2. 下载ClickHouse的产品**,**在与StoneDB一致的服务器部署产品 3. 下载TPC-H测试工具并进行调试,利用DBGEN生成100G测试数据 4. 优化Create和Query的可执行SQL脚本 5. 执行的测试过程如下: - 连接ClickHouse与StoneDB执行优化后的TPC-H Create建表语句; - 分别向ClickHouse与StoneDB中导入100G数据; - 分别对CLickHouse与StoneDB执行22条Query查询测试,最终计算得出3次测试结果的平均耗时; - 对比CLickHouse与StoneDB 22条Query的性能耗时差异及硬件资源使用率对比  6. 通过测试记录的数据,计算核心性能指标结果来评估产品性能差异 7. 完成编写测试报告并通过评审 ## 九、操作系统资源使用对比 StoneDB在IO性能消耗,CPU使用率,系统负载,IOPS上的性能明显优于ClickHouse ### CPU **StoneDB** ![stonedb-cpu](./stonedb-cpu.png) **ClickHouse** ![clickhouse-cpu.png](./clickhouse-cpu.png) ### Disk IOPS **StoneDB** ![stonedb-disk-iops.pn](./stonedb-disk-iops.png) **ClickHouse** ![clickhouse-disk-iops.png](./clickhouse-disk-iops.png) ### Memory **StoneDB** ![stonedb-memory.png](./stonedb-memory.png) **ClickHouse** ![click-memory.png](./click-memory.png) ### System Load **StoneDB** ![stonedb-system-load.png](./stonedb-system-load.png) **ClickHouse** ![clickhouse-system-load.png](./clickhouse-system-load.png) ## 十、测试脚本 ### Create Table #### ClickHouse ```sql create table nation ( n_nationkey bigint, n_name char(25) , n_regionkey bigint, n_comment varchar(152))engine=MergeTree order by (n_name,n_regionkey); create table region ( r_regionkey bigint, r_name char(25) , r_comment varchar(152))engine=MergeTree order by (r_name); create table part ( p_partkey bigint, p_name varchar(55) , p_mfgr char(25) , p_brand char(10) , p_type varchar(25) , p_size bigint, p_container char(10) , p_retailprice decimal(15,2) , p_comment varchar(23) )engine=MergeTree order by (p_name,p_mfgr); create table supplier ( s_suppkey bigint, s_name char(25) , s_address varchar(40) , s_nationkey bigint, s_phone char(15) , s_acctbal decimal(15,2) , s_comment varchar(101) )engine=MergeTree order by (s_suppkey,s_name); create table partsupp ( ps_partkey bigint, ps_suppkey bigint, ps_availqty bigint, ps_supplycost decimal(15,2) , ps_comment varchar(199) )engine=MergeTree order by (ps_partkey,ps_suppkey); create table customer ( c_custkey bigint, c_name varchar(25) , c_address varchar(40) , c_nationkey bigint, c_phone char(15) , c_acctbal decimal(15,2) , c_mktsegment char(10) , c_comment varchar(117) )engine=MergeTree order by (c_custkey,c_name); create table orders ( o_orderkey bigint, o_custkey bigint, o_orderstatus char(1) , o_totalprice decimal(15,2) , o_orderdate date , o_orderpriority char(15) , o_clerk char(15) , o_shippriority bigint, o_comment varchar(79) )engine=MergeTree order by (o_orderkey,o_custkey); create table lineitem ( l_orderkey bigint, l_partkey bigint, l_suppkey bigint, l_linenumber bigint, l_quantity decimal(15,2) , l_extendedprice decimal(15,2) , l_discount decimal(15,2) , l_tax decimal(15,2) , l_returnflag char(1) , l_linestatus char(1) , l_shipdate date , l_commitdate date , l_receiptdate date , l_shipinstruct char(25) , l_shipmode char(10) , l_comment varchar(44) )engine=MergeTree order by (l_shipdate,l_returnflag,l_linestatus); ``` #### StoneDB ```sql create table nation ( n_nationkey integer not null, n_name char(25) not null, n_regionkey integer not null, n_comment varchar(152),primary key (n_nationkey))engine=STONEDB; create table region ( r_regionkey integer not null, r_name char(25) not null, r_comment varchar(152),primary key (r_regionkey))engine=STONEDB; create table part ( p_partkey integer not null, p_name varchar(55) not null, p_mfgr char(25) not null, p_brand char(10) not null, p_type varchar(25) not null, p_size integer not null, p_container char(10) not null, p_retailprice decimal(15,2) not null, p_comment varchar(23) not null,primary key (p_partkey) )engine=STONEDB; create table supplier ( s_suppkey integer not null, s_name char(25) not null, s_address varchar(40) not null, s_nationkey integer not null, s_phone char(15) not null, s_acctbal decimal(15,2) not null, s_comment varchar(101) not null,primary key (s_suppkey))engine=STONEDB; create table partsupp ( ps_partkey integer not null, ps_suppkey integer not null, ps_availqty integer not null, ps_supplycost decimal(15,2) not null, ps_comment varchar(199) not null,primary key (ps_partkey,ps_suppkey) )engine=STONEDB; create table customer ( c_custkey integer not null, c_name varchar(25) not null, c_address varchar(40) not null, c_nationkey integer not null, c_phone char(15) not null, c_acctbal decimal(15,2) not null, c_mktsegment char(10) not null, c_comment varchar(117) not null,primary key (c_custkey))engine=STONEDB; create table orders ( o_orderkey integer not null, o_custkey integer not null, o_orderstatus char(1) not null, o_totalprice decimal(15,2) not null, o_orderdate date not null, o_orderpriority char(15) not null, o_clerk char(15) not null, o_shippriority integer not null, o_comment varchar(79) not null,primary key (o_orderkey))engine=STONEDB; create table lineitem ( l_orderkey integer not null, l_partkey integer not null, l_suppkey integer not null, l_linenumber integer not null, l_quantity decimal(15,2) not null, l_extendedprice decimal(15,2) not null, l_discount decimal(15,2) not null, l_tax decimal(15,2) not null, l_returnflag char(1) not null, l_linestatus char(1) not null, l_shipdate date not null, l_commitdate date not null, l_receiptdate date not null, l_shipinstruct char(25) not null, l_shipmode char(10) not null, l_comment varchar(44) not null,primary key (l_orderkey,l_linenumber))engine=STONEDB; ``` ### Query SQL1 ```sql select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where -- l_shipdate <= date '1998-12-01' - interval '90' day (3) l_shipdate <= date '1998-12-01' - interval '90' day group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; ``` SQL2 ```sql select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 15 and p_type like '%BRASS' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'EUROPE' and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier, nation, region, part where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'EUROPE' ) order by s_acctbal desc, n_name, s_name, p_partkey limit 100; ``` SQL3 ```sql select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment = 'BUILDING' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date '1995-03-15' and l_shipdate > date '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; ``` SQL4 ```sql select o_orderpriority, count(*) as order_count from orders,lineitem where l_orderkey = o_orderkey and o_orderdate >= date '1993-07-01' and o_orderdate < date '1993-07-01' + interval '3' month and l_commitdate < l_receiptdate and o_orderkey <> l_orderkey group by o_orderpriority order by o_orderpriority; ``` SQL5 ```sql -- using default substitutions select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' and o_orderdate >= date '1994-01-01' and o_orderdate < date '1994-01-01' + interval '1' year group by n_name order by revenue desc; ``` SQL6 ```sql -- using default substitutions select sum(l_extendedprice * l_discount) as revenue from lineitem where l_shipdate >= date '1994-01-01' and l_shipdate < date '1994-01-01' + interval '1' year and l_discount between 0.06 - 0.01 and 0.06 + 0.01 and l_quantity < 24; ``` SQL7 ```sql -- using default substitutions select supp_nation, cust_nation, l_year, sum(volume) as revenue from ( select n1.n_name as supp_nation, n2.n_name as cust_nation, extract(year from l_shipdate) as l_year, l_extendedprice * (1 - l_discount) as volume from supplier, lineitem, orders, customer, nation n1, nation n2 where s_suppkey = l_suppkey and o_orderkey = l_orderkey and c_custkey = o_custkey and s_nationkey = n1.n_nationkey and c_nationkey = n2.n_nationkey and ( (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY') or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE') ) and l_shipdate between date '1995-01-01' and date '1996-12-31' ) as shipping group by supp_nation, cust_nation, l_year order by supp_nation, cust_nation, l_year; ``` SQL8 ```sql -- using default substitutions select o_year, sum(case when nation = 'BRAZIL' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) as volume, n2.n_name as nation from part, supplier, lineitem, orders, customer, nation n1, nation n2, region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'AMERICA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'ECONOMY ANODIZED STEEL' ) as all_nations group by o_year order by o_year; ``` SQL9 ```sql -- using default substitutions select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%green%' ) as profit group by nation, o_year order by nation, o_year desc; ``` SQL10 ```sql -- using default substitutions select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders, lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date '1993-10-01' and o_orderdate < date '1993-10-01' + interval '3' month and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by revenue desc limit 20; ``` SQL11 ```sql -- using default substitutions select ps_partkey, sum(ps_supplycost * ps_availqty) as value from partsupp, supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = 'GERMANY' group by ps_partkey having sum(ps_supplycost * ps_availqty) > ( select sum(ps_supplycost * ps_availqty) * 0.0000010000 from partsupp, supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = 'GERMANY' ) order by value desc; ``` SQL12 ```sql -- using default substitutions select l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_count from orders, lineitem where o_orderkey = l_orderkey and l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_shipdate < l_commitdate and l_receiptdate >= date '1994-01-01' and l_receiptdate < date '1994-01-01' + interval '1' year group by l_shipmode order by l_shipmode; ``` SQL13 ```sql -- using default substitutions select c_count, count(*) as custdist from ( select c_custkey as c_custkey, count(o_orderkey) as c_count from customer left outer join orders on c_custkey = o_custkey and o_comment not like '%special%requests%' group by c_custkey ) as c_orders group by c_count order by custdist desc, c_count desc; ``` SQL14 ```sql -- using default substitutions select 100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount) else 0 end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue from lineitem, part where l_partkey = p_partkey and l_shipdate >= date '1995-09-01' and l_shipdate < date '1995-09-01' + interval '1' month; ``` SQL15 ```sql -- using default substitutions -- CK需要提前建立视图,否则会报failed at position 518 (end of query) (line 23, col 11): ; -- CREATE VIEW revenue0 AS SELECT l_suppkey AS supplier_no, sum(l_extendedprice * (1 - l_discount)) AS total_revenue FROM lineitem WHERE l_shipdate >= DATE '1996-01-01' AND l_shipdate < DATE '1996-04-01' GROUP BY l_suppkey; select s_suppkey, s_name, s_address, s_phone, total_revenue from supplier, revenue0 where s_suppkey = supplier_no and total_revenue = ( select max(total_revenue) from revenue0 ) order by s_suppkey; ``` SQL16 ```sql -- using default substitutions select p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt from partsupp, part,supplier where p_partkey = ps_partkey and p_brand <> 'Brand#45' and p_type not like 'MEDIUM POLISHED%' and p_size in (49, 14, 23, 45, 19, 3, 36, 9) and ps_suppkey <> s_suppkey and s_comment like '%Customer%Complaints%' group by p_brand, p_type, p_size order by supplier_cnt desc, p_brand, p_type, p_size; ``` SQL17 ```sql -- using default substitutions select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' and l_quantity < ( select 0.2 * avg(l_quantity) from lineitem,part where l_partkey = p_partkey ); ``` SQL18 ```sql -- using default substitutions select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey =l_orderkey and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, l_orderkey having sum(l_quantity) > 300 order by o_totalprice desc, o_orderdate limit 100; ``` SQL19 ```sql -- using default substitutions select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 1 and l_quantity <= 1 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#34' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 20 and l_quantity <= 20 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) limit 1; ``` SQL20 ```sql -- using default substitutions select s_name, s_address from supplier, nation where s_suppkey in ( select ps_suppkey from partsupp where ps_partkey in ( select p_partkey from part where p_name like 'forest%' ) and ps_availqty > ( select 0.5 * sum(l_quantity) from lineitem,partsupp where l_partkey = ps_partkey and l_suppkey = ps_suppkey and l_shipdate >= date '1994-01-01' and l_shipdate < date '1994-01-01' + interval '1' year ) ) and s_nationkey = n_nationkey and n_name = 'CANADA' order by s_name; ``` SQL21 ```sql -- using default substitutions select s_name, count(*) as numwait from supplier, lineitem l1, orders, nation,lineitem l2,lineitem l3 where s_suppkey = l1.l_suppkey and o_orderkey = l1.l_orderkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdate and l1.l_orderkey=l2.l_orderkey and l2.l_orderkey = l1.l_orderkey and l2.l_suppkey <> l1.l_suppkey and l1.l_orderkey<> l3.l_orderkey AND l3.l_orderkey = l1.l_orderkey and l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate > l3.l_commitdate and s_nationkey = n_nationkey and n_name = 'SAUDI ARABIA' group by s_name order by numwait desc, s_name limit 100; ``` SQL22 ```sql select cntrycode, count(*) as numcust, sum(c_acctbal) as totacctbal from ( select substring(c_phone from 1 for 2) as cntrycode, c_acctbal from customer,orders where substring(c_phone from 1 for 2) in ('13', '31', '23', '29', '30', '18', '17') and c_acctbal > ( select avg(c_acctbal) from customer where c_acctbal > 0.00 and substring(c_phone from 1 for 2) in ('13', '31', '23', '29', '30', '18', '17') ) and o_custkey = c_custkey and c_custkey <> o_custkey ) as custsale group by cntrycode order by cntrycode; ```