提交 8f184518 编写于 作者: N Nicolae Vartolomei

Try fix pk in tuple performance

Possible approach for fixing #10574

The problem is that prepared sets are built correctly, it is a hash map of key -> set
where key is a hash of AST and list of data types (when we a list of
tuples of literals).

However, when the key is built from the index to try and find if there
exists a prepared set that would match it looks for data types of the
primary key (see how data_types is populated) because the primary key
has only one field (v in my example) it can not find the prepared set.

The patch looks for any prepared indexes where data types match for the
subset of fields found in primary key, we are not interested in other
fields anyway for the purpose of primary key pruning.
上级 f4869eca
......@@ -618,16 +618,44 @@ bool KeyCondition::tryPrepareSetIndex(
const ASTPtr & right_arg = args[1];
PreparedSetKey set_key;
SetPtr prepared_set;
if (right_arg->as<ASTSubquery>() || right_arg->as<ASTIdentifier>())
{
set_key = PreparedSetKey::forSubquery(*right_arg);
auto set_it = prepared_sets.find(set_key);
if (set_it == prepared_sets.end())
return false;
prepared_set = set_it->second;
}
else
set_key = PreparedSetKey::forLiteral(*right_arg, data_types);
{
auto set_it = std::find_if(
prepared_sets.begin(),
prepared_sets.end(),
[&](const auto &e)
{
if (e.first.ast_hash == right_arg->getTreeHash())
{
for (size_t i = 0; i < data_types.size(); i++)
{
if (!recursiveRemoveLowCardinality(data_types[i])->equals(*e.first.types[indexes_mapping[i].tuple_index]))
{
return false;
}
}
return true;
}
auto set_it = prepared_sets.find(set_key);
if (set_it == prepared_sets.end())
return false;
return false;
});
if (set_it == prepared_sets.end())
return false;
const SetPtr & prepared_set = set_it->second;
prepared_set = set_it->second;
}
/// The index can be prepared if the elements of the set were saved in advance.
if (!prepared_set->hasExplicitSetElements())
......
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. $CURDIR/../shell_config.sh
$CLICKHOUSE_CLIENT --multiquery <<EOF
CREATE TABLE pk_in_tuple_perf
(
v UInt64,
u UInt32
) ENGINE = MergeTree()
ORDER BY v
SETTINGS index_granularity = 1;
INSERT INTO pk_in_tuple_perf SELECT number, number * 10 FROM numbers(100);
EOF
query="SELECT count() FROM pk_in_tuple_perf WHERE (v, u) IN ((2, 10), (2, 20))"
$CLICKHOUSE_CLIENT --query "$query"
$CLICKHOUSE_CLIENT --query "$query FORMAT JSON" | grep "rows_read"
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册