未验证 提交 a11644b1 编写于 作者: Y Yuting 提交者: GitHub

docs(stonedb): update the latest docs(#391) (#392)

* docs(stonedb): update the latest docs(#391)

* docs(stonedb): update the docs of Architechure and Limits
上级 8f60d0e5
......@@ -7,62 +7,61 @@ sidebar_position: 1.2
![StoneDB_V1.0](./stonedb-architecture-V1.png)
StoneDB is a hybrid transaction/analytical processing (HTAP) database. It provides a column-based storage engine also named StoneDB to handle online analytical processing (OLAP) workloads. The StoneDB storage engine features high performance and high data compression ratio, in addition to common features provided by other storage engines such as InnoDB and MyISAM. The logical architecture of StoneDB consists of three layers: applications, services, and storage engine. When an SQL query is processed by StoneDB, the SQL query is processed through every module in the three layers one after one.
:::info
In this topic, StoneDB refers to the database, unless otherwise specified.
:::
StoneDB is a hybrid transaction/analytical processing (HTAP) database. It provides a column-based storage engine named Tianmu to handle online analytical processing (OLAP) workloads. Tianmu features high performance and high-efficiency data compression, in addition to common features provided by other storage engines such as InnoDB and MyISAM. The logical architecture of StoneDB consists of three layers: applications, services, and storage engine. When an SQL query is processed by StoneDB, the SQL query is processed through every module in the three layers one after one.
## Applications layer
### Connection management
When a client sends a connection request to a server, the server assigns a thread from the thread pool to process interactions with the client. If the client disconnects from the server, the thread is assigned to a new connection, instead of being destroyed. This saves time in creating and releasing threads.
### Authentication
When a client sends a connection request to a server, the server assigns a thread from the thread pool to process interactions with the client. If the client disconnects from the server, the thread pool reclaims the thread and assign it to a new connection, instead of destroying the thread. This saves time in creating and releasing threads.
### Authentication
After receiving a connection request from a client, the server authenticates the user that initiates the connection based on the username, password, IP address, and port number. If the user fails the authentication, the server rejects the connection request.
### Access control
### Access control
After the client is connected to the server, the server identifies what operations are allowed for the user based on the permissions granted to the user.
## Services layer
The services layer includes service components, such as the system manager, SQL interface, query cache, and SQL parser.
:::info
The optimizer and executor provided by MySQL are not provided in this topic. The Optimizer and executor described in this topic are StoneDB Optimizer and StoneDB Executor.
:::
### Management & Utilities
StoneDB provides various database management features, such as backup, recovery, user management, permission management, and database metadata management.
StoneDB provides various database management features, such as backup and recovery, user and permission management, and database metadata management.
### SQL Interface
SQL Interface is mainly used to receive and process SQL queries and return query results.
### Cache & Buffers
### Caches & Buffers
The query cache is used to temporarily store the hash values and result sets of executed SQL statements to enhance execution efficiency. When a query passes through this module, the hash value of the query is used to check if any matching record exists in the query cache. If no, the query is then parsed, optimized, and executed. After it is processed, its hash value and result set are cached. If yes, the result set is directly read from the cache. However, if the query hits the cache but the schema or data of the queried table is modified, the relevant cache is invalid and the query still needs to be further processed to obtain the result set. Therefore, we recommend that you disable the query cache in your production environment. The query cache is removed in MySQL 8.0.
### Parser
The parser is used to parse SQL statements and generate parse trees. The parser performs lexical analysis to check whether the table and columns exist and then syntax analysis to check whether SQL statements are written in correct syntax. If any error is detected, relevant error information will be returned.
### Optimizer
The optimizer chooses the execution plan with the lowest cost for each SQL query based on the tables, indexes, and other statistics information relevant to the SQL query.
### Executor
The executor verifies whether the user that initiates a query has permissions to operate on the relevant tables. If the user has sufficient permissions, the executor calls API operations to read data and returns the query result.
## Storage engine layer
When your data volume reaches tens of or even hundreds of billions of records, executing a statistical or aggregate query on MySQL or another relational database may take several to dozens of minutes. However, to process the same query, StoneDB requires only one tenth of the time or even less. This is because StoneDB uses column-based storage, data compression, and knowledge grid techniques to optimize query processing.
The storage engine layer of StoneDB consists of many modules, such as the data decompression module, StoneDB Optimizer, and Knowledge Grid Manager.
### StoneDB Optimizer
StoneDB Optimizer is a self-developed optimizer provided by StoneDB. It is used to optimize SQL statements by converting expressions, converting subqueries to joins, and then generates a high-efficiency execution plan by using the Knowledge Grid technique.
StoneDB Optimizer is a self-developed optimizer provided by StoneDB. It is used to optimize SQL statements by converting expressions or converting subqueries to joins, and then generates high-efficiency execution plans by using the Knowledge Grid technique.
### StoneDB Executor
StoneDB Executor reads data based on the execution plan.
StoneDB Executor is a self-developed executor provided by StoneDB. It reads data based on the execution plan.
### Knowledge Grid Manager
#### **Data Pack**
When your data volume reaches tens of or even hundreds of billions of records, executing a statistical or aggregate query on MySQL or another row-oriented relational database may take several to dozens of minutes. This is because cost-based optimizer first generates execution plans based on statistics of tables or indexes and then reads data. In this process, I/O operations are performed. If the statistics are not accurate and an improper execution plan is generated, a large number of I/O operations will be performed. To process the same query, StoneDB that uses the Tianmu column-based storage engine is at least ten times faster, compared to a row-oriented relational database. Tianmu features not only column-based storage and high-efficiency data compression, but also the Knowledge Grid technique it employs. Following are some basic concepts about Knowledge Grid.
#### Data Pack
Data Packs are data storage units. Data in each column is sliced into Data Packs every 65,536 rows. A Data Pack is smaller than a column and supports higher data compression ratio, whereas it is larger than a row and supports higher query performance. Data Packs are also the units for which the Knowledge Grid uses to decompress data.
Based on the theory of rough sets, Data Packs can be classified into the following three categories:
The rough set theory can be used for classification to discover structural relationships within imprecise or noisy data. Based on this theory, Data Packs can be classified into the following three categories:
- Irrelevant Data Packs: with no data elements relevant for further execution
- Relevant Data Packs: with all data elements relevant for further execution
- Suspect Data Packs: with some data elements relevant for further execution
:::info
When a query is being processed, the relevant Data Packs are compressed only when the result set of a query cannot be obtained through the Data Pack Nodes.
:::
This classification helps filter out irrelevant Data Packs. StoneDB needs only to read metadata of relevant Data Packs, and decompress suspect Data Packs and then examine the data records to filter relevant data records. The process of handling relevant Data Packs does not consume I/O, since no data is decompressed.
#### **Data Pack Node**
- Suspect Data Packs: with some data elements relevant for further execution
#### Data Pack Node
A Data Pack Node stores the following information about a Data Pack:
- The maximum, minimum, average, and sum of the values
- The number of values and the number of non-null values
- The compression method
- The length in bytes
- The length in bytes
Therefore, Data Pack Node is also called Metadata Node. One Data Pack Node corresponds to one Data Pack.
#### **Knowledge Node**
Knowledge Nodes are at the upper layer of Data Pack Nodes. Knowledge Nodes store a collection of metadata that shows the relations between Data Packs and columns, including the range of value occurrence and the associations between columns. Most data stored in a Knowledge Node is generated when data is being loaded and the rest is generated during queries.
#### Knowledge Node
Knowledge Nodes are at the upper layer of Data Pack Nodes. Knowledge Nodes store a collection of metadata that shows the relations between Data Packs and columns, including the range of value occurrence, data characteristics, and certain statistics. Most data stored in a Knowledge Node is generated when data is being loaded and the rest is generated during queries.
Knowledge Nodes can be classified into the following types:
##### **Histogram**
##### Histogram
Histograms are used to present statistics on columns whose data types are integer, date and time, or floating point. In a histogram, the range between the maximum value and minimum value of a data pack is evenly divided into 1,024 ranges, each of which occupies 1 bit. Ranges within which at least one value falls are marked with 1. Ranges within which no value falls are marked with 0. Histograms are automatically created when data is being loaded.
Suppose values in a Data Pack fall within two ranges: 0‒100 and 102301‒102400, as shown in the following histogram.
......@@ -76,7 +75,7 @@ Execute the following SQL statement:
select * from table where id>199 and id<299;
```
No value in the Data Pack is hit. Therefore, the Data Pack is irrelevant and filtered out.
##### **CMAP**
##### CMAP
Character Maps (CMAPs) are binary representation of the occurrence of ASCII characters within the first 64 row positions. If a character exists in a position, the position is marked with 1 for the character. Otherwise, the position is marked with 0 for the character. CMAPs are automatically created when data is being loaded.
In the following example, character A exists in position 1 and position 64.
......@@ -88,27 +87,27 @@ In the following example, character A exists in position 1 and position 64.
| C | 1 | 1 | ... | 1 |
| ... | ... | ... | ... | ... |
##### **Pack-to-Pack**
##### Pack-to-Pack
Pack-to-Packs are equijoin relations between the pairs of tables. Pack-to-Pack is a binary matrix. If a matching pair is found between two Data Packs, the value is 1. Otherwise, the value is 0. Using Pack-to-Packs can help quickly identify relevant Data Packs, improving join performance. Pack-to-Packs are automatically created when join queries are being executed.
In the following example, the condition for joining tables is `A.C=B.D`. For Data Pack A.C1, only Data Packs B.D2 and B.D5 contain matching values.
| | B.D1 | B.D2 | B.D3 | B.D4 | B.D5 |
| <br /> | B.D1 | B.D2 | B.D3 | B.D4 | B.D5 |
| --- | --- | --- | --- | --- | --- |
| A.C1 | 0 | 1 | 0 | 0 | 1 |
| A.C2 | 1 | 1 | 0 | 0 | 0 |
| A.C3 | 1 | 1 | 0 | 1 | 1 |
#### **Knowledge Grid**
The Knowledge Grid consists of Data Pack Nodes and Knowledge Nodes. Data Packs are compressed for storage and the cost for decompressing Data Packs is high. Therefore, the key to improving read performance is to retrieve as few as Data Packs. The Knowledge Grid can help filter out irrelevant data. With the Knowledge Gid, the data retrieved can be reduced to less than 1% of the total data. In most cases, the data retrieved can be loaded to memory so that the query processing efficiency can be further improved.
#### Knowledge Grid
The Knowledge Grid consists of Data Pack Nodes and Knowledge Nodes. Data Packs are compressed for storage and the cost for decompressing Data Packs is high. Therefore, the key to improving read performance is to retrieve as few as Data Packs. Knowledge Grid can help filter out irrelevant data. With Knowledge Grid, the data retrieved can be reduced to less than 1% of the total data. In most cases, the data retrieved can be loaded to memory so that the query processing efficiency can be further improved.
For most statistical and aggregate queries, StoneDB can return query results by using only the Knowledge Grid. In this way, the number of Data Packs to be decompressed is greatly reduced, saving I/O resources, minimizing the response time, and improving the network utilization.
Following is an example showing how the Knowledge Grid works.
**Following is an example showing how the Knowledge Grid works.**
The following table shows the distribution of values recorded in Data Pack Nodes.
| | Min. | Max. |
| <br /> | Min. | Max. |
| --- | --- | --- |
| t1.A1 | 1 | 9 |
| t1.A2 | 10 | 30 |
......@@ -121,15 +120,20 @@ select min(t2.D) from t1,t2 where t1.B=t2.C and t1.A>15;
The working process of the Knowledge Grid is as follows:
1. Filter Data Packs based on Data Pack Nodes: data pack t1.A1 is irrelevant, t1.A2 is suspect, and t1.A3 is relevant. Therefore, t1.A1 is filtered out.
| | t2.C1 | t2.C2 | t2.C3 | t2.C4 | t2.C5 |
![Step1.png](./KnowledgeGrid-1.png)
| <br /> | t2.C1 | t2.C2 | t2.C3 | t2.C4 | t2.C5 |
| --- | --- | --- | --- | --- | --- |
| t1.B1 | 1 | 1 | 1 | 0 | 1 |
| t1.B2 | 0 | 1 | 0 | 0 | 0 |
| t1.B3 | 1 | 1 | 0 | 0 | 1 |
1. Compare t1.B1 and t2.C1 to check whether matching pairs exist based on pack-to-packs. In this step, Data Packs t2.C2 and t2.C5 contain matching pairs while Data Packs t2.C3 and t2.C4 are filtered out.
![Step2.png](./KnowledgeGrid-2.png)
2. Compare t1.B1 and t2.C1 to check whether matching pairs exist based on pack-to-packs. In this step, Data Packs t2.C2 and t2.C5 contain matching pairs while Data Packs t2.C3 and t2.C4 are filtered out.
| | Min. | Max. |
| <br /> | Min. | Max. |
| --- | --- | --- |
| t2.D1 | 0 | 500 |
| t2.D2 | 101 | 440 |
......@@ -137,11 +141,11 @@ The working process of the Knowledge Grid is as follows:
| t2.D4 | 1 | 432 |
| t2.D5 | 3 | 100 |
3. Examine Data Packs D2 and D5, after D1, D3, and D4 are filtered out in the previous two steps. Based on the Data Pack Nodes of column D, the maximum value in D5 is 100, which is smaller than the minimum value 101 in D2. Therefore, D2 is filtered out. Now, the system only needs to decompress data pack D5 to obtain the final result.
1. Examine Data Packs D2 and D5, after D1, D3, and D4 are filtered out in the previous two steps. Based on the Data Pack Nodes of column D, the maximum value in D5 is 100, which is smaller than the minimum value 101 in D2. Therefore, D2 is filtered out. Now, the system only needs to decompress data pack D5 to obtain the final result.
![Step3.png](./KnowledgeGrid-3.png)
### StoneDB Loader Parser
StoneDB Loader Parser is a module responsible for data import and export. It processes `LOAD DATA INFILE` and `SELECT … INTO FILE` operations.
### Insert Buffer
The Insert Buffer is used to optimize insert performance. When you insert data to a table, the data to insert is first temporarily stored in Insert Buffer and then flushed from Insert Buffer to disks in batches. This improves system throughput. If the data is directly written into disks, the data is written one row after another because StoneDB does not support transactions. As a result, the system throughput is low and thus the insertion efficiency is low. Insert Buffer is enabled by default. If you want to disable it, set parameter** stonedb_insert_delayed** to **off**.
StoneDB Loader Parser is a module responsible for data import and export. It processes `LOAD DATA INFILE` and `SELECT ... INTO FILE` operations. StoneDB provides an independent client to import data from various sources, written in different programming languages. Before data is imported, it is preprocessed, such as data compression and construction of Knowledge Nodes. In this way, operations such as parsing, verification, and transaction processing are eliminated when the data is being processed by the storage engine.
### Replication Manager
The high-availability structure of StoneDB includes a replication engine called Replication Manager to ensure strong consistency between the primary and secondary databases. Different from binlog replication used by MySQL to replicate original data, Replication Manager can directly replicate compressed data since data stored in StoneDB is compressed, without the need for decompression. This greatly reduces the traffic required for transmitting data.
### Compress
......
......@@ -6,26 +6,26 @@ sidebar_position: 1.1
StoneDB is an open-source hybrid transaction/analytical processing (HTAP) database designed and developed by StoneAtom based on the MySQL kernel. It is the first database of this type launched in China. StoneDB can be seamlessly switched from MySQL. It provides features such as optimal performance and real-time analytics, offering you a one-stop solution to process online transaction processing (OLTP), online analytical processing (OLAP), and HTAP workloads.
StoneDB is fully compatible with the MySQL 5.6 and 5.7 protocols, the MySQL ecosystem, and common MySQL features and syntaxes. You can use tools and clients in the MySQL ecosystem on StoneDB, such as Navicat, Workbench, mysqldump, and mydumper. In addition, all workloads on StoneDB can be run on MySQL.
StoneDB is fully compatible with the MySQL 5.6 and 5.7 protocols, the MySQL ecosystem, and common MySQL features and syntaxes. Tools and clients in the MySQL ecosystem, such as Navicat, Workbench, mysqldump, and mydumper, can be directly used on StoneDB. In addition, all workloads on StoneDB can be run on MySQL.
StoneDB is optimized for OLAP applications. StoneDB that runs on a common server can process complex queries on tens of billions of data records, while ensuring high performance. Compared to databases that use MySQL Community Edition, StoneDB is at least 10 times faster in processing queries.
StoneDB uses the knowledge grid technology and a column-based storage engine. This storage engine is designed for OLAP applications and uses techniques such as column-based storage, knowledge grid-based filtering, and high-efficiency data compression. With such storage engine, StoneDB provides application systems with high-performance and reduces the total cost of ownership (TCO).
StoneDB uses the Knowledge Grid technology and a column-based storage engine. The column-based storage engine is designed for OLAP applications and uses techniques such as column-based storage, Knowledge Grid-based filtering, and high-efficiency data compression. With such storage engine, StoneDB ensures the high performance of application systems and reduces the total cost of ownership (TCO).
## Advantages
### Integration of MySQL
StoneDB is an HTAP database built on MySQL. To enhance analytics capabilities, it integrates a self-developed engine also named StoneDB. (In this topic, StoneDB refers to the database, if not otherwise specified.) For this reason, StoneDB is fully compatible with MySQL. You can use standard interfaces, such as Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC) to connect to StoneDB. In addition, you can create local connections to connect to StoneDB. StoneDB supports APIs written in various programming languages, such as C, C++, C#, Java, PHP, and Perl. StoneDB is fully compatible with views and stored procedures that comply with the ANSI SQL-92 standard and the SQL-99 standard. In this way, your application systems that can run on MySQL can directly run on StoneDB, without the need to modify the code. This allows you to seamlessly switch MySQL to StoneDB.
### Real-time HTAP
StoneDB provides two engines: row-based storage engine InnoDB and column-based storage engine StoneDB. StoneDB uses binlogs to replicate data from the row-based storage engine to the column-based storage engine in real time. This ensures strong data consistency between the two storage engines.
### Full compatibility with MySQL
StoneDB is an HTAP database built on MySQL. You can use standard interfaces, such as Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC) to connect to StoneDB. In addition, you can create local connections to connect to StoneDB. StoneDB supports APIs written in various programming languages, such as C, C++, C#, Java, PHP, and Perl. StoneDB is fully compatible with views and stored procedures that comply with the ANSI SQL-92 standard and the SQL-99 standard. In this way, your application systems that can run on MySQL can directly run on StoneDB, without the need to modify the code. This allows you to seamlessly switch MySQL to StoneDB.
### High query performance
When processing complex queries on tens of or even billions of data records, StoneDB reduces the processing time to one tenth or even shorter, compared to MySQL.
When processing complex queries on tens of or even hundreds of billions of data records, StoneDB reduces the processing time to one tenth or even shorter, compared to MySQL or other row-oriented databases.
### Minimal storage cost
StoneDB supports high compression ratio which can be up to 40:1. This greatly reduces the disk space required for storing data, cutting down the TCO.
## Key techniques
### Column-based storage engine
A column-based storage engine stores data to disks column by column. When you query data, only the required fields are retrieved, which greatly reduces memory bandwidth traffic and disk I/O. In addition, in a column-based storage engine, columns do not need to be indexed, freeing the database from maintaining such indexes.
Tables created on StoneDB are stored to disks column by column. Because data in the same column is of the same data type, the data can be densely compressed. This allows StoneDB to achieve much higher compression ratio than row-oriented databases. When processing a query that requires data in certain fields, StoneDB retrieves only the required fields, while a row-oriented database retrieves all rows that contain values of these fields. Compared to the row-oriented database, StoneDB reduces memory bandwidth traffic and disk I/O. In addition, StoneDB does not require indexing of columns, freeing from maintaining such indexes.
### High-efficiency data compression
In a relational database, values in the same column are of the same data type. More duplicate values stored in a column indicate a higher data compression ratio and a smaller data volume. By virtue of this, less data is retrieved for queries, and thus memory bandwidth traffic and disk I/O are reduced. <br />StoneDB saves storage space by using column-based storage. The data compression ratio of a column-oriented database is at least 10 times higher than that of a row-oriented database.
### Knowledge grid
A knowledge grid can filter data packs based on metadata, and then decompress the data packs to obtain the data that meets the query conditions. This greatly reduces I/O, and improves response speed and network utilization.
StoneDB supports various compression algorithms, such as PPM, LZ4, B2, and Delta, and it uses different compression algorithms to compress data of different data types. After the data is compressed, the volume of the data becomes smaller, and thus less network bandwidth and disk I/O resources are required to retrieve the data. StoneDB saves storage space by using column-based storage. The data compression ratio of a column-oriented database is at least 10 times higher than that of a row-oriented database.
### Knowledge Grid
In StoneDB, Data Packs are classified into relevant Data Packs, irrelevant Data Packs, and suspect Data Packs. This classification helps filter out irrelevant Data Packs. StoneDB needs only to read metadata of relevant Data Packs, and decompress suspect Data Packs and then examine the data records to filter relevant data records. If the result set of the relevant Data Packs can be directly obtained through their Data Pack Nodes (also known as metadata nodes), relevant Data Packs will not be decompressed. The process of handling relevant Data Packs does not consume I/O, since no data is decompressed.
### High-performance import
StoneDB provides an independent client to import data from various sources, written in different programming languages. Before data is imported, it is preprocessed, such as data compression and construction of Knowledge Nodes. In this way, operations such as parsing, verification, and transaction processing are eliminated when the data is being processed by the storage engine.
### Push-based vectorized query execution
When processing a query, StoneDB pushes column-based data packs from one operator to another based on the execution plan. Compared to the execution model used by row-oriented databases, push-based execution prevents in-depth calls of stacks and saves resources.
\ No newline at end of file
When processing a query, StoneDB pushes column-based Data Packs from one operator to another based on the execution plan. Compared to the execution model used by row-oriented databases, push-based execution prevents in-depth calls of stacks and saves resources.
\ No newline at end of file
......@@ -4,8 +4,8 @@ sidebar_position: 1.3
---
# Limits
As a conlumn-based storage engine, StoneDB is built on MySQL. Therefore, StoneDB is highly compatible with the MySQL 5.6 and 5.7 protocols, and the ecosystem, common features, and common syntaxes of MySQL. However, due to characteristics of column-based storage, StoneDB is incompatible with certain MySQL operations and features.
### Unsupported DDL operations
Developed based on MySQL, StoneDB is compatible with the MySQL 5.6 and 5.7 protocols, and the ecosystem, common features, and common syntaxes of MySQL. However, due to characteristics of column-based storage, StoneDB is incompatible with certain MySQL operations and features.
# Unsupported DDL operations
StoneDB does not support the following DDL operations:
- Modify the data type of a field.
......@@ -16,7 +16,7 @@ StoneDB does not support the following DDL operations:
- Analyze a table.
- Lock a table.
- Repair a table.
- Execute a CREATE TABLE… AS SELECT statement.
- Execute a `CREATE TABLE … AS SELECT` statement.
- Reorganize a table.
- Rename a field.
- Configure the default value for a field.
......@@ -28,17 +28,17 @@ StoneDB does not support the following DDL operations:
- Remove an index.
- Modify a table comment.
Data stored in StoneDB is highly compressed. For this reason, table attributes and column attributes are difficult to modify. The character sets, data types, constraints, and indexes must be properly defined when tables are being created.
### Unsupported DML operations
The table attributes and column attributes are difficult to modify. The character sets, data types, constraints, and indexes must be properly defined when tables are being created.
# Unsupported DML operations
StoneDB does not support the following DML operations:
- Execute a DELETE statement.
- Use subqueries in an UPDATE statement.
- Execute an UPDATE… JOIN statement to update multiple tables.
- Execute a REPLACE… INTO statement.
- Execute a `DELETE` statement.
- Use subqueries in an `UPDATE` statement.
- Execute an `UPDATE … JOIN` statement to update multiple tables.
- Execute a `REPLACE … INTO` statement.
StoneDB is not suitable for applications that are frequently updated. It supports only single-table update and insert operations. This is because a column-oriented database needs to find each corresponding column and update the value in the row when processing an update operation. However, a row-oriented database stores data by row. When processing an update operation, the row-oriented database only needs to find the corresponding page or block and update the data directly in the row.
### Unsupported objects
# Unsupported objects
StoneDB does not support the following objects:
- Global indexes
......@@ -47,34 +47,27 @@ StoneDB does not support the following objects:
- Temporary tables
- Stored procedures containing dynamic SQL statements
- User-defined functions containing nested SQL statements
If you want to use user-defined functions that contain nested SQL statements, set the **stonedb_ini_allowmysqlquerypath** parameter to **1** in the **my.cnf** configuration file.
### Unsupported data types
# Unsupported data types
StoneDB does not support the following data types:
- bit
- enum
- set
- json
- decimal whose precision is higher than 18, for example, decimal(19,x)
- Data types that contain keyword **unsigned** or **zerofill**
### Unsupported binary log formats
# Unsupported binary log formats
StoneDB does not support the following binary log formats:
- row
- mixed
Column-based storage engines support only statement-based binary logs. Row-based binary logs and mixed binary logs are not supported.
### Join queries across storage engines not supported
By default, StoneDB does not support join queries across storage engines. If a join query involves tables in both InnoDB and StoneDB, an error will be reported. You can set the **stonedb_ini_allowmysqlquerypath** parameter to **1** in the **my.cnf** configuration file to remove this limit.
### Transactions not supported
# Join queries across storage engines not supported
By default, StoneDB does not support join queries across storage engines. If a join query involves tables in both other storage engines and StoneDB, an error will be reported. You can set the **tianmu_ini_allowmysqlquerypath** parameter to **1 **in the **my.cnf** configuration file to remove this limit.
# Transactions not supported
Transactions must strictly comply with the ACID attributes. However, StoneDB does not support redo and undo logs and thus does not support transactions.
### Partitions not supported
# Partitions not supported
Column-based storage engines do not support partitioning.
### Column locking and table locking not supported
Column-based storage engines do not support column locking or table locking.
\ No newline at end of file
# Row locks and table locks not supported
Column-based storage engines do not support row locks or table locks.
\ No newline at end of file
......@@ -5,8 +5,10 @@ sidebar_position: 1.4
# Terms
The following table provides descriptions of common terms.
| **Term** | **Description** |
| :-- | :-- |
| --- | --- |
| **row** | A series of data that makes up a record. |
| **column** | Also referred to as field. In a relational database, a field must be associated with a data type when the field is being created. |
| **table** | Consists of rows and columns. Databases use tables to store data. Tables are essential objects in databases. |
......
......@@ -5,32 +5,24 @@ sidebar_position: 2.2
# Server Configuration Requirements
This topic describes the configuration requirements for a development or test environment and a production environment.
# Configuration requirements for a development or test environment
## Configuration requirements for a development or test environment
The following table describes the configuration requirements for a development or test environment.
| **CPU** | **Memory** | **Storage** | **Network** |
| --- | --- | --- | --- |
| 2 cores+ | 2 GB+ | 10 GB+ | Megabit network card |
:::caution
:::info
If the development or test environment is deployed on a virtual machine, the AVX instruction set must be enabled. Otherwise, StoneDB cannot be installed.
:::
# Configuration requirements for a production environment
## Configuration requirements for a production environment
The following table describes the configuration requirements for a production environment.
| **CPU** | **Memory** | **Storage** | **Network** |
| --- | --- | --- | --- |
| 8 cores+ | 8 GB+ | 100 GB+ | Gigabit network card |
:::tip
:::info
We recommend you use higher configuration in your production environment.
:::
......@@ -3,23 +3,20 @@ id: supported-servers-and-OSs
sidebar_position: 2.1
---
# Support Servers and OSs
StoneDB is an open-source hybrid transaction/analytical processing (HTAP) database designed and developed by StoneAtom based on the MySQL kernel. It can be deployed and run on 64-bit x86 servers and supports most mainstream network hardware and Linux OSs.
# Supported servers
## Supported servers
The following table lists the servers on which StoneDB can run.
| **Architecture** | **Supported server** |
| --- | --- |
| x86_64 architecture | Common x86_64 servers with AVX instruction sets enabled |
| x86_64 | Common x86_64 servers with AVX instruction sets enabled |
:::info
Support for the ARM64 or Power architecture is under testing.
:::
# Supported OSs
## Supported OSs
The following table lists the OSs supported by StoneDB.
| **OS** | **Version** |
......
......@@ -4,10 +4,9 @@ sidebar_position: 3.4
---
# Basic Operations
Structured Query Language (SQL) is a programming language for communicating with databases. You can use it to manage relational databases by performing insert, query, update, and other operations.
StoneDB is compatible with MySQL. You can use clients supported by MySQL to connect to StoneDB. In addition, StoneDB supports most SQL syntaxes. This section describes the basic SQL operations supported by StoneDB.
StoneDB is fully compatible with MySQL. You can use clients supported by MySQL to connect to StoneDB. In addition, StoneDB supports most SQL syntaxes. This section describes the basic SQL operations supported by StoneDB.
SQL can be classified into the following four parts by usage:
......@@ -15,8 +14,6 @@ SQL can be classified into the following four parts by usage:
- Data Manipulation Language (DML): is used to manage data in tables, such as INSERT, DELETE, and UPDATE statements.
- Data Query Language (DQL): is used to query objects, such as SELECT statements.
- Data Control Language (DCL): is used to control access to data, such as GRANT and REVOKE statements.
## Operations on databases
This section provides examples of performing basic operations on databases.
### Create a database
......@@ -45,20 +42,21 @@ This section provides examples of performing basic operations on tables.
Execute the following SQL statement to create a table which is named **student** and consists of the **id**, **name**, **age**, and **birthday** fields:
```sql
create table student(
id int(11) primary key,
name varchar(255),
age smallint,
birthday DATE
) engine=stonedb;
id int(11) primary key,
name varchar(20),
age smallint,
birthday DATE
) engine=stonedb;
```
:::info
If you do not specify **engine=stonedb** in the SQL statement, the storage engine on which the table is created is determined by the value of parameter **default_storage_engine**. For more information, see [Configure parameters](https://stoneatom.yuque.com/staff-ft8n1u/dghuxr/xg9czr).
:::
The row-based storage engine is named StoneDB in StoneDB-5.6, and is renamed to Tianmu in StoneDB-5.7 to distinguish from the database StoneDB.<br />If you do not specify **engine=stonedb** in the SQL statement, the storage engine on which the table is created is determined by the value of parameter **default_storage_engine**. For more information, see [Configure parameters](../04-developer-guide/05-appendix/configuration-parameters.md).
:::
### Query the schema of a table
Execute the following SQL statement to query the schema of table **student**:
```sql
show create table student\G
show create table student;
```
### Drop a table
Execute the following SQL statement to drop table **student**:
......@@ -83,8 +81,9 @@ Execute the following TRUNCATE statement to clear data in table **student**:
```sql
truncate table student ;
```
#### Remove specific data from a table
:::info
As a column-based storage engine, StoneDB does not support DELETE operations.
:::
### Query data from a table
Execute a SELECT statement to query data from a table.
......@@ -104,11 +103,11 @@ Execute the following SQL statement to create a user named **tiger** and set the
```sql
create user 'tiger'@'%' identified by '123456';
```
:::info
The username together with the hostname uniquely identify a user in the format of '_username_'@'_host_'. In this way, 'tiger'@'%' and 'tiger'@'localhost' are two users.
:::
:::
### Grant a user permissions
Execute the following SQL statement to grant user **tiger** the permissions to query all tables in database **test_db**:
```sql
......@@ -123,4 +122,4 @@ show grants for 'tiger'@'%';
Execute the following SQL statement to drop user '**tiger'@'%'**:
```sql
drop user 'tiger'@'%';
```
```
\ No newline at end of file
......@@ -12,37 +12,28 @@ The image of StoneDB is downloaded from Docker Hub.
## Procedure
The username and password for login are **root** and **stonedb123**.
### 1. Pull the image
### **1. Pull the image**
Run the following command:
```bash
docker pull stoneatom/stonedb:v0.1
```
### 2. Run the image
### **2. Run the image**
Run the following command:
```bash
docker run -p 13306:3306 -v $stonedb_volumn_dir/data/:/stonedb56/install/data/ -it -d stoneatom/stonedb:v0.1 /bin/bash
```
Altenatively, run the following command:
```bash
docker run -p 13306:3306 -it -d stoneatom/stonedb:v0.1 /bin/bash
```
Parameter description:
- **-p**: maps ports in the _Port of the host_:_Port of the container_ format.
- **-v**: mounts directories in the _Path in the host_:_Path in the container_ format. If no directories ae mounted, the container will be initialized.
- **-p**: maps ports in the *Port of the host*:*Port of the container* format.
- **-v**: mounts directories in the *Path in the host*:*Path in the host* format.If no directories ae mounted, the container will be initialized.
- **-i**: the interaction.
- **-t**: the terminal.
- **-d**: Do not enter the container upon startup. If you want to enter the container upon startup, run the docker exec command.
### 3. Log in to StoneDB in the container
### **3. Log in to StoneDB in the container**
```bash
# Obtain the Docker container ID.
docker ps
......@@ -50,11 +41,8 @@ docker ps
docker exec -it <Container ID> bash
<Container ID>$ /stonedb56/install/bin/mysql -uroot -pstonedb123
```
### 4. Log in to StoneDB using a third-party tool
### **4. Log in to StoneDB using a third-party tool**
You can log in to StoneDB by using third-party tools such as mysql, Navicat, and DBeaver. The following code uses mysql as an example.
```shell
mysql -h<Host IP address> -uroot -pstonedb123 -P<Mapped port of the host>
```
```
\ No newline at end of file
......@@ -9,9 +9,10 @@ sidebar_position: 3.1
Click [here](https://static.stoneatom.com/stonedb-ce-5.6-v1.0.0.el7.x86_64.tar.gz) to download the latest installation package of StoneDB.
## **2. Upload and decompress the TAR package**
```shell
cd /
tar -zxvf stonedb-ce-5.6-v1.0.0.el7.x86_64.tar.gz
```
Upload the installation package to the directory. The name of the folder extracted from the package is **stonedb56**.
Upload the installation package to the directory. The name of the folder extracted from the package is /**stonedb56**.
## **3. Check dependencies**
```bash
cd /stonedb56/install/bin
......@@ -20,28 +21,53 @@ ldd mysql
```
If the command output contains keywords **not found**, some dependencies are missing and must be installed. <br />For example, `libsnappy.so.1 => not found` is returned:
- If your OS is Ubuntu, run the `sudo apt search libsnappy` command. The command output will inform you to install `libsnappy-dev`.
- If your OS is RHEL or CentOS, run the `yum search all snappy` command. The command output will inform you to install **snappy-devel** and **snappy**.
## **4. Modify the configuration file**
```bash
cd /stonedb56/install/
cp stonedb.cnf stonedb.cnf.bak
vi stonedb.cnf
```
Modify the path and parameters. If the installation folder is **stonedb**, you only need to modify the parameters.
## **5. Create an account**
- If your OS is Ubuntu, run the `sudo apt search libsnappy` command. The command output will inform you to install `libsnappy-dev`. For more details, see [Compile StoneDB on Ubuntu 20.04](../04-developer-guide/00-compiling-methods/compile-using-ubuntu2004.md).
- If your OS is RHEL or CentOS, run the `yum search all snappy` command. The command output will inform you to install **snappy-devel** and **snappy**. For more details, see [Compile StoneDB on CentOS 7](../04-developer-guide/00-compiling-methods/compile-using-centos7.md) or [Compile StoneDB on RHEL 7](../04-developer-guide/00-compiling-methods/compile-using-redhat7.md).
## **4. Start StoneDB**
Users can start StoneDB in two ways: manual installation and automatic installation.
### 4.1 Create an account.
```bash
groupadd mysql
useradd -g mysql mysql
passwd mysql
```
## **6. Execute reinstall.sh**
### 4.2 Manually install StoneDB.
You need to manually create directories, and then initialize and start StoneDB.
```shell
### Create directories
mkdir -p /stonedb56/install/data/innodb
mkdir -p /stonedb56/install/binlog
mkdir -p /stonedb56/install/log
mkdir -p /stonedb56/install/tmp
chown -R mysql:mysql /stonedb56
### Configure parameters in stonedb.cnf
vim /stonedb56/install/stonedb.cnf
[mysqld]
port = 3306
socket = /stonedb56/install/tmp/mysql.sock
datadir = /stonedb56/install/data
pid-file = /stonedb56/install/data/mysqld.pid
log-error = /stonedb56/install/log/mysqld.log
chown -R mysql:mysql /stonedb56/install/stonedb.cnf
### Initialize StoneDB
/stonedb56/install/scripts/mysql_install_db --datadir=/stonedb56/install/data --basedir=/stonedb56/install --user=mysql
### Start StoneDB
/stonedb56/install/bin/mysqld_safe --defaults-file=/stonedb56/install/stonedb.cnf --user=mysql &
```
### 4.3 Automatically install StoneDB.
```bash
cd /stonedb56/install
./reinstall.sh
```
The process of executing the script is to initialize and start the StoneDB.
## **7. Log in to StoneDB**
The process of executing the script is to initialize and start the StoneDB.<br />Differences between **reinstall.sh** and **install.sh**:
- **reinstall.sh** is the script for automatic installation. When the script is being executed, directories are created, and StoneDB is initialized and started. Therefore, do not execute the script unless for the initial startup of StoneDB. Otherwise, all directories will be deleted and StoneDB will be initialized again.
- **install.sh** is the script for manual installation. You can specify the installation directories based on your needs and then execute the script. Same as **reinstall.sh**, when the script is being executed, directories are created, and StoneDB is initialized and started. Therefore, do not execute the script unless for the initial startup. Otherwise, all directories will be deleted and StoneDB will be initialized again.
## **5. Log in to StoneDB**
```shell
/stonedb56/install/bin/mysql -uroot -p -S /stonedb56/install/tmp/mysql.sock
Enter password:
......@@ -68,7 +94,7 @@ mysql> show databases;
+--------------------+
7 rows in set (0.00 sec)
```
## **8. Stop StoneDB**
## **6. Stop StoneDB**
```shell
/stonedb56/install/bin/mysqladmin -uroot -p -S /stonedb56/install/tmp/mysql.sock shutdown
```
```
\ No newline at end of file
......@@ -5,34 +5,18 @@ sidebar_position: 3.3
# Quick Start
This topic presents some examples to show you that the StoneDB storage engine has superior performance than InnoDB on processing bulk insert of data, compressing data, and executing analytical queries.
## Step 1. Deploy a test environment
Before using StoneDB, prepare your test environment according to instructions provided in [Quick Deployment](quick-deployment.md) and start StoneDB.
## Step 2. Prepare test data
This topic presents some examples to show you that the StoneDB has superior performance than InnoDB on processing bulk insert of data, compressing data, and executing analytical queries.
## **Step 1. Deploy a test environment**
Before using StoneDB, prepare your test environment according to instructions provided in [Quick Deployment](./quick-deployment.md) and start StoneDB.
## **Step 2. Prepare test data**
Perform the following steps to generate test data.
### 1. Prepare for the test
In the test environment, create a StoneDB table and a InnoDB table. Ensure the following parameter settings of the two tables are same:
- autocommit=1
- innodb_flush_log_at_trx_commit=1
- sync_binlog=0
### 2. Create a database
### **1. Create a database**
Create a database named **test**.
```sql
create database test DEFAULT CHARACTER SET utf8mb4;
```
### 3. Create a table
In database **test**, create a table named **t_test**.
### **2. Create a table**
In database **test**, create two tables respectively named **t_user **and** t_user_innodb**.
```sql
use test
CREATE TABLE t_user(
......@@ -43,61 +27,98 @@ CREATE TABLE t_user(
score INT NOT NULL,
copy_id INT NOT NULL,
PRIMARY KEY (`id`)
) engine=STONEDB;
```
### 4. Create a stored procedure
) engine=stonedb;
CREATE TABLE t_user_innodb(
id INT NOT NULL AUTO_INCREMENT,
first_name VARCHAR(20) NOT NULL,
last_name VARCHAR(20) NOT NULL,
sex VARCHAR(5) NOT NULL,
score INT NOT NULL,
copy_id INT NOT NULL,
PRIMARY KEY (`id`)
) engine=innodb;
```
:::info
The row-based storage engine is named StoneDB in StoneDB-5.6, and is renamed to Tianmu in StoneDB-5.7 to distinguish from the database StoneDB.
:::
### 3. **Create a stored procedure**
Create a stored procedure that is used to generate a table containing randomly generated names of persons.
```sql
DELIMITER //
create PROCEDURE add_user(in num INT)
BEGIN
DECLARE rowid INT DEFAULT 0;
DECLARE firstname VARCHAR(10);
DECLARE name1 VARCHAR(10);
DECLARE name2 VARCHAR(10);
DECLARE lastname VARCHAR(10) DEFAULT '';
DECLARE sex CHAR(1);
DECLARE score CHAR(2);
WHILE rowid < num DO
SET firstname = SUBSTRING(md5(rand()),1,4);
SET name1 = SUBSTRING(md5(rand()),1,4);
SET name2 = SUBSTRING(md5(rand()),1,4);
SET sex=FLOOR(0 + (RAND() * 2));
SET score= FLOOR(40 + (RAND() *60));
SET rowid = rowid + 1;
IF ROUND(RAND())=0 THEN
SET lastname =name1;
END IF;
IF ROUND(RAND())=1 THEN
SET lastname = CONCAT(name1,name2);
END IF;
insert INTO t_user(first_name,last_name,sex,score,copy_id) VALUES (firstname,lastname,sex,score,rowid);
END WHILE;
END //
create PROCEDURE add_user(in num INT)
BEGIN
DECLARE rowid INT DEFAULT 0;
DECLARE firstname VARCHAR(10);
DECLARE name1 VARCHAR(10);
DECLARE name2 VARCHAR(10);
DECLARE lastname VARCHAR(10) DEFAULT '';
DECLARE sex CHAR(1);
DECLARE score CHAR(2);
WHILE rowid < num DO
SET firstname = SUBSTRING(md5(rand()),1,4);
SET name1 = SUBSTRING(md5(rand()),1,4);
SET name2 = SUBSTRING(md5(rand()),1,4);
SET sex=FLOOR(0 + (RAND() * 2));
SET score= FLOOR(40 + (RAND() *60));
SET rowid = rowid + 1;
IF ROUND(RAND())=0 THEN
SET lastname =name1;
END IF;
IF ROUND(RAND())=1 THEN
SET lastname = CONCAT(name1,name2);
END IF;
insert INTO t_user(first_name,last_name,sex,score,copy_id) VALUES (firstname,lastname,sex,score,rowid);
END WHILE;
END //
DELIMITER ;
```
## Step 3. Test insert performance
DELIMITER //
create PROCEDURE add_user_innodb(in num INT)
BEGIN
DECLARE rowid INT DEFAULT 0;
DECLARE firstname VARCHAR(10);
DECLARE name1 VARCHAR(10);
DECLARE name2 VARCHAR(10);
DECLARE lastname VARCHAR(10) DEFAULT '';
DECLARE sex CHAR(1);
DECLARE score CHAR(2);
WHILE rowid < num DO
SET firstname = SUBSTRING(md5(rand()),1,4);
SET name1 = SUBSTRING(md5(rand()),1,4);
SET name2 = SUBSTRING(md5(rand()),1,4);
SET sex=FLOOR(0 + (RAND() * 2));
SET score= FLOOR(40 + (RAND() *60));
SET rowid = rowid + 1;
IF ROUND(RAND())=0 THEN
SET lastname =name1;
END IF;
IF ROUND(RAND())=1 THEN
SET lastname = CONCAT(name1,name2);
END IF;
insert INTO t_user_innodb(first_name,last_name,sex,score,copy_id) VALUES (firstname,lastname,sex,score,rowid);
END WHILE;
END //
DELIMITER ;
```
## **Step 3. Test insert performance**
Call the stored procedure to insert 10,000,000 rows of data.
```sql
> call add_user_innodb(10000000);
mysql> call add_user_innodb(10000000);
Query OK, 1 row affected (24 min 46.62 sec)
> call add_user(10000000);
mysql> call add_user(10000000);
Query OK, 1 row affected (9 min 21.14 sec)
```
According to the returned result, StoneDB takes 9 minutes and 21 seconds, while InnoDB takes 24 minutes and 46 seconds.
## Step 4. Test data compression efficiency
:::info
The execution time is different for different hardware configurations. Here, two stored procedures are executed in the same environment and their execution times are compared.
:::
## **Step 4. Test data compression efficiency**
Compress the inserted data.
```sql
> select count(*) from t_user_innodb;
mysql> select count(*) from t_user_innodb;
+----------+
| count(*) |
+----------+
......@@ -105,7 +126,7 @@ Compress the inserted data.
+----------+
1 row in set (1.83 sec)
> select count(*) from t_user;
mysql> select count(*) from t_user;
+----------+
| count(*) |
+----------+
......@@ -121,13 +142,10 @@ Compress the inserted data.
+--------------+---------------+------------+-------------+--------------+------------+---------+
```
According to the returned result, the data size after compression in StoneDB is 120 MB while that in InnoDB is 455 MB.
## Step 5. Test performance on processing aggregate queries
## **Step 5. Test performance on processing aggregate queries**
Execute an aggregate query.
```sql
> select first_name,count(*) from t_user group by first_name order by 1;
mysql> select first_name,count(*) from t_user group by first_name order by 1;
+------------+----------+
| first_name | count(*) |
+------------+----------+
......@@ -140,7 +158,7 @@ Execute an aggregate query.
+------------+----------+
65536 rows in set (0.98 sec)
> select first_name,count(*) from t_user_innodb group by first_name order by 1;
mysql> select first_name,count(*) from t_user_innodb group by first_name order by 1;
+------------+----------+
| first_name | count(*) |
+------------+----------+
......
......@@ -5,46 +5,46 @@ sidebar_position: 1.1
# StoneDB 简介
StoneDB是由石原子科技公司自主设计、研发的国内首款基于MySQL内核打造的开源HTAP(Hybrid Transactional and Analytical Processing)融合型数据库,可实现与MySQL的无缝切换。StoneDB具备超高性能、实时分析等特点,为用户提供一站式HTAP解决方案。
StoneDB 是由石原子科技公司自主设计、研发的国内首款基于 MySQL 内核打造的开源 HTAP(Hybrid Transactional and Analytical Processing)融合型数据库,可实现与 MySQL 的无缝切换。StoneDB 具备超高性能、实时分析等特点,为用户提供一站式 HTAP 解决方案。
StoneDB是100%兼容MySQL 5.6、5.7协议和MySQL生态等重要特性,支持MySQL常用的功能及语法,支持MySQL生态中的系统工具和客户端,如Navicat、Workbench、mysqldump、mydumper。由于100%兼容MySQL,因此StoneDB的所有工作负载都可以继续使用MySQL数据库体系运行。
StoneDB 100% 兼容 MySQL 5.6、5.7 协议和 MySQL 生态等重要特性,支持 MySQL 常用的功能及语法,支持 MySQL 生态中的系统工具和客户端,如 Navicat、Workbench、mysqldump、mydumper。由于 100% 兼容 MySQL,因此 StoneDB 的所有工作负载都可以继续使用 MySQL 数据库体系运行。
StoneDB专门针对OLAP应用程序进行了设计和优化,支持百亿数据场景下进行高性能、多维度字段组合的复杂查询,相对比社区版的MySQL,其查询速度提升了十倍以上。
StoneDB 专门针对 OLAP 应用程序进行了设计和优化,支持百亿数据场景下进行高性能、多维度字段组合的复杂查询,相对比社区版的 MySQL,其查询速度提升了十倍以上。
StoneDB采用基于知识网格技术和列式存储引擎,该存储引擎为海量数据背景下OLAP应用而设计,通过列式存储数据、知识网格过滤、高效数据压缩等技术,为应用系统提供低成本和高性能的数据查询支持。
StoneDB 采用基于知识网格技术和列式存储引擎,该存储引擎为海量数据背景下 OLAP 应用而设计,通过列式存储数据、知识网格过滤、高效数据压缩等技术,为应用系统提供低成本和高性能的数据查询支持。
# 产品优势
## 产品优势
- 完全兼容MySQL
- 完全兼容 MySQL
StoneDB是在原生的MySQL中加入的存储引擎,最终集成为HTAP数据库,因此是100%兼容MySQL的。支持标准数据库接口,包括ODBC、JDBC和本地连接。支持API接口,包括C、C++、C#、Java、PHP、Perl等。StoneDB全面支持ANSI SQL-92标准和SQL-99扩展标准中视图和存储过程,这种支持使得现有应用程序无需修改应用代码即可使用StoneDB,从而可实现与MySQL的无缝切换。
- 实时HTAP
同时提供行式存储引擎InnoDB和列式存储引擎StoneDB,通过binlog从InnoDB复制数据,保证行式存储引擎InnoDB和列式存储引擎StoneDB之间的数据强一致性。
StoneDB 支持标准数据库接口,包括 ODBC、JDBC 和本地连接。支持 API 接口,包括 C、C++、C#、Java、PHP、Perl 等。StoneDB 全面支持 ANSI SQL-92 标准和 SQL-99 扩展标准中视图和存储过程,这种支持使得现有应用程序无需修改应用代码即可使用 StoneDB,从而可实现与 MySQL 的无缝切换。
- 高性能查询
在千万、亿级甚至更多数据量下进行复杂查询时,相比MySQL,其查询速度提升了十倍以上。
在千万、亿级甚至更多数据量下进行复杂查询时,StoneDB 相比其他行式存储引擎的关系型数据库,其查询速度提升了十倍以上。
- 低存储成本
对数据最高可实现40倍压缩,极大的节省了数据存储空间和企业的成本。
# 核心技术
## 核心技术
- 列式存储
列式存储在存储数据时是按照列模式将数据存储到磁盘上的,读取数据时,只需要读取需要的字段,极大减少了网络带宽和磁盘IO的压力。基于列式存储无需为列再创建索引和维护索引。
StoneDB 创建的表在磁盘上是以列模式进行存储的,由于关系型数据库中每一列的数据类型都相同,所以这种连续的空间存储与行式存储相比,更加能够实现数据的高压缩比。在读取数据方面,如果只想查询一个字段的结果,在行式存储中,引擎层向服务层返回的是一整行的数据,需要消耗更多的网络带宽和 IO。而列式存储只需要返回一个字段,极大减少了网络带宽和 IO 的消耗。另外,列式存储无需再为列创建索引和维护索引。
- 高效的数据压缩比
在关系型数据库中,同一列中的数据属于同一种数据类型,如果列中的重复值越多,则压缩比越高,而压缩比越高,数据量就越小,数据被读取时,对网络带宽和磁盘IO的压力也就越小。由于列式存储比行式存储有着十倍甚至更高的压缩比,StoneDB节省了大量的存储空间,降低了存储成本。
StoneDB 会根据不同的数据类型选择不同的压缩算法,目前支持的压缩算法主要有 PPM、LZ4、B2、Delta 等。数据被压缩后,数据量变得更小,在读取数据时,对网络带宽和磁盘 IO 的压力也就越小。由于列式存储相比行式存储有十倍甚至更高的压缩比,StoneDB 可以节省大量的存储空间,降低存储成本。
- 知识网格
知识网格根据元数据信息进行粗糙集过滤查询中不符合条件的数据,最后只需要对可疑的数据包进行解压获取符合查询条件的数据,大量减少读取IO操作,提高查询响应时间和网络利用率。
在 StoneDB 中,数据包根据粗糙集概念划分为不相关数据包、可疑数据包、相关数据包。StoneDB 根据知识网格技术过滤掉不相关的数据包,对可疑数据包需要进一步解压缩才能得到满足条件的数据。如果能从相关数据包的元数据节点得到结果,无需再解压缩数据包。这样就消除了解压缩数据包的过程和降低 IO 消耗,提高了查询响应时间和网络利用率。
- 高性能导入
StoneDB 提供独立的数据导入客户端,支持不同的数据源环境,支持多语言架构。数据在导入前,首先会进行预处理,如数据压缩和知识节点的构建。数据经过预处理后,进入存储引擎无需再次执行解析、数据验证以及事务处理等操作。
- 基于推送的矢量化查询处理
StoneDB通过执行计划将矢量块(列式数据切片)从一个运算符推送到另一个运算符来处理查询,与基于元组的处理模型相比,基于推送的执行模型避免了深度调用堆栈,并节省了资源。
\ No newline at end of file
StoneDB 通过执行计划将矢量块(列式数据切片)从一个运算符推送到另一个运算符来处理查询,与基于元组的处理模型相比,基于推送的执行模型避免了深度调用堆栈,并节省了资源。
......@@ -3,94 +3,60 @@ id: limits
sidebar_position: 1.3
---
# 使用限制
由于StoneDB是在原生的MySQL中加入的存储引擎,因此是高度兼容MySQL 5.6、5.7协议和MySQL生态等重要特性,支持MySQL常用的功能及语法,但由于StoneDB本身的一些特性,部分操作和功能尚未得到支持,以下列出的是不兼容MySQL的操作和功能。
# 不支持的DDL
1. 修改字段的数据类型
StoneDB 100% 兼容 MySQL 5.6、5.7 协议和 MySQL 生态等重要特性,支持 MySQL 常用的功能及语法,但由于 StoneDB 本身的一些特性,部分操作和功能尚未得到支持,以下列出的是不兼容 MySQL 的操作和功能。
## 不支持的 DDL
1. 修改字段的数据类型
2. 修改字段的长度
3. 修改表/字段的字符集
4. 转换表的字符集
5. optimize table
6. analyze table
7. lock table
8. repair table
9. CTAS
10. 重组表
11. 重命名字段
12. 设置字段的默认值
13. 设置字段为空
14. 设置字段非空
15. 添加唯一约束
16. 删除唯一约束
17. 创建索引
18. 删除索引
19. 表修改注释
StoneDB是列式存储,数据是被高度压缩的,因此表和列的相关属性不易被修改,在表设计阶段尽可能定义好字符集、数据类型、约束和索引等。
# 不支持的DML
表和列的相关属性不易被修改,在表设计阶段尽可能定义好字符集、数据类型、约束和索引等。
## 不支持的 DML
1. delete
2. update关联子查询
3. update多表关联
2. update 关联子查询
3. update 多表关联
4. replace into
StoneDB不适用于有频繁的更新操作,因为对列式存储来说,更新需要找到对应的每一列,然后分多次更新,而行式存储由于一行紧挨着一行,找到对应的page或者block就可直接在行上更新,因此StoneDB只支持了常规用的单表update和insert。
# 不支持跨存储引擎关联查询
StoneDB默认不支持跨存储引擎关联查询,也就是说InnoDB存储引擎下的表和StoneDB存储引擎下的表进行关联查询会报错。可在stonedb.cnf参数文件里定义stonedb_ini_allowmysqlquerypath=1,这样就支持跨存储引擎表之间的关联查询了。
# 不支持的对象
StoneDB 不适用于有频繁的更新操作,因为对列式存储来说,更新需要找到对应的每一列,然后分多次更新,而行式存储由于一行紧挨着一行,找到对应的 page 或者 block 就可直接在行上更新,因此 StoneDB 只支持了常规用的单表 update 和 insert。
## 不支持跨存储引擎关联查询
StoneDB 默认不支持跨存储引擎关联查询,也就是说其他存储引擎下的表和 StoneDB 下的表进行关联查询会报错。可在参数文件 my.cnf 里定义 tianmu_ini_allowmysqlquerypath=1,这样就支持跨存储引擎表之间的关联查询了。
## 不支持的对象
1. 全文索引
2. 唯一约束
3. 触发器
4. 临时表
5. 含有自定义函数的存储过程
6. 含有SQL的自定义函数
# 不支持的数据类型
1. 位类型bit
2. 枚举型enum
3. 集合型set
4. json类型
5. decimal精度必须小于或等于18,否则不支持,如decimal(18,x)
6. 创建表时不支持使用关键字unsigned、zerofill
# 不支持事务
只有严格遵守ACID四大属性,才能真正的支持事务。而StoneDB由于没有redo和undo,是不支持事务的。
# 不支持分区
6. 含有 SQL 的自定义函数
## 不支持的数据类型
1. 位类型 bit
2. 枚举型 enum
3. 集合型 set
4. json 类型
5. decimal 精度必须小于或等于18,否则不支持,如 decimal(19,x)
6. 创建表时不支持使用关键字 unsigned、zerofill
## 不支持事务
只有严格遵守 ACID 四大属性,才能真正的支持事务。而 StoneDB 由于没有 redo 和 undo,是不支持事务的。
## 不支持分区
列式存储不支持分区。
# 不支持行锁、表锁
## 不支持行锁、表锁
列式存储不支持行锁、表锁。
# 只支持statement的binlog格式
列式存储不支持row、mixed格式的binlog。
## 只支持 statement 的 binlog 格式
列式存储不支持 row、mixed 格式的 binlog。
......@@ -5,38 +5,39 @@ sidebar_position: 1.4
# 常见术语介绍
- **行**:行是一组相关的数据,一行行数据组成了一张表。
**行**:行是一组相关的数据,一行行数据组成了一张表。
- **列**:列也被称为字段,关系型数据库中,创建每个列需要定义数据类型和长度。
**列**:列也被称为字段,关系型数据库中,创建每个列需要定义数据类型和长度。
- **表**:由行与列组成,是数据库中用来存储数据的对象,是整个数据库系统的基础。
**表**:由行与列组成,是数据库中用来存储数据的对象,是整个数据库系统的基础。
- **视图**:是一个虚拟表,实际不存储数据,其内容由查询定义。
**视图**:是一个虚拟表,实际不存储数据,其内容由查询定义。
- **存储过程**: 一组为了完成特定功能的SQL语句集,经编译后存储在数据库中,用户通过指定存储过程的名称并设置参数来执行它。
**存储过程**: 一组为了完成特定功能的SQL语句集,经编译后存储在数据库中,用户通过指定存储过程的名称并设置参数来执行它。
- **数据库**:是各个数据库对象的集合,数据库对象包括表、视图、存储过程等。
**数据库**:是各个数据库对象的集合,数据库对象包括表、视图、存储过程等。
- **实例**:是数据库的集合。
**实例**:是数据库的集合。
- **数据页**:是数据库管理的基本单位,默认大小为16KB。
**数据页**:是数据库管理的基本单位,默认大小为16KB。
- **数据文件**:存放实实在在的数据,默认情况下,每张表对应一个数据文件,数据文件是一个物理概念。
**数据文件**:存放实实在在的数据,默认情况下,每张表对应一个数据文件,数据文件是一个物理概念。
- **表空间**:表空间是逻辑存储单元,默认情况下,每张表对应一个表空间。
**表空间**:表空间是逻辑存储单元,默认情况下,每张表对应一个表空间。
- **事务**:一个事务由一组DML组成,其严格遵循ACID四大属性,以提交或者回滚结束,其中DDL存在隐式提交。
**事务**:一个事务由一组 DML 组成,其严格遵循 ACID 四大属性,以提交或者回滚结束,其中 DDL 存在隐式提交。
- **字符集**:字符的编码规则
**字符集**:字符的编码规则。
- **校对规则**:对字符集中的字符比较大小的一种规则
**校对规则**:对字符集中的字符比较大小的一种规则。
- **列式存储**:存储数据时是按照列模式将数据存储到磁盘上的。
**列式存储**:存储数据时是按照列模式将数据存储到磁盘上的。
- **数据压缩**:减少数据文件的大小即为数据压缩,数据压缩比由数据类型、数据重复度、压缩算法决定。
**数据压缩**:减少数据文件的大小即为数据压缩,数据压缩比由数据类型、数据重复度、压缩算法决定。
- **OLTP** :On-Line Transaction Processing ,指的是在线事务处理过程,其主要特征是交互式的快速响应,大并发的小型事务处理,典型的业务系统是银行交易系统。
**OLTP** :On-Line Transaction Processing ,指的是在线事务处理过程,其主要特征是交互式的快速响应,大并发的小型事务处理,典型的业务系统是银行交易系统。
- **OLAP** :On-Line Analytical Processing,指的是在线分析处理过程,其主要特征是对海量数据进行复查的分析查询,典型的业务系统是数据仓库系统。
**OLAP** :On-Line Analytical Processing,指的是在线分析处理过程,其主要特征是对海量数据进行复查的分析查询,典型的业务系统是数据仓库系统。
**HTAP**:Hybrid Transaction/Analytical Processing,指的是混合事务和分析处理过程,是一种新型的应用程序架构,出现 HTAP 的目的是打破 OLTP 和 OLAP 之间的壁垒。
- **HTAP**:Hybrid Transaction/Analytical Processing,指的是混合事务和分析处理过程,是一种新型的应用程序架构,出现HTAP的目的是打破OLTP和OLAP之间的壁垒。
......@@ -4,18 +4,26 @@ sidebar_position: 2.2
---
# 服务器配置推荐
StoneDB对开发、测试及生产环境的服务器配置有以下要求和建议:
开发与测试环境
StoneDB 对开发、测试及生产环境的服务器配置有以下要求和建议:
## 开发与测试环境
| CPU | 内存 | 存储 | 网络 |
| --- | --- | --- | --- |
| 2核+ | 2GB+ | 10GB+ | 百兆网卡 |
注:如果开发、测试环境是部署在虚拟机上的,AVX指令集必须开启,否则StoneDB无法安装。
生产环境
:::info
如果开发、测试环境是部署在虚拟机上的,AVX 指令集必须开启,否则 StoneDB 无法安装。
:::
## 生产环境
| CPU | 内存 | 存储 | 网络 |
| --- | --- | --- | --- |
| 8核+ | 8GB+ | 100GB+ | 千兆网卡 |
注:生产环境一般要求高配置。
:::info
生产环境一般要求高配置。
:::
......@@ -4,14 +4,15 @@ sidebar_position: 2.1
---
# 软硬件环境推荐
StoneDB是由石原子科技公司自主设计、研发的国内首款基于MySQL内核打造的开源HTAP(Hybrid Transactional and Analytical Processing)融合型数据库,可以很好的部署和运行在Intel x86架构的64位通用硬件服务器平台,并支持绝大多数的主流硬件网络设备和主流的Linux操作系统环境。
StoneDB 是由石原子科技公司自主设计、研发的国内首款基于 MySQL 内核打造的开源 HTAP(Hybrid Transactional and Analytical Processing)融合型数据库,可以很好的部署和运行在 Intel x86 架构的64位通用硬件服务器平台,并支持绝大多数的主流硬件网络设备和主流的 Linux 操作系统环境。
## 支持的硬件平台
| 硬件平台类型 | 列表 |
| --- | --- |
| Intel x86_64架构 | 通用x86_64硬件平台,需打开AVX指令集 |
| x86_64 | 通用 x86_64 硬件平台,需打开 AVX 指令集 |
StoneDB在ARM64架构和Power架构的支持,目前还在测试中,还未得到完全验证。
:::info
StoneDB 在 ARM64 架构和 Power 架构的支持,目前还在测试中,还未得到完全验证。
:::
## 支持的操作系统
| 操作系统平台 | 版本 |
| --- | --- |
......@@ -19,4 +20,6 @@ StoneDB在ARM64架构和Power架构的支持,目前还在测试中,还未得
| Red Hat Enterprise | 7.x |
| Ubuntu LTS | 20.04及以上 |
StoneDB在以上三个操作系统的常用版本已得到验证,其他操作系统版本,如Debian、Fedora可能能编译成功,但还未得到完全验证。
:::info
StoneDB 在以上三个操作系统的常用版本已得到验证,其他操作系统版本,如 Debian、Fedora 可能能编译成功,但还未得到完全验证。
:::
\ No newline at end of file
......@@ -6,106 +6,103 @@ sidebar_position: 3.4
# 基本操作
SQL是结构化查询语言(Structured Query Language)的简称,是一种特殊的编程语言,也是一种数据库查询和程序设计语言,用于存取、查询、更新和管理关系型数据库管理系统。
由于StoneDB是高度兼容MySQL的,那么可以使用MySQL支持的客户端连接StoneDB,并且大多数情况下可以直接执行MySQL支持的SQL语法,本文将详细介绍StoneDB基本的SQL操作。
由于 StoneDB 是100%兼容 MySQL 的,那么可以使用 MySQL 支持的客户端连接 StoneDB,并且大多数情况下可以直接执行 MySQL 支持的 SQL 语法,本文将详细介绍 StoneDB 基本的 SQL 操作。
SQL语言按照不同的功能划分为以下的4个部分:
- **DDL**(Data Definition Language):数据定义语言,用来定义数据库中的对象,如create、alter、drop。
- **DML**(Data Manipulation Language):数据操作语言,用来操作表数据,如insert、delete、update。
- **DQL**(Data Query Language):数据查询语言,用来查询对象,如select。
- **DCL**(Data Control Language):数据控制语言,用来定义访问权限和安全级别,如grant、revoke。
- **DDL**(Data Definition Language):数据定义语言,用来定义数据库中的对象,如 create、alter、drop。
- **DML**(Data Manipulation Language):数据操作语言,用来操作表数据,如 insert、delete、update。
- **DQL**(Data Query Language):数据查询语言,用来查询对象,如 select。
- **DCL**(Data Control Language):数据控制语言,用来定义访问权限和安全级别,如 grant、revoke。
## 1. 数据库相关操作
### 1)创建数据库
例如:要创建一个名为test_db的数据库,数据库的默认字符集是utf8mb4,可以使用以下SQL语句:
要创建一个名为 test_db 的数据库,数据库的默认字符集是 utf8mb4 ,可以使用以下 SQL 语句:
```sql
create database test_db DEFAULT CHARACTER SET utf8mb4;
```
### 2)查看数据库
使用以下SQL语句查看数据库
查看数据库可以使用以下 SQL 语句:
```sql
show databases;
```
### 3)应用数据库
使用以下SQL语句应用已创建的数据库
应用已创建的数据库可以使用以下 SQL 语句:
```sql
use test_db;
```
### 4)删除数据库
使用以下SQL语句删除数据库test_db
删除数据库 test_db 可以使用以下 SQL 语句:
```sql
drop database test_db;
```
## 2. 表相关操作
### 1)创建列式存储
例如:要创建一个引擎为StoneDB、名为student的表,包括编号、姓名、年龄、生日等字段,可以使用以下SQL语句:
### 1)创建表
要创建一个表名为 student 的表,包括编号、姓名、年龄、生日等字段,可以使用以下SQL语句:
```sql
create table student(
id int(11) primary key,
name varchar(255),
age smallint,
birthday DATE
) engine=stonedb;
id int(11) primary key,
name varchar(20),
age smallint,
birthday DATE
) engine=stonedb;
```
注明:如果SQL语句中未指定“engine=stonedb”,则所创建的表的存储引擎由参数default_storage_engine决定。
详情参见设置参数-在参数文件中指定存储引擎类型
注:1)StoneDB 5.6 的存储引擎名是 stonedb,5.7 的存储引擎名是 tianmu。<br />2)如果 SQL 语句中未指定“engine=stonedb”,则所创建的表的存储引擎由参数 default_storage_engine 决定,详情参见[设置参数](../04-developer-guide/05-appendix/configuration-parameters.md)
### 2)查看表
使用以下SQL语句查看表结构
查看表结构使用以下 SQL 语句
```sql
show create table student\G
show create table student;
```
### 3)删除表
例如:要删除表student,可以使用以下SQL语句:
要删除表 student,可以使用以下 SQL 语句:
```sql
drop table student;
```
## 3. 数据相关操作
### 1)插入数据
使用insert向表插入记录,例如
使用 insert 向表插入记录
```sql
insert into student values(1,'Jack',15,'20220506');
```
### 2)修改数据
使用update修改表,例如
### 2)修改数据
使用 update 修改记录
```sql
update student set age=25 where id=1;
```
### 3)删除表数据
#### 清空表数据
使用 Truncate 可以清空表中全部数据,例如:
### 3)删除数据
StoneDB 不支持 delete,如果想清空表数据,可以使用 truncate:
```sql
truncate table student ;
```
#### 删除表中指定数据
由于StoneDB是列式存储,不支持delete操作
## 4. 查询表
使用select查询表记录,例如:
1)查询 student 表中ID=1的学生的姓名和生日
1)查询 student 表中 ID=1 的学生的姓名和生日
```sql
select name,birthday from student where id=1;
```
2)查询将 student 表按照生日排序后的学生姓名和生日
2)查询 student 表的学生姓名和生日,并且按照生日排序
```sql
select name,birthday from student order by birthday;
```
## 5. 用户相关操作
### 1)创建用户
例如:要创建用户tiger,密码为123456,可以使用以下SQL语句:
例如:要创建用户 tiger,密码为123456,可以使用以下 SQL 语句:
```sql
create user 'tiger'@'%' identified by '123456';
```
注:用户名和主机名('username'@'host')唯一表示一个用户,'tiger'@'%'和'tiger'@'localhost'是两个不同的用户。
### 2)向用户授权
例如:向用户tiger授予可查询数据库test_db所有的表,可以使用以下SQL语句:
例如:向用户 tiger 授予可查询数据库 test_db 所有的表,可以使用以下 SQL 语句:
```sql
grant select on test_db.* to 'tiger'@'%';
```
### 3)查询用户权限
### 3)查询用户权限
例如:查询名为 tiger 的用户所拥有的权限
```sql
show grants for 'tiger'@'%';
```
### 4)删除用户
例如:要删除用户tiger@%,可以使用以下SQL语句:
例如:要删除用户 tiger@%,可以使用以下 SQL 语句:
```sql
drop user 'tiger'@'%';
```
```
\ No newline at end of file
......@@ -6,7 +6,7 @@ sidebar_position: 3.2
# Docker快速部署StoneDB
## StoneDB Docker Hub地址
[Docker Hub](https://hub.docker.com/r/stoneatom/stonedb)
## 使用方法
默认登录账号密码为root,stonedb123
### 1、docker pull
......@@ -15,10 +15,13 @@ docker pull stoneatom/stonedb:v0.1
```
### 2、docker run
参数说明:
-p:端口映射,把容器端口映射到宿主机端口上,:前面是宿主机端口,后面是容器端口
-v:目录挂载,如果没有挂载的话,容器重启会进行初始化,:前面是宿主机映射路径,后面是容器映射路径
-i:交互式操作
-t:终端
-p:端口映射,把容器端口映射到宿主机端口上。如`-p 23306:3306`,前面是宿主机端口,后面是容器端口
-v:目录挂载,如果没有挂载的话,容器重启会进行初始化。如`-v $stonedb_volumn_dir/data/:/stonedb56/install/data/`,前面是宿主机映射路径,后面是容器映射路径
-i:交互式操作>-t:终端
-d:启动不进入容器,想要进入容器需要使用指令 docker exec
```bash
docker run -p 13306:3306 -v $stonedb_volumn_dir/data/:/stonedb56/install/data/ -it -d stoneatom/stonedb:v0.1 /bin/bash
......@@ -39,4 +42,4 @@ docker exec -it 容器ID bash
使用mysql client 登录,其他第三方工具(例如Navicat、DBEAVER)登录方式等类似
```shell
mysql -h宿主机IP -uroot -pstonedb123 -P宿主机映射端口
```
```
\ No newline at end of file
......@@ -6,53 +6,89 @@ sidebar_position: 3.1
# 快速部署
为方便用户快速上手,安装包是已经编译好的,只需要检查自己的环境是否缺少依赖。
## 下载安装包
点击[此处](https://static.stoneatom.com/stonedb-ce-5.6-v1.0.0.el7.x86_64.tar.gz)下载最新的安装包。
点击 [此处](https://static.stoneatom.com/stonedb-ce-5.6-v1.0.0.el7.x86_64.tar.gz)下载最新的安装包。
## 上传tar包并解压
```shell
cd /
tar -zxvf stonedb-ce-5.6-v1.0.0.el7.x86_64.tar.gz
```
上传至安装目录,解压出来的文件夹名是 stonedb56。
用户可根据安装规范将安装包上传至服务器,解压出来的目录是 stonedb56,示例中的安装路径是 /stonedb56。
## 检查依赖文件
```shell
cd /stonedb56/install/bin
ldd mysqld
ldd mysql
```
如果检查返回有关键字"not found",说明缺少文件,需要安装对应的依赖包。例如:
如果检查返回有关键字"not found",说明缺少文件,需要安装对应的依赖包。
例如:
libsnappy.so.1 => not found
在 Ubuntu 上使用命令 "sudo apt search libsnappy" 检查,说明需要安装 libsnappy-dev。在 RedHat 或者 CentOS 上使用命令 "yum search all snappy" 检查,说明需要安装 snappy-devel、snappy。
## 修改配置文件
```shell
cd /stonedb56/install/
cp stonedb.cnf stonedb.cnf.bak
vi stonedb.cnf
```
主要修改路径,如果安装目录就是 stonedb56,只需要修改其它参数。
## 创建用户
Ubuntu 需要安装的依赖包详见 [Ubuntu 20.04 下编译 StoneDB](../04-developer-guide/00-compiling-methods/compile-using-ubuntu2004.md)
CentOS 需要安装的依赖包详见 [CentOS 7 下编译 StoneDB](../04-developer-guide/00-compiling-methods/compile-using-centos7.md)
RedHat 需要安装的依赖包详见 [RedHat 7 下编译 StoneDB](../04-developer-guide/00-compiling-methods/compile-using-redhat7.md)
## 启动实例
用户可按照手动安装和自动安装两种方式启动 StoneDB。
### 1. 创建用户
```shell
groupadd mysql
useradd -g mysql mysql
passwd mysql
```
## 执行脚本reinstall.sh
### 2. 手动安装
手动创建目录、配置参数文件、初始化和启动实例。
```shell
###创建目录
mkdir -p /stonedb56/install/data/innodb
mkdir -p /stonedb56/install/binlog
mkdir -p /stonedb56/install/log
mkdir -p /stonedb56/install/tmp
chown -R mysql:mysql /stonedb56
###配置 stonedb.cnf
vim /stonedb56/install/stonedb.cnf
[mysqld]
port = 3306
socket = /stonedb56/install/tmp/mysql.sock
datadir = /stonedb56/install/data
pid-file = /stonedb56/install/data/mysqld.pid
log-error = /stonedb56/install/log/mysqld.log
chown -R mysql:mysql /stonedb56/install/stonedb.cnf
###初始化实例
/stonedb56/install/scripts/mysql_install_db --datadir=/stonedb56/install/data --basedir=/stonedb56/install --user=mysql
###启动实例
/stonedb56/install/bin/mysqld_safe --defaults-file=/stonedb56/install/stonedb.cnf --user=mysql &
```
### 3. 自动安装
执行 reinstall.sh 就是创建目录、初始化实例和启动实例的过程。
```shell
cd /stonedb56/install
./reinstall.sh
```
执行脚本的过程就是初始化实例和启动实例。
## 登录
:::info
reinstall.sh 与 install.sh 的区别:
- reinstall.sh 是自动化安装脚本,执行脚本的过程是创建目录、初始化实例和启动实例的过程,只在第一次使用,其他任何时候使用都会删除整个目录,重新初始化数据库。
- install.sh 是手动安装提供的示例脚本,用户可根据自定义的安装目录修改路径,然后执行脚本,执行脚本的过程也是创建目录、初始化实例和启动实例。以上两个脚本都只能在第一次使用。
:::
### 4. 执行登录
```shell
/stonedb56/install/bin/mysql -uroot -p -S /stonedb56/install/tmp/mysql.sock
/stonedb56/install/bin/mysql -uroot -p -S /stonedb56/install/tmp/mysql.sock
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.24-StoneDB-log build-
Your MySQL connection id is 2
Server version: 5.6.24-StoneDB-debug build-
Copyright (c) 2000, 2022 StoneAtom Group Holding Limited
No entry for terminal type "xterm";
using dumb terminal settings.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
......
......@@ -5,24 +5,19 @@ sidebar_position: 3.2
# 快速上手
本文的目的是通过简单的演示,让使用者体验StoneDB在大批量数据插入、数据压缩比和分析查询方面与InnoDB相比具有较高的性能。
本文的目的是通过简单的演示,让使用者体验 StoneDB 在大批量数据插入、数据压缩和分析查询方面与 InnoDB 相比,具有较高的性能。
## 第一步. 部署试用环境
在试用 StoneDB 功能前,请按照[快速部署](quick-deployment.md)中的步骤准备 StoneDB 测试环境并启动实例。
在试用 StoneDB 前,请按照 [快速部署](./quick-deployment.md) 中的步骤准备 StoneDB 测试环境并启动实例。
## 第二步. 准备测试数据
通过以下步骤,将生成一个测试数据集用于体验 StoneDB 功能。
### 1) 前提条件
在同一个测试环境,分别创建StoneDB和InnoDB存储引擎的表,并且以下三个参数也是相同的。
autocommit=1
innodb_flush_log_at_trx_commit=1
sync_binlog=0
### 2)创建数据库
创建名为test的数据库
```
通过以下步骤,将生成一个测试数据集用于体验 StoneDB。
### 1) 创建数据库
创建名为 test 的数据库。
```sql
create database test DEFAULT CHARACTER SET utf8mb4;
```
### 3) 创建表
test数据库内创建名为t_user的表
```
### 2) 创建表
test 数据库内分别创建名为 t_user、t_user_innodb 的表。
```sql
use test
CREATE TABLE t_user(
id INT NOT NULL AUTO_INCREMENT,
......@@ -32,54 +27,99 @@ CREATE TABLE t_user(
score INT NOT NULL,
copy_id INT NOT NULL,
PRIMARY KEY (`id`)
) engine=STONEDB;
```
### 4) 创建存储过程
执行以下 SQL 语句创建存储过程
) engine=stonedb;
CREATE TABLE t_user_innodb(
id INT NOT NULL AUTO_INCREMENT,
first_name VARCHAR(20) NOT NULL,
last_name VARCHAR(20) NOT NULL,
sex VARCHAR(5) NOT NULL,
score INT NOT NULL,
copy_id INT NOT NULL,
PRIMARY KEY (`id`)
) engine=innodb;
```
:::info
StoneDB 5.6 的存储引擎名是 stonedb,5.7 的存储引擎名是 tianmu。
:::
### 3) 创建存储过程
执行以下 SQL 语句创建存储过程。
```sql
DELIMITER //
create PROCEDURE add_user(in num INT)
BEGIN
DECLARE rowid INT DEFAULT 0;
DECLARE firstname VARCHAR(10);
DECLARE name1 VARCHAR(10);
DECLARE name2 VARCHAR(10);
DECLARE lastname VARCHAR(10) DEFAULT '';
DECLARE sex CHAR(1);
DECLARE score CHAR(2);
WHILE rowid < num DO
SET firstname = SUBSTRING(md5(rand()),1,4);
SET name1 = SUBSTRING(md5(rand()),1,4);
SET name2 = SUBSTRING(md5(rand()),1,4);
SET sex=FLOOR(0 + (RAND() * 2));
SET score= FLOOR(40 + (RAND() *60));
SET rowid = rowid + 1;
IF ROUND(RAND())=0 THEN
SET lastname =name1;
END IF;
IF ROUND(RAND())=1 THEN
SET lastname = CONCAT(name1,name2);
END IF;
insert INTO t_user(first_name,last_name,sex,score,copy_id) VALUES (firstname,lastname,sex,score,rowid);
END WHILE;
END //
create PROCEDURE add_user(in num INT)
BEGIN
DECLARE rowid INT DEFAULT 0;
DECLARE firstname VARCHAR(10);
DECLARE name1 VARCHAR(10);
DECLARE name2 VARCHAR(10);
DECLARE lastname VARCHAR(10) DEFAULT '';
DECLARE sex CHAR(1);
DECLARE score CHAR(2);
WHILE rowid < num DO
SET firstname = SUBSTRING(md5(rand()),1,4);
SET name1 = SUBSTRING(md5(rand()),1,4);
SET name2 = SUBSTRING(md5(rand()),1,4);
SET sex=FLOOR(0 + (RAND() * 2));
SET score= FLOOR(40 + (RAND() *60));
SET rowid = rowid + 1;
IF ROUND(RAND())=0 THEN
SET lastname =name1;
END IF;
IF ROUND(RAND())=1 THEN
SET lastname = CONCAT(name1,name2);
END IF;
insert INTO t_user(first_name,last_name,sex,score,copy_id) VALUES (firstname,lastname,sex,score,rowid);
END WHILE;
END //
DELIMITER ;
DELIMITER //
create PROCEDURE add_user_innodb(in num INT)
BEGIN
DECLARE rowid INT DEFAULT 0;
DECLARE firstname VARCHAR(10);
DECLARE name1 VARCHAR(10);
DECLARE name2 VARCHAR(10);
DECLARE lastname VARCHAR(10) DEFAULT '';
DECLARE sex CHAR(1);
DECLARE score CHAR(2);
WHILE rowid < num DO
SET firstname = SUBSTRING(md5(rand()),1,4);
SET name1 = SUBSTRING(md5(rand()),1,4);
SET name2 = SUBSTRING(md5(rand()),1,4);
SET sex=FLOOR(0 + (RAND() * 2));
SET score= FLOOR(40 + (RAND() *60));
SET rowid = rowid + 1;
IF ROUND(RAND())=0 THEN
SET lastname =name1;
END IF;
IF ROUND(RAND())=1 THEN
SET lastname = CONCAT(name1,name2);
END IF;
insert INTO t_user_innodb(first_name,last_name,sex,score,copy_id) VALUES (firstname,lastname,sex,score,rowid);
END WHILE;
END //
DELIMITER ;
```
此存储过程用于随机生成一张人员信息表。
## 第三步. 插入性能测试
执行以下SQL语句调用存储过程
```
> call add_user_innodb(10000000);
执行以下 SQL 语句调用存储过程。
```sql
mysql> call add_user_innodb(10000000);
Query OK, 1 row affected (24 min 46.62 sec)
> call add_user(10000000);
mysql> call add_user(10000000);
Query OK, 1 row affected (9 min 21.14 sec)
```
结果显示:插入1千万数据行,在StoneDB需要9分钟21秒,在InnoDB需要24分钟46秒。
结果显示:插入1千万数据行,在 StoneDB 需要9分钟21秒,在 InnoDB 需要24分钟46秒。
:::info
硬件配置不同,执行时间是会有差异,这里是同一个环境执行两个存储过程,对比它们的执行时间。
:::
## 第四步. 数据压缩测试
通过以下SQL语句验证数据压缩性能
```
> select count(*) from t_user_innodb;
通过以下 SQL 语句验证数据压缩性能。
```sql
mysql> select count(*) from t_user_innodb;
+----------+
| count(*) |
+----------+
......@@ -87,7 +127,7 @@ Query OK, 1 row affected (9 min 21.14 sec)
+----------+
1 row in set (1.83 sec)
> select count(*) from t_user;
mysql> select count(*) from t_user;
+----------+
| count(*) |
+----------+
......@@ -102,11 +142,11 @@ Query OK, 1 row affected (9 min 21.14 sec)
| test | t_user_innodb | 9995867 | 454.91M | 0.00M | 454.91M | InnoDB |
+--------------+---------------+------------+-------------+--------------+------------+---------+
```
结果显示:相同的数据行,在StoneDB大小为120M,在InnoDB大小为455M
结果显示:相同的数据行,在 StoneDB 大小为120MB,在 InnoDB 大小为455MB
## 第五步. 聚合查询测试
通过以下语句执行聚合查询测试
```
> select first_name,count(*) from t_user group by first_name order by 1;
```sql
mysql> select first_name,count(*) from t_user group by first_name order by 1;
+------------+----------+
| first_name | count(*) |
+------------+----------+
......@@ -119,7 +159,7 @@ Query OK, 1 row affected (9 min 21.14 sec)
+------------+----------+
65536 rows in set (0.98 sec)
> select first_name,count(*) from t_user_innodb group by first_name order by 1;
mysql> select first_name,count(*) from t_user_innodb group by first_name order by 1;
+------------+----------+
| first_name | count(*) |
+------------+----------+
......@@ -132,4 +172,4 @@ Query OK, 1 row affected (9 min 21.14 sec)
+------------+----------+
65536 rows in set (9.00 sec)
```
结果显示:执行相同的聚合查询,在StoneDB需要0.98s,在InnoDB需要9s。
结果显示:执行相同的聚合查询,在 StoneDB 需要0.98s,在 InnoDB 需要9s。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册