StoneDB is an open-source hybrid transaction/analytical processing (HTAP) database designed and developed by StoneAtom based on the MySQL kernel. It is the first database of this type launched in China. StoneDB can be seamlessly switched from MySQL. It provides features such as optimal performance and real-time analytics, offering you a one-stop solution to process online transaction processing (OLTP), online analytical processing (OLAP), and HTAP workloads.
StoneDB is fully compatible with the MySQL 5.6 and 5.7 protocols, the MySQL ecosystem, and common MySQL features and syntaxes. You can use tools and clients in the MySQL ecosystem on StoneDB, such as Navicat, Workbench, mysqldump, and mydumper. In addition, all workloads on StoneDB can be run on MySQL.
StoneDB is optimized for OLAP applications. StoneDB that runs on a common server can process complex queries on tens of billions of data records, while ensuring high performance. Compared to databases that use MySQL Community Edition, StoneDB is at least 10 times faster in processing queries.
StoneDB uses the knowledge grid technology and a column-based storage engine. This storage engine is designed for OLAP applications and uses techniques such as column-based storage, knowledge grid-based filtering, and high-efficiency data compression. With such storage engine, StoneDB provides application systems with high-performance and reduces the total cost of ownership (TCO).
## Advantages
### Integration of MySQL
StoneDB is an HTAP database built on MySQL. To enhance analytics capabilities, it integrates a self-developed engine also named StoneDB. (In this topic, StoneDB refers to the database, if not otherwise specified.) For this reason, StoneDB is fully compatible with MySQL. You can use standard interfaces, such as Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC) to connect to StoneDB. In addition, you can create local connections to connect to StoneDB. StoneDB supports APIs written in various programming languages, such as C, C++, C#, Java, PHP, and Perl. StoneDB is fully compatible with views and stored procedures that comply with the ANSI SQL-92 standard and the SQL-99 standard. In this way, your application systems that can run on MySQL can directly run on StoneDB, without the need to modify the code. This allows you to seamlessly switch MySQL to StoneDB.
### Real-time HTAP
StoneDB provides two engines: row-based storage engine InnoDB and column-based storage engine StoneDB. StoneDB uses binlogs to replicate data from the row-based storage engine to the column-based storage engine in real time. This ensures strong data consistency between the two storage engines.
## Key techniques
### Column-based storage engine
A column-based storage engine stores data to disks column by column. When you query data, only the required fields are retrieved, which greatly reduces memory bandwidth traffic and disk I/O. In addition, in a column-based storage engine, columns do not need to be indexed, freeing the database from maintaining such indexes.
### High-efficiency data compression
In a relational database, values in the same column are of the same data type. More duplicate values stored in a column indicate a higher data compression ratio and a smaller data volume. By virtue of this, less data is retrieved for queries, and thus memory bandwidth traffic and disk I/O are reduced.
StoneDB saves storage space by using column-based storage. The data compression ratio of a column-oriented database is at least 10 times higher than that of a row-oriented database.
### Knowledge grid
A knowledge grid can filter data packs based on metadata, and then decompress the data packs to obtain the data that meets the query conditions. This greatly reduces I/O, and improves response speed and network utilization.
### Push-based vectorized query execution
When processing a query, StoneDB pushes column-based data packs from one operator to another based on the execution plan. Compared to the execution model used by row-oriented databases, push-based execution prevents in-depth calls of stacks and saves resources.
StoneDB is built on MySQL by integrating a storage engine into MySQL. Therefore, StoneDB is highly compatible with the MySQL 5.6 and 5.7 protocols, and the ecosystem, common features, and common syntaxes of MySQL. However, due to characteristics of column-based storage, StoneDB is incompatible with certain MySQL operations and features.
## Unsupported DDL operations
StoneDB does not support the following DDL operations:
- Modify the data type of a field.
- Modify the length of a field.
- Change the character set of a table or a field.
- Convert the character set for a table.
- Optimize a table.
- Analyze a table.
- Lock a table.
- Repair a table.
- Execute a CREATE TABLE… AS SELECT statement.
- Reorganize a table.
- Rename a field.
- Configure the default value for a field.
- Specify the default value of a field to null.
- Specify the default value of a field to non-null.
- Add a unique constraint.
- Delete a unique constraint.
- Create an index.
- Delete an index.
- Modify a table comment.
StoneDB is a column-oriented database. Data in StoneDB is highly compressed. For this reason, table attributes and column attributes are difficult to modify. The character sets, data types, constraints, and indexes must be properly defined when tables are being created.
## Unsupported DML operations
StoneDB does not support the following DML operations:
- Execute a DELETE statement.
- Use subqueries in an UPDATE statement.
- Execute an UPDATE… JOIN statement to update multiple tables.
- Execute a REPLACE… INTO statement.
StoneDB is not suitable for applications that are frequently updated. It supports only single-table update and insert operations. This is because a column-oriented database needs to find each corresponding column and update the value in the row when processing an update operation. However, a row-oriented database stores data by row. When processing an update operation, the row-oriented database only needs to find the corresponding page or block and update the data directly in the row.
If you want to use user-defined functions that contain nested SQL statements, set the **stonedb_ini_allowmysqlquerypath** parameter to **1** in the **stonedb.cnf** configuration file.
## Unsupported data types
StoneDB does not support the following data types:
- bit
- enum
- set
- decimal whose precision is higher than 18, for example, decimal(19,x)
- Data types that contain keyword **unsigned** or **zerofill**
## Unsupported binary log formats
StoneDB does not support the following binary log formats:
- row
- mixed
Column-oriented databases supports only statement-based binary logs. Row-based binary logs and mixed binary logs are not supported.
## Association queries across storage engines not supported
By default, StoneDB does not support association queries across storage engines. If an association query involves tables in both InnoDB and the StoneDB column-based storage engine, an error will be reported. You can set the **stonedb_ini_allowmysqlquerypath** parameter to **1 **in the **stonedb.cnf** configuration file to remove this limit.
## Transactions not supported
Transactions must strictly comply with the ACID attributes. However, StoneDB does not support redo and undo logs and thus does not support transactions.
## Partitions not supported
Column-oriented databases do not support partitioning.
## Column locking and table locking not supported
Column-oriented databases do not support column locking or table locking.
| **row** | A series of data that makes up a record. |
| **column** | Also referred to as field. In a relational database, a field must be associated with a data type when the field is being created. |
| **table** | Consists of rows and columns. Databases use tables to store data. Tables are essential objects in databases. |
| **view** | A virtual table that does not store actual data. It is based on the result set of an SQL statement. |
| **stored procedure** | A collection of one or more SQL statements that are compiled and then stored in a database to execute a specific operation. To execute a stored procedure, you need to specify the name and required parameters of the stored procedure. |
| **database** | A collection of database objects such as tables, views, and stored procedures. |
| **instance** | A collection of databases. |
| **data page** | The basic unit for database management. The default size for a data page is 16 KB. |
| **data file** | Used for storing data. By default, one table corresponds to one data file. |
| **tablespace** | A logical storage unit. By default, one table corresponds to one tablespace. |
| **transaction** | A sequence of DML operations. This sequence satisfies the atomicity, consistency, isolation, and durability (ACID) properties. A transaction must end with a submission or rollback. Implicit submission by using DDL statements are supported. |
| **character set** | A collection of symbols and encodings. |
| **collation** | A collation is a collection of rules for comparing and sorting character strings. |
| **column-based storage** | Stores data by column to disks. |
| **data compression** | A process performed to reduce the size of data files. The data compression ratio is determined by the data type, degree of duplication, and compression algorithm. |
| **OLTP** | The acronym of online transaction processing. OLTP features quick response for interactions and high concurrency of small transactions. Typical applications are transaction systems of banks. |
| **OLAP** | The acronym of online analytical processing. OLAP features complex analytical querying on a large amount of data. Typical applications are data warehouses. |
| **HTAP** | The acronym of hybrid transaction/analytical processing. HTAP is an emerging application architecture built to allow one system for both transactions and analytics. |
If the development or test environment is deployed on a virtual machine, the AVX instruction set must be enabled. Otherwise, StoneDB cannot be installed.
:::
# Configuration requirements for a production environment
The following table describes the configuration requirements for a production environment.
StoneDB is an open-source hybrid transaction/analytical processing (HTAP) database designed and developed by StoneAtom based on the MySQL kernel. It can be deployed and run on 64-bit x86 servers and supports most mainstream network hardware and Linux OSs.
# Supported servers
The following table lists the servers on which StoneDB can run.
| **Architecture** | **Supported server** |
| --- | --- |
| x86_64 architecture | Common x86_64 servers with AVX instruction sets enabled |
:::info
Support for the ARM64 or Power architecture is under testing.
:::
# Supported OSs
The following table lists the OSs supported by StoneDB.
| **OS** | **Version** |
| --- | --- |
| CentOS | 7.x |
| Red Hat Enterprise | 7.x |
| Ubuntu LTS | 20.04 or higher |
:::info
Compatibilities with other OSs such as Debian and Fedora are under testing.
Structured Query Language (SQL) is a programming language for communicating with databases. You can use it to manage relational databases by performing insert, query, update, and other operations.
StoneDB is compatible with MySQL. You can use clients supported by MySQL to connect to StoneDB. In addition, StoneDB supports most SQL syntaxes. This section describes the basic SQL operations supported by StoneDB.
SQL can be classified into the following four parts by usage:
- Data Definition Language (DDL): is used to manage database objects, such as CREATE, ALTER, and DROP statements.
- Data Manipulation Language (DML): is used to manage data in tables, such as INSERT, DELETE, and UPDATE statements.
- Data Query Language (DQL): is used to query objects, such as SELECT statements.
- Data Control Language (DCL): is used to control access to data, such as GRANT and REVOKE statements.
## Operations on databases
This section provides examples of performing basic operations on databases.
### Create a database
Execute the following SQL statement to create a database named **test_db** and set the default character set to **utf8mb4**:
```sql
createdatabasetest_dbDEFAULTCHARACTERSETutf8mb4;
```
### List databases
Execute the following SQL statement to list databases:
```sql
showdatabases;
```
### Use a database
Execute the following SQL statement to use database **test_db**:
```sql
usetest_db;
```
### Drop a database
Execute the following SQL statement to drop database** test_db**:
```sql
dropdatabasetest_db;
```
## Operations on tables
This section provides examples of performing basic operations on tables.
### Create a StoneDB table
Execute the following SQL statement to create a table which is named **student** and consists of the **id**, **name**, **age**, and **birthday** fields:
```sql
createtablestudent(
idint(11)primarykey,
namevarchar(255),
agesmallint,
birthdayDATE
)engine=stonedb;
```
:::info
If you do not specify **engine=stonedb** in the SQL statement, the storage engine on which the table is created is determined by the value of parameter **default_storage_engine**. For more information, see [Configure parameters](https://stoneatom.yuque.com/staff-ft8n1u/dghuxr/xg9czr).
:::
### Query the schema of a table
Execute the following SQL statement to query the schema of table **student**:
```sql
showcreatetablestudent\G
```
### Drop a table
Execute the following SQL statement to drop table **student**:
```sql
droptablestudent;
```
## Operations on data
This section provides examples of performing basic operations on data.
### Insert data into a table
Execute the following SQL statement to insert a record to table **student**:
```sql
insertintostudentvalues(1,'Jack',15,'20220506');
```
### Modify data in a table
Execute the following UPDATE statement to modify data in table **student**:
```sql
updatestudentsetage=25whereid=1;
```
### Remove data from a table
#### Clear data in a table
Execute the following TRUNCATE statement to clear data in table **student**:
```sql
truncatetablestudent;
```
#### Remove specific data from a table
As a column-based storage engine, StoneDB does not support DELETE operations.
### Query data from a table
Execute a SELECT statement to query data from a table.
Example 1: Query the name and birthday of the student whose **id** is **1** from table **student**.
```sql
selectname,birthdayfromstudentwhereid=1;
```
Example 2: Query the name and birthday of each student and sort the result by birthday from table **student**.
```sql
selectname,birthdayfromstudentorderbybirthday;
```
## Operations on users
This section provides examples of performing basic operations on users.
### Create a user
Execute the following SQL statement to create a user named **tiger** and set the password to **123456**:
```sql
createuser'tiger'@'%'identifiedby'123456';
```
:::info
The username together with the hostname uniquely identify a user in the format of '_username_'@'_host_'. In this way, 'tiger'@'%' and 'tiger'@'localhost' are two users.
:::
### Grant a user permissions
Execute the following SQL statement to grant user **tiger** the permissions to query all tables in database **test_db**:
```sql
grantselectontest_db.*to'tiger'@'%';
```
### Query user permissions
Execute the following SQL statement to query permissions granted to user **tiger**:
```sql
showgrantsfor'tiger'@'%';
```
### Drop a user
Execute the following SQL statement to drop user '**tiger'@'%'**:
When you initialize StoneDB, add** parameter --initialize-insecure** to allow the admin to initially log in without the need to enter a password. The admin is required to set a password after the initial login.
This topic presents some examples to show you that the StoneDB storage engine has superior performance than InnoDB on processing bulk insert of data, compressing data, and executing analytical queries.
## Step 1. Deploy a test environment
Before using StoneDB, prepare your test environment according to instructions provided in [Quick Deployment](quick-deployment.md) and start StoneDB.
## Step 2. Prepare test data
Perform the following steps to generate test data.
### 1. Prepare for the test
In the test environment, create a StoneDB table and a InnoDB table. Ensure the following parameter settings of the two tables are same:
- autocommit=1
- innodb_flush_log_at_trx_commit=1
- sync_binlog=0
### 2. Create a database
Create a database named **test**.
```sql
createdatabasetestDEFAULTCHARACTERSETutf8mb4;
```
### 3. Create a table
In database **test**, create a table named **t_test**.
```sql
usetest
CREATETABLEt_user(
idINTNOTNULLAUTO_INCREMENT,
first_nameVARCHAR(20)NOTNULL,
last_nameVARCHAR(20)NOTNULL,
sexVARCHAR(5)NOTNULL,
scoreINTNOTNULL,
copy_idINTNOTNULL,
PRIMARYKEY(`id`)
)engine=STONEDB;
```
### 4. Create a stored procedure
Create a stored procedure that is used to generate a table containing randomly generated names of persons.
Mydumper is a logical backup tool for MySQL. It consists of two parts:
- mydumper: exports consistent backup files of MySQL databases.
- myloader: reads backups from mydumper, connects to destination databases, and imports backups.
Both parts require multithreading capacities.
### Benefits
- Parallelism and performance: The tool provides high backup rate. Expensive character set conversion routines are avoided and the overall high efficiency of code is ensured.
- Simplified output management: Separate files are used for tables and metadata is dumped, simplifying data view and parse.
- High consistency: The tool maintains snapshots across all threads and provides accurate positions of primary and secondary logs.
- Manageability: Perl Compatible Regular Expressions (PCRE) can be used to specify whether to include or exclude tables or databases.
### Features
- Multi-threaded backup, which generates multiple backup files
- Consistent snapshots for transactional and non-transactional tables
:::info
This feature is supported by versions later than 0.2.2.
:::
- Fast file compression
- Export of binlogs
- Multi-threaded recovery
:::info
This feature is supported by versions later than 0.2.1.
:::
- Function as a daemon to periodically perform snapshots and consistently records binlogs
:::info
This feature is supported by versions later than 0.5.0.
:::
- Open source (license: GNU GPLv3)
## Use Mydumper
### Parameters for mydumer
```bash
mydumper --help
Usage:
mydumper [OPTION…] multi-threaded MySQL dumping
Help Options:
-?, --help Show help options
Application Options:
-B, --database Database to dump
-o, --outputdir Directory to output files to
-s, --statement-size Attempted size of INSERT statement in bytes, default 1000000
-r, --rows Try to split tables into chunks of this many rows. This option turns off --chunk-filesize
-F, --chunk-filesize Split tables into chunks of this output file size. This value is in MB
--max-rows Limit the number of rows per block after the table is estimated, default 1000000
-c, --compress Compress output files
-e, --build-empty-files Build dump files even if no data available from table
-i, --ignore-engines Comma delimited list of storage engines to ignore
-N, --insert-ignore Dump rows with INSERT IGNORE
-m, --no-schemas Do not dump table schemas with the data and triggers
-M, --table-checksums Dump table checksums with the data
-d, --no-data Do not dump table data
--order-by-primary Sort the data by Primary Key or Unique key if no primary key exists
-G, --triggers Dump triggers. By default, it do not dump triggers
-E, --events Dump events. By default, it do not dump events
-R, --routines Dump stored procedures and functions. By default, it do not dump stored procedures nor functions
-W, --no-views Do not dump VIEWs
-k, --no-locks Do not execute the temporary shared read lock. WARNING: This will cause inconsistent backups
--no-backup-locks Do not use Percona backup locks
--less-locking Minimize locking time on InnoDB tables.
--long-query-retries Retry checking for long queries, default 0 (do not retry)
--long-query-retry-interval Time to wait before retrying the long query check in seconds, default 60
-l, --long-query-guard Set long query timer in seconds, default 60
-K, --kill-long-queries Kill long running queries (instead of aborting)
-D, --daemon Enable daemon mode
-X, --snapshot-count number of snapshots, default 2
-I, --snapshot-interval Interval between each dump snapshot (in minutes), requires --daemon, default 60
-L, --logfile Log file name to use, by default stdout is used
--tz-utc SET TIME_ZONE='+00:00' at top of dump to allow dumping of TIMESTAMP data when a server has data in different time zones or data is being moved between servers with different time zones, defaults to on use --skip-tz-utc to disable.
--skip-tz-utc
--use-savepoints Use savepoints to reduce metadata locking issues, needs SUPER privilege
--success-on-1146 Not increment error count and Warning instead of Critical in case of table doesn't exist
--lock-all-tables Use LOCK TABLE for all, instead of FTWRL
-U, --updated-since Use Update_time to dump only tables updated in the last U days
--trx-consistency-only Transactional consistency only
--complete-insert Use complete INSERT statements that include column names
--split-partitions Dump partitions into separate files. This options overrides the --rows option for partitioned tables.
--set-names Sets the names, use it at your own risk, default binary
-z, --tidb-snapshot Snapshot to use for TiDB
--load-data
--fields-terminated-by
--fields-enclosed-by
--fields-escaped-by Single character that is going to be used to escape characters in theLOAD DATA stament, default: '\'
--lines-starting-by Adds the string at the begining of each row. When --load-data is usedit is added to the LOAD DATA statement. Its affects INSERT INTO statementsalso when it is used.
--lines-terminated-by Adds the string at the end of each row. When --load-data is used it isadded to the LOAD DATA statement. Its affects INSERT INTO statementsalso when it is used.
--statement-terminated-by This might never be used, unless you know what are you doing
--sync-wait WSREP_SYNC_WAIT value to set at SESSION level
--where Dump only selected records.
--no-check-generated-fields Queries related to generated fields are not going to be executed.It will lead to restoration issues if you have generated columns
--disk-limits Set the limit to pause and resume if determines there is no enough disk space.Accepts values like: '<resume>:<pause>'in MB.For instance: 100:500 will pause when there is only 100MB free and willresume if 500MB are available
--csv Automatically enables --load-data and set variables to export in CSV format.
-t, --threads Number of threads to use, default 4
-C, --compress-protocol Use compression on the MySQL connection
--stream It will stream over STDOUT once the files has been written
--no-delete It will not delete the files after stream has been completed
-O, --omit-from-file File containing a list of database.table entries to skip, one per line (skips before applying regex option)
-T, --tables-list Comma delimited table list to dump (does not exclude regex option)
-h, --host The host to connect to
-u, --user Username with the necessary privileges
-p, --password User password
-a, --ask-password Prompt For User password
-P, --port TCP/IP port to connect to
-S, --socket UNIX domain socket file to use for connection
-x, --regex Regular expression for 'db.table' matching
--skip-definer Removes DEFINER from the CREATE statement. By default, statements are not modified
```
### Install and use Mydumper
```bash
# On GitHub, download the RPM package or source code package that corresponds to the machine that you use. We recommend you download the RPM package because it can be directly used while the source code package requires compilation. The OS used in the following example is CentOS 7. Therefore, download an el7 version.
**metadata**: records the name and position of the binlog file of the backup database at the backup point in time.
:::info
If the backup is performed on the standby library, this file also records the name and position of the binlog file that has been synchronized from the active libary when the backup is performed.
:::
Each table has two backup files:
-**database-schema-create**: records the statements for creating the library.
-**database.table-schema.sql**: records the table schemas.
-**database.table.00000.sql**: records table data.
-**database.table-metadata**: records table metadata.
### Backup principles
1. The main thread executes **FLUSH TABLES WITH READ LOCK** to add a global read-only lock to ensure data consistency.
2. The name and position of the binlog file at the current point in time are obtained and recorded to the **metadata **file to support recovery performed later.
3. Multiple (4 by default, customizable) dump threads change the isolation level for transactions to Repeatable Read and enable read-consistent transactions.
4. Non-InnoDB tables are exported.
5. After data of the non-transaction engine is backed up, the main thread executes **UNLOCK TABLES** to release the global read-only lock.
Compiling StoneDB on a physical server requires installation of third-party repositories, which is complicated. In addition, if the OS in your environment is Fedora or Ubuntu, you also need to install many dependencies. We recommend that you compile StoneDB in a Docker container. After StoneDB is compiled, you can directly run StoneDB in the container or copy the compilation files to your environment.
## Prerequisites
Docker has been installed. For information about how to install Docker, visit [https://docs.docker.com/engine/install/ubuntu/](https://docs.docker.com/engine/install/ubuntu/).
## Use a Dockerfile in a compilation environment
### Step 1. Download the source code of StoneDB and docker.zip
Download file **docker.zip**, save the file to the root directory of the source code of StoneDB, and then decompress the file.
#Use an FTP tool to upload 'docker.zip' to this directory for decompression.
[root@testOS atomstore2022]# unzip docker.zip
[root@testOS atomstore2022]# tree docker
docker
├── cmake.tar.gz
├── docker_build.sh
├── Dockerfile
├── stonedb-boost1.66.tar.gz
├── stonedb-gcc-rocksdb.tar.gz
└── stonedb-marisa.tar.gz
0 directories, 6 files
```
### Step 2. Build a Docker image
```bash
[root@testOS atomstore2022]# cd docke
[root@testOS docker]# chmod u+x docker_build.sh
# If an image has been created in your environment, you can use the cache. If this is the first image that is to be created in your environment, you must install dependencies. This may take a longer period of time.
# Run the './docker_build.sh <tag>' command to call the script. <tag> specifies the tag of the image.
# Example './docker_build.sh 0.1'
[root@testOS docker]# ./docker_build.sh v0.1
/home/src
Sending build context to Docker daemon 99.41MB
Step 1/14 : FROM centos:7
---> eeb6ee3f44bd
Step 2/14 : ENV container docker
---> Using cache
---> dc33c0e29f61
Step 3/14 : RUN (cd /lib/systemd/system/sysinit.target.wants/;for i in*;do[$i== systemd-tmpfiles-setup.service ]||rm-f$i;done);rm-f /lib/systemd/system/multi-user.target.wants/*;rm-f /etc/systemd/system/*.wants/*;rm-f /lib/systemd/system/local-fs.target.wants/*;rm-f /lib/systemd/system/sockets.target.wants/*udev*;rm-f /lib/systemd/system/sockets.target.wants/*initctl*;rm-f /lib/systemd/system/basic.target.wants/*;rm-f /lib/systemd/system/anaconda.target.wants/*;
#After the 'cmake' command is completed, run the 'make' and 'make install' commands.
[root@06f1f385d3b3 build]# make
[root@06f1f385d3b3 build]# make install
```
## (Optional) Follow-up operations
After the `make` commands are successful, you can choose either to compress the compilation files to a TAR file and copy the TAR file from the container or to directly run it in the container.
### Compress compilation files to a TAR file
```bash
# Compress the 'home' folder to a TAR file and mount the TAR file to a directory outside the container.
You can refer to [Quick Deployment](https://stoneatom.yuque.com/staff-ft8n1u/dghuxr/pv8ath) or the following code to deploy and use StoneDB in the container.
```bash
[root@06f1f385d3b3 build]# cd /stonedb56/install/
[root@06f1f385d3b3 install]# groupadd mysql
[root@06f1f385d3b3 install]# useradd -g mysql mysql
[root@06f1f385d3b3 install]# ll
total 180
-rw-r--r--. 1 root root 17987 Jun 8 03:41 COPYING
-rw-r--r--. 1 root root 102986 Jun 8 03:41 INSTALL-BINARY
-rw-r--r--. 1 root root 2615 Jun 8 03:41 README
drwxr-xr-x. 2 root root 4096 Jun 8 06:16 bin
drwxr-xr-x. 3 root root 18 Jun 8 06:16 data
drwxr-xr-x. 2 root root 55 Jun 8 06:16 docs
drwxr-xr-x. 3 root root 4096 Jun 8 06:16 include
-rwxr-xr-x. 1 root root 267 Jun 8 03:41 install.sh
drwxr-xr-x. 3 root root 272 Jun 8 06:16 lib
drwxr-xr-x. 4 root root 30 Jun 8 06:16 man
drwxr-xr-x. 10 root root 4096 Jun 8 06:16 mysql-test
-rwxr-xr-x. 1 root root 12516 Jun 8 03:41 mysql_server
-rwxr-xr-x. 1 root root 764 Jun 8 03:41 reinstall.sh
drwxr-xr-x. 2 root root 57 Jun 8 06:16 scripts
drwxr-xr-x. 28 root root 4096 Jun 8 06:16 share
drwxr-xr-x. 4 root root 4096 Jun 8 06:16 sql-bench
-rw-r--r--. 1 root root 5526 Jun 8 03:41 stonedb.cnf
drwxr-xr-x. 2 root root 136 Jun 8 06:16 support-files
[root@06f1f385d3b3 install]# ./reinstall.sh
。
。
。
# If the following information is returned, StoneDB is started.
+ log_success_msg
+ /etc/redhat-lsb/lsb_log_message success
/etc/redhat-lsb/lsb_log_message: line 3: /etc/init.d/functions: No such file or directory
/etc/redhat-lsb/lsb_log_message: line 11: success: command not found
make install-shared INSTALL_PATH=/usr/local/stonedb-gcc-rocksdb
make static_lib
make install-static INSTALL_PATH=/usr/local/stonedb-gcc-rocksdb
```
The directories and files shown in the following figure are generated in directory **/usr/local/stonedb-gcc-rocksdb**.
*Here's a picture to add*
1. Switch the GCC version back to 7.3.0. Otherwise, errors will be reported.
```shell
sudo rm /usr/bin/gcc
sudo ln-s /gcc/bin/gcc /usr/bin/gcc
sudo rm /usr/bin/g++
sudo ln-s /gcc/bin/g++ /usr/bin/g++
```
6. Install Boost.
Boost can be automatically installed when you execute the **stonedb_build.sh** script stored in directory** /stonedb2022/scripts**. The following code shows how to manually install Boost.
```shell
tar-zxvf boost_1_66_0.tar.gz
cd boost_1_66_0
./bootstrap.sh --prefix=/usr/local/stonedb-boost
./b2 install--with=all
```
The files and directories shown in the following figure are generated in directory **/usr/local/stonedb-boost**.
After the compilation is complete, directory **/stonedb56** is generated.
:::info
- Because Boost in this example is manually installed, the value of **-DWITH_BOOST** must be set to **/usr/local/stonedb-boost/include**.
- For compatibility purposes, **-DCMAKE_CXX_FLAGS='-D_GLIBCXX_USE_CXX11_ABI=0** must be included in the script. Otherwise, an error will be reported when the complication progress reaches 82%.
:::
## Step 3. Start StoneDB
Perform the following steps to start StoneDB.
### 1. Create a user group, a user, and directories
Navicat is a database management tool that allows you to connect to databases. You can use Navicat to connect to StoneDB and other relational databases such as Oracle, MySQL, and PostgreSQL. After you connect to StoneDB using Navicat, you can create, manage and maintain StoneDB on the Navicat graphical user interface (GUI).
This topic shows you how to use Navicat to connect to StoneDB.
## Prerequisites
Navicat has been installed.
## Procedure
1. Open Navicat and choose **File** > **New Connection** > **MySQL**.
*Here's a picture to add*
2. In the dialog box that appears, click the **General** tab, and enter the connection name, server IP address, port, username, and password. The following figure provides an example.
*Here's a picture to add*
3. Click **Test Connection**. If message "Connection successful" appears, the connection to StoneDB is established.
*Here's a picture to add*
:::info
You cannot use Navicat to connect to StoneDB as a super administrator ('root'@'localhost').
Create a database. For example, execute the following SQL statement to create a database named **test_db** that uses **utf8mb4** as the default character set:
```sql
createdatabasetest_dbDEFAULTCHARACTERSETutf8mb4;
```
List databases by executing the following SQL statement:
```sql
showdatabases;
```
Use a database. For example, execute the following SQL statement to use database **test_db**:
```sql
usetest_db;
```
Drop a datable. For example, execute the following SQL statement to drop database **test_db**:
Create a stored procedure. For example, perform the following two steps to create a stored procedure named **add_user**, used to insert 1,000,000 random data records.
1. Execute the following SQL statement to create a table:
```sql
CREATETABLEt_test(
idINTNOTNULLAUTO_INCREMENT,
first_nameVARCHAR(10)NOTNULL,
last_nameVARCHAR(10)NOTNULL,
sexVARCHAR(5)NOTNULL,
scoreINTNOTNULL,
copy_idINTNOTNULL,
PRIMARYKEY(`id`)
)engine=STONEDB;
```
2. Execute the following SQL statement to create the stored procedure:
Create a table. For example, execute the following SQL statement to create a table which is named **student** and consists of the **id**, **name**, **age**, and **birthday** fields:
```sql
createtablestudent(
idint(11)primarykey,
namevarchar(255),
agesmallint,
birthdayDATE
)engine=stonedb;
```
Query the schema of a table. For example, execute the following SQL statement to query the schema of table **student**:
```sql
showcreatetablestudent\G
```
Drop a table. For example, execute the following SQL statement to drop table **student**:
By default, all parameters of the StoneDB storage engine are saved in **/stonedb/install/stonedb.cnf**. Parameters of other storage engines can also be saved in file **stonedb.cnf**. If you want to modify parameter settings of the StoneDB storage engine, you must modify them in file **stonedb.cnf**, and then restart the StoneDB instance to make the modification take effect. This is because the StoneDB storage engine supports only static modification of parameter settings, which is different from other storage engines.
You can configure parameters based on your environment requirements. The following examples show how to configure parameters respectively in a dynamic and static manner.
# Example 1: Change the storage engine type
Parameter **default_storage_engine** specifies the storage engine type. You can dynamically set this parameter at the session level or the global level. However, If the database is restarted, the value of this parameter is restored to the default value. If you want to make a permanent modification, change the value of this parameter in file **stonedb.cnf** and restart the StoneDB instance.
Code example of changing the default storage engine:
```shell
# mysql -uroot -p -P3308
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 926
Server version: 5.7.36-StoneDB-log build-
Copyright (c) 2000, 2022 StoneAtom Group Holding Limited
No entry for terminal type"xterm";
using dumb terminal settings.
Type 'help;' or '\h'for help. Type '\c' to clear the current input statement.
mysql> show variables like 'default_storage_engine';
+------------------------+--------+
| Variable_name | Value |
+------------------------+--------+
| default_storage_engine | MyISAM |
+------------------------+--------+
1 row in set(0.00 sec)
mysql> set global default_storage_engine=StoneDB;
Query OK, 0 rows affected (0.00 sec)
mysql> exit
Bye
# mysql -uroot -p -P3308
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 927
Server version: 5.7.36-StoneDB-log build-
Copyright (c) 2000, 2022 StoneAtom Group Holding Limited
No entry for terminal type"xterm";
using dumb terminal settings.
Type 'help;' or '\h'for help. Type '\c' to clear the current input statement.
mysql> show variables like 'default_storage_engine';
+------------------------+---------+
| Variable_name | Value |
+------------------------+---------+
| default_storage_engine | STONEDB |
+------------------------+---------+
1 row in set(0.00 sec)
```
The default storage engine of the database is modified from **MyISAM** to **STONEDB** at a global level.
After the StoneDB instance is restarted, the value of **default_storage_engine** is restored to **MyISAM**. If you want to make your change persistently effective, edit file **stonedb.cnf** to modify the parameter setting and then restart the StoneDB instance.
# Example 2: Change the insert buffer size
The parameters of the StoneDB storage engine support only static modification. After the parameter settings are modified, restart the StoneDB instance to make the modification take effect.
| 2233 (HY000) | Be disgraceful to storage engine, operating is forbidden! | The error message is returned because the DDL operation is not supported. |
| 1031 (HY000) | Table storage engine for 'xxx' doesn't have this option | The error message is because the DML operation is not supported. |
| 1040 (HY000) | Too many connections | The error message is because the number of connections has reached the maximum number. |
| 1045 (28000) | Access denied for user 'u_test'@'%' (using password: YES) | The error message is because the username or password is incorrect or the permissions are insufficient. |
The logic of this syntax is to insert a row of data. The UPDATE statement is executed only if a primary key constraint or unique constraint conflict occurs.
In the SET clause of an UPDATE statement, the equal sign (`=`) functions as an assignment operator. In this case, the value on the right side of the operator is assigned to the variable on the right side, provided that any WHERE conditions specified in the UPDATE statement are met.
This topic describes the bitwise operators supported by StoneDB.
| **Operator** | **Description** |
| --- | --- |
| `&` | Bitwise AND |
| | | Bitwise OR |
| `^` | Bitwise XOR |
| `!` | Bitwise inversion |
| `<<` | Left shift |
| `>>` | Right shift |
Bitwise operators are used to operate on binary numbers. In a bitwise operation, the involved numbers are first converted to binary numbers to compute the result, and then the result is converted back to a decimal value.
The following code provides an example of using each operator.
A character set is a collection of symbols and encodings, where the encodings determine how character strings are stored. A collation is a collection of rules for comparing and sorting character strings. A character set is associated with a collation. If you change either of them, the other also changes.
## Supported character sets
You can execute the `SHOW CHARACTER SET` statement to view the character sets supported by StoneDB.
-**character_set_client** specifies the character set used by the server to receive requests from the client.
-**character_set_connection** specifies the character set converted by the server from the character set specified by **character_set_client**.
-**character_set_results** specifies the character set used by the server to respond to requests sent by the client.
The character set used by a client to send requests to a server and receive response from the server is the character set used by the client OS, which can be specified by the **LC_ALL**, **LC_CTYPE**, or **LANG **variable. Among the three variables, **LC_ALL** has the highest priority, **LC_CTYPE** the second, and then **LANG** the lowest.
If the client OS uses the UTF-8 character set and **default-character-set** is set to **gbk** during the client startup, **character_set_client**, **character_set_connection**, and **character_set_results** are automatically set to **gbk**. Now, the client requests a table that contains Chinese characters. The representation of each Chinese character in the server and that in the client are two different strings. In a UNIX OS, data received by the server will be converted based on the character set specified by **character_set_connection**. In this case, the data presented to the server is garbled text.
:::info
For StoneDB, if no character set is specified for the database when the database is being created, the database uses the character set specified by **character_set_server**, by default. If a table is not specified with a character set when the table is being created, the table uses the character set used by the database. Once a table is created, you cannot change or convert its character set.
On StoneDB, the precision for DECIMAL numbers cannot be higher than 18. For example, if you specify **decimal(19)** in your code, an error will be reported. **DECIMAL(6, 2)** indicates that up to 6 places are supported at the left of the decimal and up to 2 at the right, and thus the value range is [-9999.99, 9999.99].
## String data types
The storage required for a string varies according to the character set in use. The length range also differs. The following table describes the length range of each string data type when character set latin1 is in use.
| **Data type** | **Size** |
| --- | --- |
| CHAR(M) | [0,255] |
| VARCHAR(M) | [0,65535] |
| TINYTEXT | [0,255] |
| TEXT | [0,65535] |
| MEDIUMTEXT | [0,16777215] |
| LONGTEXT | [0,4294967295] |
## Date and time data types
The following table describes the value range of each date and time data type.
| MTU | The maximum transmission unit. Default value: 1500. |
| OK | The number of error-free packets. |
| ERR | The number of damaged packets. |
| DRP | The number of dropped packets. |
| OVR | The number of packets that exceeds the threshold. |
| Flg | The flag set for the interface. The value can be:<br/>- **B**: A broadcast address is configured.<br/>- **L**: The interface is a loopback device.<br/>- **M**: All packets are received.<br/>- **R**: The interface is running.<br/>- **U**: The interface is active.<br/> |
Gravity is a data migration tool developed by Mobike written in Golang. Though it is not frequently updated on GitHub, many developers are responding to issues. Gravity supports full synchronization, incremental synchronization, and publish of data updates to message queues. It can be deployed on elastic cloud servers (ECSs), Docker containers, and Kubernetes containers.
It is designed to be a customizable data migration tool that:
- Supports multiple data sources and destinations.
- Supports Kubernetes-based replication clusters.
*TODO*
For more information about Gravity on GitHub, visit [https://github.com/moiot/gravity](https://github.com/moiot/gravity).
## Use cases
- Data Bus: uses change data capture (MySQL binlog, MongoDB Oplog) and batch table scan to publish data to Kafka for downstream consumption.
- Unidirectional data synchronization: fully or incrementally synchronizes data from one MySQL cluster to another MySQL cluster.
- Bidirectional data synchronization: fully or incrementally synchronizes data between two MySQL clusters.
- Synchronization of shards to the merged table: synchronizes MySQL sharded tables to the merged table. You can specify the corresponding relationship between the source table and the destination table.
- Online data mutation: supports data changes during the replication. For example, you can rename, encrypt, and decrypt columns.
## Features
-**Input support**
| **Input** | **Status** |
| --- | --- |
| MySQL Binlog | ✅ |
| MySQL Scan | ✅ |
| Mongo Oplog | ✅ |
| TiDB Binlog | Doing |
| PostgreSQL WAL | Doing |
-**Output support**
| **Output** | **Status** |
| --- | --- |
| Kafka | ✅ |
| MySQL/TiDB | ✅ |
| MongoDB | Doing |
-**Data mutation support**
| **Mutation** | **Status** |
| --- | --- |
| Filter data | ✅ |
| Rename columns | ✅ |
| Delete columns | ✅ |
For information about the architecture, visit: [https://github.com/moiot/gravity/blob/master/docs/2.0/00-arch.md](https://github.com/moiot/gravity/blob/master/docs/2.0/00-arch.md).
### Limits
The binlog format of the data source can only be** row**.
## **Configuration file example**
```bash
# 'name' specifies the cluster name. It is mandatory.
name ="mysql2mysqlDemo"
# Name of the database that stores information about binlog positions and heartbeats. The default value is '_gravity'. This database is automatically generated on the data source.
internal-db-name ="_gravity"
#
# Define the input plugin. The following uses 'mysql' as an example.
#
[input]
# Type of the databases used for synchronization.
type="mysql"
# Synchronization task type. Possible values are 'stream', 'batch', and 'replication'. 'stream' specifies incremental synchronization, 'batch' specifies full synchronization, and 'replication' specifies both full synchronization and incremental synchronization.
mode ="replication"
[input.config.source]
host ="192.168.30.183"
username ="zz"
password ="********"
port = 3307
#
# Define the output plugin. The following uses 'mysql' as an example.
#
[output]
type="mysql"
[output.config.target]
host ="192.168.30.101"
username ="root"
password ="********"
port = 3306
# Define routing rules.
[[output.config.routes]]
match-schema ="zg"
match-table ="test_source_table"
target-schema ="zg"
target-table ="test_target_table
```
## Deployment schemes
### Deploy Gravity on a Docker container
```shell
docker run -d-p 8080:8080 -v${PWD}/config.toml:/etc/gravity/config.toml --net=host --name=innodb2stone moiot/gravity:latest
Then perform the following steps to create a synchronization task:
1. On the Kubernetes dashboard, check that Gravity is running properly and find the port corresponding to **admin web-server**.
1. Use the port to log in to Gravity.
1. Configure the template to create the synchronization task.
The parameters that you need to configure in the template are similar to those provided in the configuration file example.
### Deploy Gravity on an ECS
We do not recommend this scheme because it requires preparations of the Golang environment and the compilation is complex.
```shell
git clone https://github.com/moiot/gravity.git
cd gravity && make
bin/gravity -config mysql2mysql.toml
```
## Configure monitoring for synchronization tasks
Add Gravity to Prometheus to monitor synchronization tasks. The following code provides an example.
```bash
- job_name: "gravity_innodb2stone"
static_configs:
- targets: ["192.168.46.150:8080"]
labels:
instance: innodb2stone
```
The following are two screenshot examples of the Grafana monitoring dashboard. For details about display templates of Grafana, visit [https://github.com/moiot/gravity/tree/master/deploy/grafana](https://github.com/moiot/gravity/tree/master/deploy/grafana).
The data directory contains data files, binlogs, and error logs. If the data directory is run out of capacity, the database will be suspended and cannot provide services. To prevent this issue, monitoring on capacity usage must be strengthened in routine maintenance. This topic describes common causes of this issue.
## Big transactions
If big transactions exist, a large amount of binlogs are generated. If the binlog cache is insufficient, excessive binlogs will be temporarily stored to temporary files on disks. Big transactions not only occupy too much disk space, but result in long primary/secondary replication latency. Therefore, we recommend that you split each big transactions into multiple small transactions in your production environment.
## CARTESIAN JOIN
When SQL statements do not strictly follow syntaxes, for example, no condition is specified during table association, Cartesian products are generated. If the tables to associate are large, the table space will be used up. Therefore, we recommend that you check the execution plan each time you finish writing an SQL statement. If "Using join buffer (Block Nested Loop)" exists in the execution plan, check the two associated tables to see whether the association condition of the driven table is not indexed or no association condition exists.
:::info
You can start StoneDB to release the temporary table space.
:::
## Subqueries and grouped orderings
Subqueries and grouped orderings use temporary tables to cache intermediate result sets. If the temporary files run out of space, the intermediate result sets will be temporarily stored to temporary files on disks.
Many issues will cause start failures of StoneDB. If StoneDB cannot be started, we recommend you check whether any error information is recorded in **mysqld.log**. This topic describes common causes of a start failure of StoneDB.
## Improper parameter settings
If the failure is caused by improper parameter settings, check **mysqld.log** to see which parameters are improperly configured.
The following example indicates that parameter **datadir** is improperly configured.
```bash
[ERROR] failed to set datadir to /stonedb/install/dataxxx/
```
## Denial to access resources
If the port is occupied, the directory owner is incorrect, or the permission on the directory is insufficient, you cannot access the directory.
```bash
Error: unable to create temporary file; errno: 13
```
## Damaged data pages
If a relevant data page is damaged, StoneDB cannot be started. In this case, you must restore the data page from a backup.