@@ -215,11 +215,11 @@ More detail for SPI usage, please search by yourself.
Other ShardingSphere [functionality implementation](https://shardingsphere.apache.org/document/current/en/features/spi/) will take effect in the same way.
## 17. How to solve that `DATA MASKING` can't work with JPA?
## 17. How to solve that `data encryption` can't work with JPA?
Answer:
Because DDL for data masking has not yet finished, JPA Entity cannot meet the DDL and DML at the same time, when JPA that automatically generates DDL is used with data masking.
Because DDL for data encryption has not yet finished, JPA Entity cannot meet the DDL and DML at the same time, when JPA that automatically generates DDL is used with data encryption.
Security control has always been a crucial link of orchestration; data masking falls into this category. For both Internet enterprises and traditional sectors, data security has always been a highly valued and sensitive topic. Data masking refers to transforming some sensitive information through masking rules to safely protect the private data. Data involves client's security or business sensibility, such as ID number, phone number, card number, client number and other personal information, requires data masking according to relevant regulations.
Because of that, ShardingSphere has provided data masking, which stores users' sensitive information in the database after encryption. When users search for them, the information will be decrypted and returned to users in the original form.
ShardingSphere has made the encryption and decryption processes totally transparent to users, who can store desensitized data and acquire original data without any awareness. In addition, ShardingSphere has provided internal masking algorithms, which can be directly used by users. In the same time, we have also provided masking algorithm related interfaces, which can be implemented by users themselves. After simple configurations, ShardingSphere can use algorithms provided by users to perform encryption, decryption and masking.
## Preface
The data encryption module belongs to the sub-function module under the core function of ShardingSphere distributed governance. It parses the SQL input by the user and rewrites the SQL according to the encryption configuration provided by the user, thereby encrypting the original data and storing the original data and store the original data (optional) and cipher data to database at the same time. When the user queries the data, it takes the cipher data from the database and decrypts it, and finally returns the decrypted original data to the user. Apache ShardingSphere distributed database middleware automates and transparentizes the process of data encryption, so that users do not need to pay attention to the details of data decryption and use decrypted data like ordinary data. In addition, ShardingSphere can provide a relatively complete set of solutions for the encryption of online services or the encryption function of new services.
## Demand Analysis
Security control has always been a crucial link of data governance, data encryption falls into this category.
For both Internet enterprises and traditional sectors, data security has always been a highly valued and sensitive topic.
Data encryption refers to transforming some sensitive information through encrypt rules to safely protect the private data.
Data involves client's security or business sensibility,
such as ID number, phone number, card number, client number and other personal information, requires data encryption according to relevant regulations.
The demand for data encryption is generally divided into two situations in real business scenarios:
...
...
@@ -25,9 +19,16 @@ The demand for data encryption is generally divided into two situations in real
2. For the service has been launched, and plaintext has been stored in the database before. The relevant department suddenly needs to encrypt the data from the on-line business. This scenario generally needs to deal with three issues as followings:
a) How to encrypt the historical data, a.k.a.s clean data.
* How to encrypt the historical data, a.k.a.s clean data.
* How to encrypt the newly added data and store it in the database without changing the business SQL and logic; then decrypt the taken out data when use it.
* How to securely, seamlessly and transparently migrate plaintext and ciphertext data between business systems
## Challenges
b) How to encrypt the newly added data and store it in the database without changing the business SQL and logic; then decrypt the taken out data when use it.
In the real business scenario, the relevant business development team often needs to implement and maintain a set of encryption and decryption system according to the needs of the company's security department.
When the encryption scenario changes, the encryption system often faces the risk of reconstruction or modification.
In addition, for the online business system, it is relatively complex to realize seamless encryption transformation with transparency, security and low risk without modifying the business logic and SQL.
c) How to securely, seamlessly and transparently migrate plaintext and ciphertext data between business systems
## Goal
**Provides a security and transparent data encryption solution, which is the main design goal of Apache ShardingSphere data encryption module.**
举例说明,假如数据库里有一张表叫做t_user,这张表里实际有两个字段pwd_plain,用于存放明文数据、pwd_cipher,用于存放密文数据,同时定义logicColumn为pwd。那么,用户在编写SQL时应该面向logicColumn进行编写,即INSERT INTO t_user SET pwd = '123'。ShardingSphere接收到该SQL,通过用户提供的脱敏配置,发现pwd是logicColumn,于是便对逻辑列及其对应的明文数据进行脱敏处理。可以看出**ShardingSphere将面向用户的逻辑列与面向底层数据库的明文列和密文列进行了列名以及数据的脱敏映射转换。**如下图所示:
虽然这种方式确实可以增加数据的保密性,但是另一个问题却随之出现:相同的数据在数据库里存储的内容是不一样的,那么当用户按照这个加密列进行等值查询(`SELECT FROM table WHERE encryptedColumnn = ?`)时会发现无法将所有相同的原始数据查询出来。为此,我们提出了辅助查询列的概念。该辅助查询列通过`queryAssistedEncrypt()`生成,与`decrypt()`不同的是,该方法通过对原始数据进行另一种方式的加密,但是针对原始数据相同的数据,这种加密方式产生的加密数据是一致的。将`queryAssistedEncrypt()`后的数据存储到数据中用于辅助查询真实数据。因此,数据库表中多出这一个辅助查询列。
* The back-end databases are MySQL, Oracle, PostgreSQL, and SQLServer;
* The user needs to encrypt one or more columns in the database table (data encryption & decryption);
* Compatible with all commonly used SQL.
## Unsupported Items
* Users need to deal with the original inventory data and wash numbers in the database;
* Use encryption function + sub-library sub-table function, some special SQL is not supported, please refer to [SQL specification](https://shardingsphere.apache.org/document/current/en/features/sharding/use-norms/sql/);
* Encryption fields cannot support comparison operations, such as: greater than less than, ORDER BY, BETWEEN, LIKE, etc;
* Encryption fields cannot support calculation operations, such as AVG, SUM, and calculation expressions.
@@ -27,13 +27,13 @@ The database protocol interface is used to regulate parse and adapter protocol o
Its main interface is `DatabaseProtocolFrontendEngine` and built-in implementation types are `MySQLProtocolFrontendEngine` and `PostgreSQLProtocolFrontendEngine`.
### Data Masking
### data encryption
The Data masking interface is used to regulate the encryption, decryption, access type, property configuration and other methods of the encryptor.
The data encryption interface is used to regulate the encryption, decryption, access type, property configuration and other methods of the encryptor.
There are mainly two interfaces, `ShardingEncryptor` and `ShardingQueryAssistedEncryptor` and built-in implementation types are `AESShardingEncryptor` and `MD5ShardingEncryptor`.
Please refer to [Data Masking](/en/features/orchestration/encrypt/) for the introduction.
Please refer to [data encryption](/en/features/orchestration/encrypt/) for the introduction.
@@ -346,7 +346,7 @@ spring.shardingsphere.props.executor.size= #Executing thread number; default val
spring.shardingsphere.props.check.table.metadata.enabled=#Whether to check meta-data consistency of sharding table when it initializes; default value: false
```
### Data Masking
### data encryption
```properties
#Omit data source configurations; keep it consistent with data sharding
#Omit data source, data sharding, read-write split and data masking configurations
#Omit data source, data sharding, read-write split and data encryption configurations
spring.shardingsphere.orchestration.spring_boot_ds_sharding.orchestration-type=The type of orchestration center: config_center or registry_center or metadata_center
spring.shardingsphere.orchestration.spring_boot_ds_sharding.instance-type=#Center instance type. Example:zookeeper#Registry center type. Example:zookeeper