Merge pull request #87 from milvus-io/master

Merge from master Former-commit-id: df85d7d6895b385a9e367609872fc86edc379176

Merge pull request #87 from milvus-io/master
Merge from master Former-commit-id: df85d7d6895b385a9e367609872fc86edc379176
320e5af8 · Jin Hai · GitHub · 05fda8c7 · 796fdcf1 · 320e5af8
11 changed file
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -11,7 +11,7 @@ Please mark all change in change log and use the ticket from JIRA.
 ## Feature
 ## Task

-# Milvus 0.5.0 (TODO)
+# Milvus 0.5.0 (2019-10-21)

 ## Bug
 - MS-568 - Fix gpuresource free error

--- a/README.md
+++ b/README.md
-![Milvuslogo](https://github.com/milvus-io/docs/blob/branch-0.5.0/assets/milvus_logo.png)
+![Milvuslogo](https://github.com/milvus-io/docs/blob/master/assets/milvus_logo.png)
+

 ![LICENSE](https://img.shields.io/badge/license-Apache--2.0-brightgreen)
 ![Language](https://img.shields.io/badge/language-C%2B%2B-blue)
+[![codebeat badge](https://codebeat.co/badges/e030a4f6-b126-4475-a938-4723d54ec3a7?style=plastic)](https://codebeat.co/projects/github-com-jinhai-cn-milvus-master)

 - [Slack Community](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk)
 - [Twitter](https://twitter.com/milvusio)
@@ -13,34 +15,49 @@

 # Welcome to Milvus

-Firstly, welcome, and thanks for your interest in [Milvus](https://milvus.io)! No matter who you are, what you do, we greatly appreciate your contribution to help us reinvent data science with Milvus. :beers:
-
 ## What is Milvus

-Milvus is an open source vector search engine which provides state-of-the-art similarity search and analysis for billion-scale feature vectors.
+Milvus is an open source similarity search engine for massive feature vectors. Designed with heterogeneous computing architecture for the best cost efficiency. Searches over billion-scale vectors take only milliseconds with minimum computing resources.

 Milvus provides stable Python, Java and C++ APIs.

-Keep up-to-date with newest releases and latest updates by reading Milvus [release notes](https://milvus.io/docs/en/Releases/v0.4.0/).
+Keep up-to-date with newest releases and latest updates by reading Milvus [release notes](https://milvus.io/docs/en/Releases/v0.5.0/).

- GPU-accelerated search engine
+- Heterogeneous computing

-  Milvus uses CPU/GPU heterogeneous computing architecture to process feature vectors, and are orders of magnitudes faster than traditional databases.
+  Milvus is designed with heterogeneous computing architecture for the best performance and cost efficiency. 

- Various indexes
+- Multiple indexes

-  Milvus supports quantization indexing, tree-based indexing, and graph indexing algorithms. 
+  Milvus supports a variety of indexing types that employs quantization, tree-based, and graph indexing techniques. 

- Intelligent scheduling
+- Intelligent resource management

-  Milvus optimizes the search computation and index building according to your data size and available resources. 
+  Milvus automatically adapts search computation and index building processes based on your datasets and available resources.

 - Horizontal scalability

-  Milvus expands computation and storage by adding nodes during runtime, which allows you to scale the data size without redesigning the system.
+  Milvus supports online / offline expansion to scale both storage and computation resources with simple commands.
+
+- High availability
+
+  Milvus is integrated with Kubernetes framework so that all single point of failures could be avoided.
+
+- High compatibility
+
+  Milvus is compatible with almost all deep learning models and major programming languages such as Python, Java and C++, etc.
+
+- Ease of use
+
+  Milvus can be easily installed in a few steps and enables you to exclusively focus on feature vectors. 
+
+- Visualized monitor
+
+  You can track system performance on Prometheus-based GUI monitor dashboards.

 ## Architecture
-![Milvus_arch](https://github.com/milvus-io/docs/blob/branch-0.5.0/assets/milvus_arch.jpg)
+
+![Milvus_arch](https://github.com/milvus-io/docs/blob/master/assets/milvus_arch.png)

 ## Get started

@@ -117,20 +134,20 @@ To edit Milvus settings in `conf/server_config.yaml` and `conf/log_config.conf`,

 #### Run Python example code

-Make sure [Python 3.4](https://www.python.org/downloads/) or higher is already installed and in use.
+Make sure [Python 3.5](https://www.python.org/downloads/) or higher is already installed and in use.

 Install Milvus Python SDK.

 ```shell
 # Install Milvus Python SDK
-$ pip install pymilvus==0.2.0
+$ pip install pymilvus==0.2.3
 ```

 Create a new file `example.py`, and add [Python example code](https://github.com/milvus-io/pymilvus/blob/master/examples/AdvancedExample.py) to it.

 Run the example code.

-```python
+```shell
 # Run Milvus Python example
 $ python3 example.py
 ```

--- a/codecov.yaml
+++ b/codecov.yaml
+#Configuration File for CodeCov
+coverage:
+  precision: 2
+  round: down
+  range: "70...100"
+
+  status:
+    project: on
+    patch: yes
+    changes: no
+
+comment:
+  layout: "header, diff, changes, tree"
+  behavior: default
--- a/core/src/scheduler/task/SearchTask.cpp
+++ b/core/src/scheduler/task/SearchTask.cpp
@@ -307,71 +307,71 @@ XSearchTask::MergeTopkToResultSet(const std::vector<int64_t>& input_ids, const s
    }
 }

-void
-XSearchTask::MergeTopkArray(std::vector<int64_t>& tar_ids, std::vector<float>& tar_distance, uint64_t& tar_input_k,
-                            const std::vector<int64_t>& src_ids, const std::vector<float>& src_distance,
-                            uint64_t src_input_k, uint64_t nq, uint64_t topk, bool ascending) {
-    if (src_ids.empty() || src_distance.empty()) {
-        return;
-    }
-
-    uint64_t output_k = std::min(topk, tar_input_k + src_input_k);
-    std::vector<int64_t> id_buf(nq * output_k, -1);
-    std::vector<float> dist_buf(nq * output_k, 0.0);
-
-    uint64_t buf_k, src_k, tar_k;
-    uint64_t src_idx, tar_idx, buf_idx;
-    uint64_t src_input_k_multi_i, tar_input_k_multi_i, buf_k_multi_i;
-
-    for (uint64_t i = 0; i < nq; i++) {
-        src_input_k_multi_i = src_input_k * i;
-        tar_input_k_multi_i = tar_input_k * i;
-        buf_k_multi_i = output_k * i;
-        buf_k = src_k = tar_k = 0;
-        while (buf_k < output_k && src_k < src_input_k && tar_k < tar_input_k) {
-            src_idx = src_input_k_multi_i + src_k;
-            tar_idx = tar_input_k_multi_i + tar_k;
-            buf_idx = buf_k_multi_i + buf_k;
-            if ((ascending && src_distance[src_idx] < tar_distance[tar_idx]) ||
-                (!ascending && src_distance[src_idx] > tar_distance[tar_idx])) {
-                id_buf[buf_idx] = src_ids[src_idx];
-                dist_buf[buf_idx] = src_distance[src_idx];
-                src_k++;
-            } else {
-                id_buf[buf_idx] = tar_ids[tar_idx];
-                dist_buf[buf_idx] = tar_distance[tar_idx];
-                tar_k++;
-            }
-            buf_k++;
-        }
-
-        if (buf_k < output_k) {
-            if (src_k < src_input_k) {
-                while (buf_k < output_k && src_k < src_input_k) {
-                    src_idx = src_input_k_multi_i + src_k;
-                    buf_idx = buf_k_multi_i + buf_k;
-                    id_buf[buf_idx] = src_ids[src_idx];
-                    dist_buf[buf_idx] = src_distance[src_idx];
-                    src_k++;
-                    buf_k++;
-                }
-            } else {
-                while (buf_k < output_k && tar_k < tar_input_k) {
-                    tar_idx = tar_input_k_multi_i + tar_k;
-                    buf_idx = buf_k_multi_i + buf_k;
-                    id_buf[buf_idx] = tar_ids[tar_idx];
-                    dist_buf[buf_idx] = tar_distance[tar_idx];
-                    tar_k++;
-                    buf_k++;
-                }
-            }
-        }
-    }
-
-    tar_ids.swap(id_buf);
-    tar_distance.swap(dist_buf);
-    tar_input_k = output_k;
-}
+// void
+// XSearchTask::MergeTopkArray(std::vector<int64_t>& tar_ids, std::vector<float>& tar_distance, uint64_t& tar_input_k,
+//                            const std::vector<int64_t>& src_ids, const std::vector<float>& src_distance,
+//                            uint64_t src_input_k, uint64_t nq, uint64_t topk, bool ascending) {
+//    if (src_ids.empty() || src_distance.empty()) {
+//        return;
+//    }
+//
+//    uint64_t output_k = std::min(topk, tar_input_k + src_input_k);
+//    std::vector<int64_t> id_buf(nq * output_k, -1);
+//    std::vector<float> dist_buf(nq * output_k, 0.0);
+//
+//    uint64_t buf_k, src_k, tar_k;
+//    uint64_t src_idx, tar_idx, buf_idx;
+//    uint64_t src_input_k_multi_i, tar_input_k_multi_i, buf_k_multi_i;
+//
+//    for (uint64_t i = 0; i < nq; i++) {
+//        src_input_k_multi_i = src_input_k * i;
+//        tar_input_k_multi_i = tar_input_k * i;
+//        buf_k_multi_i = output_k * i;
+//        buf_k = src_k = tar_k = 0;
+//        while (buf_k < output_k && src_k < src_input_k && tar_k < tar_input_k) {
+//            src_idx = src_input_k_multi_i + src_k;
+//            tar_idx = tar_input_k_multi_i + tar_k;
+//            buf_idx = buf_k_multi_i + buf_k;
+//            if ((ascending && src_distance[src_idx] < tar_distance[tar_idx]) ||
+//                (!ascending && src_distance[src_idx] > tar_distance[tar_idx])) {
+//                id_buf[buf_idx] = src_ids[src_idx];
+//                dist_buf[buf_idx] = src_distance[src_idx];
+//                src_k++;
+//            } else {
+//                id_buf[buf_idx] = tar_ids[tar_idx];
+//                dist_buf[buf_idx] = tar_distance[tar_idx];
+//                tar_k++;
+//            }
+//            buf_k++;
+//        }
+//
+//        if (buf_k < output_k) {
+//            if (src_k < src_input_k) {
+//                while (buf_k < output_k && src_k < src_input_k) {
+//                    src_idx = src_input_k_multi_i + src_k;
+//                    buf_idx = buf_k_multi_i + buf_k;
+//                    id_buf[buf_idx] = src_ids[src_idx];
+//                    dist_buf[buf_idx] = src_distance[src_idx];
+//                    src_k++;
+//                    buf_k++;
+//                }
+//            } else {
+//                while (buf_k < output_k && tar_k < tar_input_k) {
+//                    tar_idx = tar_input_k_multi_i + tar_k;
+//                    buf_idx = buf_k_multi_i + buf_k;
+//                    id_buf[buf_idx] = tar_ids[tar_idx];
+//                    dist_buf[buf_idx] = tar_distance[tar_idx];
+//                    tar_k++;
+//                    buf_k++;
+//                }
+//            }
+//        }
+//    }
+//
+//    tar_ids.swap(id_buf);
+//    tar_distance.swap(dist_buf);
+//    tar_input_k = output_k;
+//}

 }  // namespace scheduler
 }  // namespace milvus
--- a/core/src/scheduler/task/SearchTask.h
+++ b/core/src/scheduler/task/SearchTask.h
@@ -42,10 +42,10 @@ class XSearchTask : public Task {
    MergeTopkToResultSet(const std::vector<int64_t>& input_ids, const std::vector<float>& input_distance,
                         uint64_t input_k, uint64_t nq, uint64_t topk, bool ascending, scheduler::ResultSet& result);

-    static void
-    MergeTopkArray(std::vector<int64_t>& tar_ids, std::vector<float>& tar_distance, uint64_t& tar_input_k,
-                   const std::vector<int64_t>& src_ids, const std::vector<float>& src_distance, uint64_t src_input_k,
-                   uint64_t nq, uint64_t topk, bool ascending);
+    //    static void
+    //    MergeTopkArray(std::vector<int64_t>& tar_ids, std::vector<float>& tar_distance, uint64_t& tar_input_k,
+    //                   const std::vector<int64_t>& src_ids, const std::vector<float>& src_distance, uint64_t
+    //                   src_input_k, uint64_t nq, uint64_t topk, bool ascending);

 public:
    TableFileSchemaPtr file_;

--- a/core/unittest/db/test_search.cpp
+++ b/core/unittest/db/test_search.cpp
--- a/tests/milvus_benchmark/requirements.txt
+++ b/tests/milvus_benchmark/requirements.txt
 numpy==1.16.3
 pymilvus>=0.1.18
-pyyaml==3.12
+pyyaml==5.1
 docker==4.0.2
 tableprint==0.8.0
 ansicolors==1.1.8
\ No newline at end of file
--- a/tests/milvus_python_test/test_add_vectors.py
+++ b/tests/milvus_python_test/test_add_vectors.py
@@ -50,8 +50,7 @@ class TestAddBase:
        '''
        vector = gen_single_vector(dim)
        status, ids = connect.add_vectors(table, vector)
-        ret = connect.has_table(table)
-        assert ret == True
+        assert assert_has_table(connect, table)

    @pytest.mark.timeout(ADD_TIMEOUT)
    def test_delete_table_add_vector(self, connect, table):
@@ -618,8 +617,7 @@ class TestAddIP:
        '''
        vector = gen_single_vector(dim)
        status, ids = connect.add_vectors(ip_table, vector)
-        ret = connect.has_table(ip_table)
-        assert ret == True
+        assert assert_has_table(connect, ip_table)

    @pytest.mark.timeout(ADD_TIMEOUT)
    def test_delete_table_add_vector(self, connect, ip_table):

--- a/tests/milvus_python_test/test_delete_vectors.py
+++ b/tests/milvus_python_test/test_delete_vectors.py
--- a/tests/milvus_python_test/test_table.py
+++ b/tests/milvus_python_test/test_table.py
@@ -264,7 +264,7 @@ class TestTable:
        expected: status ok, and no table in tables
        '''
        status = connect.delete_table(table)
-        assert not connect.has_table(table)
+        assert not assert_has_table(connect, table)

    def test_delete_table_ip(self, connect, ip_table):
        '''
@@ -274,7 +274,7 @@ class TestTable:
        expected: status ok, and no table in tables
        '''
        status = connect.delete_table(ip_table)
-        assert not connect.has_table(ip_table)
+        assert not assert_has_table(connect, ip_table)

    @pytest.mark.level(2)
    def test_table_delete_without_connection(self, table, dis_connect):
@@ -314,7 +314,7 @@ class TestTable:
            connect.create_table(param)
            status = connect.delete_table(table_name)
            time.sleep(1)
-            assert not connect.has_table(table_name)
+            assert not assert_has_table(connect, table_name)

    def test_delete_create_table_repeatedly(self, connect):
        '''
@@ -371,7 +371,7 @@ class TestTable:
        def deletetable(milvus):
            status = milvus.delete_table(table)
            # assert not status.code==0
-            assert milvus.has_table(table)
+            assert assert_has_table(milvus, table)
            assert status.OK()

        for i in range(process_num):
@@ -411,11 +411,10 @@ class TestTable:
        def delete(connect,ids):
            i = 0
            while i < loop_num:
-                # assert connect.has_table(table[ids*8+i])
                status = connect.delete_table(table[ids*process_num+i])
                time.sleep(2)
                assert status.OK()
-                assert not connect.has_table(table[ids*process_num+i])
+                assert not assert_has_table(connect, table[ids*process_num+i])
                i = i + 1

        for i in range(process_num):
@@ -444,7 +443,7 @@ class TestTable:
                 'index_file_size': index_file_size,
                 'metric_type': MetricType.L2}
        connect.create_table(param)
-        assert connect.has_table(table_name)
+        assert assert_has_table(connect, table_name)

    def test_has_table_ip(self, connect):
        '''
@@ -458,7 +457,7 @@ class TestTable:
                 'index_file_size': index_file_size,
                 'metric_type': MetricType.IP}
        connect.create_table(param)
-        assert connect.has_table(table_name)
+        assert assert_has_table(connect, table_name)

    @pytest.mark.level(2)
    def test_has_table_without_connection(self, table, dis_connect):
@@ -468,7 +467,7 @@ class TestTable:
        expected: has table raise exception
        '''
        with pytest.raises(Exception) as e:
-            status = dis_connect.has_table(table)
+            assert_has_table(dis_connect, table)

    def test_has_table_not_existed(self, connect):
        '''
@@ -478,7 +477,7 @@ class TestTable:
        expected: False
        '''
        table_name = gen_unique_str("test_table")
-        assert not connect.has_table(table_name)
+        assert not assert_has_table(connect, table_name)

    """
    ******************************************************************
@@ -700,7 +699,7 @@ class TestCreateTableDimInvalid(object):
                 'dimension': dimension,
                 'index_file_size': index_file_size,
                 'metric_type': MetricType.L2}
-        if isinstance(dimension, int) and dimension > 0:
+        if isinstance(dimension, int):
            status = connect.create_table(param)
            assert not status.OK()
        else:
@@ -778,7 +777,7 @@ def preload_table(connect, **params):
    return status

 def has(connect, **params):
-    status = connect.has_table(params["table_name"])
+    status = assert_has_table(connect, params["table_name"])
    return status

 def show(connect, **params):

--- a/tests/milvus_python_test/utils.py
+++ b/tests/milvus_python_test/utils.py
@@ -462,6 +462,11 @@ def gen_simple_index_params():
    return gen_params(index_types, nlists)


+def assert_has_table(conn, table_name):
+    status, ok = conn.has_table(table_name)
+    return status.OK() and ok
+    
+
 if __name__ == "__main__":
    import numpy