changes.md 14.1 KB
Newer Older
1
## 9.3.0
2

3
#### Project
4

5 6
* Bump up the embedded `swctl` version in OAP Docker image.

7
#### OAP Server
8

9
* Add component ID(133) for impala JDBC Java agent plugin and component ID(134) for impala server.
10
* Use prepareStatement in H2SQLExecutor#getByIDs.(No function change).
K
kezhenxu94 已提交
11
* Bump up snakeyaml to 1.32 for fixing CVE.
12
* Fix `DurationUtils.convertToTimeBucket` missed verify date format.
13
* Enhance LAL to support converting LogData to DatabaseSlowStatement.
14
* [**Breaking Change**] Change the LAL script format(Add layer property).
15
* Adapt ElasticSearch 8.1+, migrate from removed APIs to recommended APIs.
16
* Support monitoring MySQL slow SQLs.
17
* Support analyzing cache related spans to provide metrics and slow commands for cache services from client side
18
* Optimize virtual database, fix dynamic config watcher NPE when default value is null
19 20 21
* Remove physical index existing check and keep template existing check only to avoid meaningless `retry wait`
  in `no-init` mode.
* Make sure instance list ordered in TTL processor to avoid TTL timer never runs.
22
* Support monitoring PostgreSQL slow SQLs.
23 24 25 26
* [**Breaking Change**] Support sharding MySQL database instances and tables
  by [Shardingsphere-Proxy](https://shardingsphere.apache.org/document/current/en/overview/#shardingsphere-proxy).
  SQL-Database requires removing tables `log_tag/segment_tag/zipkin_query` before OAP starts, if bump up from previous
  releases.
wu-sheng's avatar
wu-sheng 已提交
27
* Fix meter functions `avgHistogram`, `avgHistogramPercentile`, `avgLabeled`, `sumHistogram` having data conflict when
wu-sheng's avatar
wu-sheng 已提交
28
  downsampling.
wu-sheng's avatar
wu-sheng 已提交
29
* Do sorting `readLabeledMetricsValues` result forcedly in case the storage(database) doesn't return data consistent
wu-sheng's avatar
wu-sheng 已提交
30
  with the parameter list.
31 32
* Fix the wrong watch semantics in Kubernetes watchers, which causes heavy traffic to API server in some Kubernetes
  clusters,
33
  we should use `Get State and Start at Most Recent` semantic instead of `Start at Exact`
34 35
  because we don't need the changing history events,
  see https://kubernetes.io/docs/reference/using-api/api-concepts/#semantics-for-watch.
36
* Unify query services and DAOs codes time range condition to `Duration`.
37 38 39 40 41 42 43 44
* [**Breaking Change**]: Remove prometheus-fetcher plugin, please use OpenTelemetry to scrape Prometheus metrics and
  set up SkyWalking OpenTelemetry receiver instead.
* BugFix: histogram metrics sent to MAL should be treated as OpenTelemetry style, not Prometheus style:
  ```
  (-infinity, explicit_bounds[i]] for i == 0
  (explicit_bounds[i-1], explicit_bounds[i]] for 0 < i < size(explicit_bounds)
  (explicit_bounds[i-1], +infinity) for i == size(explicit_bounds)
  ```
45
* Support Golang runtime metrics analysis.
P
pg.yang 已提交
46
* Add APISIX metrics monitoring
47 48 49
* Support skywalking-client-js report empty `service version` and `page path` , set default version as `latest` and
  default page path as `/`(root). Fix the
  error `fetching data (/browser_app_page_pv0) : Can't split endpoint id into 2 parts`.
50 51 52
* [**Breaking Change**] Limit the max length of trace/log/alarm tag's `key=value`, set the max length of column `tags`
  in tables`log_tag/segment_tag/alarm_record_tag` and column `query` in `zipkin_query` and column `tag_value` in `tag_autocomplete` to 256.
  SQL-Database requires altering these columns' length or removing these tables before OAP starts, if bump up from previous releases.
53 54 55 56 57 58 59 60 61 62
* Optimize the creation conditions of profiling task.
* Lazy load the Kubernetes metadata and switch from event-driven to polling.
  Previously we set up watchers to watch the Kubernetes metadata changes, this is perfect when there are deployments changes and
  SkyWalking can react to the changes in real time. However when the cluster has many events (such as in large cluster
  or some special Kubernetes engine like OpenShift), the requests sent from SkyWalking becomes unpredictable, i.e. SkyWalking might
  send massive requests to Kubernetes API server, causing heavy load to the API server.
  This PR switches from the watcher mechanism to polling mechanism, SkyWalking polls the metadata in a specified interval,
  so that the requests sent to API server is predictable (~10 requests every `interval`, 3 minutes), and the requests count is constant
  regardless of the cluster's changes. However with this change SkyWalking can't react to the cluster changes in time, but the delay
  is acceptable in our case.
63
* Optimize the query time of tasks in ProfileTaskCache.
B
Brandon Fergerson 已提交
64
* Fix metrics was put into wrong slot of the window in the alerting kernel.
65
* Support `sumPerMinLabeled` in `MAL`.
66
* Bump up jackson databind, snakeyaml, grpc dependencies.
67
* Support export `Trace` and `Log` through Kafka.
68
* Add new config initialization mechanism of module provider. This is a ModuleManager lib kernel level change.
69 70
* [**Breaking Change**] Support new records query protocol, rename the column named `service_id` to `entity_id` for support difference entity.
  Please re-create `top_n_database_statement` index/table.
71
* Remove improper self-obs metrics in JvmMetricsHandler(for Kafka channel).
72
* gRPC stream canceling code is not logged as an error when the client cancels the stream. The client
73
  cancels the stream when the pod is terminated.
74 75
* [**Breaking Change**] Change the way of loading MAL rules(support pattern).
* Move k8s relative MAL files into `/otel-rules/k8s`.
76 77 78 79
* [**Breaking Change**] Refactor service mesh protobuf definitions and split TCP-related metrics to individual definition.
* Add `TCP{Service,ServiceInstance,ServiceRelation,ServiceInstanceRelation}` sources and split TCP-related entities out from
  original `Service,ServiceInstance,ServiceRelation,ServiceInstanceRelation`.
* [**Breaking Change**] TCP-related source names are changed, fields of TCP-related sources are changed, please refer to the latest `oal/tcp.oal` file.
80
* Do not log error logs when failed to create ElasticSearch index because the index is created already.
81
* Add virtual MQ analysis for native traces.
82
* Support Python runtime metrics analysis.
83 84
* Support `sampledTrace` in LAL.
* Support multiple rules with different names under the same layer of LAL script.
85
* (Optimization) Reduce the buffer size(queue) of MAL(only) metric streams. Set L1 queue size as 1/20, L2 queue size as 1/2.
86
* Support monitoring MySQL/PostgreSQL in the cluster mode.
87 88 89 90 91 92 93 94 95 96 97 98 99
* [**Breaking Change**] Migrate to BanyanDB v0.2.0.
  * Adopt new OR logical operator for,
    1. `MeasureIDs` query
    2. `BanyanDBProfileThreadSnapshotQueryDAO` query
    3. Multiple `Event` conditions query
    4. Metrics query
  * Simplify Group check and creation
  * Partially apply `UITemplate` changes
  * Support `index_only`
  * Return `CompletableFuture<Void>` directly from BanyanDB client
  * Optimize data binary parse methods in *LogQueryDAO
  * Support different indexType
  * Support configuration for TTL and (block|segment) intervals
100 101
* Elasticsearch storage: Provide system environment variable(`SW_STORAGE_ES_SPECIFIC_INDEX_SETTINGS`) and support specify the settings `(number_of_shards/number_of_replicas)` for each index individually.
* Elasticsearch storage: Support update index settings `(number_of_shards/number_of_replicas)` for the index template after rebooting.
102 103 104
* Optimize MQ Topology analysis. Use entry span's peer from the consumer side as source service when no producer instrumentation(no cross-process reference).
* Refactor JDBC storage implementations to reuse logics.
* Fix `ClassCastException` in `LoggingConfigWatcher`.
105 106
* Support span attached event concept in Zipkin and SkyWalking trace query.
* Support span attached events on Zipkin lens UI.
107
* Force UTF-8 encoding in `JsonLogHandler` of `kafka-fetcher-plugin`.
108
* Fix max length to 512 of entity, instance and endpoint IDs in trace, log, profiling, topN tables(JDBC storages). The value was 200 by default.
109
* Add component IDs(135, 136, 137) for EventMesh server and client-side plugins.
110
* Bump up Kafka client to 2.8.1 to fix CVE-2021-38153.
111
* Remove `lengthEnvVariable` for `Column` as it never works as expected.
112
* Add `LongText` to support longer logs persistent as a text type in ElasticSearch, instead of a keyword, to avoid length limitation.
113
* Fix wrong system variable name `SW_CORE_ENABLE_ENDPOINT_NAME_GROUPING_BY_OPENAPI`. It was **opaenapi**.
114
* Fix not-time-series model blocking OAP boots in no-init mode.
115
* Fix `ShardingTopologyQueryDAO.loadServiceRelationsDetectedAtServerSide` invoke backend miss parameter `serviceIds`.
116
* Changed system variable `SW_SUPERDATASET_STORAGE_DAY_STEP` to `SW_STORAGE_ES_SUPER_DATASET_DAY_STEP` to be consistent with other ES storage related variables.
117
* Fix ESEventQueryDAO missing metric_table boolQuery criteria.
118
* Add default entity name(`_blank`) if absent to avoid NPE in the decoding. This caused `Can't split xxx id into 2 parts`.
119
* Support dynamic config the sampling strategy in network profiling.
120 121
* Zipkin module support BanyanDB storage.
* Zipkin traces query API, sort the result set by start time by default.
122 123 124 125 126 127
* Enhance the cache mechanism in the metric persistent process.
  * This cache only worked when the metric is accessible(readable) from the database. Once the insert execution is delayed
    due to the scale, the cache loses efficacy. It only works for the last time update per minute, considering our
    25s period.
  * Fix ID conflicts for all JDBC storage implementations. Due to the insert delay, the JDBC storage implementation would
    still generate another new insert statement.
128 129
* [**Breaking Change**] Remove `core/default/enableDatabaseSession` config.
* [**Breaking Change**] Add `@BanyanDB.TimestampColumn` to identify `which column in Record` is providing the timestamp(milliseconds) for BanyanDB,
130 131 132
  since BanyanDB stream requires a timestamp in milliseconds.
  For SQL-Database: add new column `timestamp` for tables `profile_task_log/top_n_database_statement`,
  requires altering this column or removing these tables before OAP starts, if bump up from previous releases.
133
* Fix Elasticsearch storage: In `No-Sharding Mode`, add specific analyzer to the template before index creation to avoid update index error.
134
* Internal API: remove undocumented ElasticSearch API usage and use documented one.
135
* Fix `BanyanDB.ShardingKey` annotation missed in the generated OAL metrics classes.
136
* Fix Elasticsearch storage: Query `sortMetrics` missing transform real index column name.
137
* Rename `BanyanDB.ShardingKey` to `BanyanDB.SeriesID`.
138 139
* Self-Observability: Add counters for metrics reading from DB or cached. Dashboard:`Metrics Persistent Cache Count`.
* Self-Observability: Fix `GC Time` calculation.
140
* Fix Elasticsearch storage: In `No-Sharding Mode`, column's property `indexOnly` not applied and cannot be updated.
141
* Update the `trace_id` field as storage only(cannot be queried) in `top_n_database_statement`, `top_n_cache_read_command`, `top_n_cache_read_command` index.
142

F
Fine0830 已提交
143 144
#### UI

145
* Fix: tab active incorrectly, when click tab space
146
* Add impala icon for impala JDBC Java agent plugin.
147
* (Webapp)Bump up snakeyaml to 1.31 for fixing CVE-2022-25857
148 149 150 151
* [Breaking Change]: migrate from Spring Web to Armeria, now you should use the environment variable
  name `SW_OAP_ADDRESS`
  to change the OAP backend service addresses, like `SW_OAP_ADDRESS=localhost:12800,localhost:12801`, and use
  environment
152
  variable `SW_SERVER_PORT` to change the port. Other Spring-related configurations don't take effect anymore.
F
Fine0830 已提交
153 154 155 156
* Polish the endpoint list graph.
* Fix styles for an adaptive height.
* Fix setting up a new time range after clicking the refresh button.
* Enhance the process topology graph to support dragging nodes.
157
* UI-template: Fix metrics calculation in `general-service/mesh-service/faas-function` top-list dashboard.
158
* Update MySQL dashboard to visualize collected slow SQLs.
F
Fine0830 已提交
159
* Add virtual cache dashboard.
160
* Remove `responseCode` fields of all OAL sources, as well as examples to avoid user's confusion.
161 162
* Remove All from the endpoints selector.
* Enhance menu configurations to make it easier to change.
163
* Update PostgreSQL dashboard to visualize collected slow SQLs.
F
Fine0830 已提交
164 165 166 167 168 169 170 171 172 173
* Add Golang runtime metrics and cpu/memory used rate panels in General-Instance dashboard.
* Add gateway apisix menu.
* Query logs with the specific service ID.
* Bump d3-color from 3.0.1 to 3.1.0.
* Add Golang runtime metrics and cpu/memory used rate panels in FaaS-Instance dashboard.
* Revert logs on trace widget.
* Add a sub-menu for virtual mq.
* Add `readRecords` to metric types.
* Verify dashboard names for new dashboards.
* Associate metrics with the trace widget on dashboards.
F
Fine0830 已提交
174 175 176
* Fix configuration panel styles.
* Remove a un-use icon.
* Support labeled value on the service/instance/endpoint list widgets.
F
Fine0830 已提交
177 178 179 180 181
* Add menu for virtual MQ.
* Set selector props and update configuration panel styles.
* Add Python runtime metrics and cpu/memory utilization panels to General-Instance and Fass-Instance dashboards.
* Enhance the legend of metrics graph widget with the summary table.
* Add apache eventMesh logo file.
F
Fine0830 已提交
182
* Fix conditions for trace profiling.
F
Fine0830 已提交
183
* Fix tag keys list and duration condition.
F
Fine0830 已提交
184 185 186 187
* Fix typo.
* Fix condition logic for trace tree data.
* Enhance tags component to search tags with the input value.
* Fix topology loading style.
W
Wan Kai 已提交
188
* Fix update metric processor for the readRecords and remove readSampledRecords from metrics selector.
189
* Add trace association for FAAS dashboards.
F
Fine0830 已提交
190
* Visualize attached events on the trace widget.
191
* Add HTTP/1.x metrics and HTTP req/resp body collecting tabs on the network profiling widget.
F
Fine0830 已提交
192 193
* Implement creating tasks ui for network profiling widget.
* Fix entity types for ProcessRelation.
194
* Add trace association for general service dashboards.
195

wu-sheng's avatar
wu-sheng 已提交
196
#### Documentation
197

198
* Add `metadata-uid` setup doc about Kubernetes coordinator in the cluster management.
199
* Add a doc for adding menus to booster UI.
wu-sheng's avatar
wu-sheng 已提交
200 201 202
* Move general good read blogs from `Agent Introduction` to `Academy`.
* Add re-post for blog `Scaling with Apache SkyWalking` in the academy list.
* Add re-post for blog `Diagnose Service Mesh Network Performance with eBPF` in the academy list.
203
* Add **Security Notice** doc.
204 205
* Add new docs for `Report Span Attached Events` data collecting protocol.
* Add new docs for `Record` query protocol
206
* Update `Server Agents` and `Compatibility` for PHP agent.
207
* Add docs for profiling.
208
* Update the network profiling documentation.
209

210
All issues and pull requests are [here](https://github.com/apache/skywalking/milestone/149?closed=1)