未验证 提交 16ebaade 编写于 作者: wu-sheng's avatar wu-sheng 提交者: GitHub

Optimize IDs reading in the persistent worker. (#7193)

* Optimize IDs reading in the persistent worker.
上级 f5b7c3e3
......@@ -4,39 +4,45 @@ Release Notes.
8.7.0
------------------
#### Project
* Extract dependency management to a bom.
* Add JDK 16 to test matrix.
#### Java Agent
* Supports modifying span attributes in async mode.
* Agent supports the collection of JVM arguments and jar dependency information.
* [Temporary] Support authentication for log report channel. This feature and grpc channel is going to be removed after Satellite 0.2.0 release.
* Remove deprecated gRPC method, `io.grpc.ManagedChannelBuilder#nameResolverFactory`. See [gRPC-java 7133](https://github.com/grpc/grpc-java/issues/7133) for more details.
* [Temporary] Support authentication for log report channel. This feature and grpc channel is going to be removed after
Satellite 0.2.0 release.
* Remove deprecated gRPC method, `io.grpc.ManagedChannelBuilder#nameResolverFactory`.
See [gRPC-java 7133](https://github.com/grpc/grpc-java/issues/7133) for more details.
* Add `Neo4j-4.x` plugin.
* Correct `profile.duration` to `profile.max_duration` in the default `agent.config` file.
* Fix the reponse time of gRPC.
* Fix the response time of gRPC.
#### OAP-Backend
* Disable Spring sleuth meter analyzer by default.
* Only count 5xx as error in Envoy ALS receiver.
* Upgrade apollo core caused by CVE-2020-15170.
* Upgrade kubernetes client caused by CVE-2020-28052.
* Upgrade Elasticsearch 7 client caused by CVE-2020-7014.
* Upgrade jackson related libs caused by CVE-2018-11307, CVE-2018-14718 ~ CVE-2018-14721, CVE-2018-19360 ~ CVE-2018-19362,
CVE-2019-14379, CVE-2019-14540, CVE-2019-14892, CVE-2019-14893, CVE-2019-16335, CVE-2019-16942, CVE-2019-16943,
CVE-2019-17267, CVE-2019-17531, CVE-2019-20330, CVE-2020-8840, CVE-2020-9546, CVE-2020-9547, CVE-2020-9548,
CVE-2018-12022, CVE-2018-12023, CVE-2019-12086, CVE-2019-14439, CVE-2020-10672, CVE-2020-10673, CVE-2020-10968,
CVE-2020-10969, CVE-2020-11111, CVE-2020-11112, CVE-2020-11113, CVE-2020-11619, CVE-2020-11620, CVE-2020-14060,
CVE-2020-14061, CVE-2020-14062, CVE-2020-14195, CVE-2020-24616, CVE-2020-24750, CVE-2020-25649, CVE-2020-35490,
CVE-2020-35491, CVE-2020-35728 and CVE-2020-36179 ~ CVE-2020-36190.
* Upgrade jackson related libs caused by CVE-2018-11307, CVE-2018-14718 ~ CVE-2018-14721, CVE-2018-19360 ~
CVE-2018-19362, CVE-2019-14379, CVE-2019-14540, CVE-2019-14892, CVE-2019-14893, CVE-2019-16335, CVE-2019-16942,
CVE-2019-16943, CVE-2019-17267, CVE-2019-17531, CVE-2019-20330, CVE-2020-8840, CVE-2020-9546, CVE-2020-9547,
CVE-2020-9548, CVE-2018-12022, CVE-2018-12023, CVE-2019-12086, CVE-2019-14439, CVE-2020-10672, CVE-2020-10673,
CVE-2020-10968, CVE-2020-10969, CVE-2020-11111, CVE-2020-11112, CVE-2020-11113, CVE-2020-11619, CVE-2020-11620,
CVE-2020-14060, CVE-2020-14061, CVE-2020-14062, CVE-2020-14195, CVE-2020-24616, CVE-2020-24750, CVE-2020-25649,
CVE-2020-35490, CVE-2020-35491, CVE-2020-35728 and CVE-2020-36179 ~ CVE-2020-36190.
* Exclude log4j 1.x caused by CVE-2019-17571.
* Upgrade log4j 2.x caused by CVE-2020-9488.
* Upgrade nacos libs caused by CVE-2021-29441 and CVE-2021-29442.
* Upgrade netty caused by CVE-2019-20444, CVE-2019-20445, CVE-2019-16869, CVE-2020-11612, CVE-2021-21290, CVE-2021-21295
and CVE-2021-21409.
* Upgrade netty caused by CVE-2019-20444, CVE-2019-20445, CVE-2019-16869, CVE-2020-11612, CVE-2021-21290, CVE-2021-21295
and CVE-2021-21409.
* Upgrade consul client caused by CVE-2018-1000844, CVE-2018-1000850.
* Upgrade zookeeper caused by CVE-2019-0201.
* Upgrade zookeeper caused by CVE-2019-0201.
* Upgrade snake yaml caused by CVE-2017-18640.
* Upgrade embed tomcat caused by CVE-2020-13935.
* Upgrade commons-lang3 to avoid potential NPE in some JDK versions.
......@@ -49,8 +55,11 @@ Release Notes.
* Fix: slowDBAccessThreshold dynamic config error when not configured.
* Performance: cache regex pattern and result, optimize string concatenation in Envy ALS analyzer.
* Performance: cache metrics id and entity id in `Metrics` and `ISource`.
* Performance: enhance persistent session mechanism, about differentiating cache timeout for different dimensionality
metrics. The timeout of the cache for minute and hour level metrics has been prolonged to ~5 min.
#### UI
* Fix the date component for log conditions.
* Fix selector keys for duplicate options.
* Add Python celery plugin.
......@@ -59,7 +68,6 @@ Release Notes.
#### Documentation
All issues and pull requests are [here](https://github.com/apache/skywalking/milestone/90?closed=1)
------------------
......
......@@ -51,6 +51,11 @@ import org.apache.skywalking.oap.server.telemetry.api.MetricsTag;
*/
@Slf4j
public class MetricsPersistentWorker extends PersistenceWorker<Metrics> {
/**
* The counter of MetricsPersistentWorker instance, to calculate session timeout offset.
*/
private static long SESSION_TIMEOUT_OFFSITE_COUNTER = 0;
private final Model model;
private final Map<Metrics, Metrics> context;
private final IMetricsDAO metricsDAO;
......@@ -61,6 +66,7 @@ public class MetricsPersistentWorker extends PersistenceWorker<Metrics> {
private final boolean enableDatabaseSession;
private final boolean supportUpdate;
private CounterMetrics aggregationCounter;
private long sessionTimeout = 70_000; // Unit, ms. 70,000ms means more than one minute.
MetricsPersistentWorker(ModuleDefineHolder moduleDefineHolder, Model model, IMetricsDAO metricsDAO,
AbstractWorker<Metrics> nextAlarmWorker, AbstractWorker<ExportEvent> nextExportWorker,
......@@ -98,10 +104,11 @@ public class MetricsPersistentWorker extends PersistenceWorker<Metrics> {
new MetricsTag.Keys("metricName", "level", "dimensionality"),
new MetricsTag.Values(model.getName(), "2", model.getDownsampling().getName())
);
SESSION_TIMEOUT_OFFSITE_COUNTER++;
}
/**
* Create the leaf MetricsPersistentWorker, no next step.
* Create the leaf and down-sampling MetricsPersistentWorker, no next step.
*/
MetricsPersistentWorker(ModuleDefineHolder moduleDefineHolder, Model model, IMetricsDAO metricsDAO,
boolean enableDatabaseSession, boolean supportUpdate) {
......@@ -109,6 +116,10 @@ public class MetricsPersistentWorker extends PersistenceWorker<Metrics> {
null, null, null,
enableDatabaseSession, supportUpdate
);
// For a down-sampling metrics, we prolong the session timeout for 4 times, nearly 5 minutes.
// And add offset according to worker creation sequence, to avoid context clear overlap,
// eventually optimize load of IDs reading.
this.sessionTimeout = sessionTimeout * 4 + SESSION_TIMEOUT_OFFSITE_COUNTER * 200;
}
/**
......@@ -216,7 +227,7 @@ public class MetricsPersistentWorker extends PersistenceWorker<Metrics> {
if (noInCacheMetrics.isEmpty()) {
return;
}
final List<Metrics> dbMetrics = metricsDAO.multiGet(model, noInCacheMetrics);
if (!enableDatabaseSession) {
// Clear the cache only after results from DB are returned successfully.
......@@ -235,8 +246,8 @@ public class MetricsPersistentWorker extends PersistenceWorker<Metrics> {
while (iterator.hasNext()) {
Metrics metrics = iterator.next();
metrics.extendSurvivalTime(tookTime);
// 70,000ms means more than one minute.
if (metrics.getSurvivalTime() > 70000) {
if (metrics.getSurvivalTime() > sessionTimeout) {
iterator.remove();
}
}
......
......@@ -32,8 +32,8 @@ public interface IMetricsDAO extends DAO {
/**
* Read data from the storage by given IDs.
*
* @param model target entity of this query.
* @param metrics metrics list.
* @param model target entity of this query.
* @param metrics metrics list.
* @return the data of all given IDs. Only include existing data. Don't require to keep the same order of ids list.
* @throws IOException when error occurs in data query.
*/
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册