提交 · dd7cc890bc1a950f4200c55bcbbee657c44b0e28 · apache / pulsar

30 7月, 2019 1 次提交

[schema] KeyValue schema support using AUTO_CONSUME as key/value schema (#4839) · dd7cc890

由 Sijie Guo 提交于 7月 30, 2019

*Motivation*

Currently KeyValue schema doesn't support using AUTO_CONSUME.

This PR is to add this support.

This PR is based on #4836

*Changes*

- refactor a bit on Schema interface to support fetching schema info for both AutoConsumeSchema and KeyValueSchema before subscribing
- add AUTO_CONSUME support to KeyValueSchema
- add tests

dd7cc890

29 7月, 2019 1 次提交

Improve SchemaInfoProvider to fetch schema info asynchronously (#4836) · 91c4254c

由 Sijie Guo 提交于 7月 29, 2019

*Motivation*

Currently fetching schema information is done synchronously.
It is called in netty callback threads and will potentially block
async operations.

*Modifications*

Make most of the operations asynchronously in SchemaInfoProvider.

91c4254c

24 7月, 2019 1 次提交
- C
  Pulsar SQL supports pulsar's primitive schema (#4728) · 1ab35b01
  由 congbo 提交于 7月 24, 2019
```
### Motivation
Continue the PR of #4151
```
  1ab35b01
21 7月, 2019 1 次提交

Allow to configure ack-timeout tick time (#4760) · f13af487

由 Matteo Merli 提交于 7月 21, 2019

### Motivation

After the changes in #3118, there has a been a sharp increase of memory utilization for the UnackedMessageTracker due to the time buckets being created. 

This is especially true when the acktimeout is set to a larger value (eg: 1h) where 3600 time-buckets are being created. This lead to use 20MB per partition even when no message is tracked.

Allowing to configure the tick time so that application can tune it based on needs.

Additionally, fixed the logic that keeps creating hash maps and throwing them away at each tick time iteration, since that creates a lot of garbage and doesn't take care of the fact that the hash maps are expanding based on the required capacity (so next time they are already of the "right" size). 

On a final note: the current default of 1sec seems very wasteful. Something like 10s should be more appropriate as default.

f13af487

13 7月, 2019 1 次提交
- B
  Use classloaders to load Java functions (#4685) · 6ff1bbae
  由 Boyang Jerry Peng 提交于 7月 12, 2019
```
* Use classloading to load use code for functions
```
  6ff1bbae
25 6月, 2019 1 次提交

[client] Provide a clock for generating publish timestamp for producers (#4562) · 7397b960

由 Sijie Guo 提交于 6月 25, 2019

*Motivation*

Currently producers uses `System.currentTimeMillis()` as publish timestamp by default.
However at some use cases, producers would like to a different way for generating publish timestamp.
E.g. in a database use case, a producer might be use HLC (Hybrid Logic Clock) as publish timestamp;
in integration tests, it might require the producer to use a deterministic way to generate publish timestamp.

*Changes*

This PR introduces a `clock` in building the client. This allows applications to override the system clock
with its own implementation.

*Verify the change*

Add unit test to test customized clock in both batch and non-batch cases.

7397b960

20 6月, 2019 2 次提交

[schema] key/value schema enhancement (#4548) · 82d9e716

由 Sijie Guo 提交于 6月 19, 2019

*Motivation*

- The code for encoding and decoding key/value schema is spreading over multiple places.
- Make code changes to prepare supporting key/value schema in AUTO consumers
- Make schema tools display key/value schema in a pretty format

*Modifications*

- Move the common logic of encoding and decoding key/value schema to a common class KeyValueSchemaInfo
- Expose the common class in DefaultImplementation so that it can be available for public usage
- Fix the display problem on displaying key/value schema

*Verify this change*

- Add bunch of the unit tests for key/value schemas

82d9e716

Introduce batch message container framework and support key based batching container (#4435) · b45736ad

由 lipenghui 提交于 6月 20, 2019

### Motivation

Introduce batch message container framework to support multiple ways to do message batch. 
Currently, pulsar support a most basic batch message container, use the batch message container framework can quickly implement other types batch message container, even users can customize their own batch message container.

Add a new batch message container named BatchMessageKeyBasedContainer to support batching message in key_shared subscription mode.

b45736ad

18 6月, 2019 1 次提交
- M
  Do not strip ExecutionException from the stack trace (#4493) · 571b6846
  由 Matteo Merli 提交于 6月 18, 2019
```
* Do not strip ExecutionException from the stack trace
```
  571b6846
15 6月, 2019 1 次提交

Feature - reset cursor on Reader to current position (#4331) · 1c51adc7

由 Ezequiel Lovelle 提交于 6月 15, 2019

* Feature - reset cursor on Reader to current position

*Motivation*

There are some cases in which is it useful to be able to include current
position of message when reset of cursor was made.

This was reported by a `vvy` on slack, no issue has been created to track this.

*Modifications*

  - Add startMessageIdInclusive() to support include current position of
    reset on ReaderBuilder.
  - Add resetIncludeHead field for Reader and Consumer Configuration Data
  - Fix position of cursor for non durable consumer.
  - Improve discard if statement for batch enable mode.
  - Add discard if statement for batch disable mode.
  - Improve test case for latest Reader seek.
  - Add test case to assert the start of specific message id at the expected
    position with data provider scenarios:
      A. Batch enable and start inclusive enable.
      B. Batch enable and start inclusive disable.
      C. Batch disable and start inclusive enable.
      D. Batch disable and start inclusive disable.

1c51adc7

14 6月, 2019 1 次提交
- J
  Expose `Replicated From` info in Message (#4524) · f33be02a
  由 Jia Zhai 提交于 6月 14, 2019
```
* expose replicated info in message

* add negative test case
```
  f33be02a
30 5月, 2019 1 次提交

Delayed message delivery implementation (#4062) · ba24d73b

由 Matteo Merli 提交于 5月 29, 2019

* Delayed message delivery implementation

* Fixed compilation

* Allow to configure the delayed tracker implementation

* Use int64 for timestamp

* Address comments

* More tests for TripleLongPriorityQueue

* Removing useless sync block that causes deadlock with consumer close

* Fixed merge conflict

* Avoid new list when passing entries to consumer

* Fixed test. Since entries are evicted from cache, they might be resent in diff order

* Fixed context message builder

* Fixed triggering writePromise when last entry was nullified

* Moved entries filtering from consumer to dispatcher

* Added Javadocs

* Reduced synchronized scope to minimum

ba24d73b

23 5月, 2019 1 次提交

[schema] AutoConsume should use the schema associated with messages as both... · bf06ef3e

由 Sijie Guo 提交于 5月 23, 2019

[schema] AutoConsume should use the schema associated with messages as both writer and reader schema (#4325)

* [schema] AutoConsume should use the schema associated with messages for both writer and reader schema

*Motivation*

AutoConsume should use the schema associated with the messages for decoding the schemas.

*Modifications*

- provide a flag enable or disable using the provided schema as the reader schema
- for AUTO_CONSUME schema, disable usnig the provided schema as the reader schema. so it can use the right
  schema version for decoding messages into right generic records
- provide a few util methods for displaying schema data

* Handle 64 bytes schema version

* Addressed review comments

bf06ef3e

21 5月, 2019 3 次提交

W
[ISSUE #4101] Fix Javadoc for ClientBuilder.keepAliveInterval · 4d9351d2
由 wpl 提交于 5月 21, 2019
```
Fixes #4101 

### Motivation

fix issue

### Modifications

fix keepAliveInterval method java doc
```
4d9351d2

[pulsar-common] Support Snappy compression for Java (#4259) · 706a18e5

由 Fangbin Sun 提交于 5月 21, 2019

* Support Snappy compression for java.

* Some minor fix to pass unit tests

* Format the cpp code

* Added support for c++ client

* Format the cpp code

706a18e5

Replicated subscriptions - Configuration and client changes (#4299) · 6e512375

由 Matteo Merli 提交于 5月 20, 2019

* Replicated subscriptions - Configuration and client changes

* Added missing header

* Fixed mocked methods for tests

* Fixed typo

6e512375

19 5月, 2019 1 次提交

Feature / interceptor for ack timeout (#4300) · 6485ccf7

由 Ezequiel Lovelle 提交于 5月 19, 2019

*Motivation*

Provide proper interceptor for messages being redelivered due to ack timeout.

*Modifications*

  - Add test case for onAckTimeoutSend interceptor.
  - Add onAckTimeoutSend() method in ConsumerInterceptor interface.
  - Add handler for onAckTimeoutSend() interceptor in ConsumerBase.
  - Add method call to onAckTimeoutSend() in UnAckedMessageTracker.

6485ccf7

08 5月, 2019 1 次提交

[pulsar-clients] Support nested struct for GenericRecord (#4177) · 78502a3c

由 tuteng 提交于 5月 08, 2019

### Motivation

Currently, GenericRecordBuilder only supports primitive types, e.g. int, long, string. But it doesn’t support struct type. 

### Modifications

Support nested struct for GenericRecordBuilder, for example AVRO

### Verifying this change

Unit Test Pass

78502a3c

01 5月, 2019 1 次提交

[pulsar-clients]Store key part of a KeyValue schema into pulsar message keys (#4117) · 7f21501e

由 tuteng 提交于 5月 01, 2019

### Motivation

The current implementation of KeyValue schema stores key and value together as part of message payload. Ideally the key should be stored as part of message key.

It can be done by introducing a property in KeyValue schema to indicate whether store key in payload or as message key.

### Modifications

* Add keyIsStoredToMessage for encode and decode of KeyValueSchema

### Verifying this change
Unit test pass

7f21501e

28 4月, 2019 1 次提交

Add the multi version schema support (#3876) · d5ff0828

由 congbo 提交于 4月 28, 2019

### Motivation

Fix #3742

In order to decode the message correctly by AVRO schema, we need to know the schema what the message is.

### Modification

- Introduced Schema Reader and Schema Writer for StructSchema.
   - Reader is used to decode message
   - Writer is used to encode message
- The implementations of StructSchema, provides their schema reader and writer implementations. 
- Introduced a schema reader cache for caching the readers for different schema versions.

d5ff0828

25 4月, 2019 1 次提交

Issue #3653: Kerberos authentication for web resource support (#4097) · 2777b0e4

由 Jia Zhai 提交于 4月 25, 2019

Fixes #3653

Master Issue: #3491

** Motivation
Add kerberos support for web resource support.
This mainly include 2 parts:

- the HttpClient that works for HttpLookup.
- the BaseResource that works for admin rest end point.

*** Modifications
For kerberos authentication, there need several back/forth requests to do the negotiation between client and server.
This change add a method authenticationStage in AuthenticationSasl, and a method authenticateHttpRequest in AuthenticationProviderSasl to do the mutual negotiation.
And a saslRoleToken is cached in AuthenticationSasl once the authentication get success.
When do the sasl authentication, it will first use saslRoleToken cache, and if sever check this token failed, do real sasl authentication.
Changed unit test SaslAuthenticateTest, which enable sasl authentication in admin and also use http lookup to verify the change.

2777b0e4

24 4月, 2019 1 次提交

Feature - support seek() on Reader (#4031) · c70438c6

由 Ezequiel Lovelle 提交于 4月 23, 2019


*Motivation*

 fix #3976

According to what was discussed in pull #3983 it would be an acceptable solution
to add seek() command to Reader in order to reset a non durable cursor after
Reader instance was build.

*Modifications*

  - Bugfix reset() by timestamp on a non-durable consumer, previously the
    cached cursor was not present, therefore the state set by reset() was missed
    resulting in a reset() at the beginning of the cursor instead of a reset()
    at the expected position.
  - Copy seek() commands to Reader interface from Consumer interface.
  - Fix inconsistency with lastDequeuedMessage field after seek() command was
    performed successfully.
  - Fix consumer discarding messages on receive (after seek() command) due to
    messages being present on acknowledge grouping tacker.

c70438c6

23 4月, 2019 1 次提交

Feature / Interceptor for negative ack redelivery (#3962) · 78907501

由 Ezequiel Lovelle 提交于 4月 22, 2019

* Feature / Interceptor for negative ack redelivery

*Motivation*

In some scenarios is it helpful to be able to set interceptor for redeliveries
being happening due to negative acknowledge.

*Modifications*

  - Add onNegativeAcksSend() method in ConsumerInterceptor interface.
  - Add handler for onNegativeAcksSend() interceptor in ConsumerBase.
  - Favor forEach on ConsumerInterceptor instead of classic for loop by index.
  - Optimization for each by index to avoid compute size() every iteration.
  - Add call method to onNegativeAckRedelivery() from NegativeAcksTracker.

* Add test case for onNegativeAcksSend interceptor

78907501

22 4月, 2019 1 次提交

PIP-34 Key_Shared subscription core implementation. (#4079) · 2373ca36

由 lipenghui 提交于 4月 22, 2019

## Motivation
This is a core implementation for PIP-34 and there is a task tracker ISSUE-4077 for this PIP

## Modifications
Add a new subscription type named Key_Shared
Add PersistentStickyKeyDispatcherMultipleConsumers to handle the message dispatch
Add a simple hash range based consumer selector
Verifying this change
Add new unit tests to verifying the hash range selector and Key_Shared mode message consume.


* PIP-34 Key_Shared subscription core implementation.
* PIP-34 Add more unit test.
1.test redelivery with Key_Shared subscription
2.test none key dispatch with Key_Shared subscription
3.test ordering key dispatch with Key_Shared subscription
* PIP-34 Fix alignment issue of Pulsar.proto
* PIP-34 Fix TODO: format
* PIP-34 Fix hash and ordering key issues
* PIP-34 documentation for Key_Shared subscription
* PIP-34 Fix cpp test issue.
* PIP-34 Fix cpp format issue.

2373ca36

11 4月, 2019 1 次提交

Allow to configure TypedMessageBuilder through a Map conf object (#4015) · 9c0937b8

由 Matteo Merli 提交于 4月 10, 2019

* Allow to configure TypedMessageBuilder through a Map conf object

* Use constants for message confs

* Reverted previous change

* Use Long instead of Number

9c0937b8

03 4月, 2019 1 次提交

[Issue-2122] [pulsar-client] Adding configuration for backoff strategy (#3848) · bdfc0986

由 Richard Yu 提交于 4月 02, 2019


Fixes #2122 

### Motivation

Current backoff strategy is set by default and is too aggressive. What we should do is allow it to be configurable by the user.

### Documentation

  - Does this pull request introduce a new feature? (yes)
  - If yes, how is the feature documented? (not sure)

bdfc0986

30 3月, 2019 1 次提交

[schema] store schema type correctly in schema registry (#3940) · 7e7175db

由 Sijie Guo 提交于 3月 29, 2019

*Motivation*

Fixes #3925

We have 3 places of defining schema type enums. We kept adding
new schema type in pulsar-common. However we don't update the schema type
in wire protocol and schema storage.

This causes `SchemaType.NONE` is stored in SchemaRegistry.
It fails debeizum connector on restarting.

*Modifications*

Make sure all 3 places have consistent schema type definitions.
Record the correct schema type.

7e7175db

28 3月, 2019 1 次提交
- W
  [pulsar-client] add Date/Time/Timestamp schema (#3856) · bda6a9cc
  由 wpl 提交于 3月 28, 2019
```
Fixes #3831
```
  bda6a9cc
19 3月, 2019 2 次提交

revise the schema default type not null (#3752) · 1a1c557b

由 congbo 提交于 3月 19, 2019

### Motivation
Fix #3741 

### Modifications
Support define not not allow null field in schema

### Verifying this change
Add not allow null field schema verify

Does this pull request potentially affect one of the following parts:
If yes was chosen, please highlight the changes

Dependencies (does it add or upgrade a dependency): (no)
The public API: (no)
The schema: (yes)
The default values of configurations: (no)
The wire protocol: (no)
The rest endpoints: (no)
The admin cli options: (no)
Anything that affects deployment: (no)

1a1c557b

kerberos: authentication between client and broker (#3821) · 7064285c

由 Jia Zhai 提交于 3月 19, 2019

Fixes #3652

**Motivation**
Currently both Zookeeper and BookKeeper could be secured by using Kerberos, It would be good to support Kerberos in Pulsar Broker and Client.
This is the sub-task for issue in #3491 to support Kerberos in Pulsar Broker and Client.
Will add proxy and web resource support in following prs.
The Kerberos authentication is similar to that in Zookeeper and BookKeeper, which leverage SASL and GSSAPI, so reused some of the code there.
PR #3658 is the first version of PR before #3677 .

**Changes**
provide both client and broker side support for authentication api;
add unit test.

7064285c

14 3月, 2019 1 次提交

Support passing schema definition for JSON and AVRO schemas (#3766) · da68b23c

由 Sijie Guo 提交于 3月 14, 2019

* Support passing schema definition for JSON and AVRO schemas

*Motivation*

Currently AVRO and Schema generated schemas from POJO directly.
Sometime people would like to use pre-generated/defined schemas,
so allow passing in schema definitions would clear the confusions
on parsing schemas from POJO.

*Modifications*

- Abstract a common base class `StructSchema` for AVRO/PROTOBUF/JSON
- Standarize on using avro schema for defining schema (we already did that. this change only makes it clearer)
- Add methods to pass schema definition for JSON and AVRO schemas

*NOTES*

We don't support passing schema definition for PROTOBUF. since we only supported generated messages as POJO
class for protobuf schema, and we generate schema definition from the generated messages. it doesn't make sense
to pass in a different schema definition.

* Add missing license header

da68b23c

13 3月, 2019 1 次提交

PIP-30: interface for mutual authentication (#3677) · 09e3ed8a

由 Jia Zhai 提交于 3月 13, 2019

This is to implement the mutual auth api discussed in "PIP-30: change authentication provider API to support mutual authentication"
Mainly provide 2 new command CommandAuthResponse and  CommandAuthChallenge in proto, to support it.

09e3ed8a

09 3月, 2019 1 次提交

NullPointerException at using BytesSchema.of() (#3754) · 85afd6e4

由 Sijie Guo 提交于 3月 09, 2019

Fixes #3734

*Motivation*

Exception occurred when using `BytesSchema.of()`

```
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.pulsar.examples.simple.ProducerExample.main(ProducerExample.java:32)
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at org.apache.pulsar.client.internal.ReflectionUtils.catchExceptions(ReflectionUtils.java:36)
	at org.apache.pulsar.client.internal.DefaultImplementation.newKeyValueSchema(DefaultImplementation.java:158)
	at org.apache.pulsar.client.api.Schema.<clinit>(Schema.java:123)
	... 1 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.pulsar.client.internal.DefaultImplementation.lambda$newKeyValueSchema$16(DefaultImplementation.java:160)
	at org.apache.pulsar.client.internal.ReflectionUtils.catchExceptions(ReflectionUtils.java:34)
	... 3 more
Caused by: java.lang.NullPointerException
	at org.apache.pulsar.client.impl.schema.KeyValueSchema.<init>(KeyValueSchema.java:68)
	... 9 more
```

The problem introduced because the weird class loading and reflection sequence.

When accessing `BytesSchema`, `BytesSchema` will try to initialize `Schema`. When initializing Schema, it will attempts
to initialize `KV_BYTES` using reflection, and initializing KV_BYTES requires `BytesSchema`. Hence it causes KV_BYTES not being
initialized correctly.

The change is to avoid this recursive class loading.

85afd6e4

08 3月, 2019 2 次提交

[schema] Introduce GenericRecordBuilder and the avro implementation (#3690) · 0e48029b

由 Sijie Guo 提交于 3月 08, 2019

*Motivation*

In order to introduce `GenericRecordBuilder`, we need to know the fields in a `GenericSchema`.
Otherwise, there is no way for us to build a GenericRecord.

*Modifications*

This proposal refactors current generic schema by introducing a `GenericSchema`. This generic schema
provides interfaces to retrieve the fields of a `GenericRecordSchema`.

*Additionally*

This proposal adding the primitive schemas into `Schema` class. So people can program primitive schemas
using Schema interface rather than specific implementations.

0e48029b

Added support for "negative acks" in Java client (#3703) · 1da21612

由 Matteo Merli 提交于 3月 07, 2019

* Added support for "negative acks" in Java client

* Fixed redelivery delay to be >= than configured

* Fixed redelivery after timeout

* Fixed timeout interval calculation

* Removed the 1.1 nonsense

* Fixed test cleanup

* Avoid failure when passing empty set of msg ids

1da21612

06 3月, 2019 1 次提交
- R
  Broker considers fail-over consumer priority-level (#2954) · c39e7d19
  由 Rajan Dhabalia 提交于 3月 05, 2019
```
add java doc

fix partitioned-topic distribution
```
  c39e7d19
05 3月, 2019 1 次提交

Exposing getSchemaVersion in the client by making it public. (#3744) · d847c353

由 Yuvaraj L 提交于 3月 05, 2019

* Exposing getSchemaVersion in the client by making it public.

* Implemented getSchemaVersion in TopicMessageImpl.java

* Changed the release version

d847c353

28 2月, 2019 2 次提交

冉

add reset cousor to a specific publish time (#3622) · 1f376e10

由冉小龙提交于 2月 28, 2019

Signed-off-by: xiaolong.ran ranxiaolong716@gmail.com

Fixes #3446 #3565

Motivation
Reset the subscription associated with this consumer to a specific publish time.

1f376e10

[schema] Introduce `GenericSchema` interface (#3683) · f4d56624

由 Sijie Guo 提交于 2月 28, 2019

*Motivation*

In order to introduce `GenericRecordBuilder`, we need to know the fields in a `GenericSchema`.
Otherwise, there is no way for us to build a GenericRecord.

*Modifications*

This proposal refactors current generic schema by introducing a `GenericSchema`. This generic schema
provides interfaces to retrieve the fields of a `GenericRecordSchema`.

*Additionally*

This proposal adding the primitive schemas into `Schema` class. So people can program primitive schemas
using Schema interface rather than specific implementations.

f4d56624

26 2月, 2019 1 次提交

[schema] Introduce schema builder to build schema. (#3682) · d46474b2

由 Sijie Guo 提交于 2月 26, 2019

*Motivation*

Currently we are supporting POJO based schema in java clients.
POJO schema is only useful when the POJO is predefined. However
in applications like a CDC pipeline, POJO is no predefined, there
is no other way to define a schema.

Since we are using avro schema for schema management, this PR
is proposing a simple schema builder wrapper on avro schema builder.

*Modifications*

Introduce schema builder to build a record schema.

*NOTES*

Currently we only support primitives in defining fields in a record schema in this PR.
We will add nested types in future PRs.

d46474b2