提交 · 47f85a1ddf27ca2b952ac8820537e9e0abd667b7 · apache / pulsar

02 4月, 2019 2 次提交

Issue #3211: Fix NPE when creating schema after deleting a schema (#3836) · 47f85a1d

由 Sijie Guo 提交于 3月 19, 2019

Fixes #3211
Fixes #2786 

*Motivation*

When a schema is deleted, the schema is not removed directly.
You can still fetch the latest schema but its state is marked as `deleted`.

So when we apply schema compatibility check, we should ignore deleted schema.

*Modifications*

Ignore deleted schema when doing schema compatibility check

47f85a1d

Fix topic name logic for partitioned topics (#3693) · 1c6563eb

由 Sanjeev Kulkarni 提交于 3月 15, 2019

* Since partitioned topics have a -partition-<partitionid> affixed to the topic name,
when doing explicit acking, check for the case to determine the right topic name

* added unittests

1c6563eb

30 3月, 2019 38 次提交

M
Upgrade third party libraries with security vulnerabilities (#3938) · cbd2f643
由 massakam 提交于 3月 30, 2019
```
* Upgrade jackson version to 2.9.8

* Upgrade commons-collections version to 1.10
```
cbd2f643
S
Added ability to specify consumer queue size for function input topics (#3608) · e077374d
由 Sanjeev Kulkarni 提交于 2月 17, 2019
```
* Added ability to specify consumer queue size for function input topics

* fix and update PR
```
e077374d
M

Fixed manual offloading when the last ledger is empty (#3913) · 3418de98
由 Matteo Merli 提交于 3月 29, 2019

3418de98

[schema] store schema type correctly in schema registry (#3940) · ea94ba0f

由 Sijie Guo 提交于 3月 29, 2019

*Motivation*

Fixes #3925

We have 3 places of defining schema type enums. We kept adding
new schema type in pulsar-common. However we don't update the schema type
in wire protocol and schema storage.

This causes `SchemaType.NONE` is stored in SchemaRegistry.
It fails debeizum connector on restarting.

*Modifications*

Make sure all 3 places have consistent schema type definitions.
Record the correct schema type.

ea94ba0f

W
[pulsar-client] add Date/Time/Timestamp schema (#3856) · 2fecbd6a
由 wpl 提交于 3月 28, 2019
```
Fixes #3831
```
2fecbd6a

Classloader choice for validating Source/Sink (#3865) · 4959f51f

由 Sanjeev Kulkarni 提交于 3月 28, 2019

* Try both regular classloader as well as nar class loader for validating source/sinks

* Fixed test

* Fix unittest

* Added more comments to the code

* rename variables

* Wait for the create to succeed before updating. Otherwise there might be some reamnant producers

4959f51f

S
Cleanup logic in JavaInstanceRunnable close method (#3932) · d1c17cd1
由 Sanjeev Kulkarni 提交于 3月 28, 2019
```
* Cleanup logic in JavaInstanceRunnable close method

* Added comments
```
d1c17cd1
F

Fix source/sink creation to use JSON config values. (#3922) · a4655302
由 Fangbin Sun 提交于 3月 29, 2019

a4655302

Include java-xmlbuilder in tiered-storage-jcloud NAR (#3928) (#3929) · ec3d6358

由 Steve M. Kim 提交于 3月 28, 2019

In the POM file for tiered-storage-jcloud, the dependency on
com.jamesmurty.utils/java-xmlbuilder was marked with test scope, but in
fact this dependency is needed during normal use. This commit changes
the POM file so that the java-xmlbuilder dependency is packaged in the
NAR file.

ec3d6358

R
Fix: unloading namespace logging (#3920) · ee760785
由 Rajan Dhabalia 提交于 3月 27, 2019
```
### Motivation
Fix: unloading namespace logging
```
ee760785

[cpp client] implement reference count for close() (#3863) · 5f3f5cfd

由 Ezequiel Lovelle 提交于 3月 27, 2019

Add reference count feature to keep track of reused instances of a consumer
instance, for more details please see commit ff4db8db.

*Modifications*

  - Add refCount instance variable on ConsumerImpl.
  - Use new safeDecrRefCount() on consumer close() in order to know whether
    effective close call should occur or not.
  - Increment reference count when a previous built consumer instance is being
    used by caller.

*Future considerations*

Thereafter when feature preventing duplicated consumer is made for
PartitionedConsumer, MultiTopicsConsumer and PatternMultiTopicsConsumer,
incrRefCount() member could be turned into a pure virtual method.

5f3f5cfd

Logical type use (#3900) · 0bc17d15

由 congbo 提交于 3月 27, 2019

## Motivation

To compromise avro‘s bug for avro filed logical type
https://issues.apache.org/jira/browse/AVRO-1891

## Modifications
add some initialize class

## Verifying this change
Add logical type test in AvroSchemaTest

0bc17d15

[schema] Batch message should propagate the schema version from single message (#3870) · e8002734

由 Sijie Guo 提交于 3月 26, 2019

*Motivation*

Schema version is set on single message's header. But it isn't propagated to
the batch message's header. So when the consumer receives the message batch,
it doesn't contain the schema version.

*Modifications*

When intializing a message batch, propagate the schema version from single message to batch message header.

*Verify this change*

Verified with an integration test. Need to write some unit tests for it.

e8002734

F
[Issue 3896] [pulsar-io] Fix NPE in ElasticSearchSink (#3899) · 3bfc264e
由 Fangbin Sun 提交于 3月 27, 2019
```
* Fix NPE in ElasticSearchSink.

* Init by an empty string
```
3bfc264e
S

Added sensitive annotations (#3866) · a9c47380
由 Sanjeev Kulkarni 提交于 3月 25, 2019

a9c47380
S

Add ability to specify runtime flags in functions/sources/sinks (#3872) · 0e378dc0
由 Sanjeev Kulkarni 提交于 3月 25, 2019

0e378dc0

Enable generation of swagger definitions for functions/sources/sinks (#3871) · fa3bc7e4

由 Sanjeev Kulkarni 提交于 3月 22, 2019

* Enable generation of swagger definitions for functions/sources/sinks

* Added more changes to make it backwards compatible

* Add logic to server functions rest api

* Deleted swagger file since its supposed to be generated.
Also added some missing python packages for building

* Added back deleted file

* Fix comments

* Seperated out sources and sinks

* More changes to collapse all rest apis into one dropdown

fa3bc7e4

M

Fixed reading from file with newlines (#3864) · c216dc0b
由 Matteo Merli 提交于 3月 24, 2019

c216dc0b

Fix read batching message by pulsar reader (#3830) · 8e3f0140

由 penghui 提交于 3月 20, 2019

### Motivation

Fix bug of reader.hasMessageAvailable().

```java
while (reader.hasMessageAvailable()) {
        Message msg = reader.readNext();
    // Do something
}
```

If lastDequeuedMessage with a batchMessageId, reader.hasMessageAvailable() will return false after read first message of batchMessage, because compared message id is a MessageIdImpl(lastMessageIdInBroker) 

### Modifications

Add batching message check.

### Verifying this change

- [x] This change added tests and can be verified as follows:
- *Added UT tests for read batch message and non-batch message*

8e3f0140

R
[pulsar-broker] Fix deadlock: add zk-operation timeout for blocking call on zk-cache (#3633) · a76222f9
由 Rajan Dhabalia 提交于 3月 19, 2019
```
fix: compilation

fix test

fix test
```
a76222f9

[Issue 3806] Fix NPE while call PartitionedProducerImpl.getStats() (#3829) · ede6f6e7

由 Fangbin Sun 提交于 3月 20, 2019

* Fix NPE while call PartitionedProducerImpl.getStats().

* Add unit tests and fix NPE while call MultiTopicsConsumerImpl.getStats()

ede6f6e7

S
Fix the loop of consumer poll, so the consumer can cache more than one record... · 95b44148
由 se7enkings 提交于 3月 20, 2019
```
Fix the loop of consumer poll, so the consumer can cache more than one record in signal poll. (#3852)
```
95b44148

[Issue #3712][python-client] exposing InitialPosition management in the... · c078e71a

由 Le Labourier Marc 提交于 3月 19, 2019

[Issue #3712][python-client] exposing InitialPosition management in the ConsumerConfiguration. (#3714)

### Motivation
PR #3567 introduced the SubscriptionInitialPosition option in ConsumerConfiguration in the CPP client but did not exposed any methods to be able to do it with the Python client.

This PR aims to expose the code introduced in the previous PR to allows Python user to choose the initial position of the consumer in python. 

### Modifications

- Implemented a boost object to expose InitialPosition Enum.
- Added to boost ConsumerConfiguration object the getter/setter of the InitialPosition attribute.
- Added a initial_position parameter to Client.subscribe in order to modify the ConsumerConfiguration instance created.

### Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

c078e71a

revise the schema default type not null (#3752) · 6e76af2f

由 congbo 提交于 3月 19, 2019

Fix #3741

Support define not not allow null field in schema

Add not allow null field schema verify

Does this pull request potentially affect one of the following parts:
If yes was chosen, please highlight the changes

Dependencies (does it add or upgrade a dependency): (no)
The public API: (no)
The schema: (yes)
The default values of configurations: (no)
The wire protocol: (no)
The rest endpoints: (no)
The admin cli options: (no)
Anything that affects deployment: (no)

6e76af2f

Support passing schema definition for JSON and AVRO schemas (#3766) · 41d09796

由 Sijie Guo 提交于 3月 14, 2019

* Support passing schema definition for JSON and AVRO schemas

*Motivation*

Currently AVRO and Schema generated schemas from POJO directly.
Sometime people would like to use pre-generated/defined schemas,
so allow passing in schema definitions would clear the confusions
on parsing schemas from POJO.

*Modifications*

- Abstract a common base class `StructSchema` for AVRO/PROTOBUF/JSON
- Standarize on using avro schema for defining schema (we already did that. this change only makes it clearer)
- Add methods to pass schema definition for JSON and AVRO schemas

*NOTES*

We don't support passing schema definition for PROTOBUF. since we only supported generated messages as POJO
class for protobuf schema, and we generate schema definition from the generated messages. it doesn't make sense
to pass in a different schema definition.

* Add missing license header

41d09796

[schema] use UTF_8 for storing schema information (#3666) · f1f781be

由 Sijie Guo 提交于 2月 22, 2019

*Motivation*

Currently we don't specify charset. It will be causing troubles when client
and broker are using different system charsets.

*Modifications*

Use UTF_8 for the schema information for AVRO/PROTOBUF/JSON

f1f781be

[schema] Introduce `GenericSchema` interface (#3683) · ad888f13

由 Sijie Guo 提交于 2月 28, 2019

*Motivation*

In order to introduce `GenericRecordBuilder`, we need to know the fields in a `GenericSchema`.
Otherwise, there is no way for us to build a GenericRecord.

*Modifications*

This proposal refactors current generic schema by introducing a `GenericSchema`. This generic schema
provides interfaces to retrieve the fields of a `GenericRecordSchema`.

*Additionally*

This proposal adding the primitive schemas into `Schema` class. So people can program primitive schemas
using Schema interface rather than specific implementations.

ad888f13

[schema] Introduce schema builder to build schema. (#3682) · f458ec34

由 Sijie Guo 提交于 2月 26, 2019

*Motivation*

Currently we are supporting POJO based schema in java clients.
POJO schema is only useful when the POJO is predefined. However
in applications like a CDC pipeline, POJO is no predefined, there
is no other way to define a schema.

Since we are using avro schema for schema management, this PR
is proposing a simple schema builder wrapper on avro schema builder.

*Modifications*

Introduce schema builder to build a record schema.

*NOTES*

Currently we only support primitives in defining fields in a record schema in this PR.
We will add nested types in future PRs.

f458ec34

[schema] Introduce multi version generic record schema (#3670) · 89e3410b

由 Sijie Guo 提交于 2月 26, 2019

*Motivation*

Currently AUTO_CONSUME only supports decoding records from latest schema.
All the schema versions are lost. It makes AUTO_CONSUME less useful in some use cases,
such as CDC. Because there is no way for the applications to know which version of schema
that a message is using.

In order to support multi-version schema, we need to propagate schema version from
message header through schema#decode method to the decoded record.

*Modifications*

- Introduce a new decode method `decode(byte[] data, byte[] schemaVersion)`. This allows the implementation
  to leverage the schema version.
- Introduce a method `supportSchemaVersioning` to tell which decode methods to use. Because most of the schema
  implementations such as primitive schemas and POJO based schema doesn't make any sense to use schema version.
- Introduce a SchemaProvider which returns a specific schema instance for a given schema version
- Implement a MultiVersionGenericRecordSchema which decode the messages based on schema version. All the records
  decoded by this schema will have schema version and its corresponding schema definitions.

*NOTES

This implementation only introduce the mechanism. But it doesn't wire the multi-version schema
with auto_consume schema. There will be a subsequent pull request on implementing a schema provider
that fetches and caches schemas from brokers.

89e3410b

J

fix s3 spport for s3 api (#3845) · eb96f937
由 Jia Zhai 提交于 3月 19, 2019

eb96f937
R

[pulsar-client] Fix pulsar-cpp crash on lookup timeout (#3862) · 709d49be
由 Rajan Dhabalia 提交于 3月 19, 2019

709d49be

Fix set-project-version.sh (#3847) · 9160c149

由 Sijie Guo 提交于 3月 19, 2019

*Motivation*

The script doesn't update the parent version for protobuf-shaded module

*Modifications*

use `versions:update-parent` to update parent version

9160c149

[stats] Expose namespace topics count when exposing topic level metrics (#3849) · 5e79dd48

由 Sijie Guo 提交于 3月 19, 2019

*Motivation*

Topics count is a namespace level metric. Currently if `exposeTopicLevelMetricsInPrometheus=true`
we don't expose `pulsar_topics_count`. However `pulsar_topics_count` is a very useful metric for
monitoring the healthy of the cluster.

*Modifications*

if `exposeTopicLevelMetricsInPrometheus=true`, we expose namespace level metrics that are not
exposed at topic level. e.g. `pulsar_topics_count`.

5e79dd48

Use `PULSAR_PREFIX_` for appending new keys (#3858) · 780ba31a

由 Sijie Guo 提交于 3月 19, 2019

*Motivation*

Currently integration tests are broken due to #3827

`PULSAR_` is used almost everywhere in our scripts. If we are using
`PULSAR_` for appending new keys, then we are potentially appending
a wrong key into the config file.

Currently this happens on presto. We are appending `PULSAR_ROOT_LOGGER`
into presto's config file and cause presto fail to load the config.

*Modifications*

Use `PULSAR_PREFIX_` instead of `PULSAR_`

780ba31a

S
Expand add env functionality to add variables if not present (#3827) · b6c767ef
由 Sanjeev Kulkarni 提交于 3月 15, 2019
```
* Add config variables if absent

* Took feedback into account
```
b6c767ef
M

Use correct number of messages in batch for publish rate stats during replication (#3834) · e88fed4a
由 Matteo Merli 提交于 3月 15, 2019

e88fed4a

[cpp client] Bugfix prevent dup consumer for same topic subscription (#3748) · c6aa56fe

由 Ezequiel Lovelle 提交于 3月 15, 2019

Same fix as #3746 but for cpp client.

  - Filter consumers for the same topic name.
  - Add test to verify that same subscription names with different topics are
    allowed to be different consumers subscription instead of reused.

c6aa56fe

T
fix message_id_serialize to empty slice (#3801) · d131a2b5
由 Tevic 提交于 3月 15, 2019
```
* fix message_id_serialize to empty slice

* test for serialize and deserialize
```
d131a2b5