Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert old version doc #324

Merged
merged 3 commits into from
Aug 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
13 changes: 2 additions & 11 deletions docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -97,19 +97,10 @@ const config = {
position: "right",
label: "Document",
items: [
...versions.slice(0, versions.length - 2).map((version) => ({
...versions.slice(0, 5).map((version) => ({
label: version,
to:
version >= "2.3.0"
? `docs/${version}/about`
: `docs/${version}/intro/about`,
to: `docs/${version}/about`
})),
...versions
.slice(versions.length - 2, versions.length)
.map((version) => ({
label: version === "1.x" ? "1.x(Not Apache Release)" : version,
to: `docs/${version}/introduction`,
})),
{
label: "Next",
to: "/docs/about",
Expand Down
79 changes: 79 additions & 0 deletions src/pages/versions/config.json
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,42 @@
"docUrl": "/docs/2.3.0/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.3.0",
"sourceTag": "2.3.0"
},
{
"versionLabel": "2.3.0-beta",
"docUrl": "/docs/2.3.0-beta/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.3.0-beta",
"sourceTag": "2.3.0-beta"
},
{
"versionLabel": "2.2.0-beta",
"docUrl": "/docs/2.2.0-beta/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.2.0-beta",
"sourceTag": "2.2.0-beta"
},
{
"versionLabel": "2.1.3",
"docUrl": "/docs/2.1.3/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.1.3",
"sourceTag": "2.1.3"
},
{
"versionLabel": "2.1.2",
"docUrl": "/docs/2.1.2/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.1.2",
"sourceTag": "2.1.2"
},
{
"versionLabel": "2.1.1",
"docUrl": "/docs/2.1.1/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.1.1",
"sourceTag": "2.1.1"
},
{
"versionLabel": "2.1.0",
"docUrl": "/docs/2.1.0/introduction",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.1.0",
"sourceTag": "2.1.0"
}
]
}
Expand Down Expand Up @@ -141,6 +177,49 @@
"docUrl": "/docs/2.3.0/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.3.0",
"sourceTag": "2.3.0"
},
{
"versionLabel": "2.3.0-beta",
"docUrl": "/docs/2.3.0-beta/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.3.0-beta",
"sourceTag": "2.3.0-beta"
},
{
"versionLabel": "2.2.0-beta",
"docUrl": "/docs/2.2.0-beta/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.2.0-beta",
"sourceTag": "2.2.0-beta"
},
{
"versionLabel": "2.1.3",
"docUrl": "/docs/2.1.3/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.1.3",
"sourceTag": "2.1.3"
},
{
"versionLabel": "2.1.2",
"docUrl": "/docs/2.1.2/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.1.2",
"sourceTag": "2.1.2"
},
{
"versionLabel": "2.1.1",
"docUrl": "/docs/2.1.1/intro/about",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.1.1",
"sourceTag": "2.1.1"
},
{
"versionLabel": "2.1.0",
"docUrl": "/docs/2.1.0/introduction",
"downloadUrl": "https://github.com/apache/incubator-seatunnel/releases/tag/2.1.0",
"sourceTag": "2.1.0"
}
],
"historyData1.x": [
{
"versionLabel": "1.x",
"docUrl": "/docs/1.x/introduction",
"sourceTag": "1.x"
}
]
}
Expand Down
4 changes: 2 additions & 2 deletions src/pages/versions/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ export default function () {
<td>
<a
target="_blank"
href={ "https://github.com/apache/incubator-seatunnel/tree/" + item.sourceTag }
href={ "https://github.com/apache/seatunnel/tree/" + item.sourceTag }
>
{ dataSource.table.source }
</a>
Expand Down Expand Up @@ -100,7 +100,7 @@ export default function () {
<td>
<a
target="_blank"
href={ "https://github.com/apache/incubator-seatunnel/tree/" + item.sourceTag }
href={ "https://github.com/apache/seatunnel/tree/" + item.sourceTag }
>
{ dataSource.table.source }
</a>
Expand Down
114 changes: 114 additions & 0 deletions versioned_docs/version-1.x/configuration/base.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# General configuration

## Core idea

* Row is a piece of data in the logical sense of seatunnel, and is the basic unit of data processing. When Filter processes data, all data will be mapped to Row.

* Field is a field of Row. Row can contain nested levels of fields.

* raw_message refers to the `raw_message` field in the Row for the data entered from the input.

* __root__ refers to the same field level as the top-level field of Row, and is often used to specify the storage location (top level field) of new fields generated during data processing in Row.


---

## config file

A complete seatunnel configuration includes `spark`, `input`, `filter`, `output`, namely:

````
spark {
...
}

input {
...
}

filter {
...
}

output {
...
}

````

* `spark` is spark related configuration,

Configurable spark parameters see:
[Spark Configuration](https://spark.apache.org/docs/latest/configuration.html#available-properties),
Among them, the two parameters of master and deploy-mode cannot be configured here and need to be specified in the seatunnel startup script.

* `input` can configure any input plugin and its parameters, and the specific parameters vary with different input plugins.

* `filter` can configure any filter plugin and its parameters, and the specific parameters vary with different filter plugins.

Multiple plugins in the filter form a data processing pipeline in the configuration order, and the output of the previous filter is the input of the next filter.

* `output` can configure any output plugin and its parameters, and the specific parameters vary with different output plugins.

The data processed by `filter` will be sent to each plugin configured in `output`.


---

## Configuration file example

An example is as follows:

> In configuration, behavior comments beginning with `#`.

````
spark {
# You can set spark configuration here
# seatunnel defined streaming batch duration in seconds
spark.streaming.batchDuration = 5

# see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties
spark.app.name = "seatunnel"
spark.executor.instances = 2
spark.executor.cores = 1
spark.executor.memory = "1g"
}

input {
# This is an example input plugin **only for test and demonstrate the feature input plugin**
fakestream {
content = ["Hello World, InterestingLab"]
rate = 1
}


# If you would like to get more information about how to configure seatunnel and see full list of input plugins,
# please go to https://interestinglab.github.io/seatunnel-docs/#/en-us/v1/configuration/base
}

filter {
split {
fields = ["msg", "name"]
delimiter = ","
}

# If you would like to get more information about how to configure seatunnel and see full list of filter plugins,
# please go to https://interestinglab.github.io/seatunnel-docs/#/en-us/v1/configuration/base
}

output {
stdout {}


# If you would like to get more information about how to configure seatunnel and see full list of output plugins,
# please go to https://interestinglab.github.io/seatunnel-docs/#/en-us/v1/configuration/base
}
````

For other configurations, please refer to:

[Configuration Example 1: Streaming Streaming Computing](https://github.com/InterestingLab/seatunnel/blob/master/config/streaming.conf.template)

[Configuration example 2: Batch offline batch](https://github.com/InterestingLab/seatunnel/blob/master/config/batch.conf.template)

[Configuration example 3: A flexible multi-data process processing](https://github.com/InterestingLab/seatunnel/blob/master/config/complex.conf.template)
45 changes: 45 additions & 0 deletions versioned_docs/version-1.x/configuration/filter-plugin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Filter plugin

### Filter plugin general parameters

| name | type | required | default value |
| --- | --- | --- | --- |
| [source_table_name](#source_table_name-string) | string | no | - |
| [result_table_name](#result_table_name-string) | string | no | - |


##### source_table_name [string]

When `source_table_name` is not specified, the current plugin processes the dataset output by the previous plugin in the configuration file;

When `source_table_name` is specified, the current plugin processes the dataset corresponding to this parameter.

##### result_table_name [string]

When `result_table_name is not specified`, the data processed by this plugin will not be registered as a dataset that can be directly accessed by other plugins, or called a temporary table;

When `result_table_name` is specified, the data processed by this plugin will be registered as a dataset that can be directly accessed by other plugins, or called a temporary table. The dataset registered here, other plugins can directly access by specifying `source_table_name`.

### Usage example

````
split {
source_table_name = "view_table_1"
source_field = "message"
delimiter = "&"
fields = ["field1", "field2"]
result_table_name = "view_table_2"
}
````

> The `Split` plugin will process the data in the temporary table `view_table_1` and register the processing result as a temporary table named `view_table_2`, this temporary table can be specified by any subsequent `Filter` or `Output` plugins `source_table_name` is used.

````
split {
source_field = "message"
delimiter = "&"
fields = ["field1", "field2"]
}
````

> Without `source_table_name` configured, the `Split` plugin will read the dataset passed by the previous plugin and pass it to the next plugin.
34 changes: 34 additions & 0 deletions versioned_docs/version-1.x/configuration/filter-plugins/Add.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
## Filter plugin : Add

* Author: InterestingLab
* Homepage: https://interestinglab.github.io/seatunnel-docs
* Version: 1.0.0

### Description

Add a field with fixed value to Rows.

### Options

| name | type | required | default value |
| --- | --- | --- | --- |
| [target_field](#target_field-string) | string | yes | - |
| [value](#value-string) | string | yes | - |

##### target_field [string]

New field name.

##### value [string]

New field value.

### Examples

```
add {
value = "1"
}
```

> Add a field, the value is "1"
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## Filter plugin : Checksum

* Author: InterestingLab
* Homepage: https://interestinglab.github.io/seatunnel-docs
* Version: 1.0.0

### Description

Calculate checksum(default algorithm is SHA1) of specific field and add a new field with the checksum value.

### Options

| name | type | required | default value |
| --- | --- | --- | --- |
| [method](#method-string) | string | no | SHA1 |
| [source_field](#source_field-string) | string | no | raw_message |
| [target_field](#target_field-string) | string | no | checksum |

##### method [string]

Checksum algorithm, supports SHA1,MD5 and CRC32 now.

##### source_field [string]

Source field

##### target_field [string]

Target field

### Examples

```
checksum {
source_field = "deviceId"
target_field = "device_crc32"
method = "CRC32"
}
```

> Get CRC32 checksum from `deviceId`, and set it to `device_crc32`
Loading
Loading