Make actualDataNodes expose SPI that can define expression with custom rules and add GraalVM Truffle implementation #22899

linghengqian · 2022-12-15T17:06:48Z

Feature Request

For English only, other languages will not accept.

Please pay attention on issues you submitted, because we maybe need more details.
If no response anymore and we cannot make decision by current information, we will close it.

Please answer these questions before submitting your issue. Thanks!

Is your feature request related to a problem?

Refer to The table generated by time fragments does not conform to the configuration logic #22884, Some plan to write additional unit tests for the class library need GraalVM Reachability Metadata linghengqian/graalvm-trace-metadata-smoketest#1 and Make ShardingSphere Proxy in GraalVM Native Image form available #21347 .

Describe the feature you would like.

Since it is impossible for most people to understand the syntax of Groovy in the first place, issues like The table generated by time fragments does not conform to the configuration logic #22884 can always be mentioned repeatedly, and the initiators of those issues have misunderstood the final results generated by certain Groovy expressions, similar results can actually be verified at https://groovyconsole.appspot.com/ .
If we can expose SPI for actualDataNodes, users can define special rules for expressions by implementing SPI, which will obviously help reduce some misunderstandings.
The significance of this issue for Make ShardingSphere Proxy in GraalVM Native Image form available #21347 is that using GroovyShell directly means an additional class loader, which means that if single tests involving GroovyShell are directly executed under GraalVM Native Image, all are bound to fail. I've opened [GR-43010] Using Groovy classes under native-image results in UnsupportedFeatureError oracle/graal#5522 and provided common unittests in ShardingSphere using GroovyShell.
Once the SPI of actualDataNodes is exposed, we can introduce the implementation of GraalVM Truffle in the master branch of ShardingSphere, which should allow us to use JavaScript, Python, R, Ruby, and LLVM Language in actualDataNodes. And because of the use of GraalVM Truffle, we can imagine completing the nativeTest of ShardingSphere under GraalVM Native Image.
For Groovy, I think it can be used unchanged, but we can also try to transfer the Groovy method call to Truffle's implementation of Espresso, which will distinguish the host JVM process from the guest JVM process to ensure that the nativeTest under GraalVM Native Image passes . But it is worth mentioning that Espresso is more limited than other Truffle API implementations.
I assume the YAML henceforth should be configured as such.

rules:
  - !SHARDING
    tables:
      t_order:
        actualDataNodes: 
          type: ORIGIN_GROOVY
          props:  
           expression: ds-0.t_order_$->{20221123..20221125}
        tableStrategy:
          standard:
            shardingColumn: create_time
            shardingAlgorithmName: lingh-interval
    shardingAlgorithms:
      lingh-interval:
        type: INTERVAL
        props:
          datetime-pattern: "yyyy-MM-dd HH:mm:ss.SSS"
          datetime-lower: "2022-11-23 00:00:00.000"
          datetime-upper: "2022-11-26 00:00:00.000"
          sharding-suffix-pattern: "_yyyyMMdd"
          datetime-interval-amount: 1
          datetime-interval-unit: "DAYS"

The interface corresponding to SPI should be similar to.

import org.apache.shardingsphere.infra.util.spi.lifecycle.SPIPostProcessor;
import org.apache.shardingsphere.infra.util.spi.type.typed.TypedSPI;
import java.util.Properties;

public interface ShardingSphereExpressionParser extends TypedSPI, SPIPostProcessor {

   /**
     * Replace all inline expression placeholders.
     * 
     * @return result inline expression with placeholders
     */
   String handlePlaceHolder();

   /**
     * Split and evaluate inline expression.
     *
     * @return result list
     */
   List<String> splitAndEvaluate();

   /**
     * Get inline expression with placeholders by properties.
     *
     * @return properties
     */
    Properties getProps();
}

The text was updated successfully, but these errors were encountered:

RaigorJiang · 2022-12-15T17:22:05Z

When sharding by interval, it's really not easy for new users to understand the meaning of Groovy expressions.
Looking forward to further discussion!

linghengqian · 2022-12-23T16:35:38Z

For a long time no one gave an opinion. Let me expand on this issue a bit. Considering sharding by date, we should not limit the expression, but directly build an algorithmic SPI.

import org.apache.shardingsphere.infra.util.spi.lifecycle.SPIPostProcessor;
import org.apache.shardingsphere.infra.util.spi.type.typed.TypedSPI;
import java.util.Properties;

public interface AbstractActualdataNodes extends TypedSPI, SPIPostProcessor {
   /**
     *
     * @return result real table list
     */
   List<String> getActualDataNodes();

   /**
     * Get properties.
     *
     * @return properties
     */
    Properties getProps();
}

We should only consider the java.util.List of the final real table. For the simplest case of sharding by date, the configuration of the simplest implementation class using JSR 310 should be similar to the following.

rules:
   - !SHARDING
     tables:
       t_order:
         actualDataNodes:
           type: SINGLE_TABLE
           props:
            table-prefix: t_order # The prefix of the real table
            datetime-lower: 2022-10-01 # lower limit of time
            datetime-upper: 2022-12-31 # time upper limit
            datetime-pattern: yyyy-MM-dd # A string conforming to the format of java.time.format.DateTimeFormatter, used to convert datetime-lower and datetime-upper
            table-suffix-pattern: _yyyyMM # The suffix of the real table, also follows the format of java.time.format.DateTimeFormatter
            datetime-interval-amount: 1 # time interval
            datetime-interval-unit: MONTHS # follow java.time.temporal.ChronoUnit

It should end up producing an ArrayList containing [t_order_202210,t_order_202211,t_order_202212].
Let's assume a simple function to do this conversion.

import org.junit.jupiter.api.Test;

import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.time.temporal.ChronoField;
import java.time.temporal.ChronoUnit;
import java.time.temporal.TemporalAccessor;
import java.util.List;
import java.util.stream.LongStream;

public class DateTest {
    @Test
    void testDate() {
        DateTimeFormatter dateTimeFormatter = DateTimeFormatter.ofPattern("yyyy-MM-dd");
        TemporalAccessor start = dateTimeFormatter.parse("2022-10-01");
        TemporalAccessor end = dateTimeFormatter.parse("2022-12-31");
        if (!start.isSupported(ChronoField.NANO_OF_DAY) && start.isSupported(ChronoField.EPOCH_DAY)) {
            LocalDate startTime = start.query(LocalDate::from);
            LocalDate endTime = end.query(LocalDate::from);
            List<String> actualDataNodes = LongStream.range(0, ChronoUnit.MONTHS.between(startTime, endTime.plusMonths(1)))
                    .mapToObj(startTime::plusMonths)
                    .map(localDate -> "t_order" + localDate.format(DateTimeFormatter.ofPattern("_yyyyMM")))
                    .toList();
            assert actualDataNodes.equals(List.of("t_order_202210", "t_order_202211", "t_order_202212"));
        }
    }
}

linghengqian · 2022-12-28T18:59:58Z

I'm working on this issue. Since I am not familiar with Antlr, I did not start to process the DistSQL syntax, but modified the execution logic on the java code.

zhfeng · 2023-01-07T13:26:24Z

I think this could be also helpful to quarkus-shardingsphere-jdbc to work in native mode. @linghengqian what's DistSQL syntax we need to modify? I can take a look since I have some experience with Antlr before.

linghengqian · 2023-01-07T13:45:06Z

@zhfeng

In fact, I haven't started preparing the PR for this issue, because I got 2019-nCoV some time ago. 🤣
In fact, in https://shardingsphere.apache.org/document/current/en/user-manual/shardingsphere-proxy/distsql/ , all statements involving DATANODES and actual_data_nodes need to be modified. for example

CREATE SHARDING TABLE RULE t_order_item (
DATANODES("ds_${0..1}.t_order_item_${0..1}"),
DATABASE_STRATEGY(TYPE="standard",SHARDING_COLUMN=user_id,SHARDING_ALGORITHM(TYPE(NAME="inline",PROPERTIES("algorithm-expression"="ds_${user_id % 2}")))),
TABLE_STRATEGY(TYPE="standard",SHARDING_COLUMN=order_id,SHARDING_ALGORITHM(TYPE(NAME="inline",PROPERTIES("algorithm-expression"="t_order_item_${order_id % 2}")))),
KEY_GENERATE_STRATEGY(COLUMN=another_id,TYPE(NAME="snowflake")),
AUDIT_STRATEGY (TYPE(NAME="DML_SHARDING_CONDITIONS"),ALLOW_HINT_DISABLE=true)
);

CREATE SHARDING TABLE RULE IF NOT EXISTS t_order_item (
DATANODES("ds_${0..1}.t_order_item_${0..1}"),
DATABASE_STRATEGY(TYPE="standard",SHARDING_COLUMN=user_id,SHARDING_ALGORITHM(TYPE(NAME="inline",PROPERTIES("algorithm-expression"="ds_${user_id % 2}")))),
TABLE_STRATEGY(TYPE="standard",SHARDING_COLUMN=order_id,SHARDING_ALGORITHM(TYPE(NAME="inline",PROPERTIES("algorithm-expression"="t_order_item_${order_id % 2}")))),
KEY_GENERATE_STRATEGY(COLUMN=another_id,TYPE(NAME="snowflake")),
AUDIT_STRATEGY (TYPE(NAME="DML_SHARDING_CONDITIONS"),ALLOW_HINT_DISABLE=true)
);

ALTER SHARDING TABLE RULE t_order_item (
DATANODES("ds_${0..3}.t_order_item${0..3}"),
DATABASE_STRATEGY(TYPE="standard",SHARDING_COLUMN=user_id,SHARDING_ALGORITHM(TYPE(NAME="inline",PROPERTIES("algorithm-expression"="ds_${user_id % 4}")))),
TABLE_STRATEGY(TYPE="standard",SHARDING_COLUMN=order_id,SHARDING_ALGORITHM(TYPE(NAME="inline",PROPERTIES("algorithm-expression"="t_order_item_${order_id % 4}")))),
KEY_GENERATE_STRATEGY(COLUMN=another_id,TYPE(NAME="snowflake")),
AUDIT_STRATEGY(TYPE(NAME="dml_sharding_conditions"),ALLOW_HINT_DISABLE=true)
);

SHOW SHARDING TABLE RULES;

SHOW SHARDING TABLE RULES FROM sharding_db;

zhfeng · 2023-01-07T14:30:11Z

Oh, I'm sorry to hear that. Take care yourself in this tough time and hope you recover soon!

linghengqian · 2023-08-30T20:10:55Z

It may not be reasonable to make such changes in non-major versions. Now I personally tend to introduce some specific identification symbols in the expression of actualDataNode, so that when parsing actualDataNode, some methods can try to find the implementation of SPI according to the Type. Such a module should be able to default to using the corresponding SPI implementation of Groovy expressions to resolve actualDataNode if resolution fails or no such identifier exists. All work should only affect shardingsphere-infra-expr and its submodules.
With GraalVM CE Dev 23.1.0 deprecating GraalVM Updater, I think the ShardingSphere master branch must discard the existence of Truffle, because its minimum JDK version has been raised to JDK21, and Truffle Espresso is GPL LICENSE, similar implementations will have to be implemented by the user side .

linghengqian · 2023-09-03T09:36:59Z

Row Value Expressions will be extended from dataNodes to <SPITypeName>dataNodes in Refactor shardingsphere-infra-expr to expose the use of Row Value Expressions SPI #28340 , while maintaining compatibility with previous ShardingSphere versions.
I chose <> as the identifier.

linghengqian mentioned this issue Dec 15, 2022

The table generated by time fragments does not conform to the configuration logic #22884

Closed

RaigorJiang added type: discussion feature: sharding in: API labels Dec 15, 2022

linghengqian mentioned this issue Dec 25, 2022

Make ShardingSphere Proxy in GraalVM Native Image form available #21347

Closed

7 tasks

linghengqian mentioned this issue Jan 31, 2023

Introduce Truffle Espresso to make GroovyShell available under GraalVM Native Image #23873

Merged

6 tasks

github-actions bot added the stale label Jul 12, 2023

linghengqian removed the stale label Aug 29, 2023

linghengqian self-assigned this Aug 29, 2023

linghengqian mentioned this issue Sep 2, 2023

Refactor shardingsphere-infra-expr to expose the use of Row Value Expressions SPI #28340

Merged

6 tasks

linghengqian mentioned this issue Sep 4, 2023

Sharding does not support configuring ActualDataNodes to ${01.. 02}. If I want to split the table by month, regardless of year, how should I configure it #26111

Closed

This was referenced Sep 27, 2023

Update the infra-expr-espresso module related GraalVM CE version to 23.1.1 #28609

Closed

Add documentation about the infra-expr-espresso module #28610

Merged

terrymanu closed this as completed in #28610 Sep 27, 2023

linghengqian mentioned this issue Oct 11, 2023

dynamic actual-data-nodes refresh in join query is invalid #28704

Closed

linghengqian mentioned this issue Dec 6, 2023

Add Interval implementation of InlineExpressionParser SPI #29309

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make actualDataNodes expose SPI that can define expression with custom rules and add GraalVM Truffle implementation #22899

Make actualDataNodes expose SPI that can define expression with custom rules and add GraalVM Truffle implementation #22899

linghengqian commented Dec 15, 2022 •

edited

Loading

RaigorJiang commented Dec 15, 2022

linghengqian commented Dec 23, 2022

linghengqian commented Dec 28, 2022

zhfeng commented Jan 7, 2023

linghengqian commented Jan 7, 2023

zhfeng commented Jan 7, 2023

linghengqian commented Aug 30, 2023 •

edited

Loading

linghengqian commented Sep 3, 2023 •

edited

Loading

Make actualDataNodes expose SPI that can define expression with custom rules and add GraalVM Truffle implementation #22899

Make actualDataNodes expose SPI that can define expression with custom rules and add GraalVM Truffle implementation #22899

Comments

linghengqian commented Dec 15, 2022 • edited Loading

Feature Request

Is your feature request related to a problem?

Describe the feature you would like.

RaigorJiang commented Dec 15, 2022

linghengqian commented Dec 23, 2022

linghengqian commented Dec 28, 2022

zhfeng commented Jan 7, 2023

linghengqian commented Jan 7, 2023

zhfeng commented Jan 7, 2023

linghengqian commented Aug 30, 2023 • edited Loading

linghengqian commented Sep 3, 2023 • edited Loading

linghengqian commented Dec 15, 2022 •

edited

Loading

linghengqian commented Aug 30, 2023 •

edited

Loading

linghengqian commented Sep 3, 2023 •

edited

Loading