Release Note 3.0.2 #41558

gavinchou · 2024-10-08T09:37:54Z

This version is product ready release. We strongly recommend to use this version instead of other previous 3.0.x (x<2) for compute-storage decoupled mode.

Behavioral Changes

Storage

Limited the number of tablets in a single backup task to prevent FE memory overflow. #40518
The SHOW PARTITIONS command now displays the CommittedVersion of partitions. #28274

Other

The default printing mode (asynchronous) of fe.log now includes file line number information. If performance issues are encountered due to line number output, please switch to BRIEF mode. #39419
The default value of the session variable ENABLE_PREPARED_STMT_AUDIT_LOG has been changed from true to false, and the audit log of prepare statements will no longer be printed. #38865
The default value of the session variable max_allowed_packet has been adjusted from 1MB to 16MB to align with MySQL 8.4. #38697
The JVM of FE and BE defaults to using the UTF-8 character set. #39521

New Features

Storage

Backup and recovery now support clearing tables or partitions that are not in the backup. #39028

Compute-Storage Decoupled

Support for parallel recycling of expired data on multiple tablets. #37630
Support for changing storage vaults through ALTER statements. #38685 #37606
Support for importing a large number of tablets (5000+) in a single transaction (experimental feature). #38243
Support for automatically aborting pending transactions caused by reasons such as node restarts, solving the issue of pending transactions blocking decommission or schema change. #37669
A new session variable enable_segment_cache has been added to control whether to use segment cache during queries (default is true). #37141
Resolved the issue of not being able to import a large amount of data during schema changes in compute-storage decoupled mode. #39558
Support for adding multiple follower roles of FE in compute-storage decoupled mode. #38388
Support for using memory as file cache to accelerate queries in environments with no disks or low-performance HDDs. #38811

Lakehouse

New Lakesoul Catalog has been added. Apache Doris Docs
A new system table catalog_meta_cache_statistics has been added to view the usage of various metadata caches in External Catalog. #40155

Asynchronous Materialized Views

Query Optimizer

Support for is [not] true/false expressions. #38623

Query Execution

A new CRC32 function has been added. #38204
New aggregate functions skew and kurt have been added. #41277
Profiles are now persisted to the FE's disk to retain more profiles. #33690
A new system table workload_group_privileges has been added to view permission information related to workload groups. #38436
A new system table workload_group_resource_usage has been added to monitor resource statistics of workload groups. #39177
Workload groups now support limiting reads of local IO and remote IO. #39012
Workload groups now support cgroupv2 to limit CPU usage. #39374
A new system table information_schema.partitions has been added to view some table creation attributes. #40636

Semi-Structured Data Management

Other

Support for using the SHOW statement to display BE's configuration information, such as SHOW BACKEND CONFIG LIKE ${pattern}. #36525

Improvements

Load

Improved the import efficiency of routine load when encountering frequent EOFs from Kafka. #39975
The stream load result now includes the time taken to read HTTP data, ReceiveDataTimeMs, which can quickly determine slow stream load issues caused by network reasons. #40735
Optimized the routine load timeout logic to avoid frequent timeouts during inverted index and mow writes. #40818

Storage

Support for batch addition of partitions. #37114

Compute-Storage Decoupled

Added the meta-service HTTP interface /MetaService/http/show_meta_ranges to facilitate the statistics of KV distribution in FDB. #39208
The meta-service/recycler stop script ensures that the process fully exits before returning. #40218
Support for using the session variable version_comment (Cloud Mode) to display the current deployment mode as compute-storage decoupled. #38269
Fixed the detailed message returned when transaction submission fails. #40584
Support for using one meta-service process to provide both metadata services and data recycling services. #40223
Optimized the default configuration of file_cache to avoid potential issues when not set. #41421 #41507
Improved query performance by batch retrieving the version of multiple partitions. #38949
Delayed the redistribution of tablets to avoid query performance issues caused by temporary network fluctuations. #40371
Optimized the read-write lock logic in the balance. #40633
Enhanced the robustness of file cache in handling TTL filenames during restarts/crashes. #40226
Added the BE HTTP interface /api/file_cache?op=hash to facilitate the calculation of the hash file names of segment files on disk. #40831
Optimized the unified naming to be compatible with using compute group to represent BE groups (original cloud cluster). #40767
Optimized the waiting time for obtaining locks when calculating delete bitmaps in primary key tables. #40341
When there are many delete bitmaps in primary key tables, optimized the high CPU consumption during queries by pre-merging multiple delete bitmaps. #40204
Support for managing FE/BE nodes in compute-storage decoupled mode through SQL statements, hiding the logic of direct interaction with meta-service when deploying in compute-storage decoupled mode. #40264
Added a script for rapid deployment of FDB. #39803
Optimized the output of SHOW CACHE HOTSPOT to unify the column name style with other SHOW statements. #41322
When using a storage vault as the storage backend, disallowed the use of latest_fs() to avoid binding different storage backends to the same table. #40516
Optimized the timeout strategy for calculating delete bitmaps when importing mow tables. #40562 #40333
The enable_file_cache in be.conf is now enabled by default in compute-storage decoupled mode. #41502

Lakehouse

When reading tables in CSV format, support for the session keep_carriage_return setting to control the reading behavior of the \r symbol. #39980
The default maximum memory of BE's JVM has been adjusted to 2GB (affecting only new deployments). #41403
Hive Catalog has added hive.recursive_directories_table and hive.ignore_absent_partitions properties to specify whether to recursively traverse data directories and whether to ignore missing partitions. #39494
Optimized the Catalog refresh logic to avoid generating a large number of connections during refresh. #39205
SHOW CREATE DATABASE and SHOW CREATE TABLE for external data sources now display location information. #39179
The new optimizer supports inserting data into JDBC external tables using the INSERT INTO statement. #41511
MaxCompute Catalog now supports complex data types. #39259
Optimized the logic for reading and merging data shards of external tables. #38311
Optimized some refresh strategies for metadata caches of external tables. #38506
Paimon tables now support pushing down IN/NOT IN predicates. #38390
Compatible with tables created in Parquet format by Paimon version 0.9. #41020

Asynchronous Materialized Views

Building asynchronous materialized views now supports the use of both immediate and starttime. #39573
Asynchronous materialized views based on external tables will refresh the metadata cache of the external tables before refreshing the materialized views, ensuring construction based on the latest external table data. #38212
Partition incremental construction now supports rolling up according to weekly and quarterly granularities. #39286

MySQL Compatibility

Query Optimizer

The aggregate function GROUP_CONCAT now supports the use of both DISTINCT and ORDER BY. #38080
Optimized the collection and use of statistical information, as well as the logic for estimating row counts and cost calculations, to generate more efficient and stable execution plans.
Window function partition data pre-filtering now supports cases containing multiple window functions. #38393

Query Execution

Reduced query latency by running prepare pipeline tasks in parallel. #40874
Display Catalog information in Profile. #38283
Optimized the computational performance of IN filtering conditions. #40917
Supported cgroupv2 in K8S to limit Doris's memory usage. #39256
Optimized the performance of converting strings to datetime types. #38385
When a string is a decimal number, support casting it to an int, which will be more compatible with certain behaviors of MySQL. #38847

Semi-Structured Data Management

Optimized the performance of inverted index matching. #41122
Temporarily prohibited the creation of inverted indexes with tokenization on arrays. #39062
explode_json_array now supports binary JSON types. #37278
IP data types now support bloomfilter indexes. #39253
IP data types now support row storage. #39258
Nested data types such as ARRAY, MAP, and STRUCT now support schema changes. #39210
When creating MTMV, automatically truncate KEYs encountered in VARIANT data types. #39988
Lazy loading of inverted indexes during queries to improve performance. #38979
add inverted index file size for open file. #37482
Reduced access to object storage interfaces during compaction to improve performance. #41079
Added three new Query Profile Metrics related to inverted indexes. #36696
Reduced cache overhead for non-PreparedStatement SQL to improve performance. #40910
Pre-warming cache now supports inverted indexes. #38986
Inverted indexes are now cached immediately after writing. #39076

Compatibility

Fixed the issue of Thrift ID incompatibility on the master with branch-2.1. #41057

Other

BE HTTP API now supports authentication; set config::enable_all_http_auth to true (default is false) when authentication is required. #39577
Optimized the user permissions required for the REFRESH operation. Permissions have been relaxed from ALTER to SHOW. #39008
Reduced the range of nextId when calling advanceNextId(). #40160
Optimized the caching mechanism for Java UDFs. #40404

Bug Fixes

Load

Fixed the issue where abortTransaction did not handle return codes. #41275
Fixed the issue where transactions failed to commit or abort in compute-storage decoupled mode without calling afterCommit/afterAbort. #41267
Fixed the issue where Routine Load could not work properly when modifying consumer offsets in compute-storage decoupled mode. #39159
Fixed the issue of repeatedly closing file handles when obtaining error log file paths. #41320
Fixed the issue of incorrect job progress caching for Routine Load in compute-storage decoupled mode. #39313
Fixed the issue where Routine Load could get stuck when failing to commit transactions in compute-storage decoupled mode. #40539
Fixed the issue where Routine Load kept reporting data quality check errors in compute-storage decoupled mode. #39790
Fixed the issue where Routine Load did not check transactions before committing in compute-storage decoupled mode. #39775
Fixed the issue where Routine Load did not check transactions before aborting in compute-storage decoupled mode. #40463
Fixed the issue where cluster keys did not support certain data types. #38966
Fixed the issue of transactions being repeatedly committed. #39786
Fixed the issue of use after free with WAL when BE exits. #33131
Fixed the issue where WAL playback did not skip completed import transactions in compute-storage decoupled mode. #41262
Fixed the logic for selecting BE in group commit in compute-storage decoupled mode. #39986 #38644
Fixed the issue where BE might crash when group commit was enabled for insert into. #39339
Fixed the issue where insert into with group commit enabled might get stuck. #39391
Fixed the issue where not enabling the group commit option during import might result in a table not found error. #39731
Fixed the issue of transaction submission timeouts due to too many tablets. #40031
Fixed the issue of concurrent opens with Auto Partition. #38605
Fixed the issue of import lock granularity being too large. #40134
Fixed the issue of coredumps caused by zero-length varchars. #40940
Fixed the issue of incorrect index Id values in log prints. #38790
Fixed the issue of memtable shifting not closing brpc streaming. #40105
Fixed the issue of inaccurate bvar statistics during memtable shifting. #39075
Fixed the issue of multi-replication fault tolerance during memtable shifting. #38003
Fixed the issue of incorrect message length calculations for Routine Load with multiple tables in one stream. #40367
Fixed the issue of inaccurate progress reporting for Broker Load. #40325
Fixed the issue of inaccurate data scan volume reporting for Broker Load. #40694
Fixed the issue of concurrency with Routine Load in compute-storage decoupled mode. #39242
Fixed the issue of Routine Load jobs being canceled in compute-storage decoupled mode. #39514
Fixed the issue of progress not being reset when deleting Kafka topics. #38474
Fixed the issue of updating progress during transaction state transitions in Routine Load. #39311
Fixed the issue of Routine Load switching from a paused state to a paused state. #40728
Fixed the issue of Stream Load records being missed due to database deletion. #39360

Storage

Fixed the issue of missing storage policies. #38700
Fixed the issue of errors during cross-version backup and recovery. #38370
Fixed the NPE issue with ccr binlog. #39909
Fixed potential issues with duplicate keys in mow. #41309 #39791 #39958 #38369 #38331
Fixed the issue of not being able to write after backup and recovery in high-frequency write scenarios. #40118 #38321
Fixed the issue of data errors potentially triggered by deleting empty strings and schema changes. #41064
Fixed the issue of incorrect statistics due to column updates. #40880
Limited the size of tablet meta pb to prevent BE crashes due to oversized meta. #39455
Fixed the potential column misalignment issue with the new optimizer in begin; insert into values; commit. #39295

Compute-Storage Decoupled

Fixed the issue where the tablet distribution might be inconsistent across multiple FEs in compute-storage decoupled mode. #41458
Fixed the issue where TVF might not work in multi-computing group environments. #39249
Fixed the issue where compaction used resources that had already been released when BE exited in compute-storage decoupled mode. #39302
Fixed the issue where automatic start-stop might cause FE replay to get stuck. #40027
Fixed the issue where the BE status and the stored status in meta-service were inconsistent. #40799
Fixed the issue where the FE->meta-service connection pool could not automatically expire and reconnect. #41202 #40661
Fixed the issue where some tablets might repeatedly undergo unexpected balance processes during rebalance. #39792
Fixed the issue where storage vault permissions were lost after FE restarted. #40260
Fixed the issue where tablet row counts and other statistical information might be incomplete due to FDB scan range pagination. #40494
Fixed the performance issue caused by a large number of aborted transactions associated with the same label. #40606
Fixed the issue where commit_txn did not automatically re-enter, maintaining consistent behavior between compute-storage decoupled and integrated modes. #39615
Fixed the issue where the number of projected columns increased when dropping columns. #40187
Fixed the issue where delete statements did not correctly handle return values, causing data to still be visible after deletion. #39428
Fixed the coredump issue caused by rowset metadata competition during file cache preheating. #39361
Fixed the issue where the entire cache space would be used up when TTL cache enabled LRU eviction. #39814
Fixed the issue where temporary files could not be recycled when importing commit rowset failed with HDFS storage backend. #40215

Lakehouse

Fixed some issues with predicate pushdown in JDBC Catalog. #39064
Fixed the issue of not being able to read when Struct type columns are missing in Parquet format. #38718
Fixed the issue of FileSystem leaks on the FE side in some cases. #38610
Fixed the issue of metadata cache information being inconsistent when Hive/Iceberg tables write back in some cases. #40729
Fixed the issue of unstable partition ID generation for external tables in some cases. #39325
Fixed the issue of external table queries selecting BE nodes in the blacklist in some cases. #39451
Optimized the timeout time for batch retrieval of external table partition information to avoid long-term thread occupation. #39346
Fixed the issue of memory leaks when querying Hudi tables in some cases. #41256
Fixed the issue of connection pool connection leaks in JDBC Catalog in some cases. #39582
Fixed the issue of BE memory leaks in JDBC Catalog in some cases. #41041
Fixed the issue of not being able to query Hudi data on Alibaba Cloud OSS. #41316
Fixed the issue of not being able to read empty partitions in MaxCompute. #40046
Fixed the issue of poor performance when querying Oracle through JDBC Catalog. #41513
Fixed the issue of BE crashes when querying Deletion Vector of Paimon tables after enabling file cache features. #39877
Fixed the issue of not being able to access Paimon tables on HDFS clusters with HA enabled. #39806
Temporarily disabled the Page Index filtering feature of Parquet to avoid potential issues. #38691
Fixed the issue of not being able to read unsigned types in Parquet files. #39926
Fixed the issue of potential infinite loops when reading Parquet files in some cases. #39523

MySQL Compatibility

Asynchronous Materialized Views

Fixed the issue where partition construction might select the wrong table to track partitions if both sides have the same column names. #40810
Fixed the issue where transparent rewrite partition compensation might result in incorrect results. #40803
Fixed the issue where transparent rewrite did not take effect on external tables. #38909
Fixed the issue where nested materialized views might not refresh properly. #40433

Synchronous Materialized Views

Fixed the issue where creating synchronous materialized views on MOW tables might result in incorrect query results. #39171

Query Optimizer

Fixed the issue where existing synchronous materialized views might not be usable after upgrading. #41283
Fixed the issue of not correctly handling milliseconds when comparing datetime literals. #40121
Fixed the issue of potential errors in conditional function partition pruning. #39298
Fixed the issue where MOW tables with synchronous materialized views could not perform delete operations. #39578
Fixed the issue where the nullable of slots in JDBC external table query predicates might be incorrectly planned, causing query errors. #41014

Query Execution

Fixed the memory leak issue caused by the use of runtime filters. #39155
Fixed the issue of excessive memory usage by window functions. #39581
Fixed a series of function compatibility issues during rolling upgrades. #41023 #40438 #39648
Fixed the issue of incorrect results with encryption_function when used with constants. #40201
Fixed the issue of errors when importing single-table materialized views. #39061
Fixed the issue of incorrect partition result calculations for window functions. #39100 #40761
Fixed the issue of incorrect calculations for topn when null values are present. #39497
Fixed the issue of incorrect results with the map_agg function. #39743
Fixed the issue of incorrect messages returned by cancel. #38982
Fixed the issue of BE core dumps caused by encrypt and decrypt functions. #40726
Fixed the issue of queries getting stuck due to too many scanners in high-concurrency scenarios. #40495
Supported time types in runtime filters. #38258
Fixed the issue of incorrect results with window funnel functions. #40960

Semi-Structured Data Management

Fixed the issue of match function errors when no indexes were present. #38989
Fixed the issue of crashes when ARRAY data types were used as parameters for array_min/array_max functions. #39492
Fixed the issue of nullable with the array_enumerate_uniq function. #38384
Fixed the issue of bloomfilter indexes not being updated when adding or deleting columns. #38431
Fixed the issue of es-catalog parsing exceptions with array data. #39104
Fixed the issue of improper predicate push-down in es-catalog. #40111
Fixed the issue of exceptions caused by modifying input data with map() and struct() functions. #39699
Fixed the issue of index compaction crashes in special cases. #40294
Fixed the issue of ARRAY type inverted indexes missing nullbitmaps. #38907
Fixed the issue of incorrect results with the count() function on inverted indexes. #41152
Fixed the issue of correct results with the explode_map function when using aliases. #39757
Fixed the issue of VARIANT type not being able to use row storage for exceptional JSON data. #39394
Fixed the issue of memory leaks when returning ARRAY results with VARIANT type. #41358
Fixed the issue of changing column names with VARIANT type. #40320
Fixed the issue of potential precision loss when converting VARIANT type to DECIMAL type. #39650
Fixed the issue of nullable handling with VARIANT type. #39732
Fixed the issue of sparse column reading with VARIANT type. #40295

Other

Fixed the compatibility issue between new and old audit log plugins. #41401
Fixed the issue where users could see processes of others in certain cases. #39747
Fixed the issue where users with permissions could not export. #38365
Fixed the issue where create table like required create permissions for the existing table. #37879
Fixed the issue where some features did not verify permissions. #39726
Fixed the issue of not correctly closing connections when using SSL. #38587
Fixed the issue where executing ALTER VIEW operations in some cases caused FE to fail to start. #40872

The text was updated successfully, but these errors were encountered:

gavinchou · 2024-10-13T07:26:28Z

行为变更

存储

限制单个备份任务的tablet数量，避免FE内存溢出。#40518
SHOW PARTITIONS命令现在显示分区的CommittedVersion。#28274

其他

fe.log的默认打印模式（异步）现在包含文件行号信息。如果遇到因行号输出导致的性能问题，请选择BRIEF模式。#39419
默认将session变量ENABLE_PREPARED_STMT_AUDIT_LOG的值从true更改为false，不再打印prepare语句的审计日志。#38865
将session变量max_allowed_packet的默认值从1MB调整为16MB，与MySQL 8.4保持一致。#38697
FE和BE的JVM默认使用UTF-8字符集。#39521

新特性

存储

备份和恢复现在支持清除不在备份中的表或分区。#39028

存算分离

支持并行回收多个tablet上的过期数据。#37630
支持通过ALTER语句变更storage vault。#38685 #37606
支持单个事务同时导入大量tablet（5000+）（实验性功能）。#38243
支持自动中止因节点重启等原因导致的未决事务，解决未决事务阻塞decommission或schema change的问题。#37669
新增session变量enable_segment_cache控制查询时是否使用segment cache（默认为true）。#37141
解决存算分离模式下进行schema change时不能大量导入的问题。#39558
支持在存算分离模式下允许添加多个follower角色的FE。#38388
支持在无盘或低性能HDD环境下使用内存作为file cache以加速查询。#38811

Lakehouse

新增Lakesoul Catalog。Apache Doris Docs
新增系统表catalog_meta_cache_statistics，用于查看External Catalog中各类元数据缓存的使用情况。#40155

异步物化视图

查询优化器

支持is [not] true/false表达式。#38623

查询执行

新增CRC32函数。#38204
新增聚合函数skew和kurt。#41277
将profile持久化到FE的磁盘中，以保留更多的profile。#33690
新增系统表workload_group_privileges以查看workload group相关的权限信息。#38436
新增系统表workload_group_resource_usage以监控workload group的资源统计信息。#39177
Workload group现在支持限制本地IO和远程IO的读取。#39012
Workload group现在支持cgroupv2以限制CPU使用。#39374
新增系统表information_schema.partitions以查看一些建表属性。#40636

半结构化数据管理

其他

支持使用SHOW语句展示BE的配置信息，例如SHOW BACKEND CONFIG LIKE ${pattern}。#36525

改进

导入

优化了routine load在遇到Kafka频繁EOF时的导入效率。#39975
Stream load结果中增加了读取HTTP数据的耗时时间ReceiveDataTimeMs，可以快速判断网络原因导致的stream load慢问题。#40735
优化了routine load超时逻辑，避免了倒排和mow写入频繁超时问题。#40818

存储

支持批量添加分区。#37114

存算分离

添加了meta-service HTTP接口/MetaService/http/show_meta_ranges，便于统计FDB中KV分布组成。#39208
meta-service/recycler stop脚本确保进程完全退出后才返回。#40218
支持使用session变量version_comment（Cloud Mode）来显示当前部署模式为存算分离模式。#38269
修复了提交事务失败时返回的详细消息。#40584
支持使用一个meta-service进程同时提供元数据服务和数据回收服务。#40223
优化了file_cache的默认配置，避免了未设置时可能导致的无法正确运行的问题。#41421 #41507
通过批量获取多个partition的version提高了查询性能。#38949
延迟变更tablet的分布，避免了临时网络抖动引起的查询性能问题。#40371
优化了balance逻辑中的读写锁。#40633
提高了file cache在重启/宕机等情况下处理TTL文件名的鲁棒性。#40226
增加了BE HTTP接口/api/file_cache?op=hash，方便计算segment文件在盘上的hash文件名。#40831
优化了统一命名，兼容使用compute group代表BE分组（原cloud cluster）。#40767
优化了主键表计算delete bitmap时获取锁的等待时间。#40341
当主键表delete bitmap数量多时，通过提前合并多个delete bitmap来优化查询时CPU消耗高的问题。#40204
支持通过SQL语句管理存算分离模式下的FE/BE节点，隐藏部署存算分离模式时直接和meta-service交互的逻辑。#40264
增加了快速部署FDB脚本。#39803
优化了SHOW CACHE HOTSPOT的输出，使其和其他SHOW语句的列名风格统一。#41322
使用storage vault作为存储后端时，不允许使用latest_fs()以规避同个表绑定不同的存储后端。#40516
优化了mow表导入时计算delete bitmap的超时策略。#40562 #40333
存算分离模式下be.conf的enable_file_cache默认开启。#41502

Lakehouse

读取CSV格式的表时，支持通过会话keep_carriage_return设置对\r符号的读取行为。#39980
BE的JVM最大内存默认调整为2GB（仅影响新部署用户）。#41403
Hive Catalog新增hive.recursive_directories_table和hive.ignore_absent_partitions属性，用于指定是否递归遍历数据目录，以及是否忽略缺失的分区。#39494
优化了Catalog刷新逻辑，避免了刷新产生大量连接。#39205
SHOW CREATE DATABASE和SHOW CREATE TABLE针对外部数据源，增加了location信息显示。#39179
新优化器支持通过INSERT INTO命令将数据插入到JDBC外表。#41511
MaxCompute Catalog支持复杂类型。#39259
优化了外表数据分片的读取合并逻辑。#38311
优化了外表元数据缓存的一些刷新策略。#38506
Paimon表支持IN/NOT IN谓词下推。#38390
兼容Paimon 0.9版本创建的Parquet格式的表。#41020

异步物化视图

构建异步物化视图支持同时使用immediate和starttime。#39573
基于外表的异步物化视图，在刷新物化视图前会刷新外表的元数据缓存，保证基于最新外表数据构建。#38212
分区增量构建支持按照周和季度粒度上卷。#39286

MySQL兼容性

查询优化器

聚合函数GROUP_CONCAT现在支持同时使用DISTINCT和ORDER BY。#38080
优化了统计信息的收集、使用，以及估算行数和代价计算的逻辑，现在可以生成更高效稳定的执行计划。
窗口函数分区数据预过滤支持包含多个窗口函数的情况。#38393

查询执行

通过并行运行prepare pipeline task来降低查询延时。#40874
在Profile中显示Catalog信息。#38283
优化了IN过滤条件的计算性能。#40917
在K8S中支持cgroupv2来限制Doris的内存使用。#39256
优化了字符串到datetime类型的转换性能。#38385
当字符串是一个小数时，支持将其cast为int，这将更兼容MySQL的某些行为。#38847

半结构化数据管理

优化了倒排索引匹配的性能。#41122
暂时禁止在数组上创建带分词的倒排索引。#39062
explode_json_array支持二进制JSON类型。#37278
IP数据类型支持bloomfilter索引。#39253
IP数据类型支持行存。#39258
ARRAY、MAP、STRUCT嵌套数据类型支持schema change。#39210
创建MTMV时遇到VARIANT数据类型自动截断KEY。#39988
查询时懒加载倒排索引提升性能。#38979
add inverted index file size for open file。#37482
compaction时减少倒排索引访问对象存储接口提升性能。#41079
增加了3个倒排索引相关的Query Profile Metric。#36696
减少非PreparedStatement SQL的cache开销提升性能。#40910
预热缓存支持倒排索引。#38986
倒排索引写入即缓存。#39076

兼容性

修复了Thrift ID在master上与branch-2.1不兼容的问题。#41057

其他

BE HTTP API支持鉴权，需要鉴权时将config::enable_all_http_auth设置为true（默认为false）。#39577
优化了REFRESH操作所需的用户权限。从ALTER权限放宽到SHOW权限。#39008
减少了调用advanceNextId()时nextId的范围。#40160
优化了Java UDF的缓存机制。#40404

缺陷修复

导入

修复了abortTransaction没有处理返回码的问题。#41275
修复了存算分离模式下提交/中止事务失败时未调用afterCommit/afterAbort的问题。#41267
修复了存算分离模式下Routine Load修改消费偏移量无法工作的问题。#39159
修复了获取错误日志文件路径时重复关闭文件的问题。#41320
修复了存算分离模式下Routine Load作业进度缓存不正确的问题。#39313
修复了存算分离模式下Routine Load提交事务失败导致卡住的问题。#40539
修复了存算分离模式下Routine Load一直报数据质量检查错误的问题。#39790
修复了存算分离模式下Routine Load未在提交前事务进行检查的问题。#39775
修复了存算分离模式下Routine Load未在中止事务前进行检查的问题。#40463
修复了cluster key不支持某些数据类型的问题。#38966
修复了事务被重复提交的问题。#39786
修复了WAL在BE退出时use after free的问题。#33131
修复了存算分离模式下WAL回放未跳过已经完成了的导入事务的问题。#41262
修复了存算分离模式下group commit选择BE的逻辑。#39986 #38644
修复了insert into开启group commit时BE可能coredump的问题。#39339
修复了insert into开启group commit时可能会卡住的问题。#39391
修复了导入不打开group commit选项时可能会报找不到表的问题。#39731
修复了tablet数量太多提交事务超时的问题。#40031
修复了Auto Partition并发open的问题。#38605
修复了导入锁粒度太大的问题。#40134
修复了varchar长度为0导致coredump的问题。#40940
修复了日志打印的index Id值不正确的问题。#38790
修复了memtable前移未close brpc streaming的问题。#40105
修复了memtable前移bvar统计不准确的问题。#39075
修复了memtable前移多副本容错的问题。#38003
修复了Routine Load一流多表错误计算消息长度的问题。#40367
修复了Broker Load进度汇报不准确的问题。#40325
修复了Broker Load扫描数据量汇报不准确的问题。#40694
修复了存算分离模式下Routine Load并发的问题。#39242
修复了存算分离模式下Routine Load job被取消的问题。#39514
修复了删除Kafka topic时进度未被重置的问题。#38474
修复了Routine Load事务状态转换时更新进度的问题。#39311
修复了Routine Load从暂停状态切换到暂停状态的问题。#40728
修复了Stream Load记录因数据库被删除被漏记录的问题。#39360

存储

修复了storage policy丢失的问题。#38700
修复了跨版本备份恢复报错的问题。#38370
修复了ccr binlog NPE问题。#39909
修复了可能的mow重复key问题。#41309 #39791 #39958 #38369 #38331
修复了高频写入场景下备份恢复之后不能写入的问题。#40118 #38321
修复了删除空字符串和schema change交叉可能触发的数据错误问题。#41064
修复了列更新导致的数据统计不正确问题。#40880
限制了tablet meta pb的大小，防止大小过大导致BE宕机。#39455
修复了begin; insert into values; commit新优化器可能的列错位问题。#39295

存算分离

修复了存算分离模式下多个FE的tablet分布可能不一致的问题。#41458
修复了TVF在多计算组环境下可能不工作的问题。#39249
修复了存算分离模式BE退出时compaction使用了已经释放的资源问题。#39302
修复了自动启停可能导致FE replay卡住的问题。#40027
修复了BE状态和meta-service中存储的状态不一致的问题。#40799
修复了FE->meta-service连接池不能自动过期重连的问题。#41202 #40661
修复了rebalance过程中有一些tablet可能会来回进行非预期的balance问题。#39792
修复了FE重启后storage vault权限丢失的问题。#40260
修复了tablet行数等统计信息可能因为FDB scan range分页导致统计不全的问题。#40494
修复了同个label下关联大量的abort事务导致的性能问题。#40606
修复了commit_txn没有自动重入的问题，保持存算一体和存算分离行为一致。#39615
修复了drop column时投影列变多的问题。#40187
修复了delete语句返回值没有正确处理导致删除之后数据仍可见的问题。#39428
修复了文件缓存预热时因为rowset元数据竞争导致的coredump问题。#39361
修复了TTL缓存开启LRU淘汰时会用满整个缓存空间的问题。#39814
修复了基于HDFS存储后端导入commit rowset失败时临时文件不能回收的问题。#40215

Lakehouse

修复了一些JDBC Catalog谓词下推的问题。#39064
修复了当Parquet格式中Struct类型列缺失时无法读取的问题。#38718
修复了部分情况下FE侧FileSystem泄露的问题。#38610
修复了部分情况下Hive/Iceberg表写回导致元数据缓存信息不一致的问题。#40729
修复了部分情况下为外表生成分区ID不稳定的问题。#39325
修复了部分情况下外表查询会选择在黑名单中的BE节点的问题。#39451
优化了分批获取外表分区信息时的超时时间，避免了长时间占用线程。#39346
修复了部分情况下查询Hudi表导致内存泄露的问题。#41256
修复了部分情况下JDBC Catalog可能存在连接池连接泄露的问题。#39582
修复了部分情况下JDBC Catalog可能存在BE内存泄露的问题。#41041
修复了无法查询阿里云OSS上Hudi数据的问题。#41316
修复了无法读取MaxCompute空分区的问题。#40046
修复了通过JDBC Catalog查询Oracle表示性能差的问题。#41513
修复了开启文件缓存功能后，查询Paimon表Deletion Vector时BE宕机的问题。#39877
修复了无法访问开启HA的HDFS集群上Paimon表的问题。#39806
临时关闭了Parquet的Page Index过滤功能以避免一些潜在问题。#38691
修复了无法读取Parquet文件中unsigned类型的问题。#39926
修复了部分情况下读取Parquet文件可能导致死循环的问题。#39523

MySQL兼容性

异步物化视图

修复了分区构建时，如果两侧有相同的列名，可能选择错误的表跟踪分区的问题。#40810
修复了透明改写分区补偿可能导致结果错误的问题。#40803
修复了透明改写在外表不生效的问题。#38909
修复了嵌套物化视图可能不能正常刷新的问题。#40433

同步物化视图

修复了在MOW表上创建同步物化视图可能导致查询结果错误的问题。#39171

查询优化器

修复了升级后原有同步物化视图可能不可用的问题。#41283
修复了datetime字面量比较时，没有正确处理毫秒的问题。#40121
修复了条件函数分区裁剪可能错误的问题。#39298
修复了存在同步物化视图的MOW表无法执行delete的问题。#39578
修复了JDBC外表查询谓词中的slot的nullable可能规划不正确，导致查询报错的问题。#41014

查询执行

修复了runtime filter在使用过程中导致的内存泄露问题。#39155
修复了window function在使用内存特别多的问题。#39581
修复了一系列滚动升级期间函数兼容性的问题。#41023 #40438 #39648
修复了encryption_function在常量时结果错误的问题。#40201
修复了单表物化视图导入时报错的问题。#39061
修复了窗口函数分区结果计算错误的问题。#39100 #40761
修复了topn计算在有null值时计算错误的问题。#39497
修复了map_agg函数计算结果错误的问题。#39743
修复了cancel返回的消息错误的问题。#38982
修复了encrypt和decrypt函数导致BE Core的问题。#40726
修复了在高并发场景下，过多的scanner导致查询卡住的问题。#40495
Runtime filter中支持time类型。#38258
修复了window funnel函数结果错误的问题。#40960

半结构化数据管理

修复了没有索引时match函数报错的问题。#38989
修复了ARRAY数据类型作为array_min/array_max函数参数时crash的问题。#39492
修复了array_enumerate_uniq函数nullable的问题。#38384
修复了添加或删除列时bloomfilter索引没有更新的问题。#38431
修复了es-catalog解析异常array数据的问题。#39104
修复了es-catalog不合理条件下推的问题。#40111
修复了map() struct()函数修改了输入数据导致异常的问题。#39699
修复了特殊情况下索引compaction crash的问题。#40294
修复了ARRAY类型倒排索引缺少nullbitmap的问题。#38907
修复了倒排索引count()结果的问题。#41152
修复了explode_map使用别名时结果正确性问题。#39757
修复了VARIANT类型中异常JSON数据无法使用行存的问题。#39394
修复了VARIANT类型中返回ARRAY结果时内存泄漏的问题。#41358
修复了VARIANT类型修改列名的问题。#40320
修复了VARIANT类型转成DECIMAL类型可能丢失精度的问题。#39650
修复了VARIANT类型nullable处理问题。#39732
修复了VARIANT类型稀疏列读取问题。#40295

其他

修复了新旧audit log plugin兼容性问题。#41401
修复了某些情况下用户能看到他人进程的问题。#39747
修复了有权限的用户也不能导出的问题。#38365
修复了create table like需要已有表的create权限的问题。#37879
修复了一些功能没有校验权限的问题。#39726
修复了使用SSL连接时未正确关闭连接的问题。#38587
修复了部分情况下执行ALTER VIEW操作导致FE无法启动的问题。#40872

gavinchou · 2024-10-13T07:27:57Z

Thanks all who contribute to this release:

924060929 BePPPower BiteTheDDDDt ByteYue CalvinKirs Ceng23333 ChenPeng2013 DarvenDuan Gabriel39 HappenLee Jibing-Li Johnnyssc Lchangliang LiBinfeng-01 Mryange SWJTU-ZhangLei TangSiyang2001 Toms1999 Vallishp WinkerDu Yukang-Lian Yulei-Yang airborne12 amorynan biohazard4321 bobhan1 caiconghui cambyzju catpineapple cjj2010 csun5285 dataroaring deardeng eldenmoon elon-X englefly feiniaofeiafei felixwluo freemandealer gavinchou glzhao89 hello-stephen htyoung hubgeter hust-hhb jacktengg justfortaste kaijchen kaka11chen liaoxin01 liutang123 lsy3993 luwei16 morningman morrySnow mrhhsg mymeiyi nextdreamblue platoneko qidaye qzsee seawinde smallx sollhui starocean999 superdiaodiao suxiaogang223 w41ter wangbo wangshuo128 wsjz wuwenchi wyxxxcat xiaokang xinyiZzz xzj7019 yagagagaga yiguolei yujun777 zclllyybb zddr zfr9527 zhangstar333 zhannngchen zhiqiang-hhhh zy-kkk zzzxl1993

gavinchou added the release notes label Oct 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release Note 3.0.2 #41558

Release Note 3.0.2 #41558

gavinchou commented Oct 8, 2024 •

edited

Loading

gavinchou commented Oct 13, 2024

gavinchou commented Oct 13, 2024 •

edited

Loading

Release Note 3.0.2 #41558

Release Note 3.0.2 #41558

Comments

gavinchou commented Oct 8, 2024 • edited Loading

Behavioral Changes

Storage

Other

New Features

Storage

Compute-Storage Decoupled

Lakehouse

Asynchronous Materialized Views

Query Optimizer

Query Execution

Semi-Structured Data Management

Other

Improvements

Load

Storage

Compute-Storage Decoupled

Lakehouse

Asynchronous Materialized Views

MySQL Compatibility

Query Optimizer

Query Execution

Semi-Structured Data Management

Compatibility

Other

Bug Fixes

Load

Storage

Compute-Storage Decoupled

Lakehouse

MySQL Compatibility

Asynchronous Materialized Views

Synchronous Materialized Views

Query Optimizer

Query Execution

Semi-Structured Data Management

Other

gavinchou commented Oct 13, 2024

行为变更

存储

其他

新特性

存储

存算分离

Lakehouse

异步物化视图

查询优化器

查询执行

半结构化数据管理

其他

改进

导入

存储

存算分离

Lakehouse

异步物化视图

MySQL兼容性

查询优化器

查询执行

半结构化数据管理

兼容性

其他

缺陷修复

导入

存储

存算分离

Lakehouse

MySQL兼容性

异步物化视图

同步物化视图

查询优化器

查询执行

半结构化数据管理

其他

gavinchou commented Oct 13, 2024 • edited Loading

gavinchou commented Oct 8, 2024 •

edited

Loading

gavinchou commented Oct 13, 2024 •

edited

Loading