- PartitionSpec Added a new constructor
(String, boolean)
that uses a boolean parameter to specify whether to trim partition values. This caters to scenarios (such as using char type as a partition field) where users may not want to trim partition values.
- Instance The OdpsException thrown when calling the stop method will no longer be wrapped a second time.
- SQLExecutor
- Fixed an issue in MCQA 1.0 mode where the user-specified
fallbackPolicy.isFallback4AttachError
did not take effect correctly. - Fixed an issue in MCQA 2.0 mode where the
cancel
method threw an exception when the job failed. - Fixed an issue in MCQA 2.0 mode where using instanceTunnel to fetch results resulted in an error when the isSelect check was incorrect.
- Fixed an issue in MCQA 1.0 mode where the user-specified
- Table Fixed an issue with the
getPartitionSpecs
method that trimmed partition values, causing the retrieval of non-existing partitions.
- SQLExecutor In MCQA 1.0 mode, it is allowed to add custom fallback policies, add subclass
FallbackPolicy.UserDefinedFallbackPolicy
.
- SQLExecutor Enhanced MCQA 2.0 functionality:
isActive
will return false, indicating that there are no active Sessions in MCQA 2.0 mode.- Added a
cancel
method to terminate ongoing jobs. getExecutionLog
now returns a deep copy of the current log and clears the current log, preventing duplicates.- New
quota
method inSQLExecutorBuilder
allows reusing already loadedQuota
, reducing load times. - New
regionId
method inSQLExecutorBuilder
allows specifying the region where the quota is located.
- Quotas Added
getWlmQuota
method withregionId
parameter to fetch quota for a specified regionId. - Quota Introduced
setMcqaConnHeader
method to allow users to override quota using a custom McqaConnHeader, supporting MCQA 2.0. - Instances Added
get
method applicable for MCQA 2.0 jobs, requiring additional parameters for QuotaName and RegionId. - Instance Further adapted for MCQA 2.0 jobs.
- TableSchema
basicallyEquals
method will no longer strictly check for identical Class types.
- SQLExecutor The
run
method's hints will now be deep-copied, preserving the user-provided Map and supporting immutable types (e.g.,ImmutableMap
).
- Stream Fixed potential SQL syntax errors in the
create
method.
- TableAPI Fixed an issue where
ArrayRecord
could not correctly invoketoString
when usingSplitRecordReaderImpl
to retrieve results. - TableAPI Fixed an issue where a
get
operation would throw an array index out of bounds exception when the number ofRecords
corresponding to aSplit
is 0 while usingSplitRecordReaderImpl
to retrieve results. - TableAPI Fixed an issue with composite predicates
CompositePredicate
that could lead to an additional operator being added when encountering an empty predicate.
- Added
SchemaMismatchException
: This exception will be thrown when usingStreamUploadSession
if the Record structure uploaded by the user does not match the table structure. This exception will additionally carry the latest schema version to assist users in rebuilding the Session and performing retry operations. - Added
allowSchemaMismatch
method inStreamUploadSession.Builder
: This method specifies whether to tolerate mismatches between the user's uploaded Record structure and the table structure without throwing an exception. The default value istrue
.
- Fixed an issue where specifying
tunnelEndpoint
in Odps was ineffective when usingStreamUploadSession
. - Fixed a potential NPE issue in
TunnelRetryHandler
.
- SQLExecutor added
isUseInstanceTunnel
method:- Used to determine whether to use instanceTunnel to obtain results
- Fixed an issue where when using SQLExecutor to execute MCQA 2.0 jobs, executing the CommandApi task would affect the next job, causing NPE to be thrown when retrieving results.
- SQLExecutor supports submitting MCQA 2.0 jobs
- SQLExecutorBuilder adds method
enableMcqaV2
- SQLExecutorBuilder adds getter methods for fields
- SQLExecutorBuilder adds method
- SQLExecutor adds
getQueryId
method:- For offline jobs and MCQA 2.0 jobs, it returns the currently executing job's InstanceId
- For MCQA 1.0 jobs, it returns the InstanceId and SubQueryId
- TableAPI adds
SharingQuotaToken
parameter inEnvironmentSettings
to support sharing quota resources during job submission - Quotas introduces
getWlmQuota
method:- Allows retrieval of detailed quota information based on projectName and quotaNickName, including whether it belongs to interactive quotas
- Quota class adds
isInteractiveQuota
method to determine if a quota belongs to interactive quotas (suitable for MCQA 2.0) - Adds
getResultByInstanceTunnel(Instance instance, String taskName, Long limit, boolean limitEnabled)
method:- Allows unlimited retrieval of results via instanceTunnel (lifting restrictions requires higher permissions)
- UpsertSession.Builder adds
setLifecycle
method to configure the session lifecycle
- Fixed the issue where using SQLExecutor to execute offline jobs with
limitEnabled
specified resulted in no effect - Modified the SQLExecutor so that
getQueryId
method returns the job's instanceID instead of null when executing offline jobs - Fixed the issue where using instanceTunnel to retrieve results on encountering non-select statements no longer throws exceptions, instead falling back to non-tunnel logic
- Fixed the problem of missing one data entry when using DownloadSession to download data and an error occurred while the read count equaled the number of records to be read minus one
- The
clone
method of the Odps class now correctly clones other fields, includingtunnelEndpoint
- The Instance's
getRawTaskResults
method now does not make multiple requests when processing synchronous jobs
-
OdpsRecordConverter Enhancement: Now supports converting data to SQL-compatible formats. For example, for the
LocalDate
type, data can be converted to"DATE 'yyyy-mm-dd'"
format. Additionally, for theBinary
type, hex representation format is now supported. -
Enhanced Predicate Pushdown for Storage Constants: Improved the behavior of the
Constant
class and added theConstant.of(Object, TypeInfo)
method. Now, when setting or identifying types as time types, the conversion to SQL-compatible format can be done correctly (enabling correct pushdown of time types). Other type conversion issues have been fixed; anIllegalArgumentException
will be thrown during session creation when conversion to SQL-compatible mode is not possible. -
UpsertSession Implements Closable Interface: Notifies users to properly release local resources of the UpsertSession.
-
SQLExecutorBuilder New Method
offlineJobPriority
: Allows setting the priority of offline jobs when a job rolls back. -
New Method in Table Class
getLastMajorCompactTime
: Used to retrieve the last time the table underwent major compaction. -
New Method in Instance Class
create(Job job, boolean tryWait)
: When thetryWait
parameter is true, the job will attempt to wait on the server for a period of time to obtain results more quickly. -
Resource Class Enhancement: Now able to determine if the corresponding resource is a temporary resource.
-
CreateProjectParma class enhancement Added
defaultCtrlService
parameter to specify the default control cluster of the project.
-
UpsertStream NPE Fix: Fixed an issue where an NPE was thrown during flush when a local error occurred, preventing a proper retry.
-
Varchar/Char type fix: Fixed the problem that when the
Varchar/Char
type obtains its length and encounters special characters such as Chinese symbols or emoticons, it will be incorrectly calculated twice.
- Introduced internal validation of compound predicate expressions, fixed logic when handling invalid or always true/false predicates, enhanced test coverage, and ensured stability and accuracy in complex query optimization.
- TableTunnel Configuration Optimization: Introduced the
tags
attribute toTableTunnel Configuration
, enabling users to attach custom tags to tunnel operations for enhanced logging and management. These tags are recorded in the tenant-levelinformation schema
.
Odps odps;
Configuration configuration =
Configuration.builder(odps)
.withTags(Arrays.asList("tag1", "tag2")) // Utilize Arrays.asList for code standardization
.build();
TableTunnel tableTunnel = odps.tableTunnel(configuration);
// Proceed with tunnel operations
- Instance Enhancement: Added the
waitForTerminatedAndGetResult
method to theInstance
class, integrating optimization strategies from versions 0.48.6 and 0.48.7 for theSQLExecutor
interface, enhancing operational efficiency. Refer tocom.aliyun.odps.sqa.SQLExecutorImpl.getOfflineResultSet
for usage.
- SQLExecutor Offline Job Processing Optimization: Significantly reduced end-to-end latency by enabling immediate result retrieval after critical processing stages of offline jobs executed by
SQLExecutor
, without waiting for the job to fully complete, thus boosting response speed and resource utilization.
- TunnelRetryHandler NPE Fix: Rectified a potential null pointer exception issue in the
getRetryPolicy
method when the error code (error code
) wasnull
.
- Serializable Support:
- Key data types like
ArrayRecord
,Column
,TableSchema
, andTypeInfo
now support serialization and deserialization, enabling caching and inter-process communication.
- Key data types like
- Predicate Pushdown:
- Introduced
Attribute
type predicates to specify column names.
- Introduced
- Tunnel Interface Refactoring:
- Refactored Tunnel-related interfaces to include seamless retry logic, greatly enhancing stability and robustness.
- Removed
TunnelRetryStrategy
andConfigurationImpl
classes, which are now replaced byTunnelRetryHandler
andConfiguration
respectively.
- SQLExecutor Optimization:
- Improved performance when executing offline SQL jobs through the
SQLExecutor
interface, reducing one network request per job to fetch results, thereby decreasing end-to-end latency.
- Improved performance when executing offline SQL jobs through the
- Decimal Read in Table.read:
- Fixed issue where trailing zeroes in the
decimal
type were not as expected in theTable.read
interface.
- Fixed issue where trailing zeroes in the
- Added the
getPartitionSpecs
method to theTable
interface. Compared to thegetPartitions
method, this method does not require fetching detailed partition information, resulting in faster execution.
-
Removed the
isPrimaryKey
method from theColumn
class. This method was initially added to support users in specifying certain columns as primary keys when creating a table. However, it was found to be misleading in read scenarios, as it does not communicate with the server. Therefore, it is not suitable for determining whether a column is a primary key. Moreover, when using this method for table creation, primary keys should be table-level fields (since primary keys are ordered), and this method neglected the order of primary keys, leading to a flawed design. Hence, it has been removed in version 0.48.5.For read scenarios, users should use the
Table.getPrimaryKey()
method to retrieve primary keys. For table creation, users can now use thewithPrimaryKeys
method in theTableCreator
to specify primary keys during table creation.
- Fixed an issue in the
RecordConverter
where formatting aRecord
of typeString
would throw an exception when the data type wasbyte[]
.
- Use
table-api
to write MaxCompute tables, now supportsJSON
andTIMESTAMP_NTZ
types odps-sdk-udf
functions continue to be improved
- When the Table.read() interface encounters the Decimal type, it will currently remove the trailing 0 by default (but will not use scientific notation)
- Fixed the problem that ArrayRecord does not support the getBytes method for JSON type
- Support for passing
retryStrategy
when buildingUpsertSession
.
- The
onFlushFail(String, int)
interface inUpsertStream.Listener
has been marked as@Deprecated
in favor ofonFlushFail(Throwable, int)
interface. This interface will be removed in version 0.50.0. - Default compression algorithm for Tunnel upsert has been changed to
ODPS_LZ4_FRAME
.
- Fixed an issue where data couldn't be written correctly in Tunnel upsert when the compression algorithm was set to something other than
ZLIB
. - Fixed a resource leak in
UpsertSession
that could persist for a long time ifclose
was not explicitly called by the user. - Fixed an exception thrown by Tunnel data retrieval interfaces (
preview
,download
) when encountering invalidDecimal
types (such asinf
,nan
) in tables; will now returnnull
to align with thegetResult
interface.
- Fixed the issue of relying on the user's local time zone when bucketing primary keys of DATE and DATETIME types during Tunnel upsert. This may lead to incorrect bucketing and abnormal data query. Users who rely on this feature are strongly recommended to upgrade to version 0.48.2.
Table
adds a methodgetTableLifecycleConfig()
to obtain the lifecycle configuration of hierarchical storage.TableReadSession
now supports predicate pushdown
Arrow and ANTLR Libraries: Added new includes to the Maven Shade Plugin configuration for better handling and packaging of specific libraries. These includes ensure that certain essential libraries are correctly packaged into the final shaded artifact. The newly included libraries are:
- org.apache.arrow:arrow-format:jar
- org.apache.arrow:arrow-memory-core:jar
- org.apache.arrow:arrow-memory-netty:jar
- org.antlr:ST4:jar
- org.antlr:antlr-runtime:jar
- org.antlr:antlr4:jar
- org.antlr:antlr4-runtime:jar
Shaded Relocation for ANTLR and StringTemplate: The configuration now includes updated relocation rules for org.antlr and org.stringtemplate.v4 packages to prevent potential conflicts with other versions of these libraries that may exist in the classpath. The new shaded patterns are: org.stringtemplate.v4 relocated to com.aliyun.odps.thirdparty.org.stringtemplate.v4 org.antlr relocated to com.aliyun.odps.thirdparty.antlr
- Introduced
odps-sdk-udf
module to allow batch data reading in UDFs for MaxCompute, significantly improving performance in high-volume data scenarios. Table
now supports retrievingColumnMaskInfo
, aiding in data desensitization scenarios and relevant information acquisition.- Support for setting proxies through the use of
odps.getRestClient().setProxy(Proxy)
method. - Implementation of iterable
RecordReader
andRecordReader.stream()
method, enabling conversion to a Stream ofRecord
objects. - Added new parameters
upsertConcurrentNum
andupsertNetworkNum
inTableAPI RestOptions
for more detailed control for users performing upsert operations via the TableAPI. - Support for
Builder
pattern in constructingTableSchema
. - Support for
toString
method inArrayRecord
.
UploadSession
now supports configuration of theGET_BLOCK_ID
parameter to speed up session creation when the client does not needblockId
.- Enhanced table creation method using the
builder
pattern (TableCreator
), making table creation simpler.
- Fixed a bug in
Upsert Session
where the timeout setting was configured incorrectly. - Fixed the issue where
TimestampWritable
computed one second less when nanoseconds were negative.
- Support for new Stream type that enables incremental queries.
preview
method to theTableTunnel
for data preview purposes.OdpsRecordConverter
for parsing and formatting records.- Enhancements to the
Projects
class withcreate
anddelete
methods now available, andupdate
method made public. Operations related to thegroup-api
package are now marked as deprecated. - Improved
Schemas
class to support filtering schemas withSchemaFilter
, listing schemas, and retrieving detailed schema metadata. DownloadSession
introduces new parameterdisableModifiedCheck
to bypass modification checks andfetchBlockId
to skip block ID list retrieval.TableWriteSession
supports writingTIMESTAMP_NTZ
/JSON
types and adds a new parameterMaxFieldSize
.TABLE_API
addspredicate
related classes to support predicate pushdown in the future.
- The implementation of the
read
method in theTable
class is now replaced withTableTunnel.preview
, supporting new types in MaxCompute and time types switched to Java 8 time types without timezone. - The default
MapWritable
implementation switched fromHashMap
toLinkedHashMap
to ensure order. Column
class now supports creation using the Builder pattern.
TableReadSession
now introduces new parametersmaxBatchRawSize
andsplitMaxFileNum
.UpsertSession
enhancements:- Supports writing partial columns.
- Allows setting the number of Netty thread pools with the default changed to 1.
- Enables setting maximum concurrency with the default value changed to 16.
TableTunnel
now supports settingquotaName
option.