Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-1.1:Failed to get metadata for S3 object #461

Open
xingnailu opened this issue Dec 14, 2023 · 1 comment
Open

branch-1.1:Failed to get metadata for S3 object #461

xingnailu opened this issue Dec 14, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@xingnailu
Copy link

Bug description

Bug description
I built gluten+velox using branch-1.1, submitted a tpch query using spark-shell, and the data was stored in s3. However, the following error occurred during execution:

Reason: Failed to get metadata for S3 object due to: 'Unknown error'. Path:'s3://xxxxxxx/user/hive/warehouse/tpch_orc.db/customer/part-00027-31ef1f3c-5b27-4f6c-aef4-7f77f7749873-c000.snappy.orc', SDK Error Type:100, HTTP Status Code:400, S3 Service:'AmazonS3', Message:'No response body.', RequestID:'KC5WQZ78QWKQ9BFX'"

But I can use gluten tag v1.0.0 version to execute normally.

@majetideepak

System information

System information
build branch-1.1 system info:

Velox System Info v0.0.2
Commit: facebookincubator@bbd65c4
CMake Version: 3.16.3
System: Linux-5.15.0-91-generic
Arch: x86_64
C++ Compiler: /bin/c++
C++ Compiler Version: 9.4.0
C Compiler: /bin/cc
C Compiler Version: 9.4.0
CMake Prefix Path: /usr/local;/usr;/;/usr;/usr/local;/usr/X11R6;/usr/pkg;/opt

run on aws eks

Relevant logs

"2023-12-05T07:12:37.689576121Z stdout F 23/12/05 07:12:37 ERROR TaskResources: Task 8 failed by error: ",
"2023-12-05T07:12:37.689606328Z stdout F io.glutenproject.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxRuntimeError",
"2023-12-05T07:12:37.689628682Z stdout F Error Source: RUNTIME",
"2023-12-05T07:12:37.689632451Z stdout F Error Code: INVALID_STATE",
"2023-12-05T07:12:37.689636372Z stdout F Reason: Failed to get metadata for S3 object due to: 'Unknown error'. Path:'s3://xxxxxx/user/hive/warehouse/tpch_orc.db/customer/part-00027-31ef1f3c-5b27-4f6c-aef4-7f77f7749873-c000.snappy.orc', SDK Error Type:100, HTTP Status Code:400, S3 Service:'AmazonS3', Message:'No response body.', RequestID:'KC5WQZ78QWKQ9BFH'",
"2023-12-05T07:12:37.689639435Z stdout F Retriable: False",
"2023-12-05T07:12:37.689643198Z stdout F Context: Split [Hive: s3a://xxxxxx/user/hive/warehouse/tpch_orc.db/customer/part-00027-31ef1f3c-5b27-4f6c-aef4-7f77f7749873-c000.snappy.orc 0 - 121746056] Task Gluten_Stage_0_TID_8",
"2023-12-05T07:12:37.689646437Z stdout F Top-Level Context: Same as context.",
"2023-12-05T07:12:37.689649292Z stdout F Function: initialize",
"2023-12-05T07:12:37.689652406Z stdout F File: ../../velox/connectors/hive/storage_adapters/s3fs/S3FileSystem.cpp",
"2023-12-05T07:12:37.689655045Z stdout F Line: 93", 
"2023-12-05T07:12:37.689657984Z stdout F Stack trace:",
"2023-12-05T07:12:37.689661375Z stdout F # 0  facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)",
"2023-12-05T07:12:37.689670744Z stdout F # 1  void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxRuntimeError, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(facebook::velox::detail::VeloxCheckFailArgs const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)", 
"2023-12-05T07:12:37.68967352Z stdout F # 2  facebook::velox::(anonymous namespace)::S3ReadFile::initialize()",
"2023-12-05T07:12:37.689677103Z stdout F # 3  facebook::velox::filesystems::S3FileSystem::openFileForRead(std::basic_string_view<char, std::char_traits<char> >, facebook::velox::filesystems::FileOptions const&)",
"2023-12-05T07:12:37.689680232Z stdout F # 4  facebook::velox::FileHandleGenerator::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)",
"2023-12-05T07:12:37.689682935Z stdout F # 5  facebook::velox::CachedFactory<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<facebook::velox::FileHandle>, facebook::velox::FileHandleGenerator>::generate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)", 
"2023-12-05T07:12:37.689686275Z stdout F # 6  facebook::velox::connector::hive::HiveDataSource::addSplit(std::shared_ptr<facebook::velox::connector::ConnectorSplit>)",
"2023-12-05T07:12:37.68970488Z stdout F # 7  facebook::velox::exec::TableScan::getOutput()",
"2023-12-05T07:12:37.689707926Z stdout F # 8  facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exec::Driver>&, std::shared_ptr<facebook::velox::exec::BlockingState>&, std::shared_ptr<facebook::velox::RowVector>&)",
"2023-12-05T07:12:37.689710953Z stdout F # 9  facebook::velox::exec::Driver::next(std::shared_ptr<facebook::velox::exec::BlockingState>&)",
"2023-12-05T07:12:37.689713812Z stdout F # 10 facebook::velox::exec::Task::next(folly::SemiFuture<folly::Unit>*)",
"2023-12-05T07:12:37.689716972Z stdout F # 11 gluten::WholeStageResultIterator::next()",
"2023-12-05T07:12:37.689719966Z stdout F # 12 Java_io_glutenproject_vectorized_ColumnarBatchOutIterator_nativeHasNext",
@xingnailu xingnailu added the bug Something isn't working label Dec 14, 2023
@dcoliversun
Copy link

I have similar exception but data is stored on Alibaba OSS. S3 Storage Adapters support oss scheme[1]

Exception info is

Caused by: io.glutenproject.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Failed to get metadata for S3 object due to: 'Resource not found'. Path:'s3://henghzhen-test-hangzhou/db/t1/b=1/c=10/part-00000-d4940ed1-7f70-44f5-bbb0-65ae29f325f1.c000.snappy.parquet', SDK Error Type:16, HTTP Status Code:404, S3 Service:'AmazonS3', Message:'No response body.', RequestID:'2VQQRSWNX8QQGNNY'
Retriable: False
Context: Split [Hive: s3a://henghzhen-test-hangzhou/db/t1/b=1/c=10/part-00000-d4940ed1-7f70-44f5-bbb0-65ae29f325f1.c000.snappy.parquet 0 - 443] Task Gluten_Stage_0_TID_0
Top-Level Context: Same as context.
Function: initialize
File: ../../velox/connectors/hive/storage_adapters/s3fs/S3FileSystem.cpp
Line: 93
Stack trace:
# 0  _ZN8facebook5velox7process10StackTraceC1Ei
# 1  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
# 2  _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_
# 3  _ZN8facebook5velox12_GLOBAL__N_110S3ReadFile10initializeEv
# 4  _ZN8facebook5velox11filesystems12S3FileSystem15openFileForReadESt17basic_string_viewIcSt11char_traitsIcEERKNS1_11FileOptionsE
# 5  _ZN8facebook5velox19FileHandleGeneratorclERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
# 6  _ZN8facebook5velox13CachedFactoryINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt10shared_ptrINS0_10FileHandleEENS0_19FileHandleGeneratorEE8generateERKS7_
# 7  _ZN8facebook5velox9connector4hive14HiveDataSource8addSplitESt10shared_ptrINS1_14ConnectorSplitEE
# 8  _ZN8facebook5velox4exec9TableScan9getOutputEv
# 9  _ZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEE
# 10 _ZN8facebook5velox4exec6Driver4nextERSt10shared_ptrINS1_13BlockingStateEE
# 11 _ZN8facebook5velox4exec4Task4nextEPN5folly10SemiFutureINS3_4UnitEEE
# 12 _ZN6gluten24WholeStageResultIterator4nextEv
# 13 Java_io_glutenproject_vectorized_ColumnarBatchOutIterator_nativeHasNext
# 14 0x00007f8c75018427

[1] https://facebookincubator.github.io/velox/develop/connectors.html?highlight=oss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants