1. 报错信息
Caused by: java.io.IOException: Failed to flush data to Doris. {"BeginTxnTimeMs":0,"Comment":"","CommitAndPublishTimeMs":0,"ErrorURL":"http://**.**.**.**:**/api/_load_error_log?file=__shard_892/error_log_insert_stmt_bf4eb57df3756f49-c5e794564529e398_bf4eb57df3756f49_c5e794564529e398","Label":"datax_doris_writer_c0b81541-2986-4e36-9daa-94531c7a1301","LoadBytes":10027,"LoadTimeMs":20,"Message":"[INTERNAL_ERROR]too many filtered rows\n\n\t0# std::_Function_handler<void (doris::RuntimeState*, doris::Status*), doris::StreamLoadExecutor::execute_plan_fragment(std::shared_ptr<doris::StreamLoadContext>)::$_0>::_M_invoke(std::_Any_data const&, doris::RuntimeState*&&, doris::Status*&&)\n\t1# doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::RuntimeState*, doris::Status*)> const&)\n\t2# std::_Function_handler<void (), doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0>::_M_invoke(std::_Any_data const&)\n\t3# doris::ThreadPool::dispatch_thread()\n\t4# doris::Thread::supervise_thread(void*)\n\t5# start_thread\n\t6# clone\n","NumberFilteredRows":3,"NumberLoadedRows":0,"NumberTotalRows":3,"NumberUnselectedRows":0,"ReadDataTimeMs":0,"Status":"Fail","StreamLoadPutTimeMs":4,"TwoPhaseCommit":"false","TxnId":4531902,"WriteDataTimeMs":15} at com.alibaba.datax.plugin.writer.doriswriter.DorisStreamLoadObserver.streamLoad(DorisStreamLoadObserver.java:65) at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterManager.asyncFlush(DorisWriterManager.java:165) ... 3 more
2. 排查步骤
1. 获取异常详情url
根据错误信息可以获取到详情url
"ErrorURL":"http://**.**.**.**:**/api/_load_error_log?file=__shard_892/error_log_insert_stmt_bf4eb57df3756f49-c5e794564529e398_bf4eb57df3756f49_c5e794564529e398","Label":"datax_doris_writer_c0b81541-2986-4e36-9daa-94531c7a1301"
⚠️如果报错信息中无详情url时,可以通过下面语句获取到
show stream load
2. 获取导入失败详情
-
登录doris数据管理平台,输入命令可以查看具体的导入失败的原因
-
show load WARNINGS on 'http://**.**.**.**:**/api/_load_error_log?file=__shard_892/error_log_insert_stmt_bf4eb57df3756f49-c5e794564529e398_bf4eb57df3756f49_c5e794564529e398'
详情如下:
-
Reason: no partition for this tuple. tuple=+-------------------------------
根据报错信息看出,写入无分区的数据。
3. 总结
数据写入常见的错误如下:
-
写入分区表时,写入不存在分区数据
-
数据类型与doris数据类型不一致
-
采用csv表格数据导入,原数据中存在字符串,字符串中可能存在制表符
\t
或换行符\n
在数据分割的时候,会将数据切割错误,会报数据类型不符错误,故需要设置自定义新的分隔符。PROPERTIES ( "column_separator": "&*&", --导出文件的列分隔符 "line_delimiter": "@@@@" --导出文件的行分隔符 )
doris官网:Stream Load - Apache Doris