ORA-01196: file 1 is inconsistent due to a failed media recovery session

官方解释:

ORA-01196: file string is inconsistent due to a failed media recovery session
Cause: The file was being recovered but the recovery did not terminate normally. This left the file in an inconsistent state. No more recovery was successfully completed on this file.
Action: Either apply more logs until the file is consistent or restore the backup again and repeat recovery.

======================================================================================

环境: 

rhel5u7+11.2.0.1.0+physical standby    
ADG保护模式:MAXIMUM PERFORMANCE
重做日志传输模式:LGWR ASYNC AFFIRM  
应用方法:redo apply


错误描述:

使用vmware搭建了一个物理ADG实验环境,保护模式为"maximum protection",为了验证standby 库down而primary也会down机,在standby 上执行了abort强制关闭数据库。

Standby上执行下面操作如下:

SQL> startup nomount;
SQL> alter database mount standby database;
SQL> alter database recover managed standby database using current logfile  disconnect from session;
SQL> alter database recover managed standby database cancel;
SQL> alter database open read only;      #执行到这步报错

ERROR at line 1:
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session

ORA-01110: data file 1: '/oradata/ora11g/system01.dbf'


Primary上执行的操作如下:

SQL> startup 
ORACLE instance started.

Total System Global Area  368263168 bytes
Fixed Size                  1336596 bytes
Variable Size             289409772 bytes
Database Buffers           71303168 bytes
Redo Buffers                6213632 bytes
Database mounted.
ORA-03113: end-of-file on communication channel
Process ID: 17795
Session ID: 1 Serial number: 5

=====================================================

Primary参数文件:

ora11g.__pga_aggregate_target=134217728
ora11g.__sga_target=234881024
ora11g.__shared_io_pool_size=0
ora11g.__shared_pool_size=138412032
ora11g.__streams_pool_size=8388608
*.audit_file_dest='/u01/oracle/admin/ora11g/adump'
*.audit_trail='db'
*.compatible='11.2.0.0.0'
*.control_files='/oradata/ora11g/control01.ctl','/u01/oracle/flash_recovery_area/ora11g/control02.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_recovery_file_dest='/u01/oracle/flash_recovery_area'
*.db_recovery_file_dest_size=4039114752
*.diagnostic_dest='/u01/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=ora11gXDB)'
*.memory_target=367001600
*.nls_language='SIMPLIFIED CHINESE'
*.nls_territory='CHINA'
*.open_cursors=300
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sessions=170
*.standby_file_management='auto'
*.undo_tablespace='UNDOTBS1'
*.db_name='ora11g'
*.db_unique_name='primary'
*.log_archive_config='DG_CONFIG=(primary,standby)'
*.log_archive_dest_1='location=/orachivelog/ valid_for=(all_logfiles,all_roles) db_unique_name=primary'
*.log_archive_dest_2='SERVICE=standby LGWR SYNC AFFIRM  VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQ
UE_NAME=standby'

*.db_file_name_convert='standby','primary'
*.log_file_name_convert='standby','priamry'
*.fal_client='primary'
*.fal_server='standby'
*.log_archive_dest_state_1='enable'
*.log_archive_dest_state_2='enable'
*.log_archive_format='%t_%s_%r.dbf'
*.LOG_ARCHIVE_MAX_PROCESSES=30
*.standby_file_management=auto

===========================================================

Standby参数文件:

ora11g.__db_cache_size=71303168
ora11g.__java_pool_size=4194304
ora11g.__large_pool_size=4194304
ora11g.__oracle_base='/u01/oracle'#ORACLE_BASE set from environment
ora11g.__pga_aggregate_target=134217728
ora11g.__sga_target=234881024
ora11g.__shared_io_pool_size=0
ora11g.__shared_pool_size=138412032
ora11g.__streams_pool_size=8388608
*.audit_file_dest='/u01/oracle/admin/ora11g/adump'
*.audit_trail='db'
*.compatible='11.2.0.0.0'
*.control_files='/oradata/ora11g/standby01.ctl','/oradata/ora11g/standby02.ctl','/u01/oracle/flash_recovery_area/ora11g/standby03.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='ora11g'
*.db_recovery_file_dest='/u01/oracle/flash_recovery_area'
*.db_recovery_file_dest_size=4039114752
*.diagnostic_dest='/u01/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=ora11gXDB)'
*.log_archive_dest_1='location=/orachivelog'
*.log_archive_format='%t_%s_%r.dbf'
*.memory_target=367001600
*.nls_language='SIMPLIFIED CHINESE'
*.nls_territory='CHINA'
*.open_cursors=300
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sessions=170
*.undo_tablespace='UNDOTBS1'
*.db_unique_name='standby'
*.log_archive_config='DG_CONFIG=(primary,standby)'
*.log_archive_dest_1='location=/orachivelog/ valid_for=(all_logfiles,all_roles) db_unique_name=standby'
*.log_archive_dest_2='SERVICE=primary LGWR SYNC AFFIRM VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=primary'
*.LOG_ARCHIVE_MAX_PROCESSES=30

*.log_archive_dest_state_1=enable
*.log_archive_dest_state_2=enable
*.fal_server=primary
*.fal_client=standby
*.db_file_name_convert='primary','standby'
*.log_file_name_convert='priamry','standby'
*.standby_file_management=auto


alterlog文件中的信息如下:

Fatal NI connect error 12514, connecting to:
 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.2.111)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=primary)(CID=(PROGRAM=oracle)(HOST=dba2.test.com)(USER=oracle))))


  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.1.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.1.0 - Production
  Time: 25-JAN-2013 12:14:15
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12564
    
TNS-12564: TNS:connection refused
    ns secondary err code: 0
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
***********************************************************************
Fatal NI connect error 12514, connecting to:
 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.2.111)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=primary)(CID=(PROGRAM=oracle)(HOST=dba2.test.com)(USER=oracle))))


  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.1.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.1.0 - Production
  Time: 25-JAN-2013 12:14:15
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12564
    
TNS-12564: TNS:connection refused
    ns secondary err code: 0
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
Error 12514 received logging on to the standby
FAL[client, ARC1]: Error 12514 connecting to primary for fetching gap sequence
Errors in file /u01/oracle/diag/rdbms/standby/ora11g/trace/ora11g_arc1_18114.trc:
ORA-12514: TNS: 监听程序当前无法识别连接描述符中请求的服务
Errors in file /u01/oracle/diag/rdbms/standby/ora11g/trace/ora11g_arc1_18114.trc:
ORA-12514: TNS: 监听程序当前无法识别连接描述符中请求的服务
Fri Jan 25 12:14:15 2013
ARCt started with pid=50, OS id=18174 
ARCt: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Fri Jan 25 12:14:18 2013
INSV started with pid=51, OS id=18177 
Fri Jan 25 12:28:30 2013
db_recovery_file_dest_size of 3852 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Fri Jan 25 12:55:13 2013
alter database recover managed standby database using current logfile parallel 2 disconnect from session
Attempt to start background Managed Standby Recovery process (ora11g)
Fri Jan 25 12:55:14 2013
MRP0 started with pid=52, OS id=18753 
MRP0: Background Managed Standby Recovery process started (ora11g)
 started logmerger process
Fri Jan 25 12:55:19 2013
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 2 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /orachivelog/1_169_777078497.dbf
Completed: alter database recover managed standby database using current logfile disconnect from session
Recovery of Online Redo Log: Thread 1 Group 5 Seq 170 Reading mem 0
  Mem# 0: /oradata/ora11g/dglog08.log
Standby crash recovery failed to bring standby database to a consistent
point because needed redo hasn't arrived yet.
MRP: Wait timeout: thread 1 sequence# 170
Standby crash recovery aborted due to error 16016.
Errors in file /u01/oracle/diag/rdbms/standby/ora11g/trace/ora11g_ora_18085.trc:
ORA-16016: archived log for thread 1 sequence# 170 unavailable
Recovery interrupted!
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Completed standby crash recovery.
Errors in file /u01/oracle/diag/rdbms/standby/ora11g/trace/ora11g_ora_18085.trc:
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '/oradata/ora11g/system01.dbf'
ORA-10458 signalled during: alter database open read only...

============================================================

ora11g_ora_18085.trc中的信息如下:


*** 2013-01-25 13:19:33.276
Media Recovery drop redo thread 1
KCBR: Number of read descriptors = 1024
KCBR: Media recovery blocks read (SYNC) = 71
KCBR: Influx buffers flushed = 5 times
KCBR: Redo cache copies/changes = 777/777

*** 2013-01-25 13:19:33.277
Completed Media Recovery
In-flux buffer recovery was not started because datafiles were fuzzy beyond in-flux recovery target.
Highest datafile fuzzy SCN: 0.1934411
In-flux buffer recovery target SCN: 0.1933999
*** 2013-01-25 13:19:33.307 1266 krsm.c
Managed Recovery: Not Active posted.
DDE rules only execution for: ORA 1110
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
----- START DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
Successfully dispatched
----- END DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (SUCCESS, 6 csec) -----
Executing ASYNC actions
----- END DDE Actions Dump (total 6 csec) -----
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '/oradata/ora11g/system01.dbf'


解决方法两种

第一种方法:

1,忽略这个问题,直接启备库到mount状态。

2,启动主库到open状态。

3,在备库上应用实时应用redo log。


第二种方法:

1,将主库启动到mount状态,修改redo log传输模式为异步。

SQL> alter system set log_archive_dest_2='service=primary async valid_for=(online_logfiles,primary_role) db_unique_name=primary' scope=spfile;

2,将ADG模式修改为最大性能模式

SQL> alter database set standby database to maximize performance;

SQL> shut immediate;

3,启动主库

SQL> startup

4,启动从库

SQL>startup nomount;

SQL>alter database mount standby database;

SQL>alter database recover managed standby database using current logfile disconnect from session;

SQL>alter database recover managed standby database cancel;

SQL>alter database open read only;


注意:

1, 保证监听都开启,主从库t网络都畅通。

2,正常情况下ADG的启动和关闭顺序是

启动  standby --> primary

关闭  primary --> standby


小结:

这次实验主要是验证一下,ADG在最大保护模式下如果standby  down掉时的情况,也就是从库down了,主库是不是也会随着也挂了。没想到还出了这么多问题,实验中在从库上执行了"shut abort"命令,导致从库非一致性停库,接着主库也确实挂了,但是再次启动从库到open(可以启动到mount状态)时报“ORA-10458”错误,说是需要从库需要恢复,则又执行recover database进行恢复,但没有成功。可能是redo log还没有传到standby库上所导致,突然想到在最大保护模式下,redo log必须要传到一台standby上才可以在主库上进行commit,也就是说备库上没有接收完主库上的redo log就down机了,而主库上的redo数据还在redo log里面并没有传到备库上的standby redo log中,所以备库不能open,而主库需要open的前提是备库有应用redo log才能启库,因此只能先将主库修改成最大性能模式再启库,因为最大性能模式是异步的并不要求从库必须应用redo log后主库才能提交。以上只是个人的理解,如有不对的地方希望大家更正或补充。

======================================================================================

版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任! 谢谢合作!

QQ: 164798858@qq.com
Sina: weibo.com/kaijunfeng
Yahoo: fffygapl@yahoo.com.cn

<think>我们正在解决Oracle数据泵导出时出现的UDE-31623和ORA-31623错误。根据用户提供的引用(特别是文档ID1907256.1),这个错误的具体描述是:“a jobis notattached tothis sessionvia thespecified handle”(作业没有通过指定的句柄附加到当前会话)。这通常发生在尝试控制(如停止、重启或监控)一个数据泵作业时,但当前会话并没有附加到这个作业。错误原因分析:1.会话未附加作业:可能因为之前的会话已经断开,而作业在后台继续运行,此时在新的会话中尝试重新附加该作业就会失败。2.作业句柄无效:在数据泵操作中,作业句柄(handle)是唯一标识作业的。如果句柄不正确或已失效,就会导致此错误。3.作业已完成或失败:如果作业已经完成(成功或失败)而用户仍尝试附加,也会出现此错误。 解决方案: 根据Oracle官方文档(文档ID1907256.1)[^1][^2],我们可以采取以下步骤:步骤1:检查数据泵作业的状态使用以下SQL查询当前数据库中的数据泵作业状态:```sqlSELECTowner_name, job_name,operation,job_mode, stateFROM dba_datapump_jobs; ```或者(如果没有DBA权限): ```sqlSELECT *FROM user_datapump_jobs; ```步骤2:根据作业状态采取相应措施-如果作业状态(STATE)为`NOTRUNNING`、`FAILED`或`COMPLETED`,则说明作业已经结束,无法重新附加。此时,如果需要重新运行,应该使用新的作业名重新启动导出。-如果作业状态为`EXECUTING`,说明作业正在运行,但当前会话无法附加。这时,我们可以尝试重新附加到作业(见步骤3)。 步骤3:重新附加到运行中的作业(如果作业仍在运行) 在数据泵导出命令中,使用`attach`参数附加到现有作业:```bashexpdp username/passwordATTACH=<job_name> ```其中`<job_name>`是步骤1中查询到的作业名。 步骤4:如果附加成功,则可以在交互模式下控制作业(如停止、重启等)。如果附加失败(例如作业名错误或作业已不存在),则考虑终止该作业并重新启动。 步骤5:终止无法附加的作业(如果作业已挂起且无法附加) 如果作业确实存在但无法附加,且需要停止它,可以使用以下PL/SQL命令(以SYS或拥有权限的用户执行):```sqlDECLAREhdlNUMBER;BEGINhdl :=DBMS_DATAPUMP.ATTACH('<job_name>', '<owner_name>'); --作业名和所有者(从步骤1中获取)DBMS_DATAPUMP.STOP_JOB(hdl,1,0); --立即停止作业END; /``` 注意:如果作业所有者不是当前用户,需要指定所有者。步骤6:重新启动导出作业在清理了无效作业后,使用新的作业名重新启动导出: ```bashexpdpusername/password DIRECTORY=dpump_dirDUMPFILE=export.dmp LOGFILE=export.logJOB_NAME=new_job_name ...``` 预防措施:1.在脚本中使用数据泵时,确保作业名唯一(例如包含时间戳)。 2.在会话中断(如网络断开)后,先检查作业状态再尝试重新附加。3.定期清理已完成的数据泵作业。注意事项: -如果作业已经完成(成功或失败),则不需要终止,直接重新运行即可。-如果作业状态为`DEFINING`,表示作业正在创建,可以稍等片刻再尝试附加。如果上述步骤无法解决问题,可能需要检查数据泵的日志文件(通常位于DIRECTORY参数指定的目录中)以获取更多信息。 相关问题: 1.如何查看Oracle数据泵作业的详细日志?2.如何永久删除Oracle数据泵作业? 3.在Oracle数据泵中,如何从网络故障中恢复导出作业?</think>### 解决 Oracle 数据泵导出错误 UDE-31623 ORA-31623 "job not attached to session handle" 根据您提供的引用文档 ID 1907256.1[^1][^2],这个错误的核心是数据泵作业未能正确附加到当前会话句柄。以下是针对此错误的完整解决方案: #### 错误原因分析 ```mermaid graph TD A[UDE-31623/ORA-31623] --> B[会话句柄丢失] B --> C1[网络中断] B --> C2[会话超时] B --> C3[客户端异常退出] B --> C4[并行操作冲突] B --> C5[作业名重复] A --> D[作业状态不一致] D --> E1[作业仍在运行] D --> E2[作业已停止但未清除] ``` 主要触发场景: 1. **会话中断**:导出过程中网络断开或客户端异常关闭 2. **作业状态残留**:作业仍在数据库后台运行,但会话句柄丢失[^1] 3. **句柄冲突**:多个会话尝试附加同名作业(常见于脚本重复执行) 4. **权限问题**:用户权限变更导致会话无法重新附加作业 #### 解决方案步骤 ##### 1. 检查作业状态(关键步骤) ```sql -- 查询所有数据泵作业状态 SELECT owner_name, job_name, operation, job_mode, state, attached_sessions FROM dba_datapump_jobs WHERE job_name = '<your_job_name>'; -- 替换为实际作业名 -- 若无作业名,查询所有运行中作业 SELECT * FROM dba_datapump_jobs WHERE state = 'EXECUTING'; ``` - ✅ **正常状态**:`STATE = 'EXECUTING'` 且 `ATTACHED_SESSIONS > 0` - ❌ **错误状态**:`STATE = 'NOT RUNNING'` 但作业实际在运行(需强制清理) ##### 2. 重新附加作业(首选方案) ```bash expdp username/password@db \ ATTACH=<job_name> # 从步骤1获取的作业名 ``` - 成功附加后进入交互模式,可选择: - `CONTINUE_CLIENT` 恢复作业 - `STOP_JOB` 安全停止 - 若附加失败,继续下一步 ##### 3. 强制清理残留作业 ```sql DECLARE hdl NUMBER; BEGIN -- 尝试附加作业 hdl := DBMS_DATAPUMP.ATTACH('<job_name>', '<owner_name>'); -- 立即停止并删除作业 DBMS_DATAPUMP.STOP_JOB(hdl, 1, 0); EXCEPTION WHEN OTHERS THEN -- 当附加失败时直接删除作业 DBMS_DATAPUMP.DETACH('<job_name>'); END; / ``` ##### 4. 重启导出作业(预防措施) ```bash expdp username/password@db \ DIRECTORY=dpump_dir \ DUMPFILE=export_%U.dmp \ LOGFILE=export.log \ JOB_NAME=exp_$(date +%s) \ # 使用时间戳确保唯一性 PARALLEL=4 \ CLUSTER=N # 禁用集群选项(规避常见Bug) ``` 关键参数优化: - `JOB_NAME`:添加时间戳保证唯一性(避免 ORA-31623) - `CLUSTER=N`:禁用集群模式(解决引用[2]提到的并发Bug) - `PARALLEL`:根据 CPU 核心数设置(公式:$ \text{parallel} \leq \text{CPU\_cores}/2 $) ##### 5. 网络中断处理 若错误由网络中断引起: ```bash # 使用mosh替代ssh(防止会话超时) mosh user@dbserver # 在tmux/screen中启动导出 tmux new-session -s expdp expdp ... # 按Ctrl+B D脱离会话 ``` #### 预防措施 1. **会话管理**: - 使用 `nohup expdp ... &` 或 `tmux` 启动作业 - 设置 SQLNET.EXPIRE_TIME=10(分钟级心跳检测) 2. **作业监控脚本**: ```bash #!/bin/bash # 自动检测并重新附加作业 if ! expdp attach=<job_name>; then echo "Reattaching failed, restarting job..." expdp ... JOB_NAME=exp_$(date +%s) fi ``` 3. **参数优化**(`expdp` 命令行): ```bash COMPRESSION=ALL # 减少网络传输量 NETWORK_LINK=db_link # 远程导出避免客户端中断 ENCRYPTION_PASSWORD=pass # 防止凭据失效 ``` > **重要提示**:如果问题反复出现,请检查 Oracle 集群配置。引用[2]指出该错误常与 Bug 5663241/4173696 相关(集群环境下的句柄同步问题),需应用补丁或设置 `CLUSTER=N`[^2]。 #### 故障排除流程图 ```mermaid graph LR A[出现 ORA-31623] --> B{作业是否存在?} B -->|Yes| C[尝试ATTACH] B -->|No| D[重启新作业] C -->|成功| E[交互控制] C -->|失败| F[强制清理] F --> G[重启新作业] G --> H[添加唯一JOB_NAME] ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值