zl程序教程

您现在的位置是:首页 >  其他

当前栏目

11.2.0.3 ASM实例出现ORA-4031错误导致数据库归档失败

错误实例数据库 出现 失败 导致 ASM 归档
2023-09-27 14:29:32 时间


环境:
平台:RedHat EnterPrise 5.8 X86_X64
数据库:Oracle EnterPrise 11.2.0.3
集群软件:Oracle grid 11.2.0.3


故障现象:
数据库出现了归档失败,其中有一个节点的实例出现HANG死的状况。

日志信息如下:

 Fri Feb 28 19:49:04 2014

 ARC1: Error 19504 Creating archive log file to +DATA02

 ARCH: Archival stopped, error occurred. Will continue retrying

 ORACLE Instance orcl1 - Archival Error

 ORA-16038: log 14 sequence# 68244 cannot be archived

 ORA-19504: failed to create file ""

 ORA-00312: online log 14 thread 1: +DATA02/orcl/onlinelog/group_14.264.792274883

 ORA-00312: online log 14 thread 1: +DATA02/orcl/onlinelog/group_14.265.792274889

 Archiver process freed from errors. No longer stopped

 Fri Feb 28 19:50:22 2014

 ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3

 ARCH: Archival stopped, error occurred. Will continue retrying

 ORACLE Instance orcl1 - Archival Error

 ORA-16014: log 14 sequence# 68244 not archived, no available destinations

 ORA-00312: online log 14 thread 1: +DATA02/orcl/onlinelog/group_14.264.792274883

 ORA-00312: online log 14 thread 1: +DATA02/orcl/onlinelog/group_14.265.792274889

 ARC0: Archive log rejected (thread 1 sequence 68240) at host orclsh

 FAL[server, ARC0]: FAL archive failed, see trace file.

 ARCH: FAL archive failed. Archiver continuing

 ORACLE Instance orcl1 - Archival Error. Archiver continuing.


分析:
   由于归档失败发生在ASM磁盘上,首先检查ASM磁盘空间以及DB_RECOVERY_FILE_DEST_SIZE,ASM磁盘空间是足够的,而且由于只有一个节点出现出现了无法归档的问题,也可以排除是空间不足造成的。确认两个节点的DB_RECOVERY_FILE_DEST_SIZE参数设置都是0,基本上可以判断问题和当前节点的ASM实例状态不正常有关。

检查ASM实例的错误信息:

 Fri Feb 28 19:41:23 2014

 Dumping diagnostic data in directory=[cdmp_20130702164115], requested by (instance=2, osid=2032294 (LMD0)), summary=[incident=165521].

 Fri Feb 28 19:49:19 2014

 Dumping diagnostic data in directory=[cdmp_20130702164845], requested by (instance=2, osid=2032294 (LMD0)), summary=[incident=165522].

 Fri Feb 28 19:55:56 2014

 Dumping diagnostic data in directory=[cdmp_20130702165517], requested by (instance=2, osid=2032294 (LMD0)), summary=[incident=165523].

当前节点ASM实例出现了的这个信息,说明报错发生在实例2上:

Fri Feb 28 18:34:25 2014

 Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_2032294.trc (incident=186256):

 ORA-04031: unable to allocate 3768 bytes of shared memory ("shared pool","unknown object","sga heap(1,0)","ges enqueues")

 Use ADRCI or Support Workbench to package the incident.

 See Note 411.1 at My Oracle Support for error and packaging details.

 Insufficient shared pool to allocate a GES object (ospid 2032294)

 Fri Feb 28 18:29:53 2014

 Sweep [inc][186256]: completed

 Fri Feb 28 18:36:49 2014

 Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_2032294.trc (incident=186257):

 ORA-04031: unable to allocate 3768 bytes of shared memory ("shared pool","unknown object","sga heap(1,0)","ges enqueues")

 Use ADRCI or Support Workbench to package the incident.

 See Note 411.1 at My Oracle Support for error and packaging details.

 Insufficient shared pool to allocate a GES object (ospid 2032294)


 果然实例2上的ASM出现了大量ORA-4031错误。检查ASM启动的参数配置:

Fri Feb 28 20:06:55 2012

 NOTE: No asm libraries found in the system

 ERROR: -5(Duplicate disk DATA_DG01:ASM_DISK1)

 ERROR: -5(Duplicate disk DATA_DG01:ASM_DISK2)

 MEMORY_TARGET defaulting to 411041792.

 * instance_number obtained from CSS = 2, checking for the existence of node 0... 

 * node 0 does not exist. instance_number = 2 

 Starting ORACLE instance (normal)

 LICENSE_MAX_SESSION = 0

 LICENSE_SESSIONS_WARNING = 0

 Private Interface en1 configured from GPnP for use as a private interconnect.

[name=en1, type=1, ip=169.254.78.6, mac=00-1a-64-bb-50-7d, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]

 Public Interface en0 configured from GPnP for use as a public interface.

[name=en0, type=1, ip=10.1.16.35, mac=00-1a-64-bb-50-7c, net=10.1.16.32/27, mask=255.255.255.224, use=public/1]

 Picked latch-free SCN scheme 3

 Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/11.2.0.3/grid/dbs/arch

 Autotune of undo retention is turned on. 

 LICENSE_MAX_USERS = 0

 SYS auditing is disabled

 NOTE: Volume support enabled

 Starting up:

 Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

 With the Real Application Clusters and Automatic Storage Management options.

 ORACLE_HOME = /u01/app/11.2.0.3/grid

 System name: AIX

 Node name: orcldb2

 Release: 1

 Version: 6

 Machine: 00C94E064C00

 Using parameter settings in server-side pfile /u01/app/11.2.0.3/grid/dbs/init+ASM2.ora

 System parameters with non-default values:

large_pool_size = 12M

instance_type = "asm"

remote_login_passwordfile= "EXCLUSIVE"

asm_diskstring = "/dev/ocr_*"

asm_diskstring = "/dev/voting_*"

asm_diskstring = "/dev/asm_*"

asm_diskgroups = "DATA"

asm_diskgroups = "DATA_DG01"

asm_diskgroups = "SPFILE_DG"

asm_power_limit = 1

diagnostic_dest = "/u01/app/grid"

 Cluster communication is configured to use the following interface(s) for this instance

169.254.78.6

 cluster interconnect IPC version:Oracle UDP/IP (generic)

 IPC Vendor 1 proto 2


调整及建议:
当前ASM实例使用默认的MEMORY_TARGET配置,分配大小大约是400M,根据Oracle的MOS文章:ASM Shared Pool (ORA-4031) [ID 437924.1],在11.2.0.3中,Oracle增加了ASM实例所允许的默认进程数PROCESSES,但是默认的MEMORY_TARGET参数没有增加。
根据Oracle的建议,11.2.0.3的MEMORY_TARGET至少应该设置到1536M,而MEMORY_MAX_TARGET设置为4096M。

SQL alter system set memory_max_target=4096m scope=spfile;

SQL alter system set memory_target=1536m scope=spfile;


对于当前的情况,如果短时间内无法重启DB和ASM实例,可以在问题节点配置一个本地归档路径,设置目标路径为本地磁盘,从而避免归档无法完成而导致的实例HANG死。


prudentwoo 10g/11g OCP 11g OCM,ITPUB和CSDN专家及专家讲师;有着多年数据库从业经验,资深Oracle数据库专家,现就职于北京海量数据技术股份有限公司担任高级dba职务,为央视,银行,电信等各行业及企业提供过技术支持服务