您现在的位置是：首页 > 其他

当前栏目

ASM diskgroup dismount with "Waited 15 secs for write IO to PST" (文档 ID 1581684.1)

文档 IO for to with ID 15 quot

2023-09-27 14:29:32 时间

APPLIES TO: Oracle Database - Enterprise Edition - Version 11.2.0.3 to 12.1.0.1 [Release 11.2 to 12.1]
Information in this document applies to any platform.
SYMPTOMS

Normal or high redundancy diskgroup is dismounted with these WARNING messages.

//ASM alert.log

Mon Jul 01 09:10:47 2013
WARNING: Waited 15 secs for write IO to PST disk 1 in group 6.
WARNING: Waited 15 secs for write IO to PST disk 4 in group 6.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 6.
WARNING: Waited 15 secs for write IO to PST disk 4 in group 6.
....
GMON dismounting group 6 at 72 for pid 44, osid 8782162

CAUSE

Generally this kind messages comes in ASM alertlog file on below situations,

Delayed ASM PST heart beats on ASM disks in normal or high redundancy diskgroup,
thus the ASM instance dismount the diskgroup.By default, it is 15 seconds.

By the way the heart beat delays are sort of ignored for external redundancy diskgroup.
ASM instance stop issuing more PST heart beat until it succeeds PST revalidation,
but the heart beat delays do not dismount external redundancy diskgroup directly.

The ASM disk could go into unresponsiveness, normally in the following scenarios:

+   Some of the paths of the physical paths of the multipath device are offline or lost
+   During path failover in a multipath set up
+   Server load, or any sort of storage/multipath/OS maintenance

The Doc ID 10109915.8 briefs about Bug 10109915(this fix introduce this underscore parameter). And the issue is with no OS/Storage tunable timeout mechanism in a case of a Hung NFS Server/Filer. And then _asm_hbeatiowait helps in setting the time out.

SOLUTION

1]   Check with OS and Storage admin that there is disk unresponsiveness.

2]   Possibly keep the disk responsiveness to below 15 seconds.

This will depend on various factors like
+   Operating System
+   Presence of Multipath ( and Multipath Type )
+   Any kernel parameter

So you need to find out, what is the maximum possible disk unresponsiveness for your set up.

For example, on AIX rw_timeout setting affects this and defaults to 30 seconds.

Another example is Linux with native multipathing. In such set up, number of physical paths and polling_interval value in multipath.conf file, will dictate this maximum disk unresponsiveness.

So for your set up ( combination of OS / multipath / storage ), you need to find out this.

3]   If you can not keep the disk unresponsiveness to below 15 seconds, then the below parameter can be set in the ASM instance ( on all the Nodes of RAC ):

   _asm_hbeatiowait

As per internal bug 17274537 , based on internal testing the value should be increased to 120 secs, the same will be fixed in 12.2

Run below in asm instance to set desired value for _asm_hbeatiowait

alter system set "_asm_hbeatiowait"= value scope=spfile sid=*;

And then restart asm instance / crs, to take new parameter value in effect.

REFERENCES BUG:17043894 - DISKGROUP DISMOUNTS IF 2 OUT OF 8 PATHS LOST
BUG:10109915 - ASM HANGS IN HIGH REDUNDANCY CONFIG IF 1 OF 5 DISKS GOES OFFLINE
NOTE:1910315.1 - How to Create a Normal Redundancy Diskgroup Best Practices
PG异常无法启动的问题：could not read file pg_logical/replorigin_checkpoint : Success 问题描述新安装不久的PostgreSQL数据库，断电后重启，查看日志如下 2019-01-08 08:44:19.989 UTC [7493] LOG: database system was interrupted; last known up at 2018-12-24 10:56:28 UTC 2019-01-08 08:44:19.
Oracle中filesystemio_options and disk_asynch_io两个参数的思考文献参考: Things To Consider For Setting filesystemio_options And disk_asynch_io (文档 ID 1987437.1) 场景描述:某个数据库大量使用了OS的cache，为了很好地理解I/O,通过分析上述MOS的文章，写一些自己的理解。
prudentwoo 10g/11g OCP 11g OCM，ITPUB和CSDN专家及专家讲师；有着多年数据库从业经验，资深Oracle数据库专家，现就职于北京海量数据技术股份有限公司担任高级dba职务，为央视，银行，电信等各行业及企业提供过技术支持服务

猜你喜欢

设计-直接不等于简单
【HTML——星空】（效果+代码）
二分求最长单调递增子序列并输出最长的序列（模板）
DataOutputStream的writeBytes(String s)
电商微服务实战之服务监控
libcurl 接口调用方式
Laravel 5.5 Blade::if 简介
Java基础-内部类-为什么成员内部类可以无条件访问外部类
Java编程介绍
【Redis】分布式锁RedLock
解析ThreadPoolExecutor类是如何保证线程池正确运行的
dfs小练【dfs】
第10.11节 Python模块和包小结
**PHP分步表单提交思路（分页表单提交）
移动安全3月报- 私自发短信恶意行为居高

相关主题

阿里云文档
MySQL官方文档
安全文档

zl程序教程

当前栏目

ASM diskgroup dismount with "Waited 15 secs for write IO to PST" (文档 ID 1581684.1)

相关文章