新安装了Oracle 11g rac之后,不知道是什么原因导致第二个节点上的crsd无法启动?其错误消息是CRS-4535: Cannot communicate with Cluster Ready Services。其具体的错误信息还需要查看crsd.log日志才知道。
1、环境 [root@linux2 ~]# cat /etc/issue Enterprise Linux Enterprise Linux Server release 5.5 (Carthage) Kernel \r on an \m [root@linux2 bin]# ./crsctl query crs activeversion Oracle Clusterware active version on the cluster is [11.2.0.1.0] #注意下文中描述中使用了grid与root用户操作不同的对象。 2、错误症状 [root@linux2 bin]# ./crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services #CRS-4535 CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online [root@linux2 bin]# ps -ef | grep d.bin #下面的查询中没有crsd.bin root 3886 1 1 09:50 ? 00:00:11 /u01/app/11.2.0/grid/bin/ohasd.bin reboot grid 3938 1 0 09:51 ? 00:00:04 /u01/app/11.2.0/grid/bin/oraagent.bin grid 4009 1 0 09:51 ? 00:00:00 /u01/app/11.2.0/grid/bin/gipcd.bin grid 4014 1 0 09:51 ? 00:00:00 /u01/app/11.2.0/grid/bin/mdnsd.bin grid 4028 1 0 09:51 ? 00:00:02 /u01/app/11.2.0/grid/bin/gpnpd.bin root 4040 1 0 09:51 ? 00:00:03 /u01/app/11.2.0/grid/bin/cssdmonitor root 4058 1 0 09:51 ? 00:00:04 /u01/app/11.2.0/grid/bin/cssdagent root 4060 1 0 09:51 ? 00:00:00 /u01/app/11.2.0/grid/bin/orarootagent.bin grid 4090 1 2 09:51 ? 00:00:15 /u01/app/11.2.0/grid/bin/ocssd.bin grid 4094 1 0 09:51 ? 00:00:02 /u01/app/11.2.0/grid/bin/diskmon.bin -d -f root 4928 1 0 09:51 ? 00:00:00 /u01/app/11.2.0/grid/bin/octssd.bin reboot grid 4945 1 0 09:51 ? 00:00:02 /u01/app/11.2.0/grid/bin/evmd.bin root 6514 5886 0 10:00 pts/1 00:00:00 grep d.bin [root@linux2 bin]# ./crsctl stat res -t -init -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE linux2 Cluster Reconfigura tion ora.crsd 1 ONLINE OFFLINE #crsd处于offline状态 ora.cssd 1 ONLINE ONLINE linux2 ora.cssdmonitor 1 ONLINE ONLINE linux2 ora.ctssd 1 ONLINE ONLINE linux2 OBSERVER ora.diskmon 1 ONLINE ONLINE linux2 ora.drivers.acfs 1 ONLINE OFFLINE #acfs处于offline状态 ora.evmd 1 ONLINE ONLINE linux2 ora.gipcd 1 ONLINE ONLINE linux2 ora.gpnpd 1 ONLINE ONLINE linux2 ora.mdnsd 1 ONLINE ONLINE linux2 #下面查看crsd对应的日志文件 [grid@linux2 ~]$ view $ORACLE_HOME/log/linux2/crsd/crsd.log 2013-01-05 10:28:27.107: [GIPCXCPT][1768145488] gipcShutdownF: skipping shutdown, count 1, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0) 2013-01-05 10:28:27.107: [ OCRASM][1768145488]proprasmo: Error in open/create file in dg [OCR_VOTE] #打开磁盘组错误 [ OCRASM][1768145488]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge ORA-15077: could not locate ASM instance serving a required diskgroup #出现了ORA错误 2013-01-05 10:28:27.107: [ OCRASM][1768145488]proprasmo: kgfoCheckMount returned [7] 2013-01-05 10:28:27.107: [ OCRASM][1768145488]proprasmo: The ASM instance is down #实例处于关闭状态 2013-01-05 10:28:27.107: [ OCRRAW][1768145488]proprioo: Failed to open [+OCR_VOTE]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE. 2013-01-05 10:28:27.107: [ OCRRAW][1768145488]proprioo: No OCR/OLR devices are usable #OCR/OLR设备不可用 2013-01-05 10:28:27.107: [ OCRASM][1768145488]proprasmcl: asmhandle is NULL 2013-01-05 10:28:27.107: [ OCRRAW][1768145488]proprinit: Could not open raw device 2013-01-05 10:28:27.107: [ OCRASM][1768145488]proprasmcl: asmhandle is NULL 2013-01-05 10:28:27.107: [ OCRAPI][1768145488]a_init:16!: Backend init unsuccessful : [26] 2013-01-05 10:28:27.107: [ CRSOCR][1768145488] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge ORA-15077: could not locate ASM instance serving a required diskgroup ] [7] 2013-01-05 10:28:27.107: [ CRSD][1768145488][PANIC] CRSD exiting: Could not init OCR, code: 26 2013-01-05 10:28:27.107: [ CRSD][1768145488] Done. [root@linux2 bin]# ps -ef | grep pmon #查看pmon进程,此处也表明ASM实例没有启动 root 7447 7184 0 10:48 pts/2 00:00:00 grep pmon #从上面的分析可知,应该是ASM实例没有启动的原因导致了crsd进程无法启动 3、解决 [grid@linux2 ~]$ asmcmd Connected to an idle instance. ASMCMD> startup #启动asm实例 ASM instance started Total System Global Area 283930624 bytes Fixed Size 2212656 bytes Variable Size 256552144 bytes ASM Cache 25165824 bytes ASM diskgroups mounted ASMCMD> exit #Author : Robinson #Blog : http://blog.csdn.net/robinson_0612 #再次查看集群资源的状态 [root@linux2 bin]# ./crsctl stat res -t -init -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE linux2 Started ora.crsd 1 ONLINE INTERMEDIATE linux2 ora.cssd 1 ONLINE ONLINE linux2 ora.cssdmonitor 1 ONLINE ONLINE linux2 ora.ctssd 1 ONLINE ONLINE linux2 OBSERVER ora.diskmon 1 ONLINE ONLINE linux2 ora.drivers.acfs 1 ONLINE OFFLINE ora.evmd 1 ONLINE ONLINE linux2 ora.gipcd 1 ONLINE ONLINE linux2 ora.gpnpd 1 ONLINE ONLINE linux2 ora.mdnsd 1 ONLINE ONLINE linux2 #启动acfs [root@linux2 bin]# ./crsctl start res ora.drivers.acfs -init CRS-2672: Attempting to start 'ora.drivers.acfs' on 'linux2' CRS-2676: Start of 'ora.drivers.acfs' on 'linux2' succeeded #之后所有的状态都处于online状态 [root@linux2 bin]# ./crsctl stat res -t -init -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE linux2 Started ora.crsd 1 ONLINE ONLINE linux2 ora.cssd 1 ONLINE ONLINE linux2 ora.cssdmonitor 1 ONLINE ONLINE linux2 ora.ctssd 1 ONLINE ONLINE linux2 OBSERVER ora.diskmon 1 ONLINE ONLINE linux2 ora.drivers.acfs 1 ONLINE ONLINE linux2 ora.evmd 1 ONLINE ONLINE linux2 ora.gipcd 1 ONLINE ONLINE linux2 ora.gpnpd 1 ONLINE ONLINE linux2 ora.mdnsd 1 ONLINE ONLINE linux2
有关grid相关故障链接:
Troubleshooting CRSD Start up Issue [ID 1323698.1]
How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]
更多参考
有关Oracle RAC请参考
使用crs_setperm修改RAC资源的所有者及权限
使用crs_profile管理RAC资源配置文件
RAC 数据库的启动与关闭
再说 Oracle RAC services
Services in Oracle Database 10g
Migrate datbase from single instance to Oracle RAC
Oracle RAC 连接到指定实例
Oracle RAC 负载均衡测试(结合服务器端与客户端)
Oracle RAC 服务器端连接负载均衡(Load Balance)
Oracle RAC 客户端连接负载均衡(Load Balance)
ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora)
ORACLE RAC 监听配置 (listener.ora tnsnames.ora)
配置 RAC 负载均衡与故障转移
CRS-1006 , CRS-0215 故障一例
基于Linux (RHEL 5.5) 安装Oracle 10g RAC
使用 runcluvfy 校验Oracle RAC安装环境
有关Oracle 网络配置相关基础以及概念性的问题请参考:
配置非默认端口的动态服务注册
配置sqlnet.ora限制IP访问Oracle
Oracle 监听器日志配置与管理
设置 Oracle 监听器密码(LISTENER)
配置ORACLE 客户端连接到数据库
有关基于用户管理的备份和备份恢复的概念请参考
Oracle 冷备份
Oracle 热备份
Oracle 备份恢复概念
Oracle 实例恢复
Oracle 基于用户管理恢复的处理
SYSTEM 表空间管理及备份恢复
SYSAUX表空间管理及恢复
Oracle 基于备份控制文件的恢复(unsing backup controlfile)
有关RMAN的备份恢复与管理请参考
RMAN 概述及其体系结构
RMAN 配置、监控与管理
RMAN 备份详解
RMAN 还原与恢复
RMAN catalog 的创建和使用
基于catalog 创建RMAN存储脚本
基于catalog 的RMAN 备份与恢复
RMAN 备份路径困惑
使用RMAN实现异机备份恢复(WIN平台)
使用RMAN迁移文件系统数据库到ASM
linux 下RMAN备份shell脚本
使用RMAN迁移数据库到异机
有关ORACLE体系结构请参考
Oracle 表空间与数据文件
Oracle 密码文件
Oracle 参数文件
Oracle 联机重做日志文件(ONLINE LOG FILE)
Oracle 控制文件(CONTROLFILE)
Oracle 归档日志
Oracle 回滚(ROLLBACK)和撤销(UNDO)
Oracle 数据库实例启动关闭过程
Oracle 10g SGA 的自动化管理
Oracle 实例和Oracle数据库(Oracle体系结构)