RAC 11g R2: CRS-5008: Invalid attribute value

Hi folks, I recently had to change the public IP address for the RAC. All went well except the VIPs and the network stopped working. Here is what I got from crsctl status:

[root@rac2 ~]# /u01/app/11.2.0/grid/bin/crsctl status res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
 ONLINE ONLINE rac1
 ONLINE ONLINE rac2
 ONLINE ONLINE rac3
ora.DATA_01.dg
 ONLINE ONLINE rac1
 ONLINE ONLINE rac2
 ONLINE ONLINE rac3
ora.DATA_02.dg
 OFFLINE OFFLINE rac1
 OFFLINE OFFLINE rac2
 OFFLINE OFFLINE rac3
ora.LISTENER.lsnr
 ONLINE ONLINE rac1
 ONLINE ONLINE rac2
 ONLINE ONLINE rac3
ora.asm
 ONLINE ONLINE rac1 Started
 ONLINE ONLINE rac2 Started
 ONLINE ONLINE rac3 Started
ora.eons
 ONLINE ONLINE rac1
 ONLINE ONLINE rac2
 ONLINE ONLINE rac3
ora.gsd
 ONLINE ONLINE rac1
 ONLINE ONLINE rac2
 ONLINE ONLINE rac3
ora.net1.network
 ONLINE OFFLINE rac1
 ONLINE OFFLINE rac2
 ONLINE OFFLINE rac3
ora.ons
 ONLINE ONLINE rac1
 ONLINE ONLINE rac2
 ONLINE ONLINE rac3
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
 1 ONLINE ONLINE rac2
ora.LISTENER_SCAN2.lsnr
 1 ONLINE ONLINE rac2
ora.LISTENER_SCAN3.lsnr
 1 ONLINE ONLINE rac2
ora.rac1.vip
 1 ONLINE OFFLINE
ora.rac2.vip
 1 ONLINE OFFLINE
ora.rac3.vip
 1 ONLINE OFFLINE
ora.scan1.vip
 1 ONLINE OFFLINE
ora.scan2.vip
 1 ONLINE OFFLINE
ora.scan3.vip
 1 ONLINE OFFLINE
ora.tsp.db
 1 OFFLINE OFFLINE
 2 OFFLINE OFFLINE
 3 OFFLINE OFFLINE
ora.tsp.orderentryworkload.svc
 1 OFFLINE OFFLINE
 2 OFFLINE OFFLINE
 3 OFFLINE OFFLINE

Trying to start it manually failed…

[oracle@rac2 ~]$ srvctl start nodeapps
PRKO-2419 : GSD is already started on node(s): rac1,rac2,rac3
PRCR-1079 : Failed to start resource ora.net1.network
CRS-2674: Start of 'ora.net1.network' on 'rac3' failed
CRS-2674: Start of 'ora.net1.network' on 'rac1' failed
CRS-2674: Start of 'ora.net1.network' on 'rac2' failed
PRCR-1079 : Failed to start resource ora.rac1.vip
CRS-2674: Start of 'ora.net1.network' on 'rac1' failed
CRS-2674: Start of 'ora.net1.network' on 'rac2' failed
CRS-2674: Start of 'ora.net1.network' on 'rac3' failed
CRS-2632: There are no more servers to try to place resource 'ora.rac1.vip' on that would satisfy its placement policy
PRCR-1079 : Failed to start resource ora.rac2.vip
CRS-2674: Start of 'ora.net1.network' on 'rac2' failed
CRS-2674: Start of 'ora.net1.network' on 'rac3' failed
CRS-2674: Start of 'ora.net1.network' on 'rac1' failed
CRS-2632: There are no more servers to try to place resource 'ora.rac2.vip' on that would satisfy its placement policy
PRCR-1079 : Failed to start resource ora.rac3.vip
CRS-2674: Start of 'ora.net1.network' on 'rac3' failed
CRS-2674: Start of 'ora.net1.network' on 'rac2' failed
CRS-2674: Start of 'ora.net1.network' on 'rac1' failed
CRS-2632: There are no more servers to try to place resource 'ora.rac3.vip' on that would satisfy its placement policy
PRKO-2422 : ONS is already started on node(s): rac1,rac2,rac3
PRKO-2423 : eONS is already started on node(s): rac1,rac2,rac3

To troubleshoot it I first double checked oifcfg…

[root@rac2 ~]# /u01/app/11.2.0/grid/bin/oifcfg iflist
eth0 144.180.76.0
eth1 9.11.3.0
[root@rac2 ~]# /u01/app/11.2.0/grid/bin/oifcfg getif
eth0 144.180.76.0 global public
eth1 9.11.3.0 global cluster_interconnect

Then I moved on to test the DNS for both name resolution and reverse lookups…

[root@rac2 ~]# nslookup rac-scan
Server: 144.180.76.91
Address: 144.180.76.91#53

Name: rac-scan.localdomain
Address: 144.180.76.121
Name: rac-scan.localdomain
Address: 144.180.76.122
Name: rac-scan.localdomain
Address: 144.180.76.123

[root@rac2 ~]# nslookup 144.180.76.111
Server: 144.180.76.91
Address: 144.180.76.91#53

111.76.180.144.in-addr.arpa name = rac1-vip.localdomain.

Time to look for the logs where I finally found a hint on what might be the problem… “CRS-5008: Invalid attribute value: eth0”

[root@rac2 ~]# tail -n 15 /u01/app/11.2.0/grid/log/rac2/agent/crsd/orarootagent_root/orarootagent_root.log
2016-09-11 21:09:18.141: [ora.net1.network][1324865856] [check] NetworkAgent::checkInterface returned false
2016-09-11 21:09:18.141: [ora.net1.network][1324865856] [check] NetInterface::checkLinkStatus error 0
2016-09-11 21:09:18.141: [ora.net1.network][1324865856] [check] NetInterface::checkLinkStatus error 0
2016-09-11 21:09:18.145: [ora.net1.network][1324865856] [check] NetworkAgent::checkLink returned false
2016-09-11 21:09:18.183: [ AGFW][1324865856] check for resource: ora.net1.network rac2 1 completed with status: OFFLINE
2016-09-11 21:09:18.183: [ AGFW][1324865856] Executing command: check for resource: ora.net1.network rac2 1
2016-09-11 21:09:18.183: [ora.net1.network][1324865856] [check] NetworkAgent::init enter {
2016-09-11 21:09:18.183: [ora.net1.network][1324865856] [check] Checking if eth0 Interface is fine
2016-09-11 21:09:18.185: [ AGFW][1358424384] CHECK initiated by timer for: ora.net1.network rac2 1
2016-09-11 21:09:18.188: [ora.net1.network][1324865856] [check] ifname=eth0
2016-09-11 21:09:18.189: [ora.net1.network][1324865856] [check] subnetmask=255.255.255.0
2016-09-11 21:09:18.189: [ora.net1.network][1324865856] [check] subnetnumber=144.180.76.0
2016-09-11 21:09:18.189: [ora.net1.network][1324865856] [check] CRS-5008: Invalid attribute value: eth0 for the network interface
2016-09-11 21:09:18.189: [ora.net1.network][1324865856] [check] NetworkAgent::init exit }

Checking the status of resource ora.net1.network I found a very strange thing, the SUBNET was incorrect…

[root@rac2 ~]# /u01/app/11.2.0/grid/bin/crsctl status resource ora.net1.network -p
NAME=ora.net1.network
TYPE=ora.network.type
ACL=owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:oracle:r-x
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=restore
CHECK_INTERVAL=1
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION=Oracle Network resource
ENABLED=1
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=60
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
START_DEPENDENCIES=
START_TIMEOUT=0
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=
STOP_TIMEOUT=0
UPTIME_THRESHOLD=1d
USR_ORA_AUTO=
USR_ORA_ENV=
USR_ORA_IF=eth0
USR_ORA_NETMASK=255.255.255.0
USR_ORA_SUBNET=192.168.1.0
VERSION=11.2.0.1.0

I’ve then modified the attribute USR_ORA_SUBNET to match the actual subnet and checked the new status

[root@rac2 ~]# /u01/app/11.2.0/grid/bin/crsctl modify resource ora.net1.network -attr "USR_ORA_SUBNET=144.180.76.0"
[root@rac2 ~]# /u01/app/11.2.0/grid/bin/crsctl status resource ora.net1.network -p
NAME=ora.net1.network
TYPE=ora.network.type
ACL=owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:oracle:r-x
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=restore
CHECK_INTERVAL=1
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION=Oracle Network resource
ENABLED=1
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=60
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
START_DEPENDENCIES=
START_TIMEOUT=0
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=
STOP_TIMEOUT=0
UPTIME_THRESHOLD=1d
USR_ORA_AUTO=
USR_ORA_ENV=
USR_ORA_IF=eth0
USR_ORA_NETMASK=255.255.255.0
USR_ORA_SUBNET=144.180.76.0
VERSION=11.2.0.1.0

To my surprise, when I tried to bring them up, everything was running fine. Of course if we look a while there are two attributes that probably took care of it:
OFFLINE_CHECK_INTERVAL=60
RESTART_ATTEMPTS=5

My conclusion is that it reached the threshold of 60 seconds and checked again it it was offline and tried to restart (it would keep trying and would be in that loop until it was solved). I didn’t test it thou, so it is just a guess… feel free to test it 🙂

In summary: The root cause was the incorrect value for attribute USR_ORA_NETMASK on resource ora.net1.network. After adjusting it, everything got back to normal.

Advertisements

About Bruno Carvalho

Coffee addicted tech guy.
This entry was posted in ORACLE Database and tagged , , , , , , , . Bookmark the permalink.

3 Responses to RAC 11g R2: CRS-5008: Invalid attribute value

  1. I don’t know if I understood. Did OIFCFG assume a default value for the subnet?
    Nice example, btw.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s