Voting disk in oracle rac


Voting disk is one of the crucial part in oracle RAC database system and in the absence of it the cluster won't be able to come up. A voting disk is basically containing heart beat information and it contain 2 types of heartbeat i.e. network heartbeat and disk heartbeat.

Disk Heartbeat: Each and every node must store their heartbeat information into the voting disk through which the cssd will make sure that the node is alive in the cluster.

Network Heartbeat: Each and every node must send their heartbeat information to other nodes across the cluster using private interconnect to inform them that he's alive.      

          When CSSD agent decides to evict the cluster

If any of the node in the cluster won't be able to store its heartbeat information into the voting disk then the cssd will validate whether the node would be able to send their network heartbeat information to other nodes across the cluster if it found that the node won't be able to send its heartbeat information to other nodes too then it evicts the node from the cluster to prevent the split-brain condition.

                    What information a heartbeat contains

A heartbeat contains only a timestamp nothing more than that.

                                What is split brain condition

A split-brain condition is a condition when the culprit node tries to form an independent cluster within the same cluster and if somehow that condition will form that it may lead the database’s data into the inconsistent state which break the ACID property of the oracle database. So, to prevent the split brain condition cssd agent evict the culprit node from the cluster.

                      Check network heartbeat threshold value

[oracle@oratest2 ~]$ crsctl get css misscount

CRS-4678: Successful get misscount 30 for Cluster Synchronization Services.

               Check disk heartbeat threshold value

[oracle@oratest2 ~]$ crsctl get css disktimeout

CRS-4678: Successful get disktimeout 200 for Cluster Synchronization Services.

                      Validate votedisk information

[oracle@oratest2 ~]$ crsctl query css votedisk

##  STATE    File Universal Id                File Name Disk group

--  -----    -----------------                --------- ---------

 1. ONLINE   e991f1aa016f4fb7bf5dba810330c3dc (/dev/oracleasm/disks/ASMDISK01) [DATA]

 2. ONLINE   132427f49bc64f63bffcb6bc089813ac (/dev/oracleasm/disks/ASMDISK02) [DATA]

 3. ONLINE   0ac91e3859064f04bfe11ea1b8c78e60 (/dev/oracleasm/disks/ASMDISK03) [DATA]

Located 3 voting disk(s).

           Reason for configuring odd number of voting disk

In Oracle RAC, each and every node across the cluster must be able to access more than half of the total number of voting disk which means more than 50% of the total voting disk must be accessible and if we want to tolerate the failure of n voting disk in the cluster then the formula is simple i.e. 2n+1 where n denotes the total no. of failure we can tolerate. So, for example, if we can tolerate the failure of 3 voting disk then the total no. of voting disk should be 2*3+1=7, so the formula of 2n+1 always led us to the odd value.

NOTE: The ideal value is total voting disk is 3.

         Probability formula to calculate ideal no. of voting disk

Total Number of Vote Disk

Failure Probability of a voting disk

Probability of all voting disk failing simultaneously

Probability that at least one copy of the voting disk is available

Marginal Improvement

1

0.05

0.05

0.95

-

2

0.05

0.0025

0.9975

0.0475

3

0.05

0.000125

0.999875

0.002375

4

0.05

0.000006250

0.999993750

0.000118750

5

0.05

0.000000313

0.999999688

0.000005937

6

0.05

0.000000016

0.999999984

0.000000297

 



Post a Comment

Previous Post Next Post