Note A: Disk
scrubbing work in disk sector not database block which will span 4k sectors on
the latest HDDs, earlier it had 512-bytes sectors
Note B: We don’t
need to scrub flash disk, Scrubbing is only necessary on spinning disk cells
(High Capacity or older High-Performance models) and that it’s not necessary to
configure on EF disks or cells.
Types Of Corruption
Before understanding the feature of disk scrubbing and how it checks
and repair the bad sector, we need to understand type of corruption that can occur,
A corruption can be of many types but broadly it’s categorized into 2 types i.e.,
Logical corruption & physical corruption.
Logical
Corruption: Logical corruption is a corruption in which data
is technically valid but it’s doesn’t make any sense in read-world. For e.g.
there is an order table which contain the customer order detail along with
country name now suppose if someone has updated the country name as “BRAZIL”
without where clause and also commit the data, so in that case the data is technically
valid but now the data in the order table doesn’t make any sense in the real
world and in order to resolve this type of corruption we can use flashback technology
like flashback database, flashback table etc.
Physical
corruption: Physical corruption is a tedious type of corruption
since it’ll appears when the query physically try to fetch the corrupted block
and in a real-world environment it happens that over the time when the Hard
Disk Drive (HDDs) age then the data is less access and less frequently even though
it’s still depends upon the use & purpose of data , So to prevent this type
of scenario oracle Exadata provide the feature of disk scrubbing which works on
Error Correction Code (ECC) which Exadata includes with cell server’s disk.
How & When Disk Scrubbing Work
The activity of disk scrubbing handles by the Exadata system
automatically and it only works when the disk usage in the storage server is
less than 25% so that it doesn’t impact database performance, Disk scrubbing feature
works in error correct code (ECC) of HDDs.
How Disk Scrubbing differs from backup to check corrupted block
In a real-time scenario
we use block change tracking and RMAN unused block compression which means it’ll
not read those block which has not been modified since the last backup and skip
empty block altogether and apart from that read through backup only done on the
primary group but what happen if the bad sector lies in the secondary or tertiary
copies in case we use high redundancy which Exadata recommend to support
maximum availability feature (MAA) , so there may be a chance that bad sector
to appear on the mirror disk if scrubbing of all disk did not takes place.
How Disk Scrubbing Repair Corrupted block
Disk scrubbing mainly target those sectors which are not read
recently. Data that is less & less frequently access need to be checked to
ensure that the corruption can be found early and repair it on a proactive basis,
if disk scrubbing activity will detect any kind of corruption in the disk block,
then it requests ASM to repair the bad sector from one or more mirror copy
store into other storage server that is the reason why multiple mirrors are
essential.
Scrubbing on Exadata
differs from the scrubbing performed by ASM is that the sector being scrubbed
(checked for errors) doesn’t leave the Exadata storage server, eliminating the
unnecessary network traffic and avoiding CPU consumption on the database servers.
How
Disk Scrubbing Doesn’t Impact Database Performance
Disk scrubbing in Exadata only works when the storage server
usage is less than 25% and now suppose if Exadata start disk scrubbing since
the disk utilization is less than 25% and all of a sudden, the disk utilization
increases due to the increase in the workload of database then in that case Exadata
automatically stop the disk scrubbing activity to prevent any type of database
performance’s impact.
How
To Check If Disk Scrubbing is in progress
- We can
check the cell alert log
Information About Disk Scrub Start in Alert log
=>Scrubbing started:
“Begin scrubbing CellDisk: CD_07_ exatestcel12” in cell
alert.log
=>Scrubbing completed:
“Finished
scrubbing CellDisk: CD_07_ exatestcel12, scrubbed blocks (1MB): 1488,
found bad blocks: 0” in cell alert.log
- We can also see the disk scrubbing in AWR report (Top IO Reasons by Request) however fetching the AWR for/in every hour without any issue is not a reliable solution
- We can
use cellcli utility to check if disk scrubbing operation is in progress
[root@exatest ~]# ssh exatestcel12
CellCLI: Release 23.1.8.0.0 - Production on Fri Jul 14 05:11:26
UTC 2023
Copyright (c) 2007, 2023, Oracle and/or its affiliates.
[root@exatestcel12 ~]# cellcli
CELLCLI>list metriccurrent where name
= 'CD_IO_BY_R_SCRUB_SEC' and metricObjectName
like 'CD.*'
CD_IO_BY_R_SCRUB_SEC
CD_00_ exatestcel12 115 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_01_ exatestcel12 118 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_02_ exatestcel12 117 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_03_ exatestcel12 113 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_04_ exatestcel12 114 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_05_ exatestcel12 119 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_06_ exatestcel12 112 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_07_ exatestcel12 120 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_08_ exatestcel12 116 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_09_ exatestcel12 115 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_10_ exatestcel12 116 MB/sec
CD_IO_BY_R_SCRUB_SEC
CD_11_ exatestcel12 113 MB/sec
Note: If the output comes in any
value MB/sec other than ‘0’ or the records other than no rows in the output then
it means disk scrubbing is in progress.
Can We disable disk scrubbing
Yes, even though it’s not recommended since it can lead you to
harmful scenario in the long run. The fact that scrubbing is active is good for
the health of the database and the system.
How To Check Disk Scrub Activity Is Enabled on
Exadata
Login into the compute node à login
into cell server à login into cell utility
[root@exatest ~]# ssh exatestcel12
CellCLI: Release 23.1.8.0.0 - Production on Fri Jul 14 05:11:35
UTC 2023
Copyright (c) 2007, 2023, Oracle and/or its affiliates.
[root@exatestcel12 ~]# cellcli
CELLCLI> list cell attributes name,hardDiskScrubInterval
exatestcel12
biweekely
How To Check HardDisk Scrub Time on Exadata cell node
[root@exatest ~]# ssh exatestcel12
CellCLI: Release 23.1.8.0.0 - Production on Fri Jul 14 05:12:05
UTC 2023
Copyright (c) 2007, 2023, Oracle and/or its affiliates.
[root@exatestcel12 ~]# cellcli
CELLCLI> list cell attributes name,hardDiskScrubInterval
exatestcel12
biweekely
[root@exatest ~]# ssh exatestcel12
CellCLI: Release 23.1.8.0.0 - Production on Fri Jul 14 05:14:13
UTC 2023
Copyright (c) 2007, 2023, Oracle and/or its affiliates.
[root@exatestcel12 ~]# cellcli
CELLCLI> alter cell HardDiskScrubInterval=none
How To change Hard disk Scrub frequency in exadata cell node
CellCLI: Release 23.1.8.0.0 - Production on Fri Jul 14 05:15:20
UTC 2023
Copyright (c) 2007, 2023, Oracle and/or its affiliates.
[root@exatestcel12 ~]# cellcli
CELLCLI> alter cell HardDiskScrubInterval=daily
CELLCLI> alter cell HardDiskScrubInterval=weekly
CELLCLI> alter cell HardDiskScrubInterval=biweekly (Default Value)
How To Schedule Hard Disk Scrub Acvitity
CellCLI: Release 23.1.8.0.0 - Production on Fri Jul 14 05:16:37
UTC 2023
Copyright (c) 2007, 2023, Oracle and/or its affiliates.
[root@exatestcel12 ~]# cellcli
CELLCLI> alter cell HardDiskScrubStartTime=’ 2023-26-08T22:00:00-04:00’