Standard Edition High Availability – First Impressions

2020-05-11 Off By Markus Flechtner

Last year Oracle caused a great outcry among the fans of the Standard Edition 2 (SE2) of the Oracle Database when they announced that Real Application Cluster (RAC)
is not available for SE2 in the release 19c anymore. The “announcement” was a little bit hidden in the licensing information and the information came at very short notice. No deprecation phase, as we are used to. After that, a lot of articles were written and presentation were held about the alternatives: moving to the Enterprise Edition of the Oracle Database, setting up a failover database or moving from the cluster database to a standby solution for the SE2. But about one year later, Oracle announced the availability of “Standard Edition High Availability”. Let’s have a look at his new feature.

“Standard Edition High Availability” is a cold-failover solution for the Oracle Database SE2. It was released with the Release Update 19.7 of the Oracle Database in April 2020 and is currently available for Linux, Microsoft Windows and Solaris.

Let’s test it:

My test configuration consists of a 2-node cluster:

oracle@bert:~/ [grinf19] olsnodes -c
sesamestreet

oracle@bert:~/ [grinf19] olsnodes
ernie
bert

Installed Software:

Grid Infrastructure:

grid@ernie:~/ [grinf19] $ORACLE_HOME/OPatch/opatch lspatches
30898856;TOMCAT RELEASE UPDATE 19.0.0.0.0 (30898856)
30894985;OCW RELEASE UPDATE 19.7.0.0.0 (30894985)
30869304;ACFS RELEASE UPDATE 19.7.0.0.0 (30869304)
30869156;Database Release Update : 19.7.0.0.200414 (30869156)
OPatch succeeded.

RDBMS Software (Oracle Database 19c Standard Edition 2 – 19.7.0.0.0):

oracle@ernie:~/ [rdbms19] $ORACLE_HOME/OPatch/opatch lspatches
30894985;OCW RELEASE UPDATE 19.7.0.0.0 (30894985)
30805684;OJVM RELEASE UPDATE: 19.7.0.0.200414 (30805684)
30869156;Database Release Update : 19.7.0.0.200414 (30869156)
OPatch succeeded.

Please make sure that patch 30894985 is applied to the RDBMS home, too.

In my lab I installed and patched Grid Infrastructure first.
Then I installed the RDBMS software and applied the database patch and the OJVM patch only.

So “OCW” still had the old patchlevel (patch 29585399 = 19.3) and this caused the commands which are required for SEHA to fail, because the old srvctl is not aware of the features of SEHA.
Or – as an alternative – you can use srvctl from your Grid Infrastructure Installation to manage the SEHA database. But I prefer a consistent setup.

Creating the database

Creating the database is pretty straightforward: there is no difference to creating a “normal” SE2 database and there is no screen, checkbox etc. in DBCA which allows you to enable SEHA for the database during creation.
And creating the database ends up with a single instance database running on one node only.

If you use DBCA for creating the database, please remember that DBCA creates an entry in the tnsnames.ora file which is used for the local listener:

SQL> show parameter local_listener

NAME				     TYPE	 VALUE
------------------------------------ ----------- ---------------------
local_listener			     string	 LISTENER_SEHACDB

listener.ora:

LISTENER_SEHACDB.MARKUSDBA.LOCAL =
(ADDRESS = (PROTOCOL = IPC)(HOST = ernie)(KEY = LISTENER))

Please make sure that a similar entry exists on the other nodes, e.g.

LISTENER_SEHACDB.MARKUSDBA.LOCAL =
  (ADDRESS = (PROTOCOL = IPC)(HOST = bert)(KEY = LISTENER))

Otherwise, the instance will not register with the listener on the other node.
If you are using the default port (1521) you can even remove the entry “LOCAL_LISTENER” from the spfile. The instance will then register itself with a listener running on port 1521

Enabling SEHA

Before you enable “Standard Edition 2 High Availability” for your database, please make sure that both spfile and password file for your database are stored in ASM.
dbca creates the spfile in ASM, but the passwordfile is stored in the local file system, so you have to copy it into ASM and to change the database configuration:

asmcmd pwcopy /u00/app/oracle/dbs/orapwSEHACDB +DATA/SEHACDB/orapwSEHACDB

srvctl modify database -db SEHACDB -pwfile +DATA/SEHACDB/orapwSEHACDB

After that you can enable SEHA:
You simply have to extend the list of available nodes for your database:

srvctl modify database -db SEHACDB -node ernie,bert

That’s it. There’s nothing to do on the second node. Oracle will create all the required directories automatically when the database is started on the second node.
(Just make sure, that the permissions on the local file system are OK).

Playing around with SEHA:

oracle@ernie:~/ [SEHACDB] srvctl config database -db SEHACDB
Database unique name: SEHACDB
Database name: SEHACDB
Oracle home: /u00/app/oracle/product/19
Oracle user: oracle
Spfile: +DATA/SEHACDB/PARAMETERFILE/spfile.269.1040061437
Password file: +DATA/SEHACDB/orapwSEHACDB
Domain: markusdba.local
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools:
Disk Groups: DATA,FRA
Mount point paths:
Services:
Type: SINGLE
OSDBA group: dba
OSOPER group: oper
Database instance: SEHACDB
Configured nodes: ernie,bert
CSS critical: no
CPU count: 0
Memory target: 0
Maximum memory: 0
Default network number for database services:
Database is administrator managed

The list of configured nodes is the only sign which shows you that this database is using SEHA.

When you start the database, you can specify the node:

srvctl start database -db SEHACDB -node ernie

If you don’t specify the node, I observed that the database will be started on the node which is mentioned first in the list of the “Configured nodes”.

Relocate the database

oracle@ernie:~/ [SEHACDB] srvctl status database -db SEHACDB
Instance SEHACDB is running on node bert
oracle@ernie:~/ [SEHACDB] srvctl relocate database -db SEHACDB -node ernie
oracle@ernie:~/ [SEHACDB] srvctl status database -db SEHACDB
Instance SEHACDB is running on node ernie

According to the online documentation “srvctl relocate” is for RAC One Node databases only, but it works fine with SEHA.
You can even specify the parameter “drain_timeout” which gives the running sessions some time to complete their transactions before they are killed.

During the relocation, the database instance on the one node ist stopped and after that, the instance is started on the other node:

alert.log on bert:

[..]
Archiving is disabled
2020-05-11T00:03:10.403446+02:00
Stopping background process VKTM
2020-05-11T00:03:10.424387+02:00
JIT: pid 11508 requesting stop
2020-05-11T00:03:10.542356+02:00
NOTE: Shutting down MARK background process
2020-05-11T00:03:10.758401+02:00
Stopping background process RBAL
2020-05-11T00:03:17.859281+02:00
Instance shutdown complete (OS id: 11508)

alert.log on Ernie:

2020-05-11T00:03:19.848525+02:00
Adjusting the requested value of parameter parallel_max_servers
from 0 to 1 due to running in CDB mode
Starting ORACLE instance (normal) (OS id: 16728)
2020-05-11T00:03:20.051059+02:00
****************************************************
 Sys-V shared memory will be used for creating SGA
 ****************************************************
2020-05-11T00:03:20.052086+02:00
**********************************************************************
[...]

So, 2 seconds after the instance is shut down on node 1, the instance is started on node 2. Of course, it takes some time for the instance to come up and clients will receive an error message. But this issue can be solved by configuring a Service with Transparent Application Failover and the appropriate reconnect parameters for the applications.

“Failover”:

If the node on which the database is running, the database is started on another node. That’s the expected behaviour and the reason why we want to use SEHA.

Summary

Standard Edition 2 High Availability (SEHA) is a cold-failover cluster solution the the Standard Edition 2 of the Oracle Database. It fully integrated in the Oracle Clusterware and very easy to configure. SEHA makes own scripts or 3rd party solutions for failover databases obsolete. From the licensing perspective it’s important that the “10-days-rule” applies, i.e. you can run the database on an unlicensed node in your cluster for up to 10 separate days in a calendar year.

Unfortunately, according to the documentation, it does not work for the Enterprise Edition of the Oracle Database.

Update 1:

My initial problem (as mentioned on Twitter) was that I had installed and patched Grid Infrastructure first and later on I installed the RDBMS software and applied the database and the OJVM patch only. But not the OCW patch. Shame on me.

oracle@ernie:~/ [rdbms19] echo $ORACLE_HOME
/u00/app/oracle/product/19
oracle@ernie:~/ [rdbms19] $ORACLE_HOME/OPatch/opatch lspatches
30805684;OJVM RELEASE UPDATE: 19.7.0.0.200414 (30805684)
30869156;Database Release Update : 19.7.0.0.200414 (30869156)
29585399;OCW RELEASE UPDATE 19.3.0.0.0 (29585399)

So srvctl did not accept the list of two nodes as a value value for the parameter “-node”.

Interestlingly, the version of srvctl in the GI home (patched) and the RDBMS home (unpatched) were identical, at least according to the version number.
Both do not show the minor version.

oracle@bert:~/ [rdbms19] echo $ORACLE_HOME
/u00/app/oracle/product/19
oracle@bert:~/ [rdbms19] srvctl -version
srvctl version: 19.0.0.0.0
oracle@bert:~/ [grinf19] echo $ORACLE_HOME
/u00/app/grid/product/19
oracle@bert:~/ [grinf19] srvctl -version
srvctl version: 19.0.0.0.0

Changing the minor version after applying a patch would be nice 🙂

Update 2:

As expected and as documented, the feature does not work with the Enterprise Edition of the Oracle Database. The one command which configures High Availability, i.e. the command which is used to add a second node to list of configured nodes, fails with the EE:

oracle@bert:/u00/app/oracle/dbs/ [rdbms19ee] srvctl modify database -db EEHACDB -node ernie,bert
PRCD-1302 : failed to retrieve the node hosting this single-instance database
PRCD-2088 : failed to configure the single instance database EEHACDB with multiple nodes because it is not a Standard Edition database

Update 3:

There are several issues with SEHA. Please check the current Oracle Database 19c Readme.

Update 4 (01-JUL-2020):

A fix is available for bug # 31128434 (Switchover hangs, if an additional service is defined). However, even with this patch, a switchover of the SEHA-CDB takes about 12 minutes (reproducable!). I think, that’s a little bit too long.