Oracle Standby Redo Transport Best practices

Data Guard automatically synchronizes the primary database and all standby databases by transmitting primary database redo and applying it to the standby database. There two methods that oracle use to trasnport redo logs: synchronous and asynchronous. This article it looking at asynchronous redo transport and some of the tunning and performance best practices.

Dataguard operates in ‘Maximum Protection’, ‘Maximum Availabilty’ Or ‘Maximum performance’ mode. Primary database bahaviour, in case of standby failure or delay to ship redo, is based on the mode of operation. In case of Maximum performace mode with AYNC setting, Primary never waits for standby acknowledgment to signal commit success to the application. There can be no guarantee of zero data loss in this case.

Asynchronous redo transport (ASYNC) transmits redo data asynchronously with respect to transaction commitment. A transaction can commit without having to wait for an acknowledgement that the redo generated by that transaction has been successfully transmitted to a remote standby database.

Dataguard ASYNC uses LNS (Log Network Service) process to transmit redo directly from the log buffer of the primary database. Oracle wait event ‘LNS wait on SENDREQ’ tracks the amount of time spent by all network servers to write the received redo to disk (remote) as well as open and close the remote standby/archived redo logs. The RFS process on the standby side receives the redo data and send ack back to LNS on primary. So ‘LNS wait on SENDREQ’ is the sum of network time and RFS I/O time.

There are some network settings for Data Guard asynchronous redo transport to achieve near-zero data loss protection for a production database.

Setting TCP Socket Buffer synchronizes

TCP socket buffers can be set using RECV_BUF_SIZE and SEND_BUF_SIZE. Set Socket buffer sizes using the Oracle Net parameters RECV_BUF_SIZE and SEND_BUF_SIZE, so that the socket buffer size setting affects only Oracle TCP connections. These parameters are directly linked with OS parameters net.core.rmem_max and net.core.wmem_max limit. There is no use of setting RECV_BUF_SIZE and SEND_BUF_SIZE larger than net.core.rmem_max and net.core.wmem_max limit. Oracle suggest to use Bandwidth Delay Product (BDP) of the network link between the primary and standby systems to set RECV_BUF_SIZE and SEND_BUF_SIZE. Bandwidth Delay Product (BDP) is bandwidth multiplied by latency. Oracle also suggest to set RECV_BUF_SIZE and SEND_BUF_SIZE to 3 times the Bandwidth Delay Product (BDP).

It can be set in the sqlnet.ora file.

standby = (DESCRIPTION=
(PROTOCOL=tcp) (HOST=snode1) (PORT=1521))

Listener level

(ADDRESS=(PROTOCOL=tcp) (HOST=snode2)(PORT=1521) (SEND_BUF_SIZE=9375000) (RECV_BUF_SIZE=9375000)))

Configure Standby Redo Logs

It is very important to configure Standby Redo Logs. They are very useful to reduce the data loss. Real-Time apply requires a standby database that is configured with a standby redo log for optimal performance. Online redo logs and standby redo logs should be size based on peak redo rates. AWR reports generated during peak workload periods such as batch processing, quarter or year end processing can show the peek redo generation rate. It is critical for performance that standby redo log groups only contain a single member. In addition, for each redo log thread (a thread is associated with an Oracle RAC database instance), the number of Standby Redo Logs = number of Redo Log Groups + 1. The additional standby redo log eliminates the possibility of a standby database waiting on standby redo log.

Redo log Size

Determining the network bandwidth required for optimal redo transport requires analysing the redo generatetion at primary site. The simplest way to determine application throughput in terms of redo volume is to collect AWR reports during normal and peak workload and determine the number of bytes per second of redo data the production database is producing
It is important to figure out the peek rate. AWR report for a long duration oten average the redo rate and peek rate is missed.

To determine peak redo rate for each log use the following query:

select thread#,sequence#,blocks*block_size/1024/1024 MB,(next_time-first_time)86400 sec, blocks*block_size/1024/1024)/((next_time-first_time)86400) "MB/s" from v$archived_log  where ((next_time-first_time)*86400<>0)  and first_time between to_date('2020/01/01 09:00:00','YYYY/MM/DD HH24:MI:SS') and to_date('2020/01/01 11:00:00','YYYY/MM/DD HH24:MI:SS')  and dest_id=2 order by first_time;
THREAD#  SEQUENCE# MB            SEC    MB/s   
    2      21423 10.8374023       7201  534.001504986
    1      22314 8.56689453       7197  435.001190342
    1      22315 4.54492188       3604  384.001261077
    2      21424 4.06201172       3599  273.00112865
    2      21425 .762207031        450  38.001693793
    1      22316 974.942383        426  32.28859714
    1      22317  909.48877         19  47.86783
    1      22318 743.726074       6753  231.110132693
    2      21426 8.29443359       6750  345.001228805

The above query output tells us that we must accommodate for a peak redo of 534MB/sec. In general we recommend adding an additional 30% on top of the peak rate.

Transport Lag

A transport lag is the amount of data, measured in time that the standby has not received from the primary. Determining transport lag is done on the standby database by querying the v$dataguard_stats view using a query similar to the following:

SQL> SELECT * FROM v$dataguard_stats WHERE name LIKE '%lag%';
NAME          VALUE               UNIT                  TIME_COMPUTED     DATUM_TIME
transport lag +00 00:22:22 day(2) to second(0) interval 07/21/2020 13:32:34 07/21/2020 13:32:24
apply lag +00 00:22:22 day(2) to second(0) interval 07/21/2020 13:32:34 07/21/2020 13:32:24

If there is transport lag reported. It has to be analyzed from two aspects, 1) Is there a frequent log switch at primary. 2) Does the network is able to transfer log fast enough. Frequent log switch can create a delay in log shipping and in ASYNC mode, LNS can skip over or advance to next log by skipping a sequence. If this happens, the skipped log has to be shipped after archiving. Sizing the redo log file size is very important in dataguard environment. Oracle recommended log switching frequency is about 20 minutes.

If DB is not switching logs frequently and have not exceeded the bandwidth of the primary or standby interfaces then the next thing to looks at is the network bandwidth between the primary and standby. Typically networks are shared with other applications and it could be that other applications using the network have temporarily consumed the bandwidth needed for the standby to maintain pace with the primary

Determine Available Network Bandwidth

We can use utilities such as iperf or qperf to measure available bandwidth at any given point in time. Here is the bandwidth (556MB) of my standby network connection when nothing going on

[root@snode2 ~]# qperf -lp 5001 -m 10m -t 10 tcp_bw
bw = 556 MB/sec

Logging into primary and doing some workload, You can see bandwidth has come down a little (448MB) due to the log transfer.

SQL> create table sales
2 (year number(7),
3 product varchar2(27),
4 amt number(10));
Table created.
SQL> declare
2 i integer := 1;
3 yr number :=1980;
5 FOR x IN 1..100000 LOOP
6 INSERT INTO sales (year, product, amt) values (to_number(yr+i),'My_'||i,to_number(i*10)) ;
7 if mod(x,100) = x then
8 i := i + 1;
9 end if;
11 END;
12 /
[root@snode2 ~]# qperf -lp 5001 -m 10m -t 10 tcp_bw
bw = 448 MB/sec

Another way to observe the bandwidth and diagnosing transport lag is by enabling event 16421. This is dynamic event should be set on all primary and standby instances and will dump statistics into the tt00 or nsa trace file once shipping for a given sequence has completed. Here are the details.

Last transport process (TT) can be found from alert log.

TT02 (PID:104652): Transmitting activation ID 0xfcecc25e
TT02 (PID:104652): SRL selected for T-1.S-41 for LAD:2
TT02 (PID:104652): SRL selected at LAD:2 for T-1.S-41

DB19B1:snode2:oracle$ ps -ef | grep 104652
oracle 37248 36346 0 12:16 pts/1 00:00:00 grep --color=auto 104652
oracle 104652 1 0 Aug02 ? 00:01:28 ora_tt02_DB19B1

On primary

SQL> alter system set log_archive_trace=529 scope=memory;
SQL> alter system set "_log_archive_trace_pids"='104652' SCOPE=memory
SQL> alter system set events ‘16421 trace name context forever, level 1’;

On standby

DB19B1SB:snode3:oracle$ ps -ef | grep 81911
oracle 81911 1 0 15:51 ? 00:00:02 oracleDB19B1SB (LOCAL=NO)
oracle 82019 80010 0 16:01 pts/1 00:00:00 grep --color=auto 81911

SQL> alter system set log_archive_trace=529 scope=memory;
SQL> alter system set "_log_archive_trace_pids"='81911' SCOPE=memory
SQL> alter system set events '16421 trace name context forever, level 1';

TT or NSV trace file on primary will show something like this.

krsb_end: Begin stats dump for thread 1 sequence 41
max number of buffers in use 10
Operation elapsed time (micro seconds) 18233070
Total I/Os queued 4356
Total I/Os completed 4356
Total bytes written 23180164
Average buffer turnaround time 61654
File transfer time (micro seconds) 182725140
Throughput (MB/s) 137
Total network reap stall time 12147116
Average network reap stall time 115147
Total network layer time 13265416
Percentage of time in network 72.60
Total disk reads 1502
Total disk read time 21431244
Total disk layer time 43214244
Percentage of time in disk layer 23.99
Total DG layer time 324552
Percentage of time in DG layer 3.41

Thank You!