Investigating the SCN intrinsic growth rate
Published on: Author: Bastiaan Bak Category: IT development and operationsA while ago, we had a warning about the intrinsic growth of the SCN (system change number) in our Oracle 12.2 database alert file. This situation occurred several times within a few weeks.
"Warning: The SCN intrinsic growth rate has been consistently
higher than system default 16384 per sec. for last 60 mins.
Current SCN intrinsic growth rate is 25867 per sec., zas 200fffff!
The Current SCN value is 46747726691, SCN Compat value is 1"
My initial reaction was that SCNs are related to commits, so either the load on the database was very high or the application logic should be changed. Another possibility was that a commit was done after every update, instead of using batch commits.
It turned out to be a little more complicated than I expected. Where do you look when you want to find the relation between SCNs and commits? And how serious is this warning anyway? This blog is going to be about the various ways I investigated this problem and identified the potential impact.
Oracle Support notes
The first place to look for information about warnings in the alert file is the Oracle Support website. I found several related notes:
- ORA-19706 and Related Alert Log Messages (Doc ID 1393360.1)
This note suggests that the actual message is specific to database version 12.2, but in older versions we might have similar warnings, like “Warning: The SCN headroom for this database is only NN days!”
If you encounter an alert log message like any of these entries, you are advised to follow the instructions in ID 1388639.1 and log a Service Request with Oracle support.
Evidence to collect when reporting "high SCN rate" issues to Oracle Support (Doc ID 1388639.1) - This note gives information on what information you should deliver when logging a service request.
- System Change Number (SCN), Headroom, Security and Patch Information (Doc ID 1376995.1)
This note gives more information about the usage of the SCN. The system change number (SCN) is a logical, internal timestamp used by the Oracle Database. SCNs order events that occur within the database. The database uses SCNs to query and track changes. When a transaction commits, the database records a SCN for this commit.
There is an upper limit to how many SCNs an Oracle Database can use. The limit is currently 281 trillion (2^48) SCN values.
Given that there is an upper limit, it’s important that any given Oracle Database does not run out of available SCNs.
The note also explains when the warning is raised. The Oracle Database calculates a "not to exceed" limit for the number of SCNs a database can currently use, based on the number of seconds since 1988 multiplied by 16384. Doing this ensures that Oracle Databases will ration SCNs over time.
How serious is this warning?
The warning is raised at a rate of 2^14 = 16384 SCNs per second for the last 60 minutes.
The maximum SCN is 2^48 = 281.474.976.710.656.
At a rate of 16348 SCNs per second, we will have 2^(48-14) seconds, or 544 years to reach that maximum. That should be enough in a normal situation, but the upper limit of 2^48 is only the maximum absolute value the database can store.
The limit is also related to the number of seconds since 1988. The limit of 2^48 is the maximum in the year 2532 (1988+544). But in 2018 the maximum is (2018-1988)*365*24*60*60*2^14 = 1.550.057.472.000.
The warning should not be ignored. When you reach the limit you will get ora-600 [2252] errors, but when you reach the absolute upper limit SCN the database will just stop working.
The good news is that in our situation the warning said the SCN growth rate was 25867 per second in that specific hour, so in that hour we came a little closer (25867-16384=9483) to the limit. We don’t come close to the limit every hour; the normal growth rate is lower than 16384.
Oracle Support
We called Oracle Support, and they told us Oracle Development is currently working on this issue.
Oracle Support confirmed that the SCN headroom looks good. Based on the AWR report, Oracle Support noticed a high number of commits and suggested to check with application team to commit by increasing the transaction size.
Investigation with AWR
The warning in the alert file told us that the SCN intrinsic growth rate has been consistently higher than system default: 16384 per second for last 60 mins. If we’re looking at a time frame of an hour, an AWR report might be a good place to start. We have AWR configured to make snapshots every hour.
In the AWR report, I noticed the number of user commits was 210 per second. Yes, that is a lot of commits, but it isn’t that different from the normal load of this database. And if a commit is related to a SCN, it’s also much lower than 16384 per second.
The AWR report also contained an ADDM finding: Waits on event "log file sync" while performing COMMIT and ROLLBACK operations were consuming significant database time. Investigate application logic for possible reduction in the number of COMMIT operations by increasing the size of transactions.
This reduction of the commits in the ADDM finding was also suggested by Oracle Support. From my point of view, it wasn’t really that high though.
Shorter timeframe
Because the AWR wasn’t helping me find the cause, I needed to investigate a shorter timeframe. I wanted to know a more specific timeframe so I could create an ASH report. The default for ASH is 15 minutes.
So the next challenge was to find the 15 minute timeframe in which the SCN growth rate was the highest.
Doc ID 1388639.1 suggested to query v$archived_log. That view has information about all the log switches on the database, including a timestamp and the SCN. Although you could map timestamps to SCNs, it’s not really better than the AWR report. We’re still stuck to random timestamps; in this case the timestamp of the logswitch.
Using the timestamp_to_scn function
A better way is to use the function timestamp_to_scn. This function returns a SCN based on a timestamp, like the current timestamp:
SQL> SELECT timestamp_to_scn(sysdate) FROM dual ; TIMESTAMP_TO_SCN(SYSDATE) ------------------------- 91903104563 SQL>
The next step was to make a list of timestamps together with the matching SCN and the matching SCN upper limit, based on the number of seconds since 1988 multiplied by 16,384.
This shows the timestamps and SCNs for the last day:
SELECT sysdate - (rownum/24) datetimestamp , timestamp_to_scn(sysdate - (rownum/24)) SCN , ((sysdate - (rownum/24)) - to_date('01-01-1988','DD-MM-YYYY' )) * 24 * 60 * 60 * 16384 upper_lmt FROM dual CONNECT BY rownum <= 24 /
DATETIMESTAMP SCN UPPER_LMT ------------------- -------------------- -------------------- 09-07-2018-13:23:39 95423916508 15780233527296 09-07-2018-12:23:39 95380086165 15780174544896 09-07-2018-11:23:39 95338871931 15780115562496 09-07-2018-10:23:39 95303437600 15780056580096 09-07-2018-09:23:39 95265573942 15779997597696 09-07-2018-08:23:39 95226645452 15779938615296 09-07-2018-07:23:39 95186822906 15779879632896 09-07-2018-06:23:39 95147382509 15779820650496 09-07-2018-05:23:39 95115474008 15779761668096 09-07-2018-04:23:39 95079712219 15779702685696 09-07-2018-03:23:39 95041469231 15779643703296 09-07-2018-02:23:39 95006499794 15779584720896 09-07-2018-01:23:39 94975060529 15779525738496 09-07-2018-00:23:39 94945771055 15779466756096 08-07-2018-23:23:39 94907451372 15779407773696 08-07-2018-22:23:39 94875158341 15779348791296 08-07-2018-21:23:39 94838756696 15779289808896 08-07-2018-20:23:39 94800190958 15779230826496 08-07-2018-19:23:39 94757984611 15779171844096 08-07-2018-18:23:39 94724548846 15779112861696 08-07-2018-17:23:39 94685506947 15779053879296 08-07-2018-16:23:39 94646644945 15778994896896 08-07-2018-15:23:39 94605003069 15778935914496 08-07-2018-14:23:39 94572205685 15778876932096 24 rows selected.
The current SCN is about 0,57% of the current upper limit.
Finding the top SCN rate
Based on this idea I created a query that gives me the 15 minute timeframe with highest growth in SCNs in the last 3 days.
Every minute a new timeframe starts, and because we have 1440 minutes in a days, we have 4320 timeframes to investigate. For each of them we have to calculate the growth of the SCN within that 15 minute timeframe.
We only want to show the top results, in this case only the timeframes with a rate of over 14000 per second.
ALTER SESSION SET nls_date_format='MM/DD/YY HH24:MI' ; WITH datelist AS ( SELECT sysdate - (rownum/1440) - (15/1440) starttime -- 15 minute interval , sysdate - (rownum/1440) endtime FROM dual CONNECT BY rownum <= (3*1440) -- 3 days history ) SELECT starttime , endtime , timestamp_to_scn(endtime) - timestamp_to_scn(starttime) scngrowth , round((timestamp_to_scn(endtime) - timestamp_to_scn(starttime)) / (((24*60*60)*(endtime-starttime )))) scnrate FROM datelist WHERE round((timestamp_to_scn(endtime) - timestamp_to_scn(starttime)) / (((24*60*60)*(endtime-starttime )))) >= 14000 ORDER BY 4 DESC /
STARTTIME ENDTIME SCNGROWTH SCNRATE -------------- -------------- -------------------- -------------------- 07/06/18 18:09 07/06/18 18:24 12761928 14180 07/07/18 05:20 07/07/18 05:35 12742537 14158 07/09/18 13:59 07/09/18 14:14 12705077 14117 07/09/18 12:57 07/09/18 13:12 12672507 14081 07/09/18 07:06 07/09/18 07:21 12654287 14060
So, now we have found the (sometimes overlapping) 15 minute time frames with the highest SCN rate (SCN growth per second) for the last 3 days. And even in those time frames the SCN rate is still under 16384. No warnings in the alert file this week….
Running the ASH report
The date format I used in the query above is the same as used by the ASH report, so you can just copy/paste the start time. For the duration we enter 15 minutes.
SQL> @@$ORACLE_HOME/rdbms/admin/ashrpt.sql ASH Samples IN this Workload Repository schema ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Oldest ASH sample available: 01-Jul-18 00:00:01 [ 12379 mins IN the past] Latest ASH sample available: 09-Jul-18 14:18:58 [ 0 mins IN the past] Specify the timeframe TO generate the ASH report ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enter BEGIN TIME FOR report: -- Valid input formats: -- To specify absolute begin time: -- [MM/DD[/YY]] HH24:MI[:SS] -- Examples: 02/23/03 14:30:15 -- 02/23 14:30:15 -- 14:30:15 -- 14:30 -- To specify relative begin time: (start with '-' sign) -- -[HH24:]MI -- Examples: -1:15 (SYSDATE - 1 Hr 15 Mins) -- -25 (SYSDATE - 25 Mins) Defaults TO -15 mins Enter VALUE FOR begin_time: 07/06/18 18:09 Report BEGIN TIME specified: 07/06/18 18:09 Enter duration IN minutes starting FROM BEGIN TIME: Defaults TO SYSDATE - begin_time Press Enter TO analyze till CURRENT TIME Enter VALUE FOR duration: 15 Report duration specified: 15 USING 06-Jul-18 18:09:00 AS report BEGIN TIME USING 06-Jul-18 18:24:00 AS report END TIME Specify the Report Name ~~~~~~~~~~~~~~~~~~~~~~~ The DEFAULT report file name IS ashrpt_1_0706_1824.html. TO USE this name, press <return> TO continue, otherwise enter an alternative. Enter VALUE FOR report_name: USING the report name ashrpt_1_0706_1824.html Summary OF ALL USER INPUT ------------------------- Format : HTML DB Id : 2019395491 Inst num : 1 BEGIN TIME : 06-Jul-18 18:09:00 END TIME : 06-Jul-18 18:24:00 Slot width : DEFAULT Report targets : 0 Report name : ashrpt_1_0706_1824.html
Finding SCN in AWR
The AWR report didn’t show us much information about the current SCN, but it has some information about the growth rate, if you know where to find it.
Under “Instance Activity Stats” you can find the number of “calls to kcmgas”. In the Oracle documentation this is described as the “Number of calls to routine kcmgas to get a new SCN”.
The value of these calls per second in the AWR report is very close to the SCN rate as calculated with the timestamp_to_scn function.
V$SESSTAT view
The number of “calls to kcmgas” used to create a new SCN can also be found in the views V$SESSTAT and V$SYSSTAT.
We can use V$SESSTAT to find the sessions that cause a high SCN rate. We can also test the impact on the SCN number of specific actions.
For example, when I do a select on a big table that’s also in use by other sessions, my session will do another 7 calls to kcmgas. So, my query will cause a higher SCN. This is caused by the read consistency of the database, that also uses a SCN.
SQL> CONNECT <user>/<pass>@<service> Connected. SQL> SELECT ses.value FROM v$sesstat ses , v$statname stat WHERE stat.statistic#=ses.statistic# AND ses.sid IN (SELECT sid FROM v$mystat) AND stat.name = 'calls to kcmgas' / VALUE -------------------- 2 SQL> SELECT COUNT(*) FROM mybigtable ; COUNT(*) -------------------- 12198814 SQL> SELECT ses.value FROM v$sesstat ses , v$statname stat WHERE stat.statistic#=ses.statistic# AND ses.sid IN (SELECT sid FROM v$mystat) AND stat.name = 'calls to kcmgas' / VALUE -------------------- 9 SQL>
Comparing the SCN and commit rate
With V$SESSTAT we can query the statistics for all sessions currently connected to the database. In this way we can find sessions that are responsible for a high SCN rate. We can compare this to the commit rate for that session.
The results of the query below showed us that on our database the high SCN rate was mainly caused by background processes. For most user sessions there is a relation between a high SCN rate and a high commit rate, for background sessions the commit rate is always empty.
SELECT ses.sid , decode(ses.username ,NULL,'background','user' ) session_type , (sysdate - logon_time) * 24 * 60 * 60 connect_seconds , sstat1.value SCN# , sstat2.value COMMIT# , round(sstat1.value / ((sysdate - logon_time ) * 24 * 60 * 60),2) scn_rate , round(sstat2.value / ((sysdate - logon_time ) * 24 * 60 * 60),2) commit_rate FROM v$sesstat sstat1 , v$sesstat sstat2 , v$statname sn1 , v$statname sn2 , v$session ses WHERE sstat1.statistic# = sn1.statistic# AND sstat2.statistic# = sn2.statistic# AND sn1.name = 'calls to kcmgas' AND sn2.name = 'user commits' AND ses.sid = sstat1.sid AND ses.sid = sstat2.sid ORDER BY 6 DESC / SID SESSION_TY CONNECT_SECONDS SCN# COMMIT# SCN_RATE COMMIT_RATE ---------- ---------- --------------- ---------- ---------- ---------- ----------- 8478 background 459572 214506344 0 466.75 0 7551 background 452395 209729934 0 463.6 0 3776 background 290389 133863489 0 460.98 0 8496 background 121201 55685740 0 459.45 0 8729 background 286773 128180386 0 446.98 0 12009 background 290392 128867329 0 443.77 0 13173 background 196775 87268032 0 443.49 0 12004 background 103166 45681480 0 442.8 0 8735 background 275980 121563094 0 440.48 0 3096 background 430810 185436599 0 430.44 0 8027 background 95990 40912187 0 426.21 0 7529 background 193218 81367643 0 421.12 0 2370 background 527978 219521415 0 415.78 0 14604 background 283216 117052382 0 413.3 0 14132 background 113965 46586388 0 408.78 0 7552 background 294009 119775077 0 407.39 0 13172 background 182423 73865595 0 404.91 0 14592 background 74414 29767705 0 400.03 0 3802 background 268804 107486102 0 399.87 0 9910 background 117582 46596720 0 396.29 0 12021 background 49182 19321676 0 392.86 0 974 background 160816 59996495 0 373.08 0 12723 background 74450 25455559 0 341.91 0 3310 background 193215 65915175 0 341.15 0 12963 background 49179 15687084 0 318.98 0 6111 background 3584090 1031139557 0 287.7 0 6829 USER 303 1267 1123 4.18 3.71 9665 USER 904 1845 1691 2.04 1.87 8022 USER 898 1677 1520 1.87 1.69 3323 USER 898 1406 1260 1.57 1.4 2839 USER 7503 10822 9813 1.44 1.31 11060 USER 3892 5334 4781 1.37 1.23 13184 USER 1765 2359 2038 1.34 1.15 9199 USER 898 1135 935 1.26 1.04 2130 USER 8105 9548 8518 1.18 1.05 11525 USER 898 1054 944 1.17 1.05 6130 USER 3895 4453 4199 1.14 1.08 8012 USER 7503 8576 7774 1.14 1.04 4497 USER 898 962 882 1.07 .98 5201 USER 7220 7551 6226 1.05 .86 11317 USER 12906 13371 11997 1.04 .93 [...] 1979 ROWS selected.
Conclusion
Be aware that there are limits to the SCN, so when you find warnings in the alert file, you need to investigate the problem. If you find an issue you should work with Oracle Support. By uploading information they can check if there is enough room between the current and maximum SCN.
Problems can be caused by a bug, like 12371955: Hot Backup can cause increased SCN growth rate leading to ORA-600 [2252] errors (Doc ID 12371955.8).
If you want to find the exact moment there is a high growth of SCNs you need to convert timestamps to SCNs. You get the best results using the functions SCN_TO_TIMESTAMP and TIMESTAMP_TO_SCN.
A high commit rate is always related to user processes, but SCNs are also related to background processes. Even sessions that don’t commit can have an impact on the SCN.