Hello everyone. I have a few questions that many people have:
1 - Why does the error "Recovery Manager: Log File is Full" occur?
2 - How to find out the cause of the error?
3 - Are there any notes.ini parameters to figure out the cause?
There are only 2 useful articles on this issue:
1 - https://support.hcl-software.com/csm?id=kb_article&sys_id=7372601f874729905440c9d8cebb355e&spa=1
2 - https://ds_infolib.hcltechsw.com/ldd/dominowiki.nsf/dx/overview_of_causes_of_server_Crash_hangs_due_to_quoRecovery_Managercol_Log_file_is_fullquo_message
Good articles, but, unfortunately, they do not answer the question of how to understand the cause?
Here are 2 examples. I'll make a quick note of it. I'll make a quick note of it. There is enough disk space and transactional logs are on separate 1TB disks. Installed 12.0.2FP6 domino.
Example #1:
Server has transactional logs Archived. These logs have no size limit, but sometimes the server starts spamming errors: Recovery Manager: Log File Full: Func=hlgWriteLogRecord File=rm/rmlogger.cpp:725 [/local/notesdata/log.nsf 26873]
Why is there suddenly an error? I have archived transactional logs, why suddenly the log file is full? How to understand the reason?
Example #2:
Server has transactional logs Archived. The server is up and running. Everything's fine. And at some point the server suddenly starts giving an error:
Recovery Manager: Log File is Full
Recovery Manager: Log File Full: Func=hlgWriteLogRecord File=rm/rmlogger.cpp:725 [/local/notesdata/mail/35/user_opt.nsf 43215]
err_invalid_btree ERROR: Recovery Manager: Log File is Full [/local/notesdata/mail/99/test_user.nsf]
BTExit: Unexpected error 'B-tree structure is invalid' (028E) on database mail/99/test_user.nsf
**** DbMarkCorrupt(Unable to close container), DB=/local/notesdata/mail/99/test_user.nsf TID=[010585:000006-00007AAE3E3FF700] File=index/dbbuf.c Line=398 ***
How so? How come? Where do I look? The logs are archived, there is plenty of disk space, why does the server suddenly say that the log file is full? Is it all Log.nsf's fault or is there another problem?
Do transactional logs have any additional limitations? Could the problem be in the UBM (Unified Buffer Manager)? If so, in what way? If not, where else should I look? How to debug, find out and get rid of this problem? Please don't pass it by. Let's work together to solve this problem at last.
Hello Moises,
These error messages are informational and related to Transaction Logging, indicating that the Transaction Logging is unable to write or re-use the extents in a timely manner.
There are several reasons that can cause this error message, including:
1. If the storage of your Transaction Logging files is almost full.
2. If there are Long Held Locks before experiencing the Log File is Full informational messages, which can happen due to corruption in databases.
3. If there is an underlying performance issue on the server, which may cause delays in writing information to Transaction Logging.
The impact of these errors can be significant, as they can cause the server to become unresponsive and prevent users from accessing the databases.
When we start to see Log File is Full messages, we need to be concerned because from this point the databases can do reads, but the server cannot update any of the databases, eventually leading to a hang.
Based on the shared logs, we can see that multiple "Lock manager" messages are appearing for multiple databases.
To resolve this issue, we recommend the following steps:
1. Restart the server on the OS level. Restarting the server on the Domino level generally resolves this issue in most cases, as the server cache and held transactions get cleared on restart.
2. Recreate the log.nsf file.
3. Run offline maintenance on the reported database during off-hours.
Hope the above information will help in answering your concerns
Please mark this question as answered and helpful if this answered your query.
Thanks for the answer. I will definitely mark the answer as correct if I get an answer to the questions: how to find out which database was the culprit of the problem and what exactly was the trigger of the error (disk space, locks, server error or something else)?
There were no locks in my examples. The server just starts throwing errors log file is full.
But even if we accept the fact that there is a lock:
1 - For example, in the case of archived transaction logs, shouldn't the server just create new externs and wait until the database is available, rather than locking the server?
2 - Even if the server got a lock on the base. Are all transactional logs full of the fact that the base is not available? Hard to believe.
3 - What does the Log.nsf database have to do with this? Why do they always recommend to recreate it if the errors are not related to it at all (my example #2)?
4 - If log.nsf is really the problem, wouldn't the solution be to disable transactional logs on this base?
Obviously, there are some restrictions that will give people answers. We're tired of the “log file is full” error, it's in every version of domino. It's not that there's an error, it's that no one understands how to monitor it. I understand that there should be an error, but also people should understand what it means, at what moment it will occur, what is its cause, you need statistics in the end, which will reflect the state of transactional logs.
In my opinion, the solution - reboot the server because the problem happened - is not a solution. The solution is to let people understand how to monitor the transaction logs and realize that a problem is about to occur, in order to take preventive measures.
Apparently, we will never understand the reasons for this error. That's unfortunate (((