WHOAMI
A bit of an introduction (for those of you who don’t know me):
I have been developing for and administering Domino servers pretty much continuously since 1995 (back when Domino was just called the “Notes Server” and ran on IBM’s OS/2). I have written several white papers and produced many videos on installing and configuring Domino on Linux. I was on the Leap for Domino team and personally wrote the installation scripts (now called Volt for Domino). I maintain several Domino development servers for HCL Labs, upon which we run instances of DRAPI, Nomad, Verse, Volt, and some other interesting things. My point here, is that I have been around a bit and generally know what the heck I am doing.
My Task
About a week ago I was asked up update the installed version of Nomad on one of our development servers to the latest release (specifically, 1.0.13 Interim Fix 1).
“No problem”, I thought to myself. As I have installed / upgraded Nomad on Domino (for Linux) many many times, I looked forward to this task and dove right in.
Documentation
Being a well-seasoned SME, the very first thing I did was review the Nomad server on Domino, the What's new in HCL Nomad for web browsers] and the Knowledge Article View HCL - Customer Support.
You do read the documentation, right?
Environment & Update
Ok, my next step was to ensure I had my environment up to date. I was running Nomad 1.0.10 on Domino 14.0. I decided to start with updating Domino to the latest version (14.0 Fix Pack 2, Interim Fix 1), and proceeded to complete that update without any hiccups.
My next step was to follow the instructions in the documentation. As directed
I made an appropriate back-up copy of my /nomad-files and proceeded to follow the instructions. I made special note of the two important changes for 1.0.13:
change 1:
change2:
This particular server is not using SAML, and doesn’t have a nomad-config.yml file, so that issue didn’t apply. Because I am a bit pedantic, I went ahead and deleted the nwsp-linux file as part of the update.
Once I completed the install, I went ahead and restarted the Nomad task (load nomad
).
Things go Wrong
Everything seemed to be working fine. That is until I signed back into Nomad from a web browser and tried to open an NSF on the server.
I was able to open my local notes log log.nsf
within the Nomad environment, and checking the log revealed the following (server and organization intentionally redacted)
If you do (as did I) a web search for that error, you will learn it is related to a TCP/IP problem -specifically with IPV6 addresses.
Resolution Attempts
This Domino server is sitting behind an Nginx proxy, (another thing of which I have some experience), so I immediately looked into that. I discovered I did in fact have a mis-configuration for IPV6 addresses in my nginx nomad configuration file. I made the correction (the specific change I made was to disable IPV6 as it is not being used on this server), restarted the nginx service, and then restarted the Domino server. This solved nothing.
I then re-checked the documentation, searching for something I may have missed. There is specific documentation for Using the Nomad server on Domino behind a reverse proxy / load balancer, and I looked through that, verifying ports, server name, certificates, headers, etc). Everything seemed correct.
- My next step was to revert back to 1.0.10, which worked just fine.
- “Perhaps”, I though to myself, “I can’t jump directly from 1.0.10 to 1.0.13”. So I upgraded to 1.0.12. Which worked perfectly.
- Then I upgraded to 1.0.13. And it broke again.
Sigh.
Call for Help
At this point, I had spent many hours fighting with this upgrade, a process which normally takes at most 15 or so minutes. I was tired and frustrated and certain I was missing something. So I reached out for help from some of the other awesome people here (David Kennedy, Rick Gallaspy, and Yao Cheng).
One of the the very first thing they asked, which of course you already know, was “RTFM?”
And of course my answer was a resounding yes I did.
So the four of us spent the next several hours doing various screen shares, testing, configuration verifications, restarts, etc. All to no avail.
It was getting late, we were getting tired. Yao asked me to generate a Problem Report from Nomad and send it to him. This was functionality of which I was unaware, and let me tell you it is WAY FREAKING COOL!
He reviewed my report that evening, and did some testing on his development server.
Solution
Then next morning Yao asked me to make a configuration change in my nginx configuration and test it out.
I did so, and it worked perfectly.
If you take a second look at that screenshot of our chat session, you will notice he also gently provided me with the URL to the Nomad documentation.
You know, the documentation I reviewed?
Well, in my “review”, I missed something extremely important.
Sigh. Boy did I ever feel like an idiot.
Lessons Learned
I shared this somewhat embarrassing tale to try and save you from doing the same kind of thing. There are three very important lessons here:
FAMILIARITY BREEDS CONTEMPT
Just because you know something extremely well doesn’t mean you can’t miss something important. Years of experience can be humbled by one single sentence.
READ THE MANUAL
Don’t just glance over the instructions, release notes, and what’s new information. There is a reason the documentation is provided. Take the time to actually read and comprehend what has been written.
ASK FOR HELP
Never ever be afraid to reach out and ask somebody else to help you out. Just the simple act of just explaining the problem to somebody else can sometimes help you discover the solution.
I hope this helps!
-Spanky.