postgres process using excessive CPU
I'm basically going to answer my own question here, but I wanted to document this for any future inquiries on the same topic.
I received an RMA unit (LTM 1600) to replace a dead system. I booted it up, did the basic config and then rejoined to my cluster and all seemed happy until I noticed one big problem... the CPU usage was in excess of 70%. And that was even though it was the passive node.
Looking at the "top" results, I saw "postgres" was using 70% or more continuously. I had to research what that was and why it's there: it's an internal PostgreSQL service that I think may be used by ASM/AFM (which we don't use).
After some digging I noticed the /var/log/ltm logfile had grown to an enormous size... over 700M in a day. Peeking at the tail of it, I saw that the "pgadmind" service which kicks off that PostgreSQL was restarting continuously in a horrible loop.
It complained that the file "global/pg_filenode.map" was missing... I had to really hunt around to figure out the full path it looks at is: "/var/local/pgsql/data/global/pg_filenode.map"
Sure enough, that directory existed but that file was nowhere to be found.
First, I stopped the service: "tmsh stop /sys service pgadmind" and tried copying just that one file from another system: "scp root@othersystem:/var/local/pgsql/data/global/pg_filenode.map /var/local/pgsql/data/global"
Then did a "chown postgres:root /var/local/pgsql/data/global/pg_filenode.map" to set the correct ownership and started the service back up: "tmsh start /sys service pgadmind"
Well crumb, the service still failed but at least a different message now... it complained about other files in that directory missing. Well, the other files have names like "12790" and "12788_fsm" so I figured that map file must point to specific files that don't exist since I copied it from another system.
To fix that, I just wiped that whole directory on the troubled unit, copied all the files from the working node, and restarted the service.
Voila, that did it and now pgadmind was running happily, no more errors in the ltm log, CPU was normal.
But I did wonder if copying that dir from another unit was really the best thing... I have no idea since I don't really know what uses that database instance.
To try and be as sure as possible, I installed a clean 12.1.2 ISO plus the latest hotfix onto a different volume and booted into that. Actually, before I booted into it, I mounted the "var" from that volume into a temp mount directory so I could peek at that data/global directory. That "global" dir doesn't exist... I'm pretty sure it's not a symlink to a shared location, so I wonder now if I could have just stopped the service, deleted that global directory, started back up, and maybe it creates a new set of files from some default. I just have no experience with PostgreSQL so I have no idea how that works.
Anyway, I chalk this up to the RMA unit I was sent having either a damaged or missing map file for whatever random reason... maybe they did a dirty shutdown before packaging it up to ship out, or gamma rays hit that exact spot on the hard drive...whatever the case, I'm glad I was able to remedy it with only a few extra hours of work and also glad it wasn't a hardware issue.
If anyone else sees something like this, maybe this will help them out (or if anyone has a better idea how to really fix this type of issue, that'd be great to hear since I'm still not sure I did it the correct way).