ExaCC dbaascli command with mysterious PILOT error when creating a PDB

This image has an empty alt attribute; its file name is image-16.png

Intro

I lately had a silly error while running dbaascli on an ExaCC cluster but the java output was not helpful at all.  This is a super short post that shows where to find dbaascli execution logs and also an example of what can break your command execution depending on who runs it.

My error

I  was just creating a PDB the other day after provisioning a CDB using dbaaspi, but when I run the command I had this odd error message.

  • Command output

[root@clvmd01 ~]# dbaascli pdb create --dbname MYCDB --pdbname UAT
DBAAS CLI version 21.4.1.1.0
Executing command pdb create --pdbname UAT
Job id: dbac1ebe-5a81-4bfa-b8da-a01177359abd
Loading PILOT...
[FATAL] [DBAAS-60022] Command '/var/opt/oracle/dbaastools/pilot/bin/pilot -plugin create_pdb_cloud 
PLUGIN_OPERATION_TYPE="create" DBNAME="MYCDB" PDBNAME="UAT" DBAASAPI_JOB_ID="dbac1ebe-5a81-4bfa-b8da-a01177359abd"  
-logLevel FINE  -logDir /var/opt/oracle/log/MYCDB/pdb/create  -silent   -checkpointDir /var/opt/oracle/log/pilot_checkpoints  
-jreLoc /usr/java/jdk1.8.0_291-amd64/jre ' execution has failed on nodes [localnode].
   ACTION: Refer application log file for more information.
*MORE DETAILS*     Result of node:localnode
ERRORS:
Exception in thread "main" java.lang.IllegalMonitorStateException  <------ what is that??
        at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:371)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
        at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131)
        at oracle.install.commons.pilot.JobManagementHelper.createSubmittedJobInfo(JobManagementHelper.java:179)
        at oracle.install.commons.pilot.PilotApplicationHandler.executeApp(PilotApplicationHandler.java:428)
        at oracle.install.commons.pilot.PilotApplicationHandler.performOperation(PilotApplicationHandler.java:253)
        at oracle.install.commons.pilot.PilotApplication.execute(PilotApplication.java:210)
        at oracle.install.commons.pilot.PilotApplication.startup(PilotApplication.java:142)
        at oracle.install.commons.pilot.PilotApplication.main(PilotApplication.java:428)
[Loading PILOT...]
Exit code of the operation:1

PILOT failing to load

  • IllegalMonitorStateException
    and the rest of the java stack didn’t look like anything to me and I kept getting the same error over and over.

Where to look for API execution logs in ExaCC?

When a database deployment is created on Oracle Database Exa@CC, log files from the creation operation are stored in subdirectories of /var/opt/oracle/log. I actually forgot the path when I had this problem but a colleague reminded me of it. This log directory isn’t only storing database creation logs but way more as shown in the dbaastools log layout.

dbaastools LOGs

Example: under the CDB that I have previously created in my vm cluster you can see the variety of subfolders

[oracle@clvmd01]$ ll /var/opt/oracle/log/MYCDB drwxrwx--- 2 oracle oinstall 4096 Mar 23 15:12 bkup drwxr-xr-x 2 oracle oinstall 4096 Mar 24 11:26 bkup_api_log drwxrwx--- 2 oracle oinstall 4096 Mar 24 00:01 cleandblogs –-> adrci cleanup drwxrwx--- 2 oracle oinstall 4096 Mar 23 15:12 creg --> cloud db registration drwxrwx--- 3 oracle oinstall   20 Mar 23 13:55 database  /create drwxrwx--- 3 oracle oinstall   16 Mar 23 13:55 dbaasapi /db /createdb  drwxrwx--- 2 oracle oinstall 4096 Mar 24 00:26 obkup drwxr-xr-x 5 oracle oinstall   46 Mar 24 11:29 pdb /create /open /delete /close drwxrwx--- 2 oracle oinstall    6 Mar 24 11:27 rman

  • As you can see, there are several logs related to other tools and operations in the log location . You will find log directories for dbaascli, dbaasapi, bkup_api,obkup, rman and ADR operations which is very handy when you have several databases to manage. 

Root cause

Now back to our small issue, since I know where to find my dbaascli command log. Let’s find out why the pilot was crashing.

[oracle@clvmd01]$ cat /var/opt/oracle/log/MYCDB/pdb/create/pilot_2022-03-23_04-26-45-PM

Refer associated stacktrace #oracle.install.commons.pilot.JobManagementHelper: ---# Begin Stacktrace #--------------------------- ID: oracle.install.commons.pilot.JobManagementHelper:78 java.io.FileNotFoundException: / var/opt/oracle/log/pilot_checkpoints/conf/jobs/lastJob.inf.lck (Permission denied)         at java.io.RandomAccessFile.open0(Native Method)         at java.io.RandomAccessFile.open(RandomAccessFile.java:316)         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)         at oracle.install.commons.pilot.util.ConcurrentFileLockUtil.acquireLock(ConcurrentFileLockUtil.java:152) at oracle.install.commons.pilot.JobManagementHelper.createSubmittedJobInfo(JobManagementHelper.java:129)         at oracle.install.commons.pilot.PilotApplicationHandler.executeApp(PilotApplicationHandler.java:428)         at oracle.install.commons.pilot.PilotApplicationHandler.performOperation(PilotApplicationHandler.java:253)         at oracle.install.commons.pilot.PilotApplication.execute(PilotApplication.java:210)         at oracle.install.commons.pilot.PilotApplication.startup(PilotApplication.java:142)         at oracle.install.commons.pilot.PilotApplication.main(PilotApplication.java:428)

---# End Stacktrace #----------------------------- INFO: [2022-03-23 16:26:45.670 CDT][Thread-2][PilotApplicationHelper$1.run:227] Shutting down the pilot application.

When I checked the job file that was behind the concurrent lock, I realized that the file was owned by root the first time and all subsequent runs as oracle couldn’t overwrite it as shown below.

[oracle@clvmd01]$ ll /var/opt/oracle/log/pilot_checkpoints/conf/jobs/ -rw-r----- 1 oracle oinstall 54 Mar 23 14:41 lastJob.inf -rw-rw---- 1 root   root      0 Mar 23 12:53 lastJob.inf.lck

Solution

I just changed the permission of that job file, so my next run of that create pdb job will be able to write on the lck which fixed my problem.

[root@clvmd01]# chown oracle:oinstall /var/opt/oracle/log/pilot_checkpoints/conf/jobs/lastJob.inf.lck

Conclusion

Bottom line here is, although new cli tools usage is rapidly growing, it is as important to know the log location for these api based executions (dbaastools) than their syntax. You will always find more in the log directory than the explicit command output they show.

        Thank you for reading