OVM series part 2: What to collect when opening an SR

Intro

In my last post, I have described few commands within Oracle OVM manager CLI. This time, the topic is about how to identify the relevant information of your OVM environment as well as the logs and their location for each component.
OVM servers and OVM manager have their own logs, and when issues occur It’s good to have them handy before reaching out to Oracle support. Checking the details of your configuration preemptively also helps making sure you are communicating the right release number of the components along with their state quickly.

Check the configuration

Oracle VM Manager

Configuration can be checked  through ovm_admin command. content is usually set by default

[root@em1 ~]# /u01/app/oracle/ovm-manager-3/bin/ovm_admin --listconfig

Oracle VM Manager Release 3.4.4 Admin tool Oracle VM Manager Configuration Database type               : MySQL Database Server hostname    : localhost Database name               : ovs <-- Database storing the ovmm metadata Database Listener port      : 49500 Oracle VM Database user     : ovs WebLogic Server admin       : weblogic Oracle VM Manager admin     : admin

Or by checking the configuration file  where you can find the OVMM UUID and build number useful in case of reinstall/restore

[root@em1 ~]# cat /u01/app/oracle/ovm-manager-3/.config DBTYPE=MySQL DBHOST=localhost SID=ovs LSNR=49500 OVSSCHEMA=ovs APEX=8080 WLSADMIN=weblogic OVSADMIN=admin COREPORT=54321F UUID=0004fb00000100007cc584fa7bf6e57f BUILDID=3.4.4.1709 <--- exact build number

Here you can have a look at the ovmm service  and the mysql backup configuration

[root@em1 ~]# cat  /etc/sysconfig/ovmm JVM_MEMORY_MAX=4096m JVM_MAX_PERM=512m RUN_OVMM=YES DBBACKUP=/u01/app/oracle/mysql/dbbackup DBBACKUP_CMD=/opt/mysql/meb-3.12/bin/mysqlbackup UUID=0004fb00000100007cc584fa7bf6e57f

Oracle VM server

Configuration can be checked  through The xm commands which interact with the Xen hypervisor in the ovm host

[root@ovm-01 ~]# xm info host                   : ovm-01 release                : 4.1.12-124.20.3.el6uek.x86_64 version                : #2 SMP Thu Oct 11 17:47:32 PDT 2018 machine                : x86_64 nr_cpus                : 24 nr_nodes               : 2 cores_per_socket       : 6 threads_per_core       : 2 cpu_mhz                : 3059 hw_caps                : bfebfbff:2c100800:00000000:01703f00:029ee3ff:00000000:xxx:000 virt_caps              : hvm hvm_directio total_memory           : 147447 free_memory            : 75478 free_cpus              : 0 xen_major              : 4 xen_minor              : 4 xen_extra              : .4OVM xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler          : credit xen_pagesize           : 4096 platform_params        : virt_start=0xffff800000000000 xen_changeset          : xen_commandline        : placeholder dom0_mem=max:3792M allowsuperpage dom0_vcpus_pin dom0_max_vcpus=20 crashkernel=512M@64M cc_compiler            : gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18.0.7) cc_compile_by          : mockbuild cc_compile_domain      : us.oracle.com cc_compile_date        : Thu Sep  6 08:24:27 PDT 2018 xend_config_format     : 4

-- list the vms in the ovm host

[root@ovm-server04 ~]# xm list Name                                        ID   Mem VCPUs      State   Time(s) 0004fb00000600000f199f55466d4b84            14  8199     1     -b---- 100696.0 0004fb00000600001a440b79c4e6e194            20  4099     2     -b----  36437.3

Directory purpose ------------ -------------------------------------------------------------------------

/etc/xen      Contains OVM Server configuration files for the OVM Server daemon and virtualized guests.

OVM agent

You can check the version by running an rpm command on the agent package  or checking its service status

[root@ovm-04 ~]# rpm -aq ovs-agent ovs-agent-3.4.5-11.el6.x86_64

[root@ovm-04 ~]# service ovs-agent status

Available logs

Oracle VM Server directories
  all log directories are usually located under /var/log directory, each of them logs information for a specific module

Directory         Purpose ----------------- ----------------------------------------------------------------------- /var/log          Contains the OVM Agent log file, ovs-agent.log.                   Contains the ovmwatch.log, which logs virtual machine life cycle events.                   Contains the ovm-consoled.log, which logs remote VNC console access, and all communication with OVM Manager. /var/log/xen      Contains OVM Server log files. /var/log/messages Contains OVM Server messages.

 LogFile          ------      Purpose
-------------------------- ----------------------------------------------------------------
/var/log/xen/xend.log    ----  All OVM Server daemon actions. Same output as xm log command. 
/var/log/xen/xend-debug.log    More detailed logs  of OVM Server daemon. 
/var/log/xen/xen-hotplug.log   log of hotplug events if a device or network script doesn’t start up or become available. 
/var/log/xen/qemu-dm.pid.log   log for each hardware virtualized guest. Replace the pid this in the file name. 
/var/log/ovs-agent.log         log for OVM Agent. 
/var/log/osc.log               log for OVM Storage Connect Plug-ins. 
/var/log/ovm-consoled.log      log for the OVM virtual machine console.
/var/log/ovmwatch.log          log for the OVM watch daemon. 

Oracle VM Server Command-Line Tools
Beside tailing ovs-agent.log you can also use xm diagnostic commands

command Purpose --------- ---------------------------------------------------------- xentop   Displays real-time CPU/Mem information about OVM Server and domains xm dmesg Displays log information on the hypervisor. xm log   Displays log information of the OVM Server daemon.

OVM manager

The main log file is AdminServer.log which has multiple versions according to its rotation.

  Directory File -------------------------------------------------------------------------- ----------- /u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/ AdminServer.log     access.log AdminServer-diagnostic.log

You can also use an ovm tool called OvmLogTool.py to generate an error summary log based on all AdminServer.log versions

[root@ovm-manager01 bin]# cd /u01/app/oracle/ovm-manager-3/ovm_tools/bin [root@ovm-manager01 bin]# python OvmLogTool.py -s -o summary processing input file: /u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/AdminServer.log00001 processing input file: /u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/AdminServer.log

[root@ovm-manager01 bin]# ll summary -rw-r--r--. 1 root root 985903 Jul 25 06:25 summary


SR related diagnostic logs

Oracle support will probably ask you the following logs right after opening the SR.

SOSREPORT

This report will collect diagnostic and configuration information from OVM manager host and all linked OVM servers.
i.e.: (rpm versions, syslog, network config, filesystems, disk partition details, loaded kernel modules & status of all services)

[root@em1 ~]# sosreport -v sosreport (version 3.4) Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [em1]: Please enter the case id that you are generating this report for []: 3-236xxxxx

Setting up archive ... Setting up plugins ... Running plugins. Please wait ... Running 94/94: yum... Creating compressed archive... Your sosreport has been generated and saved in:/var/tmp/sosreport-em1.3-236xxx-YYYYMMDD.tar.xz

Note sometimes the report fails to run on one of the OVM servers.In that case, just run it separately in each OVM server. 

VMPINFO Diagnostic Tool For Oracle

VMPinfo is a script that collects diagnostic information for OVM, including the OVM Manager and all linked Oracle VM Servers. this script includes the run of sosreport, hence it’s enough to only run VMPinfo command without sosreport 

[root@em1]# cd /u01/app/oracle/ovm-manager-3/ovm_tools/support/ [root@em1 support]# ./vmpinfo3.sh --username=admin listservers Enter OVM Manager Password: The following server(s) are owned by this manager: ['ovm-01', 'ovm-02']

[root@ovm-manager01 support]# ./vmpinfo3.sh --username=admin

Enter OVM Manager Password:

Gathering files from all servers. This process may take some time.
Gathering OVM Model Dump files
Gathering sosreport from ovm-01
Gathering sosreport from ovm-02
Data collection complete
Gathering OVM Manager Logs
Clean up metrics
Copying model files
Copying DB backup log files
Running lightweight sosreport
Archiving vmpinfo3-20210725-073817

=======================================================================================
Please send /tmp/vmpinfo3-3.4.6.2424-20210725-073817.tar.gz to Oracle support
=======================================================================================


Check for Database Corruption in OVM Manager

OVM manager operations and metadata are stored in a MySQL repository database. The database is automatically backed up daily but if any corruption is detected the backups will stop and all new changes won’t be recovered in case of a crash. 

[root@em1 mysql]# cd  /u01/app/oracle/mysql/data [root@em1 data] # cat my.cnf |grep log log-error=/u01/app/oracle/mysql/data/mysqld.err innodb_log_group_home_dir=/u01/app/oracle/mysql/data innodb_log_buffer_size=256M innodb_log_file_size=768M innodb_flush_log_at_trx_commit=2 innodb_log_files_in_group=2 -- [root@em1 mysql]# tail mysqld.err


Consistency check

1. Look for ONF (Object Not Found) errors in OVMM DB.

[root@em1 ~]# /usr/bin/ovm_shell.sh -u admin     Password:     OVM Shell: 3.4.4.1709 Interactive Mode     --- Run below commands in this order in OVMM shell prompt.     >>> om = OvmClient.getOvmManager()     >>> f = om.getFoundryContext()        >>> f.fixupScan()                     [11509]   <--- If the result is not empty then there is corruption and inconsistencies in OVMM DB.

2. Validate if Daily AutoFullBackup DB backups stopped working.

[root@em1 ~]# ls -ldt /u01/app/oracle/mysql/dbbackup/AutoFullBackup* /u01/app/oracle/mysql/dbbackup/AutoFullBackup-20210722_222459 <--- must be <=24h old /u01/app/oracle/mysql/dbbackup/AutoFullBackup-20210721_222433 …

3. Look for Object Not Found (ONF) and Cluster is null errors in OVM Manager Admin Server logs.

[root@em1 ~]# egrep -iR "cluster is null|ObjectNotFound|inconsistencies" /u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/AdminServer*

Action If corruption is confirmed, then perform a OVM Manager DB rebuild (Regeneration)  Doc ID 2038168.1

Conclusion

This was just a preview of what you might look at when troubleshooting the OVM environment. For my part , the common issues happening most of the time are related to some job locks or network stack hiccups . For the later the Support will also ask you to run commands like “netstat -ltnp” , “nc –zv” or “tcpdump” between the manager and the OVM hosts. In the next post we will be covering the backup and recovery tools available for OVM.

Thank you for reading