5 Explore Diagnostic Insights

Autonomous Health Framework Insights (AHF Insights) provides deeper diagnostic insights into Oracle diagnostic collections collected by AHF diagnostic utilities, Oracle Exachk, Oracle Trace File Analyzer, Exawatcher, and Cluster Health Monitor.

AHF Insights is bundled with AHF installer package. You do not need to run a web server to host this site.

Note:

The current version is supported on Linux only.

5.1 Introduction to AHF Insights

AHF Insights provides a bird's eye view of the entire system with the ability to further drill down for root cause analysis.

Note:

Starting in AHF 23.8, plotly.js dependency on CDN has been removed for customers using AHF Insights in restrictive environments.

Previously, results from different AHF components were not available in a single dashboard making it challenging to combine and correlate. To mitigate this, AHF Insights provides a web-based graphical user interface, which does not require a web server to host the web pages, for all diagnostic data collectors and analyzers that are part of AHF Kit.

AHF performs a contextual diagnostic collection for a given period to analyze the performance of database systems. The collection includes diagnostic data from various AHF features such as:
  • Configuration
  • Environment Topology
  • Metrics
  • Logs
This diagnostic data collected from the system passes through AHF Insights, which in turn produces an offline report with analysis in the following areas:
  • System Configuration
  • System State
  • Anomalies in the Operating System
  • Best Practices Compliance
  • System Traces
  • Root cause for issues and fixes in some of the anomalous cases
To get started, run the following command:
ahf analysis create --type insights

Example 5-1 ahf analysis create --type insights

[root@node02 ~]# tfactl print status

.-----------------------------------------------------------------------------------------------.
| Host   | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+----------------+---------------+--------+------+------------+----------------------+----------+
| node02 | RUNNING       | 134679 | 5000 | 22.3.0.0.0 | 22300020221031131221 | COMPLETE         |
| node01 | RUNNING       | 128438 | 5000 | 22.3.0.0.0 | 22300020221031131221 | COMPLETE         |
'----------------+---------------+--------+------+------------+----------------------+----------'
[root@node02 ~]# ahf analysis create --type insights --last 2h
Starting analysis and collecting data for insights
Collecting data for AHF Insights (This may take a few minutes per node)
AHF Insights report is being generated for the last 2h
From Date : 11/20/2022 01:16:41 UTC - To Date : 11/20/2022 03:17:15 UTC
Report is generated at : /opt/oracle.ahf/data/repository/collection_Sun_Nov_20_03_16_36_UTC_2022_node_all/cgexa-ogmn12_insights_2022_11_20_03_18_13.zip

5.2 AHF Insights - Home

Summarizes the System State, System Types, and Time Range the report was generated. As well, it provides a brief overview of the topology of the system and insights into the diagnostics collected by various AHF components.

Click on an item on the Home page to view details. Concurrently, you can open more than one item. A userscore highlights the item in focus. Use the up/down arrow keys on your keyboard to move the page horizontally. To get to the top of the page, click the Scroll to top button. To close an open item, click the X mark.

Figure 5-1 Home

Provides a summary of system topology and insights.

System Topology

  • Cluster: Provides a summary of cluster and cluster resources, and ASM details.
  • Databases: Provides basic and detailed information about Oracle Databases running on the system.
  • Database Servers: Provides basic information about database servers.
  • Storage Servers: Provides basic information about storage servers.
  • Fabric Switches: Provides basic information about RDMA Network Fabric Switches.

Insights

  • Timeline: Provides Timeline visualization in a graph and provides a table with specific information about each timestamp.
  • Operating System Issues: Provides details about the metrics collected on the system and a detailed report on operating system anomalies.
  • Best Practice Issues: Provides the results of Best Practices Compliance checks run on the system, paginated.
  • System Change: Provides details on the changes applied to the system, paginated.
  • Recommended Software: Lists recommended software and links to supported versions.
  • Database Server: Provides details about the Management Server metrics and the alerts recorded in the Management Server.
  • RPM List: Lists RPMs and the differences between them across nodes, paginated.
  • Database Parameters: Lists normal and hidden Oracle Database parameters, paginated.
  • Kernel Parameters: Lists the kernel parameters, paginated.
  • Space Analysis: Renders Disk Utilization and Diagnostice Space Usage data in visual and tabular format.

5.2.1 System Topology

Note:

Fabric Switches and Storage Servers sections will not be displayed in a non-Exadata environment.
5.2.1.1 Cluster

AHF 23.8

Starting in AHF 23.8, you will be able copy data in text format into the clipboard to post it into SR body while raising a service request.

How does AHF Insights UI render this information

Provides a summary of cluster and cluster resources, and ASM details.

Cluster Summary

Figure 5-2 Cluster Summary

Provides a summary of cluster.

Provides a brief overview on System, Oracle Grid Infrastructure, Incident, Oracle Database, Database Server, Storage Server, and RDMA Network Fabric Switch.

  • Click the arrow located inside the Database Home section to get Database Home details.
  • Click Copy as text to copy the cluster summary into the clipboard.

Cluster Resources

Figure 5-3 Cluster Resources

Provides a summary of cluster resources.

Provides details on the cluster resources, Oracle Database, pluggable database (PDB), and listener, paginated. The details include CRS resource names, online or offline statuses of the targets, state of the resources, and the servers on which the resources are running.

  • Click the Expand All toggle button to view details of all cluster resources.

ASM Details

Figure 5-4 ASM Details

Provides a summary of ASM details.

Provides details on the nodes, instance names, online or offline statuses of the nodes, disk group names, online or offline statuses of the disk groups, and percentage of disk usage, paginated.

5.2.1.2 Databases

AHF 23.8

Starting in AHF 23.8, you will be able copy data in text format into the clipboard to post it into SR body while raising a service request.

How does AHF Insights UI render this information

Figure 5-5 Databases

Provides basic and detailed information about Oracle Databases running on the system.

Provides basic and detailed information about Oracle Databases running on the system.

  • Click Expand All to view detailed information on all items or click an arrow button to view detailed information on a specific item.
5.2.1.3 Database Servers

AHF 23.8

Starting in AHF 23.8, you will be able copy data in text format into the clipboard to post it into SR body while raising a service request.

How does AHF Insights UI render this information

Figure 5-6 Database Servers

Provides basic information about database servers.
Provides basic information about database servers.
  • Sort by Attribute, Target, and Value fields.
5.2.1.4 Storage Servers

AHF 23.8

Starting in AHF 23.8, you will be able copy data in text format into the clipboard to post it into SR body while raising a service request.

How does AHF Insights UI render this information

Figure 5-7 Storage Servers

Provides basic information about storage servers.
Provides basic information about storage servers.
  • Sort by Attribute, Target, and Value fields.
5.2.1.5 Fabric Switches

AHF 23.8

Starting in AHF 23.8, you will be able copy data in text format into the clipboard to post it into SR body while raising a service request.

How does AHF Insights UI render this information

Figure 5-8 Fabric Switches

Provides basic information about RDMA Network Fabric Switches.
Provides basic information about RDMA Network Fabric Switches.
  • Sort by Attribute, Target, and Value fields.

5.2.2 Insights

Note:

Database Server section will not be displayed in a non-Exadata environment.
5.2.2.1 Timeline

How does AHF Insights UI render this information

Figure 5-9 Timeline - Host Faceted View

Provides Timeline visualization in a graph

Figure 5-10 Timeline - Event Faceted View

Provides Timeline visualization in a table with specific information about each timestamp.

Provides Timeline visualization in a graph and provides a table with specific information about each timestamp.

  • Search values using the filter section.
  • Filter by a specific time range.
  • Click the arrow located at the left side of a specific date to view detailed information regarding a specific timestamp.
  • Hover over a specific data point in the graph to get detailed information about that specific point in time.
  • Zoom into the timeline.
5.2.2.2 Operating System Issues

AHF 23.8

AHF 23.8 includes the following enhancements to the user interface to make it more intuitive and easier to use.

You can:

  • Spot the disks that have anomalies. In the Operating System Issues tab, under Local IO, click Disk to view Disk Metrics. Disks that have anomalies are marked with an X mark.
  • Explore process aggregate from operating system details in a more intuitive way.
    • Demarcated process aggregates per the instance group like Databases, ASM, APX (Apex), IOS, Clusterware, and so on.
    • Legends specific to individual category rather than single legend for all categories.

AHF 23.10

AHF 23.10 includes the following enhancements to the user interface to make it more intuitive.

You will now be able to view the data in problematic time ranges in plots with more data points.

The problematic time ranges will have the following reading intervals:
  • 5 seconds for ranges less than 1 minute
  • 30 seconds for ranges more than 1 minute

The number of data points for plots under Operating System Issues section are dynamic for optimal time taken to generate report.

The data points for time ranges greater than 4 hours are reduced and have the following reading intervals:
  • 1 minute for intervals up to 4 hours
  • 3 minutes for intervals greater than 4 hours and less than 12 hours
  • 5 minutes for intervals greater than 12 hours.

AHF 23.11

Starting in AHF 23.11, you will be able to view data coming from Exawatcher.

You can explore data coming from cell nodes in a visual format. You can switch between cell nodes, tagged as (S), and compute nodes, tagged as (D) from a dropdown. You will be able to examine Flash Disks, Flash Disk Aggregates, and Hard Disk Aggregates metrics.

How does AHF Insights UI render this information

Figure 5-11 Operating System Issues - Configuration

Provides details about the metrics collected on the system and a detailed report on operating system anomalies.

Figure 5-12 Operating System Issues - Metrics

Provides details about the metrics collected on the system and a detailed report on operating system anomalies.

Figure 5-13 Operating System Issues - Metrics - Disk Anamolies

Provides details about the metrics collected on the system and a detailed report on operating system anomalies.

Figure 5-14 Operating System Issues - Metrics - Process Aggregation

Provides details about the metrics collected on the system and a detailed report on operating system anomalies.

Figure 5-15 Operating System Issues - Metrics - Local I/O - Flash Disks

Provides details about the metrics collected from Exawatcher.

Figure 5-16 Operating System Issues - Metrics - Local I/O - Flash Disk Aggregation

Provides details about the metrics collected from Exawatcher.

Figure 5-17 Operating System Issues - Metrics - Local I/O - Hard Disk Aggregation

Provides details about the metrics collected from Exawatcher.

Figure 5-18 Operating System Issues - Report

Provides details about the metrics collected on the system and a detailed report on operating system anomalies.

Provides details about the metrics collected on the system and a detailed report on operating system anomalies. This page presents three views, Configuration, Metrics, and Report.

Metrics view has tabs displaying CPU, Memory, System I/O, Process, Network interface, Process aggregation metrics, and green and red icons indicating the statuses. From the drop-down list at the upper-right corner, select a node for which you want to view the metrics. Select a time range from the calendar widget to view metrics for that period.

Report view includes Summary Timeline and Observed Findings.

Configuration

This tab showcases CPU, Memory, IO, and Network configuration details of the systems from where operating system metrics were collected.

Metrics

System Overview

This tab showcases overview of resources such as CPU, memory, processes, and I/O operation.

  • Hover over a specific point in time of the graph to get detailed information.
  • Zoom into the timeline.

CPU Metrics

This tab showcases CPU metrics.

  • Hover over a specific point in time of the graph to get detailed information.
  • Zoom into the timeline.

Memory Metrics

This tab showcases memory metrics.

  • Hover over a specific point in time of the graph to get detailed information.
  • Zoom into the timeline.

Local I/O Metrics

This tab show case System IO metrics.

  • Hover over a specific point in time of the graph to get detailed information.
  • Zoom into the timeline.

Process Metrics

This tab showcases process metrics.

  • Hover over a specific point in time of the graph to get detailed information.
  • Zoom into the timeline.

Network Metrics

This tab showcases network interface metrics.

  • Select different metrics related to network like interface, IP, UDP, and TCP.
  • Hover over a specific point in time of the graph to get detailed information.
  • Zoom into the timeline.

Process Aggregation Metrics

This tab showcases aggregation of process metrics.

  • Select a node from the Select Node drop-down list to view node-specific process aggregation metrics.
  • Hover over a specific point in time of the graph to get detailed information.
  • Zoom into the timeline.

Report

With Report view, explore the findings in a drop-down fashion with a full widescreen view.

You can:
  • view the Event information in a subplot within the Summary Timeline Gantt Chart
  • explore the top ranked metrics in tables under a problem finding in a visual format
  • view the metrics associated with the prblem finding in a visual format
  • drill down into the detailed state of the system at a specific problematic point in time under 'Problematic Snapshots' section. Problem specific system snapshots are organized into dropdowns ordered by problem timestamp
5.2.2.3 Best Practice Issues

AHF 23.8

Starting in AHF 23.8, AHF compliance checks from Oracle Orachk and Oracle Exachk are integrated into AHF Insights Best Practice Issues section.

AHF has thousands of Best Practice Compliance Checks, which are run automatically by AHF Oracle Orachk and Oracle Exachk. The results of these checks are viewable in HTML reports and output in JSON and XML for consumption into other tools. In addition, all Best Practice Compliance Checks are fully integrated into AHF Insights for running on-demand.

AHF Insights makes it easy to quickly see the Health Score, understand where systems are out of compliance and then take the necessary corrective action.

With this enhancement, you can:
  • Explore the best practice data in a visual format.
  • Filter best practices across different status through visualization and Status status drop-down.
  • Search checks from all sections of best practice report.
  • View the best practice report in a vertical fashion.
  • See the health score with a visual distribution of checks that have failed.
Continue to use the Oracle Orachk / Oracle Exachk commands for automated scheduled runs, but for on-demand compliance investigation, generate an AHF Insights report:
ahf analysis create --type insights

How does AHF Insights UI render this information

Figure 5-19 Best Practice Issues

Provides the results of Best Practices Compliance checks run on the system, paginated.

Provides the results of Best Practices Compliance checks run on the system, paginated. The details include the list of Components on which the checks were run, the Best Practice Checks that were run on the components, and Statuses of those checks.

  • Hover the mouse pointer over doughnut pie chart and stacked bar chart to view snippets in a tooltip.
  • Filter the checks by severity - CRITICAL, FAIL, WARN, PASS, and INFO.
  • Use the Component drop-down list to navigate to different sections of the report.
  • Click the arrow on each entry in the table to view details of one specific issue.
5.2.2.4 System Change

How does AHF Insights UI render this information

Figure 5-20 System Change

Provides details on the changes applied to the system, paginated.

Provides details on the changes applied to the system, paginated.

  • Search values using the filter section.
  • Filter by a specific time range.
5.2.2.5 Recommended Software

AHF 23.8

Starting in AHF 23.8, you will be able copy data in text format into the clipboard to post it into SR body while raising a service request.

How does AHF Insights UI render this information

Figure 5-21 Recommended Software

Lists recommended software and links to supported versions.

Lists recommended software and links to supported versions.

5.2.2.6 Database Server

How does AHF Insights UI render this information

Database Server includes two sections, Management Server Metrics and Alerts recorded in Management Server across Hardware , Software, and ADR.

Metrics

Figure 5-22 Database Server Metrics

This page showcases Management Server metrics.

Metrics section provdies details about the Management Server metrics such as Disk controller battery charge, CPU utilization, CPU time used by the Management Server, Total space utilized on the file system, and so on.

  • From the drop-down list, select a host for which you want to view the metrics.

Alerts

Figure 5-23 Database Server Alerts

This page showcases Management Server alerts.

Alerts section provides details about the Alerts recorded in the Management Server across hardware, software, and ADR. Alert section has two views, Table and Graph.

Table view provides Alert Details such as description of the alert and the remedial action you can take, in tabular format.

  • Click the Expand All toggle button to view details of all alerts.
  • Click the arrow to view detailed information about an alert.

Graph view categorises the alerts by severity such as Critical and Warning, and type such as Stateful and Stateless.

  • Click the Show open alerts toggle button to view the list of open alerts. The button is turned on by default.
5.2.2.7 RPM List

How does AHF Insights UI render this information

Figure 5-24 List of RPMs

Lists RPMs and the differences between them across nodes, paginated.

Lists RPMs and the differences between them across nodes, paginated.

  • Enter the name of the parameter in the filter field to filter the RPMs.
  • The RPM Name, Version, Release, and Arch columns remains fixed.
  • Click the Show RPM differences toggle button to view the differences between the RPMs across nodes.
5.2.2.8 Database Parameters

How does AHF Insights UI render this information

Figure 5-25 Oracle Database Parameters - Normal

Lists normal and hidden Oracle Database parameters, paginated.

Figure 5-26 Oracle Database Parameters - Hidden

Lists normal and hidden Oracle Database parameters, paginated.

Lists normal and hidden Oracle Database parameters, paginated. This page also provides two views, Normal and Hidden. Normal view is displayed by default.

  • Enter the name of the parameter in the filter field to filter the parameters.
  • The Parameter column remains fixed and you can view the properties of each parameter across multiple databases.
  • Click the Show different properties across databases toggle button to view different properties across databases.
  • Click Hidden to view the hidden parameter.
5.2.2.9 Kernel Parameters

How does AHF Insights UI render this information

Figure 5-27 Kernel Parameters

Lists the kernel parameters, paginated.

Lists the kernel parameters, paginated.

  • Enter the name of the parameter in the filter field to filter the parameters.
  • The Parameter column remains fixed and you can view the properties of each parameter across multiple hosts/nodes.
  • Click the Show different properties across hosts/nodes toggle button to view different properties across hosts/nodes.
5.2.2.10 Patch Information

How does AHF Insights UI render this information

Figure 5-28 Patch Information

Provides information about the patches applied.

Figure 5-29 Patch Information - Timeline

Provides information about the patches applied.

Figure 5-30 Patch Information - Components

Provides information about the components.

Provides a list of patches to keep track of which patch was applied to which hosts, where (Oracle Database home or Grid home), and when (timeline). Gaps or inconsistencies in patching are highlighted across nodes for the same home. Lists bugs that a particular patch provides the fixes for. Bugs and relevant patch information can be quickly searched and viewed via interactive reports.

Patches tab:

  • In the Timeline section, double-click a patch ID to view on which host and when the patch was applied.
  • Enter the patch ID in the filter field to filter and view the patch details.
  • Under Applied Date, click the right arrow key to view list of bugs that a particular patch addresses.
  • Click the X mark to clear the filter.

Components tab: Provides a paginated list components and their version affected by applying the patches.

5.2.2.11 Space Analysis

How does AHF Insights UI render this information

Figure 5-31 Disk Utilization

Provides information about disk utilization.

Figure 5-32 Diagnostic Space Usage

Provides information about diagnostic space usage.

Disk Utilization tab: Provides a paginated view of host-wise directory structure, space consumed by directories and files, and available space.

Diagnostic Space Usage tab: Provides a paginated view of disk spave consumed by diagnostics. Use the drop-down lists to filter by nodes and diagnostics collected..

Related Topics

5.2.2.12 Database Anomalies Advisor

AHF detects database anomalies and identifies the cause and corrective action.

The Database Anomalies Advisor shows a summary timeline of anomalies for hosts and database instances. Findings can be drilled into to understand the cause and recommendation action.

To view the Database Anomalies Advisor and it’s recommendations, run ahf analysis create --type insights, open the resulting report, and click Database Anomalies Advisor.

Figure 5-33 Database Anomalies Advisor


This image illustrates shows database anomalies, their cause and recommended actions.

Related Topics

5.3 ahf analysis

Use the ahf analysis command to generate AHF Insights and AHF Balance reports.

AHF 23.8

Starting in AHF 23.8, you will be able to upload AHF Insights report automatically if Object Store is configured as part of AHF. Uploading AHF Insights reports helps Oracle Cloud Operations to identify, investigate, track, and resolve system health issues and divergences in best practice configurations quickly and effectively.

Oracle Autonomous Database on Dedicated Exadata Infrastructure and Oracle SaaS

To set REST endpoints (Object Store's), run:
ahfctl setupload -name oss -type https -user <user> -url <object_store> -password
To upload AHF Insights report to Object Store, run:
ahf analysis create --type insights
.

ahf analysis create

ahf analysis create [-h] [--type {insights|impact}] [[--last n{m|h} [--refresh] | --for DATETIME | --from DATETIME --to DATETIME] [--tag TAGNAME] | [--scope SCOPE --name NAME --cluster CLUSTER --clusters CLUSTER_LIST]][--output-file PATH] [--to-json]

Syntax: AHF Balance

ahf analysis create [-h] --type impact --scope [fleet|cluster|database] [--cluster CLUSTER_NAME] [--clusters space-delimited list of clusters
] --name NAME

Parameters

Table 5-1 ahf analysis create --type impact Command Parameters

Parameter Description

-h, --help

Show this help and exit.

--type impact

Specify the type of report to generate.

--scope [fleet|cluster|database]

Specify to generate AHF Balance reports - Fleet Report, Cluster Report, and Database Report

Specify the --scope and --name options to create an impact analysis.

The --cluster option is required for database impact analysis.

--output-file PATH

Specify to create output file in the specified location.

--clusters clu1 clu2 clu3

Specify a space-delimited list of clusters to include in the fleet scope.

--name NAME

Specify the name of the fleet, cluster, or database to report on.

--user-name USER_NAME

Specify the Oracle Enterprise Manager Repository user name.

--connect-string CONNECT_STRING

Specify the connect string for the Oracle Enterprise Manager Repository.

Syntax: AHF Insights

ahf analysis create [-h] --type insights [--last n{m|h} | --for DATETIME | --from DATETIME --to DATETIME] [--refresh] [--tag TAGNAME
]

Parameters

Table 5-2 ahf analysis create --type insights Command Parameters

Parameter Description

-h, --help

Show this help and exit.

--type insights

Specify the type of report to generate.

--last n{m|h}

Specify the --last parameter to analyze data for the past number of minutes (m) or hours (h).

--last cannot be greater than 12 hours.

--for <DATETIME>

Specify the --for parameter to analyze data for a 2 hour period before and after the timestamp specified.

Supported time formats:
"YYYY-MM-DDTHH:MM:SS"
"YYYY-MM-DD HH:MM:SS"

--from <DATETIME>

--to <DATETIME>

Specify the --from and --to parameters (you must use these two parameters together) to analyze data for a specific time interval.

Supported time formats:
"YYYY-MM-DDTHH:MM:SS"
"YYYY-MM-DD HH:MM:SS"
"YYYY-MM-DD"

Time difference between from and to time should not be more than 4 hours.

--refresh

Provides fresh data from AHF Insights sources.

Specify --refresh alone or together with --last to provide fresh data from AHF Insights sources.

--include-cell-data

Specify to include data from cell into AHF Insights sources.

--tag TAGNAME

Specify to collect the files into the TAGNAME directory inside the repository.

Syntax: ahf analysis explore

ahf analysis explore [-h] [--with scope] [--from-file FILE]

Parameters

Table 5-3 ahf analysis explore Command Parameters

Parameter Description

-h, --help

Show this help and exit.

--from-file FILE

Specify to read from a file.

If you do not specify the file extension, then AHF Scope assumes .mdb as the file extension.

Example 5-2 AHF Insights Analysis Usage Examples

Specify [--last | --for | --from --to] to create an analysis for a given period of time. Maximum time interval allowed is 4 hrs.

Specify [--refresh] alone or together with [--last] to provide fresh data from AHF Insights sources.

  • Create analysis report from the data collected in the last 3 hours:
    ahf analysis create --type insights --last 3h
  • Create analysis for a 2-hour period centered at the specified timestamp:
    ahf analysis create --type insights --for 2022-10-10T14:00:00
  • Create analysis for a given time range:
    ahf analysis create --type insights --from 2022-10-10T14:00:00 --to 2022-10-10T15:30:00
  • Create analysis specifying a timezone:
    ahf analysis create --type insights --from 2022-10-10T14:00:00 --to 2022-10-11T13:30:00
  • Create analysis with most recent data:
    ahf analysis create --type insights --refresh
  • Create analysis with a tag:
    ahf analysis create --type insights --tag my_tag

Example 5-3 AHF Balance Usage Examples

Specify [--scope] and [--name] options to create an impact analysis.

The [--cluster] option is required for database impact analysis.

  • Create analysis for a fleet (all clusters):
    ahf analysis create --type impact --scope fleet --name fleet1
  • Create analysis for a fleet (cluster list):
    ahf analysis create --type impact --scope fleet --name fleet1 --clusters clu1 clu2 clu3
  • Create analysis for a cluster:
    ahf analysis create --type impact --scope cluster --name cluster1
  • Create analysis for a database:
    ahf analysis create --type impact --scope database --cluster cluster1 --name database1
  • Create analysis specifying the output directory:
    ahf analysis create --type impact --scope fleet --name fleet1 --output-file /custom_path/custom_name.html
  • Create analysis specifying EM repository user name and password:
    ahf analysis create --type impact --scope fleet --name fleet1 --user-name oracle --connect-string <cs>