6 Analyze Issue Root Cause

Autonomous Health Framework Scope (AHF Scope) is a standalone, interactive, real-time capable front-end to Cluster Health Advisor (CHA). AHF Scope requires a very small foot-print on the monitored system.

CHA continuously monitors cluster nodes and Oracle RAC databases for performance and availability issue precursors to provide early warning of problems before they become critical.

Note:

GIMR is optionally supported in Oracle Database 19c. However, it's desupported in Oracle Database 23ai. For more information, see Removing Grid Infrastructure Management Repository.

6.1 Introduction to AHF Scope

AHF Scope is a standalone, interactive, real-time capable front-end to Oracle Cluster Health Advisor (CHA). AHF Scope requires a very small foot-print on the monitored system.

AHF Scope is invoked using the ahfscope script available in the /opt/oracle.ahf/ahfscope/bin/ directory. AHF Scope is designed primarily for cluster or database experts. It is capable of handling large amounts of data efficiently. Its layout and mode of operation is designed for functional efficiency. Most of the operations can be executed using a positional pointer and Hot Keys, or a floating menu available at the cursor position.

If Grid Infrastructure Management Repository (GIMR) is configured,, AHF Scope will connect directly to GIMR using a JDBC connection, and read the current data in real-time. AHF Scope can also operate locally with no connection to GIMR using a data archive extracted from GIMR.

6.2 Cluster View: Connecting and Basics of Monitoring

Start AHF Scope using the script ahfscope on Linux/UNIX or ahfscope.bat on Microsoft Windows. This script is located in the /opt/oracle.ahf/ahfscope/bin/ directory.

To connect to GIMR, the script obtains connection parameters using Oracle Wallet services. AHF Scope displays a top-level Cloud View upon starting and connecting to GIMR. AHF Scope will terminate upon closing the main window.

6.2.1 Basics of Navigation Through Entity Panels

This is the default window that will appear when AHF Scope is started:

Figure 6-1 AHF Scope default window

This image illustrates AHF Scope default window

The window contains a navigation tree on the left and an analysis panel on the right. The navigation tree always starts in a symbolic top-level system "OurCloud" and displays all components within its connected Cloud. AHF Scope currently connects to only one GIMR, and thus contains data for all monitored nodes and Oracle RAC databases within a single cluster.

The analysis panel contains three components:

  1. A system timeline at the top showing a condensed long term system history.
  2. A status line.
  3. An entity panel associated with the entity selected in the navigation tree.

The currently selected navigation tree entity is marked with a yellow background.

The analysis panel initially appears with a dark background called "Night Theme". A "Day Theme" displaying dark text on a light background is also available. You can toggle the themes by moving the cursor onto the entity panel and pressing lowercase "w", or by using the right mouse click to access the floating menu and selecting "Night/Day Mode

Figure 6-2 Night/Day mode

This image illustrates setting Night/Day mode

This is the first example of how to use either a Hot Key or a floating menu to perform the same function. For more information about Hot Keys, see List of Hot Keys.

The main panel in Day Theme looks like as follows:

Figure 6-3 Main Panel in Day Theme

This image illustrates main panel in Day Theme

This change is persistent. With every consecutive start, the most recent selected theme is used. For more information about the settings that AHF Scope persists, see Set of Persistent Settings.

Note:

The persistent storage for these settings is associated with a specific user on a specific host. These settings need to be repeated when using AHF Scope on multiple hosts.

The status timelines in the entity panel display for each timestamp the state of its entity marked either green, red, or gray.

  1. Green means that at that point in time, none of its sub-entities indicate any problem flagged as "abnormal".
  2. Red means that at least one of its sub-entities is in abnormal state. Mathematically, this means that AHF Scope calculates a union of sets of all sub-entities and propagates its value upward to the top most entity, which is here OurCloud.
  3. Gray marks periods in time in which no data was received for any of the sub-entities.
  4. Gray over green marks periods in time in which some sub-entities received no data.

Click the cluster entity under OurCloud. AHF Scope now marks the cluster name with a yellow background, and the entity panel changes to display the cluster details.

Note:

When a cluster entity is selected, the status line under the timeline displays the number of hosts and databases in this cluster.

Figure 6-4 Number of hosts and databases in this cluster

This image illustrates number of hosts and databases in this cluster

In this hierarchy, the cluster entity contains two sub-entities "Hosts" and "Databases". The above display indicates that all red-marked abnormal states come from one of the databases only.

Any entity in the navigation tree may be expanded to show its sub-entities. Note that expanding the navigation tree does not change the entity panel on the right. This panel changes only when a different entity is selected in the tree.

Note the position of the cursor on the status timeline of the cluster. The entity name will change color to indicate that it has focus. Moving the cursor over the "Databases" status timeline switches the highlighting and focus accordingly. Focusing on a specific entity allows you to enter its panel without moving the mouse pointer to the navigation tree. To step down to the panel of any sub-entity, for example, "Databases" in the image above, you can choose between these methods:

  1. Select the entity in the navigation tree on the left. This requires a movement of the mouse pointer to the tree.
  2. Place the cursor on the entity name and double-click it.
  3. Use a Hot Key without moving the mouse pointer. Press Enter on your keyboard. AHF Scope will select the focused entity and enter its panel. The selected entity will be automatically expanded and highlighted in the navigation tree.

This is the second example of how to use Hot Keys. Instead of moving the pointer across distances to expand an entity in the navigation tree, focus the entity using one hand, and then hit the Enter key on your keyboard. This is by far the fastest way to traverse through the entities.

In this example Entity "Databases" contain two databases of which one indicates abnormal state. When you select this database and step down to its panel, you will see timelines of all its instances.

Figure 6-5 Entity 'Databases' containing two databases

This image illustrates Entity 'Databases' contain two databases

From this example, you can clearly see that the state of the database is a union of states of its instances. This example also shows gray areas in the status bars of instances. They indicate absence of monitored values for some periods in time. The entity might have been stopped, terminated, evicted, or non-responsive. The only known fact is that not a single value was recorded for this entity at this point in time.

With the Instances, we reached one of the final Sub-Entities. These are the Target Entities of CHA. Every type of Target entity has its own specialized panel in AHF Scope.

6.2.2 Target Entities

CHA currently supports two types of target entities: "Host" and "Instance". Using the same example above, lets enter the panel of the second from the top instance cdb012:

Figure 6-6 Target Entities

This image illustrates two kinds of target entities: Host and Instance

On the panel of an instance (or a host), the top timeline represents the state of the target entity. Any status timeline below represents a time series of individual metrics collected for the target entity, which are referred to as signals or probes. The state of a target entity is represented by four colors:

  1. Green: The state is normal. None of the probes are in abnormal condition.
  2. Yellow: Entity state is still normal, but some of the probes are in abnormal condition.
  3. Red: Entity is in abnormal state. At least one problem (also called "decision") indicates the reason for the abnormal state. The height of the red bar changes with the number of problems at that point in time.
  4. Gray: Data was not received.

Every single point in the timeline of a target entity holds a set of values:

  1. State of the entity, which may be green, yellow, red, or gray (missing).
  2. Set of problems (if indicating red state).
  3. Set of probes in abnormal state (if indicating yellow state).

The set of indicated probes may change with every time-point. By default, only probes which are in an abnormal state are shown. When there is no problem indicated, the target entity will be yellow on the timeline. There are a collection of options controlling selection of displayed probes that will be described later.

In the above illustration at the time 18:35:30, the state of the instance is indicated by a yellow bar, and below we see that the probe "Database time" is in an abnormal state. Its value at this point in time is 4.328 ms/call, albeit the predicted (or expected) value in the active calibrated model was 494.719 ms/call. The set of probes in abnormal state and their values may change in every individual point in time. When a probe value is reported in a sample, the data set contains:

  1. State of the probe (normal/high/low).
  2. Observed value, which may be missing at some points in time.
  3. Predicted value, which may also be missing at some points in time, or when no prediction for this probe is available, this value is missing at all points in time.

Not every probe is reported in every sample. Some probes might be reported only at certain points in time. A typical example of such probes are DB wait events. When a probe value is not reported, an empty space is displayed in the timeline denoted as gray. Below is another example for the same Target showing a sample at a different point in time. At this point in time, two probes are indicated as abnormal.

Figure 6-7 DB wait events

This image illustrates DB wait events

Notice the numerous gaps in the indication of "DB FG Wait Ratio". This is an example of a probe without expected values and whose observed value is not being reported with every sample. This differs from the indication in timestamps in which no values were reported for an entity. These periods of time are marked by the gray bars.

Every type of target entity has a unique panel with specialized tabs. For example, the panel of an instance has a tab "Host", which shows the host where the instance is running. Similarly, a panel of a host has a tab called "Instances", which shows a set of running database instances.

Generally, this panel is used to visualize dynamic dependencies between target entities. The navigation tree depicts only the hierarchy of the logical structure of the installation. The logical structure contains a mix of hardware and software target components in their hierarchical dependency. For example, an entity "Cluster" points to "Hosts", which contains a list of individual hosts. These relationships do not change with time. The dynamic dependencies describe how the target entities relate to each other. These relations may change with time. They are defined by pairs of target entities and ranges of time in which they are valid. Some of these relations may be exclusive. For example, at a specific point in time, an instance is said to run on one host. However, a host can have a number of instances running on it at the same point in time.

6.2.2.1 A Host Panel

A Host panel has two tabs: "Host Detail" and "Instances".

Figure 6-8 Host Panel

This image illustrates a host panel

This host is running three different instances between 9 and 10 o'clock. Something happened around 9:11 and 9:46, which caused the instance to shut down, or change its host. As more than one instance from different databases may run on a host, there is a 1:N relationship. After approximately 9:48, two instances are running on this host.

Figure 6-9 Panel of a Host

This image illustrates panel of a host

To switch from a host panel to the desired instance panel, move focus to any of the instances using down or up arrow key and hit the Enter key on your keyboard, or double-click any of the instance names. The display will switch to the panel of the chosen instance, select its node in the navigation tree, and expand the navigation tree to make this node visible.

6.2.2.2 An Instance Panel

From the above example, let us switch to the instance otlpacdb_3. It is the customary status panel.

Figure 6-10 Instance Panel

This image illustrates an instance panel

The panel displays two interruptions in service mentioned in the previous section. This display does not indicate that the instance was in fact running on more than one host at different points in time. Click the "Host" tab to visualize where the instance was running. The relationship is 1:1. An instance runs always on only one host at a particular point in time.

Figure 6-11 Instance Panel

This image illustrates an instance panel

From this display, switching to any of the hosts is performed by either double-clicking its name, or by selecting the timeline and hitting the Enter key on your keyboard. Using this method, you can quickly switch between host and instances. As the image illustrates, this panel displays a form of cascaded time ranges from most recent on top to the oldest on the bottom. When resizing the window, the display may change dynamically causing timelines to be added or removed.

For example:

Figure 6-12 Instance Panel

This image illustrates an instance panel

The run time of the instance on host rwss03client04 is no longer indicated because it ended prior to 9:04 and this time is not shown in the narrower window.

6.2.3 Browsing Through Time and Pin Operation

AHF Scope display is dynamic when connected to a GIMR and receiving live data in real-time. AHF Scope is receiving new samples and advancing all timelines. With every new sample or with a change of a cursor position, the corresponding time point is automatically displayed with a set of abnormal probes together with their values. This time point is indicated by a vertical marker. Once a time point of interest is identified, it can be "pinned" using

  1. Double-click any time at the target status timeline.
  2. Press the Hot Ley "p" while hovering over target status timeline.
  3. Press the left or right arrow key while hovering over target status timeline.

The pinned time point may be fine-tuned by pressing the left or right arrow keys and observing the displayed timestamp. Once the time point is pinned, the position marker will no longer follow the cursor. Its color changes from orange to blue to indicate this mode of operation. Moving the cursor to any other time and pressing "p" changes the pin point to the new location. Pressing "f" will unpin the time point causing it to again follow the cursor.

Note that pinning to a specific time point selects its corresponding set of probes in abnormal state. When browsing through the visible values of this set, the pin point can be preserved. To achieve this, either hover cursor over the timelines of the probes, or use the down arrow key to move focus to these timelines. Press and hold the Control key and move the timestamp marker sideward using mouse or left or right arrow keys. Besides the light blue pin marker, a second marker is displayed following the cursor position. The bottom information line shows the time point associated with this marker. The floating label at the bottom of each bar graph shows the values at the position of the cursor.

Figure 6-13 Browsing Through Time and Pin Operation

This image illustrates browsing through time and pin operation

Image shows a set of two probes in abnormal state at pinned time 18:37:05. Press the Control key and move the cursor along the timeline. Each position corresponds to a different point in time. In this example, the values sampled at 18:34:00 of a set of probes selected at 18:37:05 are shown. With the Control key remaining pressed, use the left ot right arrow keys to step through time points one sample at a time.

To activate this feature permanently, select the checkbox "Follow Cursor" in the floating menu.

Pin point is automatically preserved in every entity sharing the same cluster. This feature is used to move between different panels and explore values of probes at the same time point. When viewing live data, the pin will be released when its timestamp reaches the left or right end of the viewport.

6.2.4 Changing the Set of Visible Probes

By default, only probes in abnormal state are displayed in yellow for every time point. When the selected time point is marked red, a set of problems is active (see Problems or Anomalies), each with its own set of abnormal probes. At this point, only the description of the problem is displayed. Once a problem is selected, its associated set of abnormal probes are identified and displayed. This will be discussed further in Problems or Anomalies.

You may prefer to display all high probes at any time point, or perhaps explore visually the time series of every probe regardless of their state.

The following are the custom settings for the selection of probes:

  1. Display only abnormal probes (the "yellow" marked zone) or probes of a selected problem. This is the default behavior.
  2. Display always all abnormal probes (on "yellow" or "red").
  3. Display every existing probe regardless its state.
  4. Display all probes belonging to the same category (called "correlated probes").
  5. Display every abnormal probe in a specific time range. For more information, see Selecting Abnormal Probes in any Time Range.
  6. Display only a subset of existing probes (3) by their category. For more information, see Selecting Custom Set of Probes.

The following examples show how to select options (2), (3) and (4) from the above list. Use either floating menu or a Hot Key to change the mode in which probes are being displayed.

For example:

Figure 6-14 Changing the Set of Visible Probes

This image illustrates changing the set of visible probes.

Either select "Show all High Probes" or press "a" to display every high probe at any point in time. Select "Show Every Probe" or press "A" to view every probe regardless of its state.

Figure 6-15 Changing the Set of Visible Probes

This image illustrates changing the set of visible probes.

The number of probes may exceed the size of the display window. In this example, only four probes out of a total of 126 can be displayed. The Status label under the time line indicates which subset of probes is being shown and provides up or down arrows to navigate. In this example, probes 18, 19, 20 and 21 are displayed. Both up/down icons are highlighted indicating that scrolling both directions is available using cursor up/down or page up/down keys or using the mouse wheel.

It is also possible to display all probes belonging to the same category. Categories of probes will be explained in the section devoted to Expert Mode. Every probe may belong to one or more pre-defined categories. For example, "Buffer Cache," or "Global Cache Exceptions" are categories of probes associated with database instances.

Note that when the cursor hovers over any probe's timeline, its boundaries are highlighted as it gets focus. Alternatively, use the up or down arrow keys to move focus to a specific probe without moving the mouse pointer. Figure 13-17 focus is on "Log file sync":

Figure 6-16 Changing the Set of Visible Probes

This image illustrates changing the set of visible probes.

Use the right-click popup menu, or press "c" to display all correlated probes belonging to all "Log file sync" member categories.

Figure 6-17 Changing the Set of Visible Probes

This image illustrates changing the set of visible probes.

Should the number of probes not fit into the display, the up/down cursor icons appear and the status line shows details about visible subset of probes. See Expert Mode for the steps to select and create subsets of probes.

6.2.5 Selecting Abnormal Probes in Any Time Range

Tracking probes in abnormal state between times points is supported. Move the cursor to the first time point, press the Shift key, press the left mouse key and move the cursor to the second time point, release the mouse button, and then release the Shift key. The selected time range is indicated by a gray bar under the time line. The information strip indicates that a "Time Filter" is active. In the example, a total of 7 probes are indicated high between 18:27 and 18:40, with probes 1 to 5 visible in the display:

Figure 6-18 Selecting Abnormal Probes in any Time Range

This image illustrates selecting abnormal probes in any time range.

The selection of time range is preserved cluster-wide similar to a selection of pin point. You can change target entities and see the sets of abnormal probes in the selected time range. For example, in the same period in time, only four high probes for another instance of the database are detected.

Figure 6-19 Selecting Abnormal Probes in any Time Range

This image illustrates selecting abnormal probes in any time range.

To deactivate the filter, either select "Reset Time Selection" on the right-click pop-up menu, or hit the Escape key on your keyboard.

6.2.6 Problems or Anomalies

Whenever the timeline of the entity displays red, one or more problems exist at that timestamp. Every problem contains the following information:

  1. Name: This is an internal identifier.
  2. Description: An explanation of the nature of the diagnosed problem (displayed by default on the panel).
  3. Confidence probability in percentage (may differ at every time point).
  4. Root cause and diagnosis of the problem.
  5. Suggested corrective actions.
  6. Set of probes associated with the problem (may differ at every time point).
  7. Set of inference chains in Bayesian Network. One chain per probe associated with the problem.
  8. Set of tables with detailed information (optional, calculated per time point).

Portions of this information will be displayed only in the Expert Mode, described in Expert Mode. Below is an example of two problems indicated at 18:31:00:

Figure 6-20 Problems or Anomalies

This image illustrates problems or anomalies.

Problems are listed in the order of their confidence probability. Problems that have probes that are Key Performance Indicator (KPI) will be displayed at the top of the list regardless of their confidence probability. The order and set of problems may differ at each time point, and a set of probes raised for each problem may vary with every time point.

The problem with a focus is indicated by a color change and a right arrow to the left of its number. Focus may also be changed using the up or down arrow keys.

Each problem has a set of probes in abnormal state. As long a specific Problem is not selected, AHF Scope will not display any probes. This default preference for display of probes may be changed by using the right click pop-up menu, or by using a Hot Key. For example, when selecting radio button "Show all High Probes" or pressing "a", all abnormal probes across all problems at 18:31 are displayed.

Figure 6-21 Problems or Anomalies

This image illustrates problems or anomalies.

The highlighted up/down icons in the upper-left side of the panel indicates that not all probes can be displayed. In this example, the information bar displays "Shows 3 of 4 probes [1..3]". Use down/up Arrow or the page up/down keys to scroll through the probes while keeping same time position.

Probes specific to a problem are still available by returning to the default mode, and by selecting the problem of interest. Press "a" again to remove the display of all abnormal probes and return to the default mode.

To step through the analysis of a problem in detail, highlight the problem and either click it, or hit the Enter key on your keyboard. This action selects the highlighted problem and displays the root cause diagnosis and a recommended corrective action. In addition, on the status timeline every occurrence of the same problem is displayed in magenta. In the example below, the top problem was indicated between 18:28 and 18:34:

Figure 6-22 Problems or Anomalies

This image illustrates problems or anomalies.

To switch to another problem, either click it, or use the up/down arrow to highlight it, and then hit the Enter key on your keyboard. Note how the magenta marked time period has changed on target's status timeline.

Figure 6-23 Problems or Anomalies

This image illustrates problems or anomalies.

Click once more on the selected problem, or hit the Enter key on your keyboard to display the set of the probes providing evidence supporting the problem determination. In this example, the problem has only one probe, "Gc cr request":

Figure 6-24 Problems or Anomalies

This image illustrates problems or anomalies.

Hit the Enter key again, and note that the display returns to its list of problems to facilitate analysis of the additional ones. There are the following three problem displays which cycle upon clicking or pressing Enter.

  1. Problem is highlighted (empty right arrow points to the problem name).
  2. Problem is selected, shows textual descriptions of cause, corrective action (empty down arrow points to the descriptions).
  3. Problem is selected, shows status timelines of probes associated with the problem (filled down arrow points to the probe timelines).

Note that when selecting a different problem by a mouse click or using Enter, the display mode stays the same allowing stepping through each problem in the same mode.

Related Topics

6.2.7 Browsing through active time of a Problem

Once a problem is selected, the time periods in which it was active is marked by a magenta color on the status timeline. Press and hold the Shift key and use the left or right arrow key to fast forward to a previous or next active time point in which this problem was indicated. As shown in Figure 13-25, for example, at 22:17:55 a "DB Writer checkpoint" problem is displayed. The magenta color shows several time ranges in which this condition was also diagnosed.

Figure 6-25 Browsing through active time of a Problem

This image illustrates browsing through active time of a problem.

The selected timestamp is at the end of the magenta colored time range. Press Shift+Right Arrow keys and the cursor will move to the next later timestamp in which the same Problem was reported. In this case, it was 22:20:30, approximately two minutes later.

Figure 6-26 Browsing through active time of a Problem

This image illustrates browsing through active time of a problem.

6.3 Expert Mode

Expert mode facilitates an advanced analysis of the metrics and their observed values against the predicted ones.

Section 2 cover the Standard mode of operation. However, since this diagnosis is based on an applied machine learning model of predicted metric values, there is always a probability that an abnormal condition will not be diagnosed correctly - either raising a warning too late or providing a false one. To facilitate an advanced analysis of the metrics and their observed values against the predicted ones, an Expert mode is provided.

6.3.1 Activating the Expert Mode

Press "e" to toggle the Expert mode, or use the right-click pop-up menu, and then select "Expert". The probe status timelines change their appearance, and an additional "Expert" tab appears. Only the target timeline stays unchanged.

In the Expert mode, timelines of probes contain now an overlapping display of three values:

  1. Time series for the expected (or predicted) value displayed as light blue lines from 0 to its value.
  2. Time series for the observed value. Plotted on top of the predicted values in green or in red to indicate the state.
  3. State of the probe marked in green or red of the observed value plot.

Figure 6-27 Activating Expert Mode

This image illustrates activating Expert Mode.

This display helps to evaluate how well the existing model aligns the actual observed values with the predicted values. If the observed values are consistently and significantly different from predicted values, then it is likely that the model is not well-calibrated to a particular workload. Performing a CHA calibration based on this workload should be considered.

The histograms of the probes are dynamically self-adaptive to the range of values of each metric predicted or observed value. This might cause that parts of the visible time series appear "flat". The predicted and observed values might be so similar to each other that the differences between them would be barely visible. See the "Open file descriptors" in the above image. In such cases, you might prefer to see a plot of a difference between the observed and predicted values, called a "residual". Select "Display Residuals" on the floating menu, or press "r" to toggle between the display of residuals or of the pair predicted/observed. The time series shown originally in Figure 13-27 changes to what is shown in Figure 13-28.

Figure 6-28 Activating Expert Mode

This image illustrates activating Expert Mode.

Note how the differences in values of "Open file descriptors" become visible. Positive residuals are displayed as "green hills" (observed value is greater than the predicted), and negative residuals are displayed as "blue pools" (observed value less than predicted).

This display will be meaningful only for probes for which CHA provides both expected and observed values. Otherwise, they will appear empty. Consider the following example, in which the metric "Network used bandwidth" does not have expected values.

Figure 6-29 Activating Expert Mode

This image illustrates activating Expert Mode.

Display of residuals for this metric is empty. The display for "Network utilization" and "Number of processes" provide good visualization of differences between their expected and observed values.

Figure 6-30 Activating Expert Mode

This image illustrates activating Expert Mode.

The time series of "OCSSD process CPU utilization" suggest a good similarity between the observed and predicted values.

6.3.2 Resizing Expert Diagrams

Due to auto-scaling, important details in observed and predicted values may not be easily visible. A vertical zoom function is available by placing the cursor over the area of interest and pressing Control+Left mouse button to drag the cursor up or down. The maximum height of each graph is 128 pixels. The example below shows the previous image enlarged to display more detail in graphs.

Figure 6-31 Resizing Expert Diagrams

This image illustrates resizing Expert diagrams.

6.3.3 Selecting Custom Set of Probes

Probes monitored by CHA are grouped into categories. A custom set of probes may be specified based on their categories, or on individual probes from different categories. Click the "Expert" tab, or use the tab and left/right arrow key to navigate to the Expert panel.

Figure 6-32 Selecting Custom Set of Probes

This image illustrates selecting custom set of probes.

This panel provides access to a hierarchical tree with probe categories. The number of probe categories may vary between CHA models. To create a custom set of displayed probes, select either a category or expand its tree, and then select individual probes from any category. For example, select the category "Global Cache Congested". Its name is highlighted in yellow and its checkbox is selected. Note, two additional categories became highlighted without their checkboxes being selected. This means that one or more of the probes from "Global Cache Congested" are also their members.

Figure 6-33 Selecting Custom Set of Probes

This image illustrates selecting custom set of probes.

Expand the selected category and one of the other yellow highlighted categories to see which probe they share. In this case, it is the "cpu_used_pct":

Figure 6-34 Selecting Custom Set of Probes

This image illustrates selecting custom set of probes.

By unchecking the probe "cpu_used_pct", only one category will remain highlighted. Switch to the "Instance Detail" tab to see that now only the three signals of the category "Global Cache Congested" are being displayed on the Instance tab. Selections may be stored for reuse. On the Expert tab use the button "Save As" to save the selection under a specified name.

Figure 6-35 Selecting Custom Set of Probes

This image illustrates selecting custom set of probes.

These saved selections will be available after a restart of AHF Scope on the same host. Use the drop-down menu "Saved selections" to retrieve any of the saved sets of probes. Press "Load" to activate it, or press "Delete" to remove it from the persistent storage. Note, that selecting of the saved set without a corresponding Loading does not activate it.

6.4 Live and Passive Sessions

AHF Scope maintains two separate sessions in parallel, Live Session (Primary) to receive current metrics in real-time and Past (Replay) Session to display statically a situation encountered at an earlier time.

AHF Scope can be run locally on one of the Oracle RAC cluster nodes and connected to the Grid Infrastructure Management Repository (GIMR,) or by reading exported GIMR data from a file.

Note:

Note, remote GIMR connections are not supported because the SQL connection is not encrypted.

When connected to a GIMR database and actively receiving samples in real-time from a live system, AHF Scope maintains in parallel two separate sessions:

  1. Live Session (Primary): Receives in real time current metrics.
  2. Past (Replay) Session: Displays statically a situation encountered at an earlier time.

Access to a replay session is available via the System Timeline (Ticker Tape) only when AHF Scope is connected to a GIMR. When AHF Scope is started with -f file as its feed, Ticker Tape is not active. In such a case, the primary session is passive and AHF Scope does not have any possibility to retrieve data from the past.

Use Ticker Tape to locate the time period of interest. Place the cursor over the Time Selector and press the Shift key. Note that Ticker Tape now displays information about the time range corresponding to the location of the time selector.

Figure 6-36 Live and Passive Sessions

This image illustrates live and passive sessions.

In this example, the selected time period is the hour from 12:27:12 to 13:27:12. The time selector width (time period) is customizable by using the -q minutes command-line parameter.

While holding down the Shift key, press the left mouse button, and then slide the time selector to an older time point. Time Selector may also be moved without using a mouse. Press the Shift key and while holding it, press the left or right arrow to shift time selector by 30 minutes. In this example, time selector was moved to approximately. 9:00.

Figure 6-37 Live and Passive Sessions

This image illustrates live and passive sessions.

Release the Shift key. AHF Scope will issue a query to GIMR requesting a set of samples for the selected period in time. The time to retrieve this data can be substantial especially when CHA monitors many databases in a large cluster. While the query and the parsing process of the data is in progress, a clock "Wait-Cursor" is displayed. In the background, the current timelines of the live session continue to receive data and advance accordingly.

Figure 6-38 Live and Passive Sessions

This image illustrates live and passive sessions.

Once the query and parsing of the data is finished, AHF Scope displays the data of a "Replay Session". In this example, this session covers one hour, that is approximately 9:00..10:00 o'clock.

Figure 6-39 Live and Passive Sessions

This image illustrates live and passive sessions.

This display never changes. You can investigate the data in the customary way without any time restriction. However, the "Live Session" is still active and in the background the data is constantly being collected. To restore the live session, either slide the time selector to the right end, or with the cursor hovering over System Timeline, press "=". Note that the past session data is discarded from memory once the session is switched to Live.

6.5 ahfscope Console Commands

Use the ahfscope -i command option to activate an interactive command-line interface (CLI). Enter a question mark (?) to see the list of available commands.

Syntax

cha> ?
	list item
     	entities
     	inputs
     	kinds
          	verbose
     	metrics
          	details
     	probes
                 diff
                 nounit
                 noobserved
                 nopredicted
                 missing
                 flagged
                 units
                 diagnose
        trace item
	     f: (feed)
	     d: (db)
	     i: (input)
	     p: (probes)
	     r: (rootcause)
	     t: (topology)
        version
        zoom (in|out)
        quit

Parameters

Table 6-1 ahfscope Console Command Parameters

Parameter Description

list item

  • entities: List of entities and time ranges of their relations.
  • inputs: Lists input feeds
  • kinds: Kinds of Entities and number of their metrics
    • verbose: Lists metrics with every kind
  • metrics: List of all known metrics and their units of measure.
    • details: Lists metrics with full name and description of value
  • probes: List of all probes (signals) for all entities.
    • diff: Lists probes where number of predicted and observed values differ
    • nounit: Lists probes without unit of measure
    • noobserved: Lists probes without observed values
    • nopredicted: Lists probes without predicted values
    • missing: Lists probes with missing values in some samples
    • flagged: Lists probes flags set
  • units: List units of measure.
    • diagnose: Shows conversion rules and values at conversion thresholds.

trace item

Switch tracing on/off, indicated by '+' or '-'.
  • f: (feed) Live feed activity.
  • d: (db) Toggle alter session for trace event 10046 (ORA DB).
  • i: (input) Copy incoming data to the log file.
  • p: (probes) Displays internal or descriptive probe names.
  • r: (rootcause) Prints to console CLOB's containing a root cause of a problem.
  • t: (topology) Displays changes in set of entities (topology).

version

Version of the (1) user interface, (2) data stream, and (3) Java Virtual Machine.

zoom (in|out)

Sets or resets magnify option.

quit

Exits AHF Scope GUI.

For convenience, AHF Scope's CLI provides an abbreviation grammar. For example: Instead of typing version, you can simply type v:
cha> version
   CHA UI version:     V1.00.000
   Data version:       V0.17
   PL/SQL package:     V0.10.11.2
   Java version:       1.8.0_77 on Linux
cha> v
   CHA UI version:     V1.00.000
   Data version:       V0.17
   PL/SQL package:     V0.10.11.2
   Java version:       1.8.0_77 on Linux

When more than one command starts with the same prefix, they need to be disambiguated. For example, debug versus device would require typing at least 3 letters to correctly identify the desired command. These commands provide summaries not available in graphical form.

6.6 List of Hot Keys

Hot Keys are keyboard shortcuts that provide an alternate way to do something you would typically do with a mouse.

List of Hot Keys

Table 6-2 List of Hot Keys - on any place

Key Description

^+ (control +)

Enlarge (Zoom In). On many keyboards '+' might stand over '=', use "shift" to reach '+'

^- (control -)

Shrink (Zoom Out)

w (lower case w)

Toggle theme (Day/Night)

Table 6-3 List of Hot Keys - on a target panel (Host or Instance)

Key Description

a (lower case a)

Show all probes in high state.

A (upper case A)

Show all existing probes.

Note:

Set may be filtered in the Expert Tab.

c (lower case c)

Show all correlated probes belonging to the same Probe category.

e (lower case e)

Toggle expert mode.

r (lower case r)

Toggle values predicted (expected)/residuals.

p (lower case p)

On state histogram: pin the cursor line at the current position.

f (lower case f)

On state histogram: follow (unpin).

-> (shift right-arrow)

Jump to the next occurrence of the selected problem.

<- (shift left-arrow)

Jump to the previous occurrence of the selected problem.

^q (control q)

Print the values of the focused probe before and after the position marker.

<escape>

Remove time filter, unpin, and deselect a problem

Table 6-4 List of Hot Keys - zoom Signal Histograms in Host/Instance panel between 24..128 pixels. The cursor has to be in the histogram area.

Key Description

control+mouse left-button drag up

Makes the histograms smaller.

control+mouse left-button drag down

Makes the histograms larger.

control+arrow up

Makes the histograms smaller.

control+arrow down

Makes the histograms larger.

Table 6-5 List of Hot Keys - Signal navigation (active when not all signals visible on a panel)

Key Description

Arrow Up/Down

Navigate between signals, scroll visible signals by one.

Page-Up/Page-Down

Navigate between signals, scroll entire page of signals.

Table 6-6 List of Hot Keys - on the system timeline (Ticker Tape)

Key Description

= (equal sign)

Jump to the current timestamp from a "Past Session", resume "Live Session".

Shift + Left/Right Cursor

Shift time selector by 30 minutes to the left or right.

Table 6-7 List of Hot Keys - entity navigation on all non-target panels

Key Description

<Enter>

Selects the focused entity and displays the panel associated with the Entity.

6.7 Set of Persistent Settings

Review the list of persistent settings that you can reuse.

  1. Theme dark/light
  2. Last time of start
  3. Width of the details panel, corresponding to number of minutes on this display
  4. Window size and position
  5. Last selected set of probes
  6. Every set of Named Custom Selections of probes. See Selecting Custom Set of Probes.

6.8 Accessibility Aspects

AHF Scope can be used without the help of the mouse.

Every operation can be achieved using the keys, Tab, cursor keys, Enter, and several Hot Keys described in List of Hot Keys. Some of the operations are available in combination with Shift and Control keys. No timeout exist on any of these operations, thus they can be used in conjunction with Sticky Keys and Slow Keys. The navigation contains components from standard Java "Swing" augmented by custom implementation of navigation in components designed for the unique, specialized displays in AHF Scope.

A magnification operation allows to enlarge text and components on panels. AHF Scope's displays are always high contrast, without use of images. They do not change with OS High Contrast mode, and AHF Scope's display or mode of operation does not affect the remaining desktop.

To enable accessible technology in Java on Microsoft Windows, follow the instructions outlined in Enabling and Testing Java Access Bridge on Microsoft Windows.

AHF Scope can operate with assistive technology software JAWS from Freedom Scientific, V17 and later.

6.9 Customizing Java Run Time System

As AHF Scope is written in Java, it is platform independent.

The script ahfscope invokes the Java Virtual Machine (JVM) with Oracle classes. Knowledgeable users may consider customizing this script, or use the environment variable _JAVA_OPTIONS to determine the way the JVM executes code.

JVM is the run-time process, which interprets Java classes. All contemporary JVM's incorporate some method of on-the-fly translation of bytecode into native code. Dominating in this field is the Hot Spot. Except for beginning invocations of classes, in most cases Java methods run later in native code. Consequently, they perform at speeds comparable to programs written in native languages, such as C.

Furthermore, on many platforms Java supports native 2D and 3D graphics with a hardware acceleration through the use of the Open GL libraries that significantly improves display performance. It is highly recommended that Open GL will be configured for default use with JVM. Information about the Open GL library is available at: http://www.opengl.org/. Most manufacturers of rendering hardware, that is graphics cards, provide a version of this library for their video cards. It is important to obtain a current version of this library, besides the current drivers for the graphics card. See the following sites for detailed information about Java rendering using Open GL:
  • https://docs.oracle.com/javase/8/docs/technotes/guides/2d/flags.html#opengl
  • https://docs.oracle.com/javase/8/docs/technotes/guides/2d/new_features.html
JVM activation flags for OpenGL are:
  • -Dsun.java2d.opengl=true: Use the OpenGL pipeline
  • -Dsun.java2d.d3d=true: Use the Direct3D accelerator for Microsoft Windows

AHF Scope does not render in 3D, but benefits greatly from the accelerated region repaint available through Direct3D.

JVM options may be used with the java command, or declared as an environment variable, _JAVA_OPTIONS.

Linux Example:
_JAVA_OPTIONS="-Dsun.java2d.opengl=True -Dsun.java2d.d3d=true"  export _JAVA_OPTIONS
Note the capitol "T" in the "true": If written with a capital letter, Java will print to the standard output whether the OpenGL pipeline is available or not. The following is an example of a warning that Open GL is not available:
Picked up _JAVA_OPTIONS: -Dsun.java2d.opengl=True -Dsun.java2d.d3d=true
Could not enable OpenGL pipeline for default config on screen 0
In this case, the system does not have a graphics card supporting OpenGL. The following is an example of a system with a graphics card supporting OpenGL:
OpenGL pipeline enabled for default config on screen 0

In the case of some graphics cards, OpenGL requires the option: sun.java2d.opengl.fbobject=false. See section 3.1.5.5 Diagnosing Rendering and Performance Issues in the following document: http://www.oracle.com/technetwork/java/javase/index-142560.html.

This link is the current and comprehensive description of potential issues with OpenGL and Java 2D drawing package in conjunction with specific hardware/driver versions.

6.10 Setting Proper Character Encoding Page on Microsoft Windows

Specify the encoding to interpret the characters dispalyed on the HTML page correctly.

Should you see strange characters in the text from AHF Scope console, for example:
278 waitclass_userio                     Ás/s

Verify the code page being active using the chcp command. For example, the page 437 (US default) unfortunately does not provide a proper display to the Greek "micro" character. Change the page to page 850 - Multilingual (Latin I) in order to see the "unicode character" Greek 'µ' showing up properly.

c:\rac\crf>chcp
Active code page: 437

c:\rac\crf>chcp 850
Active code page: 850

...

278  waitclass_userio                    µs/s

6.11 ahfscope

Use the ahfscope command to manage AHF Scope.

Syntax

ahfscope [flags] [parameters]
-f name
-i
-q value[,value]
	minutes
	clob
	psec
-C
-D item[,item]
	feed
	db
	input
	probes
	rootcause
	topology
	unit

Parameters

Table 6-8 ahfscope Command Parameters

Parameter Description

-f name

Specify to read from a file.

If you do not specify the file extension, then AHF Scope assumes .mdb as the file extension.

-i

Specify to run ahfscope in interactive mode (recommended). This option permits entering additional commands that are not available from the GUI.

ahfscope -i When started with this option, a cha> command-line prompt appears in the operating system terminal. This can be used to enter terminal commands not available on the graphical panels. These commands are enumerated by entering help at the prompt.

-q value[,value]

Specify to configure the connection and queries executed in the GIMR. Do not use this option with the -f option.

Specify a comma-delimited list for optional parameters with no spaces.
  • minutes: Specify to query time period in minutes. Default: 60 minutes, Minimum: 2 minutes.
  • clob: Specify to use sql.clob.
  • psec: Specify to postpone the query in seconds. Default: sampling period. Minimum: 1 second

Option -q minutes sets the amount of data based upon time for the initial query. Since the sampling rate is 5 seconds, in the default data set with 60 minutes will contain 720 data points. Note that when longer times are selected via the -q option, a substantial time might be added to the startup process, especially when the monitored configuration has many nodes and databases. The maximum number of minutes in this option is determined by the width of the screen (number of pixels) divided by the number of samples per minute, which is 12 for the 5 second sample interval. For example, on a standard FHD monitor with 1920 horizontal pixels the number of minutes is limited to 160, or 2 hours and 40 minutes.

Option -q pseconds adjusts the delay between the "time in the query" versus "time of the query". The time of the query must trail the time in the query. The smaller the delay, the closer the display is to the real-time. Default delay is one sampling period: 5 seconds. Regardless the delay, a sample always provide 5 seconds of data. For example -qp10 would cause that a query for 5 seconds of data in the period of 10:20--10:25 would be invoked at 10:35 or later. Use this option when you observe that CHA is too slow and cannot commit transactions on time for AHF Scope. In such a case random gaps in data might be indicated. On a fast system even -qp1 can be used without any adverse effects.

Option -q clob directs AHF Scope to use alternative path of retrieving CLOB from the database. In some versions of Oracle Database a direct retrieving of CLOB from SQL query leads to fragmentation in the database. When user enters this option -qc, AHF Scope uses a 2-step process to obtain CLOB's, with an explicit disposal command. Elapsed time for every query will increase.

-C

Specify to extract the selected data from the .mdb file in JSON format. Use this option only with the -f option.

-D item[,item]

Specify to set the debug mode.

Use this option to obtain a complete copy of the data received by AHF Scope stored in a file called .mdb file after Management Database. This file can be used as an argument with the -f file option.

Specify a comma-delimited list for optional parameters with no spaces.
  • feed: Specify to view timings of all data queries.
  • db: Specify to activate alter session set event 10046.
  • input: Specify to copy input data (CLOB) to a log file.
  • probes: Specify to use internal probe names.
  • rootcause: Specify to inform about start and stop of any rootcause
  • topology: Specify to view changes in the set of entities ('topology').
  • unit: Specify to view warnings about implicit settings for units of measure.

Note:

On a Microsoft Windows system, enclose all comma-separated arguments with double-quotes.

For example: "-Dprobes,input", or shorter "-Dp,i".

AHF Scope Modes

AHF Scope can operate in several modes:
  • With a default connection to GIMR database
  • Read in a text file with monitoring data (option -f).
  • Parse text file with data and generate JSON object with information similar to query "diagnosis" (option -C).

Default connections initiate a live session and provide real-time monitoring. The connection to the GIMR database is established via JDBC using Oracle JDBC thin driver.

Using an MDB file as a parameter (Option -f) directs AHF Scope to analyze textual data extracted from a GIMR or data collected during a live session. This data is held in a *.mdb file. A *.mdb file can be generated from GIMR using command chactl export repository. An example of obtaining one hour worth of data:
host:/dir> chactl export repository -format mdb -start '2018-11-22 09:30:00' -end '2018-11-22 10:30:00'
successfully dumped the CHA statistics to location "/hostname/trc/chad/cha_dump_20181122_093000_20181122_103000.mdb"

Using option -C will start AHF Scope without the GUI front end. AHF Scope will only parse the mdb file and generate a JSON file, similar to the file generated by chadiag. This data can be used by other tools to indicate periods of time in which CHA diagnosed problems.

When AHF Scope is invoked without any command line options, AHF Scope uses JDBC to connect to the GIMR database and operates in a real-time mode as an active monitor. Connection credentials will be obtained from Oracle Wallet or from the manual input in the login console. After the connection is established, AHF Scope retrieves a data set with the most recent N-minutes of data. In a first invocation of AHF Scope the data set contains 60 minutes, unless option -q is used. In any subsequent invocation the number of minutes in the data set corresponds to the width of the window selected by the user.