Skip to main content
Previous section   Next section

Using Caché Monitor

Caché Monitor monitors the Caché instance’s console log for errors and traps reported by Caché daemons and user processes and generates corresponding notifications, including email if configured. This chapter discusses the following topics:

Caché System Monitoring Tools

Caché provides three sets of tools for general monitoring of Caché instances, as follows:

  • The Management Portal provides several pages and log files that let you monitor a variety of system indicators, system performance, Caché locks, and errors and traps, as described in the “Monitoring Caché Using the Management Portal” chapter of this guide. Of these, the console log (install-dir\mgr\cconsole.log by default) is the most comprehensive, containing general messages, startup/shutdown, license, and network errors, certain operating system errors, and indicators of the success or failure of jobs started remotely from other systems, as well as alerts, warnings and messages from Caché System Monitor.

  • Caché Monitor, as described in this chapter, generates notifications for console log entries of a configured minimum severity and either writes them to the alerts log or emails them to specified recipients. This allows console log alerts of all types to be extracted and brought to the attention of system operators.

  • Caché System Monitor generates alerts and warnings related to important system status and resource usage indicators and also incorporates Caché Application Monitor and Caché System Health Monitor, which monitor system and user-defined metrics and generate alerts and warnings when abnormal values are encountered. System Monitor and Health Monitor alerts and warnings are written to the console log; Application Monitor alerts can be sent by email or passed to a specified notification method. System Monitor (including Application Monitor and Health Monitor) is managed using the ^%SYSMONMGR utility. See the “Caché System Monitor” chapter of this guide for detailed information about using System Monitor, Application Monitor and Health Monitor.

Caché Monitor Overview

Caché Monitor scans the console log at regular intervals for entries of the configured severity level and generates corresponding notifications. These notifications are either written to the alerts log or sent by email to specified recipients.

Caché writes general messages, errors and traps, and the success or failure of jobs started remotely from other systems to the console log; see Monitoring Log Files in the “Monitoring Caché Using the Management Portal” chapter of this guide for more information. In addition, Caché System Monitor write alerts and warnings to the console log. By generating notifications based on console log contents, Caché Monitor bring alerts to the attention of system operators.

Note:

Caché Monitor does not generate a notification for every console log entry of the configured severity. When there is a series of entries from a given process within less than about an hour of each other, a notification is generated for the first entry only. For this reason, you should immediately consult the console log (and view System Monitor alerts, if applicable) on receiving a single notification from Caché Monitor. However, the console log entries listed in Caché Monitor Errors and Traps always generate notifications.

Caché Monitor operates with the following settings by default:

  • Caché Monitor is continuously running when the instance is running.

  • The console log is scanned every 10 seconds.

  • Notifications are generated for console log entries of severity 2 (severe) and 3 (fatal).

  • Notifications are written to the alerts log.

    Note:

    You can view the alerts log in the Management Portal by navigating to the System Logs page (System Operation > System Logs) and selecting System Monitor Log, then using the Browse button to select the alerts.log file.

    The alerts log is not created until Caché Monitor writes its first notification to the log.

You can configure and manage Caché Monitor, including changing its default settings and configuring email notifications, using the interactive ^MONMGR utility.

Using the ^MONMGR Utility

The Caché Monitor Manager (^MONMGR) utility must be executed in the %SYS namespace (the name is case-sensitive).

  1. To start the Caché Monitor Manager, enter the following command in the Terminal:

    %SYS>do ^MONMGR
    
    Copy code to clipboard
  2. The main menu appears. Enter the number of your choice or press Enter to exit the Caché Monitor Manager:

    1) Start/Stop/Update MONITOR
    2) Manage MONITOR Options
    3) Exit
    
    Option? 
    
    Copy code to clipboard

The options in the main menu let you manage Caché Monitor as described in the following table:

Option Description
1) Start / Stop / Update Monitor Displays the Start/Stop/Update Monitor submenu which lets you manage Caché Monitor and the alerts log.
2) Manage MONITOR Options Displays the Manage Monitor Options submenu which lets you manage Caché Monitor notification options (sampling interval, severity level, email).
3) Exit Exits from the Caché Monitor Manager.

Start/Stop/Update Monitor

This submenu lets you manage the operation of the Caché Monitor Manager. Enter the number of your choice or press Enter to return to the main menu:

Option? 1

1) Update MONITOR
2) Halt MONITOR
3) Start MONITOR
4) Reset Alerts
5) Exit

Option?
Copy code to clipboard

The options in this submenu let you manage the operation of Caché Monitor as described in the following table:

Option Description
1) Update MONITOR Dynamically restarts Caché Monitor based on the current settings (interval, severity level, email) in Manage Monitor Options.
2) Halt MONITOR Stops Caché Monitor. The console log is not scanned until Caché Monitor is started.
3) Start MONITOR Starts Caché Monitor. The console log is monitored based on the current settings (interval, severity level, email) in Manage Monitor Options.
4) Reset ALERTS Deletes the alerts log (if it exists).
5) Exit Returns to the main menu.

Manage Monitor Options

This submenu lets you manage Caché Monitor’s scanning and notification options. Enter the number of your choice or press Enter to return to the main menu:

Option? 2

1) Set Monitor Interval
2) Set Alert Level
3) Manage Email Options
4) Exit

Option?
Copy code to clipboard

The options in this submenu let you manage the operation of Caché Monitor as described in the following table:

Option Description
1) Set Monitor Interval Lets you change the interval at which the console log is scanned. InterSystems recommends an interval no longer than the default of 10 seconds.
2) Set Alert Level Lets you set the severity level of console log entries generating notifications, as follows:
  • 1 – warning, severe and fatal
  • 2 – severe and fatal
  • 3 – fatal only
3) Manage Email Options Lets you configure Caché Monitor email notifications using the Manage Email Options submenu.
4) Exit Returns to the main menu.
Note:

Because Caché Monitor generates a notification only for the first in a series of console log entries from a given process within about an hour, setting the alert level to 1 could mean that when a warning has generated an alerts log entry or email message, a subsequent severity 2 alert from the same process does not generate a notification. For example, a license expiration warning from Caché System Monitor could prevent a more serious shadow server disconnection alert 15 minutes later from generating an alerts log entry or email message.

Manage Email Options

The options in this submenu let you configure and enable/disable email. When email is enabled, Caché Monitor sends notifications by email; when it is disabled, notifications are written to the alerts log. Enter the number of your choice or press Enter to return to the Manage Monitor Options submenu:

Option? 3

1) Enable/Disable Email
2) Set Sender
3) Set Server
4) Manage Recipients
5) Set Authentication
6) Test Email
7) Exit

Option? 
Copy code to clipboard

The options in this submenu let you manage the email notifications for Caché Monitor as described in the following table:

Option Description
1) Enable / Disable Email Enabling email causes Caché Monitor to:
  • send an email notification for each item currently in the alerts log, if any
  • delete the alerts.log file (if it exists)
  • send email notifications for console log entry of the configured severity from that point forward
Disabling email causes Caché Monitor to write entries to the alerts log.
Note:
Enabling/disabling email does not affect other email settings; that is, it is not necessary to reconfigure email options when you enable/disable email.
2) Set Sender Select this option to enter text that indicating the sender of the email, for example Cache Monitor. The text you enter does not have to represent a valid email account. You can set this field to NULL by entering - (dash).
3) Set Server Select this menu item to enter the name and port number (default 25) of the email server that handles email for your site. Consult your IT staff to obtain this information. You can set this field to NULL by entering - (dash).
4) Manage Recipients
This option displays a submenu that lets you list, add, or remove the email addresses to which each notification is sent:
Note:
Each valid email address must be added individually; when you select 2) Add Recipient, do not enter more than one address when responding to the Email Address? prompt.
5) Set Authentication Lets you specify the authentication username and password if required by your email server. Consult your IT staff to obtain this information. If you do not provide entries, the authentication username and password are set to NULL. You can set the User field to NULL by entering - (dash).
6) Test Email Sends a test message to the specified recipients using the specified email server.
7) Exit Returns to the Manage Monitor Options submenu.

Caché Monitor Errors and Traps

The following console log errors always generate Caché Monitor notifications:

  • Process halt due to segment violation (access violation).

  • <FILEFULL>in database %

  • AUDIT: ERROR: FAILED to change audit database to '%. Still auditing to '%.

  • AUDIT: ERROR: FAILED to set audit database to '%.

  • Sync failed during expansion of sfn #, new map not added

  • Sync failed during expansion of sfn #, not all blocks added

  • WRTDMN failed to allocate wdqlist...freezing system

  • WRTDMN: CP has exited - freezing system

  • Write Daemon encountered serious error - System Frozen

  • Insufficient global buffers - WRTDMN in panic mode

  • WRTDMN Panic: SFN x Block y written directly to database

  • Unexpected Write Error: dkvolblk returned %d for block #%d in %

  • Unexpected Write Error: dkswrite returned %d for block #%d in %

  • Unexpected Write Error: %d for block #%d in %.

  • Cluster crash - All Cache systems are suspended

  • System is shutting down poorly, because there are open transactions, or ECP failed to preserve its state

  • SERIOUS JOURNALING ERROR: JRNSTOP cannot open %.* Stopping journaling as cleanly as possible, but you should assume that some journaling data has been lost.

  • Unable to allocate memory for journal translation table

  • Journal file has reached its maximum size of %u bytes and automatic rollover has failed

  • Write to journal file has failed

  • Failed to open the latest journal file

  • Sync of journal file failed

  • Journaling will be disabled in %d seconds OR when journal buffers are completely filled, whichever comes first. To avoid potential loss of journal data, resolve the cause of the error (consult the Caché system error log, as described in Caché System Error Log in the “Monitoring Caché Using the Management Portal” chapter) or switch journaling to a new device.

  • Error logging in journal

  • Journaling Error x reading attributes after expansion

  • ECP client daemon/connection is hung

  • Cluster Failsoft failed, couldn't determine locksysid for failed system - all cluster systems are suspended

  • enqpijstop failed, declaring a cluster crash

  • enqpijchange failed, declaring a cluster crash

  • Failure during WIJ processing - Declaring a crash

  • Failure during PIJ processing - Declaring a crash

  • Error reading block – recovery read error

  • Error writing block – recovery write error

  • WIJ expansion failure: System Frozen - The system has been frozen because WIJ expansion has failed for too long. If space is created for the WIJ, the system will resume otherwise you need to shut it down with cforce

  • CP: Failed to create monitor for daemon termination

  • CP: WRTDMN has been on pass %d for %d seconds - freezing system. System will resume if WRTDMN completes a pass

  • WRTDMN: CP has died before we opened its handle - Freezing system

  • WRTDMN: Error code %d getting handle for CP monitor - CP not being monitored

  • WRTDMN: Control Process died with exit code %d - Freezing system

  • CP: Daemon died with exit code %d - Freezing system

  • Performing emergency Cache shutdown due to Operating System shutdown

  • CP: All processes have died - freezing system

  • cforce failed to terminate all processes

  • Failed to start slave write daemon

  • ENQDMN exiting due to reason #

  • Becoming primary mirror server