The Extended Health Check module provides extensible, configurable endpoints for evaluating the "health" of a Magnolia instance. You can use the endpoint for monitoring a Magnolia instance, either manually or automatically, for example, for autoscaling.
You configure the values of the HTTP status returned by the health check and configure the conditions that will be checked for a specific HTTP status.
The Extended Health Check module also provides a store for "health events". Health events are significant events that indicate something about the health of the Magnolia instance and can be checked in the extended health check.
You can collect health events from the Magnolia log with log4j configuration, and you can collect health events relating to Magnolia publication failures.
Installation
Maven is the easiest way to install the module. Add the following dependency to your bundle:
<dependency> <groupId>info.magnolia</groupId> <artifactId>healthcheck</artifactId> <version>${version}</version> </dependency>
Versions
1.0 | Magnolia 5.7.8 and later |
Health outcomes
Health outcomes define the conditions for which a specific HTTP status is returned by the extended health check.
A health outcome defines:
- A voter set including one or more health voters or boolean voter sets checking Magnolia health conditions.
- Details returned if the conditions for the health outcome are met (HTTP status and description).
Health outcomes can be disabled (or enabled). A disabled health outcome won't be examined when an extended health check is requested.
Health outcomes are defined through the module configuration at /modules/healthcheck/config/outcomes
. You can add or modify the health outcomes defined there.
Node name | Value |
---|---|
modules | |
healthcheck | |
config | |
outcomes | |
<health outcome 1> | |
<health outcome 2> | |
<health outcome N> |
Health outcomes are checked in the order they are defined; the first health outcome whose health voters return true is returned as the result of a health check, and any remaining health outcomes are ignored.
Here are the configurable properties of a health outcome:
Node name | Value |
---|---|
modules | |
healthcheck | |
config | |
outcomes | |
<health outcome name> | A unique name identifying the health outcome |
class | Should be info.magnolia.health.HealthOutcome |
enabled |
If |
conditions | |
class | The class name for a boolean voter set If not specified, it will be |
voters | |
<health voter or boolean voter set> | Configuration of health voters is described below. Note: you can also define further boolean voter sets, along with boolean operations, to build up complex conditions. |
Health voters
Health voters check a single, specific condition about the health of a Magnolia instance. They can be combined with other health voters and boolean voter sets to form complicated logical expressions for a particular health outcome.
The Extended Health Check module includes several health voters:
- To check if a Magnolia context is available
- To check whether specific health events exist
- To check if Magnolia needs to be updated
- To check whether certain nodes or properties exist in Magnolia's JCR repository
ContextAvailableVoter - check if a Magnolia context is available
The Magnolia context is fundamental to Magnolia operation (unsurprisingly) and indicates a serious problem with Magnolia if one is not available.
ContextAvailableVoter
has the following configuration:
Node name | Value |
---|---|
<voter name> | Name of the voter |
class | Should be |
enabled |
If |
not |
If |
HealthEventPropertyVoter - checks for specified health events
The HealthEventPropertyVoter
checks whether specific health events exist meeting the configured criteria. You can also specify a threshold for the number of health events found, as well as the expected value of a health event property.
HealthEventPropertyVoter
has the following configuration:
Node name | Value |
---|---|
<voter name> | Name of the voter |
class | Should be |
enabled |
If |
not |
If |
identifier | The Health events have the following identifiers:
If not specified, the identifier will be |
propertyName | (required) The name of the health event property whose value will be checked |
propertyValue | (required) The expected value of the health event property |
predicate | Specifies how the value of The following comparisons are available:
|
threshold | The number of health events matching the If not specified, |
interval | Defines an Health events outside of the interval will not be checked. Use interval limit the health events considered (e.g. publication errors within the last 30 minutes). If interval is less than than |
MagnoliaUpdatedNeededVoter - checks Magnolia modules needing updating
The MagnoliaUpdatedNeededVoter
checks whether one or more Magnolia modules needs updating.
MagnoliaUpdatedNeededVoter
has the following configuration:
Node name | Value |
---|---|
<voter name> | Name of the voter |
class | Should be |
enabled |
If |
not |
If |
PublicationFailureVoter - checks for Magnolia publication failures
The PublicationFailureVoter
checks whether a publication failure has occurred.
PublicationFailureVoter
has the following configuration:
Node name | Value |
---|---|
<voter name> | Name of the voter |
class | Should be |
enabled |
If |
not |
If |
interval | Defines an Publication failures outside of the Use the If interval is less than than |
threshold | The number of publication failures within the specified interval counted. If more publication failures are found, the voter will return true, otherwise false. If not specified, |
QueryVoter - checks for nodes defined in the JCR repository
The QueryVoter
checks whether nodes in the JCR repository are defined. This voter is useful for checking the messages workspace for system errors like the expiration of the Magnolia license.
QueryVoter
has the following configuration:
Node name | Value |
---|---|
<voter name> | Name of the voter |
class | Should be |
enabled |
If |
not |
If |
workspace | (required) The workspace that will be searched |
query | A valid JCR SQL 2 query that will be evaluated in the workspace |
threshold | The number of nodes expected to be found for the health voter to return If not specified, |
Health events
Health events are collected while Magnolia is running and provide a record that can be checked by health voters. There are two health voters - PublicationFailureVoter
and HealthEventPropertyVoter
- that use health events; the other voters - ContextAvailableVoter
, MagnoliaUpdatedNeededVoter
and QueryVoter
- all check the state of Magnolia at the time of execution.
Health events are collected from two sources:
- The Magnolia log
- The results of Magnolia publications
Both sources can provide valuable insight in what has happened in a Magnolia instance outside of the time Magnolia's health is being checked.
Health events have:
- an identifier to indicate where the health event came from: "loggedMessage" for health events from Magnolia logging and "publicationError" from errors occurring during a Magnolia publication
- name / value properties depending where the health event was collected
Health Log
Health events are stored in a health log and health voters can check the health log for matching their configuration to assess Magnolia's health.
The health log can store a limited number of health events:
- up to 10,000 total health events
- health events older than 6 hours are discarded
Your health voters should not use intervals longer than 6 hours.
Collecting health events from Magnolia logs
You can collect health events from Magnolia logs and save them in the health log through Magnolia's log4j configuration.
You will need set up two log4j elements:
- A health log "Appender" to store any matching messages into the health log
- One or more "Loggers" to select log messages to be saved by the health log appender
Note that you can filter events by both the health log appender (using the "Filters" attribute) and the loggers (using the "level" attribute).
The health log appender is declared in the Extended Health Check module, you can use it in your log4j configuration without further declarations:
Here's a sample health log appender:
<HealthMonitor name="license-monitor" messagePattern=".+"> <PatternLayout> <PatternLayout pattern="%-5p %c %d{dd.MM.yyyy HH:mm:ss} -- %m%n"/> </PatternLayout> </HealthMonitor>
This HealthMonitor
appender will save any log message directed toward it (messagePattern will match any non-empty message) with the specified layout pattern.
HealthMonitor
will save any matching log message to the health log with the following name / value properties:
- logLevel: the log level of the message
- logMessage: the log message
- logThread: the thread where the message was logged
- logName: the name of the Logger
- logCallerFQCN: the fully qualified class name where the message was logged
Here's some sample loggers that select log messages and send them to the HealthMonitor appender above:
<Logger name="info.magnolia.multisite.sites.MultiSiteManager" level="WARNING"> <AppenderRef ref="license-monitor"/> </Logger> <Logger name="info.magnolia.sitemesh.config.MagnoliaConfigurableSiteMeshFilter" level="WARN"> <AppenderRef ref="license-monitor"/> </Logger>
These loggers will select WARN level messages from the Magnolia Multi-Site module (specifically info.magnolia.multisite.sites.MultiSiteManager
) and the Magnolia SiteMesh cacheing module (specifically info.magnolia.sitemesh.config.MagnoliaConfigurableSiteMeshFilter
) and sends them to the HealthMonitor
appender named "license-monitor". MultiSiteManager
and MagnoliaConfigurableSiteMeshFilter
both report expired licenses at WARN level.
Collecting health events from publications
Errors during a Magnolia publication are not completely captured in the Magnolia logs; the specific error message returned by a Magnolia public instance to the Magnolia author is not recorded in the log of the public instance. Knowing why a publication failed is an important indication of the health of a Magnolia public instance: if the publication failed because of some failure of the JCR repository, the JCR repository Magnolia public instance may be corrupted and the instance should be replaced or repaired. On the other hand, some publication errors may be recoverable, for example, publishing a child node whose parent has not been published will cause a publication error that can be remedied by publishing the parent node and republishing the child node.
Publication errors can be collected by a filter. The filter detects publication requests and saves the results of the publication into the health log.
The Extended Health Check module will install a filter "publishingMonitor
" before the publication filter "publishing
" to collect the result of publications.
If you change either the publishingMonitor filter or publishing filter, please note:
- the
publishingMonitor
filter must be located before thepublishing
filter in the filter chain to collect publication results - the
publishingMonitor
filter should have the samebypasses
configuration as thepublishing
filter to identify publication requests
If you don't want to collect publication results in the health log, you can disable the publishingMonitor
filter (set its enabled
property to false
) or delete the publishingMonitor
filter.
Health outcomes provided
The Extended Health Check module includes a number of health outcomes defined:
Name | HTTP status returned | Description returned | |
---|---|---|---|
error500 | 500 | Magnolia has internal errors | Couldn't get a Magnolia context |
error501 | 501 | Magnolia has internal errors | One or more Magnolia modules needs to be updated |
error503 | 503 | Magnolia public instance has publishing failures | One or more publication errors was found in the health log |
error402 | 402 | Magnolia license has expired! | One or more licensed expired messages were found in the messages workspace or one or more license expired log messages was found in the health log |
errorTest | 502 | Test health error (Magnolia is really OK) | A test outcome (will always be returned) for testing the health check endpoint. NOTE: this outcome is disabled on installation of the Extended Health Check module. |
REST API
The Extended Health Check module comes with a REST endpoint:
GET
Health Check
Returns current health of a Magnolia instance according to its configured health outcomes.
Request URL
/.rest/health/v1/check
Returns the health check results with:
healthy
-true
if the instance considers itself healthy (HTTP 200) orfalse
if the conditions of some health outcome were met (HTTP status of the health outcome)description
- the description of the health outcome or "Magnolia is healthy" if healthy
Returns an HTTP status of:
200
- the instance is healthy- the HTTP status of the health outcome
GET
Reset Magnolia health events
Removes all Magnolia health events.
Note: reset may not change Magnolia health status if a health outcome uses health voters like ContextAvailableVoter
, MagnoliaUpdateNeededVoter
or QueryVoter
that do not use health events.
Request URL
/.rest/health/v1/reset
Returns an HTTP status of:
200
- all health events removed
GET
Retrieve Magnolia health events
Retrieves all health events currently in the health log.
Request URL
/.rest/health/v1/dump
Returns an HTTP status of:
200
- all health events returned in the response body
Warnings
- This module is at INCUBATOR level.
Changelog
- Version 1.0 - Initial release of the extensions version of the module.