Python Nagios extensions
PluginHelper takes away some of the tedious work of writing Nagios plugins. Primary features include:
Usage: p = PluginHelper() p.status(warning) p.add_summary(‘Example Plugin with warning status’) p.add_metric(‘cpu load’, ‘90’) p.exit()
Appends message to the end of Plugin long_output. Message does not need a suffix
>>> p = PluginHelper()
>>> p.add_long_output('Status of sensor 1')
>>> p.add_long_output('* Temperature: OK')
>>> p.add_long_output('* Humidity: OK')
>>> p.get_long_output()
'Status of sensor 1\n* Temperature: OK\n* Humidity: OK'
Add numerical metric (will be outputted as nagios performanca data)
>>> p = PluginHelper()
>>> p.add_metric(label="load1", value="7")
>>> p.add_metric(label="load5", value="5")
>>> p.add_metric(label="load15",value="2")
>>> p.get_perfdata()
"'load1'=7;;;; 'load5'=5;;;; 'load15'=2;;;;"
>>> p = PluginHelper()
>>> p.add_metric(perfdatastring="load1=6;;;;")
>>> p.add_metric(perfdatastring="load5=4;;;;")
>>> p.add_metric(perfdatastring="load15=1;;;;")
>>> p.get_perfdata()
"'load1'=6;;;; 'load5'=4;;;; 'load15'=1;;;;"
Same as self.parser.add_option()
Update exit status of the nagios plugin. This function will keep history of the worst status added
Examples: >>> p = PluginHelper() >>> p.add_status(0) # ok >>> p.add_status(2) # critical >>> p.add_status(1) # warning >>> p.get_status() # 2
>>> p = PluginHelper()
>>> p.add_status('warning')
>>> p.add_status('ok')
>>> p.get_status()
1
>>> p.add_status('okay')
Traceback (most recent call last):
...
Exception: Invalid status supplied "okay"
Adds message to Plugin Summary
Checks all metrics (add_metric() against any thresholds set in self.options.thresholds or with –threshold from commandline)
Check one specific metric against a list of thresholds. Updates self.status() and writes to summary or longout as appropriate.
Examples: >>> p = PluginHelper() >>> thresholds = [(warning,‘2..5’), (critical,‘5..inf’)] >>> p.get_plugin_output() ‘Unknown -‘ >>> p.add_metric(‘load15’, ‘3’) >>> p.check_metric(‘load15’,thresholds) >>> p.get_plugin_output() “Warning - Warning on load15 | ‘load15’=3;@2:5;~:5;;”
>>> p = PluginHelper()
>>> thresholds = [(warning,'2..5'), (critical,'5..inf')]
>>> p.add_metric('load15', '3')
>>> p.verbose = True
>>> p.check_metric('load15',thresholds)
>>> p.get_plugin_output()
"Warning - Warning on load15 | 'load15'=3;@2:5;~:5;;\nWarning on load15"
Invalid metric: >>> p = PluginHelper() >>> p.add_status(ok) >>> p.add_summary(‘Everythings fine!’) >>> p.get_plugin_output() ‘OK - Everythings fine!’ >>> thresholds = [(warning,‘2..5’), (critical,‘5..inf’)] >>> p.check_metric(‘never_added_metric’, thresholds) >>> p.get_plugin_output() ‘Unknown - Everythings fine!. Metric never_added_metric not found’
Invalid threshold: >>> p = PluginHelper() >>> thresholds = [(warning, ‘invalid’), (critical,‘5..inf’)] >>> p.add_metric(‘load1’, ‘10’) >>> p.check_metric(‘load1’, thresholds) Traceback (most recent call last): ... SystemExit: 3
Converts new threshold range format to old one. Returns None.
Print all collected output to screen and exit nagios style, no arguments are needed except if you want to override default behavior.
Returns an optionParser.Values instance of all defaults after parsing extra opts config file
The Nagios extra-opts spec we use is the same as described here: http://nagiosplugins.org/extra-opts
Arguments
Returns all long_output that has been added via add_long_output
Return one specific metric (PerfdataMetric object) with the specified label. Returns None if not found.
Example: >>> p = PluginHelper() >>> p.add_metric(label=”load1”, value=”7”) >>> p.add_metric(label=”load15”,value=”2”) >>> p.get_metric(“load1”) ‘load1’=7;;;; >>> p.get_metric(“unknown”) # Returns None
Get perfdatastring for all valid perfdatametrics collected via add_perfdata
Examples: >>> p = PluginHelper() >>> p.add_metric(label=”load1”, value=”7”, warn=”-inf..10”, crit=”10..inf”) >>> p.add_metric(label=”load5”, value=”5”, warn=”-inf..7”, crit=”7..inf”) >>> p.add_metric(label=”load15”,value=”2”, warn=”-inf..5”, crit=”5..inf”) >>> p.get_perfdata() “‘load1’=7;10:;~:10;; ‘load5’=5;7:;~:7;; ‘load15’=2;5:;~:5;;”
Example with legacy output (show_legacy should be set with a cmdline option): >>> p.show_legacy = True >>> p.get_perfdata() “‘load1’=7;10:;~:10;; ‘load5’=5;7:;~:7;; ‘load15’=2;5:;~:5;;”
Get all plugin output as it would be printed to screen with self.exit()
Examples of functionality: >>> p = PluginHelper() >>> p.get_plugin_output() ‘Unknown -‘
>>> p = PluginHelper()
>>> p.add_summary('Testing')
>>> p.add_long_output('Long testing output')
>>> p.add_long_output('More output')
>>> p.get_plugin_output(exit_code=0)
'OK - Testing\nLong testing output\nMore output'
>>> p = PluginHelper()
>>> p.add_summary('Testing')
>>> p.add_status(0)
>>> p.get_plugin_output()
'OK - Testing'
>>> p = PluginHelper()
>>> p.show_status_in_summary = False
>>> p.add_summary('Testing')
>>> p.add_metric(label="load1", value="7")
>>> p.add_metric(label="load5", value="5")
>>> p.add_metric(label="load15",value="2")
>>> p.get_plugin_output(exit_code=0)
"Testing | 'load1'=7;;;; 'load5'=5;;;; 'load15'=2;;;;"
>>> p = PluginHelper()
>>> p.show_status_in_summary = False
>>> p.add_summary('Testing')
>>> p.add_long_output('Long testing output')
>>> p.add_long_output('More output')
>>> p.add_metric(label="load1", value="7")
>>> p.add_metric(label="load5", value="5")
>>> p.add_metric(label="load15",value="2")
>>> p.get_plugin_output(exit_code=0)
"Testing | 'load1'=7;;;; 'load5'=5;;;; 'load15'=2;;;;\nLong testing output\nMore output"
Returns the worst nagios status (integer 0,1,2,3) that has been put with add_status()
If status has never been added, returns 3 for UNKNOWN
Parsers commandline arguments, prints error if there is a syntax error.
Executes “function” and exits Nagios style with status “unkown” if there are any exceptions. The stacktrace will be in long_output.
Example: >>> p = PluginHelper() >>> p.add_status(‘ok’) >>> p.get_status() 0 >>> p.add_status(‘okay’) Traceback (most recent call last): ... Exception: Invalid status supplied “okay” >>> p.run_function( p.add_status, ‘warning’ ) >>> p.get_status() 1 >>> p.run_function( p.add_status, ‘okay’ ) Traceback (most recent call last): ... SystemExit: 3
Overwrite current long_output with message
Example: >>> s = PluginHelper() >>> s.add_long_output(‘first long output’) >>> s.set_long_output(‘Fatal error’) >>> s.get_long_output() ‘Fatal error’
Overwrite current summary with message
Example: >>> s = PluginHelper() >>> s.add_summary(‘first summary’) >>> s.set_summary(‘Fatal error’) >>> s.get_summary() ‘Fatal error’
Configures plugin to timeout after seconds number of seconds
Same as get_status() if new_status=None, otherwise call add_status(new_status)
Returns True if value is within range_threshold.
Format of range_threshold is according to: http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT
10 < 0 or > 10, (outside the range of {0 .. 10}) 10: < 10, (outside {10 .. ∞}) ~:10 > 10, (outside the range of {-∞ .. 10}) 10:20 < 10 or > 20, (outside the range of {10 .. 20}) @10:20 ≥ 10 and ≤ 20, (inside the range of {10 .. 20}) 10 < 0 or > 10, (outside the range of {0 .. 10}) ———————————————————
# Example runs for doctest, False should mean alert >>> check_range(78, “90”) # Example disk is 78% full, threshold is 90 True >>> check_range(5, 10) # Everything between 0 and 10 is True True >>> check_range(0, 10) # Everything between 0 and 10 is True True >>> check_range(10, 10) # Everything between 0 and 10 is True True >>> check_range(11, 10) # Everything between 0 and 10 is True False >>> check_range(-1, 10) # Everything between 0 and 10 is True False >>> check_range(-1, “~:10”) # Everything Below 10 True >>> check_range(11, “10:”) # Everything above 10 is True True >>> check_range(1, “10:”) # Everything above 10 is True False >>> check_range(0, “5:10”) # Everything between 5 and 10 is True False >>> check_range(0, “@5:10”) # Everything outside 5:10 is True True >>> check_range(None) # Return False if value is not a number False >>> check_range(“10000000 PX”) # What happens on invalid input False >>> check_range(“10000000”, “invalid:invalid”) # What happens on invalid range Traceback (most recent call last): ... PynagError: Invalid threshold format: invalid:invalid
Checks value against warning/critical and returns Nagios exit code.
Format of range_threshold is according to: http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT
# Example Usage: >>> check_threshold(88, warning=”0:90”, critical=”0:95”) 0 >>> check_threshold(92, warning=”:90”, critical=”:95”) 1 >>> check_threshold(96, warning=”:90”, critical=”:95”) 2
Nagios plugin helper library based on Nagios::Plugin
Sample usage
from pynag.Plugins import WARNING, CRITICAL, OK, UNKNOWN, simple as Plugin
# Create plugin object np = Plugin() # Add arguments np.add_arg(“d”, “disk”) # Do activate plugin np.activate() ... check stuff, np[‘disk’] to address variable assigned above... # Add a status message and severity np.add_message( WARNING, “Disk nearing capacity” ) # Get parsed code and messages (code, message) = np.check_messages() # Return information and exit nagios_exit(code, message)
Parse out all command line options and get ready to process the plugin. This should be run after argument preps
Add an argument to be handled by the option parser. By default, the arg is not required.
required = optional parameter action = [store, append, store_true]
Add a message with code to the object. May be called multiple times. The messages added are checked by check_messages, following.
Only CRITICAL, WARNING, OK and UNKNOWN are accepted as valid codes.
Append perfdata string to the end of the message
Check the current set of messages and return an appropriate nagios return code and/or a result message. In scalar context, returns only a return code; in list context returns both a return code and an output message, suitable for passing directly to nagios_exit()
A string used to join the relevant array to generate the message string returned in list context i.e. if the ‘critical’ array is non-empty, check_messages would return:
joinstr.join(critical)
By default, only one set of messages are joined and returned in the result message i.e. if the result is CRITICAL, only the ‘critical’ messages are included in the result; if WARNING, only the ‘warning’ messages are included; if OK, the ‘ok’ messages are included (if supplied) i.e. the default is to return an ‘errors-only’ type message.
If joinallstr is supplied, however, it will be used as a string to join the resultant critical, warning, and ok messages together i.e. all messages are joined and returned.
Check if a value is within a given range. This should replace change_threshold eventually. Exits with appropriate exit code given the range.
Taken from: http://nagiosplug.sourceforge.net/developer-guidelines.html Range definition
Generate an alert if x... 10 < 0 or > 10, (outside the range of {0 .. 10}) 10: < 10, (outside {10 .. #}) ~:10 > 10, (outside the range of {-# .. 10}) 10:20 < 10 or > 20, (outside the range of {10 .. 20}) @10:20 # 10 and # 20, (inside the range of {10 .. 20})
Changes CRITICAL, WARNING, OK and UNKNOWN code_text to integer representation for use within add_message() and nagios_exit()
Exit with exit_code, message, and optionally perfdata
Wrapper around pynag.Utils.send_nsca - here for backwards compatibility
These are helper functions and implementation of proposed new threshold format for nagios plugins according to: http://nagiosplugins.org/rfc/new_threshold_syntax
Returns True if value is within range, else False
Examples: >>> check_range(5, “0..10”) True >>> check_range(11, “0..10”) False
Checks value against warning/critical and returns Nagios exit code.
Format of range_threshold is according to: http://nagiosplugins.org/rfc/new_threshold_syntax
# Example Usage: >>> check_threshold(88, warning=”90..95”, critical=”95..100”) 0 >>> check_threshold(92, warning=”90..95”, critical=”95..100”) 1 >>> check_threshold(96, warning=”90..95”, critical=”95..100”) 2
takes a threshold string as an input and returns a hash map of options and values
>>> parse_threshold('metric=disk_usage,ok=0..90,warning=90..95,critical=95.100')
{'thresholds': [(0, '0..90'), (1, '90..95'), (2, '95.100')], 'metric': 'disk_usage'}