URL monitor plugin for cachethq.io

mtakaki, updated 🕥 2023-02-10 23:09:55

Status

CircleCI Coverage Status Codacy Badge Docker Pulls Docker stars License Latest release pre-commit

cachet-url-monitor

Python plugin for cachet that monitors an URL, verifying it's response status and latency. The frequency the URL is tested is configurable, along with the assertion applied to the request response.

This project is available at PyPI: https://pypi.python.org/pypi/cachet-url-monitor

Configuration

yaml endpoints: - name: Google url: http://www.google.com method: GET header: SOME-HEADER: SOME-VALUE timeout: 1 # seconds expectation: - type: HTTP_STATUS status_range: 200-205 - type: LATENCY threshold: 1 - type: REGEX regex: ".*<body>.*" allowed_fails: 0 component_id: 1 metric_id: 1 action: - UPDATE_STATUS public_incidents: true latency_unit: ms frequency: 5 - name: Amazon url: http://www.amazon.com method: GET header: SOME-HEADER: SOME-VALUE timeout: 1 # seconds expectation: - type: HTTP_STATUS status_range: 200-205 incident: MAJOR - type: LATENCY threshold: 1 - type: REGEX regex: ".*<body>.*" threshold: 10 allowed_fails: 0 component_id: 2 action: - CREATE_INCIDENT public_incidents: true latency_unit: ms frequency: 5 - name: Insecure-site url: https://www.Insecure-site-internal.com method: GET header: SOME-HEADER: SOME-VALUE insecure: true timeout: 1 # seconds expectation: - type: HTTP_STATUS status_range: 200-205 allowed_fails: 0 component_id: 2 action: - CREATE_INCIDENT public_incidents: true frequency: 5 cachet: api_url: http://status.cachethq.io/api/v1 token: - type: ENVIRONMENT_VARIABLE value: CACHET_TOKEN - type: AWS_SECRETS_MANAGER secret_name: cachethq secret_key: token region: us-west-2 - type: TOKEN value: my_token webhooks: - url: "https://push.example.com/message?token=<apptoken>" params: title: "{title}" message: "{message}" priority: 5 messages: incident_outage: "{name} is unavailable" incident_operational: "{name} is operational" incident_performance: "{name} has degraded performance"

  • endpoints, the configuration about the URL/Urls that will be monitored.
    • name, The name of the component. This is now mandatory (since 0.6.0) so we can distinguish the logs for each URL being monitored.
    • url, the URL that is going to be monitored. mandatory
    • method, the HTTP method that will be used by the monitor. mandatory
    • header, client header passed to the request. Remove if you do not want to pass a header.
    • insecure, for URLs which have self-singed/invalid SSL certs OR you wish to disable SSL check, use this key. Default is false, so by default we validate SSL certs.
    • timeout, how long we'll wait to consider the request failed. The unit of it is seconds. mandatory
    • expectation, the list of expectations set for the URL. mandatory
      • HTTP_STATUS, we will verify if the response status code falls into the expected range. Please keep in mind the range is inclusive on the first number and exclusive on the second number. If just one value is specified, it will default to only the given value, for example 200 will be converted to 200-201.
      • LATENCY, we measure how long the request took to get a response and fail if it's above the threshold . The unit is in seconds.
      • REGEX, we verify if the response body matches the given regex.
    • allowed_fails, create incident/update component status only after specified amount of failed connection trials.
    • component_id, the id of the component we're monitoring. This will be used to update the status of the component. mandatory
    • metric_id, this will be used to store the latency of the API. If this is not set, it will be ignored.
    • action, the action to be done when one of the expectations fails. This is optional and if left blank , nothing will be done to the component.
      • CREATE_INCIDENT, we will create an incident when the expectation fails.
      • UPDATE_STATUS, updates the component status.
      • PUSH_METRICS, uploads response latency metrics.
    • public_incidents, boolean to decide if created incidents should be visible to everyone or only to logged in users. Important only if CREATE_INCIDENT or UPDATE_STATUS are set.
    • latency_unit, the latency unit used when reporting the metrics. It will automatically convert to the specified unit. It's not mandatory and it will default to seconds. Available units: ms, s, m, h.
    • frequency, how often we'll send a request to the given URL. The unit is in seconds.
  • cachet, this is the settings for our cachet server.
    • api_url, the cachet API endpoint. mandatory
    • token, the API token. It can either be a string (backwards compatible with old configuration) or a list of token providers. It will read in the specified order and fallback to the next option if no token could be found . (since 0.6.10) mandatory
      • ENVIRONMENT_VARIABLE, it will read the token from the specified environment variable.
      • TOKEN, it's a string and it will be read directly from the configuration.
      • AWS_SECRETS_MANAGER, it will attempt reading the token from AWS Secrets Manager. It requires setting up the AWS credentials into the docker container. More instructions below. It takes these parameters:
        • secret_name, the name of the secret.
        • secret_key, the key under which the token is stored.
        • region, the AWS region.
  • webhooks, generic webhooks to be notified about incident updates
    • url, webhook URL, will be interpolated
    • params, POST parameters, will be interpolated
  • messages, customize text for generated events, use any of endpoint parameter in interpolation
    • incident_outage, title of incident in case of outage
    • incident_performace, title of incident in case of performance issues
    • incident_operational, title of incident in case service is operational

Each expectation has their own default incident status. It can be overridden by setting the incident property to any of the following values: - PARTIAL - MAJOR - PERFORMANCE

By choosing any of the aforementioned statuses, it will let you control the kind of incident it should be considered . These are the default incident status for each expectation type:

| Expectation | Incident status | | ----------- | --------------- | | HTTP_STATUS | PARTIAL | | LATENCY | PERFORMANCE | | REGEX | PARTIAL |

Following parameters are available in webhook interpolation

| Parameter | Description | | --------- | ----------- | | {title} | Event title, includes endpoint name and short status | | {message} | Event message, same as sent to Cachet |

AWS Secrets Manager

This tools can integrate with AWS Secrets Manager, where the token is fetched directly from the service. In order to get this functionality working, you will need to setup the AWS credentials into the container. The easiest way would be setting the environment variables: bash $ docker run --rm -it -e AWS_ACCESS_KEY_ID=xyz -e AWS_SECRET_ACCESS_KEY=aaa -v "$PWD"/my_config.yml:/usr/src/app/config/config.yml:ro mtakaki/cachet-url-monitor

Setting up

The application should be installed using virtualenv, through the following command:

bash $ git clone https://github.com/mtakaki/cachet-url-monitor.git $ cd cachet-url-monitor $ virtualenv venv $ source venv/bin/activate $ pip install -r requirements.txt $ python3 setup.py install

To start the agent:

bash $ python3 cachet_url_monitor/scheduler.py config.yml

Docker

You can run the agent in docker, so you won't need to worry about installing python, virtualenv, or any other dependency into your OS. The Dockerfile is already checked in and it's ready to be used.

You have two choices, checking this repo out and building the docker image or it can be pulled directly from dockerhub. You will need to create your own custom config .yml file and run (it will pull latest):

bash $ docker pull mtakaki/cachet-url-monitor $ docker run --rm -it -v "$PWD":/usr/src/app/config/ mtakaki/cachet-url-monitor

If you're going to use a file with a name other than config.yml, you will need to map the local file, like this:

bash $ docker run --rm -it -v "$PWD"/my_config.yml:/usr/src/app/config/config.yml:ro mtakaki/cachet-url-monitor

Docker compose

Docker compose has been removed from this repo as it had a dependency on PostgreSQL and it slightly complicated how it works. This has been kindly handled on: https://github.com/boonisz/cachet-url-monitor-dc It facilitates spawning CachetHQ with its dependencies and cachet-url-monitor alongside to it.

Generating configuration from existing CachetHQ instance (since 0.6.2)

In order to expedite the creation of your configuration file, you can use the client to automatically scrape the CachetHQ instance and spit out a YAML file. It can be used like this: bash $ python cachet_url_monitor/client.py http://localhost/api/v1 my-token test.yml Or from docker (you will end up with a test.yml in your $PWD/tmp folder): bash $ docker run --rm -it -v $PWD/tmp:/home/tmp/ mtakaki/cachet-url-monitor python3.7 ./cachet_url_monitor/client.py http://localhost/api/v1 my-token /home/tmp/test.yml The arguments are: - URL, the CachetHQ API URL, so that means appending /api/v1 to your hostname. - token, the token that has access to your CachetHQ instance. - filename, the file where it should write the configuration.

Caveats

Because we can't predict what expectations will be needed, it will default to these behavior: - Verify a [200-300[ HTTP status range. - If status fail, make the incident major and public. - Frequency of 30 seconds. - GET request. - Timeout of 1s. - We'll read the link field from the components and use it as the URL.

Troubleshooting

SSLERROR

If it's throwing the following exception: python raise SSLError(e, request=request) requests.exceptions.SSLError: HTTPSConnectionPool(host='redacted', port=443): Max retries exceeded with url: /api/v1/components/19 (Caused by SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)'),))

It can be resolved by setting the CA bundle environment variable REQUESTS_CA_BUNDLE pointing at your certificate file. It can either be set in your python environment, before running this tool, or in your docker container.

Development

If you want to contribute to this project, feel free to fork this repo and post PRs with any improvements or bug fixes. This is highly appreciated, as it's been hard to deal with numerous requests coming my end.

This repo is setup with pre-commit hooks and it should ensure code style is consistent . The steps to start development on this repo is the same as the setup aforementioned above: bash $ git clone https://github.com/mtakaki/cachet-url-monitor.git $ cd cachet-url-monitor $ pre-commit install $ virtualenv venv $ source venv/bin/activate $ tox

Issues

Bump ipython from 7.16.3 to 8.10.0

opened on 2023-02-10 23:09:55 by dependabot[bot]

Bumps ipython from 7.16.3 to 8.10.0.

Release notes

Sourced from ipython's releases.

See https://pypi.org/project/ipython/

We do not use GitHub release anymore. Please see PyPI https://pypi.org/project/ipython/

Commits
  • 15ea1ed release 8.10.0
  • 560ad10 DOC: Update what's new for 8.10 (#13939)
  • 7557ade DOC: Update what's new for 8.10
  • 385d693 Merge pull request from GHSA-29gw-9793-fvw7
  • e548ee2 Swallow potential exceptions from showtraceback() (#13934)
  • 0694b08 MAINT: mock slowest test. (#13885)
  • 8655912 MAINT: mock slowest test.
  • a011765 Isolate the attack tests with setUp and tearDown methods
  • c7a9470 Add some regression tests for this change
  • fd34cf5 Swallow potential exceptions from showtraceback()
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/mtakaki/cachet-url-monitor/network/alerts).

avoid random endpoints header to activate insecure flag

opened on 2022-11-09 14:42:59 by marctanguy

if header is not present in a endpoint configuration, the insecure flag is not taken into account

Sorting the endpoints based on status

opened on 2022-03-21 15:31:31 by monu56

Hi, I am looking out for a solution where our Cachet Status Page can sort the components based on the status (For ex: Outage URL should be listed first in the status page UI). I followed the documentation, but not clear where exactly to make changes. Can anyone help please.

Ref: https://docs.cachethq.io/docs/advanced-api-usage

What I think, I should make changes in the config.yml file -> api_url section, but that's giving me an error:

Config_File: capture

Error: Traceback (most recent call last): File "cachet_url_monitor/scheduler.py", line 160, in configuration = Configuration(config_data, endpoint_index, client, webhooks) File "/home/cachet/py3/lib64/python3.7/site-packages/cachet_url_monitor-0.6.11-py3.7.egg/cachet_url_monitor/configuration.py", line 105, in init self.status = self.client.get_component_status(self.component_id) File "/home/cachet/py3/lib64/python3.7/site-packages/cachet_url_monitor-0.6.11-py3.7.egg/cachet_url_monitor/client.py", line 82, in get_component_status return status.ComponentStatus(int(get_status_request.json()["data"]["status"])) TypeError: list indices must be integers or slices, not str

Do not change previous_status if self.trigger_update is false (fix #126)

opened on 2022-02-26 17:15:57 by htamg None

Add data for method POST

opened on 2021-12-05 09:35:10 by minhng99 None

Supervising Database Status

opened on 2021-10-23 20:18:32 by schroedt

Hi There,

is there any example for supervising databases, such as mariadbor postgresql? I think this would be useful for documentation (quick-start).

Further, some recommendations for supervising systems would be nice, e.g.: * HTTP 200-300 check for general availability incident: MAJOR * HTTP 200-300 using latency for incident: PERFORMANCE * etc.

Is something like that available?

Releases

Support to different token providers 2020-05-19 08:46:00

As part of #99, cachet-url-monitor now supports different token providers. It included refactoring, so it can be easily extended to support new sources.

Still backwards compatible with old configuration files.

Improvements 2020-05-01 15:48:31

Thank you @nijel for the great new additions!

  • Removing unused token from Configuration.
  • Adding support to webhooks.
  • Improving incident handling with customizable incident message.

Numerous bug fixes 2020-04-29 15:31:58

Thanks to @nijel, we got numerous bug fixes: - Removing usage of scheduler and relying on thread sleep. - Incorrect location of latency unit. - Fixing the metrics push that was pushing it twice.

Major bug fix 2020-01-29 10:05:14

Accidentally introduced a bug when moving code from configuration.py to scheduler.py (#81). Kindly fixed by @chris-str-cst through #82.

Major bug fix 2020-01-28 09:43:32

Fixing bug introduced in 0.6.2 when a CachetClient class was created and it was preventing it from updating component status. It wasn't calling ComponentStatus.value.

Upgrading dependencies 2020-01-19 22:29:28

Upgrading dependencies: - PyYAML 5.3

Dev requirements: - coverage 5.0.3 - coveralls 1.10.0 - mock 3.0.5 - pytest 5.3.3

And finally fixing the code coverage metrics, that was broken due to running pytest with the installed package, rather than the source code.

cachethq monitoring cachet-url-monitor docker