A command line tool for Looker instance cleanup

looker-open-source, updated πŸ•₯ 2023-02-11 01:41:03

image


Henry: A Looker Cleanup Tool

Henry is a command line tool that helps determine model bloat in your Looker instance and identify unused content in models and explores. It is meant to help developers cleanup models from unused explores and explores from unused joins and fields, as well as maintain a healthy and user-friendly instance.

Table of Contents

Status and Support

Henry is NOT supported or warranted by Looker in any way. Please do not contact Looker support for issues with Henry. Issues can be logged via https://github.com/looker-open-source/henry/issues

Installation

Henry requires python3.7+. It is published on PyPI and can be installed using pip:

$ pip install henry

For development setup, follow the Development setup below.

Usage

In order to display usage information, use:

$ henry --help

Global Options that apply to many commands

Authentication

Henry makes use of the Looker SDK to issue API calls and requires API3 credentials. These can provided either using an .ini file or environment variables as documented here. By default, the tool looks for a "looker.ini" file in the working directory. If the configuration file is named differently or located elsewhere, it must be specified using the --config-file argument.

Example .ini file:

``` [Looker]

Base URL for API. Do not include /api/* in the url

base_url=https://self-signed.looker.com:19999

API 3 client id

client_id=YourClientID

API 3 client secret

client_secret=YourClientSecret

Set to false if testing locally against self-signed certs. Otherwise leave True

verify_ssl=True

[Production] base_url=https://production.looker.com:19999 client_id=YourClientID client_secret=YourClientSecret verify_ssl=True ```

Assuming the above ini file contents, Henry can be run as follows:

$ henry pulse --config-file=looker.ini --section=Looker

which due to defaults, is equivalent to

$ henry pulse

Running it using the details under the Production section can be done as follows:

$ henry pulse --section=Production

API timeout settings

By default, API calls have a timeout of 120 seconds. This can be overriden using the --timeout argument.

Output to File

If the --save flag is used the tool saves the results to your current working directory. Example usage:

$ henry vacuum models --save

saves the results in vacuum_models_{date}_{time}.csv in the current working directory.

Pulse Command

The command henry pulse runs a number of tests that help determine the overall instance health.

Analyze Command

The analyze command is meant to help identify models and explores that have become bloated and use vacuum on them in order to trim them.

analyze projects

The analyze projects command scans projects for their content as well as checks for the status of quintessential features for success such as the git connection status and validation requirements.

+-------------------+---------------+--------------+-------------------------+---------------------+------------------------+ | Project | # Models | # View Files | Git Connection Status | PR Mode | Is Validation Required | |-------------------+---------------+--------------+-------------------------+---------------------+------------------------| | marketing | 1 | 13 | OK | links | True | | admin | 2 | 74 | OK | off | True | | powered_by_looker | 1 | 14 | OK | links | True | | salesforce | 1 | 36 | OK | required | False | | thelook_event | 1 | 17 | OK | required | True | +-------------------+---------------+--------------+-------------------------+---------------------+------------------------+

analyze models

Shows the number of explores in each model as well as the number of queries against that model.

+-------------------+------------------+-----------------+-------------------+-------------------+ | Project | Model | # Explores | # Unused Explores | Query Count | |-------------------+------------------+-----------------+-------------------+-------------------| | salesforce | salesforce | 8 | 0 | 39923 | | thelook_event | thelook | 10 | 0 | 166307 | | powered_by_looker | powered_by | 5 | 0 | 49122 | | marketing | thelook_adwords | 3 | 0 | 40869 | | admin | looker_base | 0 | 0 | 0 | | admin | looker_on_looker | 10 | 9 | 28 | +-------------------+------------------+-----------------+-------------------+-------------------+

analyze explores

Shows explores and their usage. If the --min-queries argument is passed, joins and fields that have been used less than the threshold specified will be considered as unused.

+---------+-----------------------------------------+-------------+-------------------+--------------+----------------+---------------+-----------------+---------------+ | Model | Explore | Is Hidden | Has Description | # Joins | # Unused Joins | # Fields | # Unused Fields | Query Count | |---------+-----------------------------------------+-------------+-------------------+--------------+----------------+---------------+-----------------+---------------| | thelook | cohorts | True | False | 3 | 0 | 19 | 4 | 333 | | thelook | data_tool | True | False | 3 | 0 | 111 | 90 | 736 | | thelook | order_items | False | True | 7 | 0 | 153 | 16 | 126898 | | thelook | events | False | True | 6 | 0 | 167 | 68 | 19372 | | thelook | sessions | False | False | 6 | 0 | 167 | 83 | 12205 | | thelook | affinity | False | False | 2 | 0 | 34 | 13 | 3179 | | thelook | orders_with_share_of_wallet_application | False | True | 9 | 0 | 161 | 140 | 1586 | | thelook | journey_mapping | False | False | 11 | 2 | 238 | 228 | 14 | | thelook | inventory_snapshot | False | False | 3 | 0 | 25 | 15 | 33 | | thelook | kitten_order_items | True | False | 8 | 0 | 154 | 138 | 39 | +---------+-----------------------------------------+-------------+-------------------+--------------+----------------+---------------+-----------------+---------------+

Vacuum Information

The vacuum command outputs a list of unused content based on predefined criteria that a developer can then use to cleanup models and explores.

vacuum models

The vacuum models command exposes models and the number of queries against them over a predefined period of time. Explores that are listed here have not had the minimum number of queries against them in the timeframe specified. As a result it is safe to hide them and later delete them.

+------------------+---------------------------------------------+-------------------------+ | Model | Explore | Model Query Count | |------------------+---------------------------------------------+-------------------------| | salesforce | | 39450 | | thelook | | 164930 | | powered_by | | 49453 | | thelook_adwords | | 38108 | | looker_on_looker | user_full | 27 | | | history_full | | | | content_view | | | | project_status | | | | field_usage_full | | | | dashboard_performance_full | | | | user_weekly_app_activity_period_over_period | | | | pdt_state | | | | user_daily_query_activity | | +------------------+---------------------------------------------+-------------------------+

vacuum explores

The vacuum explores command exposes joins and exposes fields that are below or equal to the minimum number of queries threshold (default=0, can be changed using the --min-queries argument) over the specified timeframe (default: 90, can be changed using the --timeframe argument).

Example: from the analyze function run above, we know that the cohorts explore has 4 fields that haven't been queried once in the past 90 days. Running the following vacuum command:

$ henry vacuum explores --model thelook --explore cohorts

provides the name of the unused fields:

+---------+-----------+----------------+------------------------------+ | Model | Explore | Unused Joins | Unused Fields | |---------+-----------+----------------+------------------------------| | thelook | cohorts | users | users.id | | | | | order_items.id | | | | | order_items.id | | | | | order_items.total_sale_price | +---------+-----------+----------------+------------------------------+

If a join is unused, it's implying that fields introduced by that join haven't been used for the defined timeframe. For this reason fields exposed as a result of that join are not explicitly listed as unused fields.

It is very important to note that fields listed as unused in one explore are not meant to be completely removed from view files altogether because they might be used in other explores (via extensions), or filters. Instead, one should either hide those fields (if they're not used anywhere else) or exclude them from the explore using the fields LookML parameter.

Contributing

Please refer to the CONTRIBUTING file. Bug reports and pull requests are welcome on GitHub at https://github.com/looker-open-source/henry/issues.

Code of Conduct

Everyone interacting in the Henry project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

Copyright

Copyright (c) 2018 Joseph Axisa for Looker Data Sciences. See MIT License for further details.

Issues

Bump ipython from 8.2.0 to 8.10.0

opened on 2023-02-11 01:41:03 by dependabot[bot]

Bumps ipython from 8.2.0 to 8.10.0.

Commits
  • 15ea1ed release 8.10.0
  • 560ad10 DOC: Update what's new for 8.10 (#13939)
  • 7557ade DOC: Update what's new for 8.10
  • 385d693 Merge pull request from GHSA-29gw-9793-fvw7
  • e548ee2 Swallow potential exceptions from showtraceback() (#13934)
  • 0694b08 MAINT: mock slowest test. (#13885)
  • 8655912 MAINT: mock slowest test.
  • a011765 Isolate the attack tests with setUp and tearDown methods
  • c7a9470 Add some regression tests for this change
  • fd34cf5 Swallow potential exceptions from showtraceback()
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/looker-open-source/henry/network/alerts).

Bump cryptography from 37.0.1 to 39.0.1

opened on 2023-02-08 03:22:24 by dependabot[bot]

Bumps cryptography from 37.0.1 to 39.0.1.

Changelog

Sourced from cryptography's changelog.

39.0.1 - 2023-02-07


* **SECURITY ISSUE** - Fixed a bug where ``Cipher.update_into`` accepted Python
  buffer protocol objects, but allowed immutable buffers. **CVE-2023-23931**
* Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL 3.0.8.

.. _v39-0-0:

39.0.0 - 2023-01-01

  • BACKWARDS INCOMPATIBLE: Support for OpenSSL 1.1.0 has been removed. Users on older version of OpenSSL will need to upgrade.
  • BACKWARDS INCOMPATIBLE: Dropped support for LibreSSL < 3.5. The new minimum LibreSSL version is 3.5.0. Going forward our policy is to support versions of LibreSSL that are available in versions of OpenBSD that are still receiving security support.
  • BACKWARDS INCOMPATIBLE: Removed the encode_point and from_encoded_point methods on :class:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicNumbers, which had been deprecated for several years. :meth:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicKey.public_bytes and :meth:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicKey.from_encoded_point should be used instead.
  • BACKWARDS INCOMPATIBLE: Support for using MD5 or SHA1 in :class:~cryptography.x509.CertificateBuilder, other X.509 builders, and PKCS7 has been removed.
  • BACKWARDS INCOMPATIBLE: Dropped support for macOS 10.10 and 10.11, macOS users must upgrade to 10.12 or newer.
  • ANNOUNCEMENT: The next version of cryptography (40.0) will change the way we link OpenSSL. This will only impact users who build cryptography from source (i.e., not from a wheel), and specify their own version of OpenSSL. For those users, the CFLAGS, LDFLAGS, INCLUDE, LIB, and CRYPTOGRAPHY_SUPPRESS_LINK_FLAGS environment variables will no longer be respected. Instead, users will need to configure their builds as documented here_.
  • Added support for :ref:disabling the legacy provider in OpenSSL 3.0.x<legacy-provider>.
  • Added support for disabling RSA key validation checks when loading RSA keys via :func:~cryptography.hazmat.primitives.serialization.load_pem_private_key, :func:~cryptography.hazmat.primitives.serialization.load_der_private_key, and :meth:~cryptography.hazmat.primitives.asymmetric.rsa.RSAPrivateNumbers.private_key. This speeds up key loading but is :term:unsafe if you are loading potentially attacker supplied keys.
  • Significantly improved performance for :class:~cryptography.hazmat.primitives.ciphers.aead.ChaCha20Poly1305

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/looker-open-source/henry/network/alerts).

Bump certifi from 2021.10.8 to 2022.12.7

opened on 2022-12-09 04:37:32 by dependabot[bot]

Bumps certifi from 2021.10.8 to 2022.12.7.

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/looker-open-source/henry/network/alerts).

Fix Pulse and update Henry to work with recent Looker system_activity model changes

opened on 2022-10-19 14:12:55 by JorickvdHoeven

Problem Statement

Henry has been having two large issues making it difficult to use: 1. The Pulse command is failing because of an issue in the Looker SDK with the try_connection command. 2. The queries were running against the i__looker explore which seems to be getting deprecated or seems to be inaccessible to users who aren't admins.

These two problems impact Henry's ability to run and make it difficult to use. We need to fix these so that people can continue to use Henry to administer their Looker instances.

Proposed solution

  1. Pulse command not working:

    • Given that the reason why the pulse command isn't working is because of the deprecation of the i__looker explore and the fact that the sdk try_connection is broken, we need to catch the exception generated by the sdk and update the queries to use the system_activity explore instead of the i__looker explore.
  2. Analyze queries not working:

    • i__looker is an explore that is either deprecated or limited to admins, given that generally not all users are admins we should update to use the system_activity explore and update the fields that are used so that we can leverage this explore. An added benefit is that the system_activity explore (on a sample of 1 instance) runs much faster than i__looker.

Error whilst henry pulse - other henry commands work fine

opened on 2022-07-21 14:40:20 by haengeunc

henry analyze and vacuum work fine, but henry pulse errors with the following message:

raise error.SDKError(response.value.decode(encoding=encoding)) looker_sdk.error.SDKError: 502 Server Error

Error: Server Error

The server encountered a temporary error and could not complete your request.

Please try again in 30 seconds.

'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

opened on 2022-06-22 09:55:58 by moseleyi

I created a looker.ini file in the working directory ini [main] base_url=https://<c>.looker.com/:19999 client_id=<client_id> client_secret=<client_secret> verify_ssl=True

Trying simple command henry pulse --section=main but getting the following error:

Traceback (most recent call last): File "C:\Python38\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python38\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Python38\Scripts\henry.exe\__main__.py", line 4, in <module> File "C:\Python38\lib\site-packages\henry\cli.py", line 33, in <module> logging.config.fileConfig(LOGGING_CONFIG_PATH, File "C:\Python38\lib\logging\config.py", line 79, in fileConfig handlers = _install_handlers(cp, formatters) File "C:\Python38\lib\logging\config.py", line 142, in _install_handlers args = eval(args, vars(logging)) File "<string>", line 1 SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Releases

2019-04-18 16:52:39

2019-04-05 18:18:29

Looker Open Source

A collection of open source tools built on Looker's platform.

GitHub Repository