Henry is a command line tool that helps determine model bloat in your Looker instance and identify unused content in models and explores. It is meant to help developers cleanup models from unused explores and explores from unused joins and fields, as well as maintain a healthy and user-friendly instance.
Henry is NOT supported or warranted by Looker in any way. Please do not contact Looker support for issues with Henry. Issues can be logged via https://github.com/looker-open-source/henry/issues
Henry requires python3.7+. It is published on PyPI and can be installed using pip:
$ pip install henry
For development setup, follow the Development setup below.
In order to display usage information, use:
$ henry --help
Henry makes use of the Looker SDK to issue API calls and requires API3 credentials. These can provided either using an .ini file or environment variables as documented here. By default, the tool looks for a "looker.ini" file in the working directory. If the configuration file is named differently or located elsewhere, it must be specified using the --config-file
argument.
Example .ini file:
``` [Looker]
base_url=https://self-signed.looker.com:19999
client_id=YourClientID
client_secret=YourClientSecret
verify_ssl=True
[Production] base_url=https://production.looker.com:19999 client_id=YourClientID client_secret=YourClientSecret verify_ssl=True ```
Assuming the above ini file contents, Henry can be run as follows:
$ henry pulse --config-file=looker.ini --section=Looker
which due to defaults, is equivalent to
$ henry pulse
Running it using the details under the Production
section can be done as follows:
$ henry pulse --section=Production
By default, API calls have a timeout of 120 seconds. This can be overriden using the --timeout
argument.
If the --save
flag is used the tool saves the results to your current working directory. Example usage:
$ henry vacuum models --save
saves the results in vacuum_models_{date}_{time}.csv in the current working directory.
The command henry pulse
runs a number of tests that help determine the overall instance health.
The analyze
command is meant to help identify models and explores that have become bloated and use vacuum
on them in order to trim them.
The analyze projects
command scans projects for their content as well as checks for the status of quintessential features for success such as the git connection status and validation requirements.
+-------------------+---------------+--------------+-------------------------+---------------------+------------------------+
| Project | # Models | # View Files | Git Connection Status | PR Mode | Is Validation Required |
|-------------------+---------------+--------------+-------------------------+---------------------+------------------------|
| marketing | 1 | 13 | OK | links | True |
| admin | 2 | 74 | OK | off | True |
| powered_by_looker | 1 | 14 | OK | links | True |
| salesforce | 1 | 36 | OK | required | False |
| thelook_event | 1 | 17 | OK | required | True |
+-------------------+---------------+--------------+-------------------------+---------------------+------------------------+
Shows the number of explores in each model as well as the number of queries against that model.
+-------------------+------------------+-----------------+-------------------+-------------------+
| Project | Model | # Explores | # Unused Explores | Query Count |
|-------------------+------------------+-----------------+-------------------+-------------------|
| salesforce | salesforce | 8 | 0 | 39923 |
| thelook_event | thelook | 10 | 0 | 166307 |
| powered_by_looker | powered_by | 5 | 0 | 49122 |
| marketing | thelook_adwords | 3 | 0 | 40869 |
| admin | looker_base | 0 | 0 | 0 |
| admin | looker_on_looker | 10 | 9 | 28 |
+-------------------+------------------+-----------------+-------------------+-------------------+
Shows explores and their usage. If the --min-queries
argument is passed, joins and fields that have been used less than the threshold specified will be considered as unused.
+---------+-----------------------------------------+-------------+-------------------+--------------+----------------+---------------+-----------------+---------------+
| Model | Explore | Is Hidden | Has Description | # Joins | # Unused Joins | # Fields | # Unused Fields | Query Count |
|---------+-----------------------------------------+-------------+-------------------+--------------+----------------+---------------+-----------------+---------------|
| thelook | cohorts | True | False | 3 | 0 | 19 | 4 | 333 |
| thelook | data_tool | True | False | 3 | 0 | 111 | 90 | 736 |
| thelook | order_items | False | True | 7 | 0 | 153 | 16 | 126898 |
| thelook | events | False | True | 6 | 0 | 167 | 68 | 19372 |
| thelook | sessions | False | False | 6 | 0 | 167 | 83 | 12205 |
| thelook | affinity | False | False | 2 | 0 | 34 | 13 | 3179 |
| thelook | orders_with_share_of_wallet_application | False | True | 9 | 0 | 161 | 140 | 1586 |
| thelook | journey_mapping | False | False | 11 | 2 | 238 | 228 | 14 |
| thelook | inventory_snapshot | False | False | 3 | 0 | 25 | 15 | 33 |
| thelook | kitten_order_items | True | False | 8 | 0 | 154 | 138 | 39 |
+---------+-----------------------------------------+-------------+-------------------+--------------+----------------+---------------+-----------------+---------------+
The vacuum
command outputs a list of unused content based on predefined criteria that a developer can then use to cleanup models and explores.
The vacuum models
command exposes models and the number of queries against them over a predefined period of time. Explores that are listed here have not had the minimum number of queries against them in the timeframe specified. As a result it is safe to hide them and later delete them.
+------------------+---------------------------------------------+-------------------------+
| Model | Explore | Model Query Count |
|------------------+---------------------------------------------+-------------------------|
| salesforce | | 39450 |
| thelook | | 164930 |
| powered_by | | 49453 |
| thelook_adwords | | 38108 |
| looker_on_looker | user_full | 27 |
| | history_full | |
| | content_view | |
| | project_status | |
| | field_usage_full | |
| | dashboard_performance_full | |
| | user_weekly_app_activity_period_over_period | |
| | pdt_state | |
| | user_daily_query_activity | |
+------------------+---------------------------------------------+-------------------------+
The vacuum explores
command exposes joins and exposes fields that are below or equal to the minimum number of queries threshold (default=0, can be changed using the --min-queries
argument) over the specified timeframe (default: 90, can be changed using the --timeframe
argument).
Example: from the analyze function run above, we know that the cohorts explore has 4 fields that haven't been queried once in the past 90 days. Running the following vacuum command:
$ henry vacuum explores --model thelook --explore cohorts
provides the name of the unused fields:
+---------+-----------+----------------+------------------------------+
| Model | Explore | Unused Joins | Unused Fields |
|---------+-----------+----------------+------------------------------|
| thelook | cohorts | users | users.id |
| | | | order_items.id |
| | | | order_items.id |
| | | | order_items.total_sale_price |
+---------+-----------+----------------+------------------------------+
If a join is unused, it's implying that fields introduced by that join haven't been used for the defined timeframe. For this reason fields exposed as a result of that join are not explicitly listed as unused fields.
It is very important to note that fields listed as unused in one explore are not meant to be completely removed from view files altogether because they might be used in other explores (via extensions), or filters. Instead, one should either hide those fields (if they're not used anywhere else) or exclude them from the explore using the fields LookML parameter.
Please refer to the CONTRIBUTING file. Bug reports and pull requests are welcome on GitHub at https://github.com/looker-open-source/henry/issues.
Everyone interacting in the Henry projectβs codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.
Copyright (c) 2018 Joseph Axisa for Looker Data Sciences. See MIT License for further details.
Bumps ipython from 8.2.0 to 8.10.0.
15ea1ed
release 8.10.0560ad10
DOC: Update what's new for 8.10 (#13939)7557ade
DOC: Update what's new for 8.10385d693
Merge pull request from GHSA-29gw-9793-fvw7e548ee2
Swallow potential exceptions from showtraceback() (#13934)0694b08
MAINT: mock slowest test. (#13885)8655912
MAINT: mock slowest test.a011765
Isolate the attack tests with setUp and tearDown methodsc7a9470
Add some regression tests for this changefd34cf5
Swallow potential exceptions from showtraceback()Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps cryptography from 37.0.1 to 39.0.1.
Sourced from cryptography's changelog.
39.0.1 - 2023-02-07
* **SECURITY ISSUE** - Fixed a bug where ``Cipher.update_into`` accepted Python buffer protocol objects, but allowed immutable buffers. **CVE-2023-23931** * Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL 3.0.8.
.. _v39-0-0:
39.0.0 - 2023-01-01
- BACKWARDS INCOMPATIBLE: Support for OpenSSL 1.1.0 has been removed. Users on older version of OpenSSL will need to upgrade.
- BACKWARDS INCOMPATIBLE: Dropped support for LibreSSL < 3.5. The new minimum LibreSSL version is 3.5.0. Going forward our policy is to support versions of LibreSSL that are available in versions of OpenBSD that are still receiving security support.
- BACKWARDS INCOMPATIBLE: Removed the
encode_point
andfrom_encoded_point
methods on :class:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicNumbers
, which had been deprecated for several years. :meth:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicKey.public_bytes
and :meth:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicKey.from_encoded_point
should be used instead.- BACKWARDS INCOMPATIBLE: Support for using MD5 or SHA1 in :class:
~cryptography.x509.CertificateBuilder
, other X.509 builders, and PKCS7 has been removed.- BACKWARDS INCOMPATIBLE: Dropped support for macOS 10.10 and 10.11, macOS users must upgrade to 10.12 or newer.
- ANNOUNCEMENT: The next version of
cryptography
(40.0) will change the way we link OpenSSL. This will only impact users who buildcryptography
from source (i.e., not from awheel
), and specify their own version of OpenSSL. For those users, theCFLAGS
,LDFLAGS
,INCLUDE
,LIB
, andCRYPTOGRAPHY_SUPPRESS_LINK_FLAGS
environment variables will no longer be respected. Instead, users will need to configure their buildsas documented here
_.- Added support for :ref:
disabling the legacy provider in OpenSSL 3.0.x<legacy-provider>
.- Added support for disabling RSA key validation checks when loading RSA keys via :func:
~cryptography.hazmat.primitives.serialization.load_pem_private_key
, :func:~cryptography.hazmat.primitives.serialization.load_der_private_key
, and :meth:~cryptography.hazmat.primitives.asymmetric.rsa.RSAPrivateNumbers.private_key
. This speeds up key loading but is :term:unsafe
if you are loading potentially attacker supplied keys.- Significantly improved performance for :class:
~cryptography.hazmat.primitives.ciphers.aead.ChaCha20Poly1305
... (truncated)
d6951dc
changelog + security fix backport (#8231)138da90
workaround scapy bug in downstream tests (#8218) (#8228)69527bc
bookworm is py311 now (#8200)111deef
backport main branch CI to 39.0.x (#8153)338a65a
39.0.0 version bump (#7954)84a3cd7
automatically download and upload circleci wheels (#7949)525c0b3
Type annotate release.py (#7951)46d2a94
Use the latest 3.10 release when wheel building (#7953)f150dc1
fix CI to work with ubuntu 22.04 (#7950)8867724
fix README for python3 (#7947)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps certifi from 2021.10.8 to 2022.12.7.
9e9e840
2022.12.07b81bdb2
2022.09.24939a28f
2022.09.14aca828a
2022.06.15.2de0eae1
Only use importlib.resources's new files() / Traversable API on Python β₯3.11 ...b8eb5e9
2022.06.15.147fb7ab
Fix deprecation warning on Python 3.11 (#199)b0b48e0
fixes #198 -- update link in license9d514b4
2022.06.154151e88
Add py.typed to MANIFEST.in to package in sdist (#196)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Henry has been having two large issues making it difficult to use: 1. The Pulse command is failing because of an issue in the Looker SDK with the try_connection command. 2. The queries were running against the i__looker explore which seems to be getting deprecated or seems to be inaccessible to users who aren't admins.
These two problems impact Henry's ability to run and make it difficult to use. We need to fix these so that people can continue to use Henry to administer their Looker instances.
Pulse command not working:
Analyze queries not working:
henry analyze and vacuum work fine, but henry pulse errors with the following message:
raise error.SDKError(response.value.decode(encoding=encoding)) looker_sdk.error.SDKError:
Please try again in 30 seconds.
I created a looker.ini
file in the working directory
ini
[main]
base_url=https://<c>.looker.com/:19999
client_id=<client_id>
client_secret=<client_secret>
verify_ssl=True
Trying simple command henry pulse --section=main
but getting the following error:
Traceback (most recent call last):
File "C:\Python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Python38\Scripts\henry.exe\__main__.py", line 4, in <module>
File "C:\Python38\lib\site-packages\henry\cli.py", line 33, in <module>
logging.config.fileConfig(LOGGING_CONFIG_PATH,
File "C:\Python38\lib\logging\config.py", line 79, in fileConfig
handlers = _install_handlers(cp, formatters)
File "C:\Python38\lib\logging\config.py", line 142, in _install_handlers
args = eval(args, vars(logging))
File "<string>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape