Kubernetes LocalPVs on Steroids
helm install -n kube-system rawfile-csi ./deploy/charts/rawfile-csi/
Create a StorageClass
with your desired options:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: my-sc
provisioner: rawfile.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Filesystem
modeBlock
modeext4
, btrfs
, xfs
One might have a couple of reasons to consider using node-based (rather than network-based) storage solutions: - Performance: Almost no network-based storage solution can keep up with baremetal disk performance in terms of IOPS/latency/throughput combined. And you’d like to get the best out of the SSD you’ve got! - On-premise Environment: You might not be able to afford the cost of upgrading all your networking infrastructure, to get the best out of your network-based storage solution. - Complexity: Network-based solutions are distributed systems. And distributed systems are not easy! You might want to have a system that is easier to understand and to reason about. Also, with less complexity, you can fix unpredicted issues more easily.
Using node-based storage has come a long way since k8s was born. Right now, OpenEBS’s hostPath makes it pretty easy to automatically provision hostPath PVs and use them in your workloads. There are known limitations though: - You can’t monitor volume usage: There are hacky workarounds to run “du” regularly, but that could prove to be a performance killer, since it could put a lot of burden on your CPU and cause your filesystem cache to fill up. Not really good for a production workload. - You can’t enforce hard limits on your volume’s size: Again, you can hack your way around it, with the same caveats. - You are stuck with whatever filesystem your kubelet node is offering - You can’t customize your filesystem:
All these issues stem from the same root cause: hostPath/LocalPVs are simple bind-mounts from the host filesystem into the pod.
The idea here is to use a single file as the block device, using Linux’s loop, and create a volume based on it. That way:
- You can monitor volume usage by running df in O(1)
since devices are mounted separately.
- The size limit is enforced by the operating system, based on the backing file size.
- Since volumes are backed by different files, each file could be formatted using different filesystems, and/or customized with different filesystem options.
Some disk backends support discard
feature to release unused blocks on underlying storage.
It will take a long time to discard blocks when using mkfs
to format, if large "rawfile" is created on these storages.
This patch adds -E nodiscard / -K
parameter to mkfs.xxx
to accelerate "rawfile" formatting phase.
The write_text
function used to write config file is not atomic
https://stackoverflow.com/questions/73883435/is-python-3-path-write-text-from-pathlib-atomic
Looks lke XFS added reflink support, this can be leveraged to support snapshots for XFS
https://blogs.oracle.com/linux/post/xfs-data-block-sharing-reflink
To reclaim space in the host one should regularly issue fstrim
on the filesystem. Looks like recent kernels pass this through correctly for loopback deivces as well.
Bumps certifi from 2021.5.30 to 2022.12.7.
9e9e840
2022.12.07b81bdb2
2022.09.24939a28f
2022.09.14aca828a
2022.06.15.2de0eae1
Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...b8eb5e9
2022.06.15.147fb7ab
Fix deprecation warning on Python 3.11 (#199)b0b48e0
fixes #198 -- update link in license9d514b4
2022.06.154151e88
Add py.typed to MANIFEST.in to package in sdist (#196)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Looks like device path is passed instead of the mountpoint.
csi csi-driver kubernetes hostpath volume hacktoberfest