Simplified diarization pipeline using some pretrained models.
Made to be a simple as possible to go from an input audio file to diarized segments.
```python import soundfile as sf import matplotlib.pyplot as plt
from simple_diarizer.diarizer import Diarizer from simple_diarizer.utils import combined_waveplot
diar = Diarizer( embed_model='xvec', # 'xvec' and 'ecapa' supported cluster_method='sc' # 'ahc' and 'sc' supported )
segments = diar.diarize(WAV_FILE, num_speakers=NUM_SPEAKERS)
signal, fs = sf.read(WAV_FILE) combined_waveplot(signal, fs, segments) plt.show() ```
Simplified diarization is available on PyPI:
pip install simple-diarizer
"Some Quick Advice from Barack Obama!"
The following pretrained models are used:
It can be checked out in the above link, where it will try and diarize any input YouTube URL.
Hi there!
Nice job, I've just tested this in Windows with Python 3.10 and works fine (once soundfile installed). Just one query, it would be great if the tool could cluster the voice embeddings and guess how many speakers there are, since my intended use is to diarize conversations where the amount of speakers is unknown before hand.
Is such a feature in your road map?
I did pip install simple-diarizer
and then run the demo code sample you have with my own audio file.
But then I got an error ModuleNotFoundError: No module named 'soundfile'
You don't have soundfile python library in your requirements.txt file. Please add support for it.
From what I can see, only CUDA and CPU are supported. It would be nice to include support also for MPS (Apple M1 & M2 CPUs).
I tried it with a 2-minute-long audio file but the outcome had only 1 minute.
Hi thanks for the great work! When I was running the tutorial notebook on Google colab, the notebook kept crashing without any notice. Would someone help me figure out?
I have exposed the kwargs for all of the sklearn-based clustering algorithms so that they can be called from cluster_SC(), cluster_AHC(), Diarizer.diarize() and the command line.
All kwargs available in the sklearn algorithms should be available. I noted that you have some default values for kwargs and have retained those.
I haven't done comprehensive testing. I won't be offended if you want to change the way it is implemented.
FYI, the reason I did this was that 'arpack' eigen solver in sklearn.cluster.SpectralClustering falls over when attempting to cluster a large number (>2k) of embeddings. Using the 'lobpcg' eigen solver appears to address this problem, but the eigen_solver kwarg could not be set from Diarizer.diarize() - now it can.
Setting extra_info
to True will now return an additional dict, containing cluster labels
Removed youtube related dependencies, keeping the repository slim. There are no longer youtube helper functions, but the core functionality should now work for python >=3.7
Allowed for a newer version of speechbrain, which should have fixed the issues with pulling from huggingface_hub
speech-to-text transcription diarization asr colab-notebook speaker-diarization