Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

DeepGraphLearning, updated 🕥 2022-11-04 11:40:03

NBFNet: Neural Bellman-Ford Networks

This is the official codebase of the paper

Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction

Zhaocheng Zhu, Zuobai Zhang, Louis-Pascal Xhonneux, Jian Tang

A PyG re-implementation of NBFNet can be found here.

Overview

NBFNet is a graph neural network framework inspired by traditional path-based methods. It enjoys the advantages of both traditional path-based methods and modern graph neural networks, including generalization in the inductive setting, interpretability, high model capacity and scalability. NBFNet can be applied to solve link prediction on both homogeneous graphs and knowledge graphs.

NBFNet

This codebase is based on PyTorch and TorchDrug. It supports training and inference with multiple GPUs or multiple machines.

Installation

You may install the dependencies via either conda or pip. Generally, NBFNet works with Python 3.7/3.8 and PyTorch version >= 1.8.0.

From Conda

bash conda install torchdrug pytorch=1.8.2 cudatoolkit=11.1 -c milagraph -c pytorch-lts -c pyg -c conda-forge conda install ogb easydict pyyaml -c conda-forge

From Pip

bash pip install torch==1.8.2+cu111 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html pip install torchdrug pip install ogb easydict pyyaml

Reproduction

To reproduce the results of NBFNet, use the following command. Alternatively, you may use --gpus null to run NBFNet on a CPU. All the datasets will be automatically downloaded in the code.

bash python script/run.py -c config/inductive/wn18rr.yaml --gpus [0] --version v1

We provide the hyperparameters for each experiment in configuration files. All the configuration files can be found in config/*/*.yaml.

For experiments on inductive relation prediction, you need to additionally specify the split version with --version v1.

To run NBFNet with multiple GPUs or multiple machines, use the following commands

bash python -m torch.distributed.launch --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus [0,1,2,3]

bash python -m torch.distributed.launch --nnodes=4 --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus [0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]

Visualize Interpretations on FB15k-237

Once you have models trained on FB15k237, you can visualize the path interpretations with the following line. Please replace the checkpoint with your own path.

bash python script/visualize.py -c config/knowledge_graph/fb15k237_visualize.yaml --checkpoint /path/to/nbfnet/experiment/model_epoch_20.pth

Evaluate ogbl-biokg

Due to the large size of ogbl-biokg, we only evaluate on a small portion of the validation set during training. The following line evaluates a model on the full validation / test sets of ogbl-biokg. Please replace the checkpoint with your own path.

bash python script/run.py -c config/knowledge_graph/ogbl-biokg_test.yaml --checkpoint /path/to/nbfnet/experiment/model_epoch_10.pth

Results

Here are the results of NBFNet on standard benchmark datasets. All the results are obtained with 4 V100 GPUs (32GB). Note results may be slightly different if the model is trained with 1 GPU and/or a smaller batch size.

Knowledge Graph Completion

Dataset MR MRR [email protected] [email protected] [email protected]
FB15k-237 114 0.415 0.321 0.454 0.599
WN18RR 636 0.551 0.497 0.573 0.666
ogbl-biokg - 0.829 0.768 0.870 0.946

Homogeneous Graph Link Prediction

Dataset AUROC AP
Cora 0.956 0.962
CiteSeer 0.923 0.936
PubMed 0.983 0.982

Inductive Relation Prediction

Dataset [email protected] (50 sample)
v1 v2 v3 v4
FB15k-237 0.834 0.949 0.951 0.960
WN18RR 0.948 0.905 0.893 0.890

Frequently Asked Questions

  1. The code is stuck at the beginning of epoch 0.

This is probably because the JIT cache is broken. Try rm -r ~/.cache/torch_extensions/* and run the code again.

Citation

If you find this codebase useful in your research, please cite the following paper.

bibtex @article{zhu2021neural, title={Neural bellman-ford networks: A general graph neural network framework for link prediction}, author={Zhu, Zhaocheng and Zhang, Zuobai and Xhonneux, Louis-Pascal and Tang, Jian}, journal={Advances in Neural Information Processing Systems}, volume={34}, year={2021} }

Issues

Merge pull request #1 from DeepGraphLearning/master

opened on 2022-11-04 11:40:02 by moguizhizi

第一次更新

Seems unable to utilize multiple GPUs

opened on 2022-06-05 08:06:55 by jerermyyoung

Hi there.

I have tried running this code on one of my machine with four RTX3090 GPUs (GPU memory 24GB for each) python -m torch.distributed.launch --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus [0,1,2,3] I do not change any other parts of this repo. However, I encountered the CUDA error saying that I need more GPU memory. Later I modified this code as follows: python script/run.py -c config/inductive/wn18rr.yaml --gpus [0] and run it on a machine with one A100 GPU with 40GB GPU memory. The code runs successfully and costs roughly 32GB GPU memory. I am really puzzled for this: why the code does not properly utilize the total 24GB*4=96GB GPU memory and still report a memory issue? Is there something wrong with my setups?

Problems about ninja

opened on 2022-04-25 08:33:58 by Robot-2020

Hi, Doctor. I meet some problems when I run the code on the Linux. I do really need your help. Could you help me? It really troubles me a lot.

``` 15:43:32 Preprocess training set 15:43:36 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 15:43:36 Epoch 0 begin Traceback (most recent call last): File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1666, in _run_ninja_build subprocess.run( File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "script/run.py", line 62, in train_and_validate(cfg, solver) File "script/run.py", line 27, in train_and_validate solver.train(kwargs) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/core/engine.py", line 143, in train loss, metric = model(batch) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, kwargs) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/tasks/reasoning.py", line 85, in forward pred = self.predict(batch, all_loss, metric) File "/data1/home/wza/nbfnet/nbfnet/task.py", line 288, in predict pred = self.model(graph, h_index, t_index, r_index, all_loss=all_loss, metric=metric) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(input, kwargs) File "/data1/home/wza/nbfnet/nbfnet/model.py", line 149, in forward output = self.bellmanford(graph, h_index[:, 0], r_index[:, 0]) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, (extras + args), kw) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 56, in wrapper return forward(self, *args, kwargs) File "/data1/home/wza/nbfnet/nbfnet/model.py", line 115, in bellmanford hidden = layer(step_graph, layer_input) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(input, *kwargs) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/conv.py", line 91, in forward update = self.message_and_aggregate(graph, input) File "/data1/home/wza/nbfnet/nbfnet/layer.py", line 140, in message_and_aggregate sum = functional.generalized_rspmm(adjacency, relation_input, input, sum="add", mul=mul) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/spmm.py", line 378, in generalized_rspmm return Function.apply(sparse.coalesce(), relation, input) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/spmm.py", line 172, in forward forward = spmm.rspmm_add_mul_forward_cuda File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 27, in getattr return getattr(self.module, key) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 21, in get result = self.func(obj) File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 31, in module return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags, File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1080, in load return _jit_compile( File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1293, in _jit_compile _write_ninja_file_and_build_library( File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1405, in _write_ninja_file_and_build_library _run_ninja_build( File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1682, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'spmm': [1/3] /usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/TH -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.2/include -isystem /data1/home/wza/.conda/envs/linkp/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu -o rspmm.cuda.o FAILED: rspmm.cuda.o /usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/TH -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.2/include -isystem /data1/home/wza/.conda/envs/linkp/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu -o rspmm.cuda.o /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu: In instantiation of ‘at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&):::: [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]’: /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:246:600: required from ‘struct at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&):: [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]::’ /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:246:608: required from ‘at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&):: [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]’ /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:246:607: required from ‘struct at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&) [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul; at::sparse::SparseTensor = at::Tensor]::’ /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:246:28: required from ‘at::Tensor at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&) [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul; at::sparse::SparseTensor = at::Tensor]’ /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:356:193: required from here /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:244:37: internal compiler error: in tsubst_copy, at cp/pt.c:13189 const int num_row_block = (num_row + row_per_block - 1) / row_per_block; ^ Please submit a full bug report, with preprocessed source if appropriate. See for instructions. [2/3] /usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/TH -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.2/include -isystem /data1/home/wza/.conda/envs/linkp/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu -o spmm.cuda.o FAILED: spmm.cuda.o /usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/TH -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.2/include -isystem /data1/home/wza/.conda/envs/linkp/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu -o spmm.cuda.o /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu: In instantiation of ‘at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&):::: [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]’: /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:219:506: required from ‘struct at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&):: [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]::’ /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:219:514: required from ‘at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&):: [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]’ /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:219:512: required from ‘struct at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&) [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul; at::sparse::SparseTensor = at::Tensor]::’ /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:219:28: required from ‘at::Tensor at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&) [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul; at::sparse::SparseTensor = at::Tensor]’ /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:315:157: required from here /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:217:37: internal compiler error: in tsubst_copy, at cp/pt.c:13189 const int num_row_block = (num_row + row_per_block - 1) / row_per_block; ^ Please submit a full bug report, with preprocessed source if appropriate. See for instructions. ninja: build stopped: subcommand failed. ```

Unable to run the code with error importing 'spmm'

opened on 2022-03-31 17:04:49 by PengfeiHePower

Hi, I followed the instruction to reproduce results but had a problem with module 'spmm'. My torch version is 1.8.2, torchdrug is 0.1.2. Any ideas how to fix it?

12:53:15 Epoch 0 begin Traceback (most recent call last): File "script/run.py", line 78, in File "script/run.py", line 30, in train_and_validate File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\core\engine.py", line 143, in train loss, metric = model(batch) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\tasks\reasoning.py", line 85, in forward pred = self.predict(batch, all_loss, metric) File "C:\Users\Pengfei\Documents\cse research\NBFNet-master\nbfnet\task.py", line 288, in predict pred = self.model(graph, h_index, t_index, r_index, all_loss=all_loss, metric=metric) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "C:\Users\Pengfei\Documents\cse research\NBFNet-master\nbfnet\model.py", line 149, in forward output = self.bellmanford(graph, h_index[:, 0], r_index[:, 0]) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), kw) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\utils\decorator.py", line 56, in wrapper return forward(self, args, kwargs) File "C:\Users\Pengfei\Documents\cse research\NBFNet-master\nbfnet\model.py", line 115, in bellmanford hidden = layer(step_graph, layer_input) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(input, **kwargs) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\layers\conv.py", line 91, in forward update = self.message_and_aggregate(graph, input) File "C:\Users\Pengfei\Documents\cse research\NBFNet-master\nbfnet\layer.py", line 140, in message_and_aggregate sum = functional.generalized_rspmm(adjacency, relation_input, input, sum="add", mul=mul) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\layers\functional\spmm.py", line 378, in generalized_rspmm return Function.apply(sparse.coalesce(), relation, input) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\layers\functional\spmm.py", line 172, in forward forward = spmm.rspmm_add_mul_forward_cuda File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\utils\torch.py", line 27, in getattr return getattr(self.module, key) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\utils\decorator.py", line 21, in get result = self.func(obj) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\utils\torch.py", line 31, in module return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags, File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\utils\cpp_extension.py", line 1079, in load return _jit_compile( File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\utils\cpp_extension.py", line 1317, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\utils\cpp_extension.py", line 1700, in _import_module_from_library file, path, description = imp.find_module(module_name, [path]) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\imp.py", line 296, in find_module raise ImportError(_ERR_MSG.format(name), name=name) ImportError: No module named 'spmm'

Unable to run the code with error regarding 'mpiicpc'

opened on 2022-02-12 06:26:55 by VeritasYin

Hello,

I followed the instruction to install the torchdrug-related packages and matching PyTorch/CUDA version. However, I got this following error when initializing the code. Any ideas to fix this? The system has intel/19.0.3.199 loaded.

01:24:15 Epoch 0 begin Traceback (most recent call last): File "script/run.py", line 62, in <module> train_and_validate(cfg, solver) File "script/run.py", line 27, in train_and_validate solver.train(**kwargs) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/core/engine.py", line 143, in train loss, metric = model(batch) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/tasks/reasoning.py", line 85, in forward pred = self.predict(batch, all_loss, metric) File "~/Workspace/Python/NBFNet/nbfnet/task.py", line 288, in predict pred = self.model(graph, h_index, t_index, r_index, all_loss=all_loss, metric=metric) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "~/Workspace/Python/NBFNet/nbfnet/model.py", line 149, in forward output = self.bellmanford(graph, h_index[:, 0], r_index[:, 0]) File "<decorator-gen-888>", line 2, in bellmanford File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 56, in wrapper return forward(self, *args, **kwargs) File "~/Workspace/Python/NBFNet/nbfnet/model.py", line 115, in bellmanford hidden = layer(step_graph, layer_input) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/layers/conv.py", line 91, in forward update = self.message_and_aggregate(graph, input) File "~/Workspace/Python/NBFNet/nbfnet/layer.py", line 124, in message_and_aggregate adjacency = graph.adjacency.transpose(0, 1) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 21, in __get__ result = self.func(obj) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/data/graph.py", line 658, in adjacency return utils.sparse_coo_tensor(self.edge_list.t(), self.edge_weight, self.shape) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 182, in sparse_coo_tensor return torch_ext.sparse_coo_tensor_unsafe(indices, values, size) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 27, in __getattr__ return getattr(self.module, key) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 21, in __get__ result = self.func(obj) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 31, in module return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags, File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1079, in load return _jit_compile( File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1292, in _jit_compile _write_ninja_file_and_build_library( File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1378, in _write_ninja_file_and_build_library check_compiler_abi_compatibility(compiler) File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 282, in check_compiler_abi_compatibility if not check_compiler_ok_for_platform(compiler): File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 249, in check_compiler_ok_for_platform version_string = subprocess.check_output([compiler, '-v'], stderr=subprocess.STDOUT).decode() File "~/anaconda3/envs/dlg_env/lib/python3.8/subprocess.py", line 415, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "~/anaconda3/envs/dlg_env/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['icpc', '-v']' returned non-zero exit status 1.

MilaGraph

Research group led by Prof. Jian Tang at Mila-Quebec AI Institute (https://mila.quebec/) focusing on graph representation learning and graph neural networks.

GitHub Repository

graph-neural-networks link-prediction knowledge-graph reasoning