Knowledge graphs generation from unstructured text.

rossanez, updated 🕥 2022-07-16 13:43:28

KGen

Knowledge Graphs Generation from unstructured text

Running instructions:

1. Start CoreNLP server:

bash $ python3 common/stanfordcorenlp/server.py (syntax: python3 common/stanfordcorenlp/server.py -h)

2. With the server started, run the pipeline in another shell, e.g.:

bash $ python3 pipeline.py text.txt -p senna -s -k cso -ng (syntax: python3 pipeline.py -h)

Alternatively, each stage may be executed outside the pipeline, e.g.:

2.1. Preprocessing:

bash $ cd preprocessor $ python3 preprocessor.py text.txt (syntax: python3 preprocessor.py -h)

2.2. Facts extractor:

bash $ cd facts_extractor $ python3 extractor.py text_preprocessed.txt -p senna -s (syntax: python3 extractor.py -h)

2.3. Ontology linker (Optional stage, used to obtain ontology links):

bash $ cd kb_linker $ python3 linker.py text_preprocessed.txt -k cso (syntax: python3 linker.py -h)

2.4. RDF maker:

bash $ cd rdf_maker $ python3 maker.py text_preprocessed_triples.txt -l text_preprocessed_links.txt (syntax: python3 maker.py -h)

2.5. PNG generator (Optional stage, used to obtain a PNG image representing the KG):

bash $ cd graph_generator $ python3 generator.py text_preprocessed_kg.ttl (syntax: python3 generator.py -h)

3. When done, stop the server

bash $python3 common/stanfordcorenlp/server.py -k (or simply Ctrl+C in its shell)

Citing KGen

Issues

Repo file size excessive

opened on 2022-01-21 19:36:41 by davidshumway

```bash $ du -h /Users/b/KGen/ | sort -hr | head -n 10

3.1G /Users/b/KGen/ 2.9G /Users/b/KGen/.git/objects/pack 2.9G /Users/b/KGen/.git/objects 2.9G /Users/b/KGen/.git 183M /Users/b/KGen/examples 152M /Users/b/KGen/examples/cs 146M /Users/b/KGen/examples/cs/ISWC 31M /Users/b/KGen/examples/biomedical 19M /Users/b/KGen/examples/biomedical/tentative 12M /Users/b/KGen/examples/biomedical/reduced ```

AssertionError: ERROR: Stanford CoreNLP Server exited with a non-zero code status.

opened on 2022-01-20 00:12:12 by davidshumway

Environment: macOS Monterey 12.1 (21C52)

```bash $ python3 common/stanfordcorenlp/server.py

Starting Stanford CoreNLP Server from /Users/b/kgen/common/stanfordcorenlp Stanford CoreNLP Server startup command: java -Djava.io.tmpdir="/tmp/" -mx5g -cp "/Users/b/kgen/common/stanfordcorenlp/stanford-corenlp.jar:/Users/b/kgen/common/stanfordcorenlp/stanford-corenlp-models.jar:/Users/b/kgen/common/stanfordcorenlp/slf4j-api.jar:/Users/b/kgen/common/stanfordcorenlp/slf4j-simple.jar:/Users/b/kgen/common/stanfordcorenlp/ejml.jar" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000 Error: Could not find or load main class edu.stanford.nlp.pipeline.StanfordCoreNLPServer Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.pipeline.StanfordCoreNLPServer Traceback (most recent call last): File "/Users/b/kgen/common/stanfordcorenlp/server.py", line 119, in exit(main(argv)) File "/Users/b/kgen/common/stanfordcorenlp/server.py", line 116, in main server.startServer(verbose=True, wait_for_subprocess=True) File "/Users/b/kgen/common/stanfordcorenlp/server.py", line 81, in startServer assert not java_process.returncode, 'ERROR: Stanford CoreNLP Server exited with a non-zero code status.' AssertionError: ERROR: Stanford CoreNLP Server exited with a non-zero code status. ```

how to solved this error?

opened on 2022-01-19 09:37:31 by zhangweizhenGitHub

image

License

opened on 2021-12-01 19:19:42 by emerson-h

Hi @rossanez I came across this repo while doing some research for a hack-a-thon in the EU, and there is no license specified.

Here's a guide from GitHub on adding an open source license using their templates:

https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-license-to-a-repository

Provide requirements file

opened on 2021-11-18 14:36:17 by akhil-bot

Hi @rossanez, I came across your repo while doing some research on KG extraction from unstructured text. It seems there is no requirements file included in this repo, it would be a great help if you can include that and few instructions on installation of stanford nlp server. Thanks in Advance...:)

Anderson Rossanez
GitHub Repository