This repository contains code and models for the paper: SpanBERT: Improving Pre-training by Representing and Predicting Spans. If you prefer to use Huggingface, please check out this link -- https://huggingface.co/SpanBERT
Please use an earlier commit of Apex - NVIDIA/[email protected]
We release both base and large cased models for SpanBERT. The base & large models have the same model configuration as BERT but they differ in both the masking scheme and the training objectives (see our paper for more details).
These models have the same format as the HuggingFace BERT models, so you can easily replace them with our SpanBET models. If you would like to use our fine-tuning code, the model paths are already hard-coded in the code :)
| | SQuAD 1.1 | SQuAD 2.0 | Coref | TACRED | | ---------------------- | ------------- | --------- | ------- | ------ | | | F1 | F1 | avg. F1 | F1 | | BERT (base) | 88.5 | 76.5 | 73.1 | 67.7 | | SpanBERT (base) | 92.4 | 83.6 | 77.4 | 68.2 | | BERT (large) | 91.3 | 83.3 | 77.1 | 66.4 | | SpanBERT (large) | 94.6 | 88.7 | 79.6 | 70.8 |
Note: The numbers marked as * are evaluated on the development sets because we didn't submit those models to the official SQuAD leaderboard. All the other numbers are test numbers.
bash
python code/run_squad.py \
--do_train \
--do_eval \
--model spanbert-base-cased \
--train_file train-v1.1.json \
--dev_file dev-v1.1.json \
--train_batch_size 32 \
--eval_batch_size 32 \
--learning_rate 2e-5 \
--num_train_epochs 4 \
--max_seq_length 512 \
--doc_stride 128 \
--eval_metric f1 \
--output_dir squad_output \
--fp16
bash
python code/run_squad.py \
--do_train \
--do_eval \
--model spanbert-base-cased \
--train_file train-v2.0.json \
--dev_file dev-v2.0.json \
--train_batch_size 32 \
--eval_batch_size 32 \
--learning_rate 2e-5 \
--num_train_epochs 4 \
--max_seq_length 512 \
--doc_stride 128 \
--eval_metric best_f1 \
--output_dir squad2_output \
--version_2_with_negative \
--fp16
bash
python code/run_tacred.py \
--do_train \
--do_eval \
--data_dir <TACRED_DATA_DIR> \
--model spanbert-base-cased \
--train_batch_size 32 \
--eval_batch_size 32 \
--learning_rate 2e-5 \
--num_train_epochs 10 \
--max_seq_length 128 \
--output_dir tacred_dir \
--fp16
bash
python code/run_mrqa.py \
--do_train \
--do_eval \
--model spanbert-base-cased \
--train_file TriviaQA-train.jsonl.gz \
--dev_file TriviaQA-dev.jsonl.gz \
--train_batch_size 32 \
--eval_batch_size 32 \
--learning_rate 2e-5 \
--num_train_epochs 4 \
--max_seq_length 512 \
--doc_stride 128 \
--eval_per_epoch 5 \
--output_dir triviaqa_dir \
--fp16
bash
python code/run_glue.py \
--task_name RTE \
--model spanbert-base-cased \
--do_train \
--do_eval \
--data_dir <RTE_DATA_DIR> \
--train_batch_size 32 \
--eval_batch_size 32 \
--num_train_epochs 10 \
--max_seq_length 128 \
--learning_rate 2e-5 \
--output_dir RTE_DIR \
--fp16
Our coreference resolution fine-tuning code is implemented in Tensorflow. Please see https://github.com/mandarjoshi90/coref for more details.
If you are interested in using our fine-tuned models for downstream tasks, directly, please use the following script.
./code/download_finetuned.sh <model_dir> <task>
where <task>
is one of [squad1, squad2, tacred]
. You can evaluate the models by setting --do_train
to false
, --do_eval
to true
, and --output_dir
to <model_dir>/<task>
in python code/run_<task>.py
.
For coreference resolution, please refer to this repository -- https://github.com/mandarjoshi90/coref
@article{joshi2019spanbert,
title={{SpanBERT}: Improving Pre-training by Representing and Predicting Spans},
author={Mandar Joshi and Danqi Chen and Yinhan Liu and Daniel S. Weld and Luke Zettlemoyer and Omer Levy},
journal={arXiv preprint arXiv:1907.10529},
year={2019}
}
SpanBERT is CC-BY-NC 4.0. The license applies to the pre-trained models as well.
If you have any questions, please contact Mandar Joshi <[email protected]>
or Danqi Chen <[email protected]>
or create a Github issue.
Bumps ipython from 7.10.2 to 8.10.0.
Sourced from ipython's releases.
See https://pypi.org/project/ipython/
We do not use GitHub release anymore. Please see PyPI https://pypi.org/project/ipython/
15ea1ed
release 8.10.0560ad10
DOC: Update what's new for 8.10 (#13939)7557ade
DOC: Update what's new for 8.10385d693
Merge pull request from GHSA-29gw-9793-fvw7e548ee2
Swallow potential exceptions from showtraceback() (#13934)0694b08
MAINT: mock slowest test. (#13885)8655912
MAINT: mock slowest test.a011765
Isolate the attack tests with setUp and tearDown methodsc7a9470
Add some regression tests for this changefd34cf5
Swallow potential exceptions from showtraceback()Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps certifi from 2019.9.11 to 2022.12.7.
9e9e840
2022.12.07b81bdb2
2022.09.24939a28f
2022.09.14aca828a
2022.06.15.2de0eae1
Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...b8eb5e9
2022.06.15.147fb7ab
Fix deprecation warning on Python 3.11 (#199)b0b48e0
fixes #198 -- update link in license9d514b4
2022.06.154151e88
Add py.typed to MANIFEST.in to package in sdist (#196)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
I am using spanbert model
( I do not fine-tune it) downloaded from https://github.com/facebookresearch/SpanBERT to test it on some text samples.
The result I get is in this format using HF:
token_classifier = pipeline("token-classification", model=model, aggregation_strategy="max",tokenizer=tokenizer,grouped_entities=True)
Label_0
, Label_1
How to map these Labels to entity names? Is there any function, documents where it mentions about entity type mapping?
Bumps pillow from 6.1.0 to 9.3.0.
Sourced from pillow's releases.
9.3.0
https://pillow.readthedocs.io/en/stable/releasenotes/9.3.0.html
Changes
- Initialize libtiff buffer when saving #6699 [
@radarhere
]- Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [
@wiredfool
]- Inline fname2char to fix memory leak #6329 [
@nulano
]- Fix memory leaks related to text features #6330 [
@nulano
]- Use double quotes for version check on old CPython on Windows #6695 [
@hugovk
]- GHA: replace deprecated set-output command with GITHUB_OUTPUT file #6697 [
@nulano
]- Remove backup implementation of Round for Windows platforms #6693 [
@cgohlke
]- Upload fribidi.dll to GitHub Actions #6532 [
@nulano
]- Fixed set_variation_by_name offset #6445 [
@radarhere
]- Windows build improvements #6562 [
@nulano
]- Fix malloc in _imagingft.c:font_setvaraxes #6690 [
@cgohlke
]- Only use ASCII characters in C source file #6691 [
@cgohlke
]- Release Python GIL when converting images using matrix operations #6418 [
@hmaarrfk
]- Added ExifTags enums #6630 [
@radarhere
]- Do not modify previous frame when calculating delta in PNG #6683 [
@radarhere
]- Added support for reading BMP images with RLE4 compression #6674 [
@npjg
]- Decode JPEG compressed BLP1 data in original mode #6678 [
@radarhere
]- pylint warnings #6659 [
@marksmayo
]- Added GPS TIFF tag info #6661 [
@radarhere
]- Added conversion between RGB/RGBA/RGBX and LAB #6647 [
@radarhere
]- Do not attempt normalization if mode is already normal #6644 [
@radarhere
]- Fixed seeking to an L frame in a GIF #6576 [
@radarhere
]- Consider all frames when selecting mode for PNG save_all #6610 [
@radarhere
]- Don't reassign crc on ChunkStream close #6627 [
@radarhere
]- Raise a warning if NumPy failed to raise an error during conversion #6594 [
@radarhere
]- Only read a maximum of 100 bytes at a time in IMT header #6623 [
@radarhere
]- Show all frames in ImageShow #6611 [
@radarhere
]- Allow FLI palette chunk to not be first #6626 [
@radarhere
]- If first GIF frame has transparency for RGB_ALWAYS loading strategy, use RGBA mode #6592 [
@radarhere
]- Round box position to integer when pasting embedded color #6517 [
@radarhere
]- Removed EXIF prefix when saving WebP #6582 [
@radarhere
]- Pad IM palette to 768 bytes when saving #6579 [
@radarhere
]- Added DDS BC6H reading #6449 [
@ShadelessFox
]- Added support for opening WhiteIsZero 16-bit integer TIFF images #6642 [
@JayWiz
]- Raise an error when allocating translucent color to RGB palette #6654 [
@jsbueno
]- Moved mode check outside of loops #6650 [
@radarhere
]- Added reading of TIFF child images #6569 [
@radarhere
]- Improved ImageOps palette handling #6596 [
@PososikTeam
]- Defer parsing of palette into colors #6567 [
@radarhere
]- Apply transparency to P images in ImageTk.PhotoImage #6559 [
@radarhere
]- Use rounding in ImageOps contain() and pad() #6522 [
@bibinhashley
]- Fixed GIF remapping to palette with duplicate entries #6548 [
@radarhere
]- Allow remap_palette() to return an image with less than 256 palette entries #6543 [
@radarhere
]- Corrected BMP and TGA palette size when saving #6500 [
@radarhere
]
... (truncated)
Sourced from pillow's changelog.
9.3.0 (2022-10-29)
Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [wiredfool]
Initialize libtiff buffer when saving #6699 [radarhere]
Inline fname2char to fix memory leak #6329 [nulano]
Fix memory leaks related to text features #6330 [nulano]
Use double quotes for version check on old CPython on Windows #6695 [hugovk]
Remove backup implementation of Round for Windows platforms #6693 [cgohlke]
Fixed set_variation_by_name offset #6445 [radarhere]
Fix malloc in _imagingft.c:font_setvaraxes #6690 [cgohlke]
Release Python GIL when converting images using matrix operations #6418 [hmaarrfk]
Added ExifTags enums #6630 [radarhere]
Do not modify previous frame when calculating delta in PNG #6683 [radarhere]
Added support for reading BMP images with RLE4 compression #6674 [npjg, radarhere]
Decode JPEG compressed BLP1 data in original mode #6678 [radarhere]
Added GPS TIFF tag info #6661 [radarhere]
Added conversion between RGB/RGBA/RGBX and LAB #6647 [radarhere]
Do not attempt normalization if mode is already normal #6644 [radarhere]
... (truncated)
d594f4c
Update CHANGES.rst [ci skip]909dc64
9.3.0 version bump1a51ce7
Merge pull request #6699 from hugovk/security-libtiff_buffer2444cdd
Merge pull request #6700 from hugovk/security-samples_per_pixel-sec744f455
Added release notes0846bfa
Add to release notes799a6a0
Fix linting00b25fd
Hide UserWarning in logs05b175e
Tighter test case13f2c5a
Prevent DOS with large SAMPLESPERPIXEL in Tiff IFDDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
I have trained a pretrained bert model, but I feel that many weights are not well trained. As a result, the output of different input texts is almost similar, and the difference may only be 7 decimal places behind. What is the cause
Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.
If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.