Published August 25, 2021
| Version v1
Dataset
Open
OHSUMED
Creators
Description
OHSUMED collection contains medical documents collected in 1991 related to 23 cardiovascular disease categories. The version we used has 18,302 documents, distributed very irregularly among the categories varying from 56 to 2876 documents per category
The files:
texts.txt: Document set (text). One per line.
score.txt: Document class whose index is associated with texts.txt
split_<k>.pkl: pandas DataFrame with k-cross validation partition.
Files
ohsumed.zip
Files
(177.3 MB)
Name | Size | Download all |
---|---|---|
md5:2d839d39886c4f8064ea4ad9bdf0d743
|
153.0 MB | Preview Download |
md5:b2a9c4958e140aa0a226415ad42cfbe7
|
47.7 kB | Preview Download |
md5:461fe78886904dde1f50f8fedaa820d8
|
547.9 kB | Download |
md5:f886f6ea270ad2544f5dc0fa336c9ca5
|
274.4 kB | Download |
md5:8090204908da6930f96a13959ea12346
|
23.5 MB | Preview Download |