Document Components Ontology (DoCO)

URL
http://purl.org/spar/doco
Documentation
http://www.sparontologies.net/ontologies/doco/source.html
Source
http://www.sparontologies.net/ontologies/doco/source.rdf (RDF/XML)
http://www.sparontologies.net/ontologies/doco/source.ttl (Turtle)
http://www.sparontologies.net/ontologies/doco/source.json (JSON-LD)
Repository
http://sourceforge.net/p/sempublishing/code/HEAD/tree/DoCO/
Reference
Constantin, A., Peroni, S., Pettifer, S., Shotton, D., Vitali, F. (in press). The Document Components Ontology (DoCO). To appear in Semantic Web – Interoperability, Usability, Applicability. Amsterdam, The Netherlands: IOS Press. http://dx.doi.org/10.3233/SW-150177

**DoCO**, the **Document Components Ontology**, is an OWL 2 DL ontology that provides a general-purpose structured vocabulary of document elements. DoCO has been designed as a general unifying ontological framework for describing different aspects related to the content of scientific and other scholarly texts. Its primary goal has been to improve the interoperability and shareability of academic documents (and related services) when multiple formats are actually used for their storage. The creation of DoCO was undertaken by studying different corpora of documents (mainly scientific literature and web documents on different topics) and publishers' guidelines, from two perspectives – the structural and the rhetorical. In addition, some informal interviews have been done with researchers in different fields and with academic publishers, in order to gather as much information as possible about document components and their use. DoCO imports the [Pattern Ontology](http://www.essepuntato.it/2008/12/pattern) that describes structural patterns (introduced in the paper entitled "[Dealing with structural patterns of XML documents](http://dx.doi.org/10.1002/asi.23088)"), and the [Discourse Element Ontology (DEO)](/ontologies/deo), which was developed with DoCO and describes rhetorical components. Additionally, it also defines hybrid classes describing elements that are both structural and rhetorical in nature, such as paragraph (``doco:Paragraph``), section (``doco:Section``) or list (``doco:List``). DoCO is also aligned with the [SALT Rhetorical Ontology](http://lov.okfn.org/dataset/lov/vocabs/sro) and the [Ontology of Rhetorical Blocks (ORB)](http://www.w3.org/2001/sw/hcls/notes/orb/). A concise summary of the main DoCO classes and its imported ontologies is shown in the following figure. <img class="img-responsive center-block" src="/static/img/spar/doco-architecture.png" alt="A summary of the main classes defined in DoCO and its related imported ontologies." />

Examples of use of DoCO

  1. Describing the structure of an article
  2. Sentences containing references to bibliographic items

Describing the structure of an article

DoCO can be used for describing several parts of a document such as a journal article (defined through the [FaBiO](/ontologies/fabio) class ``fabio:JournalArticle``) connecting then by means of the object property ``po:contains``. It can be also used in combination with [C4O](/ontologies/c4o) and the [Collections Ontology (CO)](http://purl.org/co) for describing its textual content and the particular order in which the various components appear. In particular, the actual textual content of each component can be specified through the property ``c4o:hasContent``, while the order can be described by using the entities related with the class ``co:List``.

@prefix : <http://www.sparontologies.net/example/> .
@prefix doco: <http://purl.org/spar/doco/> .
@prefix deo: <http://purl.org/spar/deo/> .
@prefix po: <http://www.essepuntato.it/2008/12/pattern#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix fabio: <http://purl.org/spar/fabio/> .
@prefix co: <http://purl.org/co/> .
@prefix c4o: <http://purl.org/spar/c4o> .

:paper a fabio:JournalArticle ;
    po:contains
        :front-matter ,
        :body-matter ,
        :back-matter ;
    co:firstItem [
        co:itemContent :front-matter ;
        co:nextItem [
            co:itemContent :body-matter ;
            co:nextItem [
                co:itemContent :back-matter ] ] ] .

:body-matter a doco:BodyMatter ;
    po:contains
        :section-introduction ,
        :section-related-work ,
        :section-document-components ,
        :section-adoption ,
        :section-conclusions ;
    co:firstItem [
        co:itemContent :section-introduction ;
        co:nextItem [
            co:itemContent :section-related-work ;
            co:nextItem [
                co:itemContent :section-document-components ;
                co:nextItem [
                    co:itemContent :section-adoption ;
                    co:nextItem [
                        co:itemContent :section-conclusions ] ] ] ] ] .

# Note that, in this example, the composition in paragraphs
# has been defined only for this section.
:section-introduction a doco:Section , deo:Introduction ;
    po:containsAsHeader :section-introduction-title ;
    po:contains
        :paragraph-1 ,
        :paragraph-2 ,
        :paragraph-3 ,
        :paragraph-4 ;
    co:firstItem [
        co:itemContent :section-introduction-title ;
        co:nextItem [
            co:itemContent :paragraph-1 ;
            co:nextItem [
                co:itemContent :paragraph-2 ;
                co:nextItem [
                    co:itemContent :paragraph-3 ;
                    co:nextItem [
                        co:itemContent :paragraph-4 ] ] ] ] ] .

:section-introduction-title a doco:SectionTitle ;
    c4o:hasContent "Introduction" .

# Note that, in this example, the composition in sentences
# has been defined only for this paragraph.
:paragraph-1 a doco:Paragraph ;
    po:contains
        :sentence-1 ,
        :sentence-2 ,
        :sentence-3 ,
        :sentence-4 ,
        :sentence-5 ,
        :sentence-6 ,
        :sentence-7 ;
    co:firstItem [
        co:itemContent :sentence-1 ;
        co:nextItem [
            co:itemContent :sentence-2 ;
            co:nextItem [
                co:itemContent :sentence-3 ;
                co:nextItem [
                    co:itemContent :sentence-4 ;
                    co:nextItem [
                        co:itemContent :sentence-5 ;
                        co:nextItem [
                            co:itemContent :sentence-6 ] ] ] ] ] ] .

:sentence-1 a doco:Sentence ;
    c4o:hasContent "One of the most important criteria for the
        evaluation of a scientific contribution is the coherent
        organisation of the textual narrative that describes it,
        most often published as a scientific article or book." .

# ...

Please cite the source above with the following reference:

Peroni, Silvio (2015): Example of use of DoCO #1. figshare. http://dx.doi.org/10.6084/m9.figshare.1513725


Sentences containing references to bibliographic items

Among the various parts of a paper, describing references to other objects of the paper, such as bibliographic references, can be of some interest for keeping track, for instance, the number of times a particular publication is actually cited within a paper. DoCO allows one to describe all these parts and to link them together.

@prefix : <http://www.sparontologies.net/example/> .
@prefix doco: <http://purl.org/spar/doco/> .
@prefix deo: <http://purl.org/spar/deo/> .
@prefix po: <http://www.essepuntato.it/2008/12/pattern#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix c4o: <http://purl.org/spar/c4o> .

:sentence a doco:Sentence ;
    c4o:hasContent "For instance, a recent report by Beck [3]
        explains the requirements for an XML vocabulary of
        scientific journals to be acceptable for inclusion
        in PubMed Central." ;
    po:contains :reference-to-3 .

:reference-to-3 a deo:Reference ;
    c4o:hasContent "[3]" ;
    dcterms:references :bibliographic-reference-3 .

:bibliographic-reference-3 a deo:BibliographicReference ;
    c4o:hasContent "[3]	Beck, J. (2010). Report from the Field:
        PubMed Central, an XML-based Archive of Life Sciences
        Journal Articles. In Proceedings of the International
        Symposium on XML for the Long Haul: Issues in the Long-term
        Preservation of XML.
        OA at http://dx.doi.org/10.4242/BalisageVol6.Beck01." .

Please cite the source above with the following reference:

Peroni, Silvio (2015): Example of use of DoCO #2. figshare. http://dx.doi.org/10.6084/m9.figshare.1513733