meta {tm}R Documentation

Meta Data Management

Description

Methods to access and modify meta data of documents, corpora, and repositories. In addition to tm's internal meta data structures, Simple Dublin Core meta data mappings are available.

Usage

## S3 method for class 'Corpus'
meta(x, tag, type = c("indexed", "corpus", "local"))
## S3 method for class 'TextDocument'
meta(x, tag, type = NULL)
## S3 method for class 'TextRepository'
meta(x, tag, type = NULL)
content_meta(x, tag) <- value
DublinCore(x, tag = NULL)

Arguments

x

Either a text document, a corpus, or a text repository.

tag

A character identifying the name of the meta datum.

type

A character specifying which meta data of a corpus should be considered.

value

Replacement value.

Details

In general this function can be used to display meta information but also to modify individual meta data:

x = "TextDocument", tag = NULL

If no tag is given, this method pretty prints all x's meta data. If tag is provided its value in the meta data is returned.

x = "Corpus", tag = NULL, type = "indexed"

This method investigates the type argument. type must be either indexed (default), local, or corpus. Former is a shortcut for accessing document level meta data (DMetaData) stored at the collection level (because it forms an own entity, or for performance reasons, i.e., a form of indexing, hence the name indexed), local accesses the meta data local to each text document (i.e., meta data in text documents' attributes), and corpus is a shortcut for corpus specific meta data (CMetaData). Depending whether a tag is set or not, all or only the meta data identified by the tag is displayed or modified.

x = "TextRepository", tag = NULL

If no tag is given, this method pretty prints all x's meta data. If tag is provided its value in the meta data is returned.

Simple Dublin Core meta data is only available locally at each document:

x = "TextDocument", tag = NULL

Returns or sets the Simple Dublin Core meta datum named tag for x. tag must be a valid Simple Dublin Core element name (i.e, title, creator, subject, description, publisher, contributor, date, type, format, identifier, source, language, relation, coverage, or rights) or NULL. For the latter all Dublin Core meta data are printed.

content_meta is a convenience wrapper which calls Content if tag = "Content" and meta otherwise.

References

Dublin Core Metadata Initiative. http://dublincore.org/

Examples

data("crude")
meta(crude[[1]])
DublinCore(crude[[1]])
meta(crude[[1]], tag = "Topics")
meta(crude[[1]], tag = "Comment") <- "A short comment."
meta(crude[[1]], tag = "Topics") <- NULL
DublinCore(crude[[1]], tag = "creator") <- "Ano Nymous"
DublinCore(crude[[1]], tag = "Format") <- "XML"
DublinCore(crude[[1]])
meta(crude[[1]])
meta(crude)
meta(crude, type = "corpus")
meta(crude, "labels") <- 21:40
meta(crude)

[Package tm version 0.5-10 Index]