tm_map {tm}R Documentation

Transformations on Corpora

Description

Interface to apply transformation functions (also denoted as mappings) to corpora.

Usage

## S3 method for class 'PCorpus'
tm_map(x, FUN, ..., useMeta = FALSE, lazy = FALSE)
## S3 method for class 'VCorpus'
tm_map(x, FUN, ..., useMeta = FALSE, lazy = FALSE)

Arguments

x

A corpus.

FUN

A transformation function returning a text document.

...

Arguments to FUN.

useMeta

Logical. Should DMetaData be passed over to FUN as argument?

lazy

Logical. Lazy mappings are mappings which are delayed until the documents' content is accessed. Lazy mapping is useful when working with large corpora but only few documents will be accessed, as it avoids the computationally expensive application of the mapping to all elements in the corpus.

Value

A corpus with FUN applied to each document in x. In case of lazy mappings only annotations are stored which are evaluated upon access of individual documents which trigger the execution of the corresponding transformation function.

Note

Please be aware that lazy transformations are an experimental feature and change R's standard evaluation semantics.

See Also

getTransformations for available transformations, and materialize for manually triggering the materialization of documents with pending lazy transformations.

Examples

data("crude")
tm_map(crude, stemDocument)
## Generate a custom transformation function which takes the heading
## as new content
headings <- function(x)
    PlainTextDocument(Heading(x), id = ID(x), language = Language(x))
inspect(tm_map(crude, headings))

[Package tm version 0.5-10 Index]