readRCV1 {tm}R Documentation

Read In a Reuters Corpus Volume 1 Document

Description

Read in a Reuters Corpus Volume 1 XML document.

Usage

readRCV1(elem, language, id)
readRCV1asPlain(elem, language, id)

Arguments

elem

a list with the named component content which must hold the document to be read in.

language

a string giving the text's language.

id

a unique identification string for the returned text document.

Value

An RCV1Document for readRCV1, or a PlainTextDocument for readRCV1asPlain.

Author(s)

Ingo Feinerer

References

Lewis, D. D.; Yang, Y.; Rose, T.; and Li, F (2004). RCV1: A New Benchmark Collection for Text Categorization Research. Journal of Machine Learning Research, 5, 361–397. http://www.jmlr.org/papers/volume5/lewis04a/lewis04a.pdf

See Also

getReaders to list available reader functions.

Examples

f <- system.file("texts", "rcv1_2330.xml", package = "tm")
rcv1 <- readRCV1(elem = list(content = readLines(f)),
                 language = "en", id = "id1")
meta(rcv1)

[Package tm version 0.5-10 Index]