Skip to contents

This function allows you to import the current datasets available from the subs2vec project. You can use `data("subsData")` to view the list of available models.

Usage

import_subs(language, what)

Arguments

language

Include a two letter code of the language you wish to download. Note: some of the `_vec` files are very large. It may take a while to download and import them. Use the tokens column to get an idea of how big the data is.

what

What would you like to download? Options are: `subs_vec`, `subs_count`, `wiki_vec`, and `wiki_count`. The `subs` are models based on the subtitles, the `wiki` are models based on Wikipedia data, `_vec` indicates the fastText model dimensions of words by dimension score, and the `_count` indicates the frequency counts for that data.

Value

a dataset of either words by dimensions or the tokens including the frequency counts.

Examples

# af_dims <- import_subs(
#   language = "af",
#  what = "subs_vec"
# )