Skip to contents

This function takes a list of tokens and returns a list of potential generated pseudowords by using the Wuggy. Note: you should check the list, as random generation can generate a new real token. Note: this function is fairly slow as the word list gets larger. This function uses bigrams as syllables for one syllable words.

Usage

fake_Wuggy(wordlist, language_hyp, lang, replacewords)

Arguments

wordlist

A list of valid words from which to calculate the frequencies of syllables and transition ngrams from.

language_hyp

The language hyphenation you want to use. You can find them to download at https://hyphenation.org/. Or check out the ones we used in our package /inst/latex.

lang

The two letter language code for the language you imported for hyphenation.

replacewords

A list of tokens you want to use to create your pseudowords.

Value

A dataset of original tokens and suggested pseudowords.

word_id

Number id for each unique word

first

First syllable in pairs of syllables.

original_pair

Pair of syllables together.

second

Second syllable in the pairs of syllables.

syll

Number of syllables in the token.

original_freq

Frequency of the syllable pair.

replacement_pair

Replacement option wherein one of the syllables has been changed.

replacement_syll

The replacement syllable.

replacement_freq

The frequency of the replacement syllable pair.

freq_diff

The difference in frequency of the transition pair.

char_diff

Number of characters difference in the original pair and the replacement pair.

letter_diff

Number of letters difference in the original pair and the replacement pair. If the replacement includes the same letters, the difference would be zero. These values are excluded from being options.

original_word

The original token.

replacement_word

The final replacement token.

Examples

# af_wuggy <- fake_Wuggy(
# wordlist = af_final$sentence, # full valid options in language
# language_hyp = "../inst/latex/hyph-af.tex", # path to hyphenation.tex
# lang = "af", # two letter language code
# replacewords <- unique(af_top_sim$cue[1:20]) # words you want to create pseudowords for
# )