Calculate pseudowords with the Wuggy Algorithm
fake_Wuggy.Rd
This function takes a list of tokens and returns a list of potential generated pseudowords by using the Wuggy. Note: you should check the list, as random generation can generate a new real token. Note: this function is fairly slow as the word list gets larger. This function uses bigrams as syllables for one syllable words.
Arguments
- wordlist
A list of valid words from which to calculate the frequencies of syllables and transition ngrams from.
- language_hyp
The language hyphenation you want to use. You can find them to download at https://hyphenation.org/. Or check out the ones we used in our package /inst/latex.
- lang
The two letter language code for the language you imported for hyphenation.
- replacewords
A list of tokens you want to use to create your pseudowords.
Value
A dataset of original tokens and suggested pseudowords.
- word_id
Number id for each unique word
- first
First syllable in pairs of syllables.
- original_pair
Pair of syllables together.
- second
Second syllable in the pairs of syllables.
- syll
Number of syllables in the token.
- original_freq
Frequency of the syllable pair.
- replacement_pair
Replacement option wherein one of the syllables has been changed.
- replacement_syll
The replacement syllable.
- replacement_freq
The frequency of the replacement syllable pair.
- freq_diff
The difference in frequency of the transition pair.
- char_diff
Number of characters difference in the original pair and the replacement pair.
- letter_diff
Number of letters difference in the original pair and the replacement pair. If the replacement includes the same letters, the difference would be zero. These values are excluded from being options.
- original_word
The original token.
- replacement_word
The final replacement token.
Examples
# af_wuggy <- fake_Wuggy(
# wordlist = af_final$sentence, # full valid options in language
# language_hyp = "../inst/latex/hyph-af.tex", # path to hyphenation.tex
# lang = "af", # two letter language code
# replacewords <- unique(af_top_sim$cue[1:20]) # words you want to create pseudowords for
# )