frequency dictionary created by "Language Reactor"

Where can I find the entire list of words from the frequency dictionary created by “Language Reactor”?

Hi @Grander_Gard,

If you click on your settings gear (:gear:) then click “set vocabulary level”, it’ll give you a list of words that may be helpful to you:

Other than that, there are no other lists of words that I can think of, and you would need to ask the LR team for a specific language of frequency info — if they are willing to share that since it goes into their algorithm.

I do know that they pull frequency lists from opensubtitles[.org]:

I hope this is somewhat helpful!

1 Like

There’s also this page, it lists the top 8000 words by frequency, as lemmatised by the NLP code.

The lemmatisation is done by software libraries and is not perfect… the lists start to get noisier above 8000 with incorrect lemmatisations, so we don’t show them.

Word frequency is based 40% on movie subtitles, 40% on Youtube subtitles, 20% on other web content, something like that. The idea was to focus on spoken language, because, well, speaking with people is fun.

The NLP code and the lists should be open-sourced soon.