Agglutinative Languages like Japanese and Korean have to be parsed correctly with a morphological analyzer(MeCab, kuromoji, suika are some examples of existing ones for japanese). This is already a problem that has been pointed out as not working properly(most people have noted it for Japanese, but keep in mind it’ll likely be true for any agglutinative language). I wanted to add that even if this parsing is done correctly, there still needs to be an option to save the base form of the word rather than the conjugation or grammar that’s attached to it.
For an example that works in both Japanese and Korean, 食べなければいけない・먹어야해 means “Have to eat” but I don’t want to learn “have to eat” as a single word as that would be really inefficient with the amount of possible verbs and conjugation combinations. Instead, I want to learn the verb for “to eat”, which is 食べる/먹다 and the grammar for having to do something ~なければいけない・~아/어야하다, as separate flashcard entries. I’m going through a drama in Korean right now and in just one episode I have over 5 entries that are all just different forms of 그만하다 and it’s not very useful to learn that way.
I believe this also affects the word frequency as it considers every conjugation or grammar of a word separate so it has no idea what the frequency of any conjugated word is, which I’ve seen in my coloring not matching what I know are very common words.
Overall, in addition to the current way it works of just entering the subtitle as it exists as a flashcard, I would suggest every morpheme have separate options for adding the base word and the grammar or conjugation. Also some way that the app can recognize those different forms of the word as the same word in it’s coloring/frequency algorithms.