Small QoL recommendations

Fantastic extension! Really glad I subbed.

It’s great out of the box, but there are just a few things that might improve the experience a bit.

It’d be really nice to have a shortcut to replay the current subtitle, instead of just forward and back. (I guess a workaround is just hitting both arrows, so this isn’t a showstopper.)

I’d like to be able to re-hide subtitles after my mouse accidentally drifts over them (maybe they are blurred again after the mouse moves away? Or an option make them visible only on a specific keypress and not on mouseover?).

I’d also like some way to import lists to mark words different colors. And an ability to export just the words that have been marked. (Currently if I go to saved items > green words > export, I get all the subs those words appear in, which is great usually, but sometimes I really just want the wordlist.)

With Korean and other highly inflected languages, being able to identify roots that take different endings would be really useful. I’m sure it would be complicated though. If you know “run” maybe that includes knowing “running” and “ran,” but Korean has some long compounds built off of words with endings and I’m not sure where the line would be. German is another tough one, less stacking of particles and more just massive compound words.

Agree with the other comments, subs2srs + morphman support or emulation would be a god feature. You might not be able to let people grab AV from Netflix for legal reasons, (definitely be careful of that so the extension stays up!) But if the exports included timestamps for the start and end of saved subs, it might be a good halfway step (people might be able to get the rest of the way with some other tools or methods, I don’t know). Might be easier to implement in the shortrun too.

Thanks for all you’re doing, great extension.

Oh, also, the ability to add alternate or suggested translations for words. Maybe one of the dictionaries could be “User suggestions.”

검사님 comes out as “check +” on hover. The show’s about prosecutors, they clearly mean prosecutor here, I’d like to just add that translation over the top for this show. Names too, a lot of names get translated literally, I’d like to just write in the romanization for them so it’s easy not to confuse them with other words.

Not sure where you pull the machine translated subtitles from. But the community might even help you gradually improve the machine translated subtitles this way.

Don’t you just hit the down arrow key to repeat the current sub?

Yeah, I also realized ‘s’ can repeat the current one, but it doesn’t re-hide the subtitle, so I still end up mashing back and forward together. That one’s not a showstopper really.

Importing/exporting lists of words to mark,
Identifying roots of words or word families,
Adding suggested translations,
and more ways to create decks for Anki would be incredible.

Hey, sorry for the late reply. Glad you like the extension.

Or an option make them visible only on a specific keypress and not on mouseover?).

You can use the ‘e’ key, that should work, but no way to deactive the mouse expose thing…

import lists to mark words different colors

We’re working on something that I think will solve your problem, more or less.

sometimes I really just want the wordlist

There’s lot of ways to export data, I’ll make a little user guide, there’s not much info currently.

subs2srs + morphman support

Yeah, bit of a dilema, don’t want to poke the lion too much. Maybe can see if there’s some really good TTS. Hoping to add some i+1 type feature soon.

Oh, also, the ability to add alternate or suggested translations for words. Maybe one of the dictionaries could be “User suggestions.”

That could be nice. We considered it for translations, just didn’t have time to implement it.

“check +” on hover

I made a quick fix, will still need to find time to work on Korean dict a bit more (with compound stuff).

Thanks for feedback.

If you’re working on Korean dictionary support, specifically separating out roots from endings, I’m looking to see if I can integrate this in one of my projects: https://konlpy.org/en/latest/

Seems really promising.

Actaully, we already have words broken down into roots… but I’m not sure how to handle them. Korean is a bit unique here, it’s problematic for the dictionary and for the word frequency feature.

필요했다

필요


Korean has so many endings, and blurs the line between “grammatical ending” and “compound word.” Could be a nightmare to process. One popular Korean learning resource, “Korean Grammar in Use” is a three book series all about grammatical endings.

So, after thinking about it a hot minute, probably way less than your team, I can imagine a couple approaches:

The “simplify everything” approach: just take the roots and ignore all the endings.

So you’d take this subtitle:

나는 네가 먹고 있는 것을 알았어 (“I knew you were eating”)

and just process it like this:

나 - 너 - 먹다 - 있다 - 것 - 알다

Meaning, if I’ve marked 먹다 (“to eat”) as highlighted, it will mark this form, 먹, as well. Maybe shows the root on hover so I understand what’s going on as a user.

Pros:

  • simplifies processing

  • this is how most people consciously parse sentences anyway, with grammar processing mostly happening instinctively (MIA and AJATT are big proponents of this philosophy)

Cons:

  • Controversial, a lot of people like treating grammar as a first class citizen in language learning

  • False positives: The system could insist a learner knows words that look very strange, that have have changed dramatically by the addition of many stacked particles

  • Inconsistent with how you probably handle other languages, and for scalability you probably don’t want too many special cases.

The “completionist” approach: let people highlight grammatical principles too, treat those as separate words.

So if the system encounters 먹고 싶어요 (“I want to eat”)

It breaks it down to 먹다 (“to eat”) + -고 ("-to X") + 싶다 (“to want”) + -아/어요 (“present tense, polite style”).

IFF I’ve tagged both 먹다 AND -고, then the system would highlight “먹고” fully.

If I’ve tagged 먹다 BUT NOT -고, it will only highlight half the word.

If I’ve tagged -고 only, then it will highlight that part only.

In general it will treat roots and endings as separate words.

Pros:

  • Accuracy

  • Going back and looking at the subtitles, it seems like your system is already trying to break down words into their components, so you may have a head start on this.

Cons:

  • Complexity. The engine for breaking words down will need a lot of tuning, including maybe just hard coding a lot of corner cases.*

There might be a third way? You guys have moved crazy fast tackling a lot of languages, when a developer could lose years figuring out on any one of these, so I’m impressed and sure you’ll find a good path.

Good luck, thanks for building this, definitely worth the subscription. If you need any help testing anything let me know.

Thomas

  • Here’s a tough case:

물어보다 is the commonly used word for “to ask.” It’s really 묻다 (“to ask”) plus the -아/어보다 ending, which means “to try to do.”

Nobody says 몯다 (“ask”) because it sounds way too much like 물다 (“to bite”) when conjugated, and you really don’t want people to misunderstand when you say you want to ask them something.

So… do you tag 물어보다 as 몯다 + some ending? I think you just hard code that one as one word, because that’s how people think of it, and how it’s listed in common vocab lists.

좋아하다 (“to like”) is another really common one of these. Technically 좋다 plus an ending, but nobody seems to think of it like that.