Youtube Extension - Now Online!

Hi! Thank you so much for your work.
I make videos that I speak Japanese in there.
Is it possible that I fix English translation ??

Hi! I am a great fan of LLN. Thank you so much for offering this to the world!

I had been waiting for years for LLN to exist, after discovering Bern (Switzerland)'s movie theaters, which, for a decade or two, have been displaying movies with three subtitles at once : the original version (usually English), the German translation, and the French translation.
I had always dreamed that my DVDs could do the same.
And there you come, with your double-subtitles ! Thanks, guys !

So, now, I am thrilled : I am testing LLY, on this 2015 TEDx video (Bill Gates talking about a possible viral outbreak) :

TEDx videos are great because they usually come with several sets of human-made subtitles. So here I am, watching most of the video with English human-made subtitles, and a few parts with “Français” subtitles, and then one more time with “Français Canada” subtitles (especially a fragment where Bill Gates was using the word “equity” in too puzzling a way to me, poor random Frenchman).
I wish I could display its English human-made subtitles AND its “Français Canada” human-made subtitles at the same time, but I have not figured how to do this yet. Did I miss anything?

1 Like

This has been discussed here before just a second ago. It’s on the TODO list.
Only machine translation is available at the moment. I’m not affiliated with the extension.

Workaround for the meantime: Install the “DualSub” extension and select English there and French in LLY, or the reverse order if you study English. It will get you what you want.

1 Like

LLN works ok with Filipino but it doesn’t work in LLY (pretty much like the Arabic situation I described before).

Suggestion: Sometimes the subtitle language is mislabeled on Youtube (or 2 languages might be in the same subtitle). It would be useful to allow the user to personally set the correct language of the transcript in such a case.

Consider also that Youtube’s auto-language speech detection fails, if the speaker starts the video by saying a few greeting words in one language, for the first few seconds, and then switch to another language for the rest of the video.

Another reason why that might be useful.

EDIT: I just saw it’s in the todo list. Looking forward to it!

Thanks a LOT for this extension! I’m a Spanish language tutor and I create videos in Spanish with carefully synced subtitles for my students, so this is just what I needed.

Question: Besides using machine translation, can LLY load a second subtitle track already in the video? For example, for this video about why we should give up shaking hands, I carefully crafted both the Spanish subs and English subs (which are a rather literal translation so they match the Spanish subs closely), so it would be great to use the English subs I uploaded instead of the machine translation:

Hey. So we think we have fixed the issues with Arabic, the new code should be live in 24hrs or so. I checked out automatically excluding ‘named entities’ from the coloring system, it’s possible, but the lib I was looking at (stanza) is computationally extremely heavy, will have to look if there’s an alternative. Og is looking at some of the issues you mentioned around punctuation and saving words, but it might have to wait till next time we revisit that code. Adding stats, drag subtitles (or other solution), dict hover/pause behaviour to TODO. Phrase detection (phrasal verbs and compound nouns etc., will check out Reverso) and word frequency was something already on my mental todo list. :slight_smile:

1 Like

“you can mark the same exact word with two different colors at the same time”

Example: ‘Can I open this can of soup?’ - the first ‘can’ is recognised as a verb, the second as a noun, so, should be saved seperatly. Also, the lemma form is saved, not the form in the text. Of course it could be more broken code you have found. :open_mouth:

Hi. :slight_smile: I’m glad you like it. We were hoping teachers on Youtube would find it useful. Using a second Youtube track as a translation is possible (we do this on Netflix with our other extension), we’ll add this feature back, but it might take some weeks before it’s ready. btw, I tried your video, setting the Youtube subtitles to Spanish, and using machine translation from the extension, the results were good. They are nice videos.

Our extension tries to set the native language subtitles for the video you are watching on Youtube, but, the way it does this is to look if ‘ASR’ (automatic speech recognition/auto captions) are available for that video, then, using that language, looks for ‘human’ subtitles. This is not always possible though. Youtube doesn’t make it easy to detect the ‘native’ subs. For your video, Youtube set English subtitles. As no ASR track was available, LLY couldn’t detect the native langauge. You could include some small data in your video description which the extension could read, or we could keep some data in our database about your videos, and which captions language to set. If we keep some data in our database, we could add special features for your video (subtitles with highlighted words… links for pdf downloads… etc.). If you have some feature requests, I’m interested to hear them and help. We even thought a little about making a kind of ‘Netflix for language classes’, hosting videos (without ads) ourselves.

Hello. :slight_smile: We only support using machine translation for the second subtitles at the moment, but I’ll add the feature soon to use a different Youtube subtitles as the second subtitle.

@Jerome_Poirrier Glad you like the extension, I think this is your request too.

1 Like

Thank you very much for your reply! Yes if I can fix a bit of machine translation it would be perfect! Machine translation is already quite acculate!

Hi. here is an example of a Tagalog/Filipino language channel with dual caption support. However LLWY seems to be bugged and or not functional for Filipino! I know Filipino is a low support language on most platforms, so I have kindly linked an example channel in order to stream line and facilitate ease of use for LLWN staff or volunteers! Thank you!

This is going to be SOOOO great. How exciting!
In advance, thank you so much !

Hi, David. YouTube did create an ASR track for my video, and it’s in Spanish. I choose to unpublish those tracks because they’re low quality, and unfortunately after unpublishing them there’s no way to re-enable them in the new YouTube Studio. (I guess I could delete my manually created subs and then YT would re-enable the automatic ones?). Old YouTube had more language options, such as changing the language of the original video description. Those are gone now.

Maybe I could add the hashtag #Español or #Spanish to the videos that are in Spanish?

The only really useful “feature request” I can think of right now is… A mobile app. But I guess you first need to make sure there’s a market for the desktop extension.

I place emphasis on ear training, so I believe transcription is a good exercise. Until recently I used to recommend downloading either WorkAudioBook or LAMP (Lingual Media Player), but they’re Windows only, LAMP is no longer being developed, and only the most tenacious of my students could go through the trouble of installing them. So again, a big THANK YOU for this extension.

I plan to create videos in 10 different levels:
Level 1 would be for students who know the top 1,000 word forms in Spanish. I’ll carefully write the videos so that the student doesn’t need to use the dictionary for more than 2% of the words. That is, I’ll keep unique new words at 2%. I’ll probably aim at a 120 WPM speech rate for these videos, as this is the lower limit of normal speech in Spanish.

The video you just saw is a Level 10 video: It assumes you know the top 10,000 word forms, and the speech rate is 190 WPM, the upper limit of “normal” speech (I checked a video by a famous Chilean YouTuber and it’s 249 WPM, so 190 WPM is still way below the speed of “fast” speech).

Of course, somebody has to take you from zero to level 1, then to level 2, etc., so I’m also writing the first episode of a Spanish course based on word frequency. Season 1 will teach you the top 1,000 words in Spanish. I find frequency lists both useful and frustrating. These ARE the words that give you the most “bang for your buck”, but they’re often presented without any meaningful context. Or even worse, they’re sometimes presented as lemmas and you’re in the dark about the many different forms. So my plan is to fix this.

I am working on my own word lists. I combined the per-million frequency of a subtitle-based list and a text-based list, but I’m not blindly following the resulting data. I am applying common sense to it: For example, most plurals are formed by just adding -S or -ES, so there’s no need to treat them as separate words. Verbs, however, are a different story. Many times I’ve seen intermediate students not able to recognize a new form of a verb they already “know”.

Oh, now I remember a useful feature I’ve been thinking about for a long time:

Give the student a place to type what he hears, and after revealing the original subtitles, show the student where his mistakes are. Punctuation and even diacritics could be omitted to make it less difficult. Or maybe have a switch to turn diacritics checking on/off.

After the student finishes with the video (or after he decides he’s done), he could be shown his error rate (or success rate?) as a percentage. If the student is signed in, maybe he could compare today’s error rate with the previous days. This could keep the student motivated, and could also serve as a way to know if a certain level (or a certain show, or a certain YouTuber) is right for you (You should fully understand around 98% of what you hear for it to be “extensive listening”, or 90% if you feel you can go for “intensive listening”. Anything below 90% is just painful).

This is exactly what I did for 3 months, 2 hours each day, with Mandarin as preparation for the HSK 5 test. But I did this all manually. I painstakingly downloaded videos, subtitles, mp3s with texts and used LAMP and WorkAudioBook to transcribe with paper and pencil. I used squared paper I designed myself so I could easily count characters and mistakes, and I kept a record of my progress in a spreadsheet.

EDIT: Keeping a database of the words the student has been able to type without mistakes could be a fantastic way to predict which shows, episodes, or videos are “comprehensible” for you. For example, if a video has 1,000 words and the system knows you have successfully transcribed most of them, and there are only 20 unique words you haven’t transcribed yet, it could be labelled as “Comprehensible”, “Recommended”, or “Extensive listening”. If there are 100 unique words you haven’t transcribed yet, the video or episode could be labelled as “Intensive listening”. Anything above 10% unique new words could be labelled as “Not recommended”, or just don’t add a label to it.

Building a database based on words the student has actually been able to hear and transcribe successfully is a lot more honest, fun, engaging and useful than having the user play whack-a-mole by clicking on words he thinks (often mistakenly) he “knows”. (Looking at you, LingQ)

Yet another EDIT: Something like what I described above would be fantastic not just for language learners, but also for people looking to improve their orthography, or for parents looking to improve their kids’ orthography. I know I would use it with my niece, because she has terrible spelling.

1 Like

Eagerly waiting. Thanks a lot for taking care of that so quickly! :slight_smile:

Edit: Yup, it looks like the update just got pushed a few moments ago. It seems to be working. So far so good! :smile:

Sorry, not a full response (long day today): if you add #es to the end of the title (‘es’ is google translate language code for Spanish), I’ll add the code to automatically choose the Spanish track in the next couple of days. :slight_smile:

Thanks! Done already.

issue that often happens. It actually happens a lot more quicker without my screen capture.
It happens with words that has apostrophe.

Error that pops quite frequently (even without saving any word) and which is quite bothersome.
You have to click the button to continue each time, and it can appear every few seconds even (frequency changes).
If you save words rapidly, it would be more easy to produce, probably, but it happens even without that. Clicking on a line to jump/replay from the sidebar might also produce the error.
It seems to happen kinda randomly though, generally speaking, without doing any particular action (pausing and playing might also be linked to that).

I have a problem with my payment information but I’m unable to update it:

I think I might have reached the max database size. I’m unable to save new words to the database with the color coding system as the change doesn’t persist. You are able to change the color but once you hit F5 color change is lost and goes back to white (only applies to new white words). Color change of an already existing value persists after refresh.

Here is a short demo demonstrating both behaviors:

Starting position: ‘son’ (green) | ‘hymne’ (white)

Color change persists only for ‘son’.
F5 is pressed after every change.

Issue seems to happen across all languages I study.
If I unmark ‘son’, I would be able to mark ‘hymne’.

Whatever the limit is it needs to be raised a lot higher.

This is another reason why stats (word counts) need to be implemented, as i could have easily been unaware that my progress isn’t being saved. This will increase confident in the system when you would have such counters that you can easily monitor and be alerted in case there is an issue such as this.

Suggested dictionaries to add for Hebrew: