Transcribing/Community subs

Good catch! It might be that model that the ASR is trained on. I’ve noticed something similar as well for Korean — the word they choose is actually different from the word that was actually used just because it makes sense to the model or because it sounds similar to the word used, etc. The ASR subs are not perfect. Looks like they’re showing what they (the model it’s trained on) believe fits — at least from my understanding.

This is shows the amount of data/the datasets their ASR model has access to in each language, I believe — someone please correct me if I’m wrong here (and add to this if you have more info! :wink:)