Thanks for the reply,
this is correct for most situations, but as I understood it, they offer a premium feature where they can create subtitles for videos that have no subtitles.
Like, by using a computer algorithm they figure out what is being said and turn it into text. So with this live speech recognition, in a way, they are generating subtitles (and distributing them to the user, for profit, of shows they don’t own or probably don’t have permission to do so.)