18 Sep 2013

Bidirectional Text in WebVTT

Recently we had to implement the Unicode Bidrectional Algorithm's Paragraph Level steps in the JS WebVTT parser. What this basically boils down to is being able to detect the text directionality of the first paragraph in a WebVTT cue so that we can set the appropriate CSS and do the appropriate positioning for a cue's CSS box on the video.

The reason for this is that in the current spec text directionality is tied to the anchor position for other values such as the position cue-setting. For example:

  00:01.000 --> 00:02.000 position:25% align:end

Would have text aligned to the left (because the end of text flowing right to left is the left) and positioned 25% of the width of the video over from the right side of the video (because our anchor point is the right side of the video since the text flows right to left).


  00:01.000 --> 00:02.000 position:25% align:end

Would have text aligned to the right side of the video and be positioned 25% of the width of the video from the left edge.

Confusing thing is that this doesn't apply to vertically positioned cues in the current spec. Which doesn't quite make sense to me.

This might be removed in the coming changes to the spec. Following along on the bug to redesign the positioning mechanism there is a push to separate the text alignment from text directionality and I'm hoping there will be one to separate the position cue-setting from text directionality as well since in my opinion it's confusing to have to flip the reference point for it when dealing with different text directionalities. Position should be about setting the offset of the box, not of the text.

We went through a number of possible solutions to solve this problem the first of which was using Twitter's JS implementation of the Unicode Bidrectional Algorithm. Which seemed to be the canonical version of the JS implementation. However, we had a problem with this. The Twitter implementation requires you to know the locale beforehand as you need to generate a JS file, with all the necessary code, for a particular locale. You can't separate the code from the locale or have a general version for detecting all text directionality in all languages. The other problem was that this is a complete implementation of the algorithm and spec. We only need a very small part.

Going though other options—using Gecko's was not an option as it's not exposed to chrome JS yet and we don't really see a need to do it—it seemed like the best thing would be to implement it ourselves since it's so small.

To do this I compiled a list of all strong RTL characters by modifying the Unicode Data Table and just made a small function to loop through the characters in the cue's first paragraph and if we find a strong RTL character then we know the text is RTL. It's interesting since this isn't actually what the spec wants us to do. The spec wants us to return on the first strong directional character whether it is LTR or RTL, but :smontagu commented that this might actually be better for WebVTT's implementation which surprised me. He also said another surprising thing to me which was that algorithms like this are not always 100% correct—a thought which had never occured to me before... *ponders*.

If you'd like to check out the code that does this you can check it out here in vtt.js.

- Tagged in mozilla, seneca college, open-source, and cdot.

comments powered by Disqus

Wow coding Copyleft Richard Eyre 2013

Code for this site can be found on GitHub.