One of the largest structural changes in the broadcast market in recent years has been the increase in premium live content offered by all sectors of the industry, whether traditional linear broadcast or, more recently, streaming services. Sports has been the biggest beneficiary of this trend, with a combination of decreased production costs and increased audience interest initially piqued by the pandemic, but also sustained and growing since then.
Critically there’s a pronounced global element to this. Where existing deals allow, rights holders and broadcasters have been able to target genuinely global audiences, both tracking diaspora audiences and being able to activate national interest — and accompanying advertising revenue — on local stages. The massive audiences for cricket in the USA or the way that fans are able to follow popular players in soccer leagues all around the world are both good examples.
This has enabled an expansion into Tier 2 and Tier 3 sports, helped drive the expansion of women’s sports around the world, and opened the door to increased coverage of niche sports that would once have struggled to find a large enough audience to be economically viable to broadcast in previous decades.
However, there is still a large amount of content being left unmonetised because of a lack of localisation, mainly when it comes to subtitles. Sports commentary has proved to be almost impossible to translate in realtime as it can be as fast as the action with words per minute rates much higher than most other forms of content. Lag in the subtitles is not an option either. Latency between primary media and social platforms has proven to be a huge issue with consumers, and any subtitles that reference action that took place five seconds or so ago on the same screen is going to result in a jarring disconnect with audiences.
The key to engaging audiences and being able to fully monetise sports content therefore is to be able to apply some of the new advances in machine translation to streaming services. This combination provides them with the content that they want to watch and the commentary and context they need to fully engage with it. However, there are notable challenges that need to be overcome.
At XL8 we are working with some of the world’s major sports leagues to crack the problems of real-time machine translation with regards to sports commentary, and we have identified the following areas that are essential to its successful operation.
This is, of course, the prime metric for any translation service, whether driven by machine learning or not. With our technology, where we are following the Large Language Model route and training the service on existing datasets. We find that the larger that dataset the more accurate the end results are. Typically that means that the best translations are from English to other languages, with Latin American Spanish being particularly strong, as well as to Brazilian Portuguese, to French, and to German. Language pairs are also reliable, so we see good performance between Asian languages, between Scandinavian languages, Finno-Ugric, and so on. Here we can see translation accuracy of up to 96%, which emulates the best of human translation.
Other languages are more challenging. English to Korean, for example, is only coming in at around 75% accurate, but even that comparatively poor performance is as good as the best language pair was only a few years ago. Progress is rapid and occurring in many areas. We’ve recently implemented the ability to go from Turkish directly to Spanish without pivoting through English first, for instance.
Sport specific vocabulary
A key challenge is implementing sports-specific lexicons and ensuring the models understand some of the words they are going to hear, or, to be more accurate, can successfully predict the word that they expect to occur in that place in a sentence. For example, we have seen a UK-based subtitle of an NFL game substitute ‘pond’ for ‘punt’; the translator was not aware of the term and made a best guess. Cricket has ‘googly’, basketball has ‘layup’, cycling has ‘chapeau’, rugby has its ‘hookers’…the list is a long one and needs to be accommodated. And naturally, the same consideration has to be extended to player and team names as well, which can often have an international dimension and provide pronunciation challenges for commentary teams.
As already discussed, latency is not an option. Machine translation systems work by analyzing a sentence as it is spoken, so sometimes you will see that the algorithm has made a guesstimate at a word but corrected it as more of the sentence is spoken to provide additional context (this is especially true for systems such as ours that also process idiom and aren’t simply providing word substitutions). We estimate that a machine translation solution targeting maximum accuracy will need an approximately three to four second video delay, or add around an additional 10% to the typical current streaming latency. Even with some of the fastest CMAF-based systems currently in development, it will still ensure total glass to glass latency is in the sub-10 second region.
Ease of deployment
Obviously a solution needs to drop seamlessly into an existing broadcast workflow. For instance, we support Zixi, HLS, and more. The key is to not add more complexity to an already complex chain and work with existing technologies and deployments.
Unlocking a growing market
Even in an age of globalization, the rise of sport as a genuinely worldwide phenomena over the past few years has been impressive. Once the sole province of the very biggest multi-week events, a genuinely global market has emerged for an increasing number of sports — the popularity of NBA basketball in Hungary, for instance — which sees a wide range of individual opportunities open up for broadcasters and rights holders.
Engaging effectively with the audiences in these markets is going to be key for making the most of the monetisation opportunities they represent. And providing cost-effective and accurate subtitled commentary, even of sports with complex jargons and terminologies, is one of the most effective ways of ensuring that happens in turn.