24 hour plots

The introduction of ASR subtitles for BBC News on the 13th November prompted me to produce plots covering a 24 hour period, from midnight to midnight. The main reason was I spent a couple of weeks looking at one hour plots following the change to check of glitches and this became very boring. Grouping the data into 24 hour plots made the task more reasonable as I could spot any problems and then drill down to the one hour segments where the faults occurred. The most dramatic of these one day plots is of the 13th November where the wildly fluctuating subtitle delay, typical of BBC News up to that point becomes an almost flat line.

To make the plots readable on this scale I have taken the one minute average of the subtitle delay which does not show the maximum extent of any delay (or advance) but gives a good indication of when any problems occurred.

I have also plotted the different word rates for the 24 hour period, comparing the subtitle word rate with the speech to text transcript word rate to expose times when the subtitles omit a significant proportion of the speech content. This proved more difficult, as the word rates fluctuate quite significantly from minute to minute. Eventually I settled on five minute averages for the word rates which shows the difference up reasonably well. Here is the plot of word rates plotted for the same BBC News day.

The issue with plotting a 24 hour new channel is that it has very little variation of content across the day. By contrast a mainstream channel like BBC One has a varied output with both live and pre-recorded content as can be seen from these day plots from the 16th December.

The delay plot shows the channel going over to BBC News on ASR at about 01:30 and then the start of BBC Breakfast at 06:00 where the human subtitlers take over until about 10:45 where the channel goes over to pre-recorded programmes. The News at one and six pm both have live subtitling. The peak just before 7pm is caused by the regional weather forecast and live subtitles continue until 7:30pm with the One Show. Live subtitles then reappear for the 10pm news.

The word rate plot shows how the BBC News channel ASR tracks the spoken word rate, but once the respoken subtitles from 6am onwards fall behind. The most notable problems with word loss are shown between about 9:30 and 10:30am where there is around a 30% word loss in the subtitles. The word loss between 7 and 7:30pm is also notable but less severe. These are indicative of the problems with subtitling live magazine programmes where the speech rates are quite high and there is a great deal of interchange between the speakers.

The one thing that is lacking in these plots is any indication of the programme schedule. That is my next task.

This entry was posted in Uncategorized. Bookmark the permalink.