RSC MASHe

The Higher Education blog from the JISC RSC Scotland North & East

downloadA conversation with Brian Kelly led to a quick refresh of the Twitter Subtitle Generator (iTitle) [I’ve got tired of linking to all of mine and Tony’s previous posts on this so started the Twitter subtitling wikipedia entry].

Brian was interested in using iTitle to create twitter captioned versions of videos from IWMW10. Their plan was to use Vimeo to host the videos as it allows upload of videos greater than 10 minutes. This led me to update the iTitle code to include the timeline jump navigation which I originally developed for YouTube videos.

Whilst doing this it occurred to me that I should really get around to providing a way for users to direct link to the results page (something I had been meaning to do from the very beginning). What this means is if you are using the iTitle for in-browser playback of subtitled YouTube or Vimeo videos you can share the results with a direct link. So for example you can see Brian’s open address for IWMW10 at http://www.rsc-ne-scotland.org.uk/mashe/ititle/v/id/13314385/ or the Google I/O 2010 Android Demo at http://www.rsc-ne-scotland.org.uk/mashe/ititle/u/id/IY3U2GXhz44/ 

More importantly I thought it would also be useful to include the ability to embed the results in other websites. With the introduction of the timeline jump navigation using the typical <embed> code you see with YouTube video isn’t possible (also I’m also using the HTML5 version of Vimeo videos which also doesn’t <embed>).

I’ve instead opted to automatically generating some <iframe> code which is included in the display/result page. So using Brian’s speech again as an example the resulting code generated to embed the video in your own website is:

<iframe style="border-bottom: medium none; border-left: medium none; width: 470px; height: 570px; overflow: hidden; border-top: medium none; border-right: medium none" src="http://www.rsc-ne-scotland.org.uk/mashe/ititle/v/id/13314385/" frameborder="0" allowtransparency="allowtransparency" scrolling="no"></iframe>

which gives:

To display just the video player with twitter subtitles I was able to <embed> code for the YouTube videos as they are Flash based. The JW Player which I use for playback has a ‘viral plugin’ which can generate the embed code (and send email links). A big plus point is that it preserves the link to the Twitter subtitle file. The player only version of Vimeo uses <iframe> again. With all these embed options I leave it to the author to decide if they link back to the original.

An update on YouTube/Twitter commenting (uTitle) coming soon …


Previously when looking at twitter subtitling of videos the focus has been on replaying the backchannel discussion with the archive video from live events. The resulting ‘Twitter Subtitle Generator’ has now been used to generate and replay the twitter stream for programmes on the BBC iPlayer (some iPlayer examples), the JISC 2010 Conference (See Searching the backchannel with Twitter subtitles) and more recently as a way to enhance lecture capture. The founding premise behind all these examples and the question originally posed by Tony Hirst was how to allow a user to experience and replay the synchronous channels of what was said from the stage, and what was said about what was said from the audience (physical and virtual). Having looked at synchronous communication I was interested to extend the question and look at asynchronous communication (i.e. what was said about what was said after it was said).

My first step has been to experiment with the use of Twitter for timeline commenting on YouTube videos. The idea of timeline commenting of media isn’t entirely new and has been a feature of audio services like SoundCloud for a while. Likewise I’m sure the idea of integrating with the Twitter service as a method of capturing comments has also been used (but for the life of me I can’t find an example- another project perhaps).

The result is a prototype tool I’m calling uTitle. How it works is best explained in the video below:

As can be seen in the video uTitle allows a user to make comments at any point in the video timeline. These comments are also captured and can be replayed and added to at a later point. The link below lets you try out uTile for yourself (the paint is still wet so if you come across any problems or have any feedback this is greatly appreciated – use comments below).

Click here to try uTitle (or here for an existing example)

Some notable points

A couple of features of uTitle worth highlighting. Firstly, as demonstrated by the example link above it is possible to directly link to a timeline commented video making sharing resources easier. Another important point is that because twitter comments for YouTube videos are aggregated by using the video id this makes it possible to use this data with other services (at one point I was considering short-coding the ids to make less an impact on the Twitter 140 character limit, but I wanted to make generated tweets has human readable as possible.

How it was done

For those of you interested here are a couple of the key development milestones:

Step 1 Indentify way to integrate with Twitter
I already knew Twitter had an API to allow developers to integrate with the Twitter service so it was a case of finding a head start on which I could build upon. As I do most of my coding in PHP I went straight to this section of the Twitter Libraries. Having tried a couple out I went for TwitterOAuth by Abraham Williams (mainly because it used OAuth and when I looked at the code I could understand what it was doing).

Step 2 Submit a form without page refresh
Something I’ve known is possible for a while but never needed. I knew I wanted to allow users to make comments via uTitle without refreshing the page and loosing their place in the video. This post on Ask About PHP was perfect for my needs.

Step 3 Jot down the pseudo code
This is what I wanted uTitle to do:

  • Get YouTube video id
  • If video id doesn’t exist as a notebook on Twapper Keeper make one
  • Else get results from Twapper Keeper for video id
  • Get results from Twitter Search
  • Merge data and remove duplicates
  • Generate XML subtitle file from results
  • Display interface page
    • On comment submit to twitter

Step 4 Put it all together
Some late nights pushing bytes across the screen …

These examples demonstrate how it is relatively straight forward to extract part of the Twitter timeline

Future development areas

Some quick notes on areas for further research/development:

Comment curation/admin – currently anything on the public timeline with a YouTube video is pulled in. A similar problem exists for the Twitter Subtitle Generator and it is something Tony and I have identified as a feature … but just haven’t had a chance to implement a solution. Part of the reason for developing the prototype is to start finding use cases (ie find out where the ship is leaking)

Merging synchronous with asynchronous – basically how can Twitter Subtitle Generator and uTitle be merged so comments can be collect post event (the issue here is there are two ways the subtitle timestamps would have to be generated and distinguishing what was said from what was said about what was said).

Other video sources – I’m primarily interested in how uTitle might work with BBC iPlayer (particularly as the latest developments are exploring social networks – as highlight by Tony).

Spamming your followers with comments – Interested to see if users are willing to use there main twitter account for generating comments.

Hmm I think I may have bitten off more than I can chew …


Update 1: This post is a bit techie. If you want to jump straight to the action click here to see the full 45 Google I/O 2010 – Keynote Day 2 Android Demo with  Twitter subtitles

Update 2: When I was pushed the twitter timeline through the generator I noticed there were a number of tweets which weren’t in English so I’ve now passed the results through the Google Translate API. Click the same link above to see the result.

Update 3: Some inline updates: extra tweet filtering and inclusion of .csv upload to twitter subtitle generator.

Originally I was more interested in mashing the Google I/O Android Keynote with Twitter subtitles because I could, but the process was useful in highlighting some areas for further development. The first is something Tony and I have discussed before is a way to curate the twitter timeline to sort the wheat from the chaff. For the Google I/O presentation I downloaded the archive in .csv format from Twapper Keeper and ‘tweaked it’ in Excel filtering for tweets meta tagged as EN (English) which took it down from 5420 –> 4638 tweets in 45 minutes (not surprisingly the majority of Twitter users ignore the language setting leaving it as the default despite the language they tweet in). Then filtering ‘retweets’ by removing ‘RT’s which took it from 4638 to a more manageable 3124 tweets. Update: I also noticed that a number of tweets had exactly the same timestamp so I filtered these out leaving 1790 tweets.

Having got this far it then highlighted the next issue, converting the truncated csv file into a timed text XML format. Previously I’ve shown how you can Convert time stamped data to timed-text (XML) subtitle format using Google Spreadsheet Script and could have easily gone down that route again but wanted to try something new. As the Twitter Subtitle Generator already integrates with the Twapper Keeper service it seemed a small step to get the tool to read a csv file rather than the Twapper Keeper feed. This was made so much easier by a PHP function which returns a multi-dimensional array from a CSV file optionally using the first row as a header to create the underlying data as associative arrays – sweet!

For once my code was clean enough that I could drop this function in and point it the the csv file I created. I haven’t worked this functionality into the ‘generator’ yet but at least it is another piece for the jigsaw. Update: couldn’t resist – added functionality to upload csv for subtitling.

So below is a short demo of the output. Click here to see the full 45 minute presentation with Twitter subtitles


Screenshot of BBC iPlayer with twitter subtitles Since February my post on Twitter powered subtitles for BBC iPlayer has remained in my top 5 posts and I’ve been meaning to revisit the iPlayer platform with another twitter subtitle example. The general election ‘Leaders Debates’ seemed like an ideal event to experiment with the format. It became very apparent after the first debate that simply pulling the public timeline into the twitter subtitle generator wouldn’t work as with an average of ~30 tweets per second the public discussion would just be a blur.

With the increasing popularity of twitter (and other social platforms) to comment on live events there is probably a separate research strand looking at intelligent filters, instead I’ve gone for a more basic approach. Fortunately the good people at tweetminster.co.uk, who have been closely monitoring the election using twitter sentiment, were able to provide me with a data file of twitter comments made by MPs and party prospective parliamentary candidates being tracked on twitter during the debate. Using the Convert time stamped data to timed-text (XML) subtitle format using Google Spreadsheet Script I was able to generate a subtitle file compatible with BBC iPlayer. Below is a short demonstration of the twitter subtitles in action followed by instructions to see the entire debate.

So if you would like to see what some of the parliamentary candidates were tweeting during the last leaders’ debate follow these steps:

  1. Download the The Prime Ministerial Debates from BBC iPlayer
  2. The broadcast you download from iPlayer will be stored in a folder (something like My Documents] > [My Videos] > [BBC iPlayer] > [repository] > [b00s6lf7]), locate this folder and replace the file ‘b00s6lf7_live.xml’ with this one (keeping the obscure file name ie b00s6lf7_live.xml.
  3. When you replay the broadcast turn subtitles by clicking the ‘S’ button to see the tweets.

Today I presented some of my work on twitter voting to the Engaging Students Through In-Class Technology (ESTICT) special interest group. This group “is a UK network of education practitioners and learning technologists interested in promoting good practice with classroom technologies that can enhance face-to-face teaching.”

I used this slot as an opportunity to try out some some presentation techniques. The first was using Timo Elliott’s PowerPoint auto-tweet plugin which allows you to automatically tweet notes as you work through the slide deck. The plan was that this would provide ready made links and snippets for re-tweeting, favouring or just copying into a users personal notes. I also did this to generate information to twitter subtitle my presentation. An unforeseen benefit was that the tweets provided a stimulus for further discussion after the presentation.

The other technique I picked up from was from a presentation by Tony Hirst in which he included links to secondary resources by only displaying the end of a shortened url. This is demonstrated in the presentation (with twitter subtitles of course ;-) (the link also contains a recipe for lecture capture enhancement):

ESTiCT Presentation link


Wage dislikes spreadsheets
Wage dislikes spreadsheets
Originally uploaded by Dyanna

My post titles just get better and better. As part of my research into twitter subtitling I’ve focused on integrating with the twitter search and Twapper Keeper archive into the twitter subtitle generator tool, but I’m aware there is a wider world of timed data for subtitlizing. When Tony contacted me on Friday with some timed data he had as part of his F1 data junkie series it seemed like the ideal opportunity to see what I could do.

The data provided by Tony was in a *.csv spreadsheet format the first couple of lines included below:

timestamp,name,text,initials
2010-04-18 08:01:54,PIT,Lewis last car's coming into position now.,PW
2010-04-18 08:02:05,PIT,All cars in position.,PW
2010-04-18 08:02:59,COM,0802: The race has started,CM

My first thought was to just format it in Excel but quickly got frustrated with the way it handles dates/time, so instead uploaded it to Google Spreadsheet. Shown below is how the same data appears:

Google Spreadsheet of csv

Having played around with the timed-text XML format I knew the goal was to convert each row into something like (of course wrapping with the obligatory XML header and footer):

<p style="s1" begin="00:00:00" id="p1" end="00:00:11">PIT: Lewis last car's coming into position now.</p>

Previously I’ve played with Google Apps Script to produce an events booking systems, which uses various components of Google Apps (spreadsheet, calendar, contacts and site), so it made sense to use the power of Scripts for timed text. A couple of hours later I came up with this spreadsheet (once you open it click File –> Make a copy to allow you to edit).

On the first sheet you can import your timed data (it doesn’t have to be *.csv, it only has to be readable by Google Spreadsheet), and then clicking ‘Subtitle Gen –> Timed Data to XML’ on the XMLOut sheet it generates and timed text XML.

Below is the main function which is doing most of the work, the comments indicating what’s going on:

function writeTTXML() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var dataSheet = ss.getSheets()[0];
var data = getRowsData(dataSheet); // read data from first sheet into javascript object
var sheet = ss.getSheetByName("XMLOut") || ss.insertSheet("XMLOut"); // if there isn't a XMLOut sheet create one
sheet.clear(); // make sure it is blank
// Start the XMLOut sheet with tt-XML doc header
sheet.getRange(1, 1).setValue("<?xml version=\"1.0\" encoding=\"utf-8\"?><tt xmlns=\"http://www.w3.org/2006/10/ttaf1\" xmlns:ttp=\"http://www.w3.org/2006/10/ttaf1#parameter\" ttp:timeBase=\"media\" xmlns:tts=\"http://www.w3.org/2006/10/ttaf1#style\" xml:lang=\"en\" xmlns:ttm=\"http://www.w3.org/2006/10/ttaf1#metadata\"><head><metadata><ttm:title>Twitter Subtitles</ttm:title></metadata><styling><style id=\"s0\" tts:backgroundColor=\"black\" tts:fontStyle=\"normal\" tts:fontSize=\"16\" tts:fontFamily=\"sansSerif\" tts:color=\"white\" /></styling></head><body tts:textAlign=\"center\" style=\"s0\"><div>");
var startTime = data[0].timestamp; // collect start time from first data row, all subsequent relative to this
for (var i = 0; i < (data.length-1); ++i) { // looping through all the data one row at a time except last line (excluded because have no end date/time
var row = data[i];
var nextRow = data[i+1];
row.rowNumber = i + 1;
//calc begin and end for an entry converting to HH:mm:ss format.
var begin = Utilities.formatDate(new Date(row.timestamp-startTime), "GMT", "HH:mm:ss");
var end = Utilities.formatDate(new Date(nextRow.timestamp-startTime), "GMT", "HH:mm:ss");
// prepare string in tt-XML format. Conent is pulled by ref the column header in normalised format (e.g. if col headed 'Twitter status' normalsed = 'twitterStatus'
var str = "<p style=\"s1\" begin=\""+begin+"\" id=\"p"+row.rowNumber+"\" end=\""+end+"\">"+row.name+": "+row.text+"</p>";;
// add line to XMLOut sheet
var out = sheet.getRange(row.rowNumber+1, 1).setValue(str);
}
var lastRow = sheet.getLastRow()+1;
//write tt-XML doc footer
var out = sheet.getRange(lastRow, 1).setValue("</div></body></tt>");
}

If your timed data has different headers you can tweak this by clicking ‘Tools –> Script –> Script editor …’ and changing how the str on line 18 is constructed.

I’m the first one to admit that this spreadsheet isn’t the most user friendly and it only includes the tt-XML format, but hopefully there is enough structure for you to go, play and expand (if you do please use the post comments to share your findings)


Pair programming is an agile software development technique in which two programmers work together at one work station. One types in code while the other reviews each line of code as it is typed in. The person typing is called the driver. The person reviewing the code is called the observer (or navigator). The two programmers switch roles frequently (possibly every 30 minutes or less). From Wikipedia

Regular followers of the twitter subtitle story will be aware that this idea has been bouncing back and forth between myself and Tony (here are some of his posts). While we don’t have a true ‘pair programming’ relationship the dynamic is very similar. So when Tony posted a method for deep search linking a twitter caption file using Yahoo Pipes it was time to hit the driving seat for some evening coding.

Using the other Martin’s presentation again I’ve put together this page which demonstrates twitter caption search and timecode jump (I should point out that limitations of the JWPlayer means jumps can only be made to portions of the video which have already been buffered).

Twitter subtitle - search and timecode jump

How it was done

Taking the JWPlayer used in the previous post I dropped it onto a page also pasting the subtitles from the XML file. With a bit of CSS styling and using A K Chauhan’s JavaScript List Search using jQuery the pasted xml can be filtered, and using the JWPlayer JavaScript API you can jump to the related part of the video. When I get a chance I’ll integrate this functionality into the twitter subtitle generator. Update: Breaking my ‘no coding in office hours’ rule this feature in now enabled for the ‘YouTube with Tweets’ option of the twitter subtitle generator

Some thoughts

Historically one of the issues with audio/video content is the ability to search and deep link to content. This is changing most notably with Google/YouTube’s auto captioning of videos, but as Tony pointed out in his last post there is still some ways to go. Providing a contextualised and searchable replay of the backchannel with what was actually said potentially opens up some interesting uses. With a number of universities exploring the use of lecture capture there is potentially an opportunity to enrich this resource with the backchannel discussion. In particular I’m thinking of the opportunity for students to learning vicariously through the experiences and dialogue of others. Before I go all misty eyed the reality check is twitter isn’t that widely used by students (yet), but surely this is a growth area.


Last week saw he return of the JISC conference. As with other similar events the organisers explored a number of ways to allow delegates to experience the conference virtually as well in person. The main avenues were video streaming some of the sessions live across the web; the inclusion of a Ning social network (I’m guessing they won’t be doing this again next year. See Mashable’s Ning: Failures, Lessons and Six Alternatives); and advertising the #jisc10 hashtag for use on twitter, blogs etc. I would recommend Brian Kelly’s Privatisation and Centralisation Themes at JISC 10 Conference post which presents some analysis and discussion on the effectiveness of each of these channels.

It is apparent that the JISC conference mirrors a wider emerging trend to allow dispersed audiences to view, comment and contribute to live events. A recent example is that of the #leadersdebate broadcast on ITV, which as well as having over 9.7 million views generated over 184,000 tweets (from tweetminster.com) and numerous other real-time comments on blogs and other social network sites.

I didn’t have a chance to attend the conference myself and other things meant I was unable to see the live video streams, although I was able to keep an eye on the twitter stream. Fortunately the conference organisers have made thevideos of the keynote speeches by Martin Bean and Bill St. Arnaud available. It is however difficult to replay the video with the real-time backchannel discussion. Cue the twitter subtitle generator, which I’ve been exploring through various posts. So if you would like to experience the live video/twitter experience some I’ve embedded the videos below.

Opening Keynote: Martin Bean, Vice Chancellor, The Open University

This text will be replaced
Subtitle content provided by twitter | Download the XML subtitle file

Closing Keynote: Bill St. Arnaud, P. Eng. President, St. Arnaud-Walker and Associates Inc.

This text will be replaced
Subtitle content provided by twitter | Download the XML subtitle file

Here are Martin Bean’s and Bill St. Arnaud’s biographies and keynote slides. Both of the video’s were produced by JISC and distributed under Creative Commons.

Just a quick couple of words on the subtitle file generation. I had planned to use the archive of tweets provided by Twapper Keeper for both keynotes, but there was a 45 minute hole in the archive between 08:44 and 09:27GMT for the first session, which is being investigated, so I used the Twitter Search instead. As the session was early in the morning and twitter limits searches to 1500 tweets I had to modify the query to ‘#jisc10 -RT’, which removes retweets, to get results for all of Martin Bean’s presentation (he still has a healthy 372 original tweets during the course of his presentation. [There is perhaps an interesting way to visualise RT's in the subtitle file to indicate consensus tweets - for another day]

If you are planning to run your own event and would like to create a twitter video archive here are some basic tips:

  1. Make sure you advertise a hashtag for your event
  2. Before the event create a hashtag notebook on twitter archive service Twapper Keeper – there are other archive services but currently the subtitle tool only integrates with this one
  3. Make sure video is captured in a reusable format. The video above is played back with the JW Flash Video Player which supports FLV, H.264/MPEG-4, MP3 and YouTube Videos. Generated subtitle files can also be used directly in YouTube (if you own the video). I’ve also experimented with Vimeo for longer videos.

If you would also like a ‘at the scene’ report of the keynotes and some of the plenary sessions you should read this post by my colleague Lis Parcell at RSC Wales - Technology at the heart of education and research: JISC10 conference report



One of my frustrations when putting together the Gordon Brown’s Building Britain’s Digital Future announcement with twitter subtitles example was that the video was embedded within a custom flash player which meant I couldn’t overlay subtitles within the player. Since I making the original post Downing Street have since put Gordon Brown’s speech on YouTube which makes it possible to use the JW Player to embed the original video with your own caption file (via the captions plugin). You can see the new version of Gordon Brown’s speech with twitter subtitles here

hmm it just occurs to me that I haven’t put a post together on how the twitter subtitle generator was extended to allow a selection of the twitter timeline to be overlaid on any embeddable YouTube video. It is very easy as the JW Player caption plugin uses the same xml format as BBC iPlayer. Simples

After my first post on Twitter powered subtitles for BBC iPlayer Tony Hirst followed up with a with this post which highlighted some of his own research link to Accessible HTML5 Video with JavaScripted captions on the Dev.Opera site.

HTML5 Video is currently in the spotlight as an alternative to the widely used flash based video players used by YouTube and others for embedding and playing video on Apple’s iPad. HTML5 is still a draft specification so it doesn’t have full browser support and their is also the usual platform divergence on what video formats are supported. This hasn’t stop the main video hosting sites like YouTube and Vimeo from experimenting with HTML5 video.

One of the other reasons HTML5 video is gaining, despite the video format issues, is that it makes it easier for developers and users to embed and interact with video content.  So when I got a nudge to look at supporting twitter subtitles with Vimeo videos, exploring the HTML5 option was at the top of my list (and the fact that Vimeo allow longer videos greater than 10 minutes also makes it a viable solution of conference/lecture capture!). 

As it turns out a bit like the YouTube/JW Player solution most of the hard work is already done by tweaking the XML format used in BBC iPlayer and thanks to Bruce Lawson’s Accessible HTML5 Video example mentioned earlier.

So if you go to the twitter subtitle generator page you’ll see there is a new option to use a Vimeo HTML5 video and specify the video url e.g. http://vimeo.com/8570975. Below is a screen capture I took of the process (the video I used for this demonstration was JISC’s Digital Content Quarterly Issue 1: Video 1 (Long Version)).

One of the issues I had was finding a clip and extracting a useful part of the twitter timeline. Either there weren’t enough tweets which made it look like the subtitles had stalled or there were too many (particularly when I tried using the #askthechancellors hashtag). The ability to filter a timeline, by for example displaying tweets from a selected group, has been raised by a couple of people, most recently in Brian Kelly’s Issues In Crowd-sourced Twitter Captioning of Videos post. Using the HTML5 Video solution and Daniel Davis’ multilingual example (also from the Dev.Opera site) it is easy to demonstrate how this feature could be implemented. I haven’t had time to work this feature into the ‘twitter subtitle generator’, but this page gives a live demo of a possible output.

So it looks like there is still some more interesting work to be done in this area …


I’ve been quietly snaffling twitter timelines using the subtitle generator I created. The latest one was prompted by Brian Kelly’s post on the The “Building Britain’s Digital Future” Announcement. In this post he mentioned that twitter was a buzz with the #bbdf hashtag.

Clicking on the image below will take you to a page I created from the subtitle generator. Because Number 10 use a commercial company to host some of their content, using a bespoke flash player, I create a page and embed the video and used the Javascript SMIL Browser.

Twitter subtitles for the Digital Future Speech

Update: As well as being ‘Fun, intriguing..‘ Brian Kelly has a nice post on the ‘Issues In Crowd-sourced Twitter Captioning of Videos‘ (which despite the title I read as a positive use of tweets as they are contextualised with an event). Tony (who inspired me to look at this area) and I are both keen to take some of the twitter captioning ideas forward so if any developers or funders want to get involved we’d like to hear from you.

Update: If you want to see how I combined Gordon Brown’s speech and tweets here is the ‘making of’ video

Update: Number 10 have put Gordon Brown’s speech on YouTube which makes embeding subtitles a lot easier. Click here to see the same example but using the YouTube clip