Tag Archives: bbc

A future of Politics and News Online

Warning – personal opinions, not official BBC policy or product announcements, within.

What with the Local and EU Parliament Elections, the Scottish Referendum, and a General Election next May, 2014-15 is a momentous year for UK politics. The BBC has, and will have, a big role to play in bringing coverage and results from these elections to the nation and the world.

Today, I talked to the team at the Government Digital Service about some of the work I, and others I work with, have been doing in the political arena of BBC News Online. Here’s a write up of said talk.

The BBC has a mandate, as part of the Royal Charter, to “encourage citizenship and civil society…by promoting understanding of the UK’s political system.” For ‘Vote 2014‘, the name the BBC gave to the local and EU elections, we used semantic tagging to bring together relevant content from across BBC journalism – traditionally siloed as TV, radio and online, centred around the things that were most important to our users – the councils and constituencies where they live.

There’s a lot more about the work I did, including making sure our tags were linked up to open data sources, in the blog post I wrote for the BBC Internet Blog, back in May. But there’s another side to the BBC’s coverage of Politics, online, aside from day-to-day reporting and election coverage.

Democracy Live is a part of the BBC News website which seeks to directly fulfil the requirement to ‘promote understanding of the UK’s political system’. In essence, it is the equivalent of the BBC Parliament channel on TV. Yet it has some added features – transcripts of the proceedings of most, if not all, of the representative institutions, are made searchable, and there is a page for every representative in each major House or Assembly.

Unfortunately, it is a part of the BBC News website which is also rather siloed. And whilst it has an important role to play, and an appreciative audience within the political class, it runs the risk of super-serving that audience – when surely the point of the clause in the Royal Charter is to bring the political understanding to a much wider audience.

As a result, in my role as Data Architect, working with both BBC News Online and with BBC News Labs (part of the BBC’s R&D division), I’ve recently created a prototype which explores how we might integrate the content and concepts present within Democracy Live, with the rest of BBC News Online. It’s also been a great opportunity for me to get back into the coding game. Having had a taster of Python at the beginning of the year, I’ve used this prototype as a chance for me to learn one of the MVC frameworks for Python, Flask.

the homepage of my Politics Prototype

the homepage


As part of our tagging effort, we have tags for every MP, every political party and every Government Department. So the first thing I did was build a page for each of these. The page brings together tagged content from across BBC News.

A party page

A party page

Importantly, though, I’ve tried to go to authoritative sources wherever possible. The BBC is not the canonical store of knowledge about Parliament or Government. Those institutions are. So, bearing in mind the principle of modular design, and of stitching into the wider Web, I’ve sought to include information from Parliament and Government APIs wherever possible.

For instance, for every Political Party in the prototype, if they have MPs in the House of Commons – I get a list of those MPs from Parliament. For each person, if they’re a member of the Cabinet, I ask GOV.UK for the role they play, and the department that role belongs to. And likewise, for every government department, I ask GOV.UK for the roles, and current role holders, in that department. There’s lots of other things that I could link up, too – GOV.UK has pages for each country the Government has dealings with, and ‘topic’ pages too – just as we’ve been trialling internally at the BBC. Linking these up would give our audiences both the latest news, and engagement with their own Government’s activities in those areas.

Using Parliament's APIs to display the seat for an MP

Using Parliament’s APIs to display the seat for an MP

Department and role information using the GOV.UK proxy API

Department and role information using the GOV.UK proxy API

That’s the idea, anyway. In truth, I’ve only been able to get this far because of the limited, but at least slightly useful (and currently being improved upon, I believe), Members Data Platform from Parliament, and a proxy API built by a kind developer at GDS, Camille Baldock. And of course, I’ve had to stitch the two together, maintaining a local store of mappings between IDs and things like that.

This shouldn’t have to be the way it is. Obviously, there is important distance to be kept between the BBC, the Government and Parliament – too closer a partnership, and the political independence of all three is brought into question, also blunting the role of a free press in holding the Government or Parliament to account. But releasing well maintained and structured data about the things each institution is an authority on, and using shared, web-scale identifiers? That should be a given.

As much as each institution will devote time to serving the needs of its’ own specific user base, time and again we have to remember that the concepts are separate from the providers – users don’t see the difference between a politician on GOV.UK, in Parliament or on the BBC – they are one and the same person, and it’s time the Web reflected that.

Of course, joining up information about MPs is only the start. The work that GDS do is brilliant, and quite rightly, they have focused on delivering services to users. But personally, I believe that the part of GOV.UK formerly known as ‘Inside Government’ is an untapped goldmine. It’s not just reference data – it is the cornerstone of the actual material of what Government consists of.

The fact that every Government policy, every statement, every minister, has a publicly addressable URL, and has structured information, is a potentially massive statement about our democracy. No longer does it have to be the case that politicians (and, indeed, the media) can get away with making announcements or press releases that are soon forgotten within the 24-hour rolling cycle of news. The commitments that a Government makes to its’ population are written down and made available to anyone. It’s the closest thing we have to a constitution. And we should be using it to do so much more – holding the Government to account, for instance via linking the latest developments in news, back to the policies, is just one way.

Directly linking the massive audience the BBC has, into GOV.UK, actually showing people that politics isn’t just about spin and publicity (or at least shouldn’t be, if we truly want to at least have a go at being some form of accountable government), is as much about serving users as any of the services the GDS team are working their way through improving right now. Those are important to – they nail people’s necessary interactions with Government – but for civic engagement and democracy? That’s what together, the BBC, Parliament, GDS, and all the other representative institutions (some of which are further ahead in this game – see, for instance, the good work being done with the Northern Ireland Assembly) in the UK, could build, if we work together.

BBC Stories, or ‘What I Did on my Summer Holidays’

A post on web narrative to follow, I promise, but before hand, it’s worth pointing people to an excellent summary of work done by a couple of graduate students (with my assistance/hinderance) here – especially worth reading is the report, linked to at the bottom of the page. I’d urge anyone interested in this area to have a look, as it summaries the current lay of the land, the opportunities and risks involved, including some practical experiments, in great clarity and detail.

Meanwhile, before I finally get myself together to write a coherent piece on web narrative, prepare yourselves by reading two posts I’ll be referencing and responding to:

Kat Sommers on Web Narrative

Dan Biddle – ‘Hyperlinks don’t split narrative, they streamline it’

In a comment on the former, the author of the latter sums things up in a position I think I would agree represents my position:

“I would never try to disown the linear narrative, but I don’t hold that the link is its undoing.”

Although, I’d also argue that hyperlinks offer the opportunity not only to streamline narrative, but to open up a whole new toy box of tricks for both authors and audience to play with – and certainly not at the expense of ‘traditional’ linear storytelling. Not every Web narrative has to be branching, after all…

Better Recommendations Through (Linked) Data

Recommendations. Everyone’s talking about them, to paraphrase the old Eastenders slogan. I’m currently working on a pilot project looking at ways to expose the BBC’s archive content, help people find programmes they might be interested in, and clearly show when the programme was made/broadcast. Part of this work includes examining the ways we can improve episode to episode recommendations. I’ve been doing lots of thinking around this, and here’s the latest.

When it comes to recommendations, there seem to be four approaches. Each have their advantages and disadvantages, but I would argue that, until now, only three of the options have been tried in earnest.

Firstly, there’s the traditional method of hand-picked, manual ‘editorial’ recommendations. This means that staff consider each programme they’re responsible for, look around at what else is on offer, and pick out other programmes that could sensibly be recommended. The advantage of this method is that it’s often highly targeted, and good quality, basically because it’s been sense checked. The disadvantage is that it doesn’t scale well. It requires a great deal of human effort, and equally, a potentially vast knowledge of the programming output of a broadcaster in order to reap the maximum benefit. However, until recently, it’s been the safest, if not the only option on the cards.

The next three approaches are more to do with the reasons for recommendation. They’re often the reasons behind the manual recommendations, but as we turn to data-driven systems more and more, these reasons can inform automatic recommendations.

Production-based information – By this, I mean using production data, such as programme structure, categorisation, classification and cast/crew details, to power recommendations. In its simplest form, this can be seen on bbc.co.uk/programmes for almost any episode, where you can see the previous and next episodes in a series. Essentially, this is a recommendation as to what episodes it would make sense to consume before & after the one you’re looking at. Similarly, the genres, format and channel aggregations offer recommendations based on traditional broadcast classification structures. On the plus side, these are (relatively) easily sourced from the existing programme making workflow. They can also provide pretty useful recommendations. However, they tend to be very general. For instance, just because something is on the same channel, or in the same genre, or indeed, has the same actor in, doesn’t automatically make it a relevant recommendation. I would even argue that just showing other episodes in the same series or brand, as is done on things like iPlayer, aren’t really the best recommendations, and probably shouldn’t be sold as such.

Social-based information – Here, I’m talking about probably the most prevalent form of recommendation at the moment – or at least the one that everyone seems to be advocating. Here, we would collect data on a person’s viewing/listening habits, and use this data to provide other programmes that they might want to see, based on a combination of the frequency/range of their consumption, and the already established production-based recommendations. In addition, this can then popularly be combined with social networking information, so that recommendations can be provided based on what other people you are linked to have been consuming. Again, the advantages are that you can build up a fairly accurate picture of the type of audience you have, based on what they’re consuming, and this can then be used to influence both what you provide to them, and what you commission. However, there are major downsides to this, as well. Firstly, speaking personally, although I accept that recommendations from friends can be helpful, I don’t believe it’s the correct primary source for recommendations. Certainly, I’m not really interested in just knowing that other people have watched a particular programme – just because they watched it, doesn’t mean they would recommend it. Indeed, just because they liked it, also doesn’t mean they would recommend it. A recommendation, in this form, at least, has to be pro-active. That is, I’d much rather a friend actively recommended a programme to me, rather than a computer spying on their habits and then telling me. Which brings us to the second problem – the slightly dubious ethical/moral question of whether it’s right for companies to collect detailed information about audience habits. A really thorny question, which I’m not going to delve into now.

Which brings us on to the final form of recommendation, the one I believe gives the greatest benefit. And surprise, surprise, yes, it’s Content-based recommendations. Here, I mean something deeper than ‘this episode is in the same brand’, something more specific than ‘this programme has something to do with the same topic’, and something less, well, creepy than ‘twelve of your friends watched the Inbetweeners, so you must too!’. I’m also not talking about just tagging content. Tagging is probably the simplest and crudest way of doing this – it’s a start, but it really isn’t the end game. I mean that it’s necessary to, as far as possible, represent the actual content of the programme as data, and then link to other programmes which utilise the same data. This provides the most accurate recommendations, because we know that the exact same thing (or at least things with meaningful links between them) are being recommended. The downside, unfortunately, for the time being, is that it would have to be a fairly manual process. In this way, yes, it’s similar to the hand-picked, curated recommendations I mentioned earlier. The difference here, though, is two fold. Firstly, we’re capturing the reasons behind the recommendation as data itself, which leads to automatic re-use rather than constantly having to manually pick things (there would, of course, probably still need to be some form of editorial oversight to at least pick out highlights from the potential mass of auto-generated recommendations). Secondly, it can be folded into the production workflow from the very beginning, by engaging with writers & production staff, so that a seperate team is not required, and the recommendations can be captured and compiled at the very source, rather than after the fact. Commonly, the people who will know the content (and therefore the links) the best, will be the people who made the content in the first place.

This really shouldn’t be news to anyone, and yet it seems that this approach, until now, hasn’t been tried, in the main. I really can’t understand why, although given the problems and reluctance to even provide enough accurate data to power the production-based recommendation perhaps provides a clue. But I don’t think I’m alone in advocating this. In the oft-quoted (but perhaps not often enough!) words of Nicholas Negroponte, in 1995’s Being Digital:

We need those bits that describe the narrative with key words, data about the content, and forward and backward references….The bits about the bits change broadcasting totally. They give you a handle by which to grab what interests you and provide the network with a means to ship them into any nook or cranny that wants them. The networks will finally learn what networking is about.”

So that’s not just tags, but data to actually represent the content.

With all this in mind, I’ve begun to compile a mixture of production-based and content-based recommendations for traversing through the BBC’s archive. The next post will provide some examples of this, and lead you through the format and choices I’ve made in representing these links in the n3 format of RDF.

Narratives and the Semantic Web

Super Bowl Sunday Crystal Ball, by Circulating, from Flickr, Creative Commons license

“People assume that time is a strict progression of cause to effect – but actually, from a non-linear, non-subjective viewpoint, it’s more like a big ball of wibbly-wobbly…timey-wimey…stuff.”

The websites that we create around the narratives we tell currently focus on the objects within those narratives, like the playing pieces in a set of toys. More often than not, these are hand-crafted, static pages about certain editorially defined objects. Although we can record the links between objects implicitly as things that the audience can travel along, we tend not to expose them as things that the audience can explore and see in context. This, however, is what we are really interested in when describing narratives or telling stories. We define the objects within the world of the narrative, and then describe the interactions and changes between the objects. The intriguing thing is not necessarily the objects themselves, but the ways in which they change, or otherwise. A truly engaging website would therefore allow the audience to explore the world of the narrative not only by navigating between the objects, but by exposing and analysing the links between them, in order to derive more satisfaction.

Outside of the web, when we focus on an object, our minds give it context, and naturally establish the links between relative objects. For instance, when on a train journey, if I look out the window, I can see that branch of that tree which is placed there. We are instantly aware of both the object and its context, the thing and its links. Do the same thing with a computer, and it could identify and create a URL for a branch of a tree, but this would exist in a vacuum. It is up to us to give it the context. Using the principles and technologies underlying the Semantic Web, however, we can start to embed the context, the links, the meaning, so that, when using the web, we do not have to define these things every time. Instead, we can concentrate on uncovering and analysing those links, so that we can derive greater understanding and enjoyment from them.

Currently, websites such as www.bbc.co.uk/programmes define the objects, their contexts and links in a semantic web fashion, so that we can uniquely identify a particular object. Essentially, it provides the building blocks upon which we can establish the type of website I hae described above. Unfortunately, as far as I am aware, these building blocks are the limits of what we can currently, reliably, achieve. Emerging technologies such as SPARQL and RDF/graph visualisations will help us to build upon these blocks, but I do not think we currently have an established, reliable ‘toolkit’ or process that we can use to do this. However, this does not mean it cannot be done – it needs further experimentation. In the meantime, we can set about ensuring that the websites we build now will allow us to achieve the ideas mentioned above.

In the context of the BBC, there are two areas in which I can imagine the benefits of such an approach. The first, I will only give a overview of, as I have only thought briefly about the possibilities. The other, regarding fictional narrative, has been the focus of my previous blog posts, and I will continue the discussion here.

The first area is sport, particularly football. The BBC Football website contains a wealth of information, covering what is, in effect, the (almost) closed-off world of football. Fans essentially are following a narrative which spans matches, clubs, leagues, seasons, cup competitions etc. There is, obviously, some organisation taking place on the website – organising the clubs into their leagues, for instance. However, the links between these things – and here I mean not just the clubs, but the players, the action – are rarely revealed. We know that a team is relegated from a division because on one day their page exists within the ‘Premier League’ section, whereas the next, they are in the ‘Championship’ section. Their history may be recorded on the team’s page, or preserved in the numbers of a league table for a particular season, but there is no way of effectively (and, most importantly, engagingly) charting their fortunes. Of course, we can present these things in the numbers and bar charts and graphs, but they do not take advantage of the existence of the narrative behind them – which is really what people are interested in. Similarly with players. When two players go in for a tackle, we know that they have a history of confrontation, or perhaps an embarrasing own goal incident – what if we could provide the context around that tackle as and when, and after, it happens – filling in the back story, and getting the audience excited and engaged.

Similarly, by identifying and putting objects and events in context, we can give the audience something to latch on to. Take, for instance, a penalty incident. Say that the match was being covered on 5 Live with a commentary, it was shown and discussed on Match of the Day by pundits, and then also talked about on forums and 606 by fans. If we had an identifiable ‘hook’ for the incident, then potentially we could build a page which brought together all these different interpretations and discussions of the same event. That way, the audience would have an effective overview of the incident, with informed (and perhaps ill-informed!) opinions – their understanding and enjoyment would be enhanced, and of course, they could make their own contribution.

Back to the fiction – in my last post, I linked to a couple of images within which, I tried to explain what I aim to achieve, and where the benefits could be found. The first diagram establishes the episodes as a whole, regardless of series – and then drills down to a particular series, and a particular episode. A website that deals with a fictional narrative needs to remember that the episodes are merely a window onto the universe for the audience. If we intend to allow the audience to fully explore the universe, then apart from pointers leading them from/to episodes, as a form of ‘way-in’ (which, incidentally, should probably be through /programmes) the episodes themselves should (probably) not be included – all that exists are the objects (the places, the times, the characters) and the events.

The first diagram, once an episode has been specified, identifies the characters and events within the episode that are crucial to the narrative. For this, I limited myself to a handful of events and characters, which meant that I did not fully get the richness of the narrative across. However, potentially, we could identify as many events etc. as we require. Below the timeline of events (as presented to the viewer) there are coloured blobs, representing the characters in the events. This view shows us how the characters come and go throughout the episode (for instance, the Doctor only really appearing at certain points in the beginning, middle and end).

The second diagram gets closer to the value of this kind of site. Here, we see that the way in which each character experiences the events of the episode is quite different. This is crucial both to the plot and to the audience’s understanding and enjoyment of the episode. If, for instance, you wondered exactly how things tied together, then exploring this kind of site would allow you to piece together the parts of the puzzle. Perhaps on each character’s page, we would show their timeline, and how things happened to them. From the Doctor’s perspective, for instance, the event at the end of the episode is the first thing that happens to him – and the last from Sally’s point of view. Also, by showing these different timelines in the context of each other, we see the intricate way in which Steven Moffat (the writer) is able to weave the story together – giving the audience a greater appreciation of the story as a whole.

Obviously, Blink (so far) is an atypical episode of Doctor Who. By and large, the stories do not tend to concern themselves with the ‘timey-wimey’ stuff. However, over the course of a series, or indeed several series, characters, events etc re-appear – for instance the ‘Bad Wolf’ motif – the reason that the cliffhanger to ‘Turn Left’ works so well, is because it draws together elements of continuity established throughout several series. The audience gets maximum enjoyment out of such a moment because they are aware of the links and the context.

So what of the original series, whereby both ‘timey-wimey’ stuff and ‘story-arcs’ were at a minimum? Well, there are still instances of recurring themes, but overall, stories are self contained. That’s fine – they could be slotted into this kind of website just like everything else, because it essentially gives us a great pool of narrative to draw upon – if and when needed. Crucially, though, they represent a pool of ideas that future writers can draw upon if they wish. Continuity should not restrict the writing of future stories – the previous stories merely open out the fictional universe, creating more richness for authors. As such, when feeding the ‘classic’ stories into the website, the site becomes a form of ‘official’ wiki. Users can and should be encouraged to contribute, as a form of writing their own stories, but a distinction can be drawn between the events depicted on screen (it is, after all, and should not be forgotten, a television show..) and those where people ‘fill in the gaps’. The series itself has touched upon this, with the idea of certain events being ‘fixed points’ and others being ‘in flux’. As long as the narrative is not disrupted (i.e. breaks down so that it no longer makes sense to the audience) or becomes to insular (i.e. relying too heavily on continuity, so that new audiences are driven away), then continuity can enhance the fictional narrative universe as a whole.

Finally, a new diagram which, on a very basic level, tries to illustrate the idea that the website could be explored and presented through the model of, as quoted above, “a big ball of wibbly-wobbly…timey-wimey…stuff.” The diagram is quite obviously incomplete, but the idea is that the objects and the links between them are visualised, and the audience can then choose to look at a particular object, and see how it ties in to everything else – seeing both the object and its changing context and perspectives at the same time.

Phew. That’s enough for now. Till next time…

Tuning Fork

Tuning Fork, by Toby Esterhase, via Flickr – Creative Commons

Part three of my investigation into fictional content modelling. See the previous two posts for the background to the project. Thanks to those who’ve been discussing the ideas – I think it’s coming along nicely. I’ve been playing around with writing some RDF, trying to link up various ontologies, and explaining what I’m trying to do as I go along. Here’s a plain text file of quasi-RDF within comments – see what you think…(UPDATE: Now here in beautiful RDF format :-) )

One thing that has come up in the discussions, though, is that there’s perhaps two elements to what I’m trying to achieve. The first is to link existing ontologies and, if needed, build a new one, to help describe the narrative content of ‘stories’ within the context of television and radio programmes. The second is to experiment (and for me to learn) with existing ontologies, again, linking them up, to build dynamic and interesting webpages that work on linked data principles.

So I’m interested in the ontology *and* what kind of cool stuff we could build on top of it (which includes ideas around remixing narrative, and audience story-telling). I haven’t got any definite plans on top of that at the moment, but I think the key is to see where it takes us. Well, I have an image in my mind of the types of things we could do, but again, it will be easier to describe them by prototypes. Something that might help is if I was to link to this diagram, from the aforementioned Tristan Ferne’s Radio Labs blog, describing similar things to do with the Archers – except linking that up with linked data/ontology work…

Which would lead to something like the diagram below. Again, it isn’t a complete set of what I want to do, but it shows the types of objects we’re talking about, the relationships between them, and where they link to ontologies:

Contextual Data Model

Contextual Data Model

Actors – Using FOAF, with possible extensions, this would be a URL for each actor who appears in a BBC show. This page could pull in a biography from WIkipedia, for instance, but mainly it will show the audience all the programmes that the actor has appeared in. Linking Actors to Characters, all the way through to Episodes, would allow us to auto-generate the cast lists for the /programmes episode pages. However, one problem in an early implementation might be that if we only record ‘significant’ events within an episode, the cast lists won’t represent everyone – but over time, this could be improved (the rest of the cast could possibly be listed manually against the episode, greyed-out, until they have their own URL).

Portrayal – This would allow an Actor to play many Characters, and a Character to be played by many Actors. Here I’m thinking more of ‘flashback’ scenes where you see a character as a child, but as Tom pointed out in the comments, this could be used to handle the different actors playing the Doctor. BUt how then would you deal with the different ‘characterisations’ of the same character?

This is where the recursive relationship around ‘Character’ comes in – I haven’t worked out exactly what to call this yet, but it would allow both the foaf:knows relationship, and potentially use the owl:sameAs to link different Doctors? (Perhaps not – but something along those lines).

Again, a many-to-many resolver is needed between Characters and Events, which I’ve called ‘Action’ – I’m not sure whether these many-to-many objects would need to be made explicit and have their own URLs, but the main objects certainly would, as they could have useful pages for the audience to explore.

Events would be pages that would describe a significant event in the episode, something that would be worth describing, for instance an event which is part of a wider story arc – we would then need a URL to link these together, so you could say that ‘Someone points out that Donna has something on her back’ is part of the ‘Donna/Time-Beetle’ story arc (apologies for the random example!). This is, though, where the main value of the project would be for the audience. BY giving an event a URL, the user could trace storylines throughout the episodes, outside of the confines of the episode structure – making the fictional universe more cohesive, rather than restricting our view to the episodes, which are like ‘windows’ onto the fictional universe.

Similarly, if a user then wanted to write a story featuring some of the characters, they could refer to the character’s URL (which would then allow us to have something on the character’s page to say ‘others have written stories using this character’ – linking out  onto the web, and promoting new writers and stories. The users could equally refer to events, perhaps building events into their owns stories, taking them as cues for new stories etc. Again, it all fits in with the idea of giving our audience the tools to be creative, whilst using the advantages of the BBC website’s exposure to promote audience creativity.

There’s one many-to-many resolver which I’m not sure about at the moment – between Events and Episodes – what if the same event was  shown, or even just referred to, in more than one episode? We would need some way of defining this – but I’m not sure of the correct term for it yet, hence the ‘???’ object.

So – events could be described using the Event Ontology. Actors and Characters would use the FOAF ontology. Episodes would use the Programmes Ontology. We therefore just need a way of tying them together, and then once we have some examples, it would be good to start thinking about what new things we might need from a new ontology.

On the subject raised in the comments about expressing a person in FOAF as  fictional or real – I’d side withi Tom in saying that it would be  better to label the individual people as fictional, so that it was explicit which FOAF people were characters or not – and then you’d also have the issue of characters being used to represent, for instance, historical figures such as Charles DIckens…

Anyway, that’s enough for this entry. I hope I’ve got a little further in both clarifying the two strands of the idea, and exploring the breadth and potential of it. Comments, discussion, etc. encouraged! I’m hoping to present the idea in a meeting this coming Tuesday as a possible 10% time project, so I will keep you posted…