Baby Steps

Photo by strollerdos, via Flickr, Creative Commons

This is the second post in a series covering my exploration, experimentation and musings in the area of fictional modelling. In short, can we use the recent developments in semantic web technologies to represent elements of fictional content, and what does this allow us to do. For my introduction to the topic, see my previous post here. In this entry, I’ll talk about my first practical steps, and their implications. Thanks also go to Tom Scott, Dan Brickley and Anthony Green, amongst others, who responded to the first post with helpful comments.

Before I go any further, as pointed out by Chris Sizemore, it’s worth noting that work has been done in a similar area before. Previous IAs at the BBC, including Celia Romaniuk, worked on an ontology to describe the content of soap operas, known as SUDS. From what I have seen, it was an extension to FOAF in order to describe further relationships between people, the nature of people ‘playing’ characters, and various events that could take place between the characters in a show. This was done to tie in with an Eastenders website relaunch. I won’t go into much more detail here, but if you’re interested in seeing the original work, there’s a short article here and a great presentation here. Unfortunately, apart from a few example XML fragments, I have so far been unable to find a document that defines the SUDS ontology. This is a shame, because it would have been an extremely useful starting point for my experiments. One option might be to gather the examples together and try to reverse-engineer a schema, but for the moment, and partly as a way for me to learn as much as possible, I’ve decided to start from scratch. Hopefully at some point we can find the SUDS ontology and see how it compares to what I come up with.

So, where to start? Well, as the title suggests, I’m going to start small. Sort of. Readers of the blog, and others who know me, will probably have guessed that I’m a bit of a, shall we say, ‘fan’ of the BBC’s Doctor Who (currently in the news for apparently appointing a 12-year-old as the Eleventh Doctor). So much so, that in my sad little way, most things that I’m presented with in the course of my BBC IA work make me think “How could/would this apply to Doctor Who?”. As a programme that originally ran for 26 years, and has been enjoying an overdue renaissance, its rich history, and sheer refusal to ever completely conform to most IA domain models, make it both a source of frustration and inspiration. So when I read Tristan Ferne’s blog post over at BBC Radio Labs, shortly before joining the Beeb, I began to wonder. Have a read, it’s a good example of a similar idea.

Tristan’s article concerns fictional modelling for another hugely successful BBC show, The Archers. He talks about being able to break an episode down into scenes, characters, plots etc. and, for instance, potentially being able to build pages that allow the user to follow a story through multiple episodes, rather than being tied to the traditional episode format. Of course, to paraphrase Jack Bauer, events within The Archers occur in linear time. If we were able to build dynamic and interesting websites from a show like that, centred around a small English village, how about a show that goes forward, back and sideways in time and space? Harking back to my ‘toy box’ analogy from last time, with the imagination of the writers of a show like Doctor Who, and the imagination of our audiences, the potential to create some fantastic websites would be huge.

Sorry, where was I? Oh yes, starting small. So, yes, obviously I couldn’t hope to cover the whole scope of the show in one go. However, to show the potential of the semantic web and linked data approach, I’d want to start off by experimenting not only with characters who are linked together, but with a plot that is threaded through several episodes. I still haven’t quite decided what I’m going to choose for this, but I’m thinking that the story arc from either the first or fourth series of the current show would be good to try. But before all that, I had to learn how to create some linked data.

So I went even smaller, even simpler. I chose the first ever episode of the show, from 1963. This featured four main characters, and thanks to the workshop from Yves and the others, I had an inkling of an understanding of how to create FOAF profiles. The results can be seen here (best viewed if you use a Firefox plugin like Tabulator). So far so good. I then linked each character to the other, using the simple ‘knows’ relationship. Finally, to get my linked open data brownie points, I linked each character to its DBpedia equivalent, using the OWL ‘same as’ relationship. And that’s basically it. Except…

Except even this small experiment (which I eventually got working after help from Yves!) raises some interesting points. Firstly, the pernickety part of my brain is saying that we’re mixing two distinct things here. We’re using FOAF, which, I guess, and am happy to be corrected, is primarily intended to represent real people, to model fictional things. Crucially, nowhere, at the moment, are we explicitly stating that these resources are fictional characters.  So I’m wondering whether FOAF is the correct ontology to use. Of course, like SUDS, the ontology that results from these experiments will probably be an extension of FOAF, as it is true to say that we’re still modelling the same sort of ‘thing’, the relationship between ‘people’. But the point still stands – that somehow we need some way of indicating the ‘fictional’ nature of the FOAF person, if applicable.

Secondly, and perhaps more importantly, as Anthony Green pointed out, and as I discovered when I linked the characters to their DBpedia equivalents, there’s a lot of detailed information out there already. When I linked each character to DBPedia, I got back information which was extremely detailed and fairly well structured. Which, to be honest, depressed me a little bit. Was it worth me continuing? It’s clear that others had done a lot of similar work already, and I knew that ultimately it would be silly to reinvent the wheel.

However, then I remembered what data I was trying to link. Of course I should still link to the DBpedia equivalents, but the linked data I am thinking of is more to do with linking between characters, plots etc within my own domain. I’m still slightly uneasy with this, because I know that obviously the main thrust of the whole linked data movement is to link external sources together, and that creating silos of data is not good. However, I’m still definitely in favour of linking to DBpedia – if we were to make our ‘internal’ linked data semantically rich, and then link to external sources, then everyone would benefit, and in a way, we would be regarded as the ‘master’ source in the same way that, in my small experiment, I used DBpedia as my ‘master’ source.

So that’s it. A long, rambling blog post, and small, simple experiment. Baby steps. Apologies for the rambling, and I’m not sure that I *quite* explained myself properly in that last part – but there’s definitely some interesting issues coming up already, and I’m hoping that the advantages of my position will be borne out in future experiments. Finally, I’ve adapted the RDF file that I used to create the FOAF profiles to temporarily remove the OWL ‘same as’ relationship – just to ease the page loading time, and to, for the moment, give me a more clean space to work in. The adapted version is here, the original version here. Linking back *in* to DBpedia will be a task for later…

Again, comments, queries, advice is more than welcome – comment, twitter or email me.

13 thoughts on “Baby Steps

  1. Tom Scott

    This is looking really interesting – and of course picking Doctor Who will present some interesting ‘issues’ for example when the Doctor regenerates… same character different err instantiation (and different actors).

    Modeling different Doctors to different badies and different side kicks with FOAF will be very cool. Especially when also linked to :pids… would be able to ask questions like “which incarnation have battled the Cybermen?”

  2. Dan Brickley

    Thanks for keeping us posted on all this 🙂

    Re SUDS, I hope someone can dig out the details. I’ll ping Celia too. Post-SUDS, many things have moved on so revisiting would make sense anyway: we now have Wikipedia/DBpedia, Semantic Media Wiki, as well as a query language (SPARQL) that allows different sets of claims to be grouped into separate ‘graphs’, organized eg. by date or source. This could perhaps be used to address use cases like “who did so-and-so think was blah-blah’s father, in 200x” (although I’m not sure how far in that direction makes sense).

    On the question of FOAF for fictional characters, this was (tada!) anticipated in the FOAF spec, xmlns.com/foaf/0.1/#term_Person says

    “The foaf:Person class represents people. Something is a foaf:Person if it is a person. We don’t nitpic about whether they’re alive, dead, real, or imaginary. The foaf:Person class is a sub-class of the foaf:Agent class, since all people are considered ‘agents’ in FOAF.”

    I’m not sure whether it is better to represent ‘fictional-ness’ at the person level, or the dataset/document level. Rather than saying that some John Doe is a fictional character, perhaps describe the entire dataset as a work of fiction. After all there are other use cases for describing non-existent situations, such as describing the content represented in a painting – eg. http://www.w3.org/2001/sw/Europe/200206/imagemeta/smilanno/ via http://www.w3.org/2001/sw/Europe/200207/imgann/possible-annotations.html

    Anyway do please keep blogging this stuff. Is there a planet-style aggregator somewhere of all the opendata/semweb efforts coming out of the BBC? It’s hard to keep track lately…

  3. Tom Scott

    @Dan i was thinking it would make more sense to represent the fictional-ness at the person level… only because you sometimes have real people represented in fictional works. So although the story and other characters are fictional, some agents might be real… or does that not make sense.

    (very happy to hear you can’t keep track of the BBC’s semweb efforts — I try to blog/ link to about what I know about over on my blog, although I’m sure it’s not complete).

  4. Paul Post author

    Thanks for the comments guys. I’m starting to think there’s essentially two elements to what I’m trying to achieve – the ontology work, and the ‘flexible website for fiction built on top of RDF and SPARQL’. I’ve been playing around with writing some RDF with lots of comments, so expect one of the posts in the near future to essentially be a load of quasi-RDF code with my thoughts interspersed.

  5. Dan Brickley

    @Tom … interesting problem. I guess to be really picky, one would want to know which of the claims in the graph are considered true. Which goes a bit beyond just listing the people who don’t exist. However I take your point re the non-fictional people. No idea what’s best, except to build things and see what works 🙂

    Some old notes – http://web.archive.org/web/20061008023851/http://rdfweb.org/people/danbri/2001/12/puzzle/unicorny.html

    …not sure I go by everything I wrote there, but certainly this bit: “Many RDF documents will be both false and useful “

  6. Pingback: Interesting stuff from around the web 2009-01-12 « Derivadow.com

  7. Kim

    Hello

    When I was in iD&E we did a piece of work to make a data model that supported drama and comedy shows: Locations, fictional characters, ‘Universes’ (ie, Torchwood + Dr Who are in the ame ‘Universe’)

    IIRC, it supported Dr Who, in that the same character could be played by many actors, appear with themselves etc. Dr.Who is a very good model, as there are moments when, for instance, the Dr appears in Albert Square with the cast of Eastenders. (I also recall that the BBC’s postcoder system supports the idea of fictional postcodes, so Walford and Ambridge can take their own postcodes…)

    I think I have all the documentation on a hard drive somewhere… it was based on SMEF, and was essentially an extension of it. We used SUDS as a kicking off point, and I am pretty sure i have loads of the SUDS documentation too.

    Drop me an email to remind me to dig it all out for you???

  8. Kim

    Ooh, I’ve also got all the XML for the old Dr Who site episode guide somewhere, too… but you didn’t hear that, right? 🙂

  9. Paul Post author

    @Kim, that would be brilliant, cheers. I’ve seen some SMEF based stuff from ‘Project Dorothy’ – not sure if that’s the same thing – but the SUDS documentation could be very useful (and the DW XML as well :D)

    The only thing which I’m wary of in using SMEF is its’ complexity – and its’ basis in more ‘traditional’ computer system based projects – obviously I’d like to make whatever happens as a result of this flexible and integrated, but I wouldn’t want to drive it into the ground by making it rigid and too tightly bound to SMEF. But hey, it’s worth a look 😀

  10. Paul Post author

    Hi – yes, Kim sent me the SUDS/SMEF files – I haven’t had much chance to do further work on this recently (apart from the latest blog post), but I’m optimistic that something more ‘official’ may be on its way soon…

    I’ll find out more about making the SUDS/SMEF files public soon.

    Thanks for the continued interest in this!

Leave a Reply

Your email address will not be published. Required fields are marked *