I was in Cambridge a few weeks ago for an event called Ways of Machine Seeing, organised by the Cambridge Digital Humanities Network, the Cultures of the Digital Economy Research Institute (CoDE), and Cambridge Big Data.

The workshop aimed to reframe the influential BBC TV series Ways of Seeing —first broadcast 45 years ago— under the logic of (relatively) new computing technologies and their concomitant theories of the visual as a form of information processing: machine vision or computer vision, as it is now commonly known.

I presented a short paper called “anomalous inputs: artificial spectatorship in film”, which will hopefully become a full chapter of my PhD project that looks at artificial intelligence and cinema more widely.

Right before my talk, Geoff Cox and Nicholas Malevé presented their own experiments applying computer vision techniques to comment on the footage of the TV series. Among other things, Nicholas demonstrated how automatic object recognition could be mistrained, or rather, perversely trained, to “see” a world comprised of different, more idiosyncratic, categories of objects and people.

This experiment tweaks the you only see once system (YOLO), which deploys an artificial neural network optimised for object recognition in real-time video footage. The network, when trained using the PASCAL VOC dataset can recognise categories such as persons, dogs, cows, bicycles, boats, bottles, etc. Nicholas showed a clip of YOLO trained using a different dataset (COCO, sponsored by Microsoft) and finally, a version of it trained by his hacker/amateur friends.
Put side by side, the results are funny and revealing at the same time:

Click here to see more about this project.

It becomes clear from this neat little experiment that machine vision is both very different and very similar to human vision. Different in the sense that the computer doesn’t really “see”, as much as it associates and outputs tokens of similarity with varying degrees of accuracy. And similar because these engines are trained to encode human subjectivities, and thus replicate collective biases and deeply ingrained cultural assumptions.

The question of training machines —how this training occurs, where the data sets come from in the first place and who has access to them— came up as a politically charged question in other presentations and in informal discussions throughout the event.

The other highlight for me was to have artists and curators alongside technologists and engineers, all genuinely traying to understand and work with each other. This was an event where there were presentations with sentences such as: “the purposed evisceration of phenomenological apparatuses” but also: “bullet points make me happy and comfortable”. There is a lesson here, I think, for humanists and engineers: It is true that technology does not occur ex nihilo, but nor should we take it talis qualis.

Thanks to Professor Alan Blackwell for inviting me to this event, for his warm welcome at Darwin College, and for his tour of the Computer Lab. Also thanks to Leo for showing us his robot…