Curate Board Game

Today we played Curate, the digital curation board game, brought to us by Laura Molloy and our MCPDM lecturer Yunhyong Kim.

The game seemed to be based on Monopoly, with each player taking turns to roll the dice to advance their game token around the squares in a clockwise direction. If a player landed on a square marked ‘danger’, ‘caution’ or ‘take a DigCurV card’, they chose a corresponding card and had to answer a question or take action following the card’s instruction. The objective of the game is to prompt discussion about digital preservation, dealing with various problematic scenarios requiring solutions. As a group of 6, we each contributed ideas for solutions and concluded that a collaborative discussion was useful. We also realised in a real world environment, although it is necessary to have knowledge of digital curation, management and people skills are fundamental and it helps to surround yourself with subject specialists.

Sean also concluded that the game has a finite ending whereas digital curation continues.

See Laura Molloy’s blog post on the HATII blog here;

Further information on the game can be found here;

Week 9 Post-it notes!!


Here are two photos of the last group’s work from our afternoon session. We weren’t really able to explain at the time about them so I thought I would post them here if you want to see what we made of the task!

Following on from what another person from the previous group said about bending the model to suit your needs, we also said you could/should pick and choose aspects of different models/standards to suit your repositories needs.

We thought that there still need to be a lot of discussion on things like significant properties and that there wasn’t just one ‘perfect’ method, format, etc for digital preservation. A lot of work still needs to be done in this area, but a lot of good work has already been done to get us all started!

We all agreed that metadata was very important for preserving digital materials. Digital preservation isn’t just a job for archival institutions it’s a job that every one has to be involved in but we thought that we need to educate people more on the subject as most people think their files, family snaps, etc are safe and secure forever…. But they are not!

The second photo is a little clearer so hopefully you can make out what we thought about ‘actors’, ‘objects’ and the seven functional components yourself!

This was a good afternoon session which was very useful for revision and to see how all the different aspects of the course are all connected. It was interesting to actually break down parts of the OAIS for ourselves, instead of listening to a seminar on it as it made us think about what happens in each of the components and made us realise just how much they were linked and how some common themes kept popping up like authenticity, reliability and integrity.

It would have been more interesting if we could have been allowed to go around the class and look at each other’s work on the task as I think we missed a lot of what was written down as they were just skimmed over at break neck speed in the mini-presentations. We could have had a Q & A afterwards on any parts that we weren’t sure about or each said which statement they liked the best/found the most interesting or useful. It would have been good to have seen other’s ideas on the topics as we all generally had the same ideas but we approached them in all slightly different ways.

Reference: Emulation

Following on from today’s discussion in class about emulation and migration , here is the link to the YouTube video from the Emory university team who were given Salman Rushdie’s old computer and were involved in the task of emulating the software, etc in order to extract his work to archive it. They thought it was important to, not only save the content of his work but also demonstrate how it also looked to Rushdie when he used his computer. Some surprising things popped up for them that they thought they might not have known/understood had they simply copied the content.

Obviously a lot of work went into this process but the team believe it will have on-going benefit to their archive, university and beyond.

There are apparently a few videos but I could only find/access two so if you can find anymore please feel free to post them!

MCPDM Lab Task Part 2

For our next task, Julie and I played ‘Metadatagames’



We chose the British Library Collection ‘Book tag’ game where we were presented with 4 books which we had to tag. We scored points for each tags that we assigned to each book. We scored 58 points for the first game.

Next we played the ‘Zen tag’ game. We were again presented with 4 images and had to create tags and were scored accordingly based on the amount of tags applied to each image. We scored a bit better here at 82 points.

Next we tried ‘Stupid robot’ where we had a 2 minute countdown to tag as many descriptive words of the presented image. The words had to start off short and get bigger incrementally, so we used basic words like ‘tree’ or ‘branch’ to describe the offered woodland scene. The time element made this a bit more difficult than the previous games.

We also tried ‘Portrait tag’ which was a variation on the two previous games, also assigning tags to the 4 images presented. By now, we had become much better and scored 98 this time!

Once we completed these 4 games, we realised that by playing, we were contributing more useful information to each image which would hopefully allow people in future to find them using this new attached metadata that we provided.

We tried to adhere to the 15 Dublin Core metadata elements, however this was very difficult in many ways. Some images of books for example had the publisher and date, most images showed books in English although one was in Italian and another in Latin, whereas others had little or no information so it was difficult to fill in the gaps.  Another issue we came across was trying to find terms to describe the images that were easy for other people to understand. As there was not a list of available standard tags to use as descriptors, we used our own words which may not be useful if someone did a search. For example, something we describe as a coat, someone else may call a jacket and so on. Also, if we made a mistake or assumption, these tags are stored with the object and may well be incorrect. It was not clear if the information we provided would be checked or verified by the site or if it could be changed at a later date by someone else.

It was a useful task which made us really consider how important additional information is when storing digital images or other digital objects. Overall, this is a very useful way of crowdsourcing information especially if there is a huge volume of digital data that requires it, which may be impossible for an individual heritage centre to undertake without additional help.

Values for the fifteen Dublin Core metadata elements can be found here;

MCPDM Lab Task 18th Feb

Today we were given 10 sample files and our task was to explore the metadata for each file.

The two suggested ways of doing this was by looking at the properties of each file and then check further by using the site

 The 10 sample files are as follows;

Adobe acrobat document (EDRM-TalentTaskMatrix-v1.pdf)

By checking the properties, it confirmed the file type as being an Adobe acrobat document which was modified on 04/02/2013 at 15:29. It gave the file location, size as being 97KB, but compressed size as 93KB, the method: deflated, the Cyclic redundancy checksum polynomial (used to validate the integrity of the data) CRC-32: 8ABE110F and Index: 6. However, once I checked the same file using the extract metadata online tool, it showed that it was originally created in Excel (hence the title showing the .xlsl extension) using a Mac, by an author called George Socha. None of this information would have been obvious otherwise.


DWG file (civil_example-imperial.dwg)

By checking the properties, it confirmed the file type as being a DWG file (usually a graphic file associated with AutoCAD), modified on 21/05/2011 at 20:23. It gave me the file location, size of 166KB, compressed size as 67KB, method: deflated, CRC-32: 9F5AFC3E and Index: 1. The online extract metadata tool yielded no results saying “Max. filesize: 5MB !”


GIF image (Wrinkled_Paper.gif)

I started this task on Campus using a PC in a computer lab. I am not sure what version of Windows it was running, but as I did not finish, I completed the task at home on my own PC running Windows 7 Home Premium. This made a difference to the available metadata if you compare the properties as shown in the image below;


The first properties (illustrated in the grey box) gave three pieces of information that was not available when I checked the same file later on my home PC. These were CRC-32: 3FF3001B, Index: 4 and method: deflated. There are two values in the second properties box relating to ‘size’ and ‘size on disk’ which are very slightly different, so I assume this would relate to the values in the grey properties box described as ‘size’ and ‘compressed size’, although the second properties box gives more exact values for the size of the file (15,063 & 15,360 bytes) as opposed to the first size value which appears to be rounded up. Obviously the location of the file is different as it has been accessed from two different locations. The second properties box offers three dates; when the file was created, modified and accessed, whereas the first only gives the modified date. In addition, the second properties box gives information about the owner, attributes, bit depth and even the dimension of the image (using pixels). This metadata is not present in the first properties box. The file has not changed (apart from the location) so I can conclude that the newer system offers more information to the user, which must have already existed but not been available to view in the older system. The second properties box gives the user the option to ‘remove properties and personal information’ that has been attached to this file, which is not an option when checking the properties on the system on Campus. Once I checked this file using the online extract metadata tool, the outcome was that the format was shown as MPEG-1 and the mimetype described as audio/mpeg with the duration as 0m00. As a GIF file is an animation, it refers to the file as both an image and a video/audio file.

JPEG image (chimp at typewriter.jpg)

The properties shown here are interesting; apart from the automatically generated properties such as the file type, name, size, dates of creation etc. location, owner, resolution, bit depth, attributes and dimensions, as this is an image file, there is the option to add your own metadata too. You can add information such as author, date taken, copyright, camera make and model used, lens make and model used, camera serial number, light source, contrast and so on. There is even the option to rate the image, give it a title and add tags which would be a useful finding aid. The only additional information yielded by the online extract metadata tool was that the thumbnail  was binary and 16019 bytes.


JPEG image (huh.jpg)

As with the other jpeg file, there is a lot of useful metadata. The image below shows the option to remove all personal information relating to the file, or to pick specific properties to remove.


In addition to this, the properties of this file also include GPS information which (if accurate) should pinpoint where the photograph was taken.

Latitude 38; 53; 51.669499999989057

Longitude 77; 2; 11.309899999992918

Altitude 62

The jpeg image is of the Eiffel Tower shown from the Seine. As this is a recognisable landmark, I would have expected the GPS location to be Paris, however when I put the co-ordinates into the result shown was in China. When I tried again using the result shown was instead Washington in the USA. Either way, clearly the attached metadata was incorrect as shown in these maps below;


JPEG image (PICT0460)

This metadata attached here shows the file was originally created as an Adobe photoshop CS3 windows file. However, once checked using the online extract metadata tool, this information is not included. The other attached metadata appears to be consistent.


MS Excel worksheet (97-03 version) ARMA-Speakers_list

The metadata shows that the file is indeed an Excel file, and the online metadata tool confirms that it was created by Excel software. However the mimetype refers to it as “” which I understood to be associated with Direct X. I am a little confused by the seeming contradiction of creation and modification dates which are 2009 and 2011 as seen below;


MS Word document (97-03 version) Proposed_ED_Rules_and_Standards_2004.doc

This word document gives lots of additional information such as the word, character, line and page count (which is verified in both versions of the metadata) and the properties tells us the Company which created whereas the online tool gives us the Individuals name. The dates are consistent, but the time differs by one hour, which I have seen happen often. The properties also tell us when the file was last printed, so there is more metadata associated with the properties box in this instance.


Text document (20100501-0721 The SEC v. Goldman Sachs the case in a nutshell.txt)

This plain text document yields very little information from the properties other than the title, file type, size, date created and modified. The online metadata extraction tool does not generate any metadata whatsoever.

 Wave sound file (bonds.wav)

The metadata is slightly contradictory again; the properties say it is 12 seconds duration, whereas the online metadata extraction tool says it is 10 seconds long. They both agree that it is an audio WAV file, but the properties describe the bitrate as 20 kbps and the online metadata tool says 21 kbps. Overall, there is a very small amount of information on this type of file.


In conclusion, it was difficult to find all 15 metadata elements required for Dublin Core, and even more so to find the selected elements required for PREMIS. It was interesting to see the variety of metadata included for the different file types, and depending on what operating system you used, some information was available and others were not. By having the option to add or remove metadata, it also made me question the validity of some of the information. How do we know it the attached attributes are correct or false? Also, there were some inconsistencies between the two versions of metadata which in a couple of instances, I could not understand why. Although the two ways of extracting metadata were useful, neither was entirely consistent or gave complete values, still leaving gaps in the information.

MCPDM week 6: Metadata

This week in the lab we had to look at metadata.  We were given a metadata zip file and had to look for the metadata of these files by either clicking on ‘properties’ or by using the tool found on  We had to see how many of the Dublin Core metadata elements, PREMIS elements and NISO mix elements we could find.

Firstly i tried to look at the metadata of the different files by only clicking on ‘properties’.  I found that i could only find limited information, usually only ‘creator’, ‘date’, ‘format’ and ‘title’ for the Dublin Core elements and also ‘object characteristics’, ‘object environment’ and ‘event date’ for the PREMIS elements and also ‘file size’, ‘width’ and ‘height’ for the NISO mix elements.

Using the extract metadata tool i could find more elements, including the gps co-ordinates for a JPEG photo. I thought this was a really useful tool as it would be a great time saver, instead of trying to look up all of the elements by yourself and would thus give you more time to find out the information for the other missing elements.

Saying that, the next part of the task was to take the gps co-ordinates and, using google maps, try to find out the location of where the JPEG photo was taken. This gave me the result of Canada first and then using the google map satellite converter, an address in Washington, Maryland, USA when clearly the photo was taken in neither place! So clearly there is still a lot of time consuming ‘digging’ to do even when you are given the metadata information!

All in all, inputting metadata is a very time-consuming (and i would presume therefore a costly) task and I can see the great benefit and need for automation. Even if some of the metadata could be automated then this would help greatly!

Task 2 was fun but also a very useful tool too! By ‘tagging’ images in a game you can help libraries, museums and universities across the world with applying metadata to these images! Lorraine and I played four of the games, Book tag, Zentag, Stupid Robot and Portrait tag. I think we did quite well but it threw up a few things that we hadn’t really thought about.  We applied the Dublin Core elements to the tagging in our games and it really made us think about language.  Some of the images were of books written in Italian and Latin.  With one of us being able to speak Italian we could have given the English translations easily in our tags, but with the Latin images we would have had to use an online translator or bilingual dictionary (again, very time-consuming).  But not only that, the language we used to describe things in English posed us a few questions.  For example, we asked ourselves “What’s the name for that style of cravat?” (for a portrait photo image) and “should we say ‘formal attire’ rather than ‘formal dress’?  So the vocabulary used was very wide-ranging so that it would be more ‘ discoverable’ to as many people as possible.

I liked these games and I thought they were very useful to tag these images so that future users could find them easily, however I’m not sure how accurate our tags were! Anyway, anyone fancy challenging me to a game of One Up?