iPhoto, Flickr and EXIF munging using Perl

EXIF/IPTC/XMP tagging of GPS coordinates, folksonomy tags, and other goodness is a nice idea, but unfortunately, iPhoto and Flickr don't play too well together. Couple this with the fact that any decent support those products now have is not included for photos already imported into them. So, here are some notes resulting from some experimentation, along with the Perl code I wrote along the way.

(Note: for the rest of this post, I'll refer to all metadata stored in the image file itself such as EXIF, IPTC, XMP, etc. as "EXIF" data. This isn't strictly correct. However, using the term "metadata" could be confused with such data held in other files, such as iPhoto's own database.)

For Christmas, my dad bought my mum a decent photo printer: the Canon PIXMA iP5200R. Very nice piece of kit, with good printouts, and most importantly, easy setup, with no CUPS and Windows -v- Mac issues. Anyway, as an upshot of this, my dad wants some of my photos for building a Picasa library for her to print stuff from. Specifically, the photos from the Caribbean holiday we had earlier in the year.

I've been playing with Picasa a bit, and I must say I love it. It's what iPhoto should be. It's a good deal faster for a start. That's comparing my iBook G4 1GHz against my mum's Compaq Deskpro Pentium III 733MHz, which is a few years older. Secondly, it doesn't use a horrible non-scalable monolithic database like iPhoto does. Thirdly, it seems to deal with metadata a bit more sensibly.

On the downside, neither iPhoto or Picasa handle collaboration functions very well. Both have publishing functions, and iPhoto has a read-only "sharing" function, but I mean the true sharing of photo libraries. My mum and dad have separate computers, but they share a camera, so it would be incredibly useful to have a shared photo library which either could use. On this count, Picasa wins slightly, because although it has no true sharing ability, there doesn't seem to be anything wrong with two copies of Picasa working on the same directory on a shared drive. I'm not sure how it would handle conflicts though, due to the lack of locking. For now, I'll advise my parents not to run it at the same time.

I usually manage all of my photos in iPhoto. However, due to the lack of decent keywording and other metadata in iPhoto, I ended up adding a lot of metadata in Flickr instead. As a result, my iPhoto library is woefully out-of-date with my Flickr library, even though it's far more complete as I only upload a subset of my photos to Flickr. I specifically want to extract Flickr tags back into the files themselves, and ideally rip the Geotagging information I added within Flickr.

While iPhoto is drastically improved by Keyword Assistant, it doesn't store the keywords (read: tags) in EXIF form, but in its own database. During this investigation I discovered that iPhoto does import keywords from EXIF, and also Geotagging info. The fact that it doesn't manage EXIF is a fairly annoying flaw, and one that I hope they'll be fixing. I've heard that Microsoft Vista now does sensible things with EXIF. iPhoto is, quite frankly, falling far behind other solutions at this point, and I hope next month will bring a new iLife with an improved iPhoto.

So, I had a good look around the 'net for some utilities to try to clear up this mess. I've had a play with things like Cal Henderson's Flickr::API and Dmytro Kovalov's Mac::iPhoto before, and they're not particularly mature. Incidentally, Mac::iPhoto also seems currently broken.

Others have attempted using Applescripts. The ones I found around the 'net are far too slow to cope with the ~8000 photo collection I have. This is probably due to the sucky design and/or implementation of Apple Events in MacOS.

Instead, I've had a play with PerlObjCBridge, or whatever Apple's calling it. In Perl, it's "use Foundation;". This gives access to the Cocoa API within OS X, which I figure is probably the fastest way to manipulate Property Lists. Since iPhoto stores at least a backup of the database in XML form (specifically Property List form) as "AlbumData.xml", we can use this to cross-reference the data.

The most critical link between my Flickr and iPhoto collection is missing. This would be a link between each individual photo and its alter ego in the two different collections. Since filenames aren't necessarily complete or preserved by either iPhoto or Flickr, this is relatively difficult to establish.

I've found that a reasonably good connection is between image creation time/dates. By matching the EXIF timestamp data within the file stored by iPhoto with the EXIF data extracted from Flickr, you can get a reasonably good correlation. At this point, it would be a nice idea to shove the Flickr URL for the image into the local EXIF data and get iPhoto to read it. From then on, there would be an established link.

Well, I got about as far as adding the URL, tags and geocoding to the image, but I gave up when it came to getting iPhoto to read it. There is a relatively obscure "rebuild" function in iPhoto, triggered by holding down Cmd and Opt while starting iPhoto. This allows rebuilding of thumbnails and the binary databases (presumably) from the XML. Unfortunately, it doesn't seem to reread the images for new metadata. I think this would require actual modification of the AlbumData.xml file to work, which sounds like a far bigger job.

In the meantime, this proves that such a script is feasible, albeit painful. At the moment, this is a run-once script, so it's not much use for regular usage.

SOURCE CODE: I am supplying this code freely with the understanding that there is no warranty or guarantee. Also, I'll go as far as to say that incorrect usage could quite easily delete your photos, destroy your Flickr account, upset Flickr, and kill your pets. DO NOT USE THIS CODE UNLESS YOU UNDERSTAND IT COMPLETELY AND ARE WILLING TO TAKE RESPONSIBILITY FOR ANYTHING GOING WRONG. For more details, read the README file.

The most recent version can be obtained from my Subversion repository: iphoto_flickr_exif_munge/.

I also apologise for the quality of some of the code. It's far more experimental than I'd usually want to release, but I think it might be of use to someone like me in a similar situation. Rather than just throwing it into the depths of my hard drive and forgetting it, I figure it's probably worth blogging.

My conclusion to all of this is that I really need to get my iBook replaced with something good enough to run Aperture or Lightroom, and then reimport the whole bloody lot. Painful, to say the least. I'm also hoping that Google get around to writing Picasa for the Mac. I think I'd prefer it in the long term.

Anyway, please let me know if you have any other ideas or leads on how to do this kind of stuff nicely...

Tags

Tracking

Comments

  1. Very cool. I'd love to see someone take this and run with it. I think there's a big market for a photo tool that works on the Mac or any platform that can clean up mistaken exif data, bad filenames, and correlate metadata between services like flikr and picassa web.

  2. [...] you crawl the Internet for a solution to a problem, like getting iPhoto to understand exif data [...]

  3. this is an absurd problem. same thing happens using a program like adobe elements. it's got nifty features to allow you to tag and filter your photos but it's in a proprietary database as well. i wasted hundreds of hours before i realized this was even an issue! after that i played with aperture, but it ran like a pig on my rig which at the time was fairly state of the art. i hear they fixed in version 1.1 but they had lost me by then. i now use adobe bridge to manage my meta-data. i can edit all iptc core including setting up templates for personal information i want to include in each. i can view all camera generated exif and raw info and set up smart collections. it's the most robust implementation i have yet to find that doesn't leave you in a meta-data black hole.

  4. re: "Secondly, it doesn’t use a horrible non-scalable monolithic database like iPhoto does."

    Get iPhoto Library Manager! Free download by Brian Webster. I don't know why Apple doesn't just buy him out and incorporate this feature? It makes a world of difference.

    As far as iPhoto EXIF info goes- Apple should have addressed this issue instead of creating Aperture. iPhoto PRO would have been a much simpler, faster product but they dropped the ball.

  5. @artist:

    Yeah, I've got an iPhoto Library Manager license and I've also donated a little to Rick Neil for the freeware iPhoto Buddy, which does a similar thing. They're both good products, and have been very useful when merging libraries.

    However, I'm one of those slightly obsessive types who wants all my photos tagged, searchable and accessible from a single window :-)

    Rather than breaking up the library into sub-libraries, I've gone through and deleted some of the more egregiously useless shots. I've got a bad habit of taking a number of shots on continuous drive, picking the best, and then completely failing to ditch the old shots.

    Thing is, this stuff is unnecessary. Apple have a bad habit of failing to optimise these apps for scalability, and also targetting the higher end of their hardware range, rather than catering for us poor bozos still using G4 iBooks, which admittedly is not surprising for a hardware company like Apple. iTunes is pretty crappy at handling large libraries, and as I understand it, starts slowing down non-linearly when you get to the tens and hundreds of thousands of tracks. I'm only on about 8000 tracks and it's noticeably slow in starting up.

    I think the great hope is that the Core Data API -- which iTunes and iPhoto predate -- will give them scalability for these kinds of app. It can use SQLite databases, along with XML and binary formats. Tiger's Mail.app already uses SQLite for the main mail index, although I'm not sure whether it's using Core Data or using SQLite directly. They haven't had many qualms about totally replacing the data/filing mechanisms for iPhoto and iTunes in the past, so I'm hoping they'll reengineer around Core Data as well. If they'd then add a MySQL or PostgreSQL backend for Core Data, you could then run iPhoto and iTunes on a database cluster... mmm...

    With the EXIF munging, it's fairly annoying, as I know it wouldn't be hard to implement. It might even be practical to do as a plugin. Unfortunately, iPhoto is completely undocumented as far as I can tell and my attention span just doesn't go far enough to figure it out!

  6. I want to add in my vista tagged photos to iPhoto and have all the metadata saved there. but currently iPhoto 08 does not read them, im hoping this could change when Leopard is released but there is no official word on that.

    do you know how to get iphoto to read vista tagged files?

    thanks (please email me if so)

Leave a Reply

Powered by WP Hashcash