You know how it is: you want something that no program/service provides and you end up making it yourself.
In my case, what I wanted was a metadata source for my books in calibre.
I use calibre not only for tracking ebooks (which is what is made for), but also for normal, printed books.
I have no problem with the English and French books of my library since the metadata sources provided by calibre are more than enough.
But I also own a big number of Greek books and, there, the results I get through calibre are quite poor: for very few greek books there are metadata available(provided by Google). And these are transcribed into Latin, which is odd to read.
Now, the most comprehensive source of greek books, especially the last 20 years production, is found in the Biblionet database. But there is no web api and the semantics of their web pages are poor if existent at all. Which means that in order to get book metadata, one has to scrape their web pages.
The simplest query (and the most reliable) to perform in biblionet is to search by isbn, as this returns only one record (or none, if the isbn is not found).
So what i did, was to make a simple web app(?) that searches a book by isbn in biblionet, parses the result page and returns a json or html output. I call it bookmeta.
Here is a sample output:
{"biblionetid" : "44201", "cover_url" : "http://biblionet.gr/images/logo_gr.jpg", "title" : "Από το Βυζάντιο στην Αναγέννηση", "authors" : "N. G. Wilson", "translators" : "Φωτεινή Πρεβεδούρου - Γεωργίνη", "publisher" : "Εκδοτικός Οίκος Α. Α. Λιβάνη", "yr_published" : "1994.", "original_language" : " αγγλικά", "original_title" : " From Byzantium to Italy", "categories" : "Ευρώπη - Ιστορία - Αναγέννηση [DDC: 940.21]"}
You can find the code at github.
This script isn’t meant for end users. It is for people who might want to use this in another product or service. Help yourselves folks!
Next steps?
I am now developing a calibre plugin that does exactly the same thing, and since my python knowledge is close to zero, I am going to use this hack to get book metadata in the plugin almost without any further parsing in python (for v.1 at least).
One thought on “Bookmeta: a book metadata extractor for Greek books”
Comments are now closed.