Thursday, 13 March 2014

My Cinema Knowledge: "my movies 2" aka kiss open refine

In this episode of " My Cinema Knowledge" I will try to link video files I have on my pc with Freebase. I tried to keep the process as simple as possible and this time the requiremt is only a normal laptop with open refine and its extensions.
Use of this data in the next post!

Step By Step How-to

  • Build a csv with all the film names. I used folders names in my disk extracted using a linux command ( find . -type d > myMovies.csv )
  • Import in open refine (I used ver2.6)
  • Extracted movie name from folder name taking only the last part of the location. For me it was something like this in GREL:
    Here a guide:
  • Reconciliation: There is now a cloud based reconciliation service for freebase, now working also with italian language. It should be included in open refine but it does not work for this bug You can make it work creating a new standard one using this address:  

  • Run reconciliation selecting the "film" type
  • Select the match with a "high" match using the facet on best match score
  • match all cell to the highest candidate (from reconciliation->action)
  • Manually find a match for other items
  • Add new column based on the reconciliated one with expression:
  • In order to have the freebase rdf id add new column based on the last one with expression: "" +  "m." +
  • Download and compile rdf-extension:
  • Edit rdf skeleton like this: (preview does not work because of

In order to make a uri out of row index I just used a custom vale: "http://www.mycinemaknowledge/video/" + value
  • Export in rdf