View Issue Details

IDProjectCategoryView StatusLast Update
0006935MMW v4Properties/Auto-Toolspublic2011-01-09 23:30
Reporterrusty Assigned To 
PriorityurgentSeverityfeatureReproducibilityalways
Status closedResolutionfixed 
Product Version4.0 
Target Version4.0Fixed in Version4.0 
Summary0006935: Video metadata lookup
DescriptionMM's auto-tagging / freedb lookup tools are currently only designed for audio. We need to add basic support for looking up Video metadata (for both Auto-tag from Web, and Get Info/Buy)

To discuss...
TagsNo tags attached.
Fixed in build1340

Relationships

related to 0006940 closedLudek Video columns are missing for Auto-tag from filename 
related to 0006776 closedLudek Change 'Scenarist' to 'Screenwriter' 
related to 0007052 closedpetr Automatically generated/retrieved metadata shouldn't be stored to a temp folder 

Activities

michal

2010-12-09 07:22

developer   ~0021697

Last edited: 2010-12-09 13:07

I think the best source for video metadata is http://www.imdb.com/. They offered some kind of API formerly, but now it seems they do not. I don't understand, if the only way how to access data is to download the whole very large database (http://www.imdb.com/interfaces) or if there is another way (parsing HTML is probably not very legal?).
There exists some proprietal IMDb APIs, but I think, it is not very legal too (http://imdbapi.poromenos.org/, http://www.deanclatworthy.com/imdb/)

Maybe the best way is to contact them a ask, maybe they will offer some acceptable solution.

rusty

2010-12-09 15:52

administrator   ~0021701

Last edited: 2010-12-09 15:54

IMDB is great, but it does have issues:
i.e. minimum licensing fee of 15k annually: http://www.imdb.com/licensing/
Users can screenscrape if they wish, but we shouldn't include that as the default method in MM.

Here are some other ideas re what myth TV is doing:
http://www.mythtv.org/wiki/MythVideo#Metadata_Lookup

Other available DBs:
http://www.tagchimp.com/ (doesn't seem to be actively developed)
http://www.allmovie.com (screenscraping only)
http://www.rottentomatoes.org (best for getting review information about movies)
http://en.wikipedia.org (contains huge DB and can be scraped without a problem--don't know if there's an API)
http://www.themoviedb.org/ (seems to be the best option, except for the fact that metadata is limited for TV).
http://www.tv.com (only screenscraping would work)
http://thetvdb.com (seems to be the best option for TV)

My suggestion is that we use the Auto-tag from Web functionality for both TV and Video collections, and that:
- MM by default use themoviedb.org for Type=Video
- MM by default use thetvdb.com for Type=TV
- MM allow for the use of Amazon.com for each of Type=Video and Type=TV, assuming the API service allows for it
- MM give users the ability to change the default lookup provider on a per Type basis

Note: the Get Info/buy function should also be updated to search in the Appropriate Amazon DB.

rusty

2010-12-09 16:56

administrator   ~0021704

fyi, the structured data from wikipedia infoboxes can also be queried as described here: http://wiki.dbpedia.org/Datasets#h18-11

jiri

2010-12-16 15:55

administrator   ~0021869

As discussed offline, the best implementation for now will be using Amazon. There are few things to be modified in order to support Movies tagging in our Auto-tag window:

1. In case the source file(s) are Type=Video, the type of Amazon query should be changed from 'Music' to 'DVD'. Note that this is valid also for the quick-links directly next to track in the main tracklist.
2. In this case, columns in the Auto-tag from Web dialog should be modified, those that are audio only removed and other, video-only, added. See below for columns the need to be added.
3. The shown HTML needs to be slightly modified in this case, e.g. Tracks will be removed.
4. The fields to be mapped from Amazon XML results to MM DB fields are:

<Actor>Elijah Wood</Actor>
<Actor>Ian McKellen</Actor>
  -> Actor(s): Elijah Wood; Ian McKellen

<AudienceRating>PG-13 (Parental Guidance Suggested)</AudienceRating>
  -> Parental rating: PG-13

<Creator Role="Writer">Peter Jackson</Creator>
<Creator Role="Writer">J.R.R. Tolkien</Creator>
  -> Involved people: Writer: Peter Jackson; Writer: J.R.R. Tolkien

<Creator Role="Producer">Barrie M. Osborne</Creator>
  -> Producer: Barrie M. Osborne

<Director>Peter Jackson</Director>
  -> Director: Peter Jackson

<Publisher>New Line Home Video</Publisher>
  -> Publisher: New Line Home Video

<ReleaseDate>2010-09-14</ReleaseDate>
  -> Date: 2010-09-14 (full date, not just year)

<Title>The Lord of the Rings: The Fellowship of the Ring [Blu-ray]</Title>
  -> Title: The Lord of the Rings: The Fellowship of the Ring

<EditorialReview>
  <Source>Product Description</Source>
  <Content>Assisted by a Fellowship of heroes, Frodo Baggins plunges into a perilous trek to take the mystical One Ring to Mount Doom so that it and its magical powers can be destroyed and never possessed by evil Lord Sauron. The astonishing journey begins in the first film of director/co-writer Peter Jackson's epic trilogy that redefined fantasy filmmaking. This imaginative foray into J.R.R. Tolkien's Middle-earth won 4 Academy AwardsR* and earned 13 total nominations including Best Picture.</Content>
</EditorialReview>
  -> Comment: {text above}

Ludek

2010-12-16 23:10

developer   ~0021874

Last edited: 2010-12-16 23:17

Implemented in build 1336

jiri

2010-12-17 10:10

administrator   ~0021893

Generally works great, just:
5. We shouldn't trim Title in case of Video (as we do for Audio), it results in 'The curious case of Benjamin Button' being trimmed to 'The curious case' which doesn't return the correct result.
5b. Actually, I realized that the search isn't initiated by Title fields, but by Album(Series) - this should be corrected.
6. The sorting of results should be modified, since the Amazon engine returns even unrelated things (for Benjamin Button I got 'Slumdog millionaire' and 'Revolutionary Road' before the correct result). I'd suggest to sort by:
 a. StringSimilarity() - so that best title matches are sorted first
 b. Date then?
7. Some strings could be removed from Title, namely:
 (single-disc..., (two-disc..., [Blue-ray], (ws), (widesreen..., [VHS] {and anything after these strings}
8. Some of the tagged fields don't have their columns (Actors, for example).
9. We need to review storing of album art - it doesn't work 'to tag' in all formats and 'to album folder' doesn't make much sense for video. Maybe a third option 'to global folder' could be useful?

Ludek

2010-12-17 10:18

developer   ~0021894

10. Involved people are in the form "Name1; Name2; ...", but should be "Role1: Name1; Role2: Name2; ..."

Re 8: Also 'Involved People' and 'Parental Rating' should be added

michal

2010-12-17 10:31

developer   ~0021895

Note, that we have "Lyricist/Written by" field for writer(s) (IWRI tag in AVI, WM/Writer in WMV), shouldn't it be used for Writer(s)?

Ludek

2010-12-20 15:12

developer   ~0021959

Last edited: 2010-12-20 21:12

5 - not reproduced

5b, 6, 7, 8, 10 - fixed in build 1338

9 - as I remember we depracated 'to global folder' in the past for several reasons so it is general problem and is (should be) covered by another issue(s)

Re: Michal's note:
I added 'Writer(s):' field and Writers are written to 'Screenwriter' field, but problem is that 'J.R.R. Tolkien' is not Screenwriter, but Writer so we should probably rename 'Screenwriter' to 'Writer' fo video? - covered by 0006776

jiri

2010-12-21 17:40

administrator   ~0022016

7. The string removal should be also done for ordering results - currently it isn't done and it causes that some good results are sorted lower, just because they have something like '(widescreen)' appended.

10. The only remaining issue is that the quick links in the tracklist should be corrected - for video files clicking on Title or Series should search for the given string in DVD collection.

Ludek

2010-12-22 15:09

developer   ~0022043

Fixed in build 1340
and fixed also for the links in 'Properties' -> 'Basic' page

peke

2011-01-09 23:30

developer   ~0022252

Verified 1343