View Issue Details

IDProjectCategoryView StatusLast Update
0013047MMW v4Framework: Taggingpublic2016-06-09 11:14
Reporterrusty Assigned To 
PriorityimmediateSeveritymajorReproducibilityalways
Status closedResolutionfixed 
Product Version4.1.1 
Target Version4.1.11Fixed in Version4.1.11 
Summary0013047: Incorrect lyrics / artwork downloaded for many tracks
DescriptionThere are several complaints in the Google Play store that 1.1.3 is either:
- not finding lyrics for 1/4 of tracks
- finding incorrect artwork for 1/4 tracks

Peke can you please investigate this? Also, can you confirm whether the issue is common to MMW?
TagsNo tags attached.
Attached Files
Fixed in build1782

Relationships

related to 0011991 closedmichal MMW v4 Auto Artwork lookup stopped working 
parent of 0013186 newLudek MMW v4 Classical Music: Searches use incorrect data which end in empty or invalid results 
parent of 0013344 closedLudek MMW v4 Album Artwork lookup fails terribly when Album Artist is missing 
related to 0013097 closedmartin MMA Metadata lookup is often inaccurate 
Not all the children of this issue are yet resolved or closed.

Activities

peke

2016-01-14 01:44

developer   ~0043859

Last edited: 2016-01-14 01:53

1. Looks like it is common for MMW also.

I uploaded one track that observe incorrect Artwork fetch in both MMW and MMA "East 17 - It's Alright.mp3"

When searched Album art from track properties it always find Incorrect Artwork due the bad Album Tag.

But if you do a Auto Tag from Web (US Amazon) there is an CD Single with track name an Correct Artwork.

2. I would adjust MMA/MMW to do additional search and increase match By searching Album art by criteria "<Artist> - <Title>" not just Dropping to <Artist>

3. In case of "Coolio - C U when you get there.mp3" Album data is partially incorrect where instead of "Now37 Disc 1" MMW/MMA should try to search "Now37" or "Now 37" where correct Album art would be searched.

4. When there is inconclusive result I would add Drop down like in Auo Tag From Web to Let User choose Corect Album art from searched results.

4a. Repeating Artwork search always return wrong Artworks in sub-searches for both files

rusty

2016-01-14 02:54

administrator   ~0043860

Re. point 4.: since the process is completely automatic, I would suggest that if the result is inconclusive, then it's preferable to _not_ save the lyrics / artwork.

Ludek

2016-01-14 11:38

developer   ~0043862

Last edited: 2016-01-14 11:40

The problem is that we cannot know whether the result is inconclusive,
in case of the "East 17 - It's Alright.mp3" track we are using this query to lookup the artork via Google:
https://www.google.com/search?safe=active&tbs=iar:s,ift:jpg&q=%22East%2017%22+%22Incoming%22&tbm=isch

The problem is that the track has incorrect album tag field "Incoming" (which actually doesn't seem to be an album from East 17 -- searching for their discography).
The album should be "Walthamstow", so the result is incorrect because of the incorrect album tag!

In MM5 we are using MusicBrainz which (similar like the Amazon search) should give better results.

peke

2016-01-14 19:00

developer   ~0043866

NOTE: Album is deliberately wrong so that bug is shown.

I was suggesting that in case where Album is not found MMW tries to search for album art by adding Title in Search.

eg. https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&q="East+17"+"It's+Alright"&tbm=isch

Same goes for Amazon Web search if Single track is selected And Album is not found so Search drops to <Artist> only eg. http://www.amazon.com/s/ref=nb_sb_ss_i_1_8?url=search-alias%3Dpopular&field-keywords=east+17+it's+alright&sprefix=east+17+it's+alright

michal

2016-01-15 07:38

developer   ~0043877

Peke: searching with title returns some artwork for "It's alright" song, but it is because this song was on single with the same name and the result is cover of this single... And artwork might be incorrect, album "Walthamstow" has different cover.
But hard to say, maybe it could really improve relevance sometimes. And sometimes not, based e.g. on the specific title of the song.
Anyway, we cannot guess mistakes in user tags, and these mistakes really could make correct artwork and lyric searching nearly impossible. Correct tag is the base.

peke

2016-01-26 03:10

developer   ~0043973

Last edited: 2016-01-26 03:11

As talked on IM here is direct example in your response to 0013047:0043877

Current = https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&q=%22East+17%22+%22Incoming%22&tbm=isch Which ends in results that do not have both <Artist> + <Album> in Image links with 100% certainty

If that is the case and we do not want to present to user result that is incorrect MMW should try to Search <Artist> + <Title> eg. https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&q=%22East+17%22+%22It%27s+Alright%22&tbm=isch which get 100% correct album art for the track ignoring Album which can't be confirmed as correct and/or maybe belong to compilation or user custom Album.

Result in auto tag is that MMW shows correct track Album Art in more then 70% cases where currently found Album Art is correct in less then 10%

To test this I Copied Title to Album in approximately 2000 tracks without and used Auto Tag to find art.

Not knowing if tags are correct can be archived by checking image URL and do not show/DL album art if no search criteria is found.

peke

2016-01-26 04:01

developer   ~0043974

Last edited: 2016-01-26 04:03

This approach can be added to any track, but here is few examples that clearly show how filtering and new search should work:
https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Bon+Jovi%22+%22Backup%22
https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Bon+Jovi%22+%22Always%22

https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Queen%22+%22Download%22
https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q="Queen"+"invisible+man"


https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Abba%22+%22Tribune%22
https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Abba%22+%22sos%22

https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Roxette%22+%22Chart+Hits%22
https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Roxette%22+%22Sleaping+Single%22

michal

2016-01-26 12:25

developer   ~0043976

Last edited: 2016-01-26 12:26

peke: but how to detect, that album title inferred from filename or file path is not correct, and that we should use track title for artwork searching? If we won't use inferred metadata for searching, it could significantly decrease artwork searching accuracy for users, which have tracks in folders with names that matches real album titles (in case the tracks have not tags for album)...

peke

2016-01-26 16:47

developer   ~0043978

I also thought that it will return less accurately, but physical test results showed opposite results and no need for big changes in search engine just adding additional search layer.

I'll use original example for East 17 as it is more visible for older tracks.
Current https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22East+17%22+%22Incoming%22
Img Results are:
- https://i1.sndcdn.com/artworks-000135561298-11fxz5-large.jpg (from http://www.mp3-songs.host/free/new/mp3/east-17.html )
- https://traveladventures20142015.files.wordpress.com/2015/02/travelling-east17.jpg (from https://traveladventures20142015.wordpress.com/page/2 )
Both do not contain Album Name in img link, although second one contain <Artist> but fails <Artist> - <Album> or even <Artist> - <Title> which can be flagged with less then 30% accuracy

After such results MMW can assume with more than 70% accuracy that either Artist or Album is wrong so to confirm that Artist is correct it should initiate another search but using <Artist>+<Title>.

<Artist>+<Title> search results using https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22East+17%22+%22It%27s+Alright%22 img are:
- http://www.12inch.de/data/350/7467.jpg (from http://www.cdandlp.com/en/east-17/it-s-alright/album/ )
- http://cdn.discogs.com/_WJym9mfd0BAPoYruL88g_H9nl4=/fit-in/300x300/filters:strip_icc%28%29:format%28jpeg%29:mode_rgb%28%29/discogs-images/R-565311-1141660950.jpeg.jpg (from http://www.discogs.com/East-17-Its-Alright/release/565311 )

And if Results are analyzed both <Artist> and <Title> Is found in img links even they reference to discogs which increase chance for accuracy

Using within link search inconclusive Google results are narrowed to more accurate results and even we can add additional control/support for more loosen results.

This change I'm implying is to not force <Artist>+<Album> in auto search due the high risk of false results and add per track search which is actually happening.

Yes, result maybe incorrect Album Art but it increases correct per Track Album Art and if not satisfied user can correct tags (eg. Album Name) and search for new Album Art eg. https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22East+17%22+%22Walthamstow%22

@Michal
Select 200 tracks and copy Title to Album TAG and try Art Searches from Artwork Properties

Ludek

2016-01-26 17:39

developer   ~0043981

Last edited: 2016-01-26 18:00

We can hardly predict which metadata are incorrect (despite the fact whether they are in filename or tag), so for incorrect artist/album/title metadata user cannot expect to have correct lyrics/artwork metadata.

I guess that only that we can do now is to lookup only known metadata, i.e. do not look for "Unknown" albums or tracks like "Track01".

But as Michal pointed over IM, the title/album/artist metadata are mostly inferred from tags and filenames so they are far more often incorrect rather than unknown :-/

[Artist - Title] searching has disadvantage that it would result in inconsistent artworks for various albums (e.g. for each "The Best" album or compilations)

Moving targe to MM5 where we uses MusicBrainz database containing only real albums (Google can return any image).

peke

2016-01-27 02:49

developer   ~0043986

Last edited: 2016-01-27 02:59

I really doubt that even changes in MM5 would solve Bad Album tags where

Track do not have ALBUM entered then result is https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Marky+Mark%22

Eg. http://musicbrainz.org/taglookup?tag-lookup.artist=Marky+Mark&tag-lookup.release=&tag-lookup.tracknum=&tag-lookup.track=&tag-lookup.duration=&tag-lookup.filename= making Auto Tag album art useless

and if Title is used then the result is https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&tbm=isch&q=%22Marky+Mark%22+%22Good+Vibrations%22

eg. http://musicbrainz.org/taglookup?tag-lookup.artist=Marky+Mark&tag-lookup.release=&tag-lookup.tracknum=&tag-lookup.track=Good+Vibrations&tag-lookup.duration=&tag-lookup.filename= contain result http://musicbrainz.org/release/952c1dd4-ec59-4074-89fb-7779f848b282 which is again correct Album art or adding Title as Release (ALBUM) end up in getting http://musicbrainz.org/taglookup?tag-lookup.artist=Marky+Mark&tag-lookup.release=Good+Vibrations&tag-lookup.tracknum=&tag-lookup.track=Good+Vibrations&tag-lookup.duration=&tag-lookup.filename= which contain original album where track is released http://musicbrainz.org/release/4712e342-de22-4ca1-80dd-18c72fe4680c

peke

2016-01-27 03:04

developer   ~0043987

Last edited: 2016-01-27 03:06

To summarize tests results:
- MMW return possible correct Album art only when artist/album/title metadata is 100% correct

- MMW always return bad false Album Art when either artist/album/title metadata is wrong or missing (usually more than 80% of users databases)

- Using additional search after inconclusive results of current searches by using <Artist>+<Title> results in possible correct Album art ending with more than 90% success rate and in case of missing metadata can provide info to fill them

Ludek

2016-01-27 11:12

developer   ~0043990

Last edited: 2016-01-27 12:19

OK, so if I understand correctly you are suggesting following fix for 4.1.11:

Change current search from <Artist>+<Album> to <Artist>+<Title>

- pros: gives more accurate results (according to your tests)
- cons: mostly makes album artwork inconsistent (various album tracks gets various artwork)

I guess we can try to change this for 4.1.11 and collect the further feedback about the change, but...

I just tested this on random album "A Rush Of Blood To The Head" from artist "Coldplay" and for every song (tested "Clocks", "Green Eyes", "In my Place", "The Scientists", ...) it gets different artwork. Not sure how users would be satisfied to have different artwork for every single track on an album?

peke

2016-01-27 12:30

developer   ~0043991

Last edited: 2016-01-27 12:31

Closely, I do not want to change existing just want to add additional Check if Album Art results are correct eg. real album art for that album is returned or some random images thrown by google.

Steps when Auto Search is initiated:
1. Search <Artist>+<Album>
2. Check Image URLs and REFURLS if they Contain Both <Artist> and <Album>

3a. If Yes then use Results to get Album art (As currently)

3b. If only Artist is located then Search for <Artist> - <Title>

3b/1.Check Image URLs and REFURLS if they Contain Both <Artist> and <Title> and Tag track Assuming that Album is Either Compilation or metadata is incorrect.

3b/2. If only Artist is located then Search for <Artist> then We should decide if we will add/present Album art or Skip.

Better explained.
I do not want to replace existing behavior just to eliminate false results as possible.

In future we can compare results if needed for increased accurancy

Auto Tag from web have Human factor where you can edit search line and research and also choose from existing results. Auto Search need to calculate that by itself, and can't assume that metadata of Album is correct due the fact that search is for that tracks and there is missing data regarding Album (like number of tracks from album, Album Artis, is it compilation, ...) so logically it would be that we search Album art like it is Single or 12" release.

BTW you have not stated if your search/test on "Coldplay" ended in correct Album art for each track or it missed them?

Ludek

2016-01-27 12:38

developer   ~0043992

Last edited: 2016-01-27 14:52

Testing it now, you are probably right that mostly the images URLs contains both album and artist name, so I guess that your suggestion would work to eliminate the false matches as much as possible. (e.g. http://misterbulger.files.wordpress.com/2014/05/the-joshua-tree-u2.jpg -- the first image URL for the U2 - The Joshua Tree album)

This should probably improve the accurancy a lot in case of the Google lookup.

EDIT: Another solution could be to accept only img urls including word "cover", "albumart", "album", "artwork"

Targeted to 4.1.11 again.

peke

2016-01-27 13:24

developer   ~0043993

Last edited: 2016-01-27 15:19

If possible also check REFURL for each image. Sometimes images do not contain info but refurls do. As shown in my East 17 example in 0013047:0043978

rusty

2016-01-27 15:13

administrator   ~0043995

fyi, I was just playing around with searches and found two other things that may help (not sure if you're already including this):

1) if you add: album cover
... to the search, it seems to improve the quality of the images (at least in English and Hebrew--I don't know about other foreign languages.

Note I also briefly tested "Album cover" but that didn't yield as good results. Others that I tried but weren't as good: "Album Art", Album, Cover, Album Art.

2) if you reject any images that aren't square it further improves results

peke

2016-01-27 15:25

developer   ~0043996

@Ludek
So if I got it correctly after fix MMW will search for <Artist>+<Album> and if both metadata was not found in IMG URL and REFURL of results it will restart Search as <Artist>+<Title> to see if there is Track Album Art. Again if none is found it will return no album art found?

@Rusty
1. Not for German/EU compilations.

2. Good point, Round image can represent CD/LP scan on some very old tracks. But non square (eg. 300x300) can also narrow chance to get album art and eliminate banners,...

michal

2016-01-27 15:33

developer   ~0043998

ad 2) we already limit searching only for square images (parameter "tbs=iar:s")

rusty

2016-01-27 21:17

administrator   ~0044001

3) another possible addition to the logic might be to give preference to images from certain sites (e.g. discogs.com, allmusic.com)

michal

2016-01-27 21:37

developer   ~0044002

Last edited: 2016-01-27 21:38

ad 3) we already were trying to use whitelist for preferring some results, but it was not working very good so it is disabled now, it preferred wrong result to the good one quite often.

Ludek

2016-01-28 11:45

developer   ~0044009

Last edited: 2016-01-29 13:20

Fixed in build 1781.

I used the Peke's approach from 0013047:0043991 which seems to improve the accuracy/relevancy a lot.

1) at first it searches for <Album Artist>+<Album> and only within the first 30 items in the results, if alfa-numeric variants of _all_ words from <artist>+<album> are included within the image link url then the link is accepted

2) If all links from 1 are refused (e.g. in case of the 'East 17'+'Incoming') then it searches again but with <Artist>+<Title>, and if alfa-numeric variants of _all_ words from <Artist>+<Title> are included within the image link url then the link is accepted

Based on my tests it improves accuracy a lot, so once tested in 1781 we should consider to implement the same for MMA/MM5.

peke

2016-01-28 16:50

developer   ~0044013

1. Great, as that will also cover your "U2" example where REFURL contained <Album>+<Artist>

peke

2016-01-28 20:53

developer   ~0044015

Last edited: 2016-01-29 13:57

Reopen.

Literally Step By Step description for few more corrections,as we missed few things.
1. Initial search <Album Artist>+<Album> URL: https://www.google.com/search?safe=active&tbs=iar:s,ift:jpg&q=%22Various%20Artists%22+%22Best%20of%2090's%20dance%20music%22&tbm=isch
2. Fail to find searching for <Album Artist>+<Title> URL: https://www.google.com/search?safe=active&tbs=iar:s,ift:jpg&q=%22Various%20Artists%22+%22We're%20going%20to%20Ibiza%22&tbm=isch
3. Fail as instead of <Artist>+<Title> MMW used <Album Artist>+<Title>
4. MANUALLY REMOVED Album Artist from TRACK PROPERTIES and then MMW searched for <Artist>+<Album> URL: https://www.google.com/search?safe=active&tbs=iar:s,ift:jpg&q=%22Vengaboys%22+%22Best%20of%2090's%20dance%20music%22&tbm=isch
5. as it Failed to find Album art it started search for <Artist>+<Title> URL: https://www.google.com/search?safe=active&tbs=iar:s,ift:jpg&q=%22Vengaboys%22+%22We're%20going%20to%20Ibiza%22&tbm=isch
6. NOW Album art is found aaSearch.js: AddResult https://is5-ssl.mzstatic.com/image/thumb/Music/v4/d7/07/e3/d707e3a1-5fb8-5756-a843-c26213dadc56/source/1600x1600sr.jpg

NOTE: I tested on 200+ files that do not have Album Artist and result was 100% correctly found album art

peke

2016-01-28 23:12

developer   ~0044016

Last edited: 2016-01-28 23:32

a) Add few more exclusions from Search:

"ft", "ft.", "Feat", "feat.","featuring", "&"

For Future tweaks We can try to exclude any text between parentheses

b) Exception BUG on 404 error: It should proceed with next result (LOG Uploaded)

Ludek

2016-01-29 11:42

developer   ~0044019

Last edited: 2016-01-29 12:24

Not sure exactly what do you mean by items 1-6, but I guess it would make sense to modify the current workflow slightly. As you pointed currently we prefers <Album Artist> once is presented. I guess that at first we should search for <Album Artist>+<Album> and then for <Artist>+<Title> , this probably will result in inconsistent artwork for compilation albums, but looks like a better choice than incorrect artwork.

a) yes, makes sense to exclude to exclude any text between parentheses

b) yes, I have already noticed that sometimes the found image link is no longer available, will try to fix it, hopefully it won't degrade performance a lot

Ludek

2016-01-29 12:58

developer   ~0044020

Last edited: 2016-01-29 13:12

re b) the reason for the 404 was that MM tried
http://vignette1.wikia.nocookie.net/katyperry/images/a/a2/TeenageDreamSingle.jpg/revision/latest%3Fcb%3D20120412150520
instead of
http://vignette1.wikia.nocookie.net/katyperry/images/a/a2/TeenageDreamSingle.jpg/revision/latest?cb=20120412150520

i.e. it is URI decoding issue '?cb=' versus '%3Fcb%3D'
the link is double encoded for some reason so it needs to be also double decoded, original double encoded string was http://vignette1.wikia.nocookie.net/katyperry/images/a/a2/TeenageDreamSingle.jpg/revision/latest%253Fcb%253D20120412150520

Ludek

2016-01-29 13:18

developer   ~0044021

Last edited: 2016-01-29 13:21

Fixed in build 1782

i.e. <Album artist>+<album> used in 0013047:0044009
- text in parenthesis is ignored
- URI decoding fixed

peke

2016-01-29 14:07

developer   ~0044022

Last edited: 2016-01-29 17:09

To clarify 0013047:0044019: In my test in case of Compilations where Album artist is "Various Artist" or its iterations unless <Album> is correct/existing and not Chart compilations correct album art is found. Othervise in case of Chart Compilations each track should have own Album art due teh fact they are actually Chart Singles. We will see which is better based on feedback.

re a): This needs to be tracked, but still there is low number of tracks that actually have parenthesis in Title. There is much more tracks that use parenthesis to state duets or multi-artists, so this approach is valid.

peke

2016-01-29 17:20

developer   ~0044024

Verified 1782

Search Results are 99.9% correct.

Only track that it fail to find any Album art is with one track where MMW searched Artist "D.K. Dance", Title "It's a lot" using https://www.google.rs/search?safe=active&tbs=iar:s,ift:jpg&q="D.K.+Dance"+"It's+a+lot"&tbm=isch

With further investigation I've found that MMW decided correctly not to add any Album Art as even when I do manual Search for google images using https://www.google.rs/search?q=%22D.K.+Dance%22%2B%22It%27s+a+lot%22&source=lnms&tbm=isch&sa=X&ved=0ahUKEwi2rKX2ws_KAhWF_SwKHW5GA4QQ_AUICCgC&biw=1352&bih=717 there is no Square result and only result that is correct can only be chosen/confirmed visually