So I made an API request for 37430, which is apparently Gordon Tootoosis, the request looked like this: http://api.themoviedb.org/3/person/37430?api_key=xxxx&append_to_response=external_ids,images,tagged_images (This was in November)
The return JSON looked like this:
{"birthday":"1941-10-25","tagged_images":{"results":[],"page":1,"total_results":0,"id":37430,"total_pages":0},"deathday":"2011-07-05","id":37430,"external_ids":{"freebase_id":"\/en\/gordon_tootoosis","instagram_id":null,"tvrage_id":6424,"twitter_id":null,"freebase_mid":"\/m\/08h8lr","imdb_id":"nm0867588","facebook_id":null},"name":"Gordon Tootoosis","images":{"profiles":[{"iso_639_1":null,"aspect_ratio":0.667107001321,"vote_count":0,"height":757,"vote_average":0,"file_path":"\/l10aLDp4D8ZwWdmfRdlj6FedUYk.jpg","width":505},{"iso_639_1":null,"aspect_ratio":0.6675567423231,"vote_count":0,"height":749,"vote_average":0,"file_path":"\/924xXlQnyah5S7IXy89UMdzXXsh.jpg","width":500}]},"also_known_as":[],"gender":2,"biography":"","popularity":0.098449,"place_of_birth":"Poundmaker Reserve, Saskatchewan, Canada","profile_path":"\/924xXlQnyah5S7IXy89UMdzXXsh.jpg","adult":false,"imdb_id":"nm0867588","homepage":null}
The problem is the 924xXlQnyah5S7IXy89UMdzXXsh.jpg image, the one which seems to be the default, is actually the image for Fergal Reilly (52701) I know this because it caused a Duplicate Entry violation. The data, looking at the IDs, should have been gotten within 24hrs of each other.
Any idea how this happened?
The erroneous image now seems to have gone, but I find it hard to believe that it could have gotten there via user error. It looks like something more underlying (if someone had uploaded the wrong image, it would have had a unique filename, it wouldn't have had the filename for someone else's image).
I ask this, as I am wondering whether there is going to be a rash of this in the data collected from November.
Can't find a movie or TV show? Login to create it.
Elementu hau kalifikatu edo zerrenda batera gehitzea nahi al duzu?
Ez zara kidea?
Adi Erabiltzailearen Erantzuna
Otsaila 23, 2018 egunean 12:53 PM(e)tan
Why is there no Unique constraint on the image field? Surely there isn't a viable reason for having two people with the same photo?
So, for James Heath, and this is still the case, we have these TMDB IDs:
63256
63246
63244
Those are old IDs.
All with the same profile image of: /azRn7U2RKTkB9cHBO4GwJZm2jxy.jpg
They are all basically the same, but the last one of the 3 has a IMDB ID.
Adi Erabiltzailearen Erantzuna
Otsaila 23, 2018 egunean 1:00 PM(e)tan
So here are the duplicates. This feels highly avoidable to be honest:
4 33781, 231784, 262075, 1104340
4 25348, 139567, 161310, 1153503
3 976019, 990654, 1301102
3 63244, 63246, 63256
2 94938, 572394
2 129814, 1884930
2 222548, 222549
2 1033001, 1866767
2 1791571, 1791647
2 251202, 1405869
2 1178411, 1880465
2 1624370, 1624372
2 555778, 1070406
2 1020725, 1070133
2 19828, 1462120
2 237775, 1483240
2 46391, 127279
2 1499236, 1886106
2 932097, 1234191
2 1418435, 1418437
2 179942, 1216483
2 1561979, 1561980
2 1833297, 1833299
2 1405685, 1612728
2 1479456, 1905843
2 588716, 1575014
2 239025, 1091885
2 565339, 1911757
2 1172683, 1676265
2 1405687, 1612729
2 1517607, 1523644
2 131208, 137626
2 148084, 260050
2 572045, 572046
2 230712, 932081
2 1747946, 1747947
2 564053, 1404608
2 1165435, 1165436
2 148108, 148109
2 145086, 145087
2 560243, 560244
2 1024232, 1883402
2 1553273, 1572416
2 18906, 1403158
2 37986, 1489580
2 1816564, 1849793
2 74296, 1144947
2 1120110, 1908406
2 1157333, 1157335
2 1405690, 1612727
2 56900, 143817
2 1813034, 1813041
2 189884, 930147
2 88471, 107221
2 224886, 994322
2 131605, 131606
2 228802, 586259
2 16609, 975287
2 1157303, 1222106
2 1155607, 1155608
2 224462, 1342697
2 1067293, 1067296
2 1130836, 1259905
2 580219, 1463264
2 33608, 1127849
2 144103, 1213844
2 260627, 1339312
2 1405691, 1612726
2 113387, 1216756
So yeah, on a few occasions, 4 entries for people, have the same image... Is this moderating gone wrong?
Travis Bell Erabiltzailearen Erantzuna
Otsaila 23, 2018 egunean 2:21 PM(e)tan
Hey Adi,
The image service is only keyed by the file's SHA and has no link to the asset it belongs to. That link only exists in the media database, so yes, it would be possible to have the same SHA belong to more than one media record. It makes sense you're seeing this in and around duplicate records.
Once an image is uploaded to S3, it is never removed. Since it's keyed by the SHA, if that file were to get uploaded again, it's essentially a no-op. Nothing happens. Same SHA, means same image which means many records could theoretically be tagged with it.
Adi Erabiltzailearen Erantzuna
Otsaila 23, 2018 egunean 2:31 PM(e)tan
Makes you wonder what happened with Gordon Tootoosis / Fergal Reilly, they don't look similar :P
Nice move with the SHA.
Worth adding something which indicates where it is already in use? (Stopping people from using it doesn't help, since they will just upload a different image of the same person for their duplicate entry, which isn't helpful.)
Travis Bell Erabiltzailearen Erantzuna
Otsaila 25, 2018 egunean 12:00 PM(e)tan
Haha, I have stopped trying to figure out what users do sometimes.
It could be but it might not change a users behaviour much, as like you said, they'd probably ignore it anyways. And I would prefer to keep the merging/editing of profiles easier by not restricting duplicates (ie. so a mod or user can just re-add the image) and then the mod can just click delete without worrying about anything else.
Adi Erabiltzailearen Erantzuna
Otsaila 26, 2018 egunean 12:33 AM(e)tan
Yeah, no dupes only hinders the mods without actually helping with the problem when it comes to user behaviour.