@leafstrat said:
Where can I keep up with your developments on this particular TMDB project out of curiosity?
Ok, so it looks like the Mix-N-Match catalogs then could be updated with IDs from the daily dumps: https://developer.themoviedb.org/docs/daily-id-exports I messaged the two current Mix-n-Match library creators if they would mind importing the dump for matching before needing to tackle that.
This leaves TV Networks without an existing matching catalog, but it also looks as these don't' have external IDs and so there isn't a place to return a Wikidata ID for the network: https://www.themoviedb.org/network/6-nbc/edit. Not a high priority, but something to note as a gap for Wikidata ID linking. I don't know if keyword linking for semantic interpretation is desired (could be interesting from a linked data standpoint to find movies related to the term when reading about the keyword elsewhere) or if collections would like to be linked.
For now, I'll post share worthy updates here such as request for comment for the cases test cases to cover before coding. The project todo and tool design comments might be better suited to file suggestions in the repor after I upload a first draft of the use cases, edge cases, and architecture -> https://github.com/users/brierjon/projects/3 & https://github.com/brierjon/Id-Workflow-and-Sync This being a hobby project I'm not sure how fast it will develop.
Has anyone started work on a resolver for the IDs between TMDB and Wikidata? I'm potentially interested in testing an ID match and ID sync tool design, but haven't started coding it yet.
For TMDB company ID to Wikidata ID mapping, you might want to have a look at https://github.com/rohfle/tmdb-wikidata-company-matching. I used a combination of fuzzy string matching for company names, along with comparing common lists of TV / movies under companies on TMDB and Wikidata to give a basic similarity score.
I have created a Python script that for each page on TMDb (of films and TV series, for now) searches for the corresponding page in Wikidata and in the latter synchronises the IDs via API (this is the bot that executes the script: https://www.wikidata.org/wiki/User:Wicci%27o%27Bot). I could also modify the code for reverse synchronisation: insert the missing IDs on TMDb if are listed on the corresponding Wikidata page instead. Is there an option in TMDb to compile the External IDs via POST API?
Is there an option in TMDb to compile the External IDs via POST API?
All of our data must be entered manually and there are no plans to support automated data entry.
Regarding Wikidata IDs, the easiest way would be if TMDB synchronized its data directly with Wikidata via IMDb ID and TMDB ID (afaik, now the daily synchronization only works via IMDB ID on the Wikidata end). However, I don't know if such sync is even planned .
Can't find a movie or TV show? Login to create it.
Reply by Jonathan Brier
on July 11, 2023 at 11:17 AM
Ok, so it looks like the Mix-N-Match catalogs then could be updated with IDs from the daily dumps: https://developer.themoviedb.org/docs/daily-id-exports I messaged the two current Mix-n-Match library creators if they would mind importing the dump for matching before needing to tackle that.
This leaves TV Networks without an existing matching catalog, but it also looks as these don't' have external IDs and so there isn't a place to return a Wikidata ID for the network: https://www.themoviedb.org/network/6-nbc/edit. Not a high priority, but something to note as a gap for Wikidata ID linking. I don't know if keyword linking for semantic interpretation is desired (could be interesting from a linked data standpoint to find movies related to the term when reading about the keyword elsewhere) or if collections would like to be linked.
For now, I'll post share worthy updates here such as request for comment for the cases test cases to cover before coding. The project todo and tool design comments might be better suited to file suggestions in the repor after I upload a first draft of the use cases, edge cases, and architecture -> https://github.com/users/brierjon/projects/3 & https://github.com/brierjon/Id-Workflow-and-Sync This being a hobby project I'm not sure how fast it will develop.
Reply by rohfle
on July 12, 2023 at 4:16 AM
For TMDB company ID to Wikidata ID mapping, you might want to have a look at https://github.com/rohfle/tmdb-wikidata-company-matching. I used a combination of fuzzy string matching for company names, along with comparing common lists of TV / movies under companies on TMDB and Wikidata to give a basic similarity score.
Reply by Wiccio
on May 13, 2024 at 5:29 AM
I have created a Python script that for each page on TMDb (of films and TV series, for now) searches for the corresponding page in Wikidata and in the latter synchronises the IDs via API (this is the bot that executes the script: https://www.wikidata.org/wiki/User:Wicci%27o%27Bot). I could also modify the code for reverse synchronisation: insert the missing IDs on TMDb if are listed on the corresponding Wikidata page instead. Is there an option in TMDb to compile the External IDs via POST API?
Reply by talestalker
on May 13, 2024 at 5:46 AM
All of our data must be entered manually and there are no plans to support automated data entry.
Regarding Wikidata IDs, the easiest way would be if TMDB synchronized its data directly with Wikidata via IMDb ID and TMDB ID (afaik, now the daily synchronization only works via IMDB ID on the Wikidata end). However, I don't know if such sync is even planned .