Hello there,
I'm new to the TMDB. I'm building an app as project in college and I need a movie database.
Is there a way to download the entire dataset to be used locally? Is it allowed? If not, the only way is to use the TMDB API for every call my app gets? Is there a way to only get the ID of each item in the database?
If my application is to become commercial one day, will it be possible to get more than 30 requests per 10 seconds?
Thanks
Can't find a movie or TV show? Login to create it.
Want to rate or add this item to a list?
Not a member?
Reply by DJ_Lectr0
on April 23, 2017 at 4:43 AM
@Tokubetsu How large is the current json dataset? Planning on downloading it all and wanted to make sure I have enough space.
@mateinone @Tokubetsu Do you guys already have a database structure and mind sharing it? I want to add movie recommendations as well as IMDB list support to our application and the whole db would make that a lot easier.
@travisbell If I create a database, would I be able to share it? (I think that would be very useful for others, and would save you bandwidth, since they don't have to query each id?) Also thank you so much for this API. It is really a pleasure to use :)
Reply by DJ_Lectr0
on April 23, 2017 at 4:47 AM
@mateinone Also do you have a script you could share with me for downloading all the json stuff? I can write that up myself, but if someone has already done it, I could save some time.
Reply by Adi
on April 23, 2017 at 10:23 AM
Over 2 gigs and around 1 million files.
Reply by DJ_Lectr0
on April 23, 2017 at 10:45 AM
@Tokubetsu And this was with all append_responses? Because that does seem bearable. I could even save that on my SSD. Thanks for you response.
Reply by Adi
on April 23, 2017 at 11:13 AM
Yeah, appended to the max. Just remember that some of the files will be quite small and due to the cluster size of your harddrive, will take up more space than they advertise.
I am guessing, at this point, it will be closer to 2.5gig, as my snapshot was a while back.
Reply by DJ_Lectr0
on April 23, 2017 at 2:43 PM
@Tokubetsu Yeah. I have more than enough storage space for that :) About keeping the data "latest". Shouldn't you be able to query /movie/changes every e.g. 24 hours and only query those movies to get a pretty up do date database? Or have I misunderstood something?
Btw. I created a simple python script here. Currently works well and I should have a complete dump in a day or two: https://gist.github.com/galli-leo/6398f9128ffc20af70c6c7eedfeb0a65
Reply by Adi
on April 23, 2017 at 4:36 PM
This gives you an indication as to how different everyones needs are and how for me, I don't need everything to be up to the minute uptodate. I just do a snapshot every now and then. Also keeps the workload down the for TMDB servers.
Reply by Travis Bell
on April 27, 2017 at 7:22 PM
You guys can grab a list of ids from the following file:
http://files.tmdb.org/p/exports/movie_ids_04_27_2017.json.gz
This is a daily export, I'll be posting information about these files tomorrow. Keep an eye on the blog.
Reply by DJ_Lectr0
on April 28, 2017 at 8:32 AM
@travisbell Very Cool! One question: If I query the /movie/changes endpoint, will movies (who's popularity changed) also appear? Additionally, is it possible to calculate the popularity ourselves / how often is the popularity updated?
Reply by Travis Bell
on April 28, 2017 at 9:35 AM
No.
There's no way to get our popularity calculation, no. Popularity is a value that is calculated, and changes every day.
Now that I'm thinking about this, I'll probably add the popularity value to this export. I'm trying to push data that will help in the daily scrape scenario and popularity is definitely one of these.
Reply by DJ_Lectr0
on April 28, 2017 at 9:38 AM
@travisbell Gotcha. Thanks for the quick reply. If you add the popularity to the daily id export, that would be wonderful and would allow our application to be recommending movies much better :)
Reply by Travis Bell
on April 28, 2017 at 11:09 AM
Popularity has been added to 04_28_2017 export. More details coming out soon.
Reply by bhatfield
on July 14, 2017 at 6:23 PM
Travis, perhaps a solution is hosting a database dump as a public AWS S3 bucket where the requester pays for the bandwidth? It might save you some money with API bandwidth and provide the community with a place to get a dump. If you like, I'm happy to help set it up.
Reply by Travis Bell
on July 14, 2017 at 6:53 PM
Hi @bhatfield I don't have any plans to host full database dumps. For the actual data, the only official way is the web service. It's not really about the cost, there's a number of reasons but one of the largest ones is to do with legal stuff.
Reply by nbmoviedb
on January 20, 2019 at 10:27 PM
Travis - Isn't this a little silly? IMDb - 5 minute download of current database, then upload into a SQL database or Excel sheet, 2 minutes. TMDb - 48 hour download, parse all JSON files and upload to DB or Excel: 4 hours. So IMDb 7 minutes vs TMDb 52 hours. Really? IMDb has NO legal issues, but you do?