The Movie Database Support

Hello there,

I'm new to the TMDB. I'm building an app as project in college and I need a movie database.

Is there a way to download the entire dataset to be used locally? Is it allowed? If not, the only way is to use the TMDB API for every call my app gets? Is there a way to only get the ID of each item in the database?

If my application is to become commercial one day, will it be possible to get more than 30 requests per 10 seconds?

Thanks

49 replies (on page 1 of 4)

Jump to last post

Next pageLast page

Hi David,

We do not provide data dumps at this time. The only way to get the data would be to iterate over each ID, yes.

There is also no way to circumvent the rate limits. Keep in mind, these rate limits are per IP address, not API key. We have tens of thousands of developers using our platform and this is rarely an issue.

Cheers.

Hi Travis, Thank you for the detailed response.

Is it allowed to download and keep the data locally on my machine? Can you show or explain how to iterate over the IDs?

Thanks again

No problem. Yes, you can cache the data locally.

There's nothing to really show, just start at id 1, and loop up to the last ID you get in /movie/latest. Keep in mind, there will be gaps (404 responses) in this id range from movies that have been deleted.

Cheers.

So at three requests per second, that's a day and a half to get all the films + the same again to get the cast/crew + six days to get the people? Is there a way to get multiple results at once (in some structured way/without searching)?

Thanks.

Hi trengot,

Since the first post above, the limits have actually been increased to 40 every 10 seconds, but other than that there still isn't much more we offer in the sense of bulk data downloading. We do not have any kind of multi fetch API.

Cheers.

Hi there, I'm probably OT (and somewhat inappropriate, hope admin do not mind I'm speaking about competition), but if the need is for a college work (non-commercial) I believe imdb dataset can be used, too, as their license allow non commercial use. You can download the entire database of imdb (in text format, a little awkard to manage, but is doable) from: ftp://ftp.fu-berlin.de/pub/misc/movies/database/ It is more than 4 Gigs, and without any picture. Maybe worth a try. Have fun.

I don't mind at all! If the licenses around the IMDb data dump work for you, then by all means ;) The link to IMDb's terms around the data is located on their alternative interfaces page.

Cheers.

Hi Travis,

Could you please tell me the latest movie id at this moment? Im at 424996 and still going...

Thank You

If anyone did loop over the ids in the api and produce a local cache would they be able to share that database. I was thinking it would be nice to have a postgres database for offline work if possible.

Point of note, just for anyone who is interested, actually importing the IMDB dump into a working, indexed etc. DB, is a bit of a ball ache. You have been warned. It does make for a fascinating tool though and great for testing theories or answering friend's strange questions. And yeah, you don't want to be creating a sight off it, not just due to the licensing issues, but because it has an insane amount of short films you just don't really need. e.g. it is over 1 million items, where I think 70% are shorts.

I would recommend just working iterating through the IDs via the API, rather than getting a dump from someone. e.g. I am guessing if you need it now, you will most likely needed an updated one later. The best way to do that, is do it yourself. Also, it allows you to range the data in a way that is most useful to yourself. e.g. how one person represents the data may well be different to how you want it.

If I did have one request, it would be to know what the field sizes were for various things.

@alifrezser As Travis Bell state, just use this: movies/movielatest

It gives you the highest ID available.

Thanks @david.tzoor for asking this question due to the fact that I was looking for a dataset for my college work as well. Using the API to iterate over all movie IDs is pretty unconvenient, but still a passable way to get the data... Also worked with the IMDb dataset and I can confirm that it's a hell of work to get the data into a normalized database. The format of the database is just a mess. I was hoping to get a 'nice' dumb here, at a less 'ancient' platform, unfortunately I was proofed wrong :)

I think getting the data is pretty straight forward, one question here was, what is the latest id.

This gets your latest movie id https://api.themoviedb.org/3/movie/latest?api_key={api_key}

At the time of typing: 432788

With regards to the requests of a data dump, the biggest problem is the empty values. When I ran through I found the following (latest id when i ran it was 432,420) but I was using a fairly rudimentary test and it is possible some were missed.

VALID::303,923 EMPTY::118,148 ADULT::10,349
TOTAL::432,420

I could provide a list of corresponding ids, but it would need to remember that movies could still be deleted, I could have missed something. Generally speaking I am just stating that I can't be responsible for the accuracy going forward (or even now) :)

Getting a data dump of anything is rare.

IMDB has to do it I believe, due to it's origins. That said, they try and make it as difficult as possible to do anything with.

TMDB doesn't offer it. Neither should it have to. That said, they don't actively prevent people from crawling the API. Offering a dump just gives them more headaches and eats lots of bandwidth as people download it, thinking it will be handy and then never use it.

I will say one thing, you know the data pretty well by the time you are done getting it. I have made over 1000 data changes on this site because of it.

So it is there if you want it, you just have to get it ahead of time, as it takes a while to get.

If you are desperate for a dump (No pun intended) then OMDBAPI offer one if you are willing to pay for it. Only has the same data you can already get from the API, but it comes in a CSV for or some such.

Ultimately, the way I have my data set up and indexed is probably very different to how mateinone has his or how the TMDB is originally set up. I know I have a lot more validation on mine, which is what often results in me fixing data on here.

I think what is offered here is a fair compromise.

Just comes down to how much you want it. Certainly keeps the headaches down f an already busy workforce.

Adi hits two big points for me:

Offering a dump just gives them more headaches and eats lots of bandwidth as people download it, thinking it will be handy and then never use it.

Ultimately, the way I have my data set up and indexed is probably very different to how mateinone has his or how the TMDB is originally set up.

That second point in particular is going to be the biggest one for sure. I've struggled at times over the years to keep up with our growth and feature demands so the DB is very likely to not be setup in a way that a lot of people would find very useful. Keep in mind, this all started as a website to share zip files of images way back in the day. Things are just a little bit different now days šŸ˜‰

Now, I do have plans to offer a downloadable file of invalid ids, that is something I can do to help some people out and have plans for it.

Can't find a movie or TV show? Login to create it.

Global

s focus the search bar
p open profile menu
esc close an open window
? open keyboard shortcut window

On media pages

b go back (or to parent when applicable)
e go to edit page

On TV season pages

(right arrow) go to next season
(left arrow) go to previous season

On TV episode pages

(right arrow) go to next episode
(left arrow) go to previous episode

On all image pages

a open add image window

On all edit pages

t open translation selector
ctrl+ s submit form

On discussion pages

n create new discussion
w toggle watching status
p toggle public/private
c toggle close/open
a open activity
r reply to discussion
l go to last reply
ctrl+ enter submit your message
(right arrow) next page
(left arrow) previous page

Settings

Want to rate or add this item to a list?

Logg inn