The Movie Database Support

Hi, had some issues getting all the Ocean’s x films to scrape via MrMC using the TMDb scraper. I have 4 of them in total - Clooney’s 3 and the newest version “8” from last year. But only Thirteen scanned automatically.

Have had issues with these films in the past but always manually fixed it - I figured it was to do with numerical 11/12/13/8 vs the written form Eleven etc. (I matched the way they’re written on TVDb but no luck). It looks like it an issue with curly and straight apostrophes. And inconsistency on different apps/websites on the iPhone.

For example if I type an apostrophe in the Safari address bar - it’s straight. If I type the same key on the Duck Duck Go search engine it’s curly. If I type it in TMDb’s search bar it’s curly.

If you search for for “ocean’s” (curly apostrophe) on TMDb it returns 1 film. Ocean’s Thirteen:

https://www.themoviedb.org/search?query=Ocean’s&language=en-US

If you search with a straight apostrophe “Ocean's”, you get a 12 movies returned including all the Ocean’s films as well as Ocean’s Thirteen again:

https://www.themoviedb.org/search?query=Ocean%27s&language=en-US

I checked my local files. And Ocean’s Thirteen was the only version I had with straight apostrophes. Hence why it scanned ok. The rest were named with curly apostrophes. Renamed them all with straight apostrophes and now MrMC scrapes them fine.

Is this a bug of some sort? Can it be fixed so TMDb search returns results for either curly or straight apostrophes?

5 replies (on page 1 of 1)

Jump to last post

I've been noticing for some time that all instances of curly apostrophes on TMDb seem to be as copy-and-pasted in from texts found elsewhere. I'm not clear why they don't automatically convert to straight apostrophes (which are TMDb's norm) when pasted in and saved.

This damn apostrophe has made my research much more difficult. :imp:
What I found http://snowball.tartarus.org/texts/apostrophe.html
Those are 4:
U+0027 Unicode Character 'APOSTROPHE' https://www.fileformat.info/info/unicode/char/27/index.htm
U+2019 Unicode Character 'RIGHT SINGLE QUOTATION MARK' https://www.fileformat.info/info/unicode/char/2019/index.htm
U+2018 Unicode Character 'LEFT SINGLE QUOTATION MARK' https://www.fileformat.info/info/unicode/char/2018/index.htm
U+201B Unicode Character 'SINGLE HIGH-REVERSED-9 QUOTATION MARK' https://www.fileformat.info/info/unicode/char/201b/index.htm

And I think there are others.
U+2032 Unicode Character 'PRIME' https://www.fileformat.info/info/unicode/char/2032/index.htm
U+2035 Unicode Character 'REVERSED PRIME' https://www.fileformat.info/info/unicode/char/2035/index.htm
U+0060 Unicode Character 'GRAVE ACCENT' https://www.fileformat.info/info/unicode/char/0060/index.htm

and some more https://www.fileformat.info/info/unicode/block/combining_diacritical_marks/utf8test.htm

I should be handling these properly, it's something that I remember looking at years and years ago. Here's the relevant ticket. Thanks.

@travisbell Is it okay to suggest new tickets? I don't know if you remember, but there are two somewhat similar search issues with a) all diacritics and b) the Turkish capital letter "İ" (regular i work fine, but the Turkish letter always returns no result). :construction:

This issue has been fixed and pushed live. Curly and regular apostrophe's should be returning the same results now.

@banana_girl I haven't looked at the Turkish issue yet.

Global

s focus the search bar
p open profile menu
esc close an open window
? open keyboard shortcut window

On media pages

b go back (or to parent when applicable)
e go to edit page

On TV season pages

(right arrow) go to next season
(left arrow) go to previous season

On TV episode pages

(right arrow) go to next episode
(left arrow) go to previous episode

On all image pages

a open add image window

On all edit pages

t open translation selector
ctrl+ s submit form

On discussion pages

n create new discussion
w toggle watching status
p toggle public/private
c toggle close/open
a open activity
r reply to discussion
l go to last reply
ctrl+ enter submit your message
(right arrow) next page
(left arrow) previous page