tw-video-scraper

For download and source code, see the GitHub page.

tw-video-scraper is a small python script that can retrieve fan art thumbnail for a video file. It is designed to run with TwonkyMedia server, but can possibly be of use to others as well.

Background

TwonkyMedia server is a DLNA server that runs on a lot of different devices. Using ffmpeg Twonky can generate thumbnails for video files. A guide to setting up Twonky can be found here

One of the problems of using ffmpeg is that it takes a snapshot of the video file at a fixed time, for example at 30 seconds. For both movies and series this can be a problem. If there are opening credits, the thumbnail will display that. This might not be a very meaningful thumbnail for the video file.

For a lot of movies and series there is fan art on the internet. This is a thumbnail that is more meaningful for a specific serie’s episode. tw-video-scraper tries to get the fan art thumbnail for a video file. If no thumbnail can be found, ffmpeg is used to generate a thumbnail.

How to use

tw-video-scraper is a python script that takes two arguments. The first is the name of the video file, the second is the output image file.

[code]python tw-video-scraper.py /path/to/cool.serie.s01e04.htdv.avi /images/abc.jpg[/code]

To use it with Twonky, save tw-video-scraper.py, python-tw-video-scraper.desc and python.location in Twonky’s cgi-bin directory. Make all files executable

[code]chmod 700 tw-video-scraper.py python-tw-video-scraper.desc python.location[/code]

If needed, change the values in python.location, tw-video-scraper.py, and python-tw-video-scraper.desc. The file ffmpeg-video-thumb.desc is no longer needed. Either comment out all lines are delete the file. If you want to regenerate all thumbnail, delete everything from the cache folder in the twonky directory (i.e. /var/twonkymedia/twonkymedia/db/cache) Now restart Twonky, and the script is used.

At least jpeg-scale.desc and cgi-jpegscale plugin must also be enabled.

How it works

Thumbnail for series are retrieved from TheTVDB.com. For movies, TheMovieDB.org is used. For each file, the script tries to guess the name and year of the movie, or the name of the series, the season, and the episode number. This is done by regular expression patterns.

Series

First, it tries to determine if the file is a series. It is a series if the file name matches one of the following conditions:

  • The file name containing S01E01 (or any one of two digit number)
  • The file name containing 1×01 (or any one of two digit number)
  • The file name containing ‘season 1 – episode 01’ (or any one of two digit number, hyphen can be omitted)

Anything immediately before the S01E01 or 1×01 is treated as the serie’s name. If the last character is a hyphen (-), this is removed first. Periods (.) and underscores (_) are replaced by space ( ) first. With this name, the script used the TheTVDB.com API to find the correct serie. A serie is only found if the name matches the serie name exactly. A semicolon (:) in the serie’s name is removed. Names are compared case insensitive.

For each serie, the script downloads a zip file from TheTVDB.com and extracts it. This file contains information about each episode. The script will try to match the season number and the season episode retrieved from the file name to the information in the file downloaded from TheTVDB.com. When the correct episode has been found, the thumbnail image is saved in the path used as the input argument when starting the script.

For each serie, the script has to contact TheTVDB.com to get the serie ID. Once found, the ID is stored in a database. First, the database is checked for the serie ID, before TheTVDB.com is contacted. This improves speed, and reduced the number of API call.

The files downloaded from TheTVDB.com are saved. If the file already exists, the file is not downloaded again. This also improves speed and reduces the load on the TheTVDB.com server. If the file is more that 6 days old, it will always be retrieved again from TheTVDB.com, to make sure new episodes are included.

Movies

If the script determines that a file is not a series, it tries to download the movie poster from TheMovieDB.org. Movies are also matched using regular expression. The following patterns are recognised:

  • The file name has a year in in, surrounded by (), [], or {}, or not surrounded
  • The file file name

First, the name of the parent directory is tried. If no movie can be found, the file name itself is used. Both the directory and the file name use the same patterns.

Configuration settings

A number of settings can be configured. Open the file tw-video-scraper.py, and change the settings on top of the page. Documentation is included above each setting.

Serie/Movie cannot be found OR Wrong serie/movie is retrieved

If the serie/movie cannot be found, or the wrong serie/movie information is retrieved, there are some ways to deal with this. The starting point is the name of each video file. The script must be able to guess the name of the serie or movie from just the file name. This is done by regular expression. New regular expression patterns can be added by adjusting the settings.

Sometimes the file name is in a different format, e.g. foo.bar.2011.s01e01.aaa.avi. This includes the year of the series. The script sees the name as ‘foo bar 2011’ and will try to match that. If this cannot be found, no thumbnail will be retrieved. The same goes for movies. A folder name or file name like ‘foo.bar.hdtv.avi’ cannot be matched. One option is to change the name of all the files. Another option is to make (mis)use of the database. Add a new line to the database with the serie name and the correct ID, and the script will then always use this ID.

[code]$ sqlite3 tw-video-scraper.db
>> insert into video (id,type,name) values (TVDB_ID,’serie’,’foo bar 2011′);[/code]
or

[code]>> insert into video (id,type,name) values (MOVIEDB_ID,’movie’,’foo bar’);
>> .exit[/code]

Windows users

This script was originally written for Linux operating systems, but it can work on Windows too. You have to make some modifications to the files and install some extra software.

  1. Install Python. It is tested with the 2.7.x and 3.2.x releases.
  2. If you also want to generate thumbnails if no thumbnail could be found, install ffmpeg too.
  3. Download the script files, and extract the files python-tw-video-scraper.desc, python.location, and tw-video-scraper.py to a temporary location, e.g. your desktop.
  4. Open the file python.location and change it to the path where Python is installed. In the default case C:\Python27 or C:\Python32 This should just be the directory, NOT the whole path name including python.exe
  5. Open the file python-tw-video-scraper.desc and change /usr/local/twonkymedia/cgi-bin/tw-video-scraper.py to “C:\Program Files (x86)\TwonkyMedia\cgi-bin\tw-video-scraper.py” (with quotes around it, because the path contains a space; replace the path if Twonky is installed somewhere else).
  6. Open the file tw-video-scraper.py and make some changes there:
    • Change the location of the database to c:\tmp\tw-video-scraper\tw-video-scraper.db. You can choose any directory you want, just make sure the script has write access.
    • Change the location of the tmpdir to c:\tmp\tw-video-scraper Make sure not to end this path with \. Again, you can choose any directory.
    • Change the location of the generatecommand. Replace ffmpeg with the full path to ffmpeg, e.g. “C:\Program Files (x86)\ffmpeg\bin\ffmpeg.exe” (add quotes around it if the path contains spaces).
  7. Make sure Twonky is stopped.
  8. Copy the three files to C:\Program Files (x86)\TwonkyMedia\cgi-bin
  9. Restart Twonky.
  10. Browse using the Flash browser, or using a television that supports thumbnails.

Future developments

If you can, please contribute by submitting bug/fixes/new code through GitHub.

Licence & legal stuff

This code may be distributed and adapted under a BSD licence. See the file LICENSE for more information.

This product is not associated with TwonkyMedia or PacketVideo in any way.
TwonkyMedia server (c)2011 PacketVideo Corporation

This product uses the TheTVDB.com API. You can help support TheTVDB.com by contributing information and artwork.

This product uses the TMDb API but is not endorsed or certified by TMDb.

68 thoughts on “tw-video-scraper

  1. Hi, I’m trying to get that working on win7. I installed python+ffmpeg and tryed to run the following command with an error. Do you have a solution for this?

    C:\Program Files (x86)\Twonky\TwonkyServer\cgi-bin>python tw-video-scraper.py Y:\Serien\Serien\Chuck\chu-s01e01.avi C:\Temp\tw-video-scraper\asd.jpg
    File “tw-video-scraper.py”, line 179
    print “Cachedir: “+match.group(1)
    ^
    SyntaxError: invalid syntax

      1. My mistake. There was some debug code in there which only worked on Python 2. Either get the new version from github, or remove the three lines starting with ‘print “‘ at the line where the error occurs

        1. I’m running python version 3.2.2 but found another way to solve it. Just had to put brackets around the print argument 🙂 But thanks for the fast reply, never used python.

          1. Wow I’ve never seen a swiss roll with so many strawberries! Hope you don’t mind me sanyig but I think you overfolded the cake because your cake looks like mine when I overfold it – there’s a dense layer in the cake which is very rubbery right?

      2. Did you use bniakg soda or bniakg powder? You talk about increasing bniakg powder, but in your recipe you list bniakg soda. I just want to make sure that I make them right! Thanks!

      3. Hi Pook,I quite new in baking. While I sheacring recipe using whipping cream i saw this recipe. I had only about 100g whipping cream left in my fridge. May i know is there any subsitute for the balance of 125 g? As i still have evaporated milk and low fat milk. (as i dont intend to buy anymore whipping cream)Thank you in advance.BabeIpoh

  2. For RHEL5 / CentOS5 I had to patch as follows

    $ cat 0001-Python-2.4.3-works-better-with-python-lxml.patch
    From 0d1b537bf8b4a2463788aa94ad282425832a737e Mon Sep 17 00:00:00 2001
    From: Patrick C. F. Ernzer
    Date: Sun, 29 Apr 2012 21:19:43 +0300
    Subject: [PATCH] - Python 2.4.3 works better with python-lxml
    - ideally do like https://github.com/tjfontaine/airprint-generate/blob/master/airprint-generate.py

    ---
    tw-video-scraper.py | 9 ++++++---
    1 files changed, 6 insertions(+), 3 deletions(-)

    diff --git a/tw-video-scraper.py b/tw-video-scraper.py
    index ccf2db0..ae81afb 100755
    --- a/tw-video-scraper.py
    +++ b/tw-video-scraper.py
    @@ -251,7 +251,8 @@ class Serie:
    if not self.id:
    apicall = URL('http://www.thetvdb.com/api/GetSeries.php?language='+Config['tvdblang']+'&seriesname='+self.name).open()
    if apicall:
    - from xml.etree.ElementTree import ElementTree
    + import lxml.etree as etree
    + from lxml.etree import ElementTree
    tree = ElementTree()
    tree.parse(apicall)
    for series in tree.findall('Series'):
    @@ -293,7 +294,8 @@ class Serie:
    def _getTVDBThumbnail(self):
    if self.id:
    if self._getTVDBzipfile():
    - from xml.etree.ElementTree import ElementTree
    + import lxml.etree as etree
    + from lxml.etree import ElementTree
    tree = ElementTree()
    tree.parse(Config['tmpdir']+self.id+'/'+Config['tvdblang']+'.xml')
    for episode in tree.findall('Episode'):
    @@ -346,7 +348,8 @@ class Movie:
    return self.thumbnail

    def _getMovieDBThumbnail(self, name, year = None):
    - from xml.etree.ElementTree import ElementTree
    + import lxml.etree as etree
    + from lxml.etree import ElementTree
    tree = ElementTree()

    match = False
    --
    1.7.4.1

  3. Hello
    filename Star.Wars.Episode.VI.Return.Of.The.Jedi.1988.1080p.avi and Zatoichi.2003.1080p.BDRip.x264.AC3.DTS.avi it not correct retrieve fan art. Do you have a solution for this?
    Python 2.7 on Synology NAS.

    1. The Zatoichi file works for me, and retrieves the correct movie poster. E.g. try on the command line “python tw-video-scraper.py Zatoichi.2003.1080p.BDRip.x264.AC3.DTS.avi test.jpg”

      Star Wars indeed does not work.First problem, the movie is from 1983, not 1988. But even then it does not work, because TheMovieDB lists it as “Star Wars: Episode VI – Return of the Jedi” Just adding the – does not work either, because then it also expects a :
      So either change the file name, or manually add the correct id (1892) to the database as detailed above.

      There isn’t that much I can do to fix it. It depends a lot on the search functionality offered by TheMovieDB, and in this case it is just a bit weird.

    2. 在archlinux终端下使用需要什么额外的设置吗?我在arch使用** (process:1432): WARNING **: Trying to regsiter gtype GMountMountFlags’ as enum when in fact it is of type GFlags’** (process:1432): WARNING **: Trying to regsiter gtype GDriveStartFlags’ as enum when in fact it is of type GFlags’** (process:1432): WARNING **: Trying to regsiter gtype GSocketMsgFlags’ as enum when in fact it is of type GFlags’/usr/lib/python2.7/site-packages/gtk-2.0/gtk/__init__.py:57: GtkWarning: could not open display warnings.warn(str(e), _gtk.Warning)/usr/share/deepin-scrot/src/window.py:27: GtkWarning: IA__gdk_screen_get_root_window: assertion `GDK_IS_SCREEN (screen)’ failed (screenWidth, screenHeight) = gtk.gdk.get_default_root_window().get_size()Traceback (most recent call last): File ./deepinScrot.py , line 25, in from mainscrot import MainScrot File /usr/share/deepin-scrot/src/mainscrot.py , line 28, in from action import * File /usr/share/deepin-scrot/src/action.py , line 25, in from draw import * File /usr/share/deepin-scrot/src/draw.py , line 32, in from window import screenWidth, screenHeight File /usr/share/deepin-scrot/src/window.py , line 27, in (screenWidth, screenHeight) = gtk.gdk.get_default_root_window().get_size()AttributeError: NoneType’ object has no attribute get_size’

  4. The poster for Zatoichi is really created, but it is a poster for other film of 1989 year, not 2003. There are some more films with a similar problem when year is ignored.
    On Star Wars this problem to all films of a series.

    It is possible to give more concrete example create of translations in db on an example of the specified files.

    I am not a programmer and I can not make necessary corrections, but nevertheless is very grateful to You for so convenient tool.

    1. Matching file names to movies is sometimes a bit tricky. Especially if there are more movies with the same name, or when the movie name contains characters that are not allowed in computer file names (such as / \ : )

      The basic working for extracting the name from a file is explained above. The script first tries to detect if it is a serie, by looing for s01e01 patterns. If it cannot find that, it assumes it is a movie.
      First the name of the parent directory is tried, and if that fails, the name of the file.
      If there is a year in the title, everything after the year is ignored.Otherwise the whole name is taken (excluding the file extension).
      Then the name is sanitised, so . and _ are replaced by a space.

      The resulting name is fed to the API, in the case of movies, TheMovieDB. This returns an XML file. It then tries to find the movie with the exact name.

      An example might be best.
      File is “Zatoichi.2003.1080p.BDRip.x264.AC3.DTS.avi”. A year is found, so eveything after 2003 is ignored. ‘.’ is replaced by a space. This leaves “Zatoichi 2003”. Now the API is called. This is just an URL, in this example http://api.themoviedb.org/2.1/Movie.search/en/xml/a8b9f96dde091408a03cb4c78477bd14/Zatoichi%202003

      Because the result is just one movie, it assumes this is the correct movie. It then retrieves the image with the type ‘poster’ and size ‘cover’.

      Try the same API URL, but now omit the year. Now there are many results. It must now go trough all the results, and see if the name, original name, or alternative name matches the name that was used as input. In this case, the 1989 movie is the correct result.

      You can try the same thing with the Star Wars file, it finds two results, but none match the file name exactly. Hence, no image is retrieved. That http://api.themoviedb.org/2.1/Movie.search/en/xml/a8b9f96dde091408a03cb4c78477bd14/Star%20Wars%20Episode%20VI%20-%20Return%20of%20the%20Jedi does not give any results, might just be a bug in TheMovieDB api.

      In the end, everything depends on TheMovieDB API. This is an open project, to which every one can contribute. So sometimes things are not correct. When I saw your first comment, I tried to Zatoichi link, and I remember seeing the 1989 poster. Doing it again now gives the 2003 poster, so some user must have seen the same mistake and updated the information. Just delete the image, and it will be downloaded again.

  5. How I can use tw-video-scraper.db to bypass it? One more problem – the film Romeo+Julieta. “Romeo Juliet 1996” search in a web gives the necessary film, but a poster through a scraper a poster isn’t loaded.

  6. Hi,

    I tried to make it work on a qnap turbonas without success.
    I first installed the Python QPKG, but I think I don’t have made the good changes in the files :
    in python-tw-video-scraper.desc :
    exe: python /mnt/HDA_ROOT/twonkymedia/cgi-bin/tw-video-scraper.py $infile $outfile
    in tw-video-scraper.py :
    ‘database’: r’/var/twonkymedia/twonkymedia/tw-video-scraper.db’,

    don’t see the db file when reboot twonkyserver 🙁
    Any help ?
    Thank’s

    1. Creating directories depends on permissions. I don’t know under which user twonky runs on a qnap, maybe not as root.

      First try to run the script without twonky. For example, set the path to the database to your home directory, and run the script directly from the terminal (python tw-video-scraper.py game.of.thrones.s01e01.avi test.jpg). This allows you to see if the script works, and if the database is being created.

      1. It’s finaly working. The thumbnails are done when I open the directories with my tv.
        I was too impatient.
        Thx

        1. Is the database working now? Although the script can run without a database, I would recommend using it. It makes the script faster, and limits the number of requests to thetvdb and moviedb. Besides the permissions, you also need the sqlite3 libraries for Python.

    2. Totally Off topic (but related to 42 maybe :-)What hanepped to the question “does refrential integrity make the database slow” posted on asktom? This question seems to have disappeared The latest update to this question was that eBay does not use the database referential integrity. No stored procedures, joins, etc for performance reasons.The older version of this question can be seen on the asktom beta, but not on asktom.

  7. Hi,

    I have succesfully set up tw-video-scraper to download covers into a sqlite3 database using TwonkyServer Premium 7.0.9 on my QNAP TS-219P II (FW# 3.8.1).
    I have my movies organized in a folder structure like this:
    ./Multimedia/Movies/MovieTitle/MovieTitle.mpg
    Question is, how do I modify tw-video-scraper to save the covers not only in the database, but also as “Folder.jpg” in each movie folder?

    /Koks

    1. A new version is available on GitHub that can save the image locally. Look at the parameter ‘savelocal’ to configure the setting. By default, it is turned off.

  8. thanks for this, mini bugfix. the cmd to

    chmod 700 tw-video-scraper.py python-python-tw-video-scraper.desc python.location

    has “python-python-tw-video-scraper.desc” but should be “python-tw-video-scraper.desc”

  9. Hi,
    I have successfully installed this on a MyBookLive but when I test the script in the cgi-bin folder I get an error ‘Error creating symlink folder’ – although I do get the correct test.jpg downloaded into the local folder.
    Can you please advise on what might e going wrong?
    On putting a trace, it looks like the script gets an error on line 116 of tw-video-scraper.py
    Regards.

  10. Your script is great, but after I update my My Book Live to latest firmware 02.43.03 script not working properly. No thumbnail is generated, only empty file with same name as movie and extension jpg.unsuported. In twonky log file is nothing from script. This is my output when I run script manually:
    MyBookLive:/usr/local/twonkymedia-5/cgi-bin# python tw-video-scraper.py /nfs/Public/Shared Videos/300.mkv /images/300.jpg
    Error in making symbolic link folders
    Traceback (most recent call last):
    File “tw-video-scraper.py”, line 797, in
    main()
    File “tw-video-scraper.py”, line 155, in main
    movie = Movie(sys.argv[1])
    File “tw-video-scraper.py”, line 408, in __init__
    self._checkMovieDBConfiguration()
    File “tw-video-scraper.py”, line 426, in _checkMovieDBConfiguration
    self._getMovieDBConfiguration()
    File “tw-video-scraper.py”, line 429, in _getMovieDBConfiguration
    import json
    ImportError: No module named json

    When disable symbolic link, output is same only this line disappear:
    Error in making symbolic link folders
    Thanks.

    1. The problem is not related to the script itself, but to the Python module ‘json’ missing. The current implementation of the script uses JSON, because TheTVDB started using that for their API.
      After a quick search, it appears that the My Book Live uses Python 2.5 by default. The json module was added in Python 2.6. So you need to upgrade to 2.6 or higher.

      1. Have you ever tried kamut? It’s a type of wheat with mysterious ancniet origins… I switched from using the standard hard red wheat to almost all kamut. I still use white flour for some things, because I’m not a nut either, and some things in this world just need to be light and fluffy, dangit. lol But kamut is absolutely delicious!! I made 100% whole grain kamut bread and let people/kids in my ward try it and not one person guessed it was whole “wheat”. It makes a light golden colored flour, and stuff made with it seems to be more light but moist and less dense/heavy than the red wheat. Or even white wheat. Find some and try it! I’ve made biscuits, bread, pancakes, waffles, muffins, crepes, you name it. And it has more protein and certain minerals than regular wheat.

      2. I checked the rise and I see it’s still acvite. After the first rest, my bread rise but not much as other type of bread. After taking out of the fridge in the morning, I see my bread has a little rise but it is actually hard to knead and shape. Maybe because I didn’t let it rest 1h before shaping. Anyway, thanks for your quick response. I’m gonna try this again. Wish me luck ^^

    2. I just bought some strebwarries today! I am not good at rolling swiss roll, yours look very neat! Sonia, I saw your comment over at Biren’s regarding the sichuan peppercorns, I was wondering where did you buy them as I could not find it anywhere! Thanks!

    3. Homegirl,Hey nice to know you made the scones and tnreud put well! Yes, I use 6cm cutter and managed to get 8 scones. Next time you try not to roll out the dough to hard before cut it the scones. Or may just use your palm to lightly pat it flat… Cheers!!

  11. Thank you for your reply. I upgraded Python to 2.7.3, but I still have problem with symbolic link. My actual output:
    MyBookLive:/usr/local/twonkymedia-5/cgi-bin# python2 tw-video-scraper.py “/nfs/Public/Shared Videos/300.mkv” /images/300.jpg
    Error in making symbolic link folders
    Retrieving JSON for http://api.themoviedb.org/3/search/movie?api_key=a8b9f96dde091408a03cb4c78477bd14&query=shared%20videos
    Retrieving JSON for http://api.themoviedb.org/3/search/movie?api_key=a8b9f96dde091408a03cb4c78477bd14&query=300
    Downloading file http://d3gtl9l2a4fn1j.cloudfront.net/t/p/w342/4AmPMxTs1zSdCK0eCacj0kBgOMV.jpg

    When I disable symbolic link in tw-video-scraper.py, thumbnails is generated successfully.

    1. Could you try making a small change to the script? This change will remove the try-catch error handling, showing the actual error.

      Open the script, and find the line “if Config[‘scaleoption’] == ‘symlink’:”
      Two lines below that is “try:”. Replace this by “if True:”
      Some lines below that is “except:
      Console.error(“Error in making symbolic link folders”)”

      Comment out these two lines by placing a # in front of each line. Now run the script again to see the actual line that results in the problem.

      1. I simply want to say I’m new to boigglng and site-building and absolutely loved you’re blog. Very likely I’m want to bookmark your blog . You actually come with incredible articles and reviews. Appreciate it for sharing your web page.

  12. I changed the script to this:

    # make symlinks
    if True:
    .
    .
    .
    #except:
    #Console.error(“Error in making symbolic link folders”)

    Result:

    MyBookLive:/usr/local/twonkymedia-5/cgi-bin# python2 tw-video-scraper.py “/nfs/Public/Shared Videos/300.mkv” /images/300.jpg
    Traceback (most recent call last):
    File “tw-video-scraper.py”, line 797, in
    main()
    File “tw-video-scraper.py”, line 135, in main
    cachedir = match.group(1)
    AttributeError: ‘NoneType’ object has no attribute ‘group’

  13. This is incorrect:

    >> insert into video (id,type,name) values (MOVIEDB_ID,'movie','foo bar');

    It should be:

    >> insert into video (id,type,name,year) values (MOVIEDB_ID,'movie','foo bar',YYYY);

    Cheers,

  14. Is it possible to add two types of processing lists:

    1) Ignore folders for Online image get (process directly with ffmpeg)
    2) Ignore folders – always(No online get or ffmpeg generation)

    Because i have some folders with various videos which i want to be ignored for thumbnail generation. And some other folders which i want to be generated by ffmpeg.

    1. Done, check the new version on GitHub. The configuration is alwaysignore and alwaysgenerate. Both are lists. Items should be enclosed with single quotes and separated by commas. See the parentdir config for examples.

  15. Just want to thank you for your work, great tool, worked from the first try on a Buffalo LinkStation LS-XL!

    Cheers!

  16. A suggestion, the temp folder should be automatically created if not present.

    Another problem for me seems to be the ffmpeg, if I manually run the script the screenshot is generated, but on the TV no image is shown if no poster was found online..

  17. Hey, I have noticed that on occasion this website shows a 403 server error message. I figured you would be keen to know. Best wishes

    1. Admiring the persistence you put into your site and in depth infoomatirn you provide. It’s good to come across a blog every once in a while that isn’t the same outdated rehashed infoomatirn. Wonderful read! I’ve bookmarked your site and I’m including your RSS feeds to my Google account.

  18. Neat pice of code – thanks, good work.
    However, Im curious if its possible to trigger this or even other stuff.
    I saw in the plugins folder the itunes plugin has a command “launch on change”.

    I thought to trigger a sort of playlist showing me the last played movies from date xy which is updated evertime a movie or episode was watched.. and I would like to trigger the thumbnail creation.
    Any Ideas?
    Greetings

    1. I cannot reproduce this. Do you have an example or a TV show/episode that is not working?

  19. Hi Daniel,

    i have downloaded you script.
    I use python 2.7
    When i try to test it i receive this output (debug enabled):

    python tw-video-scraper.py Unthinkable test.jpg
    Could not open/create database. Running without database…
    Could not parse image size from image folder name
    Retrieving JSON for https://api.themoviedb.org/3/configuration?api_key=a8b9f96dde091408a03cb4c78477bd14
    Could not connect to TheMovieDB server to retrieve configuration
    Retrieving JSON for http://api.themoviedb.org/3/search/movie?api_key=a8b9f96dde091408a03cb4c78477bd14&query=unthinkable
    Traceback (most recent call last):
    File “tw-video-scraper.py”, line 840, in
    main()
    File “tw-video-scraper.py”, line 183, in main
    movie = Movie(sys.argv[1])
    File “tw-video-scraper.py”, line 447, in __init__
    self._parseFileName()
    File “tw-video-scraper.py”, line 605, in _parseFileName
    if self._getMovieDBThumbnail(self.name, self.year):
    File “tw-video-scraper.py”, line 556, in _getMovieDBThumbnail
    self.thumbnail = self.base_url + self.poster_size + movie[‘poster_path’]
    TypeError: unsupported operand type(s) for +: ‘NoneType’ and ‘NoneType’

    Can you help me?
    Thank you

  20. i have a Problem

    root@ubuntu:/usr/local/twonky/cgi-bin# python tw-video-scraper.py “/mnt/Raid/media/video/Serien/Unsere Mütter, unsere Väter/Season 1/Unsere Mütter, unsere Väter – S01E01 Folge 1.mp4” test.jpg
    Could not parse image size from image folder name
    Retrieving JSON for http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=unsere%20mütter,%20unsere%20väter%20Series%20Info%20site:thetvdb.com
    Traceback (most recent call last):
    File “tw-video-scraper.py”, line 840, in
    main()
    File “tw-video-scraper.py”, line 178, in main
    serie = Serie(sys.argv[1])
    File “tw-video-scraper.py”, line 303, in __init__
    self._parseFileName()
    File “tw-video-scraper.py”, line 342, in _parseFileName
    if self._retrieveID():
    File “tw-video-scraper.py”, line 376, in _retrieveID
    match = pattern.match(data[‘responseData’][‘results’][searchresult][‘url’])
    IndexError: list index out of range
    root@ubuntu:/usr/local/twonky/cgi-bin# python –version
    Python 2.7.6

    Can you help me?
    Thank you

    1. Thanks for submitting this bug. The script didn’t handle unicode characters correctly. This has now been fixed. You can get the latest version from GitHub.

  21. I just read your blog and it work well. Like it another product a available Amazon Product Scraper It nicely extracts the product list and delivers it to the user of it in an excel sheet. I have purchased many products from Amazon by searching the products list with the help of this scraper only.

  22. Hi,

    Since Twonky 7.2.7 i have troubles making the script work ?!

    For some reason Twonky refuses to generate thumbnails ?!

    ANy idea what is goind on ?

  23. Hi, i have installed twonky server 8.0.3 on my dlink dns-320 and i try to use this script.
    When i run from shell it works fine, but it seems that twonky server don’t call it when it generates thumbnails, infact there are no files in my tmp dir.
    Is it possible to see if townky runs this script and if it generates some errors?
    Here are my script file:

    python-tw-video-scraper.desc
    # change path below if needed
    exec: python3 /mnt/HD/HD_a2/ffp/opt/twonky/cgi-bin/tw-video-scraper.py $infile $outfile
    # capabilities
    from=video/*
    to=image/jpeg
    videothumb
    synchronous

    python.location
    /mnt/HD/HD_a2/ffp/bin

    tw-video-scraper.py
    settings = {
    # path and name of the sqlite3 database to use
    # use only a file name to use the database in
    # the current working directory
    # Leave empty to disable the database
    ‘database’: r”,

    # Path where to download and extract the files
    # downloaded from TVDB. The script will check
    # if a file already exists, so it will not be
    # downloaded again.
    ‘tmpdir’: r’/mnt/HD/HD_a2/ffp/opt/twonky/miniature/’,

    # Log level
    # 1: ERROR only
    # 2: ERROR and WARNING
    # 3: ERROR, WARNING, and INFO
    # 4: ERROR, WARNING, INFO, and DEBUG
    ‘loglevel’: 4,
    # Different kind of messages are coloured
    ‘colouredoutput’: True,

    # List of name of the parent directories
    # Used for matching the directory name of the name
    # of a movie/serie. By knowing the parent dir,the
    # script knows if it should analyse the name
    # E.g. ~/download/my.movie.2011.avi vs. ~/download/My.Movie.2011/part1.avi
    ‘parentdir’: [‘/HD/HD_b2/Film’,’/HD/HD_a2/film’] (i try also by typing ‘film’,’Film’)

    Any suggestions?
    Thank you.
    Jacopo

    1. Hi,
      Delete ttu-video-thumb.desc, ffmpeg-video-thumb.desc and jpeg-scale.desc from /cgi-bin folder, and it will work again. Tested with Twonky 8.1.2. You can check it in Twonky’s log file, if you change Twonky log level to 0 in twonkyserver.ini. Search for “upnp_bgtrans_get_descfile : Added transcoder/scaler: /usr/bin/python” line.

      1. Thank you for your reply 🙂
        I deleted the 3 files (ttu-video-thumb.desc, ffmpeg-video-thumb.desc and jpeg-scale.desc from /cgi-bin folder) and i changed Twonky log level to 0 in twonkyserver.ini but i don’t see any thumbnail (i deleted twonky cache and restart it).
        Where is twonky log file? i can’t find it.
        Thank you.

        1. Go to twonky web page :9000 and go to “settings” –> Advanced –> Logging, enable logging, and there is “View log file” button (Twonky ver 8.1.2). Or it is in the “tmp” or maybe “ffp/tmp/” folder of your nas. Sorry I had a DNS-320 but I sold it, so I don’t know the correct path to tmp.

          1. Thank you again 🙂
            Now i can see log file but i can’t find “upnp_bgtrans_get_descfile : Added transcoder/scaler: /usr/bin/python” line. What does it mean? it doesn’t run at all?
            I updated my twonky to 8.1.2.
            thank you.
            Jacopo

  24. Thank you again 🙂
    Now i can see log file but i can’t find “upnp_bgtrans_get_descfile : Added transcoder/scaler: /usr/bin/python” line. What does it mean? it doesn’t run at all?
    I updated my twonky to 8.1.2.
    thank you.
    Jacopo

  25. Finally I have managed to borrow a DNS-320, and I tried the script.
    First: Unfortunatelly, if you use “-appdata” then there will be no debug log and that’s why you cannot find the “upnp_bgtrans_get_descfile : Added transcoder/scaler: /usr/bin/python” line.
    Second: I tried to use the script from the shell, but it is not working for me, it always generates an error: “AttributeError: ‘module’ object has no attribute ‘quote_plus’. Did you have python3 on the DNS-320?

  26. Good evening, I am trying to get thumbs for TV series but cant get the script to match the series name with the TV DB website.

    I tried many different file names:

    Suits season 1 episode 01.mkv
    Suits s01e01.mkv
    Suits 1×01.mkv

    But I always get the followning error:
    python /usr/local/twonky/cgi-bin/tw-video-scraper.py /srv/fileserver/video/TV\ Serien/Suits/Suits\ season\ 1\ episode\ 01.mkv /tmp/abc.jpg

    Could not parse image size from image folder name
    Could not get enough information from file name ‘Suits season 1 episode 01.mkv’

    I checked the scraper database in /var/twonky/ and the series ID was correctly identifed and stored in the DB.

    What am I doing wrong here?

    The script works like a charm for movies.

    1. Sorry for the (very) late reply. It’s probably that there was a problem with creating certain directories. I’ve pushed a new version to GitHub that should provide better handling and logging if something goes wrong.

  27. Hi, need to understand what is going wrong. Trying to install over a QNAP TS-212. Where does scraper pur logs?

    1. I mean, I am getting a whole bunch of “unsupported” in cache/images/ subdirs

      For example:

      /share/MD0_DATA/.@twonkymediadb/twonkymedia/db/cache/images/00510d7bdc5c59e374cad542113a6172/160×160:
      Person.Of.Interest.4×17.Karma.ENGLiSH.DLRip.XviD-Pir8.avi.jpg.unsupported

      1. The script itself does not ouput logs to a file. However, you can see the logs in the console if you run the command manually. You can change the loglevel setting on the top of the script to see even more.

        The script itself does not put .unsupported as part of the name. It then must be the program/script calling this script that requests a filename with .unsupported. I do not use Twonky (or this script) anymore, so why this is happening. If I can find the time, I’ll investigate with a newest version of Twonky.

  28. Google has stop ajax.googleapis.com

    I think this is why thetvdb part is not working for me, themoviedb is working ok.

    Is there an update or work around Daniel

    1. New version is now pushed live. This now uses a custom google search engine configuration. This is limited to 100 queries per day (for all users of the script). If you use this script a lot, it might be a good idea to create your own custom search engine.

      First create your own custom search engine at cse.google.com This needs to be a search engine that only searches the domain thetvdb.com Note the search engine ID. Edit the settings of the script and set this ID as ‘cse_search_id’. Next, go to console.developers.google.com and create a new API browser key credential.

Leave a Reply to Koks Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.