For download and source code, see the GitHub page.

tw-video-scraper is a small python script that can retrieve fan art thumbnail for a video file. It is designed to run with TwonkyMedia server, but can possibly be of use to others as well.


TwonkyMedia server is a DLNA server that runs on a lot of different devices. Using ffmpeg Twonky can generate thumbnails for video files. A guide to setting up Twonky can be found here

One of the problems of using ffmpeg is that it takes a snapshot of the video file at a fixed time, for example at 30 seconds. For both movies and series this can be a problem. If there are opening credits, the thumbnail will display that. This might not be a very meaningful thumbnail for the video file.

For a lot of movies and series there is fan art on the internet. This is a thumbnail that is more meaningful for a specific serie’s episode. tw-video-scraper tries to get the fan art thumbnail for a video file. If no thumbnail can be found, ffmpeg is used to generate a thumbnail.

How to use

tw-video-scraper is a python script that takes two arguments. The first is the name of the video file, the second is the output image file.

python /path/to/cool.serie.s01e04.htdv.avi /images/abc.jpg

To use it with Twonky, save, python-tw-video-scraper.desc and python.location in Twonky’s cgi-bin directory. Make all files executable

chmod 700 python-tw-video-scraper.desc python.location

If needed, change the values in python.location,, and python-tw-video-scraper.desc. The file ffmpeg-video-thumb.desc is no longer needed. Either comment out all lines are delete the file. If you want to regenerate all thumbnail, delete everything from the cache folder in the twonky directory (i.e. /var/twonkymedia/twonkymedia/db/cache) Now restart Twonky, and the script is used.

At least jpeg-scale.desc and cgi-jpegscale plugin must also be enabled.

How it works

Thumbnail for series are retrieved from For movies, is used. For each file, the script tries to guess the name and year of the movie, or the name of the series, the season, and the episode number. This is done by regular expression patterns.


First, it tries to determine if the file is a series. It is a series if the file name matches one of the following conditions:

  • The file name containing S01E01 (or any one of two digit number)
  • The file name containing 1×01 (or any one of two digit number)
  • The file name containing ‘season 1 – episode 01′ (or any one of two digit number, hyphen can be omitted)

Anything immediately before the S01E01 or 1×01 is treated as the serie’s name. If the last character is a hyphen (-), this is removed first. Periods (.) and underscores (_) are replaced by space ( ) first. With this name, the script used the API to find the correct serie. A serie is only found if the name matches the serie name exactly. A semicolon (:) in the serie’s name is removed. Names are compared case insensitive.

For each serie, the script downloads a zip file from and extracts it. This file contains information about each episode. The script will try to match the season number and the season episode retrieved from the file name to the information in the file downloaded from When the correct episode has been found, the thumbnail image is saved in the path used as the input argument when starting the script.

For each serie, the script has to contact to get the serie ID. Once found, the ID is stored in a database. First, the database is checked for the serie ID, before is contacted. This improves speed, and reduced the number of API call.

The files downloaded from are saved. If the file already exists, the file is not downloaded again. This also improves speed and reduces the load on the server. If the file is more that 6 days old, it will always be retrieved again from, to make sure new episodes are included.


If the script determines that a file is not a series, it tries to download the movie poster from Movies are also matched using regular expression. The following patterns are recognised:

  • The file name has a year in in, surrounded by (), [], or {}, or not surrounded
  • The file file name

First, the name of the parent directory is tried. If no movie can be found, the file name itself is used. Both the directory and the file name use the same patterns.

Configuration settings

A number of settings can be configured. Open the file, and change the settings on top of the page. Documentation is included above each setting.

Serie/Movie cannot be found OR Wrong serie/movie is retrieved

If the serie/movie cannot be found, or the wrong serie/movie information is retrieved, there are some ways to deal with this. The starting point is the name of each video file. The script must be able to guess the name of the serie or movie from just the file name. This is done by regular expression. New regular expression patterns can be added by adjusting the settings.

Sometimes the file name is in a different format, e.g. This includes the year of the series. The script sees the name as ‘foo bar 2011′ and will try to match that. If this cannot be found, no thumbnail will be retrieved. The same goes for movies. A folder name or file name like ‘’ cannot be matched. One option is to change the name of all the files. Another option is to make (mis)use of the database. Add a new line to the database with the serie name and the correct ID, and the script will then always use this ID.

$ sqlite3 tw-video-scraper.db
>> insert into video (id,type,name) values (TVDB_ID,'serie','foo bar 2011');


>> insert into video (id,type,name) values (MOVIEDB_ID,'movie','foo bar');
>> .exit

Windows users

This script was originally written for Linux operating systems, but it can work on Windows too. You have to make some modifications to the files and install some extra software.

  1. Install Python. It is tested with the 2.7.x and 3.2.x releases.
  2. If you also want to generate thumbnails if no thumbnail could be found, install ffmpeg too.
  3. Download the script files, and extract the files python-tw-video-scraper.desc, python.location, and to a temporary location, e.g. your desktop.
  4. Open the file python.location and change it to the path where Python is installed. In the default case C:\Python27 or C:\Python32 This should just be the directory, NOT the whole path name including python.exe
  5. Open the file python-tw-video-scraper.desc and change /usr/local/twonkymedia/cgi-bin/ to “C:\Program Files (x86)\TwonkyMedia\cgi-bin\” (with quotes around it, because the path contains a space; replace the path if Twonky is installed somewhere else).
  6. Open the file and make some changes there:
    • Change the location of the database to c:\tmp\tw-video-scraper\tw-video-scraper.db. You can choose any directory you want, just make sure the script has write access.
    • Change the location of the tmpdir to c:\tmp\tw-video-scraper Make sure not to end this path with \. Again, you can choose any directory.
    • Change the location of the generatecommand. Replace ffmpeg with the full path to ffmpeg, e.g. “C:\Program Files (x86)\ffmpeg\bin\ffmpeg.exe” (add quotes around it if the path contains spaces).
  7. Make sure Twonky is stopped.
  8. Copy the three files to C:\Program Files (x86)\TwonkyMedia\cgi-bin
  9. Restart Twonky.
  10. Browse using the Flash browser, or using a television that supports thumbnails.

Future developments

If you can, please contribute by submitting bug/fixes/new code through GitHub.

Licence & legal stuff

This code may be distributed and adapted under a BSD licence. See the file LICENSE for more information.

This product is not associated with TwonkyMedia or PacketVideo in any way.
TwonkyMedia server (c)2011 PacketVideo Corporation

This product uses the API. You can help support by contributing information and artwork.

This product uses the TMDb API but is not endorsed or certified by TMDb.

34 thoughts on “tw-video-scraper

  1. Hi, I’m trying to get that working on win7. I installed python+ffmpeg and tryed to run the following command with an error. Do you have a solution for this?

    C:\Program Files (x86)\Twonky\TwonkyServer\cgi-bin>python Y:\Serien\Serien\Chuck\chu-s01e01.avi C:\Temp\tw-video-scraper\asd.jpg
    File “”, line 179
    print “Cachedir: “
    SyntaxError: invalid syntax

      1. My mistake. There was some debug code in there which only worked on Python 2. Either get the new version from github, or remove the three lines starting with ‘print “‘ at the line where the error occurs

        1. I’m running python version 3.2.2 but found another way to solve it. Just had to put brackets around the print argument :) But thanks for the fast reply, never used python.

  2. For RHEL5 / CentOS5 I had to patch as follows

    $ cat 0001-Python-2.4.3-works-better-with-python-lxml.patch 
    From 0d1b537bf8b4a2463788aa94ad282425832a737e Mon Sep 17 00:00:00 2001
    From: Patrick C. F. Ernzer 
    Date: Sun, 29 Apr 2012 21:19:43 +0300
    Subject: [PATCH] - Python 2.4.3 works better with python-lxml
     - ideally do like
    --- |    9 ++++++---
     1 files changed, 6 insertions(+), 3 deletions(-)
    diff --git a/ b/
    index ccf2db0..ae81afb 100755
    --- a/
    +++ b/
    @@ -251,7 +251,8 @@ class Serie:
     		if not
     			apicall = URL(''+Config['tvdblang']+'&seriesname='
     			if apicall:
    -				from xml.etree.ElementTree import ElementTree
    +				import lxml.etree as etree
    +				from lxml.etree import ElementTree
     				tree = ElementTree()
     				for series in tree.findall('Series'):
    @@ -293,7 +294,8 @@ class Serie:
     	def _getTVDBThumbnail(self):
     			if self._getTVDBzipfile():
    -				from xml.etree.ElementTree import ElementTree
    +				import lxml.etree as etree
    +				from lxml.etree import ElementTree
     				tree = ElementTree()
     				for episode in tree.findall('Episode'):
    @@ -346,7 +348,8 @@ class Movie:
     		return self.thumbnail
     	def _getMovieDBThumbnail(self, name, year = None):
    -		from xml.etree.ElementTree import ElementTree
    +		import lxml.etree as etree
    +		from lxml.etree import ElementTree
     		tree = ElementTree()
     		match = False
  3. Hello
    filename Star.Wars.Episode.VI.Return.Of.The.Jedi.1988.1080p.avi and Zatoichi.2003.1080p.BDRip.x264.AC3.DTS.avi it not correct retrieve fan art. Do you have a solution for this?
    Python 2.7 on Synology NAS.

    1. The Zatoichi file works for me, and retrieves the correct movie poster. E.g. try on the command line “python Zatoichi.2003.1080p.BDRip.x264.AC3.DTS.avi test.jpg”

      Star Wars indeed does not work.First problem, the movie is from 1983, not 1988. But even then it does not work, because TheMovieDB lists it as “Star Wars: Episode VI – Return of the Jedi” Just adding the – does not work either, because then it also expects a :
      So either change the file name, or manually add the correct id (1892) to the database as detailed above.

      There isn’t that much I can do to fix it. It depends a lot on the search functionality offered by TheMovieDB, and in this case it is just a bit weird.

  4. The poster for Zatoichi is really created, but it is a poster for other film of 1989 year, not 2003. There are some more films with a similar problem when year is ignored.
    On Star Wars this problem to all films of a series.

    It is possible to give more concrete example create of translations in db on an example of the specified files.

    I am not a programmer and I can not make necessary corrections, but nevertheless is very grateful to You for so convenient tool.

    1. Matching file names to movies is sometimes a bit tricky. Especially if there are more movies with the same name, or when the movie name contains characters that are not allowed in computer file names (such as / \ : )

      The basic working for extracting the name from a file is explained above. The script first tries to detect if it is a serie, by looing for s01e01 patterns. If it cannot find that, it assumes it is a movie.
      First the name of the parent directory is tried, and if that fails, the name of the file.
      If there is a year in the title, everything after the year is ignored.Otherwise the whole name is taken (excluding the file extension).
      Then the name is sanitised, so . and _ are replaced by a space.

      The resulting name is fed to the API, in the case of movies, TheMovieDB. This returns an XML file. It then tries to find the movie with the exact name.

      An example might be best.
      File is “Zatoichi.2003.1080p.BDRip.x264.AC3.DTS.avi”. A year is found, so eveything after 2003 is ignored. ‘.’ is replaced by a space. This leaves “Zatoichi 2003″. Now the API is called. This is just an URL, in this example

      Because the result is just one movie, it assumes this is the correct movie. It then retrieves the image with the type ‘poster’ and size ‘cover’.

      Try the same API URL, but now omit the year. Now there are many results. It must now go trough all the results, and see if the name, original name, or alternative name matches the name that was used as input. In this case, the 1989 movie is the correct result.

      You can try the same thing with the Star Wars file, it finds two results, but none match the file name exactly. Hence, no image is retrieved. That does not give any results, might just be a bug in TheMovieDB api.

      In the end, everything depends on TheMovieDB API. This is an open project, to which every one can contribute. So sometimes things are not correct. When I saw your first comment, I tried to Zatoichi link, and I remember seeing the 1989 poster. Doing it again now gives the 2003 poster, so some user must have seen the same mistake and updated the information. Just delete the image, and it will be downloaded again.

  5. How I can use tw-video-scraper.db to bypass it? One more problem – the film Romeo+Julieta. “Romeo Juliet 1996″ search in a web gives the necessary film, but a poster through a scraper a poster isn’t loaded.

  6. Hi,

    I tried to make it work on a qnap turbonas without success.
    I first installed the Python QPKG, but I think I don’t have made the good changes in the files :
    in python-tw-video-scraper.desc :
    exe: python /mnt/HDA_ROOT/twonkymedia/cgi-bin/ $infile $outfile
    in :
    ‘database’: r’/var/twonkymedia/twonkymedia/tw-video-scraper.db’,

    don’t see the db file when reboot twonkyserver :(
    Any help ?

    1. Creating directories depends on permissions. I don’t know under which user twonky runs on a qnap, maybe not as root.

      First try to run the script without twonky. For example, set the path to the database to your home directory, and run the script directly from the terminal (python game.of.thrones.s01e01.avi test.jpg). This allows you to see if the script works, and if the database is being created.

        1. Is the database working now? Although the script can run without a database, I would recommend using it. It makes the script faster, and limits the number of requests to thetvdb and moviedb. Besides the permissions, you also need the sqlite3 libraries for Python.

  7. Hi,

    I have succesfully set up tw-video-scraper to download covers into a sqlite3 database using TwonkyServer Premium 7.0.9 on my QNAP TS-219P II (FW# 3.8.1).
    I have my movies organized in a folder structure like this:
    Question is, how do I modify tw-video-scraper to save the covers not only in the database, but also as “Folder.jpg” in each movie folder?


    1. A new version is available on GitHub that can save the image locally. Look at the parameter ‘savelocal’ to configure the setting. By default, it is turned off.

  8. thanks for this, mini bugfix. the cmd to

    chmod 700 python-python-tw-video-scraper.desc python.location

    has “python-python-tw-video-scraper.desc” but should be “python-tw-video-scraper.desc”

  9. Hi,
    I have successfully installed this on a MyBookLive but when I test the script in the cgi-bin folder I get an error ‘Error creating symlink folder’ – although I do get the correct test.jpg downloaded into the local folder.
    Can you please advise on what might e going wrong?
    On putting a trace, it looks like the script gets an error on line 116 of

  10. Your script is great, but after I update my My Book Live to latest firmware 02.43.03 script not working properly. No thumbnail is generated, only empty file with same name as movie and extension jpg.unsuported. In twonky log file is nothing from script. This is my output when I run script manually:
    MyBookLive:/usr/local/twonkymedia-5/cgi-bin# python /nfs/Public/Shared Videos/300.mkv /images/300.jpg
    Error in making symbolic link folders
    Traceback (most recent call last):
    File “”, line 797, in
    File “”, line 155, in main
    movie = Movie(sys.argv[1])
    File “”, line 408, in __init__
    File “”, line 426, in _checkMovieDBConfiguration
    File “”, line 429, in _getMovieDBConfiguration
    import json
    ImportError: No module named json

    When disable symbolic link, output is same only this line disappear:
    Error in making symbolic link folders

    1. The problem is not related to the script itself, but to the Python module ‘json’ missing. The current implementation of the script uses JSON, because TheTVDB started using that for their API.
      After a quick search, it appears that the My Book Live uses Python 2.5 by default. The json module was added in Python 2.6. So you need to upgrade to 2.6 or higher.

  11. Thank you for your reply. I upgraded Python to 2.7.3, but I still have problem with symbolic link. My actual output:
    MyBookLive:/usr/local/twonkymedia-5/cgi-bin# python2 “/nfs/Public/Shared Videos/300.mkv” /images/300.jpg
    Error in making symbolic link folders
    Retrieving JSON for
    Retrieving JSON for
    Downloading file

    When I disable symbolic link in, thumbnails is generated successfully.

    1. Could you try making a small change to the script? This change will remove the try-catch error handling, showing the actual error.

      Open the script, and find the line “if Config['scaleoption'] == ‘symlink’:”
      Two lines below that is “try:”. Replace this by “if True:”
      Some lines below that is “except:
      Console.error(“Error in making symbolic link folders”)”

      Comment out these two lines by placing a # in front of each line. Now run the script again to see the actual line that results in the problem.

  12. I changed the script to this:

    # make symlinks
    if True:
    #Console.error(“Error in making symbolic link folders”)


    MyBookLive:/usr/local/twonkymedia-5/cgi-bin# python2 “/nfs/Public/Shared Videos/300.mkv” /images/300.jpg
    Traceback (most recent call last):
    File “”, line 797, in
    File “”, line 135, in main
    cachedir =
    AttributeError: ‘NoneType’ object has no attribute ‘group’

  13. This is incorrect:

    >> insert into video (id,type,name) values (MOVIEDB_ID,'movie','foo bar');

    It should be:

    >> insert into video (id,type,name,year) values (MOVIEDB_ID,'movie','foo bar',YYYY);


  14. Is it possible to add two types of processing lists:

    1) Ignore folders for Online image get (process directly with ffmpeg)
    2) Ignore folders – always(No online get or ffmpeg generation)

    Because i have some folders with various videos which i want to be ignored for thumbnail generation. And some other folders which i want to be generated by ffmpeg.

    1. Done, check the new version on GitHub. The configuration is alwaysignore and alwaysgenerate. Both are lists. Items should be enclosed with single quotes and separated by commas. See the parentdir config for examples.

  15. Just want to thank you for your work, great tool, worked from the first try on a Buffalo LinkStation LS-XL!


  16. A suggestion, the temp folder should be automatically created if not present.

    Another problem for me seems to be the ffmpeg, if I manually run the script the screenshot is generated, but on the TV no image is shown if no poster was found online..

  17. Neat pice of code – thanks, good work.
    However, Im curious if its possible to trigger this or even other stuff.
    I saw in the plugins folder the itunes plugin has a command “launch on change”.

    I thought to trigger a sort of playlist showing me the last played movies from date xy which is updated evertime a movie or episode was watched.. and I would like to trigger the thumbnail creation.
    Any Ideas?

  18. Hi Daniel,

    i have downloaded you script.
    I use python 2.7
    When i try to test it i receive this output (debug enabled):

    python Unthinkable test.jpg
    Could not open/create database. Running without database…
    Could not parse image size from image folder name
    Retrieving JSON for
    Could not connect to TheMovieDB server to retrieve configuration
    Retrieving JSON for
    Traceback (most recent call last):
    File “”, line 840, in
    File “”, line 183, in main
    movie = Movie(sys.argv[1])
    File “”, line 447, in __init__
    File “”, line 605, in _parseFileName
    if self._getMovieDBThumbnail(, self.year):
    File “”, line 556, in _getMovieDBThumbnail
    self.thumbnail = self.base_url + self.poster_size + movie['poster_path']
    TypeError: unsupported operand type(s) for +: ‘NoneType’ and ‘NoneType’

    Can you help me?
    Thank you

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>