My Samsung TV is pretty good at playing videos that sit on my NAS, but it is not very good at recognizing various subtitle formats. Luckily, there are free .SRT files available from various places (like http://www.suby.com, or http://www.opensubtitles.org, or http://www.ondertitel.com) and usually it is enough to throw them into the same directory as the movie file, give it the same name as the movie file (with the extension .SRT), and play.
But a while ago I copied a DVD straight to my hard-drive. It was English spoken, and wanted to add some Dutch subtitles, but the DVD consists of a VIDEO_TS directory containing more than one .VOB file:
VIDEO_TS.BUP VIDEO_TS.IFO VTS_01_0.BUP VTS_01_0.IFO VTS_01_1.VOB VTS_01_2.VOB VTS_01_3.VOB VTS_01_4.VOB VTS_01_5.VOB
Playing it was no problem: I just pointed my Samsung TV to the first .VOB file, VTS_01_1.VOB, and it started playing; and it was even nice enough to automatically start the next movie files, in order, so I could watch the entire movie without touching anything. But what name should the SRT file have?
I wasn’t sure how smart my Samsung TV was, so I tried the following:
- Call the file VIDEO_TS.SRT and put it in the same directory where the VIDEO_TS directory was in. This didn’t work: no subtitles were shown.
- Call the file VIDEO_TS.SRT and put it in the VIDEO_TS directory. No subtitles.
- Call the file VTS_01_1.SRT and put it in the VIDEO_TS directory. This worked great for the first file, but no subtitles were played when the second file (VTS_01_2.VOB) started to play.
- Could it be that they were smart enough to support this? I copied the SRT file 5 times, and named them VTS_01_1.SRT, VTS_01_2.SRT, VTS_01_3.SRT, VTS_01_4.SRT and VTS_01_5.SRT. After all, the TV knew the files should play one after another. Alas. The subtitles played, but started anew when the second VOB file began to play.
This last experiment gave me an idea how to do it: if I copied the file five times and resynced the subtitles for the last four, it should work. But of course, that is not something you want to do manually.
Just because it was possible, I wrote the following Python program. It reads the information about the VOB files (using FFmpeg,) and creates copies of the original SRT files, offset by the start time of each piece of the video. Of course, this took longer than converting the VIDEO_TS directory into, say, an MKV file, but the next time it will not…
Share and enjoy.
This program needs FFmpeg to read the video information (or, more precisely, ffprobe,) and assumes it is available in your path. FFmpeg is available at https://www.ffmpeg.org/
I learned some things from:
- https://somethingididnotknow.wordpress.com/2012/05/02/fix-subtitles-offset-with-python/
- https://github.com/wting/srt-resync/blob/master/srt-resync
Thanks for that.
""" NAME srt2video_ts.py - split an SRT file for use with VTS_01_*.VOB files SYNOPSIS srt4video_ts.py [-h] [--version] [-s SRT_FILE] [video_ts] DESCRIPTION srt2video_ts splits an SRT file according to the durations of a set of VOB files. You can use it to see subtitles with a copied DVD on a Samsung TV. It will probably be useful with other video applications as well. This program needs FFmpeg to read the video information (or, more precisely, ffprobe,) and assumes it is available in your path. FFmpeg is available at https://www.ffmpeg.org/ I learned some things from: https://somethingididnotknow.wordpress.com/2012/05/02/fix-subtitles-offset-with-python/ https://github.com/wting/srt-resync/blob/master/srt-resync Thanks. ARGUMENTS video_ts The VIDEO_TS directory containing the VTS_01_*.VOB files. Default is the current directory. OPTIONS -h, --help Show this help message and exit. --version Show version information and quit. -s SRT_FILE, --srt-file SRT_FILE The SRT file to split. The SRT file will be split into a set of SRT files with the same names as the VOB files, but with the extension SRT. Many video players use this convention to show subtitles automatically. AUTHOR Dion Nicolaas LICENSE zlib License: Copyright (c) 2016 Dion Nicolaas This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgement in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. """ import argparse import datetime import math import os import re import subprocess VERSION = "1.0" DESCRIPTION = "Split an SRT file for use with VTS_01_*.VOB files" # The ffprobe command to retrieve the vobfile's start time. %s is the vobfile. # This assumes ffprobe is in your path (or your current dir.) FFPROBE="ffprobe -v error -show_entries format=start_time -of default=noprint_wrappers=1:nokey=1 %s" # Arbitrary date, we need to use datetimes with timedeltas. BASE_DATE = datetime.datetime(2016,1,1) def vob_list(dirname): """Return a list of vobfiles in dirname. Called from parseargs.""" dirlist = os.listdir(dirname) voblist = [dirname + '/' + fname for fname in dirlist if re.match(".*.vob$", fname, re.IGNORECASE)] return voblist def parse_options(): """Parse the command line option and return the options dict.""" parser = argparse.ArgumentParser(description = DESCRIPTION) parser.add_argument('--version', action = "version", version = "%(prog)s " + VERSION, help = "show version information and quit") parser.add_argument('-s', '--srt-file', type=file, help="the SRT file to split.") parser.add_argument('video_ts', type=vob_list, nargs='?', default=vob_list("."), help="The VIDEO_TS directory containing the VTS_01_*.VOB files. Default is the current directory.") return parser.parse_args() def timestamp(time_string): """Turn a SRT file timestring into a datetime (with arbitrary date.)""" ts = time_string.replace(':', ',') tlist = [int(num) for num in ts.split(',')] return BASE_DATE + datetime.timedelta(days=0, hours=tlist[0], minutes=tlist[1], seconds=tlist[2], microseconds=tlist[3] * 1000) def read_srt(srt_file): """Read an SRT file and return it as a list.""" print "Reading SRT file..." rec_idx_expected = 0 rec = [] records = [] for line in srt_file: # Check for empty lines m = re.match("^\\s*$", line) if m: # empty line: end of record. Let's always do this, even if we # don't have anything else. Something will fail later. records.append(rec) rec = [] rec_idx_expected = 0 elif rec_idx_expected == 0: # Check for record number. We don't use the number itself, as we # don't expect out of order records (which is hardly supported # anyway.) m = re.match(r'(\d+)', line) if m: rec_idx_expected = 1 else: print "Error, record number expected (%s)" % line elif rec_idx_expected == 1: # Check for the time stamps. m = re.match(r'^(\d+:\d+:\d+,\d+)\s+--\>\s+(\d+:\d+:\d+,\d+)', line) if m: if len(rec) == 0: rec.append((timestamp(m.group(1)), timestamp(m.group(2)))) rec_idx_expected = 2 else: print "Error: time before index!" else: print "Error, time expected (%s)" % line elif rec_idx_expected == 2: # Anything next is the subtitle. Just store them. if len(rec) == 2: rec[1] += line else: rec.append(line) return records def get_vob_starts(files): """Use ffprobe to make a list of the vobfiles's start times. Return a dictionary vobfilename -> starttime. """ print "Reading VOB file offsets..." vob_starts = {} for fname in files: print " Probing %s..." % fname ffprobe = FFPROBE % fname output = subprocess.check_output(ffprobe.split(), stderr=subprocess.STDOUT, shell=True) # first line only, in case ffprobe spits out more time = output.splitlines()[0] (integer, fractional) = math.modf(float(time)) timediff = datetime.timedelta(seconds=integer, microseconds=fractional * 1000000) vob_starts[fname] = timediff return vob_starts def format_time(timestamp): """Format a timestamp in SRT file format.""" formatted = timestamp.strftime('%H:%M:%S,%f') return formatted[:-3] def write_record(f, index, record, start): """Write out one SRT record, times offset with start.""" f.write("%d\n" % index) # The end time belongs to this video part, the start time maybe started # earlier. Set it to 0 in that case. start_time = record[0][0] - start if start_time < BASE_DATE: start_time = BASE_DATE f.write("%s --> %s\n" % (format_time(start_time), format_time(record[0][1] - start))) f.write(record[1]) f.write("\n") def write_srt(srtname, start, srt): """Write an SRT file starting after time start. Subtract start from timestamps. Leave everything after end, as it does no harm. """ print " Writing %s..." % srtname with open(srtname, "w") as f: index = 0 for record in srt: # Check if the end-time belongs to this video offset_time = record[0][1] - start if offset_time < BASE_DATE: continue index += 1 write_record(f, index, record, start) def write_srts(vob_starts, srt): """Write SRT files for all VOB files.""" print "Writing SRT files..." for (fname, start) in vob_starts.iteritems(): srtname = re.sub('\\.VOB', '.SRT', fname, re.IGNORECASE) write_srt(srtname, start, srt) if __name__ == "__main__": options = parse_options() srt = read_srt(options.srt_file) vob_starts = get_vob_starts(options.video_ts) write_srts(vob_starts, srt) print "Done."