I have a basic sab/sickbeard/couchpotato setup running on ubuntu server 12.04
I use my Samsung smart tv with plex to watch them. The Samsung plex app can't display mkv subs unless you transcode them and I don't like doing that because I'm running my system on an aged computer. I ran across this script which I think is the answer to my sorrows:
Code: Select all
#!/bin/bash
# Extract subtitles from each MKV file in the given directory
# If no directory is given, work in local dir
if [ "$1" = "" ]; then
DIR="."
else
DIR="$1"
fi
# Get all the MKV files in this dir and its subdirs
find "$DIR" -type f -name '*.mkv' | while read filename
do
# Find out which tracks contain the subtitles
mkvmerge -i "$filename" | grep 'subtitles' | while read subline
do
# Grep the number of the subtitle track
tracknumber=`echo $subline | egrep -o "[0-9]{1,2}" | head -1`
# Get base name for subtitle
subtitlename=${filename%.*}
# Extract the track to a .tmp file
`mkvextract tracks "$filename" $tracknumber:"$subtitlename.srt.tmp" > /dev/null 2>&1`
`chmod g+rw "$subtitlename.srt.tmp"`
# Do a super-primitive language guess: DUTCH
langtest=`egrep -ic ' ik | je | een ' "$subtitlename".srt.tmp`
trimregex="vertaling &\|vertaling:\|vertaald door\|bierdopje"
# Do a super-primitive language guess: ENGLISH
#langtest=`egrep -ic ' you | to | the ' "$subtitlename".srt.tmp`
#trimregex=""
# Do a super-primitive language guess: GERMAN
#langtest=`egrep -ic ' ich | ist | sie ' "$subtitlename".srt.tmp`
#trimregex=""
# Do a super-primitive language guess: SPANISH
#langtest=`egrep -ic ' el | es | por ' "$subtitlename".srt.tmp`
#trimregex=""
# Check if subtitle passes our language filter (10 or more matches)
if [ $langtest -ge 10 ]; then
# Regex to remove credits at the end of subtitles (read my reason why!)
`sed 's/\r//g' < "$subtitlename.srt.tmp" \
| sed 's/%/%%/g' \
| awk '{if (a){printf("\t")};printf $0; a=1; } /^$/{print ""; a=0;}' \
| grep -iv "$trimregex" \
| sed 's/\t/\r\n/g' > "$subtitlename.srt"`
`rm "$subtitlename.srt.tmp"`
`chmod g+rw "$subtitlename.srt"`
else
# Not our desired language: add a number to the filename and keep anyway, just in case
`mv "$subtitlename.srt.tmp" "$subtitlename.$tracknumber.srt" > /dev/null 2>&1`
fi
done
done
I have installed mkvtoolnix and have put the script in a folder and have set the Post-Processing script folder in sab to the correct directory. That's where my mad skillz end.
Thanks in advance!
And just want to give props to Computer Nerd From Hell for providing this script to the online world!
EDIT: I would also have a another request! I would like the file name to be *.en.srt or *.fi.srt (for finnish scripts if found, not necessary for fin scripts, but if it's easy to implement then I would be very much obliged)
I assume correcting the few lines of code as follows would take care of the *.en.srt issue.
| sed 's/\t/\r\n/g' > "$subtitlename.en.srt"`
`rm "$subtitlename.srt.tmp"`
`chmod g+rw "$subtitlename.en.srt"
EDIT2: Upon a little playing around I found that under categories I have to activate the script for said categories. Will run a test download to see if this works now.
I ran the script manually and found that it seems to prefer dutch language subtitles. If anyone can help with this I would be gratefull.
EDIT 3: I modified the script slightly for the test to see how it performs as follows:
Code: Select all
#!/bin/bash
# Extract subtitles from each MKV file in the given directory
# If no directory is given, work in local dir
if [ "$1" = "" ]; then
DIR="."
else
DIR="$1"
fi
# Get all the MKV files in this dir and its subdirs
find "$DIR" -type f -name '*.mkv' | while read filename
do
# Find out which tracks contain the subtitles
mkvmerge -i "$filename" | grep 'subtitles' | while read subline
do
# Grep the number of the subtitle track
tracknumber=`echo $subline | egrep -o "[0-9]{1,2}" | head -1`
# Get base name for subtitle
subtitlename=${filename%.*}
# Extract the track to a .tmp file
`mkvextract tracks "$filename" $tracknumber:"$subtitlename.srt.tmp" > /$
`chmod g+rw "$subtitlename.srt.tmp"`
# Do a super-primitive language guess: DUTCH
#langtest=`egrep -ic ' ik | je | een ' "$subtitlename".srt.tmp`
#trimregex="vertaling &\|vertaling:\|vertaald door\|bierdopje"
[b] # Do a super-primitive language guess: ENGLISH
langtest=`egrep -ic ' you | to | the ' "$subtitlename".en.srt.tmp`
trimregex=""
# Do a super-primitive language guess: FINNISH
#langtest=`egrep -ic ' tämä | hän | kyllä ' "$subtitlename".fi.srt.tmp`
#trimregex=""[/b]
# Do a super-primitive language guess: GERMAN
#langtest=`egrep -ic ' ich | ist | sie ' "$subtitlename".srt.tmp`
#trimregex=""
# Do a super-primitive language guess: SPANISH
#langtest=`egrep -ic ' el | es | por ' "$subtitlename".srt.tmp`
#trimregex=""
# Check if subtitle passes our language filter (10 or more matches)
if [ $langtest -ge 10 ]; then
# Regex to remove credits at the end of subtitles (read my reason why$
`sed 's/\r//g' < "$subtitlename.srt.tmp" \
| sed 's/%/%%/g' \
| awk '{if (a){printf("\t")};printf $0; a=1; } /^$/{print ""; a$
| grep -iv "$trimregex" \
| sed 's/\t/\r\n/g' > "$subtitlename[b].en.s[/b]rt"`
`rm "$subtitlename.srt.tmp"`
`chmod g+rw "$subtitlenam[b]e.en.[/b]srt"`
else
# Not our desired language: add a number to the filename and keep any$
`mv "$subtitlename.srt.tmp" "$subtitlename.$tracknumber.srt" > /dev/n$
fi
done
done
I can now get it running in sab and it seems to work partly. It named the srt file:
RANDOMMEDIATHINGY.3.srt and gave the following errors:
Code: Select all
chmod: changing permissions of `/media/plex/1TB/Downloads/Movies/Lorem ipsum dolor sit amet.cp(tt0903624)/orem ipsum dolor sit amet.srt.tmp': Operation not permitted
egrep: /media/plex/1TB/Downloads/Movies/orem ipsum dolor sit amet.cp(tt0903624)/orem ipsum dolor sit amet.en.srt.tmp: No such file or directory
/media/plex/1TB/Downloads/scripts/ripsubtitles.sh: line 48: [: -ge: unary operator expected
chmod: changing permissions of `/media/plex/1TB/Downloads/Movies/orem ipsum dolor sit amet.cp(tt0903624)/orem ipsum dolor sit amet.srt.tmp': Operation not permitted
egrep: /media/plex/1TB/Downloads/Movies/orem ipsum dolor sit amet.cp(tt0903624)/orem ipsum dolor sit amet.en.srt.tmp: No such file or directory
/media/plex/1TB/Downloads/scripts/ripsubtitles.sh: line 48: [: -ge: unary operator expected
-If someone can help with the error it is spewing above
-If anyone can help with the filter to just filter out english and finnish subs and name them *.en.srt / *.fi.srt
I'm puzzled.. will get back to it later I think...