SageTV Community  

Go Back   SageTV Community > General Discussion > General Discussion
Forum Rules FAQs Community Downloads Today's Posts Search

Notices

General Discussion General discussion about SageTV and related companies, products, and technologies.

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 03-11-2010, 04:46 AM
doc's Avatar
doc doc is offline
Sage Fanatic
 
Join Date: Jun 2006
Location: Leicester, England
Posts: 918
Extracting Subtitles from TS files

I expect I already know the answer to this

Does anyone know if there is a way to extract the subtitles out of a TS file?

I've found lots of utilities that should do it with mpg files. I know the subtitles are included in the TS file as I can see them in VLC but I'd really like to find a way to extract them.

I've tried converting the recordings to mpg but they then lose the subtitle stream.

Any idea's anyone?
Reply With Quote
  #2  
Old 03-13-2010, 09:42 AM
emveepee emveepee is offline
Sage Aficionado
 
Join Date: Nov 2006
Posts: 417
ProjectX is best for extracting Teletext and graphic subtitles. CCExtractor is easier for extracting North American closed captions.

Martin
Reply With Quote
  #3  
Old 03-14-2010, 01:59 AM
doc's Avatar
doc doc is offline
Sage Fanatic
 
Join Date: Jun 2006
Location: Leicester, England
Posts: 918
I was a bit vague with my post, I'm in New Zealand using DVB-S cards.

I've tried ProjectX but couldn't get it to work. I don't know if it will extract the streams or not. I'll play some more over the next few days and see how I get on.
Reply With Quote
  #4  
Old 03-16-2010, 07:38 AM
doc's Avatar
doc doc is offline
Sage Fanatic
 
Join Date: Jun 2006
Location: Leicester, England
Posts: 918
I'm getting somewhere slowly....

The subtitles are stored as teletext pages on page 801.

I can extract them to a .sub file in ProjectX, and they are in the format below

Quote:
1
00:00:00,000 --> 00:00:00,000
Now to the most viewed stories
on our website today.

2
00:00:00,000 --> 00:00:00,000
Labour MP Charles Chauvel says
a story about him telling children
on a plane to shut up
The log file has the following

Quote:
--> using faked PTS for following data:

++> Teletext: PID 0x0243 / PesID 0xBD / SubID 0x10 :

-> export format: srt
-> Unicode (UTF-16 BE) on text export
-> decode hidden rows of teletext (id 0xFF, means offline pages)
-> temp. file: EleventhHour-7938023-0{0x0243-0xBD-0x10}.tt (122956758 bytes)
-> looking for page number 801
!> teletext stream doesn't use PTS's, sync impossible
-> take only first Audio PTS (to sync the starttime)
-> Teletext PTS: first packet 00:00:00.000, last packet 00:00:00.000
-> adjusting teletext at its own timeline
887 pages of No. 801 written...
---> new File: C:\\EleventhHour-7938023-0{0x0243-0xBD-0x10}[801].srt
As you can see the stop/start times are all set to 00:00:00,000

Does anyone know if its possible to get the times out of this too? or is what I've extracted the best I can manage?

Thanks.
Reply With Quote
  #5  
Old 03-16-2010, 09:02 AM
StephaneM's Avatar
StephaneM StephaneM is offline
Sage Icon
 
Join Date: Mar 2006
Location: France
Posts: 1,463
Quote:
Originally Posted by doc View Post
As you can see the stop/start times are all set to 00:00:00,000 Does anyone know if its possible to get the times out of this too? or is what I've extracted the best I can manage?
The ProjectX log states that the TTX stream isn't using a PTS (presentation time stamp) and as such ProjectX is not able to deduce the time of display (in reference to video or audio stream) of the subtitle.

Actually this is not an error in the stream because TTX subtitles may not carry a PTS (TTX are supposed to be displayed when they come)

I think that my recorder would also fail to extract correctly these subtitles...

Regards,
Stéphane.
Reply With Quote
  #6  
Old 03-16-2010, 03:29 PM
doc's Avatar
doc doc is offline
Sage Fanatic
 
Join Date: Jun 2006
Location: Leicester, England
Posts: 918
Thanks for looking at that Stephane.

I'll put this idea at the bottom of the pile and start on a few others....
Reply With Quote
  #7  
Old 03-16-2010, 09:03 PM
doc's Avatar
doc doc is offline
Sage Fanatic
 
Join Date: Jun 2006
Location: Leicester, England
Posts: 918
OK, weathers been rubbish today so I got bored....

After playing around a bit more I got ProjectX to extract a timestamp with the subtitles after unchecking the option "Re-build TTX-PTS from 1st MpgAudio stream"

The timestamps were out by 73 seconds for some reason, so I've edited the SRT file with vbscript to correct it.

Ill test on a few more files to see if the 73 seconds is consistant or not - once the TV becomes free!

If it is I'll use SJQ to automatically create them after each recording finishes.

If its not consistant, I'll put this back to the bottom of the pile again!

Last edited by doc; 03-16-2010 at 09:05 PM.
Reply With Quote
  #8  
Old 03-17-2010, 08:54 AM
StephaneM's Avatar
StephaneM StephaneM is offline
Sage Icon
 
Join Date: Mar 2006
Location: France
Posts: 1,463
Quote:
Originally Posted by doc View Post
The timestamps were out by 73 seconds for some reason, so I've edited the SRT file with vbscript to correct it.
73 seconds seems huge (usually DVB elementary stream timestamps are to be synched with the main stream clock, and usually this clock is the one carried by the video stream)

Just for curiosity, why do you want to extract the subtitles?
Reply With Quote
  #9  
Old 03-17-2010, 01:46 PM
doc's Avatar
doc doc is offline
Sage Fanatic
 
Join Date: Jun 2006
Location: Leicester, England
Posts: 918
Quote:
Originally Posted by StephaneM View Post
73 seconds seems huge (usually DVB elementary stream timestamps are to be synched with the main stream clock, and usually this clock is the one carried by the video stream)

Just for curiosity, why do you want to extract the subtitles?
Well... the first ttx page is timed at 00:00:00 and all other timestamps seem to be synced to that page, so the amount they are off varies each time. I managed somehow to get one to start 09:something - not sure why as the show started at 12:00 noon. Will have a closer look another time.

The 'customer' here want a transcript of all the subtitles for archiving purposes. To be honest the extracts I have will do, I was just hoping I could get the timelines correct so that in future you can use them for playback - at some point they may ask the question, but I hope they don't!

Last edited by doc; 03-17-2010 at 01:52 PM.
Reply With Quote
  #10  
Old 03-17-2010, 08:37 PM
StephaneM's Avatar
StephaneM StephaneM is offline
Sage Icon
 
Join Date: Mar 2006
Location: France
Posts: 1,463
Quote:
The 'customer' here want a transcript of all the subtitles for archiving purposes. To be honest the extracts I have will do, I was just hoping I could get the timelines correct so that in future you can use them for playback - at some point they may ask the question, but I hope they don't!
Then try my DVBViewer Recorder if you can, it will extract all the subtitles with the correct time stamp (during the recording) and as those are in SMI format they can be played back even with Windows Media Player (and of course with SageTV). Time stamps are pretty accurate: you will still have some inaccuracy because of video playback pipeline (usually with WMP subtitles will come on a little earlier than they should, it depends on the video decoder used)
Reply With Quote
  #11  
Old 03-21-2010, 04:06 AM
Doubletop's Avatar
Doubletop Doubletop is offline
Sage Advanced User
 
Join Date: Apr 2005
Location: New Zealand
Posts: 126
Can somebody provide a simple explanation what would be required for SageTV to play closed captions when they already exist in stream? I'm not "having a go" StepanieM you've done some great stuff but I cannot see why we should need to use your DVBViewer recorder to control the tuner and have SageTV then use it as a network encoder when SageTV is perfectly capable of doing the tuning and saving the video file.

If DVBviewer can find the captions why can't a SageTV plugin be created that extract the captions during the playback and displays them? I just don't understand why we have to have these round the houses solutions. Especially when apps like DVBViewer apparently have a very high CPU overhead.

Doc; I'm in NZ as well so so if you get this sorted you would earn countless WAF points. Its the only thing missing from SageTV as far as I am concerned. We have to resort to real time for shows like "Coro" as the American in our household has understandable issues with the Manchester accent.
Reply With Quote
  #12  
Old 03-21-2010, 07:25 AM
doc's Avatar
doc doc is offline
Sage Fanatic
 
Join Date: Jun 2006
Location: Leicester, England
Posts: 918
I think its on the Sage 'to do' list, its just that the list is quite long I'd guess.

The main reason I haven't looked into Stephane's software is because all the freeview channels are on 2 muxes, so with 2 tuner cards and umpteen dvb enhancer virtual tuners you can record all of the freeview channels simultaneously.

I was looking at ProjectX as I was hoping to run a standalone exe against each TS file after they've recorded. ProjectX does a good job once you get the ini file correct, but the timestamps are way off for the first ttx page, which makes it pretty useless unless you manually correct them. If you just want the odd file thats fairly easy as long as you can count the seconds and then knock up a script to amend it, but its not really a viable solution.

I'm afraid my programming skills got left behind in the 90's when the old mainframe got sent to the landfill and the MOD decided to buy COSH (commercial off the shelf) applications.

I'm hoping to get back into programming in the next few months now that I've got time on my hands so if my skills ever develop I'll look into this again. I've done a lot of vbscript'ing lately but I don't call that proper programming.

My question for the programmers out there is - which language? I'd hate to spend a few months and a few courses (and pounds) learning one to find out I'd have been better learning another. I also have to consider which would be best for work if I decide to go in that direction again - I'm sick of servers and support work

DT - I live in England - just visiting friends here and go back in 5 weeks. Heading off in the car on my own in a couple of days with a tent. If you don't hear back I've probably been attacked by kiwi's ...... or a 15cm tsunami !
Reply With Quote
  #13  
Old 03-21-2010, 08:01 AM
StephaneM's Avatar
StephaneM StephaneM is offline
Sage Icon
 
Join Date: Mar 2006
Location: France
Posts: 1,463
Quote:
Originally Posted by Doubletop View Post
Can somebody provide a simple explanation what would be required for SageTV to play closed captions when they already exist in stream?
Time to develop the feature...


Quote:
I cannot see why we should need to use your DVBViewer recorder to control the tuner and have SageTV then use it as a network encoder when SageTV is perfectly capable of doing the tuning and saving the video file.
1) SageTV doesn't (by default) save the # subtitles stream, only audio and video.
2) When you ask SageTV to save the subtitles stream: it can mess the recording and SageTV will not play it (I was having terrible issues when doing so)

So basically when SageTV make the recording, you have no subtitle in the recording.

Moreover, each channel will carry multiple subtitles, so you have to choose which of them you want to display. By using a network recorder you can for each channel specify which sub to extract. This is not the same when you only have the recording file: which subtitle stream should be extracted? As you don't know the channel that where used to make the recording (unless using for instance the web server plugin to ask for it...)

Quote:
Especially when apps like DVBViewer apparently have a very high CPU overhead.
You are wrong: DVBViewer doesn't have any CPU overhead. With my recorder and DVBViewer you'll not see your CPU working at all... I can record, extract sub from three HDTV recording on my old machine while playing one HDTV recording / displaying DVB subs (that are overlaid bitmap images)

So if you really need live subtitles dispalyed you should really try it...

Regards,
Stéphane.
Reply With Quote
  #14  
Old 03-21-2010, 08:06 AM
StephaneM's Avatar
StephaneM StephaneM is offline
Sage Icon
 
Join Date: Mar 2006
Location: France
Posts: 1,463
Quote:
Originally Posted by doc View Post
The main reason I haven't looked into Stephane's software is because all the freeview channels are on 2 muxes, so with 2 tuner cards and umpteen dvb enhancer virtual tuners you can record all of the freeview channels simultaneously.
You can also do it with DVBViewer + DVBViewer Recorder... The only thing you can't do is: with one tuner, tune to encrypted channels on the same mux at the same time (from DVBViewer authors: when it is possible -it depends on the CAM used- you may loose packets, corrupt one or both recordings. That's why it is not allowed)

Regards,
Stéphane.
Reply With Quote
  #15  
Old 03-21-2010, 09:55 PM
doc's Avatar
doc doc is offline
Sage Fanatic
 
Join Date: Jun 2006
Location: Leicester, England
Posts: 918
Quote:
Originally Posted by StephaneM View Post
You can also do it with DVBViewer + DVBViewer Recorder...
Cool, I hadn't realised that. I should have guessed as the XML importer of yours I use is always at least one step ahead of what I want to do

I'll see if I can find time to look into this next week when I get back.
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I enable subtitles in DIVX avi files? nicholsona SageTV Software 5 12-09-2009 12:22 PM
Can HD100 display subtitles in non-encrypted dvr-ms files? logmaster SageTV Media Extender 0 10-31-2008 11:11 AM
Subtitles for Divx files delgrego SageTV Software 7 03-20-2007 09:56 AM
Subtitles with MP4 files in Sage? mike1961 SageTV Beta Test Software 9 12-02-2006 04:33 PM
Extracting audio only from a recording frank2056 General Discussion 3 09-29-2003 02:50 PM


All times are GMT -6. The time now is 12:14 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, vBulletin Solutions Inc.
Copyright 2003-2005 SageTV, LLC. All rights reserved.