SageTV Community  

Go Back   SageTV Community > General Discussion > General Discussion
Forum Rules FAQs Community Downloads Today's Posts Search

Notices

General Discussion General discussion about SageTV and related companies, products, and technologies.

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 01-04-2009, 06:30 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Looking for Title/Season/Episode naming formats to support for metadata scraper

I'm working on a scraping tool and I'm hoping to accommodate as many naming conventions as I can. Basically, I just need to know how people's files are named so I can figure out how to capture the show title and season/episode # so I can track down the metadata on the internet.

Right now I've got 3 working...

Title-Episode-######-0.mpg
- SageTV format, impossible to parse directly, but I can use the webserver to grab the real name using the airing number.

title S##E## - episode.avi

title ##x## - episode.avi

If you have one that looks different, if you use the folder structure for naming, or anything else that is a little crazy, please post it so I can do my best to support it.

Thanks!

Last edited by evilpenguin; 01-04-2009 at 06:38 PM.
Reply With Quote
  #2  
Old 01-04-2009, 06:59 PM
stuckless's Avatar
stuckless stuckless is offline
SageTVaholic
 
Join Date: Oct 2007
Location: London, Ontario, Canada
Posts: 9,713
Which site(s) are you scraping? I'm currently working on tv scraping support in the batch metadata tools. Maybe we can share code/algorithms? Currently, for SxE type variants, I'm using the following regexp
Code:
".*[Ss]*0*([0-9]+)\\.*[EeXx]\\s*0*([0-9]+).*"
Reply With Quote
  #3  
Old 01-04-2009, 07:03 PM
dvd_maniac's Avatar
dvd_maniac dvd_maniac is offline
Sage Icon
 
Join Date: Mar 2004
Location: New England
Posts: 1,899
I have a folder for each show I archive and then a seson folder for each season and finally the episode.
Ex:
TV\CSI\09\S09E08 Say Uncle.mp4
__________________
If this doesn't work right, Then:
"I'm going to blow up the Earth!"
Reply With Quote
  #4  
Old 01-04-2009, 07:22 PM
keefmyster keefmyster is offline
Sage User
 
Join Date: Jan 2005
Location: VA
Posts: 28
Mine are in a format similar to dvd_maniac's, but I spelled out season and don't have the season number on the final file name:

TV\Lost\Season 04\01 The Beginning of the End.avi
Reply With Quote
  #5  
Old 01-04-2009, 07:23 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Quote:
Originally Posted by dvd_maniac View Post
I have a folder for each show I archive and then a seson folder for each season and finally the episode.
Ex:
TV\CSI\09\S09E08 Say Uncle.mp4
That's very do able, its what I use also. I think i'll have to restrict it to only being from a \TV\ subfolder for sanity's sake

Quote:
Originally Posted by stuckless View Post
Which site(s) are you scraping? I'm currently working on tv scraping support in the batch metadata tools. Maybe we can share code/algorithms? Currently, for SxE type variants, I'm using the following regexp
Code:
".*[Ss]*0*([0-9]+)\\.*[EeXx]\\s*0*([0-9]+).*"
Sorry, didn't mean to step on your toes! I'm more working on a tool you can use to create scrapers rather than a scraper for a specific site. I'm shooting for something like XBMC's system where all of the logic is in profile files so it can be expanded infinity without recompiling. In addition I also have the following goals extra goals...

* Be able to capture all episode data for entire series in one call into a single file (Think Fandata).
* Have the output format be controlled by profile files so it can be configured to write into any format

For starters, to prove out my system, I'm working on "theTVDB" and I've pretty much nailed it...

Code:
Show		: Lost
theTVDB ID	: 73739
Description	: After their plane, Oceanic Air flight 815, tore apart whilst thousands of miles off course, the survivors find themselves on a mysterious deserted island where they soon find out they are not alone.
First Aired	: September 22, 2004
Airs On		: Thursday
Runtime		: 45 minutes
Network		: ABC
Genre		: Action and Adventure
Actors		: Matthew Fox(http://www.theTVDB.com/banners/_cache/actors/27747.jpg)||Evangeline Lilly(http://www.theTVDB.com/banners/_cache/actors/27741.jpg)||Naveen Andrews(http://www.theTVDB.com/banners/_cache/actors/27749.jpg)||
Banners		: http://www.theTVDB.com/banners/graphical/73739-g4.jpg||http://www.theTVDB.com/banners/graphical/73739-g.jpg||http://www.theTVDB.com/banners/graphical/73739-g6.jpg||http://www.theTVDB.com/banners/graphical/24313-g2.jpg||http://www.theTVDB.com/banners/graphical/73739-g3.jpg||http://www.theTVDB.com/banners/graphical/73739-g8.jpg||http://www.theTVDB.com/banners/graphical/73739-g7.jpg||http://www.theTVDB.com/banners/graphical/73739-g5.jpg||http://www.theTVDB.com/banners/graphical/24313-g.jpg||http://www.theTVDB.com/banners/blank/73739.jpg||http://www.theTVDB.com/banners/graphical/73739-g2.jpg||http://www.theTVDB.com/banners/graphical/73739-g9.jpg||
Posters		: http://www.theTVDB.com/banners/posters/73739-2.jpg||http://www.theTVDB.com/banners/posters/73739-1.jpg||http://www.theTVDB.com/banners/posters/73739-5.jpg||http://www.theTVDB.com/banners/posters/73739-6.jpg||http://www.theTVDB.com/banners/posters/73739-4.jpg||http://www.theTVDB.com/banners/posters/73739-3.jpg||
Fanart		: http://www.theTVDB.com/banners/fanart/original/73739-27.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-27.jpg||http://www.theTVDB.com/banners/fanart/original/73739-15.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-15.jpg||http://www.theTVDB.com/banners/fanart/original/73739-26.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-26.jpg||http://www.theTVDB.com/banners/fanart/original/73739-1.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-1.jpg||http://www.theTVDB.com/banners/fanart/original/73739-22.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-22.jpg||http://www.theTVDB.com/banners/fanart/original/73739-17.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-17.jpg||http://www.theTVDB.com/banners/fanart/original/73739-19.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-19.jpg||http://www.theTVDB.com/banners/fanart/original/73739-3.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-3.jpg||http://www.theTVDB.com/banners/fanart/original/73739-12.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-12.jpg||http://www.theTVDB.com/banners/fanart/original/73739-9.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-9.jpg||http://www.theTVDB.com/banners/fanart/original/73739-16.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-16.jpg||http://www.theTVDB.com/banners/fanart/original/73739-7.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-7.jpg||http://www.theTVDB.com/banners/fanart/original/73739-2.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-2.jpg||http://www.theTVDB.com/banners/fanart/original/73739-23.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-23.jpg||http://www.theTVDB.com/banners/fanart/original/73739-4.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-4.jpg||http://www.theTVDB.com/banners/fanart/original/73739-25.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-25.jpg||http://www.theTVDB.com/banners/fanart/original/73739-14.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-14.jpg||http://www.theTVDB.com/banners/fanart/original/73739-11.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-11.jpg||http://www.theTVDB.com/banners/fanart/original/73739-6.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-6.jpg||http://www.theTVDB.com/banners/fanart/original/73739-8.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-8.jpg||http://www.theTVDB.com/banners/fanart/original/73739-10.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-10.jpg||http://www.theTVDB.com/banners/fanart/original/73739-18.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-18.jpg||http://www.theTVDB.com/banners/fanart/original/73739-5.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-5.jpg||http://www.theTVDB.com/banners/fanart/original/73739-24.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-24.jpg||http://www.theTVDB.com/banners/fanart/original/73739-29.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-29.jpg||http://www.theTVDB.com/banners/fanart/original/73739-20.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-20.jpg||http://www.theTVDB.com/banners/fanart/original/73739-28.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-28.jpg||http://www.theTVDB.com/banners/fanart/original/73739-21.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-21.jpg||http://www.theTVDB.com/banners/fanart/original/73739-30.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-30.jpg||
~~~~~~~~~~~~~~~~~~~~~~~~~
Episode		: 1 - 1
Title		: Pilot (1)
First Aired	: 2004-09-22
Guest Stars	: Greg Grunberg| John Dixon| Michelle Arthur|
Directors	: J.J. Abrams
Writers		: J.J. Abrams, Damon Lindelof
Description	: Stripped of everything, the 48 survivors scavenge what they can from the plane for their survival. Some panic. Some pin their hopes on rescue. A few find inner strength they never knew they had-like Kate who, with no medical training, suddenly finds herself suturing the doctor's wounds. The band of friends, family, enemies and strangers must work together against the cruel weather and harsh terrain. But the intense howls of  mysterious creatures stalking the jungle fill them all with fear. Fortunately, thanks to the calm leadership of quick-thinking Jack and level-headed Kate, they have hope. But even heroes have secrets, as the survivors will come to learn.
~~~~~~~~~~~~~~~~~~~~~~~~~
Episode		: 1 - 2
Title		: Pilot (2)
First Aired	: 2004-09-29
Guest Stars	: John Dixon| Michelle Arthur| George Alan|
Directors	: J.J. Abrams
...
...
So should definitely talk because I think we can work together. Care to pop into a new thread in the customization forum so we I can throw out some gory technical details?
Reply With Quote
  #6  
Old 01-04-2009, 08:19 PM
stuckless's Avatar
stuckless stuckless is offline
SageTVaholic
 
Join Date: Oct 2007
Location: London, Ontario, Canada
Posts: 9,713
Actually you are not stepping on my toes. I was just doing it out of necessity, and I haven't started much yet along those lines. I'll probably just wait and see what you produce, since you are way futher ahead than I am
Reply With Quote
  #7  
Old 01-05-2009, 06:48 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
How about some music naming conventions? I don't have any mp3's so people will have to help me out on this one
Reply With Quote
  #8  
Old 01-05-2009, 06:57 PM
QueOnda's Avatar
QueOnda QueOnda is offline
Sage Icon
 
Join Date: Jan 2008
Posts: 1,093
Quote:
Originally Posted by evilpenguin View Post
How about some music naming conventions? I don't have any mp3's so people will have to help me out on this one
I have a central location on my mp3s. I have my file name as follows:

Artist - Album - Track# - song name.mp3

The - and spaces are there.

Sometimes they are in their own folder: Artist folder -> Album folder

Some song I have don't have track# or Album, so then only Artist - song name.mp3

There are other name configurations I use they are minor. Most programs allow you to rename to certain naming convention. I don't know if mediamonkey monkey might.
Reply With Quote
  #9  
Old 01-05-2009, 07:03 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Quote:
Originally Posted by QueOnda View Post
Some song I have don't have track# or Album, so then only Artist - song name.mp3
This is the only one that would be a problem if I were doing it strictly by file name. Are most of your songs ID3 tagged with at least artist/album?
Reply With Quote
  #10  
Old 01-06-2009, 09:38 AM
TallMomof2's Avatar
TallMomof2 TallMomof2 is offline
Sage Aficionado
 
Join Date: Aug 2008
Posts: 258
My mp3s have the following format for the file name:

track#-track name

May or may not have spaces between track# "-" track name. Depends on the source. I rip mp3s without spaces and Amazon mp3s use spaces. All are ID3 tagged with:
Track Name
Artist
Album Title
Year
Track Number
Duration
Type
Bit Rate
Protected (Y/N)
Size
__________________
Server: Windows 8x64, ASUS P8 H67-M Pro Micro ATX, Core i5, 8 RAM, 14TB running latest Sage

Clients: HD200, HD300, Win7 Desktop

Capture Devices: Hauppauge Colossus & 1 HDHR, TV Service: Verizon FIOS
Reply With Quote
  #11  
Old 01-06-2009, 03:23 PM
popechild popechild is offline
Sage Expert
 
Join Date: May 2006
Posts: 725
Quote:
Originally Posted by evilpenguin View Post
How about some music naming conventions? I don't have any mp3's so people will have to help me out on this one
I'd suggest looking to see how iTunes organizes and names songs when you use it to organize your music. If there's going to be a method used by more people than most, that's bound to be it, simply because so many people use iTunes to organize their music. (Though maybe iTunes let's you change the way it organizes it, I'm not sure about that.) I let iTunes do all the naming and organizing into folders, then I just point Sage at the parent folder and let it find it all.
Reply With Quote
  #12  
Old 01-06-2009, 03:26 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Is storing your songs without the artist/album in the file name pretty common? Its sounding like my best bet is to just read the ID3 tags for music .
Reply With Quote
  #13  
Old 01-06-2009, 04:13 PM
QueOnda's Avatar
QueOnda QueOnda is offline
Sage Icon
 
Join Date: Jan 2008
Posts: 1,093
Quote:
Originally Posted by evilpenguin View Post
This is the only one that would be a problem if I were doing it strictly by file name. Are most of your songs ID3 tagged with at least artist/album?
Yes, I have mostly all ID3 taggged. Could you read ID3 tagged info? Other issues would me other type of music files which are NOT mp3. Don't know what else sage is able to play. Does OGG and FLac and WAVs have ID3 Tag info?

The reason I have my music files labled the way I can sort and look what I have through file browser.

I confirmed MediaMonkey is able to rename filenames. Called auto-organize files (CTRL+R). Just an option if people has to name the files a particular way.
Reply With Quote
  #14  
Old 01-06-2009, 04:35 PM
Brent Brent is offline
SageTVaholic
 
Join Date: May 2006
Location: KC, Missouri
Posts: 3,695
You best bet will be using the ID3 tags for music. There are so many different combinations of naming schemes for music files.
Reply With Quote
  #15  
Old 01-06-2009, 04:41 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Quote:
Originally Posted by Brent View Post
You best bet will be using the ID3 tags for music. There are so many different combinations of naming schemes for music files.
Yeah, its sure sounding like that. Oh well, not a big deal, Perl has a ID3 tag parser I should be able to use.
Reply With Quote
  #16  
Old 01-08-2009, 12:41 PM
MattHelm MattHelm is offline
Sage Icon
 
Join Date: Jun 2005
Location: Chicago, IL
Posts: 1,209
Quote:
Originally Posted by QueOnda View Post
Does OGG and FLac and WAVs have ID3 Tag info?
I know FLAC does NOT use ID3, but it's own format, but SageTV does understand whatever FLAC uses.
__________________
Server #1= AMD A10-5800, 8G RAM, F2A85-M PRO, 12TB, HDHomerun Prime, HDHR, Colossus (Playback - HD-200)
Server #2= AMD X2 3800+, 2G RAM, M2NPV-VM, 2TB, 3x HDHR OTA (Playback - HD-200)
Reply With Quote
  #17  
Old 01-08-2009, 12:42 PM
Slipshod's Avatar
Slipshod Slipshod is offline
Sage Aficionado
 
Join Date: Feb 2008
Location: San Francisco Bay Area
Posts: 474
Quote:
Originally Posted by evilpenguin View Post
Yeah, its sure sounding like that. Oh well, not a big deal, Perl has a ID3 tag parser I should be able to use.
Yeah, definately parse ID3 tags - I do artist and album as directories, with track# and title as filenames. You'll need to think about what version to support, and what has precedence. Some of my old stuff has both v1 and v2 tagging on it, or just v1.
__________________
SageTV V7 (WHS), Diamond UI
Server: WHS with Xeon X3350, 4GB ECC, ASUS P5BV-C/4L, recording into a 6.6TB Drive pool
Tuners: 4 (2x HDHR)
Clients: 2x HD300, 1x HD200 Extenders, 1x Placeshifter
2x Roku XD
Reply With Quote
  #18  
Old 01-08-2009, 04:43 PM
Skirge01's Avatar
Skirge01 Skirge01 is offline
SageTVaholic
 
Join Date: Jun 2007
Location: New Jersey
Posts: 2,599
My ID3 tags are 99% accurate and complete, so that's definitely the best method. But, I also organize with a folder structure as follows:

Artist\Album\[Artist] - [Title].mp3

or

Genre\Artist\Album\[Artist] - [Title].mp3

Sometimes the track # is part of the filename, but never part of the folder structure. It's primarily live albums where I worry about track sequence at all.

The issue with parsing folder names versus ID3 tags is that folder names disallow some characters that are used in actual album, artist, or song names.
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Approximate ETA for BD ISO and full mt2s audio support? ojosch SageTV Media Extender 1 12-10-2008 12:41 PM
Cardea/DRM Support in HD200 fyodor SageTV Media Extender 0 12-05-2008 03:01 PM
SageTV V6.3.2 Beta is Available! Narflex SageTV Beta Test Software 0 11-21-2007 07:08 PM
SageTV V6.3.1 Beta is Available Narflex SageTV Beta Test Software 1 11-09-2007 05:17 PM


All times are GMT -6. The time now is 04:11 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, vBulletin Solutions Inc.
Copyright 2003-2005 SageTV, LLC. All rights reserved.