|
General Discussion General discussion about SageTV and related companies, products, and technologies. |
|
Thread Tools | Search this Thread | Display Modes |
#1
|
||||
|
||||
Looking for Title/Season/Episode naming formats to support for metadata scraper
I'm working on a scraping tool and I'm hoping to accommodate as many naming conventions as I can. Basically, I just need to know how people's files are named so I can figure out how to capture the show title and season/episode # so I can track down the metadata on the internet.
Right now I've got 3 working... Title-Episode-######-0.mpg - SageTV format, impossible to parse directly, but I can use the webserver to grab the real name using the airing number. title S##E## - episode.avi title ##x## - episode.avi If you have one that looks different, if you use the folder structure for naming, or anything else that is a little crazy, please post it so I can do my best to support it. Thanks! Last edited by evilpenguin; 01-04-2009 at 06:38 PM. |
#2
|
||||
|
||||
Which site(s) are you scraping? I'm currently working on tv scraping support in the batch metadata tools. Maybe we can share code/algorithms? Currently, for SxE type variants, I'm using the following regexp
Code:
".*[Ss]*0*([0-9]+)\\.*[EeXx]\\s*0*([0-9]+).*"
__________________
Batch Metadata Tools (User Guides) - SageTV App (Android) - SageTV Plex Channel - My Other Android Apps - sagex-api wrappers - Google+ - Phoenix Renamer Downloads SageTV V9 | Android MiniClient |
#3
|
||||
|
||||
I have a folder for each show I archive and then a seson folder for each season and finally the episode.
Ex: TV\CSI\09\S09E08 Say Uncle.mp4
__________________
If this doesn't work right, Then: "I'm going to blow up the Earth!" |
#4
|
|||
|
|||
Mine are in a format similar to dvd_maniac's, but I spelled out season and don't have the season number on the final file name:
TV\Lost\Season 04\01 The Beginning of the End.avi |
#5
|
||||
|
||||
Quote:
Quote:
* Be able to capture all episode data for entire series in one call into a single file (Think Fandata). * Have the output format be controlled by profile files so it can be configured to write into any format For starters, to prove out my system, I'm working on "theTVDB" and I've pretty much nailed it... Code:
Show : Lost theTVDB ID : 73739 Description : After their plane, Oceanic Air flight 815, tore apart whilst thousands of miles off course, the survivors find themselves on a mysterious deserted island where they soon find out they are not alone. First Aired : September 22, 2004 Airs On : Thursday Runtime : 45 minutes Network : ABC Genre : Action and Adventure Actors : Matthew Fox(http://www.theTVDB.com/banners/_cache/actors/27747.jpg)||Evangeline Lilly(http://www.theTVDB.com/banners/_cache/actors/27741.jpg)||Naveen Andrews(http://www.theTVDB.com/banners/_cache/actors/27749.jpg)|| Banners : http://www.theTVDB.com/banners/graphical/73739-g4.jpg||http://www.theTVDB.com/banners/graphical/73739-g.jpg||http://www.theTVDB.com/banners/graphical/73739-g6.jpg||http://www.theTVDB.com/banners/graphical/24313-g2.jpg||http://www.theTVDB.com/banners/graphical/73739-g3.jpg||http://www.theTVDB.com/banners/graphical/73739-g8.jpg||http://www.theTVDB.com/banners/graphical/73739-g7.jpg||http://www.theTVDB.com/banners/graphical/73739-g5.jpg||http://www.theTVDB.com/banners/graphical/24313-g.jpg||http://www.theTVDB.com/banners/blank/73739.jpg||http://www.theTVDB.com/banners/graphical/73739-g2.jpg||http://www.theTVDB.com/banners/graphical/73739-g9.jpg|| Posters : http://www.theTVDB.com/banners/posters/73739-2.jpg||http://www.theTVDB.com/banners/posters/73739-1.jpg||http://www.theTVDB.com/banners/posters/73739-5.jpg||http://www.theTVDB.com/banners/posters/73739-6.jpg||http://www.theTVDB.com/banners/posters/73739-4.jpg||http://www.theTVDB.com/banners/posters/73739-3.jpg|| Fanart : http://www.theTVDB.com/banners/fanart/original/73739-27.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-27.jpg||http://www.theTVDB.com/banners/fanart/original/73739-15.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-15.jpg||http://www.theTVDB.com/banners/fanart/original/73739-26.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-26.jpg||http://www.theTVDB.com/banners/fanart/original/73739-1.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-1.jpg||http://www.theTVDB.com/banners/fanart/original/73739-22.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-22.jpg||http://www.theTVDB.com/banners/fanart/original/73739-17.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-17.jpg||http://www.theTVDB.com/banners/fanart/original/73739-19.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-19.jpg||http://www.theTVDB.com/banners/fanart/original/73739-3.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-3.jpg||http://www.theTVDB.com/banners/fanart/original/73739-12.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-12.jpg||http://www.theTVDB.com/banners/fanart/original/73739-9.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-9.jpg||http://www.theTVDB.com/banners/fanart/original/73739-16.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-16.jpg||http://www.theTVDB.com/banners/fanart/original/73739-7.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-7.jpg||http://www.theTVDB.com/banners/fanart/original/73739-2.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-2.jpg||http://www.theTVDB.com/banners/fanart/original/73739-23.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-23.jpg||http://www.theTVDB.com/banners/fanart/original/73739-4.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-4.jpg||http://www.theTVDB.com/banners/fanart/original/73739-25.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-25.jpg||http://www.theTVDB.com/banners/fanart/original/73739-14.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-14.jpg||http://www.theTVDB.com/banners/fanart/original/73739-11.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-11.jpg||http://www.theTVDB.com/banners/fanart/original/73739-6.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-6.jpg||http://www.theTVDB.com/banners/fanart/original/73739-8.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-8.jpg||http://www.theTVDB.com/banners/fanart/original/73739-10.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-10.jpg||http://www.theTVDB.com/banners/fanart/original/73739-18.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-18.jpg||http://www.theTVDB.com/banners/fanart/original/73739-5.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-5.jpg||http://www.theTVDB.com/banners/fanart/original/73739-24.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-24.jpg||http://www.theTVDB.com/banners/fanart/original/73739-29.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-29.jpg||http://www.theTVDB.com/banners/fanart/original/73739-20.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-20.jpg||http://www.theTVDB.com/banners/fanart/original/73739-28.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-28.jpg||http://www.theTVDB.com/banners/fanart/original/73739-21.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-21.jpg||http://www.theTVDB.com/banners/fanart/original/73739-30.jpg||http://www.theTVDB.com/banners/fanart/vignette/73739-30.jpg|| ~~~~~~~~~~~~~~~~~~~~~~~~~ Episode : 1 - 1 Title : Pilot (1) First Aired : 2004-09-22 Guest Stars : Greg Grunberg| John Dixon| Michelle Arthur| Directors : J.J. Abrams Writers : J.J. Abrams, Damon Lindelof Description : Stripped of everything, the 48 survivors scavenge what they can from the plane for their survival. Some panic. Some pin their hopes on rescue. A few find inner strength they never knew they had-like Kate who, with no medical training, suddenly finds herself suturing the doctor's wounds. The band of friends, family, enemies and strangers must work together against the cruel weather and harsh terrain. But the intense howls of mysterious creatures stalking the jungle fill them all with fear. Fortunately, thanks to the calm leadership of quick-thinking Jack and level-headed Kate, they have hope. But even heroes have secrets, as the survivors will come to learn. ~~~~~~~~~~~~~~~~~~~~~~~~~ Episode : 1 - 2 Title : Pilot (2) First Aired : 2004-09-29 Guest Stars : John Dixon| Michelle Arthur| George Alan| Directors : J.J. Abrams ... ... |
#6
|
||||
|
||||
Actually you are not stepping on my toes. I was just doing it out of necessity, and I haven't started much yet along those lines. I'll probably just wait and see what you produce, since you are way futher ahead than I am
__________________
Batch Metadata Tools (User Guides) - SageTV App (Android) - SageTV Plex Channel - My Other Android Apps - sagex-api wrappers - Google+ - Phoenix Renamer Downloads SageTV V9 | Android MiniClient |
#7
|
||||
|
||||
How about some music naming conventions? I don't have any mp3's so people will have to help me out on this one
|
#8
|
||||
|
||||
Quote:
Artist - Album - Track# - song name.mp3 The - and spaces are there. Sometimes they are in their own folder: Artist folder -> Album folder Some song I have don't have track# or Album, so then only Artist - song name.mp3 There are other name configurations I use they are minor. Most programs allow you to rename to certain naming convention. I don't know if mediamonkey monkey might. |
#9
|
||||
|
||||
This is the only one that would be a problem if I were doing it strictly by file name. Are most of your songs ID3 tagged with at least artist/album?
|
#10
|
||||
|
||||
My mp3s have the following format for the file name:
track#-track name May or may not have spaces between track# "-" track name. Depends on the source. I rip mp3s without spaces and Amazon mp3s use spaces. All are ID3 tagged with: Track Name Artist Album Title Year Track Number Duration Type Bit Rate Protected (Y/N) Size
__________________
Server: Windows 8x64, ASUS P8 H67-M Pro Micro ATX, Core i5, 8 RAM, 14TB running latest Sage Clients: HD200, HD300, Win7 Desktop Capture Devices: Hauppauge Colossus & 1 HDHR, TV Service: Verizon FIOS |
#11
|
|||
|
|||
I'd suggest looking to see how iTunes organizes and names songs when you use it to organize your music. If there's going to be a method used by more people than most, that's bound to be it, simply because so many people use iTunes to organize their music. (Though maybe iTunes let's you change the way it organizes it, I'm not sure about that.) I let iTunes do all the naming and organizing into folders, then I just point Sage at the parent folder and let it find it all.
|
#12
|
||||
|
||||
Is storing your songs without the artist/album in the file name pretty common? Its sounding like my best bet is to just read the ID3 tags for music .
|
#13
|
||||
|
||||
Quote:
The reason I have my music files labled the way I can sort and look what I have through file browser. I confirmed MediaMonkey is able to rename filenames. Called auto-organize files (CTRL+R). Just an option if people has to name the files a particular way. |
#14
|
|||
|
|||
You best bet will be using the ID3 tags for music. There are so many different combinations of naming schemes for music files.
|
#15
|
||||
|
||||
Yeah, its sure sounding like that. Oh well, not a big deal, Perl has a ID3 tag parser I should be able to use.
|
#16
|
|||
|
|||
I know FLAC does NOT use ID3, but it's own format, but SageTV does understand whatever FLAC uses.
__________________
Server #1= AMD A10-5800, 8G RAM, F2A85-M PRO, 12TB, HDHomerun Prime, HDHR, Colossus (Playback - HD-200) Server #2= AMD X2 3800+, 2G RAM, M2NPV-VM, 2TB, 3x HDHR OTA (Playback - HD-200) |
#17
|
||||
|
||||
Yeah, definately parse ID3 tags - I do artist and album as directories, with track# and title as filenames. You'll need to think about what version to support, and what has precedence. Some of my old stuff has both v1 and v2 tagging on it, or just v1.
__________________
SageTV V7 (WHS), Diamond UI Server: WHS with Xeon X3350, 4GB ECC, ASUS P5BV-C/4L, recording into a 6.6TB Drive pool Tuners: 4 (2x HDHR) Clients: 2x HD300, 1x HD200 Extenders, 1x Placeshifter 2x Roku XD |
#18
|
||||
|
||||
My ID3 tags are 99% accurate and complete, so that's definitely the best method. But, I also organize with a folder structure as follows:
Artist\Album\[Artist] - [Title].mp3 or Genre\Artist\Album\[Artist] - [Title].mp3 Sometimes the track # is part of the filename, but never part of the folder structure. It's primarily live albums where I worry about track sequence at all. The issue with parsing folder names versus ID3 tags is that folder names disallow some characters that are used in actual album, artist, or song names. |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Approximate ETA for BD ISO and full mt2s audio support? | ojosch | SageTV Media Extender | 1 | 12-10-2008 12:41 PM |
Cardea/DRM Support in HD200 | fyodor | SageTV Media Extender | 0 | 12-05-2008 03:01 PM |
SageTV V6.3.2 Beta is Available! | Narflex | SageTV Beta Test Software | 0 | 11-21-2007 07:08 PM |
SageTV V6.3.1 Beta is Available | Narflex | SageTV Beta Test Software | 1 | 11-09-2007 05:17 PM |