SageTV Community  

Go Back   SageTV Community > SageTV Development and Customizations > SageTV Customizations
Forum Rules FAQs Community Downloads Today's Posts Search

Notices

SageTV Customizations This forums is for discussing and sharing user-created modifications for the SageTV application created by using the SageTV Studio or through the use of external plugins. Use this forum to discuss customizations for SageTV version 6 and earlier, or for the SageTV3 UI.

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 01-04-2009, 11:16 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Arrow Utility: mediaScraper - metadata made easy!

Release History:
What it is:
mediaScrapper is my answer XBMC's excellent, but very confusing, scraping engine and you can use it to track down all the metadata and fanart for any TV/Movies you may have ripped/bought/downloaded/whatever.

Usage Instructions: See 2nd post

How to use metadata files:

* To make SageMC use the .my files, go into SageMC Properties -> Enhancments -> Use .my Files -> Enabled.
* To make the default STV use .properties files, read this thread.


Submitting Issues:

If you have any issue I'll need to see the EXACT PATH/FILE name that is giving you trouble and/or (preferably) the scraper.log file that gets generated next to the .exe after every run.

Example Outputs:

.properties file for default STV:
Code:
MediaType=TV
Title=Wonderfalls 1x05 - Crime Dog
MediaTitle=Wonderfalls
Actor=Caroline Dhavernas;Tracie Thoms;Tyron Leitso;Diana Scarwid;Lee Pace;Katie Finneran;William Sadler;
ActorRoles=Jaye Tyler;Mahandra;Eric;Karen Tyler;Aaron Tyler;Sharon Tyler;Darrin Tyler;
Guest=Audrey Wasilewski;Bill Lake;Kimberly Scott;
Director=Allan Kroeker;
Writer=Krista Vernoff;
Genres=Drama;
Description=When Jaye's brother sees her talking to inanimate objects, he begins to question her state of mind; the animals help Jaye reunite her family with the housekeeper.
Rated=
UserRating=6.0
ReleaseDate=2004-07-23
MediaProviderDataID=theTVDB:78845
SeasonNumber=1
EpisodeNumber=5
EpisodeTitle=Crime Dog
.my file for SageMC:
Code:
Title=Burn Notice
overview=A spy, Michael Weston, receives a burn notice for an unstated reason, effectively firing him. He has spent the previous decade working for the government in Eastern Europe and returns to his hometown of Miami to get his life in order. Michael will stop at nothing to find out why. Shut out from his normal contacts, but still driven to right wrongs, Michael needs to stay under the radar in order to stay in the game.
TVOriginalAiringDate=2007-07-26
actors=Jeffrey Donovan, Gabrielle Anwar, Bruce Campbell, Paul Gutrecht, Guri Weinberg, Audrey Landers, Scott Michael Campbell, Joel Swetow, Hannia Guillen, Jessica Baldwin, Rangel Hernandez Martinez,
TVEpisode=Family Business
TVOverview=Michael infiltrates a family of gunrunners who are pressuring an airport supervisor and his pregnant wife. The FBI begin to put pressure on Sam to gather more information on Michael.
TVDuration=60 minutes
TVGenre=Action and Adventure
TVChannelName=USA
List of Non-Working names that I know about and intend to Fix:

Last edited by evilpenguin; 04-09-2009 at 01:49 AM.
Reply With Quote
  #2  
Old 01-04-2009, 11:16 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Supported Naming Conventions:
MediaScaper works by looking at your file names and attempting to pull out all of the information it needs to track down the metadata. Chances are if mediaScraper can't find the metadata for your video then the file name is too confusion for it to figure out. Below are the supported naming conventions that will work

Television: Requires Show Title and Season/Episode Number
Code:
Show Title (S##E##|#x##|###) randomJunk.avi
BaseFolder\Show Title\(S##E##|#x##|###) randomJunk.avi
BaseFolder\Show Title\Season #\(S##E##|#x##|###) randomJunk.avi
Movies: Requires Movie Title
Code:
Movie Title.avi
Movie Title (YEAR).avi
Movie Title (YEAR).randomJunk.avi

NOTE: This will not work on shows in Sage Recording format (ShowTitle-EpisodeName-#####-#.mpg)!!!



Basic Usage:

For general usage all you need to do is drag and drop videos and/or folders containing videos onto mediaScraper.exe and, by default, it will download the metadata and fanart and place them right next to the original videos.

TV:
Code:
TV\Scrubs\Scrubs 2x01 - My Overkill.avi
TV\Scrubs\Scrubs 2x01 - My Overkill.avi.properties (metadata)
TV\Scrubs\background.jpg (Fanart)
TV\Scrubs\folder.jpg (Thumbnail/Poster)
TV\Scrubs\banner.jpg (Banner)
Movies:
Code:
Movies\300.avi
Movies\300.avi.properties (metadata)
Movies\300_background.jpg (Fanart)
Movies\300.jpg (Thumbnail/Poster)
Movies (VIDEO_TS Folder):
Code:
Movies\300\VIDEO_TS
Movies\300.properties (metadata)
Movies\300\background.jpg (Fanart)
Movies\300\folder.jpg  (Thumbnail/Poster)

folder.override:

There will be times when mediaScraper just won't be able to find a match for your file. Common problems include
  • Actual show name contains illegal window characters. ex: Terminator: The Sarah Conner Chronicles.
  • Name conflicts with other TV shows. ex: The Office (US) vs. The Office (UK) or Battlestar Gallactica (2003) vs. Battlestar Gallactica.
In these cases, rather than going through and renaming every file to get it to match you can create a folder.override file next to the original video(s) that is just a text file that contains the exact show title that theTVDB or IMDB is expecting for every show in that folder.

Example:
Lets say that you try and run this file through mediaScraper:
Code:
TV\Terminator The Sarah Connor Chronicles\Terminator The Sarah Connor Chronicles 1x01.avi
It would decide that its a TV show with...
Code:
Show Title = Terminator The Sarah Connor Chronicles
Season = 1
Episode = 1
Which looks all good, but when you search for it on theTVDB it returns 0 matches.

With a little manual investigating you'll find that theTVDB will only recognize the title if it contains the ':'.
Code:
Series Title = Terminator: The Sarah Conner Chronicles
However, you can't use a ':' in a Windows file name. This is where folder.override come in.

To fix this you'd create a folder.override in the folder with all the episodes.
Code:
TV\Terminator The Sarah Connor Chronicles\Terminator The Sarah Connor Chronicles 1x01.avi
TV\Terminator The Sarah Connor Chronicles\folder.override
and use notepad to make its contents be the exact show title that theTVDB or IMDB is expecting....
Code:
Terminator: The Sarah Connor Chronicles
Then next time mediaScraper gets a video from that folder, it'll see the folder.override file and instead of trying to figure out the Show Title, it will just read it out of that file and use that to search.

You can also place a folder.override it in the parent directory if you want it to apply to all folders directly above it.
Code:
TV\Terminator The Sarah Connor Chronicles\Season 1\Terminator The Sarah Connor Chronicles 1x01.avi
TV\Terminator The Sarah Connor Chronicles\folder.override
In addition, mostly for movies, you can create a whatEverYourFileNameIs.override so that it applies to only one file.
Code:
The Dark Knight.avi
The Dark Knight.override
mediaScraper.skip:

If you have a folder full of videos you know have no metadata (ex. home movies, clips, etc.) then you can place a file named mediaScraper.skip in the folder and then mediaScaper will ignore it and all its sub folders.
Code:
TV\Home Movies\mediaScraper.skip
Advanced Usage:

MediaScraper reads all of its options out of defaults.txt which is right next to mediaScraper.exe and it is just a list of command line switches that will always be used.

These are the default options:
Code:
/genPropertyFile /downloadFanArt /baseFolder "TV"
In addition you can also add any of the below, available, switches to further customize the output.

Available Switches:
  • /genPropertyFile - Generate a .properties file.
  • /genMyFile - Generate a .my file, contains extra data for SageMC
  • /genInfoFile - Generate a .info file. These don't do anything useful, they are just a dump of all of the available metadata, some of which, doesn't fit into the .my or .properties files.
  • /downloadFanArt - Download fanart/thumbnails/banners if available.
  • /baseFolder "C:\Example\Folder Name" - If you keep all your TV organized in a common base folder you can specify it here to help with scraping accuracy
  • /organizeFiles - If the mediaScraper is able to find metadata for your file it can also rename your file to match. (Will not work with VIDEO_TS folders)
    Code:
    Scrubs.201.lol.hdtv.avi -> Scrubs 2x01 - My Overkill.avi
    Code:
    The.Dark.Knight.2008.DVDrip.xor.aiv -> The Dark Knight (2008).avi
  • /tvSE - This will have mediaScraper use the TV naming format S02E01 rather than 2x01. This will apply to both organizing and metadata files.
  • /organizeBaseFolder "D:\Videos" - If you specify a base folder when you have /organizeFiles set this will move the videos into a folder structure as well.
    Code:
    Scrubs.201.lol.hdtv.avi -> D:\Videos\TV\Scrubs\Season 2\Scrubs 2x01 - My Overkill.avi
    Code:
    The.Dark.Knight.2008.DVDrip.xor.aiv -> D:\Videos\Movies\The Dark Knight (2008).avi
  • /4digitTV - Allow TV shows that don't use a separator between the season/episode (S07E22 -> 722) to be 4 digits long (S11E22 -> 1122). By default this is disabled to prevent a conflict with movies that have the year in the file name (2008 season/episode vs 2008 year).
  • /updateInfo - Normally mediaScraper will ignore videos that already have existing properties files. You can set this to have it run anyway.
  • (NEW) /useOriginalName - Use this if you want to use your original file name in the .properties files rather than the official name scraped from the show info.
  • (NEW) /genXMLFile - Generate an XML file that can be imported in to SageTV web server
  • (NEW) /userName "User" - User name for logging onto a SageTV web server
  • (NEW) /password "Password" - Password for logging onto a SageTV web server
  • (NEW) /sageTVServer "localhost:8080" - Host name and port number of web server
  • (NEW) /addToSageDB - When used with /genXMLFile and web server settings will automatically add a show to Sage's DB using the web servers XML import function. You can use this to have your imported TV show up with your recorded TV.

/addToSageDB:
If you pair this switch with /userName, /password, /sageTVServer, and /genXMLFile then mediaScraper will pack the metadata into a SageTV Webserver XML file and use an experimental feature that will add trick Sage into treating the show as if it were a SageTV Recording. I've been playing around with it and it seems to be working rock solid, but I make no guarantees that this won't blow out your entire wiz.bin: Use at your own risk!

Last edited by evilpenguin; 03-03-2009 at 01:40 PM.
Reply With Quote
  #3  
Old 01-05-2009, 03:52 AM
mickp's Avatar
mickp mickp is offline
Sage Aficionado
 
Join Date: Oct 2006
Posts: 468
Ooooh. If Mike or Dirk support this for fan art I might just give the feature a go.

Downloading now. Thanks E.P.

Mick.
Reply With Quote
  #4  
Old 01-05-2009, 04:28 AM
mickp's Avatar
mickp mickp is offline
Sage Aficionado
 
Join Date: Oct 2006
Posts: 468
Initial ignorant impression #1

Hey E.P.

Just gave it a burl on a few files

Filenames that didn't work;

Code:
"c:\shares\movies\Atlantis\Season 3\Stargate.Atlantis.S03E01.WS.DSR.XviD-DIMENSION.avi"

"c:\shares\movies\Atlantis\Season 3\Stargate.Atlantis.S03E01.avi"
Filename that did work;

Code:
"c:\shares\movies\Atlantis\Season 3\Stargate Atlantis S03E01.avi"
Unfortunately I'm utter rubbish at regex so can't be of much help with detailed suggestions .

Can I suggest replacing the (.) period with a space and also if a postitive result/hit hasn't yet been found try removing the last word of the file (assuming . replaced with space) and giving the search another go. Then removing another, and another?

I realise that I could rename the files so that they work but it would be nice to have it just work with typical download file names.

I'll go have more of a play now

Mick.

Edit: Also, would it be possible to not create the metadata files if no result was found?

Last edited by mickp; 01-05-2009 at 04:32 AM.
Reply With Quote
  #5  
Old 01-05-2009, 04:52 AM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Yeah, good catch, looks like the '.' in the series name is causing some trouble. I'll that (and a whole bunch of other bugs I found ) all sorted out tomorrow.
Reply With Quote
  #6  
Old 01-05-2009, 05:04 AM
mickp's Avatar
mickp mickp is offline
Sage Aficionado
 
Join Date: Oct 2006
Posts: 468
Cool. Thanks.

Mick.
Reply With Quote
  #7  
Old 01-06-2009, 01:38 AM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Just updated the download with a ton of fixes, the best of which is it won't create output files if it doesn't actually get any data.

Last edited by evilpenguin; 01-09-2009 at 12:39 PM.
Reply With Quote
  #8  
Old 01-06-2009, 03:02 AM
mickp's Avatar
mickp mickp is offline
Sage Aficionado
 
Join Date: Oct 2006
Posts: 468
Awesome!

I've adapted my old comskip batch file to process all files in a directory. The log files from this should be quite interesting.

I'll give the new version a run over one of my download directories and pm the logs.

In the meantime i've attached the two batch files required for bulk scraping. Hopefully someone will find them useful.

Usage is scrape [unc path]

Mick.

Edit: Fixed a bug which would cause all shows to be processed every time. Now only files without a .properties file are scraped on subsequent runs.

Edit: Updated version should cope correctly with video_ts directories

NB: You don't need to use this batch file(s) any more as mediascraper.exe will now process all files in a directory.
Attached Files
File Type: zip scrape.zip (1.2 KB, 662 views)

Last edited by mickp; 02-14-2009 at 08:44 PM.
Reply With Quote
  #9  
Old 01-06-2009, 03:07 AM
mickp's Avatar
mickp mickp is offline
Sage Aficionado
 
Join Date: Oct 2006
Posts: 468
Um. File appears to be corrupted. One zip program I tried says "error in zip file. Garbage at end of file"

Mick.
Reply With Quote
  #10  
Old 01-06-2009, 07:25 AM
Opus4's Avatar
Opus4 Opus4 is offline
Administrator
 
Join Date: Sep 2003
Location: NJ
Posts: 19,624
BTW: .properties files are read by the core, so that data is put into SageTV's database when importing files. I'm adding this comment because after a quick glance while adding this to the customizations index, it looked like the notes were saying that was for the default STV, but it will work for anything.

- Andy
__________________
SageTV Open Source v9 is available.
- Read the SageTV FAQ. Older PDF User's Guides mostly still apply: SageTV V7.0 & SageTV Studio v7.1.
- Hauppauge remote help: 1) Basics/Extending it 2) Replace it 3) Use it w/o needing focus
- HD Extenders: A) FAQs B) URC MX-700 remote setup
Note: This is a users' forum; see the Rules. For official tech support fill out a Support Request.
Reply With Quote
  #11  
Old 01-06-2009, 07:55 AM
jaminben jaminben is offline
Sage Icon
 
Join Date: Sep 2007
Location: Norwich, UK
Posts: 1,754
Send a message via MSN to jaminben
Quote:
Originally Posted by evilpenguin View Post
Just updated the download with a ton of fixes, the best of which is it won't create output files if it doesn't actually get any data.
Great stuff However it appears that the zip file is damaged and wont extract (C:\Users\jaminben\Desktop\mediaScraperBeta.zip: The archive is corrupt).

Is this me being stupid or is their something wrong with the zip file?

Cheers

Ben
__________________
Server - Win7 64bit, 2.4Ghz Intel Core 2 Duo, TBS 6284 PCI-E Quad DVB-T2 Tuner, 3 x HD200 & 1 x HD300 extenders
Reply With Quote
  #12  
Old 01-06-2009, 11:47 AM
deanm deanm is offline
Sage Aficionado
 
Join Date: Apr 2005
Location: North East, UK
Posts: 309
Quote:
Originally Posted by jaminben View Post
Great stuff However it appears that the zip file is damaged and wont extract (C:\Users\jaminben\Desktop\mediaScraperBeta.zip: The archive is corrupt).

Is this me being stupid or is their something wrong with the zip file?

Cheers

Ben
If you click on the link (PERL source code) and download it that way

Dean
Reply With Quote
  #13  
Old 01-06-2009, 11:52 AM
joe123 joe123 is offline
Sage Fanatic
 
Join Date: Jan 2006
Posts: 954
For the rest of us, what does this module do?

In basic terms please
Reply With Quote
  #14  
Old 01-06-2009, 12:17 PM
deanm deanm is offline
Sage Aficionado
 
Join Date: Apr 2005
Location: North East, UK
Posts: 309
Nice tool this will save me hours of work. Did try and do this manually once but gave up after about an hour and only 5 shows later.

It looks like the speech mark (“”) can not be displayed properly. Not sure if this is a Sagemc or Sage problem. If I run this on the Simpsons Season 18 Episode 1 then the text at the end “family business.” will not display correctly.

Works fine if you remove the Speech marks (“”)

Dean,
Reply With Quote
  #15  
Old 01-06-2009, 12:24 PM
deanm deanm is offline
Sage Aficionado
 
Join Date: Apr 2005
Location: North East, UK
Posts: 309
Quote:
Originally Posted by deanm View Post
Nice tool this will save me hours of work. Did try and do this manually once but gave up after about an hour and only 5 shows later.

It looks like the speech mark (“”) can not be displayed properly. Not sure if this is a Sagemc or Sage problem. If I run this on the Simpsons Season 18 Episode 1 then the text at the end “family business.” will not display correctly.

Works fine if you remove the Speech marks (“”)

Dean,
Strange one this. Works OK if I put the Speech marks back ("") back into the text. Just used notepad to edit the file
Reply With Quote
  #16  
Old 01-06-2009, 12:55 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Quote:
Originally Posted by mickp View Post
Um. File appears to be corrupted. One zip program I tried says "error in zip file. Garbage at end of file"

Mick.
Quote:
Originally Posted by jaminben View Post
Great stuff However it appears that the zip file is damaged and wont extract (C:\Users\jaminben\Desktop\mediaScraperBeta.zip: The archive is corrupt).

Is this me being stupid or is their something wrong with the zip file?

Cheers

Ben
Not sure what's up with this. I just tried downloading it and extracting it with 7zip (highly recommended, btw) and it works fine. But also it could just be that one of Sourceforge servers got a corrupt file so its working for some but not others, I've updated the link try downloading it again.

Sorry for the confusion.
Reply With Quote
  #17  
Old 01-06-2009, 12:58 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Quote:
Originally Posted by joe123 View Post
For the rest of us, what does this module do?

In basic terms please
Quote:
What it is:
mediaScrapper is my answer XBMC's excellent, but very confusing, scraping engine and you can use it to track down all the metadata for media (currently only TV) you may have ripped/bought/downloaded/whatever.

* For Users: Drag and drop your TV files onto mediaScraper.exe and it'll track down the metadata from TV.com and drop it into a .info, .my, and .properties file right next to it filled with the all that metadata you crave.
* For Developers: Want to bring to bring the rich world of metadata into your plug-in/STV/whatever? See this space for a link to detailed info for how you can incorporate this tool into your own software!

How to use metadata files:

* To make SageMC use the .my files, go into SageMC Properties -> Enhancments -> Use .my Files -> Enabled.
* To make the default STV use .properties files, read this thread.
Basically, the .properties/.my files this generates allow Sage to show extra information about the imported video rather than just its file name like it normally does. I'll post some screen shots tonight to show what I mean.

Last edited by evilpenguin; 01-06-2009 at 02:30 PM.
Reply With Quote
  #18  
Old 01-06-2009, 01:04 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Quote:
Originally Posted by deanm View Post
Strange one this. Works OK if I put the Speech marks back ("") back into the text. Just used notepad to edit the file
I've noticed a few weird things like this. I think the issue is that the quotes (") in the webpage are unicode characters, but when I read them in they lose their unicode-ness and just display as junk. I'll see what I can do about fixing that.

Last edited by evilpenguin; 01-06-2009 at 01:06 PM.
Reply With Quote
  #19  
Old 01-06-2009, 01:18 PM
evilpenguin's Avatar
evilpenguin evilpenguin is offline
SageTVaholic
 
Join Date: Aug 2003
Location: Seattle, WA
Posts: 3,696
Quote:
Originally Posted by mickp View Post
Can I suggest replacing the (.) period with a space and also if a postitive result/hit hasn't yet been found try removing the last word of the file (assuming . replaced with space) and giving the search another go. Then removing another, and another?
Something like that will have happen eventually if I ever want to solve that "Terminator: The Sarah Connor Chronicles" problem. Right now I'm looking at two possible solutions:
  1. Do something like you mention and start removing words until I get a hit. I think there's a good way to do this and still maintain a certain level of confidence that the hit i'm getting is actually the right show. It'd prolly be best to start removing words based on how long they are (i.e. "Terminator The Sarah Connor Chronicles" (0 hits) -> "Terminator Sarah Connor Chronicles" (0 hits) -> ... -> "Chronicles"(Lots of hits), then check the number of characters I used to make the hit and determine how confident I am based on that.
  2. Generate a list of possible hits and then have on the front end to give the choice to the user, and then rerun the tool using their selection.

Last edited by evilpenguin; 01-06-2009 at 01:22 PM.
Reply With Quote
  #20  
Old 01-06-2009, 02:50 PM
mickp's Avatar
mickp mickp is offline
Sage Aficionado
 
Join Date: Oct 2006
Posts: 468
Re: the zip file;

I found that if I renamed the zip to be .7z rather than .zip, winrar would open it and allow me to extract the one file "mediaScraperBeta". Extracted and added a .zip extension and voila...

Not sure what's going on with 7zip.

Mick.
Reply With Quote
Reply

Tags
fanart, imdb, metadata, themoviedb, thetvdb


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
STV Import: Read & Write Metadata files for Default STV 5/6 (v2.2.9 18/Feb./2010) JREkiwi SageTV Customizations 251 10-09-2019 11:11 AM
Utility: Batch Metadata Importer 3.x stuckless Batch Metadata Tools 2416 09-20-2011 08:03 PM
manually edit video metadata loomdog32 SageMC Custom Interface 8 01-04-2009 12:34 AM
Runningtime not populated in Metadata mycorona SageMC Custom Interface 2 11-30-2008 01:08 PM
Keeping custom metadata for imports via AddShow() Opus4 SageTV Studio 9 02-20-2008 06:35 PM


All times are GMT -6. The time now is 01:04 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, vBulletin Solutions Inc.
Copyright 2003-2005 SageTV, LLC. All rights reserved.