SageTV Community

SageTV Community (http://forums.sagetv.com/forums/index.php)
-   SageTV v7 Customizations (http://forums.sagetv.com/forums/forumdisplay.php?f=47)
-   -   Plugin: Sage Recording Extender 4 (SREv4) (http://forums.sagetv.com/forums/showthread.php?t=57364)

skiingwiz 06-03-2015 07:55 PM

Quote:

Originally Posted by Slugger (Post 572751)
The easiest and most standard way is to require a specific, standard user agent header in the requests. Google Appengine doesn't allow spoofing of this header and so that knocks out a lot of contenders. The ones that remain seem to employ similar tactics as Donald Duck's site.

I know you've solved this for now, but I'm curious. Have you considered moving off of appengine to some other platform that would allow user agent spoofing? I know that if you tell Donald's Site that you are IE 8, it gives you a nice static HTML scoreboard.

Slugger 06-03-2015 08:46 PM

Quote:

Originally Posted by skiingwiz (Post 572849)
I know you've solved this for now, but I'm curious. Have you considered moving off of appengine to some other platform that would allow user agent spoofing? I know that if you tell Donald's Site that you are IE 8, it gives you a nice static HTML scoreboard.

When one of the big 4 breaks and there's no other options then a move off of appengine might be considered. To do that, I'd have to basically rewrite the server app (the source I inherited a few years back is vendor locked into Appengine's APIs for things like data storage).

But most importantly, appengine is free. Small enough user base that we've never exceeded appengine's daily free quotas (with room still for substantial growth, were that ever to happen). I've thought about just biting the bullet, rewriting it and moving it to AWS on a few occasions over the years, but it's tough to beat free especially when the service doesn't generate any revenue nor do I have any such plans to start trying to.

But if user-agent spoofing were the only way to get data at some point in the future then I might consider it (but then are people willing to pay a nominal fee to offset an average AWS bill, for example?). Hopefully we don't have to worry about that for some time to come.

skiingwiz 06-04-2015 07:29 PM

Quote:

Originally Posted by Slugger (Post 572850)
But if user-agent spoofing were the only way to get data at some point in the future then I might consider it (but then are people willing to pay a nominal fee to offset an average AWS bill, for example?). Hopefully we don't have to worry about that for some time to come.

I'd be willing to put up a few bucks for the service. But, like you say, hopefully that doesn't happen. It's hard to argue with free.

Slugger 06-12-2015 12:36 PM

Grrrr... this is starting to turn into more of a chore than I'd prefer. Suddenly out of the blue, mlb.com has started blocking all requests from appengine.

I've moved mlb over to the same source as nba, but this also means we now lose future look aheads for the schedule (i.e. only current day's games are available).

I could see why these sites were blocking me if I were hitting them every 2 seconds, but I don't. I actually hit the sites once every 2 mins and cache the result in the app for 2 mins and only while there are active requests for the status of the game. When there are no games in progress, I do not hit the urls. If you load MLB's scoreboard page and just leave it sitting in your browser all day, your browser makes the same ajax request I make, except it makes it much, much more frequently than this app does.

I'm now down to one source for everything. If it starts blocking appengine then we're screwed. Fingers crossed...

skiingwiz 06-13-2015 07:08 PM

Could pulling the data from the various sources be crowd-sourced? I'm thinking of a small client that asks livepvrdata.com what url it needs, pulls the data, and then sends it back to livepvrdata.com.

It wouldn't take very many of these little clients to make it much more difficult for the sites to block, since they'll be coming from all over the internet.

I know I'd be willing to run such a client. I'd even be willing to write a client or a sageplugin to do it. (Of course, that's not much work. There would still be a chunk of work to do on the backend to support this.)

KarylFStein 06-13-2015 08:47 PM

What about a proxy? If the filtering is really just the user agent some PHP code on a web host could be a solution.

KryptoNyte 06-13-2015 08:55 PM

1 Attachment(s)
Any time there is a live sporting event on the EPG, I just head for Google and type in the name of one team followed by a "vs" and it gives the status of the event. What feeds that and/or could something be grabbed right from that search result?

Slugger 06-14-2015 10:40 PM

Quote:

Originally Posted by skiingwiz (Post 573254)
Could pulling the data from the various sources be crowd-sourced? I'm thinking of a small client that asks livepvrdata.com what url it needs, pulls the data, and then sends it back to livepvrdata.com.

It wouldn't take very many of these little clients to make it much more difficult for the sites to block, since they'll be coming from all over the internet.

I know I'd be willing to run such a client. I'd even be willing to write a client or a sageplugin to do it. (Of course, that's not much work. There would still be a chunk of work to do on the backend to support this.)

An interesting idea, but then the service is reliant on users being online. Is your Sage server on 24x7 or do you put it to sleep between recordings? Mine is 24x7, but are enough people's to make it viable? The other issue is that each of these clients now polls the web service constantly looking for work to do, which greatly increases the number of requests per day, which starts to push us closer to exhausting the service's free quota. I'd have to work out the numbers, but I actually expect this kind of thing would easily make us exceed the free quota (because clients have to poll all the time, 24x7, because you never know when there's work to be done).

Quote:

Originally Posted by KarylFStein (Post 573258)
What about a proxy? If the filtering is really just the user agent some PHP code on a web host could be a solution.

Yeah, this I've thought about, but 1) it's not free. I realize your run of the mill php hosting site is dirt cheap so I'd probably be willing to eat the buck or two a month for it except 2) you get what you pay for. $1/mth hosts are just that. You're not getting much of an SLA (if any) for $1/mth. So we now have a single point of failure that is very hard to overcome when something goes wrong (i.e. I change the web service to route all requests thru this proxy and if the proxy goes belly up then the service is dead). At least today, when a site blocks appengine only that one monitor is affected, the others continue to work. And 3) the architecture of this web service just got a whole lot more complicated (IOW, it requires a decent chunk of the app to be modified/rewritten).

Quote:

Originally Posted by KryptoNyte (Post 573259)
Any time there is a live sporting event on the EPG, I just head for Google and type in the name of one team followed by a "vs" and it gives the status of the event. What feeds that and/or could something be grabbed right from that search result?

Yeah, I've seen this, but I have no idea where Google is getting its data from -- my guess is they pay for it from somewhere. I just spent a quick 10 mins looking at the html. There seems to be some patterns in there meaning it might be possible to scrape, but the html is so cryptic it would take awhile to properly write & test something. Though push comes to shove, it might be something to start looking into as an alternative.

However, good news!! I decided to revisit appengine's api docs just to see if there was something I could do and it seems sometime between now and let's say the last 3 or 4 years they've modified their http client api to allow modification of the user-agent header. The only caveat is that they still append an appengine identifier onto any custom value you set it to, but it seems to be good enough to fool MLB's site. So I've starting sending an IE8 user agent string on all requests from the app and even though appengine mangles it to append its stuff to the end of the string, mlb still seems to think it's IE8 as the requests aren't blocked anymore. A few other web sites I tested with were convinced I was still IE8 as well. I need to try it with Donald Duck's site still as well to see if they're also convinced, but that will have to wait for another weekend. But it seems there's some light once again at the end of the tunnel, they're just making me work for it these days after (literally) years of next to no work on this thing. :)

skiingwiz 06-16-2015 01:45 PM

Quote:

Originally Posted by Slugger (Post 573289)
The other issue is that each of these clients now polls the web service constantly looking for work to do, which greatly increases the number of requests per day, which starts to push us closer to exhausting the service's free quota. I'd have to work out the numbers, but I actually expect this kind of thing would easily make us exceed the free quota (because clients have to poll all the time, 24x7, because you never know when there's work to be done).

Good point. I had not considered the effect of my idea on the server quota. That pretty much negates the idea.

Quote:

Originally Posted by Slugger (Post 573289)
However, good news!! I decided to revisit appengine's api docs just to see if there was something I could do and it seems sometime between now and let's say the last 3 or 4 years they've modified their http client api to allow modification of the user-agent header. The only caveat is that they still append an appengine identifier onto any custom value you set it to, but it seems to be good enough to fool MLB's site. So I've starting sending an IE8 user agent string on all requests from the app and even though appengine mangles it to append its stuff to the end of the string, mlb still seems to think it's IE8 as the requests aren't blocked anymore. A few other web sites I tested with were convinced I was still IE8 as well. I need to try it with Donald Duck's site still as well to see if they're also convinced, but that will have to wait for another weekend. But it seems there's some light once again at the end of the tunnel, they're just making me work for it these days after (literally) years of next to no work on this thing. :)

Awesome.

Slugger 06-16-2015 08:44 PM

Well my one Canadian site seems to be happy to serve me data again from appengine since I now claim to be IE8. Our friends at MLB are also serving up data again. We're now fully diversified among three different sites again.

I still need to rework some of Donald's html pages to regain full schedule look ahead capabilities for some sports but since those sports are about to end, I'll deal with that later this summer or early fall. After a bit of a scare, I'd say we're back to full functionality at least thru the summer. :)

jvonbokel 07-06-2015 02:06 PM

Quote:

Originally Posted by Slugger (Post 573289)
An interesting idea, but then the service is reliant on users being online. Is your Sage server on 24x7 or do you put it to sleep between recordings? Mine is 24x7, but are enough people's to make it viable? The other issue is that each of these clients now polls the web service constantly looking for work to do, which greatly increases the number of requests per day, which starts to push us closer to exhausting the service's free quota. I'd have to work out the numbers, but I actually expect this kind of thing would easily make us exceed the free quota (because clients have to poll all the time, 24x7, because you never know when there's work to be done).

This may be a moot point now that you've had success spoofing IE8, but the traffic could be limited by having the server tell each client when to connect again. If there are no active games, the response would say to wait until the next game starts. Also, the server could effectively manage the number of clients by telling some to "sleep" for a day or two when it gets too many active clients. You'd have to decide how many is a safe number to rely on (2? 5? 10?), but that could be ironed out over time as well.

On an unrelated note, I've got two MLB games scheduled tomorrow (Cardinals @ Cubs double header), and I'm getting a "Monitor status is invalid" message on them. Is there an easy way to correct that on my end, or is that just a limitation of SRE that I have to live with? If so I'm prepared to do that, as a success rate of 160 out of 162 is very acceptable. :)

Slugger 07-06-2015 06:17 PM

Quote:

Originally Posted by jvonbokel (Post 573874)
On an unrelated note, I've got two MLB games scheduled tomorrow (Cardinals @ Cubs double header), and I'm getting a "Monitor status is invalid" message on them. Is there an easy way to correct that on my end, or is that just a limitation of SRE that I have to live with? If so I'm prepared to do that, as a success rate of 160 out of 162 is very acceptable. :)

So my once Sage box turned desktop/dev box finally called it quits last week - motherboard died. 6.5 years of 24x7 power on then another good 1.5 years as my desktop, it doesn't owe me anything I suppose. Unfortunately, my Appengine dev env for all things livepvrdata.com/SRE was on the box. I got the source code else where, but no way to debug/redeploy it (until I take the time to rebuild the env else where and as long as sitting on my deck bbq'ing is an option, rebuilding dev envs isn't :)). So if it's broken, it's not going to get fixed any time soon, but I fixed doubleheader support for MLB a few weeks back when the Jays played one of their very rare DHs and I noticed that DH support was broken.

The key to DH monitors working correctly is the time the recording starts. When the same teams play multiple games on the same day, I try to figure out which game to attach to the recording based on the time the recording starts and try to match it to the proper game based on the start time in the data feed. I know it worked for the Jays DH so it should still work. Status unknown happens (quite frequently for MLB) as the data feed I use seems to go wonky more often than it probably should. If it was wonky during SRE's daily checkup on your system then you get status unknown. In that case, the games will just monitor properly tomorrow. Anyway, let me know if it does or doesn't work because if it is a problem with DH support then I'd like to try and fix it sometime in the relatively near future.

Denali 08-14-2015 02:22 AM

Adding an FYI. SRE fails under SAGE.JAR version 9. Appears to be a CORE dependency that coded for 7.1.9.256.

egeller 08-14-2015 04:27 PM

Quote:

Originally Posted by Denali (Post 575361)
Adding an FYI. SRE fails under SAGE.JAR version 9. Appears to be a CORE dependency that coded for 7.1.9.256.

I am seeing this too, and I think the assessment is correct.

Is there a hack we can do ourselves to re-enable the plug-in or do we need to wait for it to be updated? Just wondering.

wayner 08-14-2015 06:00 PM

Quote:

Originally Posted by egeller (Post 575434)
I am seeing this too, and I think the assessment is correct.

Is there a hack we can do ourselves to re-enable the plug-in or do we need to wait for it to be updated? Just wondering.

Give Slugger a break, we have short summers here in Canada and he is enjoying it. Plus the Blue Jays are the hottest team in baseball and are in contention for the first time in a generation.

skiingwiz 08-14-2015 06:44 PM

1 Attachment(s)
Quote:

Originally Posted by egeller (Post 575434)
Is there a hack we can do ourselves to re-enable the plug-in or do we need to wait for it to be updated? Just wondering.

Yes. Assuming SRE will work fine in Sage v9...

Put the attached file in your SageTV/SageTV folder (in the same folder as the SageTVPlugins.xml file) and remove the .txt extension. This will override the SRE plugin manifest to remove the max version dependency.

Note 1: I have not tested this, but the concept will work (this is how devs test plugins without having to submit them.

Note 2: If you already have a SageTVPluginsDev.xml file, don't overwrite it... but if you have one already, I'll pretty much assume you know what you're doing with it.

I think I'm still able to submit updates to this plugin, so once I find time to get the Open Source Sage running, I'll verify this will work with it and release a proper update.

skiingwiz 08-15-2015 01:32 PM

I can now confirm now that using the method above will allow SRE to be installed using Sage 9. Since I was running this on a test install, I don't actually have any recordings to test functionality.

Slugger 08-15-2015 06:21 PM

I can't think of a reason why I set a max version for this plugin. The hack proposed here should be fine until someone actually updates the manifest. The earliest I would update the plugin myself would be mid October, but I don't even think I have ownership of this plugin in the repo anymore so it may not even be up to me to fix it. Either way I'm enjoying summer and have no env to even address these kinds of issues until I rebuild it and that won't start before October.

Fuzzy 08-15-2015 07:06 PM

message andy, he might be able to alter the manifest to simply remove the max version dependency.

skiingwiz 08-15-2015 07:51 PM

Since I technically control the plugin, I just submitted an updated manifest. It should show up in a few minutes. It points to the same release, just removes the max version constraint.

Slugger, feel free to take control of this back whenever you want it :)


All times are GMT -6. The time now is 09:02 PM.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.
Copyright 2003-2005 SageTV, LLC. All rights reserved.