Any details around the Database Improvements?

dvd_maniac · #1 09-28-2015, 06:25 PM

From the release notes:
21. Significant performance improvements to overall database system

My wiz.bin hovers between 80 - 90MB lately and I am noticing some performance issues especially on the extenders.I want to try the Open Source version specifically for this update but am afraid of some of my installed plugins might then have issues.
It would be nice to know the details of this update to see if it would even be worth trying it out... Are there any specifics regarding this?

tmiranda · #2 09-29-2015, 06:46 AM

Quote:

Originally Posted by dvd_maniac

From the release notes:
21. Significant performance improvements to overall database system

My wiz.bin hovers between 80 - 90MB lately and I am noticing some performance issues especially on the extenders.I want to try the Open Source version specifically for this update but am afraid of some of my installed plugins might then have issues.
It would be nice to know the details of this update to see if it would even be worth trying it out... Are there any specifics regarding this?

I think Jeff mentioned the improvements were made in the areas of locking and concurrency.

will · #3 09-29-2015, 11:44 AM

Quote:

Originally Posted by tmiranda

I think Jeff mentioned the improvements were made in the areas of locking and concurrency.

What is locking and concurrency exactly when it comes to STV?

moothekow · #4 09-29-2015, 11:52 AM

Quote:

Originally Posted by will

What is locking and concurrency exactly when it comes to STV?

Basically it means it improves allowing multiple things to access the database at once. Locking = preventing other things from accessing the database while it's doing something. Concurrency = multiple things accessing it at once.

So - before it was possible that if the server was updating something in one part of the database - it could have made any clients wanting to read stuff wait until it was done even if there wasn't any overlap in the areas the two things were looking at. I'm assuming he's improved the code so if the server is writing something related to item A and the client wants to read information about item B it won't have to wait until the server is done. I don't know the specifics of what was slow before or what was improved - but this is my best guess :-).

wnjj · #5 09-29-2015, 12:25 PM

There are also indexes that contained ID's sorted by something else. So when looking up something like a show by its title, the program doesn't have to scan all of the shows (which are in the order they were initially added to the DB). It instead can quickly locate the title by searching the index and checking only a few ID's to find the right one. If you want more detail I can make an illustration.

Narflex · #6 09-29-2015, 12:32 PM

The improvements were pretty much all concurrency related. Previously, it had locks per-table. So if something was reading/writing the Airing table...then nothing else could access it at the same time. Now each table has separate locks for reading and writing...so many things are allowed to read a table in the DB simultaneously; and they are only blocked out that table DB if something has a write lock (which we only grab now when something is actually being modified). A main point of slowdown previously was after an EPG update when SageTV was saving out the database to disk and had to read the whole thing from memory to write it out, locking each table as it was written...and that took a fair amount of time and you'd usually see the spinning circle during this...now it should not cause the spinning circle at all since it will only be grabbing read locks.

Sorted indexes were always part of the database.

The other improvement was adding Lucene; so that when you do full text searches, they will execute much faster than before (although they already worked surprisingly fast for doing a linear search of the DB).

wnjj · #7 09-29-2015, 12:40 PM

Quote:

Originally Posted by Narflex

Sorted indexes were always part of the database.

That's odd because when I disassemble my version 7 DB, there aren't any.

Narflex · #8 09-29-2015, 04:50 PM

Quote:

Originally Posted by wnjj

That's odd because when I disassemble my version 7 DB, there aren't any.

Then your disassembler doesn't work right.

Indexes have been part of the DB since it was first created back in 2002.

wnjj · #9 09-29-2015, 06:03 PM

Quote:

Originally Posted by Narflex

Then your disassembler doesn't work right.

Indexes have been part of the DB since it was first created back in 2002.

Maybe indexes were always there, but it seems they are only recently written to wiz.bin (ver 83):

Code:

  // 83 added support for writing out index orders; this is backwards compatible since older versions will skip this section and create their own indices

So the improvement is reading them in rather than re-creating them every time.

Narflex · #10 09-30-2015, 12:32 PM

Quote:

Originally Posted by wnjj

Maybe indexes were always there, but it seems they are only recently written to wiz.bin (ver 83):

Code:

  // 83 added support for writing out index orders; this is backwards compatible since older versions will skip this section and create their own indices

So the improvement is reading them in rather than re-creating them every time.

Lol...yes, that is correct.

I had totally forgotten about that it used to not store them on disk.

dvd_maniac · #11 10-01-2015, 08:41 PM

I just cleared the watched flag of about 12,000 recordings using the WebUI and though that would shrink my wiz.biin down a bit but it did not. In fact it went form 81,302KB to 86,273KB. Is there some sort of process that I need to do to purge old recordings without starting from a completely fresh wiz.bin?

JREkiwi · #12 10-02-2015, 12:35 AM

What ever you do with the wiz.bin it won't have an immediate effect on the size. You need to wait for the cleanup to occur. Normally happens on a SageTV restart or daily when doing an EPG update.

Something I have seen is UserRecords created by plugins taking up wiz.bin space as there is not an automated process to delete the UserRecords associated with a media file when the file is deleted.

John

Fuzzy · #13 10-02-2015, 01:13 AM

Quote:

Originally Posted by dvd_maniac

I just cleared the watched flag of about 12,000 recordings using the WebUI and though that would shrink my wiz.biin down a bit but it did not. In fact it went form 81,302KB to 86,273KB. Is there some sort of process that I need to do to purge old recordings without starting from a completely fresh wiz.bin?

Changing the state of a flag is not going to remove that entry from the wiz.bin. Essentially, any recording that has been made will stay in the wiz.bin forever. There is no process to 'purge' this old metadata.

tmiranda · #14 10-02-2015, 06:26 AM

Quote:

Originally Posted by JREkiwi

Something I have seen is UserRecords created by plugins taking up wiz.bin space as there is not an automated process to delete the UserRecords associated with a media file when the file is deleted.

Some of my plugins are guilty of this. I think we need a PluginUninstalled event so plugins can clean up after themselves when uninstalled.

stuckless · #15 10-02-2015, 06:55 AM

Quote:

Originally Posted by tmiranda

Some of my plugins are guilty of this. I think we need a PluginUninstalled event so plugins can clean up after themselves when uninstalled.

Or... If a UserRecord had the concept of an owner (ie, a Plugin) and SageTV could just clean them up automatically once a plugin is uninstalled. Having plugin authors do it, is good, but I have feeling not many would actually handle the event. It'll be like the windows registry

In addition to user records, mediafiles have custom metadata fields. Phoenix stuffs LOTS of information in these fields.

dvd_maniac · #16 10-02-2015, 07:07 AM

So whenever I record something or set something to watched it is going to stay in the wiz.bin forever?

I give up...
I'm ditching my wiz.bin and starting fresh.
The only thing that sucks is that I have a couple of odl classic favorite TV shows that I'm trying to get and now it will end up recording the ones I already have again.
No biggie...

Taddeusz · #17 10-02-2015, 08:03 AM

On the face of it I guess it's not bad that SageTV loads the entire db into memory. However, I can see how this can be a problem as one's database grows larger. An 80MB database shouldn't necessarily be a problem. A more pragmatic approach using a cache might be better approach. Why should the whole database always be kept in memory when the majority of that database won't necessarily need to be accessed?

Yes, accessing data from disk would take more time than from memory but smart cache management could make any user impact negligible. For example, as someone is scrolling through the recordings list or through the videos list it could pre-cache the items forward and backward on the list as the user is scrolling. That way the user should never notice any interface impact from a cache miss.

Even if/when SageTV can run in a 64-bit JVM in Windows that 80MB database is still taking up a significant part of its overall memory footprint.

Another school of thought I just had is the current database format is kind of akin to the old legacy Office document formats. For those that don't know a .doc file is actually a dump of the memory representation of that document to disk. So when Word loads a .doc file it essentially plays back that file into memory. This is the reason that the legacy Office files can grow so huge compared to the newer format based on XML.

Not saying the current wiz.bin is necessarily a bad thing. It has most definitely served its purpose. Just thinking that even as difficult as it would be to change a more pragmatic approach might be necessary to continue to move SageTV into the future.

I'm not saying lets replace the Wizard just because. From an efficiency and maintainability perspective how much of the code is written specifically to deal with maintenance of the Wizard? Garbage collection, concurrency, locking? How often does this code run? How much more efficient could the core of SageTV be if we were to use a standard database, even something like SQLite or Mongo? I'm sure there are libraries already that can convert database objects to the Java objects that SageTV uses with little to no fuss. Yes, this would introduce necessary complexity. But it would also offload some of the maintenance of the database itself to a system specifically designed to store and manage data.

Not saying "let's do this". I know the Wizard is Jeff's baby. Just providing some perspective and food for thought.

Fuzzy · #18 10-02-2015, 08:35 AM

Quote:

Originally Posted by Taddeusz

On the face of it I guess it's not bad that SageTV loads the entire db into memory. However, I can see how this can be a problem as one's database grows larger. An 80MB database shouldn't necessarily be a problem. A more pragmatic approach using a cache might be better approach. Why should the whole database always be kept in memory when the majority of that database won't necessarily need to be accessed?

Yes, accessing data from disk would take more time than from memory but smart cache management could make any user impact negligible. For example, as someone is scrolling through the recordings list or through the videos list it could pre-cache the items forward and backward on the list as the user is scrolling. That way the user should never notice any interface impact from a cache miss.

Even if/when SageTV can run in a 64-bit JVM in Windows that 80MB database is still taking up a significant part of its overall memory footprint.

Another school of thought I just had is the current database format is kind of akin to the old legacy Office document formats. For those that don't know a .doc file is actually a dump of the memory representation of that document to disk. So when Word loads a .doc file it essentially plays back that file into memory. This is the reason that the legacy Office files can grow so huge compared to the newer format based on XML.

Not saying the current wiz.bin is necessarily a bad thing. It has most definitely served its purpose. Just thinking that even as difficult as it would be to change a more pragmatic approach might be necessary to continue to move SageTV into the future.

I'm not saying lets replace the Wizard just because. From an efficiency and maintainability perspective how much of the code is written specifically to deal with maintenance of the Wizard? Garbage collection, concurrency, locking? How often does this code run? How much more efficient could the core of SageTV be if we were to use a standard database, even something like SQLite or Mongo? I'm sure there are libraries already that can convert database objects to the Java objects that SageTV uses with little to no fuss. Yes, this would introduce necessary complexity. But it would also offload some of the maintenance of the database itself to a system specifically designed to store and manage data.

Not saying "let's do this". I know the Wizard is Jeff's baby. Just providing some perspective and food for thought.

I think the largest chunk of Wizard.java is likely detecting and repairing errors. In any case, I think that separating out 'live' and 'archived' data would make everything else far more complicated. You'd have to make dummy copies of every array that the wiz.bin contains, and have some way to move items in and out of them, while still providing the full contents on demand. At current memory prices, I can't see that ever really being worth it. The real advantage of keeping everything in java objects in the JVM is speed. Database lookups are very fast because of this, and because database lookups are so fast, much of the operations sage does with the database are coded for simplicity and robustness, and NOT for efficiency (repetetive iterations through an array, for instance). Slowing down those accesses is likely to cause a very noticeable slowdown on the end result. I don't believe a 90MB database, or even 256MB database size should be a problem, and the only real issue is that many people are now using a majority of miniclients instead of 'fat clients', so the JVM has been easily maxed in normal use, which i don't think was ever the design basis. This is an issue with the overall system architecture, not the database, so it should be fixed by changing the architecture. That's either going to a 64-bit JVM, or breaking out the MiniClientRenderer's to their own miniserver JVM's.

wnjj · #19 10-02-2015, 09:07 AM

Quote:

Originally Posted by dvd_maniac

So whenever I record something or set something to watched it is going to stay in the wiz.bin forever?

I give up...
I'm ditching my wiz.bin and starting fresh.
The only thing that sucks is that I have a couple of odl classic favorite TV shows that I'm trying to get and now it will end up recording the ones I already have again.
No biggie...

Before you do that, have you checked out my SQL DB converter? I could get to work on the exporter back to wiz.bin if someone really needs it.

. Then you could clean whatever you want.

http://forums.sagetv.com/forums/showthread.php?t=62512

stanger89 · #20 10-02-2015, 09:12 AM

Quote:

Originally Posted by dvd_maniac

So whenever I record something or set something to watched it is going to stay in the wiz.bin forever?

I give up...
I'm ditching my wiz.bin and starting fresh.

Why? Are you having a problem? The only problems I've every had with Sage were not related to the database, they were always related to orphan files that no longer existed.

FWIW, since the beginning I've seen lots of people wanting to "improve" the Sage database, but rarely, if ever, have I seen anyone provide evidence of an actual problem with Sage's current database, performance or otherwise. It all seems largely academic/philosophical, people assume because it's not a standard format it's not good or something.

FWIW2, 80MB is nothing these days, I've got 80MB pictures.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Limit to Day - V7 Improvements?	bellyman	SageTV Software	10	09-27-2011 04:34 PM
EPG Thread : Questions, Improvements, etc	bialio	Phoenix	2	06-26-2011 05:55 AM
Improvements needed for Sage TV HD	Flash2k6	SageTV HD Theater - Media Player	7	12-04-2009 07:03 PM
Clear QAM Improvements to HVR-1600	bcjenkins	SageTV Linux	1	11-14-2009 09:13 AM
general improvements	loomdog32	SageMC Custom Interface	0	12-25-2008 10:30 PM

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)