Video Conversions on Sage Server now trigger ESXi reboot
A little more than a month ago, I built a new ESXi 5.0 white box host, to run multiple VMs including my Sage Server. It's been working fine until the beginning of this week. On Monday, we had a power outage, but the host is attached to a UPS, so as far as I know, it was fine. After the power came back, I checked the host and it was still up, so I assume it ran on battery during the 45-50 minute outage. (My automatic shutdown from the UPS apparently did not work, reason unknown.)
A day or two later, I was watching a show on a client, and the server (host) rebooted with no warning. Since then, every time I start up Sage (whether as a service or an Application) the ESXi host reboots about 2 minutes later. Also, all of my other VMs are shut down so this is not some activity from something else.
Starting the Sage VM (Windows 7) alone does not cause the issue, only actually starting Sage. I've turned on debug logging in Sage to get a sense of what's going on.
My recording drives (4 physical drives) are attached to an M1015 controller, and I wondered if there was a problem with the drives or the controller. I have run chkdsk on each of the recording drives, and no errors were reported.
I did boot the host to the bios settings and confirm the CPU was not overheating. But of course, that was not under load.
I played another hunch, and as soon as Sage started, checked for queued video conversions (I automatically convert any show recorded with my HD PVR so that I can watch it on clients that can't handle .ts files) I found that a show was awaiting conversion, and quickly cancelled the conversion. Once I did this, Sage (and the ESXi host) stayed up. I then tried a conversion of another .ts (different show), and once again, the ESXi host reboots. Once again, video conversions had been working fine for the month or so since the new server was built.
Any suggestions about further diagnostics I can do to isolate what might be causing video conversion activity from Sage to bring down the ESXi host? I'm certainly not blaming Sage, but since a video conversion consistently precedes the reboot, I know that something Sage conversion does (disk I/O, traffic, CPU) must be involved.
|