You are browsing the archive for Downtime.

[Updated] FleepGrid Server is UP; Move to New Hardware Completed

January 28, 2014 in Downtime

FleepGrid Website:  UP

FleepGrid Web Store:  UP

Minecraft Server:  UP

OpenSim Grid:  UP (Limited basis – FleepGrid Plaza only)

UPDATE:  The server is up and running!  All services have been restored, though at this time only the FleepGrid Plaza region is available.  I intend to do some reconfiguring of the map and region layout, among other things, so most other regions will likely be unavailable for some time to come.   Please do let me know if you see anything amiss either with the web services or in-world, otherwise I hope this upgrade will result in a much improved user experience all around!   Thanks!  - Fleep

Original Post:

After many years of reusing old hardware to host the FleepGrid server, I finally broke down and bought a brand new machine!  This is a big yay, but of course means it will take some time to get everything reinstalled.

I’ll try to keep status posts here (if you can even see the post!) to track my progress.

 

FleepGrid Online – But Not Particularly Stable

January 18, 2014 in Downtime

So… the move to the new old hardware didn’t go as well as expected.  At this point the website, web store, and the FleepGrid Plaza region are all provisionally online, but stability is pretty questionable and I essentially can’t fix the issues until I get better hardware to run the grid.  I don’t have an ETA yet, but I’m working on a solution, and in the meantime, I’ll just have to apologize in advance for any inconvenience or technical issues you may experience.

FleepGrid Down – Moving to New Server

October 22, 2013 in Downtime

As many of you know, the original FleepGrid server was quite literally an old box I’d dragged up from the basement.  My original goal was simply to learn how to run Opensim, and as a test box, it served that purpose very well. Over time, however, the hardware began to fail in very annoying ways.  A bad hard drive, the power supply went out, I had to replace some RAM, the video card fan died.. And on and on.  I’m not even sure what went wrong this last time except that a BSOD appeared any time the server tried to access a particular file.  All attempts to check and repair the disk were to no avail and I got tired of throwing money at that machine, and so at last, I decided to retire the poor old FLEEP-SAM server altogether.

The new hardware isn’t all that new either, it’s another spare box, but hopefully it will be more reliable than the last.  Since you’re seeing this post, it’s evidence that I’m making some progress setting up the new server and I hope to have FleepGrid up soon running the latest 0.7.6 release.  More to come soon and thanks for your patience!

 

FleepGrid Upgraded to Opensim 0.7.5

March 26, 2013 in Downtime

Hi all, just wanted to let you know FleepGrid has been upgraded to 0.7.5. See the Opensimulator 0.7.5 release notes for more information and if you see anything amiss or not working, please let me know! Thanks as always!

FleepGrid Downtime – Hardware Issue Resolved

February 17, 2013 in Downtime

FleepGrid was down for some time due to what I thought was a problem with the power supply.  It turned out to be a more involved problem than that, but I think everything is fixed now and should be running normally.

Since it ended up taking longer to resolve the hardware problems than I anticipated, I’m going to hold off on upgrading to Opensim 0.7.5 for another day when I have more time.  So for now, the grid is still on 0.7.4 and all services should be working normally.

Let me know if you see anything amiss and apologies for the lengthy downtime!

FleepGrid Upgraded to Opensim 0.7.4

September 19, 2012 in Downtime, General News

FleepGrid has been upgraded to Opensim version 0.7.4, including the Diva Wifi for ROBUST module and Mimetic Core’s awesome OS Services modules to provide groups, offline IMs, profiles, and search. As always, detailed notes on the upgrade are available on the FleepGrid wiki page.

FleepGrid Back Online After Extended Downtime

July 14, 2012 in Downtime

Due to hardware and networking issues that I was unable to resolve until this weekend, FleepGrid has been down for some time. I’m happy to report that all issues should be fixed and the grid should now be back to operating normally. Apologies for any inconvenience during the downtime!

FleepGrid Griefed! And How to Get HyperGrid User’s UUID from an Object

April 13, 2012 in Downtime, Griefers, Security

Over the past few days, a griefer has stopped by to drop a bunch of colored spheres all over FleepGrid and it’s taken a little time to figure out how to clean everything up since the griefer objects cause the sims to crash after a few minutes.  If you’ve tried to visit and couldn’t HG teleport or log in, that’s probably why.  My first inclination was to NOT post because I know griefers love the attention, but on the off chance he hits some else’s grid, I thought I’d share what happened.

The first attack came from the account “Jack Marioline”, which was created locally on FleepGrid.  With many thanks to Gudule Lapointe from Speculoos grid for alerting me to the issue, cleaning up that mess was relatively easy.  I use phpMyAdmin to poke around in the Opensim database, so it was easy to go to the “useraccounts” table and look up the UUID for that user and then delete the items from the prims table.

SELECT * FROM `useraccounts` WHERE `LastName` LIKE 'marioline'

This returns a row that shows the user’s “PrincipalID” or UUID that you can then use to search the prims table for items created by that user.

SELECT * FROM `prims` WHERE `CreatorID` LIKE '120aa461-0bd1-4c3a-a759-31fe9d73a328'

Checking to see that the results seemed about right (897 rows), then just delete those prims created by that user. Easy peasy and only took a few minutes. After that, I changed the password on that account so the griefer couldn’t log in and figured it was done.

However, a couple days later, shortly after the HyperGrid Adventurers Club stopped by on their way to the final destination of Devokan Tao (an awesome Myst themed area on OSGrid), I flipped back over to the FleepGrid window to see that Jack had struck again – this time from a HyperGrid account “Jack.Marioline @login.cyberwrld.net:8002″.

I was momentarily confused about how to proceed, since hypergrid users don’t show up in the database “useraccounts” table, so I couldn’t just look the user up to get the UUID to search by in the prims table. But then I remembered that Imprudence allows you to copy a user’s UUID from their profile, so I took one of the griefer objects and profiled the creator and voila, I had a UUID to search for.

Now strangely, when I searched the prims table by CreatorID for that UUID, I got no results, but I might have done something dumb and just didn’t realize it, since my second try, searching by OwnerID did work and I deleted the prims. But either way, getting the Hypergrid user account’s UUID was the trickiest part and the “Copy Key” button on the user’s profile in Imprudence was the solution.

Hope that helps someone else!

FleepGrid Provisionally Back Up Following Hard Disk Failure

March 19, 2012 in Downtime

It’s been a long few days.  (This post is being updated as services are restored.)

The TL;DR Version

I’ll try to re-cap what happened in detail, but the long and short of it is that the FleepGrid server had a total hard drive failure and I had difficulty restoring things from backup.  Here’s the quick status of each service:

  • FleepGrid is ONLINE – but has time-warped back to October 2011 since that was the last complete database backup I was able to restore.  This means if you created an account on FleepGrid between October and now, your account, your inventory, and anything else is gone.  =(  I’m really sorry.  Really REALLY sorry.
  • The FleepGrid website is ONLINE - should be fine, no data loss.
  • The FleepGrid Web Shop is ONLINE has been upgraded to a newer version and hopefully everything is completely restored.
  • The FleepGrid Portal region on OSGrid is ONLINE.  

The Tell Me Everything Version

Indications of Trouble

The first hint that something was wrong actually wasn’t obvious.  On Thursday last week, a few people notified me that they were having trouble importing IAR files from the FleepGrid Web Shop that they’d just downloaded.  This seemed quite strange, since everything had been working fine the day before (I’d imported an IAR file into OSGrid from my own website with no trouble).  I sent a message to the listserv asking if anyone had a hint about the error message:

20:26:41 - Command error: System.IO.InvalidDataException: The magic number in GZ
ip header is not correct. Make sure you are passing in a GZip stream.
at System.IO.Compression.GZipDecoder.ReadGzipHeader()
at System.IO.Compression.Inflater.Decode()
at System.IO.Compression.Inflater.Inflate(Byte[] bytes, Int32 offset, Int32 l
ength)

No one replied, and while I was trying to figure out what was wrong with the IAR files, I decided to further separate some of the region processes so I could shut off a few regions that weren’t really being used to improve performance on the grid.  Except when I tried to copy over my opensim config files, I got an ominous windows error:

Cannot copy opensim_0.7.2: Data error (cyclic redundancy check)

One quick google of  cyclic redundancy check and I was suddenly (painfully) aware that the hard drive was probably failing and I needed to get any data I could ASAP.  Which I did.  And I had backups!  So, while I was very bummed, and dreading re-installing Windows and everything, I wasn’t terribly worried about data loss since I regularly take database dump backups and OAR backups and save them out to an external drive.

It will be a pain, I thought, but doable over the weekend.

Installing New Hard Drive, Installing Windows, Installing installing installing…

I hopped in the car and picked up a new drive and got started with the Windows install straight away.  By the next morning, Windows was done and I got going on installing all the other software I use.  I won’t bore you with too many of the details, but I tried to keep super good notes in case anyone else is prepping a Windows XP box for Opensim and wants to follow what I did.  (It’s the 3/17/2012 entry if you’re reading this post later.)

Note:  I also run the http://fleepgrid.com WordPress site and the FleepGrid Shop Opencart site from the same server, so had to install a bunch of stuff for those services that you would NOT need to install for plain Opensim.

O NOES – Problems with my Backups

Everything was going swimmingly until I got to the point of restoring the database mysql dumps.  All of the databases restored beautifully… except the one that was most important!  My Opensim database!  For some unknown reason (I still have to figure this out), all of my opensim database dumps only contained ONE TABLE – the asset table.  All the other tables, poof.  Not there.

Now, I do have OAR backups of all the regions, but I really did panic when I realized everything in my inventory, all of the user accounts, all of the other stuff that makes up FleepGrid besides the stuff out on the sims was gone.  :(

The only saving grace was that I’d done a full NTBackup of the FleepGrid system back in October 2011 as a test, and I still had that file, which I was able to restore and from there extract the mysql data files to restore a snapshot of the opensim database as it existed in October 2011.  Again, all the gory details of all the stupid things I tried in the interim are on the 3/17/12 change log entry.

What I Did Wrong

My primary mistake was not not having backups.  I had backups.  I was backing things up regularly, scheduled, automated even!

My mistake was that I never tested the backups.  If I’d tried it even once, I would have realized all the tables weren’t being dumped, and I would have saved myself a huge huge headache and lots of stress.

As an IT person, I know this.  You know this.  We all know this.  Still, when it comes to hobby projects like FleepGrid, we sometimes get lazy doing all the checklist things we know we should do, and then we pay for it.  Or at least I did, and any poor peeps whose accounts I just lost.  =(

—-

So that’s the update for now, more to come, I’m sure lots of things still aren’t working quite right, but I wanted to give an update as soon as I had a good idea of what the situation was.  Please be on notice that FleepGrid might be going up and down a bit while I get everything repaired, I probably won’t make a blog post each time, will just leave this here for reference.

And my sincere apologies again to anyone who was/is  inconvenienced by the outtage.  I’m really glad FleepGrid is just a test grid and hopefully if you run your own grid you will go off right this minute and test your backups to make sure they work so you don’t have to make a post like this.  ;)

 

FleepGrid Down for Maintenance – Completed

November 19, 2011 in Downtime

UPDATE 11/20/11: Maintenance on the FleepGrid server has been completed.

All new modules installed successfully so I’m happy to say FleepGrid users can now create groups, edit their profile and have it persist, send offline IMs to friends and others, and use the search function! Super thanks to Mimetic Core for this wonderful module package!  

The HttpServer_OpenSim.dll, .pdb, and .xml files have also been updated with the latest versions, so there will hopefully be less random crashing of the simulators, too.  Thanks to JustinCC and Oren from Kitely for their assistance.  :)

As always, the step-by-step breakdown of how these modules/changes were installed are located at the FleepGrid Change Log at: http://fleep.wikispaces.com/FleepGrid
(11/19/11 and 11/20/11 entries).

- – - -

I’ll be taking FleepGrid down for a bit of maintenance today, which may intermittently affect the website and FleepGrid Shop as well.

First, I’m super excited to try to give Memetic Core’s new module package a try, if all goes well, it will include:

  • Groups
  • Profiles
  • Offline IMs
  • Search

In addition to trying the module package, I also want to deal with a bug I’ve been running into that’s causing intermittent crashing on the simulator, with an error message that looks like this:


Region (root) #
Unhandled Exception:Unhandled Exception: System.ArgumentOutOfRangeException: Spe
cified argument was out of the range of valid values.
Parameter name: offset at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, In
t32 size, AsyncCallback callback, Object state) at HttpServer.HttpClientContext.OnReceive(IAsyncResult ar) in C:\Users\Crista\dev\opensim-HttpServer\trunk\HttpServer\HttpClientContext.cs:line 300
at System.Net.LazyAsyncResult.Complete(IntPtr userToken)
at System.Net.ContextAwareResult.CompleteCallback(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Net.ContextAwareResult.Complete(IntPtr userToken)
at System.Net.LazyAsyncResult.ProtectedInvokeCallback(Object result, IntPtr userToken)
at System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(UInt32
errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32
errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)

According to Oren Hurvitz from Kitely, this is a known issue with the HTTP Server .dll and he submitted a patch, but I didn’t feel up to the challenge of trying to rebuild it myself.  Thankfully, JustinCC rebuilt the DLL and added it to git master f72c4bd, so hopefully replacing the bugged version with the new one will fix those nasty crashing problems.