October 23, 2024 Unplanned Outage and Recovery

Hello SFBA.social! 🌉

We want to be transparent with you about the recent downtime on SFBA. We’ve been working behind the scenes on some important infrastructure changes, and we ran into some issues.

🔧 What Happened?

  1. We migrated both media storage and database backups from AWS S3 to Linode. This migration is helping us save a significant amount of money on infrastructure.
  2. After completing the media migration, we started preparing for the big upgrade to Mastodon 4.3 by first updating to 4.2.13.
  3. In the midst of all this work, we did not keep track of disk space on our database server. Unfortunately, the database ran out of space and crashed. 😓
  4. We brought the database back online yesterday morning, but it went into recovery mode, which is read-only. This prevented logins, posts and feed updates, and essentially turned SFBA into a fediverse archive (temporarily).
  5. As the morning went on, we noticed the issue with logins and posts and began troubleshooting. Given the number of changes we made recently, it took us a while to pinpoint the exact cause.
  6. Once we realized that the database was repeatedly entering recovery mode due to lack of disk space to recover from the crash, we resized the database server, which resolved the issue within about an hour.
  7. It then took about three hours for our SFBA instance to fully catch up with the rest of the fediverse.

☀️ The good news: Even with the database crash, we didn’t lose any data. Your posts, follows, and everything else are all safe and sound.

Quick Note: This outage was not related to our Mastodon 4.3 upgrade, which we’ll be continuing soon now that everything’s stabilized.

You may have also noticed that character limits are back to the Mastodon defaults instead of the longer posts you’re used to. This is temporary; we’ll merge that change to the code back in when we finish the upgrade to version 4.3.

We deeply appreciate your patience and understanding during this time. These infrastructure upgrades are part of our commitment to making SFBA.social more sustainable, efficient, and reliable. If you have any questions, feel free to reach out!

Thank you for being part of this community,

Your SFBA team:
@seb @cd24 @moritz @ingurido @EverydayMoggie @jeridansky @neuralgraffiti


Posted

in

by

Tags: