Do not ever push to production without testing things first. I went and moved us to the beta branch because this commit caught my eye and I wanted us to have the emoji picker fixed: https://github.com/LemmyNet/lemmy-ui/commit/ae4c37ed4450b194719859413983d6ec651e9609

The beta branch was on dockerhub so I thought that it was at least minimally tested. I was sorely mistaken. Lemmy (the backend) itself would load, but lemmy-ui (the actual website that renders everything) would keep crashing when attempting to load. I couldn’t roll back because the database was migrated to this beta version and it couldn’t migrate back to the old version when attempting to launch said old version.

I had no choice but to restore from backup. We’ve lost a whole day’s worth of posts (anything after 3AM CST.) I’m really really sorry… blobcatsadpleading

I was just so excited to be able to unveil this, I didn’t take my time with actually testing it.

  • Spiffy Diffy
    link
    English
    12
    edit-2
    9 months ago

    It’s time to speedrun “repost everything and re-setup new community any% glitchless”! Here’s hoping for a PB! ablobcathyper

  • @BurgerOPMA
    link
    99 months ago

    Also, to add: Thank fuck for Borg. It’s one of the few backup solutions that hasn’t corrupted itself just from doing incremental backups.

    • @Mousepad
      link
      69 months ago

      Their reverse order list empowers them!

    • @BurgerOPMA
      link
      49 months ago

      Right. I thought I was restoring the most recent backup, but it turned out that the order for the archive list was in ascending order. I picked the top one thinking it was the latest. So that’s why you saw 8 days when Burggit was back up for a split second.

      My server automatically takes a backup once daily in the early AM, and keeps (I think? I’d need to check) 7 on hand. Any older than that, and they’re pruned.

      • @CookieJarObserver
        link
        49 months ago

        You should pick one day a week that is stored longer, if there is a repetitive issue it might corrupt over more than a few days.

        Man i thought i did something wrong when the posts from the last week where gone

        • @BurgerOPMA
          link
          49 months ago

          Added a 1 per month backup retention. Thanks for the suggestion.

          • @CookieJarObserver
            link
            39 months ago

            Yeah had a problem once where a backup (made one once a day and keep 14) was overwritten by a corrupted data set and i basically lost most data, would be sad if that happens here if its avoidable👍

  • @Nazrin
    link
    49 months ago

    Daily backups isn’t too bad, though.

    I would recommend backup right before upgrade too. make a sticky note for it if you always forget.

    • @BurgerOPMA
      link
      19 months ago

      See, I deliberately didn’t do it before I did the upgrade because it’d cause downtime for 10-15mins I stop the server just so postgres can be in a consistent state and then the backup starts. And ofc we all know how that turned out, rofl. I clearly wasn’t thinking.

        • @BurgerOPMA
          link
          39 months ago

          There’s a whole separate server that’s in charge of storing images (pict-rs) it uses its own database that isn’t anything *SQL (sled.) I just think it’s easier to use this solution. Everything’s in-tact and the backup task runs in the wee hours of the morning. Besides, I couldn’t get cron to call docker to execute a pg_dump if my life depended on it. Some shell environment fuckery is my guess. And I just don’t want to mess with troubleshooting it because it’s a hassle testing why something doesn’t work with cron. This works. I’d rather not change it. It backs up everything, including all the images.

          For a time when I was running this on my home box, Proxmox has a nifty backup tool that freezes the filesystem in place and takes a snapshot, then backs up said snapshot in a compressed tarball. It’s dedicated too if you run proxmox backup server. This is a VPS though. Not a dedi. There’s, of course LVM for taking snapshots, but I don’t want to rip everything up and start over. Since this is running raw Ext4 with no volume management whatsoever.

          • nickwitha_k (he/him)
            link
            fedilink
            29 months ago

            Makes sense. I just don’t like folks having to do tedious work, if it isn’t needed (also my job). For the cron, I might suspect path or permissions (most often these, in my experience). I find the easiest way to diagnose is to wrap the intended command in a bash script that writes stdout and stderr to files, acting like basic logs.

            Glad you’re back up and running!

            • @BurgerOPMA
              link
              29 months ago

              I guess I didn’t clarify. I have a cron job that automatically turns the server off via docker compose and runs the Borg backup which seems to work perfectly. There’s no manual intervention at all. I’m not manually turning it off and manually doing the backup.

              I really appreciate your suggestions, though. I just don’t want to touch something that saved my bacon already and risk having some workflow somewhere screw up without my being aware of it when I need to restore from backup. This probably will all be moot since moving to a dedi is a possibility in the future. So then I’d be able to use a hypervisor and just full on backup VM images.

              • nickwitha_k (he/him)
                link
                fedilink
                39 months ago

                Yeah. Makes sense - best to have something that you absolutely know works. Having the dedi will be really nice - having control of the hypervisor should let you avoid a lot of issues and make testing new updates easier (clone prod, update the clone, test on the clone, swap LB backend to point to the clone and drain old backend, hold old prod VM for a bit to make rollback quick, if needed).

  • @marisa1
    link
    29 months ago

    Well… I came back after this is solved very convient! lol