A week of downtime and all the servers were recovered only because the customer had a proper disaster recovery protocol and held backups somewhere else, otherwise Google deleted the backups too

Google cloud ceo says “it won’t happen anymore”, it’s insane that there’s the possibility of “instant delete everything”

  • Mossy Feathers (She/They)@pawb.social
    link
    fedilink
    arrow-up
    144
    ·
    edit-2
    8 months ago

    They said the outage was caused by a misconfiguration that resulted in UniSuper’s cloud account being deleted, something that had never happened to Google Cloud before.

    Bullshit. I’ve heard of people having their Google accounts randomly banned or even deleted before. Remember when the Terraria devs cancelled the Stadia port of Terraria because Google randomly banned their account and then took weeks to acknowledge it? The only reason why Google responded so quickly to this is because the super fund manages over $100b and could sue the absolute fuck out of Google.

    • Pechente@feddit.de
      link
      fedilink
      English
      arrow-up
      53
      ·
      8 months ago

      This happened to me years ago. Suddenly got a random community guidelines violation on YouTube for a 3 second VFX shot that was not pornographic or violent and that I owned all the rights to. After that my whole Google account was locked down. I never found out what triggered this response and I could never resolve the issue with them since I only ever got automated responses. Fuck Google.

      • NaN@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        16
        ·
        8 months ago

        This sort of story is what made me switch away from Google Fi and ultimately mostly degoogling. Privacy was a big part later on, but initially it was realizing that a YouTube comment or a file in my drive could get my cell service turned off.

    • umbrella@lemmy.ml
      link
      fedilink
      arrow-up
      3
      ·
      edit-2
      8 months ago

      one of my accounts was locked for no reason once. i apparently did well to not trust important data to them anymore.

    • RegalPotoo@lemmy.world
      link
      fedilink
      English
      arrow-up
      44
      ·
      8 months ago

      Because accountants mostly.

      For large businesses, you essentially have two ways to spend money:

      • OPEX: “operational expenditure” - this is money that you send on an ongoing basis, things like rent, wages, the 3rd party cleaning company, cloud services etc. The expectation is that when you use OPEX, the money disappears off the books and you don’t get a tangible thing back in return. Most departments will have an OPEX budget to spend for the year.
      • CAPEX: “capital expenditure” - buying physical stuff, things like buildings, stock, machinery and servers. When you buy a physical thing, it gets listed as an asset on the company accounts, usually being “worth” whatever you paid for it. The problem is that things tend to lose value over time (with the exception of property), so when you buy a thing the accountants will want to know a depreciation rate - how much value it will lose per year. For computer equipment, this is typically ~20%, being “worthless” in 5 years. Departments typically don’t have a big CAPEX budget, and big purchases typically need to be approved by the company board.

      This leaves companies in a slightly odd spot where from an accounting standpoint, it might look better on the books to spend $3 million/year on cloud stuff than $10 million every 5 years on servers

      • TCB13@lemmy.world
        link
        fedilink
        English
        arrow-up
        25
        arrow-down
        1
        ·
        8 months ago

        Excellent explanation, however, technically it does not constitute an “odd spot.” Rather, it represents a “100% acceptable and evident position” as it brings benefits to all stakeholders, from accounting to the CEO. Moreover, it is noteworthy that investing in services or leasing arrangements increases expenditure, resulting in reduced tax liabilities due to lower reported profits. Compounding this, the prevailing high turnover rate among CEOs diminishes incentives for making significant long-term investments.

        In certain instances, there is also plain corruption. This occurs when a supplier offering services such as computer and server leasing or software, as well as company car rentals, is owned by a friend or family member of a C-level executive.

      • kcuf@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        8 months ago

        I read OPs comment as being a question about using a company with a reputation like Google rather than using a cloud service, but I could be wrong.

    • Chozo@fedia.io
      link
      fedilink
      arrow-up
      38
      arrow-down
      4
      ·
      8 months ago

      Money. It’s a lot cheaper to let somebody else maintain your systems than to pay somebody to create and maintain your own, directly.

      • Tryptaminev@lemm.ee
        link
        fedilink
        arrow-up
        15
        arrow-down
        1
        ·
        8 months ago

        If you are a small company then yes. But i would argue that for larger companies this doesn’t hold true. If you have 200 employees you’ll need an IT department either way. You need IT expertise either way. So having some people who know how to plan, implement and maintain physical hardware makes sense too.

        There is a breaking point between economics of scale and the added efforts to coordinate between your company and the service provider plus paying that service providers overhead and profits.

        • matti@sopuli.xyz
          link
          fedilink
          arrow-up
          4
          arrow-down
          3
          ·
          8 months ago

          If coordinating with service providers is hard for a firm, I would argue the cost effective answer isn’t “let’s do all this in house”. Many big finance firms fall in this trap of thinking it’s cheaper to build v buy, and that’s how you get everyone building their own worse versions of everything. Whether your firm is good at the markets or kitchens or travel bookings, thinking you can efficiently in-source tech is a huge fallacy.

          • Tryptaminev@lemm.ee
            link
            fedilink
            arrow-up
            3
            ·
            8 months ago

            it is not about it being hard. It simply creates effort to coordinate. And this effort needs to be considered. If you do things externally that means there is two PMs to pay, you need QMs on both sides, you need two legal/contract teams, you need to pay someone in procurement and someone in sales…

            I agree with you that doing software inhouse when there is good options on the market is usually not a good idea. But for infrastructure i don’t see there to be as much of an efficiency loss. Especially as you very much need experts on how to set things up in a cloud environment and you better look carefully at how many resources you need to not overpay huge amounts.

      • nehal3m@sh.itjust.works
        link
        fedilink
        arrow-up
        4
        ·
        edit-2
        8 months ago

        Except for the larger companies you still need a bunch of trained experts in house to manage everything.

        • Optional@lemmy.world
          link
          fedilink
          arrow-up
          3
          ·
          8 months ago

          Yes, and they’re the company’s resources so they theoretically do what’s best for the company as opposed to hoping Google or (godforbid Microsoft) does it.

          The money gets paid either way, and if you have good people it’s often the right call to keep it in house but inevitably somebody read a business book last year and wants to layoff all the IT people and let Google handle it “for savings”. Later directors are amazed at how much money they’re spending just to host and use the data they used to have in-house because they don’t own anything anymore.

          There are still benefits - cloud DevOps tools are usually pretty slick, and unless your company has built a bunch of those already or is good about doing it, it might still be worth it in terms of being able to change quickly. But it’s still a version of the age old IT maxim to never own or build it yourself when you can pay someone a huge subscription and then sue them if you have to. I don’t like it, but it’s pretty much iron in the executive suite.

          As a result, IT departments or companies spend much more than half of their time - totalling years or decades - moving from whatever they were using to whatever is supposed to be better. Almost all of that effort is barely break-even if not wasted. That’s just the nature of the beast.

      • PowerCrazy@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        ·
        8 months ago

        It’s absolutely not. If you are at any kind of scale whatsoever, your yearly spend will be a minimum of 2x at a cloud provider rather then creating and operating the same system locally including all the employees, contracts, etc.

        • allywilson@lemmy.ml
          link
          fedilink
          arrow-up
          8
          arrow-down
          1
          ·
          8 months ago

          Why do you think it’s invasive? How do you quantify which providers are less invasive?

          • GolfNovemberUniform@lemmy.ml
            link
            fedilink
            arrow-up
            10
            arrow-down
            11
            ·
            8 months ago

            Google is one of the most privacy invasive companies in the world. And judging by encryption standards, terms of service and privacy policies

            • settoloki@lemmy.one
              link
              fedilink
              arrow-up
              9
              arrow-down
              3
              ·
              8 months ago

              Are you sure you’ve not just read bad stuff without verification on the internet and feel the need to chime in on something you don’t fully understand?

                • settoloki@lemmy.one
                  link
                  fedilink
                  arrow-up
                  12
                  arrow-down
                  2
                  ·
                  8 months ago

                  Me too as a programmer that uses Google cloud to store government information. Which bit of the policy says they are going to access your data, shouldn’t take you long to link it to me if you read them as much as you say. Unless what you’re actually doing is spreading misinformation and bullshit.

            • Pup Biru@aussie.zone
              link
              fedilink
              arrow-up
              4
              arrow-down
              1
              ·
              8 months ago

              and you know the security standards that are achievable on google cloud entirely negate your point right? their cloud offering is a totally different beast

    • Kit@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      10
      arrow-down
      1
      ·
      8 months ago

      G Suite is a legitimate option for small-medium businesses. It’s seen as the cheaper, simpler option versus Azure. I usually recommend it for nonprofits as they have a decent free option for 501c3 orgs.

    • Karna@lemmy.ml
      link
      fedilink
      arrow-up
      4
      ·
      8 months ago

      Money and Time – It’s rather easier/cheaper for Organizations nowadays to outsource a part of infra to Cloud service providers.

  • TCB13@lemmy.world
    link
    fedilink
    English
    arrow-up
    47
    ·
    edit-2
    8 months ago

    “This is an isolated, ‘one-of-a-kind occurrence’ that has never before occurred with any of Google Cloud’s clients globally. This should not have happened.

    I don’t believe this is what that rare, what I believe is that this was the fist time it happened to a company with enough exposure to actually have in impact and reach the media.

    Either way Google’s image won’t ever recover from this and they just lost what small credibility they had on the cloud space and won’t be even considered again by any institution in the financial market - you know the people with the big money - and there’s no coming back from this.

    • TeoTwawki@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      ·
      edit-2
      8 months ago

      It has 100% happened before and just never been admitted to. I have both 1st hand dealt with the aftermath and heard from other smaller companies about it. I work at medium sized MSP and disaster recovery is in my wheelhouse.

    • Hirom@beehaw.org
      link
      fedilink
      arrow-up
      5
      ·
      8 months ago

      They had backups at multiple locations, and lost data at multiple (Google Cloud) locations because of the account deletion.

      They restored from backups stored at another provider. It may have been more devastating if they relied exclusively on google for backups. So having an “offsite backup” isn’t enough in some cases, that offsite location need to be at a different provider.

      • heluecht@pirati.ca
        link
        fedilink
        arrow-up
        6
        ·
        8 months ago

        @Hirom With “offsite” I mean either a different cloud provider or own hardware (if you hold your regular data at some cloud provider, like in this case).

        • Hirom@beehaw.org
          link
          fedilink
          arrow-up
          1
          ·
          8 months ago

          That would indeed be a good backup strategy, but better be specific. “Offsite” may be interpreted in different ways.

      • Tangentism@lemmy.ml
        link
        fedilink
        arrow-up
        2
        ·
        8 months ago

        It may have been more devastating if they relied exclusively on google for backups.

        Which is why having any data, despite the number of backups, on a cloud provider shouldn’t be seen as off-site.

        Only when it is truly outside their ecosphere and cannot be touched by them should it be viewed as such.

        If that company didn’t have such resilience built into their backup plan, they would be toast with a derisory amount of compensation from Google.

        • Hirom@beehaw.org
          link
          fedilink
          arrow-up
          2
          ·
          8 months ago

          Having a backup at a cloud provider is fine, as long as there is at least one other backup that isn’t with this provider.

          Cloud provider seems to do a good job protecting against hardware failure, but can do poorly with arbitrary account bans, and sometimes have mishaps due to configuration problems.

          Whereas a DIY backup solution is often more subject to hardware problems (disk failure, fire, flooding, theft, …), but there’s no risk of account problem.

          A mix is fine to protect against different kind of issues.

          • Tangentism@lemmy.ml
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            8 months ago

            as long as there is at least one other backup that isn’t with this provider.

            Which is exactly what I was saying.

            Any services used with a cloud provider should be treated as 1 entity, no matter how many geo-locations they claim your data is backed up to because they are a single point from which all those can be deleted.

            When I was last involved in a companies backups, we had a fire safe in the basement, we had an off-site location with another fire safe & third copies would go off to another company that provided a backup storage solution so for all backups to be deleted, someone had to go right out of their way to do so. Not just a simple deletion of our account & all backups are wiped.

            That company had the foresight to do something similar & it’s saved them. [edited - was on the tube when I wrote this and didnt see the autocorrect had put ‘comment’, not ‘company’]

  • Optional@lemmy.world
    link
    fedilink
    arrow-up
    21
    ·
    8 months ago

    While UniSuper normally has duplication in place in two geographies, to ensure that if one service goes down or is lost then it can be easily restored, because the fund’s cloud subscription was deleted, it caused the deletion across both geographies.

    TFW your BCDR gets disastered.

    Also “massive misconfiguration” is the “spontaneous disassembly” of cloud computing. i’m sure it’s mutiple systems are misconfigured causing chaos but it sounds hilarious.

  • Simon@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    12
    ·
    8 months ago

    Just an FYI in case you don’t follow Cloud news but Google has deleted customers accounts on multiple occasions and has been for literal years. This time they just did it to someone large enough to make the news. I work in SRE and no longer recommend GCP to anyone.

  • AutoTL;DR@lemmings.worldB
    link
    fedilink
    English
    arrow-up
    2
    ·
    8 months ago

    This is the best summary I could come up with:


    More than half a million UniSuper fund members went a week with no access to their superannuation accounts after a “one-of-a-kind” Google Cloud “misconfiguration” led to the financial services provider’s private cloud account being deleted, Google and UniSuper have revealed.

    Services began being restored for UniSuper customers on Thursday, more than a week after the system went offline.

    Investment account balances would reflect last week’s figures and UniSuper said those would be updated as quickly as possible.

    In an extraordinary joint statement from Chun and the global CEO for Google Cloud, Thomas Kurian, the pair apologised to members for the outage, and said it had been “extremely frustrating and disappointing”.

    “These backups have minimised data loss, and significantly improved the ability of UniSuper and Google Cloud to complete the restoration,” the pair said.

    “Restoring UniSuper’s Private Cloud instance has called for an incredible amount of focus, effort, and partnership between our teams to enable an extensive recovery of all the core systems.


    The original article contains 412 words, the summary contains 162 words. Saved 61%. I’m a bot and I’m open source!