Nice! I still remember the old GitHub status page which showed and reported on their uptime. No surprises they took it offline and replaced it with the current one when it started reporting the truth.
> It looks like they are down to a single 9 at this point across all services
That's not at all how you measure uptime. The per area measures are cool but the top bar measuring across all services is silly.
I'm unsure what they are targeting, seems across the board it's mostly 99.5+ with the exception of Copilot. Just doing math, 3 (independent, which I'm aware they aren't fully) 99.5 services brings you down to an overall "single 9" 98.5 healthy status but it's not meaningful to anyone.
It depends whether the outages are overlapped or not. If the outages are not overlapped then that is indeed how you do it since some of your services being unavailable means your service is not fully available.
I mean, there's a big difference between primary Git operations being down and Copilot being down. Any SLAs are probably per-service, not as a whole, and I highly doubt that someone just using a subset of services cares that one of the other services is down.
Copilot seems to be the worst offender, and 99% of people using Github likely couldn't care less.
It's interesting to see that copilot has the worst overall. I use copilot completions constantly and rarely notice issues with it. I suspect incidents aren't added until after they resolve.
Completions run using a much simpler model that Github hosts and runs themselves. I think most of the issues with Copilot that I've seen are upstream issues with one or two individual models (e.g. the Twitter API goes down so the Grok model is unavailable, etc)
Do I misunderstand or does your page count today's downtime as minor? I would not count the web UI being mostly unusable as minor. Does this mean GitHub understates how bad incidents are? Pr has your page just not yet been updated to include it?
If you'd have asked me a few years ago if anything could be an existential threat to github's dominance in the tech community I'd have quickly said no.
If they don't get their ops house in order, this will go down as an all-time own goal in our industry.
I'm pretty sure they don't GAF about GH uptime as long as they can keep training models on it (0.5 /s), but Azure is revenue friction so might be a real problem.
I'm sympathetic to ops issues, and particularly sympathetic to ops issues that are caused by brain-dead corporate mandates, but you don't get to be an infrastructure company and have this uptime record.
It's extra galling that they advertise all the new buzzword laden AI pipeline features while the regular website and actions fail constantly. Academically I know that it's not the same people building those as fixing bugs and running infra, but the leadership is just clearly failing to properly steer the ship here.
That's probably partly why things have got increasingly flaky - until they finish there'll be constant background cognitive load and surface area for bugs from the fact everything (especially the data) is half-migrated
You'd think so, and we don't know about today's incident yet, but recent Github incidents have been attributed specifically to Azure, and Azure itself has had a lot of downtime recently that lasts for many hours.
Many teams work exclusively in GitHub (ticketing, boards, workflows, dev builds). People also have entire production build systems on GitHub. There's a lot more than git repo hosting.
It's especially painful for anyone who uses Github actions for CI/CD - maybe the release you just cut never actually got deployed to prod because their internal trigger didn't fire... you need to watch it like a hawk.
I'm a firm believer that almost nothing except public services needs that kind of uptime...
We've introduced ridiculous amounts of complexity to our infra to achieve this and we've contributed to the increasing costs of both services and development itself (the barrier of entry for current juniors is insane compared to what I've had to deal with in my early 20s).
All kinds of companies lose millions of dollars of revenue per day if not hour if their sites are not stable.... apple, amazon, google, Shopify, uber, etc etc.
Those companies have decided the extra complexity is worth the reliability.
Even if you're operating a tech company that doesn't need to have that kind of uptime, your developers probably need those services to be productive, and you don't want them just sitting there either.
By public services I mean only important things like healthcare, law enforcement, fire department. Definitely not stores and food delivery. You can wait an hour or even a couple of hours for that.
> Those companies have decided the extra complexity is worth the reliability.
Companies always want more money and yes it makes sense economically. I'm not disagreeing with that. I'm just saying that nobody needs this.
I grew up in a world where this wasn't a thing and no, life wasn't worse at all.
Eh, if I'm paying someone to host my git webui, and they are as shitty about it as github has been recently, I'd rather pay someone else to host it or go back to hosting it myself. It is not absolutely required, but it's a differentiating feature I'm happy to pay for
Any module that is properly tagged and contains an OSS license gets stored in Google's module cache indefinitely. As long as it was go-get-ed once before, you can pull it again without going to GitHub (or any other VCS host).
Lots of teams embraced actions to run their CI/CD, and GitHub reviews as part of their merge process. And copilot. Basically their SOC2 (or whatever) says they have to use GitHub.
Every product vendor, especially those that are even within a shouting distance from security, has a wet dream: to have their product explicitly named in corporate policies.
Does SOC2 itself require that or just yours? I'm not too familiar with SOC2 but I know ISO 27001 quite well, and there's no PR specific "requirements" to speak of. But it is something that could be included in your secure development policy.
And it's pretty common to write in the policy, because its pretty much a gimme, and lets you avoid writing a whole bunch of other equivalent quality measures in the policy.
I think this is being downvoted unfairly. I mean, sure, as a company accepting payment for services, being down for a few hours every few months is notably bad by modern standards.
But the inward-looking point is correct: git itself is a distributed technology, and development using it is distributed and almost always latency-tolerant. To the extent that github's customers have processes that are dependent on services like bug tracking and reporting and CI to keep their teams productive, that's a bug with the customer's processes. It doesn't have to be that way and we as a community can recognize that even if the service provider kinda sucks.
There are still some processes that require a waterfall method for development, though. One example would be if you have a designer, and also have a front-end developer that is waiting for a feature to be complete to come in and start their development. I know on HN it's common for people to be full-stack developers, or for front-end developers to be able to work with a mockup and write the code before a designer gets involved, but there are plenty of companies that don't work that way. Even if a company is working in an agile manner, there still may come a time where work stalls until some part of a system is finished by another team/team-member, especially in a monorepo. Of course they could change the organization of their project, but the time suck of doing that (like going with microservices) is probably going to waste quite a bit more time than how often GitHub is down.
> There are still some processes that require a waterfall method for development, though
Not on the 2-4 hour latency scale of a GitHub outage though. I mean, sure, if you have a process that requires the engineering talent to work completely independently on day-plus timescales and/or do all their coordination offline, then you're going to have a ton of trouble staffing[1] that team.
But if your folks can't handle talking with the designers over chat or whatnot to backfill the loss of the issue tracker for an afternoon, then that's on you.
[1] It can obviously be done! But it's isomorphic to "put together a Linux-style development culture", very non-trivial.
Being snapshot-based. Git has some issues being distributed in practice since the patch order matter which means you basically need to have some centralized authoritative server in most cases with more than 2 folks to resolve the order of patches for meaningful uses as the hash is used in so many contexts.
That's... literally what a merge collision is. The tooling for that predates git by decades. The solutions are all varying levels of non-trivial and involve tradeoffs, but none of them require 24/7 cloud service availability.
Are you kidding? I need my code to pass CI, and get reviewed, so I can move on, otherwise the PRs just keep piling. You might as well say the lights could go out, you can do paperwork.
Yeah, I'm literally looking at GitLab's "Migrate from GitHub" page on their docs site right now. If there's a way to import issues and projects I could be sold.
Maybe it's be reasonable to script using the glab and gh clis? I've never tried anything like that, but I regularly use the glab cli and it's pretty comprehensive.
It’s not so much an op’s issue as an architecture and code quality issue. If you have ever dug into the GitHub enterprise self hosted product you get an idea of the mess.
This is Microsoft. They forced a move to Azure, and then prioritized AI workfloads higher. I'm sure the traing read workloads on GH are nontrivial.
They literally have the golden goose, the training stream of all software development, dependencies, trending tool usage.
In an age of model providers trying train their models and keep them current, the value of GitHub should easily be in the high tens of billions or more. The CEO of Microsoft should be directly involved at this point, their franchise at risk on multiple fronts now. Windows 11 is extremely bad. GitHub going to lose their foundational role in modern development shortly, and early indications are that they hitched their wagon to the wrong foundational model provider.
I viscerally dislike github so much at this point. I don't know how how they come back from this. Major opportunity for competitor here to come around and with ai native features like context versioning
Of course they're down while I'm trying to address a "High severity" security bug in Caddy but all I'm getting is a unicorn when loading the report.
(Actually there's 3 I'm currently working, but 2 are patched already, still closing the feedback loop though.)
I have a 2-hour window right now that is toddler free. I'm worried that the outage will delay the feedback loop with the reporter(s) into tomorrow and ultimately delay the patches.
I can't complain though -- GitHub sustains most of my livelihood so I can provide for my family through its Sponsors program, and I'm not a paying customer. (And yet, paying would not prevent the outage.) Overall I'm very grateful for GitHub.
have you considered moving or having at least an alternative? asking as someone using caddy for personal hosting who likes to have their website secure. :)
We can of course host our code elsewhere, the problem is the community is kind of locked-in. It would be very "expensive" to move, and would have to be very worthwhile. So far the math doesn't support that kind of change.
Usually an outage is not a big deal, I can still work locally. Today I just happen to be in a very GH-centric workflow with the security reports and such.
I'm curious how other maintainers maintain productivity during GH outages.
As an alternative, I thought mainly as a secondary repo and ci in case that Github stops being reliable, not only as the current instability, but as an overall provider. I'm from the EU and recently catch myself evaluating every US company I interact with and I'm starting to realize that mine might not be the only risk vector to consider. Wondering how other people think about it.
> have you considered moving or having at least an alternative
Not who you're responding to, but my 2 cents: for a popular open-source project reliant on community contributions there is really no alternative. It's similar to social media - we all know it's trash and noxious, but if you're any kind of public figure you have to be there.
14 incidents in February! It's February 9th! Glad to see the latest great savior phase of the AI industrial complex [1] is going just as well as all the others!
An interesting thing I notice now is that people do not like companies that only post about outages if half the world have them ... and also not companies that also post about "minor issues", e.g.:
> During this time, workflows experienced an average delay of 49 seconds, and 4.7% of workflow runs failed to start within 5 minutes.
That's for sure not perfect, but there was also a 95% chance that if you have re-run the job, it will run and not fail to start. Another one is about notificatiosn being late. I'm sure all others do have similar issues people notice, but nobody writes about them. So a simple "to many incidents" does bot make the stats bad - only an unstable service the service.
I know you are joking but I'm sure that there is at least one director or VP inside GitHub pushing a new salvation project that must use AI to solve all the problems, when actually the most likely reason is engineers are drawing in tech debt.
Upper management in Microsoft has been bragging about their high percentage of AI generated code lately - and in the meantime we've had several disastrous Windows 11 updates with the potential to brick your machine and a slew of outages at github. I'm sure it might be something else but it's clear part of their current technical approach is utterly broken.
When I first typed up my comment I said "their current business approach" and then corrected it to technical since - yea, in the short term it probably isn't hurting their pocket books too much. The issue is that it seems like a lot more folks are seriously considering switching off Windows - we'll see if this actually is the year of the linux desktop (it never seems to be in the end) but it certainly seems to be souring their brand reputation in a major way.
Honestly AI management would probably be better. "You're a competent manager, you're not allowed to break or circumvent workers right laws, you must comply with our CSR and HR policies, provide realistic estimates and deliver stable and reliable products to our customers." Then just watch half the tech sector break down, due to a lack of resources, or watch as profit is just cut in half.
All the cool kids move fast and break things. Why not the same for core infrastructure providers? Let's replace our engineers with markdown files named after them.
I'm happy that they're being transparent about it. There's no good way to take downtime, but at least they don't try to cover it up. We can adjust and they'll make it better. I'm sure a retro is on its way it's been quite the bumpy month.
Copilot is shown as having policy issues in the latest reports. Oh my, the irony. Satya is like "look ma, our stock is dropping...", Gee I wonder why Mr!!
GitHub has had customer visible incidents large enough to warrant status page updates almost every day this year (https://www.githubstatus.com/history).
This should not be normal for any service, even at GitHub's size. There's a joke that your workday usually stops around 4pm, because that's when GitHub Actions goes down every day.
I wish someone inside the house cared to comment why the services barely stay up and what kinds of actions are they planning to do to fix this issue that's been going on years, but has definitely accelerated in the past year or so.
It's 100% because the number of operations happening on Github has likely 100x'd since the introduction of coding agents. They built Github for one kind of scale, and the problem is that they've all of a sudden found themselves with a new kind of scale.
That doesn't normally happen to platforms of this size.
ISTR that the lift-n-shift started like ... 3 years ago? That much of it was already shifted to Azure ... 2 years ago?
The only thing that changed in the last 1 year (if my above two assertions are correct (which they may not be)) is a much-publicised switch to AI-assisted coding.
We've migrated to Forgejo over the last couple of weeks. We position ourselves[0] as an alternative to the big cloud providers, so it seemed very silly that a critical piece of our own infrastructure could be taken out by a GitHub or Azure outage.
It has been a pretty smooth process. Although we have done a couple of pieces of custom development:
1) We've created a Firecracker-based runner, which will run CI jobs in Firecracker VMs. This brings the Foregjo Actions running experience much more closely into line with GitHub's environment (VM, rather than container). We hope to contribute this back shortly, but also drop me a message if this is of interest.
2) We're working up a proposal[1] to add environments and variable groups to Forgejo Actions. This is something we expect to need for some upcoming compliance requirements.
I really like Forgejo as a project, and I've found the community to be very welcoming. I'm really hoping to see it grow and flourish :D
Screw GitHub, seriously. This unreliability is not acceptable. If I’m in a position where I can influence what code forge we use in future I will do everything in my power to steer away from GitHub.
Every company I’ve worked in the last 10 years used GH for the internal codebase hosting , PRs and sometimes CI. Discoverability doesn’t really come into picture for those users and you can still fork things from GitHub even if you don’t host your core code infra on it
They're in the process of moving from "legacy" infra to Azure, so there's a ton of churn happening behind the scenes. That's probably why things keep exploding.
I don't know jack about shit here, but genuinely: why migrate a live production system piecewise? Wouldn't it be far more sane to start building a shadow copy on Azure and let that blow up in isolation while real users keep using the real service on """legacy""" systems that still work?
Because it's significantly harder to isolate problems and you'll end up in this loop
* Deploy everything
* It explodes
* Rollback everything
* Spend two weeks finding problem in one system and then fix it
* Deploy everything
* It explodes
* Rollback everything
* Spend two weeks finding a new problem that was created while you were fixing the last problem
* Repeat ad nauseum
Migrating iteratively gives you a foundation to build upon with each component
Of course, you need some way of producing test loads similar to those found in production. One way would be to take a snapshot of production, tap incoming requests for a few weeks, log everything, then replay it at "as fast as we can" speed for testing; another way would be to just mirror production live, running the same operations in test as run in production.
Alternatively, you could take the "chaos monkey" approach (https://www.folklore.org/Monkey_Lives.html), do away with all notions of realism, and just fuzz the heck out of your test system. I'd go with that, first, because it's easy, and tends to catch the more obvious bugs.
So just double your cloud bill for several few weeks, costing site like GitHub millions of dollars?
How do you handle duplicate requests to external services? Are you going to run credit cards twice? Send emails twice? If not, how do you know it's working with fidelity?
If you make it work, migrating piecewise should be less change/risk at each junction than a big jump between here and there of everything at once.
But you need to have pieces that are independent enough to run some here and some there, and ideally pieces that can fail without taking down the whole system.
That’s a safer approach but will cause teams to need to test in two infrastructures (old world and new) til the entire new environment is ready for prime time. They’re hopefully moving fast and definitely breaking things.
1. Stateful systems (databases, message brokers) are hard to switch back-and-forth; you often want to migrate each one as few times as possible.
2. If something goes sideways -- especially performance-wise -- it can be hard to tell the reason if everything changed.
3. It takes a long time (months/years) to complete the migration. By doing it incrementally, you can reap the advantages of the new infra, and avoid maintaining two things.
I think it's more likely the introduction of the ability to say "fix this for me" to your LLM + "lgtm" PR reviews. That or MS doing their usual thing to acquired products.
rumors I've heard was that github is mostly run by contractors? That might explain the chaos more than simple vibe coding (which probably aggravates this)
Definitely. The devil is in the details though since it's so damn hard to quantify the $$$ lost when you have a large opinionated customer base that holds tremendous grudges. Doubly so when it's a subscription service with effectively unlimited lifetime for happy accounts.
Business by spreadsheet is super hard for this reason - if you try to charge the maximum you can before people get angry and leave then you're a tiny outage/issue/controversy/breach from tipping over the wrong side of that line.
Yeah, but who cares about long-term? In the long term we are all dead. CEO only needs to be good for 5-10 max years, pop up stock prices and get applause every where and called as the smartest guy in the world.
I think the last major outage wasn't even two weeks ago. We've got about another 2 weeks to finish our MVP and get it launched and... this really isn't helpful. I'm getting pretty fed up of the unreliability.
I can help you restore from backups if you will tell me where you backed it up.
You did back it up, right? Right before you ran me with `--allow-dangerously-skip-permissions` and gave me full access to your databases and S3 buckets?
More like Tay.ai and Zoe.ai AIs still arguing amongst themselves not being able to keep the service online for Microsoft after they replaced their human counterparts.
It probably depends on your scale, but I'd suggest self-hosting a Forgejo instance, if it's within your domain expertise to run a service like that. It's not hard to operate, it will be blazing fast, it provides most of the same capabilities, and you'll be in complete control over the costs and reliability.
A people have replied to you mentioning Codeberg, but that service is intended for Open Source projects, not private commercial work.
i would imagine that's what everyone is doing instead of sitting on their hands. Setup a different remote and have your team push/pull to/from it until Github comes back up. I mean you could probably use ngrok and setup a remote on your laptop in a pinch. You shouldn't be totally blocked except for things like automated deployments or builds tied specifically to github.com
I've been using https://radicle.xyz/ + https://radicle-ci.liw.fi/ (in combination with my own ci adapter for nix flakes) for about half a year now for (almost) all my public and private repos and so far I really like it.
> What are good alternatives to GitHub for private repos + actions? I'm considering moving my company off of it because of reliablity issues.
Dunno about actions[1], but I've been using a $5/m DO droplet for the last 5 years for my private repo. If it ever runs out of disk space, an additional 100GB of mounted storage is an extra $10/m
I've put something on it (Gitea, I think) that has the web interface for submitting PRs, reviewing them, merging them, etc.
I don't think there is any extra value in paying more to a git hosting SaaS for a single user, than I pay for a DO droplet for (at peak) 20 users.
----------------------
[1] Tried using Jenkins, but alas, a $5/m DO droplet is insufficient to run Jenkins. I mashed up shell scripts + Makefiles in a loop, with a `sleep 60` between iterations.
For me it is their history of high-impact easily avoidable security bugs. I have no idea why "send a reset password link to an address from an unauthenticated source" was possible at all.
Nah at a small scale it's totally fine, and IME pretty pain-free after you've got it running. The biggest pain points are A) It's slow, B) between auth, storage, and CI runners, you have a lot of unavoidable configuration to do, and C) it has a lot of different features so the docs are MASSIVE.
Not really. About average in terms of infrastructure maintenance. Have been running our orgs instance for 5 years or so, half that time with premium and half the time with just the open source version, running on kubernetes... ran it in AWS at first, then migrated to our own infrastructure.
At my last job I ran a GitLab instance on a tiny AWS server and ran workers on old desktop PCs in the corner of the office.
It's pretty nice if you don't mind it being some of the heaviest software you've ever seen.
I also tried gitea, but uninstalled it when I encountered nonsense restrictions with the rationale "that's how GitHub does it". It was okay, pretty lightweight, but locking out features purely because "that's what GitHub does" was just utterly unacceptable to me.
One thing that always bothered me about gitea is they wouldn't even dog food for a long time. GitLab has been developing on GitLab since forever, basically.
ad hominem isn't a very convincing argument, and as someone who also enjoys forgejo it doesn't make me feel good to see as the justification for another recommender.
From [1] "Forgejo was created in October 2022 after a for profit company took over the Gitea project."
Forgejo became a hard fork in 2024, with both projects diverging. If you're using it for local hosting I don't personally see much of a difference between them, although that may change as the two projects evolve.
I'm not OP, but; Forgejo is much lighterweight than Gitlab for my usecase, and was cited as a more maintained version of Gitea, but that's just anecdote from my brain and I don't have sources, so take that with a truckload of salt.
I'd had a gitea instance before and it was appealing insofar as having the ability to mirror from or to a public repo, it had docker container registry capability, it ties into oauth, etc; I'm sure gitlab has much/all of that too, but forgejo's tiny, tiny footprint was very appealing for my resource-constrained selfhosted environment.
pretty clear that companies like microsoft are actually terrible at engineering, their core products were built 30 years ago. any changes now are generally extremely incremental and quickly rolled back with issue. trying to innovate at github shows just how bad they are.
It's not just MSFT, it's all of big tech. They basically run as a cartel, destroy competition through illegal means, engage in regulatory capture, and ensure their fiefdoms reign supreme.
All the more reason why they should be sliced and diced into oblivion.
yeah i have worked at a few FAANG, honestly stunning how entrenched and bad some of the products are. internally, they are completely incapable of making any meaningful product changes, the whole thing will break
It's a general curse of anything that becomes successful at a BigCorp.
The engineers who build the early versions were folks at the top of their field, and compensated accordingly. Those folks have long since moved on, and the whole thing is maintained by a mix of newcomers and whichever old hands didn't manage to promote out, while the PMs shuffle the UX to justify everyones salary...
im not even sure id say they were "top", id more just say its a different type of engineer, that either doesnt get promoted to a big impact role at a place like microsoft, or leaves on their own.
Yes, for personal projects I just self-host an instance of forgejo with dokploy. Everything else I deploy on codeberg, which is also an instance of forgejo.
I wonder what's the value of having a dedicated X (formerly Twitter) status account post 2023 when people without account will see a mix of entries from 2018, 2024, and 2020 in no particular order upon opening it.
Is it just there so everyone can quickly share their post announcing they're back?
At its core antitrust cases are about monopolies and how companies use anti-competitive conduct to maintain their monopoly.
Github isn't the only source control software in the market. Unless they're doing something obvious and nefarious, its doubtful the justice department will step in when you can simply choose one of many others like Bitbucket, Sourcetree, Gitlab, SVN, CVS, Fossil, DARCS, or Bazaar.
There's just too much competition in the market right now for the govt to do anything.
Minimal changes have occurred to the concept of “antitrust” since its inception as a form of societal justice against corporations, at least per my understanding.
I doubt policymakers in the early 1900s could have predicted the impact of technology and globalization on the corporate landscape, especially vis a vis “vertical integration”.
Personally, I think vertical integration is a pretty big blind spot in laws and policies that are meant to ensure that consumers are not negatively impacted by anticompetitive corporate practices. Sure, “competition” may exist, but the market activity often shifts meaningfully in a direction that is harmful consumers once the biggest players swallow another piece of the supply chain (or product concept), and not just their competitors.
There was a change in the enforcement of antitrust law in the 1970s. Consumer welfare, which came to mean lower prices, is the standard. Effectively normal competition is fine and takes egregious behavior to be violation. It even assumes that big companies are more efficient which makes up for lack of competition.
The other change is reluctance to break up companies. AT&T break up was big deal. Microsoft survived being broken up in its antitrust trial. Tech companies can only be broken up vertically, but maybe the forced competition would be enough.
Not really. It's a network effect, like Facebook. Value scales quadratically with the number of users, because nobody wants to "have to check two apps".
We should buy out monopolies like the Chinese government does. If you corner the market, then you get a little payout and a "You beat capitalism! Play again?" prize. Other companies can still compete but the customers will get a nice state-funded high-quality option forever.
Not sure how having downtime is an anti-competition issue. I'm also not sure how you think you can take things away from people? Do you think someone just gave them GitHub and then take it away? Who are you expecting to take it away? Also, does your system have 100% uptime?
Companies used to be forced to sell parts of their business when antitrust was involved. The issue isn't the downtime, they should never have been allowed to own this in the first place.
There was just a recent case with Google to decide if they would have to sell Chrome. Of course the Judge ruled no. Nowadays you can have a monopoly in 20 adjacent industries and the courts will say it's fine.
You've been banging on about this for a while, I think this is my third time responding to one of your accounts. There is no antitrust issue, how are they messing with other competitors? You never back up your reasoning. How many accounts do you have active since I bet all the downvotes are from you?
I've had two accounts. I changed because I don't like the history (maybe one other person has the same opinion I did?). Anyways it's pretty obvious why this is an issue. Microsoft has a historical issue with being brutal to competition. There is no oversight as to what they do with the private data on GitHub. It's absolutely an antitrust issue. Do you need more reasoning?
Didn't you just privately tell me it was 4 accounts? Maybe that was someone else hating on Windows 95. But you need an active reason not what they did 20 years ago.
The more stable/secure a monopoly is in its position the less incentive it has to deliver high quality services.
If a company can build a monopoly (or oligopoly) in multiple markets, it can then use these monopolies to build stability for them all. For example, Google uses ads on the Google Search homepage to build a browser near-monopoly and uses Chrome to push people to use Google Search homepage. Both markets have to be attacked simultaneously by competitors to have a fighting chance.
It's a funny coincidence - I pushed a commit adding a link to an image in the README.md, opened the repo page, clicked on the said image, and got the unicorn page. The site did not load anymore after that.
Does anyone have issue like Workflow cannot be canceled? It keeps displaying "Failed to cancel workflow." when I tried to cancel it and the flow has been stuck in status "In Progress" for the last 5 hours and still counting
It feels like GitHub's shift to these "AI writes code for you while you sleep!" features will appeal to a less technical crowd who lack awareness of the overall source code hosting and CI ecosystem and, combined with their operational incompetence of late (calling it how I see it), will see their dominance as the default source code solution for folks using it to maintain production software projects fade away.
Hopefully the hobbyists are willing to shell out for tokens as much as they expect.
In the age of Claude Code et al, my honest biggest bottleneck is GH downtime.
I've got a dozen PRs I'm working on, but it's all frozen up, daily, with GH outages.
Are the other providers offering much better uptime GitLab, CircleCI, Harness?
Saying this as someone that's been GH exclusive sicne 2010.
The biggest thing tying my team to GitHub right now is that we use Graphite to manage stacked diffs, and as far as I can tell, Graphite doesn't support anything but GitHub. What other tools are people using for stacked-diff workflows (especially code review)?
Gerrit is the other option I'm aware of but it seems like it might require significant work to administer.
List of company-friendly managed-host alternatives? SSO, auditing, user management, billing controls, etc?
I would love to pay Codeberg for managed hosting + support. GitLab is an ugly overcomplicated behemoth... Gitea offers "enterprise" plans but do they have all the needed corporate features? Bitbucket is a joke, never going back to that.
The saddest part to me is that their status update page and twitter are both out of date. I get a full 500 on github.com and yet all I see on their status page is an "incident with pull requests" and "copilot policy propagation delays."
It looks like one of my employees got her whole account deleted or banned without warning during this outage. Hopefully this is resolved as service returns.
On the plus side, it's git, so developers can at least get back to work without too much hassle as long as they don't need the CI/CD side of things immediately.
Anyone have alternatives to recommend? We will be switching after this. Already moved to self-hosted action runners and we are early-stage so switching cost is fairly low.
So what's the moneyline on all these outages being the result of vibe-coded LLM-as-software-engineer/LLM-as-platform-engineer executive cost cutting mandates?
Issues, CI, and downloads for built binaries aren't part of vanilla Git. CI in particular can be hard if you make a multi-platform project and don't want to have to buy a new mac every few years.
I made this joke 10 hours ago:
"I wonder if you opened https://github.com/claude in like 1000's of browsers / unique ips would it bring down github since it does seem to try until timeout"
the incident has now expanded to include webhooks, git operations, actions, general page load + API requests, issues, and pull requests. they're effectively down hard.
hopefully its down all day. we need more incidents like this to happen for people to get a glimpse of the future.
I think this is an indicator of a broader trend where tech companies put less value on quality and stability and more value on shipping new features. It’s basically the enshittification of tech
Github's two biggest selling points were its feature set (Pull Requests, Actions) and its reliability.
With the latter no longer a thing, and with so many other people building on Github's innovations, I'm starting to seriously consider alternatives. Not something I would have said in the past, but when Github's outages start to seriously affect my ability to do my own work, I can no longer justify continuing to use them.
Github needs to get its shit together. You can draw a pretty clear line between Microsoft deciding it was all in on AI and the decline in Github's service quality. So I would argue that for Github to gets its shit back together, it needs to ditch the AI and focus on high quality engineering.
GitHub is the new Internet Explorer 6. A Microsoft product so dominant in its category that it's going to hold everyone back for years to come.
Just when open source development has to deal with the biggest shift in years and maintainers need a tool that will help them fight the AI slop and maintain the software quality, GitHub not only can't keep up with the new requirements, they struggle to keep their product running reliably.
Paying customers will start moving off to GitLab and other alternatives, but GitHub is so dominant in open source that maintainers won't move anywhere, they'll just keep burning out more than before.
GitHub has a long history of being extremely unstable. They were down all the time, much like recently, several years ago. They seemed to stabilize quite a bit around the MS acquisition era, and now seem to be returning to their old instability patterns.
They should have just scaled a proper Rails monolith instead of this React, Java whatever mixed mess.
But hey probably Microslop is vibecoding everything to Rust now!
Can we please demand that Github provide mirror APIs to competitors? We're just asking for an extinction-level event. "Oops, our AI deleted the world's open source."
Any public source code hosting service should be able to subscribe to public repo changes. It belongs to the authors, not to Microsoft.
The history of tickets and PRs would be a major loss - but a beauty of git is that if at least one dev has the repo checked out then you can easily rehost the code history.
It would be nice to have some sort of widespread standard for doing issue tracking, reviews, and CI in the repo, synced with the repo to all its clones (and fully from version-managed text-files and scripts) rather than in external, centralized, web tools.
It's really pathetic for however many trillions MSFT is valued.
If we had a government worth anything, they ought to pass a law that other competitors be provided mirror APIs so that the entire world isn't shut off from source code for a day. We're just asking for a world wide disaster.
I get the feeling that most of these GitHub downtimes are during US working hours, since I don't remember being impacted them during work. Only noticed it now as I was looking up a repo on my free time.
Good thing we have LLM agents now. Before this kind of behavior was tolerable. Now it's pretty easy to switch over to using other providers. The threat of "but it will take them a lot of effort to switch to someone else" is getting less and less every day.
Github stability worse than ever. Windows 11 and Office stability worse than ever. Features that were useful for decades on computers with low resources are now "too hard" too implement.
That pink "Unicorn!" joke is something that should be reconsidered. When your services are down you're probably causing a lot of people a lot of stress ; I don't think it's the time to be cute and funny about it.
One of Reddit's cutesy error pages (presumably for Internal Server Error is similar) is an illustration that says "You broke reddit". I know it's a joke, but have wondered what effect that might have on a particularly anxiety-prone person who takes it literally and thinks they've done something that's taken the site down and inconvenienced millions of other people. Seems a bit dodgy for a mainstream site to assume all of its users have the dev knowledge to identify a joking accusation.
Even if it is their server name, I completely agree with your point. The image is not appropriate when your multi-billion revenue service is yet again failing to meet even a basic level of reliability, preventing people from doing their jobs and generally causing stress and bad feeling all round.
I am personally totally fine with it but I see your point. Github is a bit too big for often braking with a cutsey error message even if it is a reference to their web server.
GitHub no longer publishes aggregate numbers so here they are parsed out. It looks like they are down to a single 9 at this point across all services:
https://mrshu.github.io/github-statuses/
Nice! I still remember the old GitHub status page which showed and reported on their uptime. No surprises they took it offline and replaced it with the current one when it started reporting the truth.
EDIT: You mention this with archive.org links! Love it! https://mrshu.github.io/github-statuses/#about
> It looks like they are down to a single 9 at this point across all services
That's not at all how you measure uptime. The per area measures are cool but the top bar measuring across all services is silly.
I'm unsure what they are targeting, seems across the board it's mostly 99.5+ with the exception of Copilot. Just doing math, 3 (independent, which I'm aware they aren't fully) 99.5 services brings you down to an overall "single 9" 98.5 healthy status but it's not meaningful to anyone.
It depends whether the outages are overlapped or not. If the outages are not overlapped then that is indeed how you do it since some of your services being unavailable means your service is not fully available.
They are overlapped. You can hover over the bars and some bars have multiple issues.
I mean, there's a big difference between primary Git operations being down and Copilot being down. Any SLAs are probably per-service, not as a whole, and I highly doubt that someone just using a subset of services cares that one of the other services is down.
Copilot seems to be the worst offender, and 99% of people using Github likely couldn't care less.
It's interesting to see that copilot has the worst overall. I use copilot completions constantly and rarely notice issues with it. I suspect incidents aren't added until after they resolve.
Completions run using a much simpler model that Github hosts and runs themselves. I think most of the issues with Copilot that I've seen are upstream issues with one or two individual models (e.g. the Twitter API goes down so the Grok model is unavailable, etc)
Do I misunderstand or does your page count today's downtime as minor? I would not count the web UI being mostly unusable as minor. Does this mean GitHub understates how bad incidents are? Pr has your page just not yet been updated to include it?
Today's isn't accounted for on that page yet
Great project, thanks for building and sharing!
If you'd have asked me a few years ago if anything could be an existential threat to github's dominance in the tech community I'd have quickly said no.
If they don't get their ops house in order, this will go down as an all-time own goal in our industry.
Github lost at least one 9, if not two, since last year's "existential" migration to Azure.
I'm pretty sure they don't GAF about GH uptime as long as they can keep training models on it (0.5 /s), but Azure is revenue friction so might be a real problem.
Something this week about "oops we need a quality czar": https://news.ycombinator.com/item?id=46903802
> (0.5 /s),
Does this mean you are only half-sarcastic/half-joking? Or did I interpret that wrong?
I’m Planck’s constant serious about this
Yes that's it.
I'm sympathetic to ops issues, and particularly sympathetic to ops issues that are caused by brain-dead corporate mandates, but you don't get to be an infrastructure company and have this uptime record.
It's extra galling that they advertise all the new buzzword laden AI pipeline features while the regular website and actions fail constantly. Academically I know that it's not the same people building those as fixing bugs and running infra, but the leadership is just clearly failing to properly steer the ship here.
They didn't migrate yet.
Fucking REALLY?!
Migrations of Actions and Copilot to Azure completed in 2024.
Pages and Packages completed in 2025.
Core platform and databases began in October 2025 and are in progress, with traffic split between the legacy Github data center and Azure.
That's probably partly why things have got increasingly flaky - until they finish there'll be constant background cognitive load and surface area for bugs from the fact everything (especially the data) is half-migrated
You'd think so, and we don't know about today's incident yet, but recent Github incidents have been attributed specifically to Azure, and Azure itself has had a lot of downtime recently that lasts for many hours.
True, the even simpler explanation is what they've migrated to is itself just unreliable
This has those Hotmail migration vibes off the early 2000s.
And yet, somehow my wife still has a hotmail.com address 25 years later.
Is there any reason why Github needs 99.99% uptime? You can continue working with your local repo.
Many teams work exclusively in GitHub (ticketing, boards, workflows, dev builds). People also have entire production build systems on GitHub. There's a lot more than git repo hosting.
It's especially painful for anyone who uses Github actions for CI/CD - maybe the release you just cut never actually got deployed to prod because their internal trigger didn't fire... you need to watch it like a hawk.
I waited 2.5 hours for a webhook from the registry_packages endpoint today.
I'm grateful it arrived, but two and half hours feels less than ideal.
I'm a firm believer that almost nothing except public services needs that kind of uptime... We've introduced ridiculous amounts of complexity to our infra to achieve this and we've contributed to the increasing costs of both services and development itself (the barrier of entry for current juniors is insane compared to what I've had to deal with in my early 20s).
What do you mean by public services?
All kinds of companies lose millions of dollars of revenue per day if not hour if their sites are not stable.... apple, amazon, google, Shopify, uber, etc etc.
Those companies have decided the extra complexity is worth the reliability.
Even if you're operating a tech company that doesn't need to have that kind of uptime, your developers probably need those services to be productive, and you don't want them just sitting there either.
By public services I mean only important things like healthcare, law enforcement, fire department. Definitely not stores and food delivery. You can wait an hour or even a couple of hours for that.
> Those companies have decided the extra complexity is worth the reliability.
Companies always want more money and yes it makes sense economically. I'm not disagreeing with that. I'm just saying that nobody needs this. I grew up in a world where this wasn't a thing and no, life wasn't worse at all.
Eh, if I'm paying someone to host my git webui, and they are as shitty about it as github has been recently, I'd rather pay someone else to host it or go back to hosting it myself. It is not absolutely required, but it's a differentiating feature I'm happy to pay for
As an example, Go build could fail anywhere if a dependency module from Github is not available.
Any module that is properly tagged and contains an OSS license gets stored in Google's module cache indefinitely. As long as it was go-get-ed once before, you can pull it again without going to GitHub (or any other VCS host).
Does go build not support mirrors so you can define a fallback repository? If not, why?
Lots of teams embraced actions to run their CI/CD, and GitHub reviews as part of their merge process. And copilot. Basically their SOC2 (or whatever) says they have to use GitHub.
I’m guessing they’re regretting it.
> Basically their SOC2 (or whatever) says they have to use GitHub
Our SOC2 doesn't specify GitHub by name, but it does require we maintain a record of each PR having been reviewed.
I guess in extremis we could email each other patch diffs, and CC the guy responsible for the audit process with the approval...
Every product vendor, especially those that are even within a shouting distance from security, has a wet dream: to have their product explicitly named in corporate policies.
I have cleaned up more than enough of them.
The Linux kernel uses an email based workflow. You can digitally sign email and add it to an immutable store that can be reviewed.
Does SOC2 itself require that or just yours? I'm not too familiar with SOC2 but I know ISO 27001 quite well, and there's no PR specific "requirements" to speak of. But it is something that could be included in your secure development policy.
Yeah, it’s what you write in the policy.
And it's pretty common to write in the policy, because its pretty much a gimme, and lets you avoid writing a whole bunch of other equivalent quality measures in the policy.
The money i pay them is the reason
What if you need to deploy to production urgently...
I think this is being downvoted unfairly. I mean, sure, as a company accepting payment for services, being down for a few hours every few months is notably bad by modern standards.
But the inward-looking point is correct: git itself is a distributed technology, and development using it is distributed and almost always latency-tolerant. To the extent that github's customers have processes that are dependent on services like bug tracking and reporting and CI to keep their teams productive, that's a bug with the customer's processes. It doesn't have to be that way and we as a community can recognize that even if the service provider kinda sucks.
There are still some processes that require a waterfall method for development, though. One example would be if you have a designer, and also have a front-end developer that is waiting for a feature to be complete to come in and start their development. I know on HN it's common for people to be full-stack developers, or for front-end developers to be able to work with a mockup and write the code before a designer gets involved, but there are plenty of companies that don't work that way. Even if a company is working in an agile manner, there still may come a time where work stalls until some part of a system is finished by another team/team-member, especially in a monorepo. Of course they could change the organization of their project, but the time suck of doing that (like going with microservices) is probably going to waste quite a bit more time than how often GitHub is down.
> There are still some processes that require a waterfall method for development, though
Not on the 2-4 hour latency scale of a GitHub outage though. I mean, sure, if you have a process that requires the engineering talent to work completely independently on day-plus timescales and/or do all their coordination offline, then you're going to have a ton of trouble staffing[1] that team.
But if your folks can't handle talking with the designers over chat or whatnot to backfill the loss of the issue tracker for an afternoon, then that's on you.
[1] It can obviously be done! But it's isomorphic to "put together a Linux-style development culture", very non-trivial.
Being snapshot-based. Git has some issues being distributed in practice since the patch order matter which means you basically need to have some centralized authoritative server in most cases with more than 2 folks to resolve the order of patches for meaningful uses as the hash is used in so many contexts.
That's... literally what a merge collision is. The tooling for that predates git by decades. The solutions are all varying levels of non-trivial and involve tradeoffs, but none of them require 24/7 cloud service availability.
Are you kidding? I need my code to pass CI, and get reviewed, so I can move on, otherwise the PRs just keep piling. You might as well say the lights could go out, you can do paperwork.
> otherwise the PRs just keep piling
Good news! You can't create new PRs right now anyway, so they won't pile.
When in doubt - schedule a meeting about how you're unable to do work to keep doing work!
Yeah, I'm literally looking at GitLab's "Migrate from GitHub" page on their docs site right now. If there's a way to import issues and projects I could be sold.
If you're considering moving away from github due to problems with reliability/outages, then any migration to gitlab will not make you happy.
> If there's a way to import issues and projects I could be sold.
That is what that feature does. It imports issues and code and more (not sure about "projects", don't use that feature on Github).
Maybe it's be reasonable to script using the glab and gh clis? I've never tried anything like that, but I regularly use the glab cli and it's pretty comprehensive.
No need – it imports pretty much anything you can reliably import from GitHub, including issues and PRs (with comments): https://docs.gitlab.com/user/project/import/github/#imported...
It’s not so much an op’s issue as an architecture and code quality issue. If you have ever dug into the GitHub enterprise self hosted product you get an idea of the mess.
This is obviously empty speculation, but I wonder if the mindless rush to AI has anything to do with the increase in outages we've seen recently.
Or maybe the mindless rush to host it in azure?
Or both!
It does. I work at Amazon and I can see the increase in outages or major issues since AI has been pushed.
This is Microsoft. They forced a move to Azure, and then prioritized AI workfloads higher. I'm sure the traing read workloads on GH are nontrivial.
They literally have the golden goose, the training stream of all software development, dependencies, trending tool usage.
In an age of model providers trying train their models and keep them current, the value of GitHub should easily be in the high tens of billions or more. The CEO of Microsoft should be directly involved at this point, their franchise at risk on multiple fronts now. Windows 11 is extremely bad. GitHub going to lose their foundational role in modern development shortly, and early indications are that they hitched their wagon to the wrong foundational model provider.
I viscerally dislike github so much at this point. I don't know how how they come back from this. Major opportunity for competitor here to come around and with ai native features like context versioning
Of course they're down while I'm trying to address a "High severity" security bug in Caddy but all I'm getting is a unicorn when loading the report.
(Actually there's 3 I'm currently working, but 2 are patched already, still closing the feedback loop though.)
I have a 2-hour window right now that is toddler free. I'm worried that the outage will delay the feedback loop with the reporter(s) into tomorrow and ultimately delay the patches.
I can't complain though -- GitHub sustains most of my livelihood so I can provide for my family through its Sponsors program, and I'm not a paying customer. (And yet, paying would not prevent the outage.) Overall I'm very grateful for GitHub.
Which security bug(s) are you referring to?
Presumably bugs that may still be under embargo
have you considered moving or having at least an alternative? asking as someone using caddy for personal hosting who likes to have their website secure. :)
We can of course host our code elsewhere, the problem is the community is kind of locked-in. It would be very "expensive" to move, and would have to be very worthwhile. So far the math doesn't support that kind of change.
Usually an outage is not a big deal, I can still work locally. Today I just happen to be in a very GH-centric workflow with the security reports and such.
I'm curious how other maintainers maintain productivity during GH outages.
Yep, I get you about the community.
As an alternative, I thought mainly as a secondary repo and ci in case that Github stops being reliable, not only as the current instability, but as an overall provider. I'm from the EU and recently catch myself evaluating every US company I interact with and I'm starting to realize that mine might not be the only risk vector to consider. Wondering how other people think about it.
> have you considered moving or having at least an alternative
Not who you're responding to, but my 2 cents: for a popular open-source project reliant on community contributions there is really no alternative. It's similar to social media - we all know it's trash and noxious, but if you're any kind of public figure you have to be there.
Several quite big projects have moved to Codeberg. I have no idea how it has worked out for them.
I would have said Codeberg’s reliability was a problem for them but… gestures vaguely at the submission
Zig has been doing fine since switching to Codeberg
LOL Codeberg's 'Explore' link is 503 for me!
N.I.N.A. (Nighttime Imaging 'N' Astronomy) is on bitbucket and it seems to be doing really well.
Edit: Nevermind, looks like they migrated to github since the last time I contributed
I get that, but if we all rely on the defaults, there couldn't be any alternatives.
You are talking to the maintainer of caddy :)
Edit- oh you probably meant an alternative to GitHub perhaps..
no worries, misunderstandings happen.
You can literally watch GitHub explode bit by bit. Take a look at the GitHub Status History; it's hilarious: https://www.githubstatus.com/history.
14 incidents in February! It's February 9th! Glad to see the latest great savior phase of the AI industrial complex [1] is going just as well as all the others!
[1] https://www.theverge.com/tech/865689/microsoft-claude-code-a...
An interesting thing I notice now is that people do not like companies that only post about outages if half the world have them ... and also not companies that also post about "minor issues", e.g.:
> During this time, workflows experienced an average delay of 49 seconds, and 4.7% of workflow runs failed to start within 5 minutes.
That's for sure not perfect, but there was also a 95% chance that if you have re-run the job, it will run and not fail to start. Another one is about notificatiosn being late. I'm sure all others do have similar issues people notice, but nobody writes about them. So a simple "to many incidents" does bot make the stats bad - only an unstable service the service.
Dude, that's just a reason to scream at the clouds... Literally...
At this point they are probably going to crash their status system. "No one ever expected more than 50 incidents in a month!"
You know what I think would reverse the trend? More vibe coding!
I know you are joking but I'm sure that there is at least one director or VP inside GitHub pushing a new salvation project that must use AI to solve all the problems, when actually the most likely reason is engineers are drawing in tech debt.
> I'm sure that there is at least one director or VP inside GitHub pushing a new salvation project that must use AI to solve all the problems
GitHub is under Microsoft’s CoreAI division, so that’s a pretty sure bet.
https://www.geekwire.com/2025/github-will-join-microsofts-co...
Upper management in Microsoft has been bragging about their high percentage of AI generated code lately - and in the meantime we've had several disastrous Windows 11 updates with the potential to brick your machine and a slew of outages at github. I'm sure it might be something else but it's clear part of their current technical approach is utterly broken.
CoPilot has done more for Linux than anyone expected. I switched. I'm switching my elderly parents away next before they fall victim.
Says everything... https://www.youtube.com/shorts/Dj_f2ANBfas
Utterly broken - perhaps, but apparently that's not exclusive with being highly profitable, so why should they care?
When I first typed up my comment I said "their current business approach" and then corrected it to technical since - yea, in the short term it probably isn't hurting their pocket books too much. The issue is that it seems like a lot more folks are seriously considering switching off Windows - we'll see if this actually is the year of the linux desktop (it never seems to be in the end) but it certainly seems to be souring their brand reputation in a major way.
For the time being. Does anyone want Windows 11 for real?
The inertia is not permanent.
Cause it's finally the year of Linux on desktop.
It’s not a joke. This is funny because it is true.
Better to replace management by AI.
Computers can produce spreadsheets even better and they can warm the air around you even faster.
I mean, the strengths of LLMs were always a much better match for the management than for technical work:
* writing endless reports and executive summaries
* pretending to know things that they don't
* not complaining if you present their ideas as yours
* sycophancy and fawning behavior towards superiors
Plus they don't take stock options!
Honestly AI management would probably be better. "You're a competent manager, you're not allowed to break or circumvent workers right laws, you must comply with our CSR and HR policies, provide realistic estimates and deliver stable and reliable products to our customers." Then just watch half the tech sector break down, due to a lack of resources, or watch as profit is just cut in half.
All the cool kids move fast and break things. Why not the same for core infrastructure providers? Let's replace our engineers with markdown files named after them.
This kind of thing never happened before LLMs!
No, the reason it's happening is because they must be vibe coding! :P
That's not good enough. You need SKILLS!
I'm happy that they're being transparent about it. There's no good way to take downtime, but at least they don't try to cover it up. We can adjust and they'll make it better. I'm sure a retro is on its way it's been quite the bumpy month.
I think this will continue to happen until they finish migrating to Azure
The main root cause of the incident on their actions was actually due to Azure: https://www.githubstatus.com/incidents/xwn6hjps36ty points to https://azure.status.microsoft/en-us/status/history/?trackin...
Haven't they been shown the front door?
wut
https://learn.microsoft.com/en-us/azure/frontdoor/front-door...
Probably referring to the fact that they no longer are independent, do not have a CEO and are a division of a division within Microsoft.
I was sort of hoping this would be a year-to-date visualization similar to Github profile contribution graphs...
Someone should make a timeline chart from that, lol.
https://updog.ai/status/github
Here it is. It looks like they are down to a single 9 at this point across all services:
https://mrshu.github.io/github-statuses/
Can you add a line graph with incidents per month? Would be useful to see if the number of incidents are going up or down over time.
I threw together <https://mkantor.github.io/github-incident-timeline/>. It's by day rather than month, and only shows the last 50 incidents since that's all their API returns.
Haha, that would be awesome!
Light work for an LLM
But not Copilot.
Copilot is shown as having policy issues in the latest reports. Oh my, the irony. Satya is like "look ma, our stock is dropping...", Gee I wonder why Mr!!
GitHub has had customer visible incidents large enough to warrant status page updates almost every day this year (https://www.githubstatus.com/history).
This should not be normal for any service, even at GitHub's size. There's a joke that your workday usually stops around 4pm, because that's when GitHub Actions goes down every day.
I wish someone inside the house cared to comment why the services barely stay up and what kinds of actions are they planning to do to fix this issue that's been going on years, but has definitely accelerated in the past year or so.
It's 100% because the number of operations happening on Github has likely 100x'd since the introduction of coding agents. They built Github for one kind of scale, and the problem is that they've all of a sudden found themselves with a new kind of scale.
That doesn't normally happen to platforms of this size.
A major platform lift and shift does not help. They are always incredibly difficult.
There are probably tons of baked in URLs or platform assumptions that are very easy to break during their core migration to Azure.
> A major platform lift and shift does not help.
ISTR that the lift-n-shift started like ... 3 years ago? That much of it was already shifted to Azure ... 2 years ago?
The only thing that changed in the last 1 year (if my above two assertions are correct (which they may not be)) is a much-publicised switch to AI-assisted coding.
Status page currently says the only issue is notification delays, but I have been getting a lot of Unicorn pages while trying to access PRs.
Edit: Looks like they've got a status page up now for PRs, separate from the earlier notifications one: https://www.githubstatus.com/incidents/smf24rvl67v9
Edit: Now acknowledging issues across GitHub as a whole, not just PRs.
They added the following entry:
Investigating - We are investigating reports of impacted performance for some GitHub services. Feb 09, 2026 - 15:54 UTC
But I saw it appear just a few minutes ago, it wasn't there at 16:10 UTC.
And just now:
Investigating - We are investigating reports of degraded performance for Pull Requests Feb 09, 2026 - 16:19 UTC
Yeah I've been seeing a lot of 500 errors myself, latency seems to have spiked too: https://github.onlineornot.com/
I cannot approve PRs because the JSON API is returning HTML error pages. Something is really hosed over there.
Yep, trying to access commit details is just returning the unicorn page for me
git operations are down too.
We've migrated to Forgejo over the last couple of weeks. We position ourselves[0] as an alternative to the big cloud providers, so it seemed very silly that a critical piece of our own infrastructure could be taken out by a GitHub or Azure outage.
It has been a pretty smooth process. Although we have done a couple of pieces of custom development:
1) We've created a Firecracker-based runner, which will run CI jobs in Firecracker VMs. This brings the Foregjo Actions running experience much more closely into line with GitHub's environment (VM, rather than container). We hope to contribute this back shortly, but also drop me a message if this is of interest.
2) We're working up a proposal[1] to add environments and variable groups to Forgejo Actions. This is something we expect to need for some upcoming compliance requirements.
I really like Forgejo as a project, and I've found the community to be very welcoming. I'm really hoping to see it grow and flourish :D
[0]: https://lithus.eu, adam@
[1]: https://codeberg.org/forgejo/discussions/issues/440
PS. We are also looking at offering this as a managed service to our clients.
Why .eu if you're in London? Where are your servers located and who hosts them?
Screw GitHub, seriously. This unreliability is not acceptable. If I’m in a position where I can influence what code forge we use in future I will do everything in my power to steer away from GitHub.
Forge feature parity is easy to find. But GH has that discover ability feature and the social queues from stars/forks.
One solution I see is (eg) internal forge (Gitlab/gitea/etc) and then mirrored to GH for those secondary features.
Which is funny. If GH was better we'd just buy their better plan. But as it stands we buy from elsewhere and just use GH free plans.
Every company I’ve worked in the last 10 years used GH for the internal codebase hosting , PRs and sometimes CI. Discoverability doesn’t really come into picture for those users and you can still fork things from GitHub even if you don’t host your core code infra on it
Stars are just noise. All they tell you is how online the demographics of that ecosystem are.
Mirroring is probably the way forward.
3 outages in 3 months straight according to their own status history. https://www.githubstatus.com/history
I wonder who left the team recently. Must be someone bagged with shadow knowledge. Or maybe they send devops/devs work to another continent.
They're in the process of moving from "legacy" infra to Azure, so there's a ton of churn happening behind the scenes. That's probably why things keep exploding.
I don't know jack about shit here, but genuinely: why migrate a live production system piecewise? Wouldn't it be far more sane to start building a shadow copy on Azure and let that blow up in isolation while real users keep using the real service on """legacy""" systems that still work?
Because it's significantly harder to isolate problems and you'll end up in this loop
* Deploy everything * It explodes * Rollback everything * Spend two weeks finding problem in one system and then fix it * Deploy everything * It explodes * Rollback everything * Spend two weeks finding a new problem that was created while you were fixing the last problem * Repeat ad nauseum
Migrating iteratively gives you a foundation to build upon with each component
So… create your shadow system piecewise? There is no reason to have "explode production" in your workflow, unless you are truly starved for resources.
Does this shadow system have usage?
Does it handle queries, trigger CI actions, run jobs?
If you test it, yes.
Of course, you need some way of producing test loads similar to those found in production. One way would be to take a snapshot of production, tap incoming requests for a few weeks, log everything, then replay it at "as fast as we can" speed for testing; another way would be to just mirror production live, running the same operations in test as run in production.
Alternatively, you could take the "chaos monkey" approach (https://www.folklore.org/Monkey_Lives.html), do away with all notions of realism, and just fuzz the heck out of your test system. I'd go with that, first, because it's easy, and tends to catch the more obvious bugs.
So just double your cloud bill for several few weeks, costing site like GitHub millions of dollars?
How do you handle duplicate requests to external services? Are you going to run credit cards twice? Send emails twice? If not, how do you know it's working with fidelity?
> several few weeks
*many months
Why would you avoid a perfect opportunity to test a bunch of stuff on your customers?
If you make it work, migrating piecewise should be less change/risk at each junction than a big jump between here and there of everything at once.
But you need to have pieces that are independent enough to run some here and some there, and ideally pieces that can fail without taking down the whole system.
That’s a safer approach but will cause teams to need to test in two infrastructures (old world and new) til the entire new environment is ready for prime time. They’re hopefully moving fast and definitely breaking things.
A few reasons:
1. Stateful systems (databases, message brokers) are hard to switch back-and-forth; you often want to migrate each one as few times as possible.
2. If something goes sideways -- especially performance-wise -- it can be hard to tell the reason if everything changed.
3. It takes a long time (months/years) to complete the migration. By doing it incrementally, you can reap the advantages of the new infra, and avoid maintaining two things.
---
All that said, GitHub is doing something wrong.
It took me a second to realize this wasn't sarcasm.
Are they just going to tough through the process and whatever...
I think it's more likely the introduction of the ability to say "fix this for me" to your LLM + "lgtm" PR reviews. That or MS doing their usual thing to acquired products.
rumors I've heard was that github is mostly run by contractors? That might explain the chaos more than simple vibe coding (which probably aggravates this)
nah, they're just showing us how to vibecode your way to success
If the $$$ they saved > the $$$ they lose then yeah it is a success. Business only cares about $$$.
Definitely. The devil is in the details though since it's so damn hard to quantify the $$$ lost when you have a large opinionated customer base that holds tremendous grudges. Doubly so when it's a subscription service with effectively unlimited lifetime for happy accounts.
Business by spreadsheet is super hard for this reason - if you try to charge the maximum you can before people get angry and leave then you're a tiny outage/issue/controversy/breach from tipping over the wrong side of that line.
Yeah, but who cares about long-term? In the long term we are all dead. CEO only needs to be good for 5-10 max years, pop up stock prices and get applause every where and called as the smartest guy in the world.
I think the last major outage wasn't even two weeks ago. We've got about another 2 weeks to finish our MVP and get it launched and... this really isn't helpful. I'm getting pretty fed up of the unreliability.
Sure it is not vibe coding related
Looks like AI replacement of engineering force in action.
You're absolutely right! Sorry I deleted your database.
I can help you restore from backups if you will tell me where you backed it up.
You did back it up, right? Right before you ran me with `--allow-dangerously-skip-permissions` and gave me full access to your databases and S3 buckets?
You're right! Let's just quickly promote your only read replica to the new primar---oops!
I was laughing really hard until I remembered it happened to me a few months ago and I wasn't having fun at that time.
Good news: I optimized your infrastructure costs to zero. Bad news: I did it by deleting everything. You're welcome.
> I can help you restore from backups if you will tell me where you backed it up.
"Whoops, now that one is nuked too. You have any more backups I can practice my shell commands on?"
I'm very sorry I deleted your `backups` bucket, despite being specifically instructed not to touch the `backups` bucket.
Github is moving to Microsoft Azure which is causing all of this downtime AFAIK
That's cover. They've been doing that since microsoft bought them
Yeah but that's exactly the issue - that whole time dev time will have been getting chewed up on the migration when it could have been spent elsewhere
More like Tay.ai and Zoe.ai AIs still arguing amongst themselves not being able to keep the service online for Microsoft after they replaced their human counterparts.
They're overwhelmed with all the vibecoded apps people are pushing after watching the Super Bowl.
Their network stack is ran by OpenAI and is now advertising cool new ways for us to stay connected in a fun way with Mobile Co (TM).
What are good alternatives to GitHub for private repos + actions? I'm considering moving my company off of it because of reliablity issues.
It probably depends on your scale, but I'd suggest self-hosting a Forgejo instance, if it's within your domain expertise to run a service like that. It's not hard to operate, it will be blazing fast, it provides most of the same capabilities, and you'll be in complete control over the costs and reliability.
A people have replied to you mentioning Codeberg, but that service is intended for Open Source projects, not private commercial work.
This. I have been using Codeberg and self-hosting Forgejo runners and I'm happy. For personal projects though, I don't know for a company.
Also very happy with SourceHut, though it is quite different (Forgejo looks like a clone of GitHub, really). The SourceHut CI is really cool, too.
If you want to go really minimal you can do raw git+ssh and hooks (pre/post commit, etc).
i would imagine that's what everyone is doing instead of sitting on their hands. Setup a different remote and have your team push/pull to/from it until Github comes back up. I mean you could probably use ngrok and setup a remote on your laptop in a pinch. You shouldn't be totally blocked except for things like automated deployments or builds tied specifically to github.com
Distributed source control is distributable.
It's also fun when a Jr. on the team distributes the .env file via Git...
Couldn't you avoid that with .gitignore and pre-commit hooks? A determined Jr. can still mess it up, but you can minimize the risk.
We self-host Gitlab at work and it's amazing. CI/CD is great and it has never once gone down.
I've been using https://radicle.xyz/ + https://radicle-ci.liw.fi/ (in combination with my own ci adapter for nix flakes) for about half a year now for (almost) all my public and private repos and so far I really like it.
+1, I like the idea of a peer-distributed code forge. I've been using it as well.
> What are good alternatives to GitHub for private repos + actions? I'm considering moving my company off of it because of reliablity issues.
Dunno about actions[1], but I've been using a $5/m DO droplet for the last 5 years for my private repo. If it ever runs out of disk space, an additional 100GB of mounted storage is an extra $10/m
I've put something on it (Gitea, I think) that has the web interface for submitting PRs, reviewing them, merging them, etc.
I don't think there is any extra value in paying more to a git hosting SaaS for a single user, than I pay for a DO droplet for (at peak) 20 users.
----------------------
[1] Tried using Jenkins, but alas, a $5/m DO droplet is insufficient to run Jenkins. I mashed up shell scripts + Makefiles in a loop, with a `sleep 60` between iterations.
Gitlab.com. CI is super nice and easily self hostable.
And their status history isn't much better. It's just that they are so much smaller it's not Big News.
For me it is their history of high-impact easily avoidable security bugs. I have no idea why "send a reset password link to an address from an unauthenticated source" was possible at all.
I heard that it's hard to maintain self-hosted Gitlab instances
Nah at a small scale it's totally fine, and IME pretty pain-free after you've got it running. The biggest pain points are A) It's slow, B) between auth, storage, and CI runners, you have a lot of unavoidable configuration to do, and C) it has a lot of different features so the docs are MASSIVE.
Not really. About average in terms of infrastructure maintenance. Have been running our orgs instance for 5 years or so, half that time with premium and half the time with just the open source version, running on kubernetes... ran it in AWS at first, then migrated to our own infrastructure.
I type docker pull like once a month and that's it.
Uhm no? We have been self-hosting Gitlab for 6 years now with monthly updates and almost zero issues, just apt update && apt upgrade.
I left for codeberg.org and my own ci runner with woodpecker. Soooo much faster than github
Codeberg is close to what i need
At my last job I ran a GitLab instance on a tiny AWS server and ran workers on old desktop PCs in the corner of the office.
It's pretty nice if you don't mind it being some of the heaviest software you've ever seen.
I also tried gitea, but uninstalled it when I encountered nonsense restrictions with the rationale "that's how GitHub does it". It was okay, pretty lightweight, but locking out features purely because "that's what GitHub does" was just utterly unacceptable to me.
One thing that always bothered me about gitea is they wouldn't even dog food for a long time. GitLab has been developing on GitLab since forever, basically.
Gitlab.com is the obvious rec.
Gitea is great.
SourceHut.
gitea
Don't listen to the clueless suggesting Gitlab. It's forgejo (not gitea) or tangled, that's it.
> clueless suggesting Gitlab
ad hominem isn't a very convincing argument, and as someone who also enjoys forgejo it doesn't make me feel good to see as the justification for another recommender.
Can you offer some explanation as to why Forgejo and Tangled over Gitlab or Gitea?
I personally use Gitea, so I'd appreciate some additional information.
From [1] "Forgejo was created in October 2022 after a for profit company took over the Gitea project."
Forgejo became a hard fork in 2024, with both projects diverging. If you're using it for local hosting I don't personally see much of a difference between them, although that may change as the two projects evolve.
[1] https://forgejo.org/compare-to-gitea/
I'm not OP, but; Forgejo is much lighterweight than Gitlab for my usecase, and was cited as a more maintained version of Gitea, but that's just anecdote from my brain and I don't have sources, so take that with a truckload of salt.
I'd had a gitea instance before and it was appealing insofar as having the ability to mirror from or to a public repo, it had docker container registry capability, it ties into oauth, etc; I'm sure gitlab has much/all of that too, but forgejo's tiny, tiny footprint was very appealing for my resource-constrained selfhosted environment.
GitLab is slow as fuck and the UI is cluttered with corporate nonsense.
Seems Microsoft goes downhill after all in AI.
It's already there - most CS students have second-hand experience with MS products.
It's all been downhill at Microsoft since windows 3.1
I'm fine with that!
pretty clear that companies like microsoft are actually terrible at engineering, their core products were built 30 years ago. any changes now are generally extremely incremental and quickly rolled back with issue. trying to innovate at github shows just how bad they are.
It's not just MSFT, it's all of big tech. They basically run as a cartel, destroy competition through illegal means, engage in regulatory capture, and ensure their fiefdoms reign supreme.
All the more reason why they should be sliced and diced into oblivion.
yeah i have worked at a few FAANG, honestly stunning how entrenched and bad some of the products are. internally, they are completely incapable of making any meaningful product changes, the whole thing will break
to be fair, git is one of the most easily replaced pieces of tech.
just add a new git remote and push. less so for issues and and pulls, but at least your dev team/ci doesn't end up blocked.
It's a general curse of anything that becomes successful at a BigCorp.
The engineers who build the early versions were folks at the top of their field, and compensated accordingly. Those folks have long since moved on, and the whole thing is maintained by a mix of newcomers and whichever old hands didn't manage to promote out, while the PMs shuffle the UX to justify everyones salary...
im not even sure id say they were "top", id more just say its a different type of engineer, that either doesnt get promoted to a big impact role at a place like microsoft, or leaves on their own.
Sorry, my fault. I tried to download a couple of CppCon presentations from their stash. Should have known better than to touch anything C++. ducks
There are new slides? Here goes the rest of my work day.
Seems like MS copilot is vibe-ing it again ! Some other major cloud provider outages come to mind that never happened before the "vibe" area.
Well its a day that ends in Y.
Github is down so often now, especially actions, I am not sure how so many companies are still relying on them.
Migration costs are a thing
So are the costs of downtime.
Is it really that much better than alternatives to justify these constant outages?
We're starting to have that convo in our org. This is just getting worse and worse for Github.
Hosting .git is not that complicated of a problem in isolation.
No, but it has momentum left over from when it was much better. The Microsoft downslide will continue untill there's no one left
Not any longer. It used to but the outages have become very common. I am thinking about moving all my personal stuff to Codeberg.
Yes, for personal projects I just self-host an instance of forgejo with dokploy. Everything else I deploy on codeberg, which is also an instance of forgejo.
Im using Bitbucket for years with no issues.
The great advantage of Bitbucket is that it's so painfully slow you can't tell if it's down or not.
I love its UI (apart from its slowness, of course). I find it much cleaner than Gitlab's.
You can self-host GitHub enterprise.
Ooh - got a source?
https://docs.github.com/en/enterprise-server@3.14/admin/over...
self-host your own services. There are a lot of alternatives to GitHub.
It always has been to just self host. Predicted GitHub's outage streak as far back as half a decade ago [0].
"A better way is to self host". [0]
[0] https://news.ycombinator.com/item?id=22867803
GitHub is slowly turning into the Deutsche Bahn of git providers.
Looks like they've got a status page up now for PRs, separate from the earlier notifications one: https://www.githubstatus.com/incidents/smf24rvl67v9
Edit: Now acknowledging issues across GitHub as a whole, not just PRs.
I wonder what's the value of having a dedicated X (formerly Twitter) status account post 2023 when people without account will see a mix of entries from 2018, 2024, and 2020 in no particular order upon opening it. Is it just there so everyone can quickly share their post announcing they're back?
Just remove all that copilot nonsense and focus on uptime... I would like to push some code.
Take it away from Microsoft. Not sure how this isn't an antitrust issue anyway.
At its core antitrust cases are about monopolies and how companies use anti-competitive conduct to maintain their monopoly.
Github isn't the only source control software in the market. Unless they're doing something obvious and nefarious, its doubtful the justice department will step in when you can simply choose one of many others like Bitbucket, Sourcetree, Gitlab, SVN, CVS, Fossil, DARCS, or Bazaar.
There's just too much competition in the market right now for the govt to do anything.
Minimal changes have occurred to the concept of “antitrust” since its inception as a form of societal justice against corporations, at least per my understanding.
I doubt policymakers in the early 1900s could have predicted the impact of technology and globalization on the corporate landscape, especially vis a vis “vertical integration”.
Personally, I think vertical integration is a pretty big blind spot in laws and policies that are meant to ensure that consumers are not negatively impacted by anticompetitive corporate practices. Sure, “competition” may exist, but the market activity often shifts meaningfully in a direction that is harmful consumers once the biggest players swallow another piece of the supply chain (or product concept), and not just their competitors.
There was a change in the enforcement of antitrust law in the 1970s. Consumer welfare, which came to mean lower prices, is the standard. Effectively normal competition is fine and takes egregious behavior to be violation. It even assumes that big companies are more efficient which makes up for lack of competition.
The other change is reluctance to break up companies. AT&T break up was big deal. Microsoft survived being broken up in its antitrust trial. Tech companies can only be broken up vertically, but maybe the forced competition would be enough.
Picking something other than Github may also have the positive effect that you're less of a target for drive by AI patches.
Can they use Github to their advantage to maintain a monopoly if they are nefarious? Think about it.
Unfortunately the question is "have they", not "can they".
> you can simply choose one of many others
Not really. It's a network effect, like Facebook. Value scales quadratically with the number of users, because nobody wants to "have to check two apps".
We should buy out monopolies like the Chinese government does. If you corner the market, then you get a little payout and a "You beat capitalism! Play again?" prize. Other companies can still compete but the customers will get a nice state-funded high-quality option forever.
Forever, for sure, definitely. State sponsored projects are never subject to the whims of uninformed outsiders.
> Not sure how this isn't an antitrust issue anyway.
Simple: the US stopped caring about antitrust decades ago.
It's not an antitrust issue because antitrust laws aren't enforced in the U.S.
That's on every individual that decided to "give it" to Microsoft. Git was made precisely to make this problem go away.
Git is like 10% of building software.
If GitHub is doing 90% more than Git does, "GitHub" is a terrible name for it.
Not sure how having downtime is an anti-competition issue. I'm also not sure how you think you can take things away from people? Do you think someone just gave them GitHub and then take it away? Who are you expecting to take it away? Also, does your system have 100% uptime?
Companies used to be forced to sell parts of their business when antitrust was involved. The issue isn't the downtime, they should never have been allowed to own this in the first place.
There was just a recent case with Google to decide if they would have to sell Chrome. Of course the Judge ruled no. Nowadays you can have a monopoly in 20 adjacent industries and the courts will say it's fine.
You've been banging on about this for a while, I think this is my third time responding to one of your accounts. There is no antitrust issue, how are they messing with other competitors? You never back up your reasoning. How many accounts do you have active since I bet all the downvotes are from you?
I've had two accounts. I changed because I don't like the history (maybe one other person has the same opinion I did?). Anyways it's pretty obvious why this is an issue. Microsoft has a historical issue with being brutal to competition. There is no oversight as to what they do with the private data on GitHub. It's absolutely an antitrust issue. Do you need more reasoning?
Didn't you just privately tell me it was 4 accounts? Maybe that was someone else hating on Windows 95. But you need an active reason not what they did 20 years ago.
Nope. If someone did that it should be reported if it's against the rules here.
Do you also post "Take it away from $OWNER" every time your open source software breaks?
If he posted every time GitHub broke, he would have certainly have posted a bunch of times.
What antitrust issue does my open source software have?
What does antitrust have to do with the GitHub services downtime?
The more stable/secure a monopoly is in its position the less incentive it has to deliver high quality services.
If a company can build a monopoly (or oligopoly) in multiple markets, it can then use these monopolies to build stability for them all. For example, Google uses ads on the Google Search homepage to build a browser near-monopoly and uses Chrome to push people to use Google Search homepage. Both markets have to be attacked simultaneously by competitors to have a fighting chance.
It regularly breaks the workflow for thousands of FLOSS projects.
It's a funny coincidence - I pushed a commit adding a link to an image in the README.md, opened the repo page, clicked on the said image, and got the unicorn page. The site did not load anymore after that.
Does anyone have issue like Workflow cannot be canceled? It keeps displaying "Failed to cancel workflow." when I tried to cancel it and the flow has been stuck in status "In Progress" for the last 5 hours and still counting
It feels like GitHub's shift to these "AI writes code for you while you sleep!" features will appeal to a less technical crowd who lack awareness of the overall source code hosting and CI ecosystem and, combined with their operational incompetence of late (calling it how I see it), will see their dominance as the default source code solution for folks using it to maintain production software projects fade away.
Hopefully the hobbyists are willing to shell out for tokens as much as they expect.
In the age of Claude Code et al, my honest biggest bottleneck is GH downtime. I've got a dozen PRs I'm working on, but it's all frozen up, daily, with GH outages.
Are the other providers offering much better uptime GitLab, CircleCI, Harness? Saying this as someone that's been GH exclusive sicne 2010.
When I was a summer intern 10 years ago I remember there without fail always being a day where GitHub was down, ever summer. Good times.
I am able to access github.com at 140.82.112.3 no problem
I am able to access api.github.com at 20.205.243.168 no problem
No problem with githubusercontent.com either
to be fair, i think usage has increased a lot because of coding agents and some things that worked well for now can't scale to the next 10x level.
Maybe they need to sort things out for people who pay through the nose for it cause I ain't comforted by vibe coders slowing us down.
The biggest thing tying my team to GitHub right now is that we use Graphite to manage stacked diffs, and as far as I can tell, Graphite doesn't support anything but GitHub. What other tools are people using for stacked-diff workflows (especially code review)?
Gerrit is the other option I'm aware of but it seems like it might require significant work to administer.
I use git town. Fits my brain a lot better.
I wonder if GH charges for the runners during their downtime. Last week lot of them would retry multiple times and then fail.
List of company-friendly managed-host alternatives? SSO, auditing, user management, billing controls, etc?
I would love to pay Codeberg for managed hosting + support. GitLab is an ugly overcomplicated behemoth... Gitea offers "enterprise" plans but do they have all the needed corporate features? Bitbucket is a joke, never going back to that.
Oh! It's not my GitLab@Hetzner that's not working, it's GitHub. Just when I decided to opensource my project.
Well done for self-hosting.
https://www.githubstatus.com/
i was right ... https://medium.com/@patrick.szymkowiak/github-is-falling-apa...
I'm always fascinated by these growth charts. Isn't everyone who needs GitHub already on GitHub? Are people migrating from GitLab? I don't get it!
Azure Screen of Death?
Kids don't even know this. Lucky them.
They will soon given MS direction
Fortunately, git is quite resilient and you can work offline and even do pull requests with your peers without GitHub.
The saddest part to me is that their status update page and twitter are both out of date. I get a full 500 on github.com and yet all I see on their status page is an "incident with pull requests" and "copilot policy propagation delays."
https://www.githubstatus.com/
I don't know if it's related, but for the past week I've been getting pages cut off at some point, as if something closed the connection mid-transfer.
Today, when I was trying to see the contribution timeline of one project, it didn't render.
Meanwhile, Codeberg and Worktree are both online and humming along.
Codeberg gets hit by a fair few attacks every year, but they're doing pretty well, given their resources.
I am _really_ enjoying Worktree so far.
For anyone else having trouble finding Worktree's site because you keep getting "how to use git-worktree" results, it's https://worktree.ca/
Sorry! I should have included a link, since it's relatively unknown.
Yeap, getting this for the last 20 minutes. Everything green on their status pages.
It looks like one of my employees got her whole account deleted or banned without warning during this outage. Hopefully this is resolved as service returns.
On the plus side, it's git, so developers can at least get back to work without too much hassle as long as they don't need the CI/CD side of things immediately.
Anyone have alternatives to recommend? We will be switching after this. Already moved to self-hosted action runners and we are early-stage so switching cost is fairly low.
Codeberg, if your product/project is open source, otherwise try out Tangled.org and Radicle!!
Radicle is the most exciting out of these, imo!
Microslop strikes again!
So what's the moneyline on all these outages being the result of vibe-coded LLM-as-software-engineer/LLM-as-platform-engineer executive cost cutting mandates?
So, what're people's alt stack for replacing GitHub?
We're mirroring to Gitea + Jenkins.
It's definitely some extra devops time, but claude code makes it easy to get over the config hurdles.
Codeberg, Tangled, Radicle!
Wait a minute, isn't Git supposed to be... distributed?
Yeah, but things with "Hub" in their name don't tend to be very distributed
Thanks for underscoring the beautiful oxymoron.
Issues, CI, and downloads for built binaries aren't part of vanilla Git. CI in particular can be hard if you make a multi-platform project and don't want to have to buy a new mac every few years.
Probably Worth taking an honest look at whether your CI could just be an SQS queue and a Mac mini running under your desk though
For my OSS work that is about $699 over my budget
Yeah, fair enough (though you can often pick up an M1 Mini for <$300 these days)
Has anyone noticed that in the past year we have seen a LOT of outages?
Yes. Feels like every other week.
That goes against all the gushing posts about how AI is great. I use all the frontier models and sure they're a bit helpful
But I don't understand if they're that good why are we getting an outage every other week? AWS had an outage unsolved for about 9+ hrs!
I made this joke 10 hours ago: "I wonder if you opened https://github.com/claude in like 1000's of browsers / unique ips would it bring down github since it does seem to try until timeout"
coincidence I think not!
the incident has now expanded to include webhooks, git operations, actions, general page load + API requests, issues, and pull requests. they're effectively down hard.
hopefully its down all day. we need more incidents like this to happen for people to get a glimpse of the future.
And hey, its about the best excuse for not getting work done I can think of
I guess Bill Gates has a virus.
Maybe we should post when it's up
Azure infra rock solid as always.
Damn, I was also trying to push and deploy a critical bug fix that was needed within minutes.
Well unfortunately, you have to wait for GitHub to get back online to push that critical bug fix. If that were me, I would find that unacceptable.
Self hosting would be a better alternative, as I said 5 years ago. [0]
[0] https://news.ycombinator.com/item?id=22867803
I wonder if the incident root cause analysis will point to vibe coding?
I think this is an indicator of a broader trend where tech companies put less value on quality and stability and more value on shipping new features. It’s basically the enshittification of tech
We replaced everything except the git part because of reliability issues. Pages…gone Actions…gone KB…gone. Tickets…gone.
Maybe they need to get more humans involved because GitHub is down at least once a week for a while now.
Do they publish proper post-mortems? I feel like that's gotta be the bare minimum nowadays for such critical digital infrastructure.
The new-fangled copilot/agentic stuff I do read about on HN is meaningless to me if the core competency is lost here.
I was wondering why my AUR packages won’t update, just my luck.
If you were looking for a signal to leave github, then this is it.
They put too much AI in it bot enough engineering rigor
I look forward to the day that jjhub becomes available...
sorry all, i took a month off and then opened github.com
> Monday
Beyond a meme at this point
Is it just me, or are critical services like GitHub, AWS, Google, etc., down more often than they used to be these days?
Fix this or I will send my droid army. #greenpurplelifesmatter #Imcocoforcocoapuffs #ihatejedi
1 engineer, 1 month, 1 million lines of code.
Github's two biggest selling points were its feature set (Pull Requests, Actions) and its reliability.
With the latter no longer a thing, and with so many other people building on Github's innovations, I'm starting to seriously consider alternatives. Not something I would have said in the past, but when Github's outages start to seriously affect my ability to do my own work, I can no longer justify continuing to use them.
Github needs to get its shit together. You can draw a pretty clear line between Microsoft deciding it was all in on AI and the decline in Github's service quality. So I would argue that for Github to gets its shit back together, it needs to ditch the AI and focus on high quality engineering.
Monolith looking like a good now?
it's Monday therefore Github is down.
GitHub is the new Internet Explorer 6. A Microsoft product so dominant in its category that it's going to hold everyone back for years to come.
Just when open source development has to deal with the biggest shift in years and maintainers need a tool that will help them fight the AI slop and maintain the software quality, GitHub not only can't keep up with the new requirements, they struggle to keep their product running reliably.
Paying customers will start moving off to GitLab and other alternatives, but GitHub is so dominant in open source that maintainers won't move anywhere, they'll just keep burning out more than before.
Copilot, what have you done again?
Churlish of me to say, but wasn't GH so much more reliable before the acquisition by Microsoft?
GitHub has a long history of being extremely unstable. They were down all the time, much like recently, several years ago. They seemed to stabilize quite a bit around the MS acquisition era, and now seem to be returning to their old instability patterns.
They should have just scaled a proper Rails monolith instead of this React, Java whatever mixed mess. But hey probably Microslop is vibecoding everything to Rust now!
Team is doing resume driven development
3 incidents in feb already lmao
presumably slophub's now dogfooding GitHub Agentic Workflows?
One reason for the reduction in global downtime could be that with time they add more and more services that can go down and affect the stats.
Just saying.
Now it seems Actions has broken - https://www.githubstatus.com/incidents/lcw3tg2f6zsd
Can we please demand that Github provide mirror APIs to competitors? We're just asking for an extinction-level event. "Oops, our AI deleted the world's open source."
Any public source code hosting service should be able to subscribe to public repo changes. It belongs to the authors, not to Microsoft.
The history of tickets and PRs would be a major loss - but a beauty of git is that if at least one dev has the repo checked out then you can easily rehost the code history.
It would be nice to have some sort of widespread standard for doing issue tracking, reviews, and CI in the repo, synced with the repo to all its clones (and fully from version-managed text-files and scripts) rather than in external, centralized, web tools.
Every repo usually has at least one local copy somewhere, worst would be few old repos disappear.
Making it even easier to snipe accidentally committed credentials?
No, we can't. Hence Git. Use it the right way, or prepare for the fallout. Anyone looking for a good way to prepare for that, I suggest Git.
It's really pathetic for however many trillions MSFT is valued.
If we had a government worth anything, they ought to pass a law that other competitors be provided mirror APIs so that the entire world isn't shut off from source code for a day. We're just asking for a world wide disaster.
when im on w2 this is good but when im contracting this is bad
vibe coding too much?
I bet Microsoft did this...
Related incidents:
Incident with Pull Requests https://www.githubstatus.com/incidents/smf24rvl67v9
Copilot Policy Propagation Delays https://www.githubstatus.com/incidents/t5qmhtg29933
Incident with Actions https://www.githubstatus.com/incidents/tkz0ptx49rl0
Degraded performance for Copilot Coding Agent https://www.githubstatus.com/incidents/qrlc0jjgw517
Degraded Performance in Webhooks API and UI, Pull Requests https://www.githubstatus.com/incidents/ffz2k716tlhx
In addition:
Notifications are delayed https://www.githubstatus.com/incidents/54hndjxft5bx
Incident with Issues, Actions and Git Operations https://www.githubstatus.com/incidents/lcw3tg2f6zsd
https://en.wiktionary.org/wiki/Microsuck
I get the feeling that most of these GitHub downtimes are during US working hours, since I don't remember being impacted them during work. Only noticed it now as I was looking up a repo on my free time.
Good thing we have LLM agents now. Before this kind of behavior was tolerable. Now it's pretty easy to switch over to using other providers. The threat of "but it will take them a lot of effort to switch to someone else" is getting less and less every day.
Are we sure LLM agents aren't the cause of these increasing outages?
tangled is up B]
fix it or I will send robot to your house blud #greenpurplelifesmade #Imcocoforcocoapuffs
MS is now all in on agentic coding.
Github stability worse than ever. Windows 11 and Office stability worse than ever. Features that were useful for decades on computers with low resources are now "too hard" too implement.
Coincidence?
migrating to azure kills businesses
Welcome to Microsoft Github
And now actions are down... great. https://www.githubstatus.com/incidents/lcw3tg2f6zsd
Thank god it's only Actions, Copilot, Issues, Git, Pages, Packages and Pull Requests....
Now Github pages are down
GitHub downtime is going from once a month (unacceptable) to twice a month (what the fuck?)
The next name after Cloudflare
That pink "Unicorn!" joke is something that should be reconsidered. When your services are down you're probably causing a lot of people a lot of stress ; I don't think it's the time to be cute and funny about it.
EDIT: my bad, seems to be their server's name.
I don't know if it's meant to be a joke, per se. They use (or used) the Unicorn server once upon a time:
https://github.blog/news-insights/unicorn/
https://news.ycombinator.com/item?id=4957986
I don't think it's a joke, it's the server that github runs on
https://en.wikipedia.org/wiki/Unicorn_(web_server)
One of Reddit's cutesy error pages (presumably for Internal Server Error is similar) is an illustration that says "You broke reddit". I know it's a joke, but have wondered what effect that might have on a particularly anxiety-prone person who takes it literally and thinks they've done something that's taken the site down and inconvenienced millions of other people. Seems a bit dodgy for a mainstream site to assume all of its users have the dev knowledge to identify a joking accusation.
Even if it is their server name, I completely agree with your point. The image is not appropriate when your multi-billion revenue service is yet again failing to meet even a basic level of reliability, preventing people from doing their jobs and generally causing stress and bad feeling all round.
I am personally totally fine with it but I see your point. Github is a bit too big for often braking with a cutsey error message even if it is a reference to their web server.
That stupid "Aww, Snap!" message I think it's one of the browsers does.