Show HN: Red Squares – GitHub outages as contributions

(red-squares.cian.lol)

333 points | by cianmm 3 hours ago ago

68 comments

u_fucking_dork 15 minutes ago ago

Every time one of these vibe coded meme sites gets posted there’re endless comments about how it’s not actually because of load, the GitHub team is shit, their tech stack is shit, Microsoft is shit, Azure is shit, etc.
Just compare the GitHub status page for public GitHub vs the enterprise cloud pages.
Enterprise has much better numbers and I’ve personally can’t remember the last time there was an outage that prevented me from doing work.
If the problems didn’t revolve around load, I’d expect to see the same uptime problems reflected on the enterprise offering.
collinmanderson 3 minutes ago ago

> Disruption with Gemini 2.5 Pro model
> Disruption with Grok Code Fast 1 in Copilot
> Incident with Copilot Grok Code Fast 1
> Claude Opus 4 is experiencing degraded performance
It doesn't seem fair to blame Github for this? There's nothing they can do about it?
__natty__ 26 minutes ago ago

Contrast between official [0] and third party status pages [1] is huge. How their terms of service for SLA are legal if they are so different from real world usage of their product? I really like GitHub and their services but every time when it’s broken and their status page is green something screams inside me.
[0] https://www.githubstatus.com/ [1] https://mrshu.github.io/github-statuses/

[-]
- philipwhiuk 9 minutes ago ago
  
  > How their terms of service for SLA are legal if they are so different from real world usage of their product?
  Because the SLA likely doesn't consider some features of GitHub under the SLA, whereas an outage/issues for a single model is seen as problem on the Third Party page.
sd9 2 hours ago ago

Weekends are the untapped frontier. Still room to scale.

[-]
- skor an hour ago ago
  
  change is the biggest cause then?
  
  [-]
  - sifex an hour ago ago
    
    Or usage
- kshahkshah 41 minutes ago ago
  
  Wait until they go 996!
jve an hour ago ago

A graph I have to question is even accurate.
> Across 170 days with at least one incident · worst day Thu, Nov 20, 2025 (1.1 days)
1.1 days total how is that possible? Scrolling over that day doesn't indicate the math behind the scenes - 1.3 hours single bullet point.
Also Nov 19 has a bullet point 1.3 day outage but total is 8.1 hours

[-]
- hxtk an hour ago ago
  
  The missing status page [1] treats it as downtime any time any component of the system is down, and calculates the overall uptime based on the time that doesn't overlap with any individual category outages, and the overall downtime as any time overlapping with at least one individual category outage to avoid double-counting They show 24h of minor outage on that date.
  I'm guessing that this site is taking the downtime in a given day across all services and adding it up, which would mean the worst possible day has 10 days of downtime (a day of downtime for each major category).
  1: https://mrshu.github.io/github-statuses/
- thenewnewguy an hour ago ago
  
  I see a bullet point for "1.0 days of 1.3 days", and when I mouse over the previous day (Wedensday 2025-11-19), I see "7.8 hours of 1.3 days".
  I haven't actually checked any sources to confirm there really was downtime on those days, but if we assume those numbers are true 7.8 hours + 1 day is about 1.3 days.
dvh 34 minutes ago ago

I didn't know azure was this bad, completely changed my opinion on their cloud offerings

[-]
- vaylian 8 minutes ago ago
  
  I know that there was a plan to move GitHub to Azure, but I don't know what the status is.
  It could very well be, that GitHub is not running on Azure yet.
- dijit 11 minutes ago ago
  
  Really? A 10 minute interaction with the platform was enough to inform me that no serious engineer is in charge, and no serious engineer chooses this platform.
  It is a platform for CFOs to avoid having another vendor relationship.
anant-singhal 2 minutes ago ago

Interesting, no outages on weekends
keyle an hour ago ago

This is one of the most creative idea I've seen this year. Tasteful and clever. Bravo!
figmert 2 hours ago ago

Far fewer outages during the weekends. Perfect, wasn't gonna do any work then anyway.
elAhmo 2 hours ago ago

Funny to see this closely match contribution graphs with effectively no downtime on weekends.
tealpod 17 minutes ago ago

If anyone wants a aggregated status page for github, cloud & AI services.
https://status-page.org/

[-]
- freakynit 14 minutes ago ago
  
  You can let people organize/filter them into groups based on the stack they use and provide email/discord/slack notifications if any of these form their groups change their status.
debarshri an hour ago ago

Would be funny if you host it on github pages.
revolution88 an hour ago ago

For 30th of April, 2026 it shows it was down 1.0 days of 2.6 days (minor incident) :)
jpb0104 an hour ago ago

Setup my self-hosted Forgejo last night. Very pleased so far.

[-]
- hosteur an hour ago ago
  
  Yeah me too. I moved all my public projects to codeberg and my internal repos to self-hosted forgejo.
  Hosting forgejo is really easy as well. It being a single binary makes it really easy to handle with almost zero maintenance.
danfritz 2 hours ago ago

I wonder how well this corolates with azure incidents. Especially for the US regions.

[-]
- ngruhn 2 hours ago ago
  
  I live in Europe. I've not noticed these constant outages. But I only use GitHub after work.
  
  [-]
  - progbits a minute ago ago
    
    Interesting. I'm in EU and see these constantly but usually in the afternoon so it bothers me less as I'm already wrapping up, but my US coworkers are getting hit much worse.
- p2detar 2 hours ago ago
  
  I also bet my money on Azure. Someone who allegedly worked there recently posted an article here on the numerous problems with Azure. Sadly I didn’t bookmark it.
  
  [-]
  - hosteur an hour ago ago
    
    The article you are thinking of was likely written by Axel Rietschin who worked on Azure core compute team.
    https://isolveproblems.substack.com/p/how-microsoft-vaporize...
    HN thread: https://news.ycombinator.com/item?id=47616242
    
    [-]
    - chrisweekly 38 minutes ago ago
      
      Wow. Yikes. I never liked Azure, but this level of dysfunction is just astonishing.
korrectional 2 hours ago ago

I don't really understand why this is happening at this scale, it's not like they just became broke and can't afford a proper server... can someone explain?

[-]
- fareesh an hour ago ago
  
  Agents are shipping code faster all over the world and in some cases 24 hours a day. Additionally, some significant number of non-developers are now developers i.e. they are also shipping to github regularly.
  This is not limited to just pushing code but all the bells and whistles that github added as features under the assumption of some predictable growth are now exceeding the original plans.
  I suspect a lot of their existing systems have to be re-architected for unanticipated scale, and it won't happen overnight for sure.
  
  [-]
  - prepend an hour ago ago
    
    They were sucking 5 years ago before agents existed. I don’t think this has anything to do with recent changes.
    https://damrnelson.github.io/github-historical-uptime/
    
    [-]
    - Octoth0rpe an hour ago ago
      
      Pretty damning. Would also be interesting to see the number of commits overlayed. The graph tells a great story about the correlation with MS's takeover, but I wonder if at the same time that uptime went to shit, MS was shifting over large numbers of enterprise contracts to github. That would be a more complete story IMO.
      None of which excuses this. Can you imagine someone's reaction in 2017 if you told them that github would be below 90% uptime in 2026? It would be unimaginable.
    - sarchertech 37 minutes ago ago
      
      That’s nonsense. GitHub didn’t have 100% uptime before 2020. I remember downtime back then. And Microsoft didn’t make changes that fast. The only thing that changed is the accuracy of their status page.
      Also go back and look at the unofficial status page from 3 years ago. It’s regularly above 99% and has been dropping steadily since then. Then in the last 3 months has dropped to below 85%.
    - p-e-w an hour ago ago
      
      Whoa, if that is even remotely accurate then the talk about agents is a complete red herring.
      
      [-]
      - theolivenbaum an hour ago ago
        
        If I remember correctly the status page was not precise before the acquisition - so take with a big grain of salt the 100% pre-acquisition values
- baq an hour ago ago
  
  They’re on track to 30x volume yoy by their own words
- philipwhiuk 6 minutes ago ago
  
  Most of the outages are actually the unavailability of single AI models, not the core service.
- plufz 2 hours ago ago
  
  See previous days articles. Agentic coding. Going from 1b annual commits to estimated 14b or more from one year to another.
- embedding-shape 2 hours ago ago
  
  The faster you move, the more you screw up, almost no company producing software have figured out how to move fast and not screw up. It's so hard, that companies even used to boast about how much they didn't care about screwing up, as long as they moved fast.
  Add in new "productivity" tools that help you move even faster, with even less regards for how much you screw up (even though the tool could be used for you to move at the same speed, but with less screw ups), and an engineering culture which boils down to "Why not?", and you get platforms run by Microsoft that are unable to achieve two nines of reliability.
- prepend an hour ago ago
  
  I suspect it’s caused because Microsoft is using buggy Microsoft tech instead of the original stack.
  They’re making political decisions based on what they sell vs what’s actually useful for their use case.
  It’s kind of impossible to find out if this is true though.
  
  [-]
  - u_fucking_dork 24 minutes ago ago
    
    That doesn’t track because GitHub Enterprise Cloud has great uptime. This is all load based, vibe coded ai slop code shipped at record numbers from users who will never convert to paid. The real question is what are they doing about that?
- dicksent an hour ago ago
  
  ai
bharxhav 2 hours ago ago

Would be interesting to see if this correlated with their release cycles.

[-]
- rufo 14 minutes ago ago
  
  At least as of when I left the company, GitHub was being deployed to fairly close to once every 60-90 minutes (the frequency of a deploy train/merge queue batch going out) 24 hours a day, at least during weekdays… there are a fair number of international engineers and deploy trains get crowded during main US business hours, so while there are fewer PRs going out at odd hours US time, there were typically still some. There aren’t dedicated releases as such for GitHub-hosted instances - everything you release needs to be gated behind a feature flag or other mechanism if it’s not going live immediately, and your code either needs to handle the database in both its pre- and post-migrated state, or you need to run the migration in advance of your code shipping out.
  Fun fact: it used to be the case that GitHub was actually _less_ reliable if nobody deployed to it… there used to be various resource leaks that we didn’t see when people were deploying all day, since then the app wasn’t getting restarted constantly. After GitHub went down during a holiday break we had volunteers to deploy GitHub once a day during holiday breaks, until the underlying issues were eventually fixed.
- hosteur 2 hours ago ago
  
  Well, outages seem to be distributed across all days except weekends. So this seems like people fucking around with stuff being a major factor.
  
  [-]
  - samlinnfer 2 hours ago ago
    
    Surely it just means more people working, resulting in more load, resulting in more outages?
    
    [-]
    - pwagland an hour ago ago
      
      Or even both. In any kind of continuous deployment, you'd expect outages at the point of deployment, or shortly thereafter as the unintended consequences ripple.
      Then the load during the working days makes those ripples larger and into outages.
  - embedding-shape an hour ago ago
    
    Most outages are caused by changes by humans ("actors"?), very rarely are things "People just dig our stuff so much we can't keep up" but more often "We didn't think about this performance drawback when we built thing X, now it's hurting us", and of course, more outages when you try to fix those issues without fully considering the scope and impact.
nautilus12 18 minutes ago ago

Guess where AI Coding entered the picture
pards 2 hours ago ago

This design is perfect irony. I love it.
FaithMB 43 minutes ago ago

I like this more than I expected. The intensity gradient is a nice touch too.
lnenad 2 hours ago ago

The memes are really painful now. I feel for the team that's is trying to survive underwater.

[-]
- renegade-otter an hour ago ago
  
  With management screaming down their necks:
  YOU NEED TO USE MOAR AI!
- nojonestownpls 4 minutes ago ago
  
  $250k can do a good job of easing meme-induced pain.
  "survive underwater" what a joke. Yes, there will be good engineers there who will be sad to see it go this way, but they choose to be there, get paid better than 99% of humanity for that.
Gigacore an hour ago ago

It is funny how weekends are almost always up!
faangguyindia an hour ago ago

All these companies brag about being hyperscalers and cannot scale github.
Similarly, i see google releasing advancement after advancement in LLM yet i see antigravity sub where people are crying all time.
cyanydeez 2 hours ago ago

double entendre: Is it load based or github-employee based that weekends are sparser.
or just a multifactor of both.

[-]
- globular-toast 2 hours ago ago
  
  Didn't they blame "AI" for the increased load? I'm not sure why AI usage would be more during the week than the weekend, but it could be.
  It does look like Friday outages were a bit rarer, which could be due to having a "no deployments on Friday" rule.
  
  [-]
  - mirekrusin 2 hours ago ago
    
    From the chart it seems they should have policy to deploy on weekends only.
- Shoetp 2 hours ago ago
  
  Yes
airstrike 2 hours ago ago

can you correlate this to data on # of commits, actions, etc?
WesSouza an hour ago ago

Well done.
ramon156 2 hours ago ago

Please tell me this makes sense
This website has no overused ai-generated animations and... I quite enjoy it. The original website[1] has a fade-in animation, big round cards, shadows, all the jazz you can think of, it's there.
This site is very readable, very honest and sober. I don't need to sift through buzzwords to figure out tiny details.
Thank you, OP!
1: https://mrshu.github.io/github-statuses/
philprx 2 hours ago ago

"Good job, Microsoft, amazing uptime."
rvz an hour ago ago

Another reminder that a self hosted git repository would have more uptime than GitHub and centralizing everything to GitHub was a very bad idea. [0]
[0] https://news.ycombinator.com/item?id=22867803
Fokamul 2 hours ago ago

Clearly their team needs more LLM usage.