Jepsen: MariaDB Galera Cluster 12.1.2

(jepsen.io)

117 points | by aphyr 3 days ago ago

18 comments

fluxcorethread 2 days ago ago

I don't understand why, if you are creating a distributed db, that you don't at least try using eg. aphyrs jepsen library (1).
The story seems to repeat itself for distributed database: Documentation looks more like advertisement. Promises a lot but contains multiple errors, and failures that can corrupt the data. It's great that jepsen doing the work they do!
1. https://github.com/jepsen-io/jepsen

[-]
- aphyr 2 days ago ago
  
  I was kind of surprised by this one--I know the MariaDB folks and have worked with some of them before. They made significant changes to fix the Repeatable Read issues we found in the last report, so I know the team cares about safety.
  There wasn't much reaction on the mailing list to the lost-write problem back in January, or to the Jira tickets. I actually tried calling MariaDB on the phone to see if they'd like to talk about it, but no dice. I assume they're probably busy with other projects at the moment (hi, it's me too) and haven't had a chance to switch gears.
- jwr 2 days ago ago
  
  It also surprises me. Every company that creates a distributed database should pay for Jepsen testing. First, it is a great chance to improve their software, and second, if there are problems, they will eventually come to light anyway.
  
  [-]
  - Sesse__ 2 hours ago ago
    
    > if there are problems, they will eventually come to light anyway
    Not necessarily; before Kyle started this one-man crusade against data loss, database vendors would claim generally whatever and it would go unchallenged for decades. (You might get the occasional bug report, which you could handwave away as “hardware” or “you're holding it wrong” or just ignore it.) Now you're _slightly_ less likely to succeed, but only as long as e.g. your product is sufficiently uninteresting or hard enough to set up that he doesn't test it. :-)
gebalamariusz 2 days ago ago

It's the "healthy cluster" aspect that makes this scary. Partition errors are expected—that's what Jepsen is testing. However, stale reads during normal operation mean that most Galera deployments behind a round-robin load balancer silently encounter this problem. The classic scenario: we create a user on node A, the next request goes to node B, and the user doesn't exist yet. The solution is wsrep_sync_wait or pinning reads to the writer node, but most setups don't use either of these methods because they assume a healthy cluster equals consistent reads.

[-]
- Diggsey 2 days ago ago
  
  Yes this stood out to me as well...
taneliv 3 days ago ago

While Jepsen (and this article) is focused on behavior under node failure and network partitions, this caught my eye:
> It also exhibits Stale Read, Lost Update, and other forms of G-single in healthy clusters
This looks like quite a fundamental issue.
mono442 2 days ago ago

I would kinda expect that. MySQL hasn't been designed to be a distributed database from the beginning and it's usually hard to make it work later on.
linsomniac 3 days ago ago

I really like glaera for low volume clustering, because of the true multi-master nature. I've been using it for over a decade on a clustered mail server for storing account information, and more recently I've pumped the log information in there so each user can see their related log messages, for a user base of around 6,000 users, and it's been a real workhorse.

[-]
- ffsm8 2 days ago ago
  
  Uh, that scale doesn't even need clustering beyond high availability.
  And as Jepsen showed, if you actually do increase volume, it loses consistency... Invalidating the use case for multi master entirely. So, ymmv I guess
  
  [-]
  - linsomniac 2 days ago ago
    
    Correct, this is entirely about high availability.
  - nchmy 2 days ago ago
    
    what options are there in this situation for HA without galera?
    
    [-]
    - ffsm8 2 days ago ago
      
      honestly, i havent paid attention to it since ive put my last devops role behind me sometime around 2015.
      back then one of my jobs was doing HA via percona mysql (and later mariadb) master/slave. I expected that to continue to exist to date, but maybe its no longer an option? i'm not up to date.
constructrurl 3 days ago ago

[flagged]
linsomniac 3 days ago ago

I realize that we like to use the page title here on HN, but this really should be something like "Data loss cases with MariaDB Glaera Cluster 12.1.2".

[-]
- deepsun a day ago ago
  
  No, if one had ever read any Jepsen article immediately understands what's it about, and in what format. No need to edit original title here.
- hu3 3 days ago ago
  
  One of the reasons is that this kind of title editorialization fosters generic commentaries in reaction to titles.
- undefined 3 days ago ago
  
  [deleted]