to be honest, the way compound index is implemented in mongo is problematic (index intersection never worked for me). you have to tinker a lot sometimes, so the query params are in the right order so the query planner don't decide to pick other index or not use index at all.
not sure if this was fixed in recent versions, but if you have index on array of strings/ints, and then use `addToSet` operator, the index is not used for the `addToSet` operation, and the disk is hammered for each update to verify if the content of the array has unique values. with pretty high volume, you can easily saturate the disk queue, wondering what is happening and how you can fix that
after few years of working with few TB collections (tables) in mongo, I can say that this DB is good if you want to insert data, but don't need to read the data that much often. if you need json and read heavy DB, consider something else eg elastic (with all it's drawbacks), because at some point the limiting factor for you will be indexes in mongo, that are the most weak part of this DB (even more weak than the infamous data loss)
Data infrastructure in the real world, for non-trivial systems, moves slowly. Migrations are costly and risky. If you need to optimize query performance, you are likely working on one of those non-trivial systems. We have some MongoDB clusters at work (one in my team). Would we have picked it today? Maybe not, because the mindshare isn't there. Does it work fine right now though? Yes, definitely, we have no issues really (the suck in that system is elsewhere).
to be honest, the way compound index is implemented in mongo is problematic (index intersection never worked for me). you have to tinker a lot sometimes, so the query params are in the right order so the query planner don't decide to pick other index or not use index at all.
not sure if this was fixed in recent versions, but if you have index on array of strings/ints, and then use `addToSet` operator, the index is not used for the `addToSet` operation, and the disk is hammered for each update to verify if the content of the array has unique values. with pretty high volume, you can easily saturate the disk queue, wondering what is happening and how you can fix that
after few years of working with few TB collections (tables) in mongo, I can say that this DB is good if you want to insert data, but don't need to read the data that much often. if you need json and read heavy DB, consider something else eg elastic (with all it's drawbacks), because at some point the limiting factor for you will be indexes in mongo, that are the most weak part of this DB (even more weak than the infamous data loss)
Here is my recipe for even faster query performance:
1: export the data out of MongoDB
2: import the data in PostgreSQL, ClickHouse, DuckDB, whatever
3: create the indexes there
4: uninstall MongoDB
But then they wouldn’t be able to sell their dashboard view. That’s the point of the tutorial/salespitch/s.
Does anyone even use MongoDB anymore? Is this article from 2019?
Data infrastructure in the real world, for non-trivial systems, moves slowly. Migrations are costly and risky. If you need to optimize query performance, you are likely working on one of those non-trivial systems. We have some MongoDB clusters at work (one in my team). Would we have picked it today? Maybe not, because the mindshare isn't there. Does it work fine right now though? Yes, definitely, we have no issues really (the suck in that system is elsewhere).