Planetscale... is an interesting company. They have ended their free tier and I am not sure where, but someone pointed out that they are essentially just being a b2b company now in some sense (and they lost quite a bit reputation from indie hackers)
Now after that, they released their nvme drive innovation which I admit I am a little ignorant of.
Now one of the reasons that I hated planetscale was that it was exclusively mysql, Postgresql is good tbh. But can it run postgres extensions?
And also regarding convex using them. Isn't convex itself a database? / a reactive database. I didn't knew that underneath convex used some other database like postgres though I guess Correct me if I am wrong but from my last recall, they can also use sqlite etc. too.
Another point I'd like to raise is that alloydb is the cheapest in their benchmark except their own product.
And I wonder if there is some part of the results that they have omitted to be shown as the better product & I'd like to see third party results too tbh.
I'd also love to see it being open source tbh. Neon/Supabase is open source fwiw.
The closest open source I could see of planetscale is of https://github.com/planetscale/migration-scripts where its a shell script to migrate from postgres to planetscale and at the time of writing, a recent commit just 36 minutes ago was launched but I guess I'd like to genuinely tweak and self host what makes their postgres better IDK
Seeing latency figures measured in hundreds of milliseconds in the best cases really drives home for me how big of a deal solutions like SQLite can be.
If you took their exact same hardware and put the application+SQLite on the same box, you could literally chop 4 zeroes off these p99 latency figures. NVMe storage is unbelievably fast when it's utilized in the same machine that the application runs on.
That's fine if you don't super care about the data. I expect these latency figures in particular would look better if there wasn't any replication (Q/s might not change much though, would be my guess).
> At PlanetScale, we give you a primary and two replicas spread across 3 availability zones (AZs) by default. Multi-AZ configurations are critical to have a highly-available database. The replicas can also be used to handle significant read load.
Folks, for the love of god, please please stop running TPC-C without the “keying time” and calling it “the industry-standard TPCC benchmark”.
I understand there are practical reasons why you might want to just choose a concurrency and let it rip at a fixed warehouse size and say, “I ran TPC-C”, but you didn’t!
TPC-C when run properly is effectively an open-loop benchmark that scales where the load scales with the dataset size by having a fixed number of workers per warehouse (2?) that each issue transactions at some rate. It’s designed to have a low level of builtin contention that occurs based on the frequency of cross warehouse transactions, I don’t remember the exact rate but I think it’s something like 10%.
The benchmark has an interesting property that if the system can keep up with the transaction load by processing transactions quickly, it remains a low contention workload but if it falls behind and transactions start to pile up, then the number of contending transactions in flight will increase. This leads to non-linear degradation mode even beyond what normally happens with an open loop benchmark — you hit some limit and the performance falls off a cliff because now you have to do even more work than just catching up on the query backlog.
When you run without think time, you make the benchmark closed loop. Also, because you’re varying the number of workers without changing the dataset size (because you have to vary something to make your pretty charts), you’re changing the rate at which any given transaction is going to be on the same warehouse. So, you’ve got more contending transactions generally, but worse than that, because of Amdahl’s law, the uncontended transactions will fly through, so most of the time for most workers will be spend sitting waiting on contended keys.
Not sure why you would benchmark AlloyDB/Postgres running on Google Cloud against Planetscale running on AWS? Why not test it against Google Cloud Compute? Typically, there is a good reason why people using a Google Cloud Service
Planetscale... is an interesting company. They have ended their free tier and I am not sure where, but someone pointed out that they are essentially just being a b2b company now in some sense (and they lost quite a bit reputation from indie hackers)
Now after that, they released their nvme drive innovation which I admit I am a little ignorant of.
Now one of the reasons that I hated planetscale was that it was exclusively mysql, Postgresql is good tbh. But can it run postgres extensions?
And also regarding convex using them. Isn't convex itself a database? / a reactive database. I didn't knew that underneath convex used some other database like postgres though I guess Correct me if I am wrong but from my last recall, they can also use sqlite etc. too.
Another point I'd like to raise is that alloydb is the cheapest in their benchmark except their own product.
And I wonder if there is some part of the results that they have omitted to be shown as the better product & I'd like to see third party results too tbh.
I'd also love to see it being open source tbh. Neon/Supabase is open source fwiw. The closest open source I could see of planetscale is of https://github.com/planetscale/migration-scripts where its a shell script to migrate from postgres to planetscale and at the time of writing, a recent commit just 36 minutes ago was launched but I guess I'd like to genuinely tweak and self host what makes their postgres better IDK
Seeing latency figures measured in hundreds of milliseconds in the best cases really drives home for me how big of a deal solutions like SQLite can be.
If you took their exact same hardware and put the application+SQLite on the same box, you could literally chop 4 zeroes off these p99 latency figures. NVMe storage is unbelievably fast when it's utilized in the same machine that the application runs on.
That's fine if you don't super care about the data. I expect these latency figures in particular would look better if there wasn't any replication (Q/s might not change much though, would be my guess).
> At PlanetScale, we give you a primary and two replicas spread across 3 availability zones (AZs) by default. Multi-AZ configurations are critical to have a highly-available database. The replicas can also be used to handle significant read load.
Don't even need SQLite for the speed up, although SQLite is fast. I routinely get blazing fast p99 latency on local hardware from postgres itself.
Folks, for the love of god, please please stop running TPC-C without the “keying time” and calling it “the industry-standard TPCC benchmark”.
I understand there are practical reasons why you might want to just choose a concurrency and let it rip at a fixed warehouse size and say, “I ran TPC-C”, but you didn’t!
TPC-C when run properly is effectively an open-loop benchmark that scales where the load scales with the dataset size by having a fixed number of workers per warehouse (2?) that each issue transactions at some rate. It’s designed to have a low level of builtin contention that occurs based on the frequency of cross warehouse transactions, I don’t remember the exact rate but I think it’s something like 10%.
The benchmark has an interesting property that if the system can keep up with the transaction load by processing transactions quickly, it remains a low contention workload but if it falls behind and transactions start to pile up, then the number of contending transactions in flight will increase. This leads to non-linear degradation mode even beyond what normally happens with an open loop benchmark — you hit some limit and the performance falls off a cliff because now you have to do even more work than just catching up on the query backlog.
When you run without think time, you make the benchmark closed loop. Also, because you’re varying the number of workers without changing the dataset size (because you have to vary something to make your pretty charts), you’re changing the rate at which any given transaction is going to be on the same warehouse. So, you’ve got more contending transactions generally, but worse than that, because of Amdahl’s law, the uncontended transactions will fly through, so most of the time for most workers will be spend sitting waiting on contended keys.
Not sure why you would benchmark AlloyDB/Postgres running on Google Cloud against Planetscale running on AWS? Why not test it against Google Cloud Compute? Typically, there is a good reason why people using a Google Cloud Service