Week 10

We ran the benchmarking test this week. We ran the performance test for Vega against PostgreSQL and DuckDB. For the test, we used multiple generated datasets of differing sizes, from 50,000 to 1 million rows. Just a side note, the input dataset for the dataset generator had only 300 rows, so it’s really interesting that it produced a dataset as high as 400 times larger than the original dataset. Back to the point, we were very pleased to find out that ours outperformed others in general. While DuckDB was the clear victor for the Extent operator across all size, and while DuckDB had the tendency to perform very well as the data size increased, Vega was noticeably performing better in all the other cases (across all other operator types and dataset sizes). Because we were able to empirically prove that Vega is very strong through the benchmarking process, I think our team achieved one of the main milestones. If you are interested in seeing a comparison graph, please check the final report.

Written on August 3, 2021