
Join us for an exciting collaborative talk night with [Seattle Postgres](https://www.meetup.com/seattle-postgres/?eventOrigin=your_groups) at the Fred Hutch Steam Plant! **Agenda:** 5:30-6pm: Doors open/networking 6-6:10pm: Opening remarks 6:10-6:35pm: Talk #1 6:35-6:50pm: Intermission 6:50-7:05pm: Lightning talk #1 7:05-7:20pm: Lightning talk #2 7:20-7:35pm: Lightning talk #3 7:35pm-8pm: Networking After party to follow at Tapster (1011 Valley St, Seattle, WA 98109) **What we'll do:** This event will feature the following presentations: **Talk #1:** TBD **Speaker:** Andrew Beyer **Description:** A survey of some interesting tools/systems/practices at the intersection of databases and webassembly **Bio:** TBD **Lightning talk #1:** Benchmarking Database systems on the NYC Taxi Database **Speaker:** Junaid Hasan **Description:** Data science workflows on local hardware often face a “Mid-Size Data” problem: datasets between 1GB and 100GB that are too large for spreadsheets but inconvenient for distributed clusters. This study benchmarks five data management systems (PostgreSQL, SQLite, Pandas, DuckDB, Polars) on a 41-million row NYC Taxi dataset using a standard Apple M1 laptop. Our results reveal an 18,000x difference in ingestion latency between row-stores and zero-copy columnar engines. Furthermore, forensic analysis of query plans demonstrates that execution architecture (Vectorized vs. Volcano) dominates optimizer intelligence for analytical workloads. Finally, a sensitivity analysis over 20 iterations exposes significant volatility in SQLite’s query planning (σ > 400s) compared to the stability of DuckDB and PostgreSQL. **Bio:** I am Junaid Hasan, I'm a Math and Data Science PhD student at UW Seattle, graduating this June. Most of my day-to-day research focuses on AI interpretability and computational number theory, and managing all the data got me interested in benchmarking databases. Before this, I spent some time working on cryptography and algebraic ge
Hosted by Spotbo
@community