Show HN: Sqlite3-dump - a fast SQLite to CSV and parquet

(github.com)

16 points | by Gave4655 2 days ago ago

3 comments

  • skalkin 2 days ago ago

    Thanks for sharing!

    If converting SQLite to CSV/parquet is a one-off task for you, you can also import SQLite files (among the other 50+ formats for tabular data) and export to multiple supported formats (CSV and parquet included) using Datagrok (https://datagrok.ai/). Just drag-and-drop the sqlite file, then use the "export" button on top. Everything happens inside the browser (so there is a limit on the max size of the dataset, probably around 1GB).

    Disclaimer - I'm one of the developers of Datagrok :)

  • vdm 15 hours ago ago

    $ duckdb f.db -c 'COPY table1 TO table1.csv;COPY table1 TO table1.parquet;'

    • Gave4655 11 hours ago ago

      on my machine that i did the basic run, the one in the link is way more faster.

      ``` $ time ./duckdb_cli-linux-amd64 ./basic_batched.db -c "COPY user TO 'user.csv'" 100% (00:00:20.55 elapsed)

      real 0m24.162s user 0m22.505s sys 0m1.988s ```

      ``` $ time ./duckdb_cli-linux-amd64 ./basic_batched.db -c "COPY user TO 'user.parquet'" 100% (00:00:17.11 elapsed)

      real 0m20.970s user 0m19.347s sys 0m1.841s ```

      ``` $ time cargo run --bin parquet --release -- basic_batched.db user -o out.parquet Finished `release` profile [optimized] target(s) in 0.11s Running `target/release/parquet basic_batched.db user -o out.parquet` Database opened in 14.828µs

      SQLite to Parquet Exporter ========================== Database: basic_batched.db Page size: 4096 bytes Text encoding: Utf8 Output: out.parquet Batch size: 10000

      Exporting table: user Output file: out.parquet

         user: 100000000 rows (310.01 MB) - 5.85s (17095636 rows/sec)
      
      Export completed successfully! ========================== Table: user Rows exported: 100000000 Time taken: 5.85s Output file: out.parquet Throughput: 17095564 rows/sec File size: 310.01 MB

      real 0m6.052s user 0m10.455s sys 0m0.537s ```

      ``` $ time cargo run --bin csv --release -- basic_batched.db -t user -o out.csv Finished `release` profile [optimized] target(s) in 0.03s Running `target/release/csv basic_batched.db -t user -o out.csv`

      real 0m6.453s user 0m5.252s sys 0m1.196s ```