Data Engineering

387 readers

1 users here now

A community for discussion about data engineering

Icon base by Delapouite under CC BY 3.0 with modifications to add a gradient

founded 2 years ago

MODERATORS

ericjmorey@programming.dev

Are there best practices for time series database designs? (programming.dev)

submitted 2 weeks ago by jupyter@programming.dev to c/data_engineering@programming.dev

3 comments fedilink hide all child comments

I am creating a couple of bigger database tables with at least hundreds of millions of observations, but growing. Some tables are by minute, some by milliseconds. timestamps are not necessarily unique.

Should I create separate year, month, or date and time columns? Is one unique datetime column enough? At what size would you partition the tables?

Raw data is in csv.

Currently I aim for postgres and duckdb. Does timescaledb make a significant difference?

you are viewing a single comment's thread
view the rest of the comments

[–] plumbus 5 points 2 weeks ago

I thought InfluxDB is the choice for such use cases. But I’m not an expert…