time series - InfluxDB performance -

January 15, 2013

for case, need capture 15 performance metrics devices , save influxdb. each device has unique device id.

metrics written influxdb in following way. here show 1 example

new serie.builder("perfmetric1")     .columns("time", "value", "id", "type")     .values(gettime(), getperf1(), getid(), gettype())     .build()

writing data fast , easy. saw bad performance when run query. i'm trying 15 metric values last 1 hour.

select value perfmetric1, perfmetric2, ..., permetric15 id='testdeviceid' , time > now() - 1h

for hour, each metric has 120 data points, in total it's 1800 data points. query takes 5 seconds on c4.4xlarge ec2 instance when it's idle.

i believe influxdb can better. problem of schema design, or else? splitting query 15 parallel calls go faster?

as @valentin answer says, need build index id column influxdb perform these queries efficiently.

in 0.8 stable can "indexing" using continuous fanout queries. example, following continuous query expand perfmetric1 series multiple series of form perfmetric1.id:

select * perfmetric1 perfmetric1.[id];

later do:

select value perfmetric1.testdeviceid, perfmetric2.testdeviceid, ..., permetric15.testdeviceid time > now() - 1h

this query take less time complete since influxdb won't have perform full scan of timeseries points each testdeviceid.

Search This Blog

UV code

time series - InfluxDB performance -

Comments

Post a Comment

Popular posts from this blog

shopping cart - Page redirect not working PHP -

php - How to modify a menu to show sub-menus -

python - Installing PyDev in eclipse is failed -