cql - Cassandra sorting is not working -
i having cassandra table want data sorted according action_time
field.
please find below script details.
cassandra@cqlsh:activitylog> desc keyspace;cassandra@cqlsh:activitylog> desc keyspace; create keyspace activitylog replication = {'class': 'simplestrategy', 'replication_factor': '3'} , durable_writes = true; create table activitylog.activity_log ( id timeuuid, action_time timestamp, vendor_id bigint, primary key (id, action_time) ) clustering order (action_time asc) , bloom_filter_fp_chance = 0.01 , caching = '{"keys":"all", "rows_per_partition":"none"}' , comment = '' , compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.sizetieredcompactionstrategy', 'max_threshold': '32'} , compression = {'sstable_compression': 'org.apache.cassandra.io.compress.lz4compressor'} , dclocal_read_repair_chance = 0.1 , default_time_to_live = 0 , gc_grace_seconds = 864000 , max_index_interval = 2048 , memtable_flush_period_in_ms = 0 , min_index_interval = 128 , read_repair_chance = 0.0 , speculative_retry = '99.0percentile'; cassandra@cqlsh:activitylog> select id,action_time,vendor_id activity_log; id | action_time | vendor_id --------------------------------------+--------------------------+----------- 4ce8ea50-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:11:18+0530 | 1234 4ce8ea53-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:11:18+0530 | 1234 86aedb31-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:13:34+0530 | 1234 09ec5340-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:09:25+0530 | 1234 65e85a71-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:12:39+0530 | 1234 65e85a72-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:12:39+0530 | 1234 86aedb33-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:13:34+0530 | 1234 4ce8ea52-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:11:18+0530 | 1234 86aedb32-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:13:34+0530 | 1234 09ec5341-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:09:25+0530 | 1234 20935240-ea70-11e4-8fd7-4943c7dd5536 | 2015-04-24 16:23:20+0530 | 1234 65e85a73-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:12:39+0530 | 1234 20935243-ea70-11e4-8fd7-4943c7dd5536 | 2015-04-24 16:23:20+0530 | 1234 86aedb30-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:13:34+0530 | 1234 09ec5343-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:09:25+0530 | 1234 20935241-ea70-11e4-8fd7-4943c7dd5536 | 2015-04-24 16:23:20+0530 | 1234 65e85a70-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:12:39+0530 | 1234 20935242-ea70-11e4-8fd7-4943c7dd5536 | 2015-04-24 16:23:20+0530 | 1234 09ec5342-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:09:25+0530 | 1234 4ce8ea51-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:11:18+0530 | 1234 create keyspace activitylog replication = {'class': 'simplestrategy', 'replication_factor': '3'} , durable_writes = true; create table activitylog.activity_log ( id timeuuid, action_time timestamp, vendor_id bigint, primary key (id, action_time) ) clustering order (action_time asc) , bloom_filter_fp_chance = 0.01 , caching = '{"keys":"all", "rows_per_partition":"none"}' , comment = '' , compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.sizetieredcompactionstrategy', 'max_threshold': '32'} , compression = {'sstable_compression': 'org.apache.cassandra.io.compress.lz4compressor'} , dclocal_read_repair_chance = 0.1 , default_time_to_live = 0 , gc_grace_seconds = 864000 , max_index_interval = 2048 , memtable_flush_period_in_ms = 0 , min_index_interval = 128 , read_repair_chance = 0.0 , speculative_retry = '99.0percentile'; cassandra@cqlsh:activitylog> select id,action_time,vendor_id activity_log; id | action_time | vendor_id --------------------------------------+--------------------------+----------- 4ce8ea50-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:11:18+0530 | 1234 4ce8ea53-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:11:18+0530 | 1234 86aedb31-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:13:34+0530 | 1234 09ec5340-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:09:25+0530 | 1234 65e85a71-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:12:39+0530 | 1234 65e85a72-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:12:39+0530 | 1234 86aedb33-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:13:34+0530 | 1234 4ce8ea52-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:11:18+0530 | 1234 86aedb32-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:13:34+0530 | 1234 09ec5341-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:09:25+0530 | 1234 20935240-ea70-11e4-8fd7-4943c7dd5536 | 2015-04-24 16:23:20+0530 | 1234 65e85a73-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:12:39+0530 | 1234 20935243-ea70-11e4-8fd7-4943c7dd5536 | 2015-04-24 16:23:20+0530 | 1234 86aedb30-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:13:34+0530 | 1234 09ec5343-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:09:25+0530 | 1234 20935241-ea70-11e4-8fd7-4943c7dd5536 | 2015-04-24 16:23:20+0530 | 1234 65e85a70-ea7f-11e4-aa45-4943c7dd5536 | 2015-04-24 18:12:39+0530 | 1234 20935242-ea70-11e4-8fd7-4943c7dd5536 | 2015-04-24 16:23:20+0530 | 1234 09ec5342-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:09:25+0530 | 1234 4ce8ea51-ea55-11e4-a220-4943c7dd5536 | 2015-04-24 13:11:18+0530 | 1234
as can see select statement not sorted per action_time column. while getting description of table showing data sorted action_time in asc order.
please me out.
it important note cassandra enforce sorting order only within partition key. in case, partition key id
, , table definition indicates rows similar partition keys ordered clustering key action_time
.
your problem not sort order not work. cardinality of partition key high cassandra doesn't have sort.
this becomes clear when model data lower-cardinality partition key. watch happens when re-create table activity_log_by_vendor
re-ordered primary key:
create table activity_log_by_vendor ( id timeuuid, action_time timestamp, vendor_id bigint, primary key (vendor_id, action_time, id)); aploetz@cqlsh:stackoverflow2> select * activity_log_by_vendor vendor_id = 1234; vendor_id | action_time | id -----------+--------------------------+-------------------------------------- 1234 | 2015-04-24 02:39:25-0500 | 09ec5340-ea55-11e4-a220-4943c7dd5536 1234 | 2015-04-24 02:39:25-0500 | 09ec5341-ea55-11e4-a220-4943c7dd5536 1234 | 2015-04-24 02:39:25-0500 | 09ec5342-ea55-11e4-a220-4943c7dd5536 1234 | 2015-04-24 02:39:25-0500 | 09ec5343-ea55-11e4-a220-4943c7dd5536 1234 | 2015-04-24 02:41:18-0500 | 4ce8ea50-ea55-11e4-a220-4943c7dd5536 1234 | 2015-04-24 02:41:18-0500 | 4ce8ea51-ea55-11e4-a220-4943c7dd5536 1234 | 2015-04-24 02:41:18-0500 | 4ce8ea52-ea55-11e4-a220-4943c7dd5536 1234 | 2015-04-24 02:41:18-0500 | 4ce8ea53-ea55-11e4-a220-4943c7dd5536 1234 | 2015-04-24 05:53:20-0500 | 20935240-ea70-11e4-8fd7-4943c7dd5536 1234 | 2015-04-24 05:53:20-0500 | 20935241-ea70-11e4-8fd7-4943c7dd5536 1234 | 2015-04-24 05:53:20-0500 | 20935242-ea70-11e4-8fd7-4943c7dd5536 1234 | 2015-04-24 05:53:20-0500 | 20935243-ea70-11e4-8fd7-4943c7dd5536 1234 | 2015-04-24 07:42:39-0500 | 65e85a70-ea7f-11e4-aa45-4943c7dd5536 1234 | 2015-04-24 07:42:39-0500 | 65e85a71-ea7f-11e4-aa45-4943c7dd5536 1234 | 2015-04-24 07:42:39-0500 | 65e85a72-ea7f-11e4-aa45-4943c7dd5536 1234 | 2015-04-24 07:42:39-0500 | 65e85a73-ea7f-11e4-aa45-4943c7dd5536 1234 | 2015-04-24 07:43:34-0500 | 86aedb30-ea7f-11e4-aa45-4943c7dd5536 1234 | 2015-04-24 07:43:34-0500 | 86aedb31-ea7f-11e4-aa45-4943c7dd5536 1234 | 2015-04-24 07:43:34-0500 | 86aedb32-ea7f-11e4-aa45-4943c7dd5536 1234 | 2015-04-24 07:43:34-0500 | 86aedb33-ea7f-11e4-aa45-4943c7dd5536 (20 rows)
by partitioning on vendor_id
, cassandra has 20 cql rows sort. , since have same vendor_id
, cassandra enforce sort order on action_time
, id
, descending (note, added id
primary key ensure uniqueness).
last month wrote article planetcassandra describing behavior: we shall have order!
datastax's patrick mcfadin has article great job discussing how model these types of problems: getting started time series data modeling.
give 2 articles read, , should better understanding of how leverage cassandra clustering order.
Comments
Post a Comment