database - How is cache conscious B+tree stored? -

February 15, 2013

i new database , wish implement cache conscious b+tree. lot of reading suggest storing nodes , leaves continuous memory. assuming when b+tree created, nodes , leaves stored in heap, , copied disk read write operation? cache conscious b+tree tell os give set of continuous physical pages? think answer no b/c applications shouldn't know how physical pages allocated, , continuous memory referring primary memory page?

the 'cache-consciousness' bit refers special discipline of page layout tries maximise utilisation of first-level data cache of cpu, optimised specific cache line size (e.g. 64 bytes).

one standard technique (independent of cache line size) use offset-value coding in indirection vector, combined 'poor man's normalised keys' (e.g. typically 2 or 4 bytes of key material starting first byte not shared predecessor). reduces necessity of accessing data outside indirection vector - i.e. actual key data kept in heap elsewhere on page, , quite few queries (failed seeks) can done using data contained in beefed-up indirection vector. maximises cache utilisation , minimises thrashing.

other schemes form elements of indirection vector mini-btree 'page size' equal cache line size.

yet scheme divides indirection vector sub-blocks of 1 (or few) cache lines in size, prefix truncation (called 'front compression' in papers) used within these sub blocks not across different blocks. binary search among block 'leaders' used identify target block, scanned linearly in manner typical prefix-truncated key sequences.

a variation of scheme stores block leaders in mini-index , keeps sequential sub blocks elsewhere, improve cache utilisation further. needless say, intra-page space management absolute nightmare.

many other variations possible publications seem limited academic papers trying prove academic points, , rare peeks page layouts used important database systems.

even basic key comparison in connection prefix truncation, serious reference on web find dates 1990:

ddj december 1990 - supercharging sequential searches

papers overviews of cpu cache issues btrees:

b-tree indexes , cpu caches (goetz graefe , per-Åke larson)
cache-conscious index structures main-memory databases (vilho raatikka)
making b+-trees cache conscious in main memory (jun rao , kenneth ross)

consciousness of paged nature of underlying storage - , differing characteristics of seek, page read/write , bulk read/write - different beast important. in results in surprising, innovative designs. 1 example berkeley db's java version, persists log file disk , reconstructs in-memory btree log @ startup.

Search This Blog

UV code

database - How is cache conscious B+tree stored? -

Comments

Post a Comment

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

Bubble Sort Manually a Linked List in Java -

asp.net mvc - SSO between MVCForum and Umbraco7 -