database - How is cache conscious B+tree stored? -


i new database , wish implement cache conscious b+tree. lot of reading suggest storing nodes , leaves continuous memory. assuming when b+tree created, nodes , leaves stored in heap, , copied disk read write operation? cache conscious b+tree tell os give set of continuous physical pages? think answer no b/c applications shouldn't know how physical pages allocated, , continuous memory referring primary memory page?

the 'cache-consciousness' bit refers special discipline of page layout tries maximise utilisation of first-level data cache of cpu, optimised specific cache line size (e.g. 64 bytes).

one standard technique (independent of cache line size) use offset-value coding in indirection vector, combined 'poor man's normalised keys' (e.g. typically 2 or 4 bytes of key material starting first byte not shared predecessor). reduces necessity of accessing data outside indirection vector - i.e. actual key data kept in heap elsewhere on page, , quite few queries (failed seeks) can done using data contained in beefed-up indirection vector. maximises cache utilisation , minimises thrashing.

other schemes form elements of indirection vector mini-btree 'page size' equal cache line size.

yet scheme divides indirection vector sub-blocks of 1 (or few) cache lines in size, prefix truncation (called 'front compression' in papers) used within these sub blocks not across different blocks. binary search among block 'leaders' used identify target block, scanned linearly in manner typical prefix-truncated key sequences.

a variation of scheme stores block leaders in mini-index , keeps sequential sub blocks elsewhere, improve cache utilisation further. needless say, intra-page space management absolute nightmare.

many other variations possible publications seem limited academic papers trying prove academic points, , rare peeks page layouts used important database systems.

even basic key comparison in connection prefix truncation, serious reference on web find dates 1990:

ddj december 1990 - supercharging sequential searches

papers overviews of cpu cache issues btrees:

consciousness of paged nature of underlying storage - , differing characteristics of seek, page read/write , bulk read/write - different beast important. in results in surprising, innovative designs. 1 example berkeley db's java version, persists log file disk , reconstructs in-memory btree log @ startup.


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -