c# - How much record can linq load when result stored in array? -


i have 600 milion record database,and want load query:

var query=(from p in behzad.test           select p).toarray(); 

can load that?

highly improbable... there multiple "walls" you'll encounter.

first... let's each record id... store in best way 600 million of ids, 600 million * 4 bytes = 2.4gb. don't think objects small. there reference objects... each referece 8 bytes @ 64 bits...

http://codeblog.jonskeet.uk/2011/04/05/of-memory-and-strings/ here skeet calculated minimum memory used object @ 64 bits 24 bytes... 14.4gb (it doesn't includes reference object, new object[size] before gc.gettotalmemory(true)). in space can put 2 int without making object bigger (the id , int field, example) (in same page, search table twoint32s).

then there problem linq-to-sql duplicates records when loaded (one copy goes object tracker). can disable https://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.objecttrackingenabled.aspx

then there limit of maximum 2gb of memory each array... being array of references of 8 bytes (64 bits), around 260 million records array. removed .net 4.0 @ 64 bits: https://msdn.microsoft.com/en-us/library/hh285054.aspx

then there problem toarray(): build array goes doubling, using powers of 2 (4 elements, 8 elements, 16, 32, 64, 128... 1024... 1024*1024 elements, ...) before building array of 600 million references (that alone 4.8 gb, included in 24 bytes/element, don't count it), build temporary array of 512 million elements (4 gb), 1 of 1024 million elements (8 gb, of 3.2 useless), , resize 1 of 1024 million elements 600 million.

so before end of toarray() have:

  • 14.4 gb of rows
  • 8.0 gb 1024 million elements array of references
  • 5.4 gb 600 million elements array of references

so @ minimum you'll need 14.4 + 8.0 + 5.4 = 27.8 gb... 2 ints... , "theoretical minimum". increase @ least 50-100% make "practical minimum". , still handling 2 int each record :-)

now... alexei suggested using struct. struct have advantage don't require reference, , size exact size use.

you change query to:

var query = (from p in behzad.test select p)                .select(p => new mystruct { id = p.id, foo = p.foo })                .toarray(); 

now each element 8 bytes big, need

  • 8.0 gb 1024 million elements array of references
  • 5.4 gb 600 million elements array of references

total 13.4 gb.

note using reference type string inside element still big no no (it eat memory).


Comments

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -