c# - How much record can linq load when result stored in array? -


i have 600 milion record database,and want load query:

var query=(from p in behzad.test           select p).toarray(); 

can load that?

highly improbable... there multiple "walls" you'll encounter.

first... let's each record id... store in best way 600 million of ids, 600 million * 4 bytes = 2.4gb. don't think objects small. there reference objects... each referece 8 bytes @ 64 bits...

http://codeblog.jonskeet.uk/2011/04/05/of-memory-and-strings/ here skeet calculated minimum memory used object @ 64 bits 24 bytes... 14.4gb (it doesn't includes reference object, new object[size] before gc.gettotalmemory(true)). in space can put 2 int without making object bigger (the id , int field, example) (in same page, search table twoint32s).

then there problem linq-to-sql duplicates records when loaded (one copy goes object tracker). can disable https://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.objecttrackingenabled.aspx

then there limit of maximum 2gb of memory each array... being array of references of 8 bytes (64 bits), around 260 million records array. removed .net 4.0 @ 64 bits: https://msdn.microsoft.com/en-us/library/hh285054.aspx

then there problem toarray(): build array goes doubling, using powers of 2 (4 elements, 8 elements, 16, 32, 64, 128... 1024... 1024*1024 elements, ...) before building array of 600 million references (that alone 4.8 gb, included in 24 bytes/element, don't count it), build temporary array of 512 million elements (4 gb), 1 of 1024 million elements (8 gb, of 3.2 useless), , resize 1 of 1024 million elements 600 million.

so before end of toarray() have:

  • 14.4 gb of rows
  • 8.0 gb 1024 million elements array of references
  • 5.4 gb 600 million elements array of references

so @ minimum you'll need 14.4 + 8.0 + 5.4 = 27.8 gb... 2 ints... , "theoretical minimum". increase @ least 50-100% make "practical minimum". , still handling 2 int each record :-)

now... alexei suggested using struct. struct have advantage don't require reference, , size exact size use.

you change query to:

var query = (from p in behzad.test select p)                .select(p => new mystruct { id = p.id, foo = p.foo })                .toarray(); 

now each element 8 bytes big, need

  • 8.0 gb 1024 million elements array of references
  • 5.4 gb 600 million elements array of references

total 13.4 gb.

note using reference type string inside element still big no no (it eat memory).


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -