c# - How much record can linq load when result stored in array? -
i have 600 milion record database,and want load query:
var query=(from p in behzad.test select p).toarray();
can load that?
highly improbable... there multiple "walls" you'll encounter.
first... let's each record id... store in best way 600 million of ids, 600 million * 4 bytes = 2.4gb. don't think objects small. there reference objects... each referece 8 bytes @ 64 bits...
http://codeblog.jonskeet.uk/2011/04/05/of-memory-and-strings/ here skeet calculated minimum memory used object @ 64 bits 24 bytes... 14.4gb (it doesn't includes reference object, new object[size]
before gc.gettotalmemory(true)
). in space can put 2 int
without making object bigger (the id
, int
field, example) (in same page, search table twoint32s
).
then there problem linq-to-sql duplicates records when loaded (one copy goes object tracker). can disable https://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.objecttrackingenabled.aspx
then there limit of maximum 2gb of memory each array... being array of references of 8 bytes (64 bits), around 260 million records array. removed .net 4.0 @ 64 bits: https://msdn.microsoft.com/en-us/library/hh285054.aspx
then there problem toarray()
: build array goes doubling, using powers of 2 (4 elements, 8 elements, 16, 32, 64, 128... 1024... 1024*1024 elements, ...) before building array of 600 million references (that alone 4.8 gb, included in 24 bytes/element, don't count it), build temporary array of 512 million elements (4 gb), 1 of 1024 million elements (8 gb, of 3.2 useless), , resize 1 of 1024 million elements 600 million.
so before end of toarray()
have:
- 14.4 gb of rows
- 8.0 gb 1024 million elements array of references
- 5.4 gb 600 million elements array of references
so @ minimum you'll need 14.4 + 8.0 + 5.4 = 27.8 gb... 2 int
s... , "theoretical minimum". increase @ least 50-100% make "practical minimum". , still handling 2 int
each record :-)
now... alexei suggested using struct
. struct
have advantage don't require reference, , size exact size use.
you change query to:
var query = (from p in behzad.test select p) .select(p => new mystruct { id = p.id, foo = p.foo }) .toarray();
now each element 8 bytes big, need
- 8.0 gb 1024 million elements array of references
- 5.4 gb 600 million elements array of references
total 13.4 gb.
note using reference type string
inside element still big no no (it eat memory).
Comments
Post a Comment