Timeline for Efficiently Visualising Very Large Data Sets (without running out of memory)

Current License: CC BY-SA 3.0

28 events

when toggle format	what		by	license	comment
Apr 13, 2017 at 12:55	history	edited	CommunityBot		replaced http://mathematica.stackexchange.com/ with https://mathematica.stackexchange.com/
Mar 7, 2012 at 17:43	vote	accept	Sinistar
Mar 7, 2012 at 17:37	answer	added	Sinistar		timeline score: 0
Mar 6, 2012 at 22:02	comment	added	Verbeia		@Sjoerd I fixed up the question title and inserted the input data and function.
Mar 6, 2012 at 22:00	history	edited	Verbeia	CC BY-SA 3.0	added content from OP's comments and made title more explicit
Mar 6, 2012 at 21:02	answer	added	Sjoerd C. de Vries		timeline score: 14
Mar 6, 2012 at 20:57	comment	added	Sjoerd C. de Vries		Sinistar, could you please come up with a more descriptive title for your question? This will help future users searching for an answer to similar questions.
Mar 6, 2012 at 15:52	answer	added	rcollyer		timeline score: 3
Mar 6, 2012 at 15:45	comment	added	Sinistar		Okay, that's interesting. I still think M! developers need to address this. If you can't visualize the data as well as spot check in a systematic way, I think this is a serious wall to hit. The problems on etries to solve using a tool such as this are only confounded in regards to such difficulties. I suppose, if I was a Los Alamos, I'd be fine, but then only sell it to those that have those resources. Sorry if this sounds a but frustrated, it's because I am.
Mar 6, 2012 at 15:44	answer	added	canadian_scholar		timeline score: 1
Mar 6, 2012 at 15:42	comment	added	Andy Ross		let us continue this discussion in chat
Mar 6, 2012 at 15:41	comment	added	Andy Ross		I would suggest editing your question so that is specifically addresses what you are hoping to accomplish with this data. Maybe responses will give you a sense of how to handle similar issues in the future.
Mar 6, 2012 at 15:38	comment	added	Andy Ross		But your output isn't packed and `ByteCount[ts]` verifies the size. If you apply Developer`ToPackedArray@ts it will consume far less memory.
Mar 6, 2012 at 15:35	comment	added	Sinistar		But I have 16GB of RAM, a 24 GB unbounded swap file, and a 2GB flash card assisting the OS. I'm not sure I agree it 2GB either, I'm guessing more around 300MB in 16bit unicode. My machine should be able to handle this. Plus, the finished data already exists in memory. You can use RandomChoice[ts,10] on it and get result after result. But if you want the whole list, fuggetaboutit.
Mar 6, 2012 at 15:34	history	edited	rcollyer	CC BY-SA 3.0	added link to related question
Mar 6, 2012 at 15:34	review	Suggested edits
Mar 6, 2012 at 15:34
Mar 6, 2012 at 15:33	comment	added	Andy Ross		It "gags" because you have about 2GB of results and are trying to render them in the front end. I suppose I don't see how you hope to gain anything by seeing the full output. Try a small problem and verify that it works. Look at a subset of individual results. Check that the dimensions are what you expect etc..
Mar 6, 2012 at 15:31	comment	added	Sinistar		I just want the text output of this processed data. Seems simple enough? I had already asked this question, and realized the solution was not quite what I was needing, but I had to discover this by accident, because I couldn't simply render the data. I had another problem where I was summing the a list of 22 million elements and want to ListPlot the results. I waited for 3 days for the output, and when it finally rendered, it was useless. I couldnt export to SVG or PDF had to settle for a low res JPG. M! just cannot seem to render what the kernal is capable of processing.
Mar 6, 2012 at 15:29	comment	added	Sinistar		mathematica.stackexchange.com/questions/2593/advanced-tupling
Mar 6, 2012 at 15:25	comment	added	tkott		Would you edit the original question to include the data and solution. How exactly are you imagining a good visualization of the tuples to look like? A histogram? Some word on what you're looking for would be helpful. As for "data it has already processed", there is a large difference I think between having it in memory and printing it out in some fashion, where you need not only all of the data, but also all the commands to print / visualize it.
Mar 6, 2012 at 15:20	comment	added	Sinistar		So if you try to use Print, or Show All, the machine loses it. If I am patient, like for 3 or 4 days sometimes, it will finish. I have seen this happen on many problems of sufficient complexity. Graphical and plain text output. I fail to understand why M! gags on data it has already processed?
Mar 6, 2012 at 15:19	comment	added	Sinistar		This was a solution: newTuples[t_, x_] := Flatten[ ParallelTable[Append[s, #] & /@ Complement[x, s], {s, t}], 1]; Timing[ts = Fold[newTuples, {{}}, {a, b, c, d, e, f}];]
Mar 6, 2012 at 15:18	comment	added	halirutan♦		@ruebenko Fixed it.
Mar 6, 2012 at 15:18	comment	added	Sinistar		Here is the data: a = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 21, 22, 23, 25, 26, 28} b = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 30, 31, 33, 37, 41} c = {6, 10, 13, 14, 15, 16, 18, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 32, 35, 37, 39} d = {17, 19, 25, 30, 31, 33, 34, 35, 36, 38, 44} e = {31, 41, 45, 47} f = {23, 26, 31, 32, 33, 34, 35, 36, 38, 39, 40, 41, 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53}
Mar 6, 2012 at 15:14	history	edited	halirutan♦	CC BY-SA 3.0	added 2 characters in body
Mar 6, 2012 at 15:13	history	edited	Sinistar	CC BY-SA 3.0	added 1 characters in body
Mar 6, 2012 at 15:10	comment	added	tkott		@Sinistar It seems that you are asking how to visualize the data from a separate question on this site. If so, could you update the question here to include the original data set so that users could more easily play around with it?
Mar 6, 2012 at 14:58	history	asked	Sinistar	CC BY-SA 3.0

toggle format