Shared Memory with Python

As far as reprogramming, if you have a decent amount coded in something, I'm pretty sure there is away of combining langauges, I thought it was called inlining, but that's not the word apparently.
 
Quote from 6yaNYCjm5m:

- I don't get the part why you have to reload it so often? If you are still developing your logic and testing it, then it's unnecessary to load everything, use just a sample until you have it right. If you are running some back-testing for many cases and different parameters, than structure your script where you first load the data and then loop through the cases, varying the parameters. Again, I might be missing something here, but you did not elaborate much on the logic.

Good post. To answer, I have a dirty piece of code that I like to tweak a lot. Sometimes the tweaks aren't obvious to me and the script goes through several iterations, edit, run, edit, run, edit, run. But since the runs happen on a large dataset, the "quick and dirty code" ends up loading stuff over and over. I figured the part that does the loading should be left in memory and when a program loads, it just finds the loaded stuff, runs on it, finishes up, lets me edit/re-run.

But otherwise, the suggestions are in this thread are good. (Very good thread!) I've never evaluated the in-memory database option. That sounds like an interesting alternative also.
 
You might want to consider moving your Python code over to IronPython so you can open up anything that's available in .Net for your future solutions.
 
I have a similar problem of parsing huge csv/txt files using python.

My solution is creating a "pickled" file for the parsed csv/txt in the first run of parsing of data. For the second run, if the python program finds the pickled file, I load the pickled file instead.

It's not perfect but it speeds up my loading of csv files to 50-100 times.
 
Quote from SteveH:

You might want to consider moving your Python code over to IronPython so you can open up anything that's available in .Net for your future solutions.

Admitting being biased against anything Microsoft is involved with, my question is really why?
To OPEN myself to CLOSED, PROPRIETARY environment that's mostly sub par?
 
Quote from byteme:

FYI: As an aside, I suspect PyTables might be a good fit for your use case, though I can only guess what kind of analysis you are trying to perform.

This is the solution I went with. Thank you very much; it has made a big difference for me in my ability to process large datasets.
 
Back
Top