I am inexperienced with databases; this will be my first large database build. I built a tick data database in MySQL last year using tick data from Integral's TrueFX. It grew to about 250 GB uncompressed for the 5 majors over the span of about 3 years of data, so I was operating under the assumption that this database would grow much bigger. I've spoken to some people who have build similar systems professionally, and they've pretty unanimously told me that this is a large scale endeavor. Some of the companies I've spoken to have petabytes of tick data, so I was led to believe anecdotally that this would be a large database. Although I didn't partition my tables or anything, my MySQL implementation was unusably slow. Again, I appreciate the comment--albeit slightly condescending--but no one here has really answered the original question regarding how to best design the database. I've since come up with a schema that I believe works, but I'm sure you'll agree that it's rather quixotic to come answer my question by saying "it's basic" and then not answering the question and contributing to the narrative of what tech stack I should use. I know my tech stack already, I use what I believe is going to make me marketable for jobs in this field, I've signed on to do some experiments above, but otherwise, I'm really not interested in discussing what the optimal language will be here. I don't really care if this database takes me longer to build, I care about not looking like a total idiot in prop trading job interviews...which, believe me, is a way bigger task than building a database. I don't mean sound like I'm personally attacking you, but I'm getting kinda tired of the pervasive motifs of "who's smarter", "who's the better trader", and of course "language X is better than language Y, and no one on this site has any business using language Z." Again, not a personal insult, you've actually offered more constructive feedback than most of the answers I've gotten, but let's be honest...if I messaged you on quora/quantnet/LinkedIn/nuclearphynance and asked you the same question and offered to buy you a coffee to discuss you'd treat me differently. I'm not sure where that gets lost here, but maybe it's me being quixotic this time. Anyway, thanks for your feedback.
To anyone still following this thread regarding my tests, I'm still working on this project, and so far kdb+ 32 has been substantially faster, although I've partitioned my tables so that 4gbs hasn't really been a major limitation yet. I'll post more empirical findings later.