i agree out-of-sample testing is only a partial solution. the more statistically savvy may correct me, but you are talking about a mass-univariate approach to data mining. this has limitations due to the assumptions of your model. the fundamental correlations proposition mentioned above is one step towards a multivariate approach, which essentially captures more variability than a mass-univariate approach. You could try Bonferroni etc to filter out many chance signals but the bottom line remains:
you can not get more out of the data than the tools and assumptions that your model uses. the reverse approach might offer some edge: first you notice a pattern yourself (wild example: price of oil correlated with number of topless females on beach A and how does it behave when it is winter or summer), then you try to datamine that pattern you noticed with other patterns in a multivariate approach.
that is why I think computers will never take the human factor out of the equation, computers can compute models but not develop the models themselves. that is up to us (thankfully).
my 2cts
George
you can not get more out of the data than the tools and assumptions that your model uses. the reverse approach might offer some edge: first you notice a pattern yourself (wild example: price of oil correlated with number of topless females on beach A and how does it behave when it is winter or summer), then you try to datamine that pattern you noticed with other patterns in a multivariate approach.
that is why I think computers will never take the human factor out of the equation, computers can compute models but not develop the models themselves. that is up to us (thankfully).
my 2cts
George