Are two samples from same population?

abattia · Nov 17, 2011

Quote from kut2k2:
My point is that price is constantly changing, so how can you assume the only significant difference between t1 and t2 is the presence or absence of some event? Are p1 and p2 different? Apparently so, or you would have said otherwise. Did the price arrive at p1 and p2 from the same velocity, or even from the same direction? There are so many unknowns here I don't know how you can begin to treat them the same.

Even your title says as much: you can't be sure the two samples are from the same population. Price series are not stationary, which is what makes analyzing them so much "fun". *tonguefirmlyincheek*

Sorry, your serious point deserves a serious answer...

I would assume the same criticism might be applied to testing two groups of patients for the effectiveness of a new treatment (apparently that is, or at least has been, a field of application of the two sample K-S test). After all, you could say that the two groups of patients will also be different in so many other ways (i.e. not just in whether that get the new treatment or just the placebo) that you won't be able to isolate the difference you are looking to focus on? No?

Intuitively, I think the response is that if your two groups of patients (or samples of price) are large enough, then you get to the point that both samples contain examples of "all" the things that could be different, and (if you have a big enough sample), the only difference that you end up with is the thing you know about (because you set it up that way). No?

intradaybill · Nov 17, 2011

Quote from abattia:

Sorry, your serious point deserves a serious answer...

I would assume the same criticism might be applied to testing two groups of patients for the effectiveness of a new treatment (apparently that is, or at least has been, a field of application of the two sample K-S test). After all, you could say that the two groups of patients will also be different in so many other ways (i.e. not just in whether that get the new treatment or just the placebo) that you won't be able to isolate the difference you are looking to focus on? No?

Intuitively, I think the response is that if your two groups of patients (or samples of price) are large enough, then you get to the point that both samples contain examples of "all" the things that could be different, and (if you have a big enough sample), the only difference that you end up with is the thing you know about (because you set it up that way). No?

I suggest you put this joker on ignore. As you see he did not even try to answer your question. He instead tried to raise controvesy for the purpose of instigating ad hominen attacks later and turning threads into insults.

He should know that when a police radar catches someone speeding the ticket is the same, regardless of whether the driver is on a vacation trip, going to work, living near by, or travelling with his daughter and son. He belongs to the population of those that exceed the speed limit and that is all that counts. It does not matter whether he started the trip an hour ago or 10 minutes ago.

So basically, whether two samples belong to the same population depends on the measurement objectives and not on their specific properties that may be irrelevant to those objectives.

"The word big is small". It is sampled from the set of small words. You don't care what it says, you only care how many characters it contains.

In other words, we do not look at the river but at things caused by the river which we can measure. A 'river" is an abstract concept. Rivers are not measured dynamically, except geometrically. It is the motion of particles in the river that can be measured. These particles obey the same laws. Their motion is governed by Navier-Stocks equations. If the boundary conditions are the same, the motions will be on the average the same resulting in the same pressure, temperature, etc.

Relating the the markets, there are periods when the same conditions prevail. If we know that after a given set of conditions something happend, let us say a correction, next time we measure the same conditions we can have some confidence interval that a correction will happen.

Matching distributions is essentially another way of pattern recognition. It is like matching signatures in signal processing.

DontMissTheBus · Nov 17, 2011

I would argue it's more difficult since your have to match ALL the moments of the distributions rather some parameters?

For example, can you really distinguish between the realizations of a t-distribution with large degrees of freedom from that of a normal?

Quote from intradaybill:

Matching distributions is essentially another way of pattern recognition. It is like matching signatures in signal processing.

intradaybill · Nov 17, 2011

Quote from DontMissTheBus:

I would argue it's more difficult since your have to match ALL the moments of the distributions rather some parameters?

For example, can you really distinguish between the realizations of a t-distribution with large degrees of freedom from that of a normal?

You stated the big problem.

kut2k2 · Nov 17, 2011

Quote from abattia:

Sorry, your serious point deserves a serious answer...

I would assume the same criticism might be applied to testing two groups of patients for the effectiveness of a new treatment (apparently that is, or at least has been, a field of application of the two sample K-S test). After all, you could say that the two groups of patients will also be different in so many other ways (i.e. not just in whether that get the new treatment or just the placebo) that you won't be able to isolate the difference you are looking to focus on? No?

Intuitively, I think the response is that if your two groups of patients (or samples of price) are large enough, then you get to the point that both samples contain examples of "all" the things that could be different, and (if you have a big enough sample), the only difference that you end up with is the thing you know about (because you set it up that way). No?

I think the key difference is that a population of individuals is sampled simultaneously while a time series can only be sampled sequentially. True, the patients are all different from each but with large enough samples in both the treated group and the control group, it's reasonable to assume that the main difference in the course of their illness is due to the presence or absence of the treatment. In contrast, with the price series you're really looking at a single test specimen that changes over time. The assumption that this specimen can be modeled by a probability distribution is a huge unproven one. Every "moment" of the "distribution" is a variable over time.

Perhaps a better approach would be to look at a group of similar securities simultaneously and try to see if they exhibit similar behaviors in the presence or absence of the event of interest. A lot more work I know but that may be the only way to validly answer your question. Sorry.

DontMissTheBus · Nov 17, 2011

... this is.... um.... nonsense root in ignorance of what a probability actually is.

In any case, much like his earlier comment about water, it's irrelevant.

Back to the OP - I know we had a discussion about .Net a few days ago; If you are going down the line of these statistical inquiries - might I suggest matlab (if you can afford a license), R (it's free), or Python (also free) might give you more tools out the box?

Quote from kut2k2:

The assumption that this specimen can be modeled by a probability distribution is a huge unproven one. Every "moment" of the "distribution" is a variable over time.

backblackdaddy · Nov 17, 2011

10,000 hours in front of live charts, no shortcuts.. then leave the math behind

Algo_Design_Kid · Nov 17, 2011

Quote from DontMissTheBus:

I would argue it's more difficult since your have to match ALL the moments of the distributions rather some parameters?

For example, can you really distinguish between the realizations of a t-distribution with large degrees of freedom from that of a normal?

Is this not a 2 sample z-test?

R. Raskolnikov · Nov 17, 2011

I agree with you here Trader28

Quote from backblackdaddy:

10,000 hours in front of live charts, no shortcuts.. then leave the math behind

backblackdaddy · Nov 17, 2011

The real question is are these two samples from the same population