There is no general math section on ET, so I put it here since it seems this is where lots of math talk happens.
Say you have n finite number of events, x1, x2, ..., xn on any given day. Say that on any given day, any combination of these events can occur, say x1 & x3 & x7, or sometimes it could be x1 & x3 with no x7, or x2 & x3 & x6, or just x10. Each of these Xs by the way is a structure that contains statistical data, but each x contains identical fields.
Say you are pretty certain that the variable you are trying to predict, call it Y, is influenced by these events, but in what measure you don't know. What is the right way to analyse this situation statistically? Factor analysis, PCA, Bootstrap etc? The question is how to aggregate the events btw, imo.
If I just measure/analyse the Xs individually, I am ignoring the fact that other xs occurred on that day (conditional probability), and I may be assigning to high or too low a probability to this event. Also, it seems as if each X should be joined into one day overall X, where overall x, [OAX] ={x |: the sum of all x event occurred on that day}, and then take statistic (whatever hypotheses I am doing) on this aggregate. The problem with this is, who is to say that the means, or std, or any other statistical technique allows for this? I could just sum the fields I suppose, but what if the scales are different in each?
I guess I could take the moments of each, and then divide the key numbers by the second moment, STD, giving me a dimensionless number I can then add for each event within the day. I am just not sure if I am GIGO...
I hope this is a clear. I realize this may require more explanation...
Say you have n finite number of events, x1, x2, ..., xn on any given day. Say that on any given day, any combination of these events can occur, say x1 & x3 & x7, or sometimes it could be x1 & x3 with no x7, or x2 & x3 & x6, or just x10. Each of these Xs by the way is a structure that contains statistical data, but each x contains identical fields.
Say you are pretty certain that the variable you are trying to predict, call it Y, is influenced by these events, but in what measure you don't know. What is the right way to analyse this situation statistically? Factor analysis, PCA, Bootstrap etc? The question is how to aggregate the events btw, imo.
If I just measure/analyse the Xs individually, I am ignoring the fact that other xs occurred on that day (conditional probability), and I may be assigning to high or too low a probability to this event. Also, it seems as if each X should be joined into one day overall X, where overall x, [OAX] ={x |: the sum of all x event occurred on that day}, and then take statistic (whatever hypotheses I am doing) on this aggregate. The problem with this is, who is to say that the means, or std, or any other statistical technique allows for this? I could just sum the fields I suppose, but what if the scales are different in each?
I guess I could take the moments of each, and then divide the key numbers by the second moment, STD, giving me a dimensionless number I can then add for each event within the day. I am just not sure if I am GIGO...
I hope this is a clear. I realize this may require more explanation...