Standard deviation for irregularly spaced timeseries

kroxobor · Jan 11, 2021

Hi fellows,

Suppose I have a collection of irregularly spaced time series (tick data with time_obs(i+1) - time_obs(i) ranging from 2ms to 20ms). I want to calculate 1-minute standard deviation of returns.

The question is: should the number of observations be the same in the every sample of 1 minute series? i.e. for example, the first minute contains 50 observations and the second minute 100 has observations. After calculating standard deviations for both is it legal to make claims like volatility of price in the first minute is higher than in the second since std1 > std2?

Thanks in advance.

tradeking007yahoo · Jan 11, 2021

you dont have a time series ..garbage in and garbage out

globalarbtrader · Jan 11, 2021

kroxobor said:
Hi fellows,

Suppose I have a collection of irregularly spaced time series (tick data with time_obs(i+1) - time_obs(i) ranging from 2ms to 20ms). I want to calculate 1-minute standard deviation of returns.

The question is: should the number of observations be the same in the every sample of 1 minute series? i.e. for example, the first minute contains 50 observations and the second minute 100 has observations. After calculating standard deviations for both is it legal to make claims like volatility of price in the first minute is higher than in the second since std1 > std2?

Thanks in advance.

https://quant.stackexchange.com/que...of-a-sample-when-points-are-irregularly-space

GAT

kroxobor · Jan 11, 2021

tradeking007yahoo said:
you dont have a time series ..garbage in and garbage out

yes and I want to make time series from this garbage. So any thoughts?

MarkBrown · Jan 11, 2021

use fill data

use the exact same last bar until new data replaces it. the purpose is to fill every slot for data, make it up. but you don't want it to influence, so make it the same. logical?

stochastix · Jan 11, 2021

kroxobor said:
Hi fellows,

Suppose I have a collection of irregularly spaced time series (tick data with time_obs(i+1) - time_obs(i) ranging from 2ms to 20ms). I want to calculate 1-minute standard deviation of returns.

The question is: should the number of observations be the same in the every sample of 1 minute series? i.e. for example, the first minute contains 50 observations and the second minute 100 has observations. After calculating standard deviations for both is it legal to make claims like volatility of price in the first minute is higher than in the second since std1 > std2?

Thanks in advance.

you should not be using time series methods at all. use the point process paradigm , where points are described by their inter-event durations

stochastix · Jan 11, 2021

MarkBrown said:
use fill data

use the exact same last bar until new data replaces it. the purpose is to fill every slot for data, make it up. but you don't want it to influence, so make it the same. logical?

you might be a good trader Mark but that's not good advice , the term for this is censoring in statistics

MarkBrown · Jan 11, 2021

stochastix said:
you might be a good trader Mark but that's not good advice , the term for this is censoring in statistics

let's say you have a series that updates every minute and a series that updates every five minutes to avoid the gap there is nothing "impacting" on the price to censor it. using the synthetic data fill method i described.

stochastix · Jan 11, 2021

I would be very concerned about a feed that updated at a fixed frequency. See https://vixra.org/abs/1211.0094
The problem is this 1) any selection of frequency throws away information

MarkBrown · Jan 11, 2021

stochastix said:
I would be very concerned about a feed that updated at a fixed frequency. See https://vixra.org/abs/1211.0094
The problem is this 1) any selection of frequency throws away information

well that is true but faced with trying to decipher data sometimes smoothed is better than raw for observation.