I'll break this up into 3 separate issues we're addressing:
1) Am I only concerned with processing time (internal latency) and disregarding external latency (me to exchange and exchange to me)?
No, I measure external latency as well. But instead of breaking it up into 2 separate components (meToExchange and exchangeToMe), I lump it into one measurement (basically OrderSent and OrderAckReceived), that's measured at the same source (me). This way, there are measures that I know I can take to reduce both internal latency (writing more efficient code) and external latency (optimizing interaction with the exchange).
2) How do I know/guess at when events happened in 'real' time at the exchange? Or in other words, how do I know the staleness of data I'm receiving and processing?
One close guess, which might be good enough for you, is to look at the lag between the time you receive a fill and the time it's reflected in the pricefeed. Add half a round-trip time to that and that gives you a pretty decent approximation of how long ago events happened that you're just seeing now. You're making a few assumptions (namely that the lag is consistent, and FillNotification-->PublicLastTrade receipt lag is indicative of the lags in other market data as well), but it's certainly a start.
3) How do I decide if a signal is good any more if I can't judge its true age?
Simply put, I take the time I receive the signal as the starting time for my signal analysis. So if I think I have a good signal that tells hit the market whenever I receive signal X, and I can effectively model whether my order will get filled, I analyze the PNL/predictivity of signal X that way, without caring about the actual time at which X occurred on the exchange.
In your example, if you know you have X micros after Event1 happened on the exchange where it's predictive, can't you change your 'happened on the exchange' to 'received the information'? If your basis is that the time from event happening to market data receipt is inconsistent and you need to get a good model on that inconsistency then I'd understand where you're coming from, and there are exchanges where that applies. If that's the case, I'd try to model lags of the signals you're interested in by trying to simulate them yourself. For example, if your signal is based on LastTrades, look at the Fill-->LastTrade lag that I mentioned earlier). If your signal is based on some big size entering the book, look at the lag between the time you send an order and when it's reflected in the pricefeed. If you give me a specific example we can throw some ideas around, it could be a learning experience for me too.
So if, TOTAL_LATENCY = EXCHANGE_TO_YOU + PROCESSING_TIME + YOU_TO_EXCHANGE, you only bother measuring PROCESSING_TIME (if I'm understanding correctly). I'm sure much of what you do is aimed at reduced the processing figure, but without constant monitoring of the other two (especially EXCHANGE_TO_YOU), how do you know if you're having network problems or the like? Perhaps a better question - how do you know that a signal is still valid and not stale? For example, say I trade a simple model where after Event1 on the exchange I have x micros to get my order on the book - after this point, positive expectancy is gone. If you don't know how long it's been since Event1 occurred, how can you determine whether sending the order makes sense?
I realize that you can get TOTAL_LATENCY figures by receiving a timestamped exchange event, sending an order in response, and then taking the difference in exchange timestamps for your order hitting the book and the original event. However, that is still after the fact and doesn't isolate each element of non-processing time.
Clearly, I'm asking because these are things I contend with in my trading (albeit on a much slower scale). What's surprising to me is that I would have thought they were much more critical in your business...and yet you seem to ignore them entirely, ha.