I use the term data augmentation in a slightly different way but I get what you mean. I did test with inputting time series, for example, or other input data in a reverse order and the output and prediction rate turned out to be almost identical which confirmed that the learning algorithm is capable of building associations and assigning weights regardless of the order of the input data series. Obviously a complete re-ordering of input data is an entirely different story.
I am still new to this but my understanding is that CNN is in fact process that starts with data filtering, feature selection and data reduction in the process of convolution and pooling. Then the NN parts kicks in and does its magic. Is the first part on CNN really responsible for its better performance than other types of NN? Why would it need extra data augmentation before that and what would be the advantage of doing it?
