Suppose we have 1000 models. Two classrooms in two different rooms get the same models and the same data.
The first classrooms is full of geniuses who know wassup and will test the "clever way", meaning they test all on seg1, then test pretty ones on seg2 and keep the ones that test pretty on...