We've gone through a few decades of optimizing disk access, so there's actually quite a bit of complexity here.
1) You are right that somewhere along the chain, disk accesses will be reordered to reduce overall seek time. The O/S is, however, not the best place to do it, since it doesn't know where the head is.
http://en.wikipedia.org/wiki/Tagged_Command_Queuing is where the disk reorders the requests.
This, however, does not mean that you should throw multiple reads at the disk. Doing so guarantees that you will need to read from two different places in the disk, creating seek times. The best thing you can do for your disk access is to completely eliminate all seek times by ensuring that you read everything in order, from the beginning of the disk to the end. As volpunter has said, defrag will help here. Reading from only one file that is contiguous on the disk means that your disk never has to seek. It just reads from the beginning to the end.
If you MUST have random access, then multiple threads are better since you'll stack up a large number of requests in the disk. The seek times will be aggregated, and the average seek time will go down. So there, multiple threads is better.
There is also some caching here that you've experienced. The O/S will use all its spare (and sometimes not so spare) memory to cache files that you've recently accessed. This is why you saw that coming back to the same file is so much faster. The O/S didn't need to go to disk at all. It just grabbed it from memory. Postgres actually leverages this cache a lot. Unlike MSSQL, Postgres doesn't cache data in its own memory. It simply hopes that the O/S will cache it.
2) The technology behind SSD is very different, but it's not magical. If you'll look through any SSD's specs, you'll notice 2 interesting numbers for reads: MB/s and IOPS. The IOPS is usually for 4K blocks in random order. You'll see that the IOPS * 4K never matches up to MB/s. That's because random access is more expensive than sequential access. In reality, ALL SSD reads have seek times, including the sequential reads. It's just one line of flash disk is very large, so that we can get a large amount (much more than 4K) of data in one seek. To get the maximum throughput out of your SSD, you want to read sequentially, again.
There's a little twist here. If you MUST have random access anyway, then you DO want multiple threads just like with spinning disks. The reason, though, is slightly different. SSDs are more often than not arranged as multiple chips. Each chip can handle a number of simultaneous requests. If you can distribute your requests over all the chips, then you get some speedup. In fact, the IOPS speed you see on spec sheets is often at high queue depths (number of outstanding requests). If you want to achieve the high IOPS the manufacturer has promised, you must do so with a lot of random requests distributed all over in parallel. Large disks have more chips, and hence better IOPS.
Binary File) Binary files significantly reduce the work that the CPU has to do to read files. Parsing a string means it must do a few operations for each character, often more than 100 instructions to read a single number. If it was in the machine's native format, it can simply copy the data into memory and call it a day. 1 load instruction. This also has the added benefit of being much denser than a string representation so you can read more numbers per second.
From your tests, it looks like your bottleneck is more in the parsing than the disk seeks, with the exception of the multiple "A" files where you're really thrashing your disk. You will seriously benefit from a binary file representation. Lower level languages like C/C++ are really good at this as you can simply copy your memory into disk and call it a day. If you use higher level languages or want more portability, you'll unfortunately need to use a few more instructions to serialize into a network representation before storing to disk. Still, this is probably 10's of instructions as opposed to 100's.
I ran some more tests and it's safe to say that I'm just making an ass out of myself now. I'm afraid to declare anything at this point, but in direct contradiction to my prior posts I think my issues in order are string parsing/object creation, file read time, then file seek time. Here are file read time results which I think demonstrate that assertion (3 runs for each test - completion time in milliseconds):
Symbol "A"
Method 1 [Read one large file from SSD and parse text -> create objects at each line]: 1899, 1899, 1875
Method 2 [Read one large file from HDD and parse text -> create objects at each line]: 1860, 1870, 1895
Method 3 [Read many small files from HDD and parse text -> create objects at each line]: 146590, 2259, 2061
Symbol "WLP"
Method 1 [Read one large file from SSD and parse text -> create objects at each line]: 2148, 2175, 2218
Method 2 [Read one large file from HDD and parse text -> create objects at each line]: 2098, 2120, 2184
Method 3 [Read many small files from HDD and parse text -> create objects at each line]: 2329, 2350, 2295
Symbol "WLP"
Method 1 [Read one large file from SSD and no object creation]: 247, 247, 248
Method 2 [Read one large file from HDD and no object creation]: 269, 272, 270
Method 3 [Read many small files from HDD and no object creation]: 456, 462, 461
Like I said, I'm wary of drawing any more conclusions here, but would it be possible to avoid the text parsing and object creation by putting my OptionQuote objects into an ArrayList and then storing that ArrayList to file as a serialized object?
Also, I have no idea why it takes forever to read many small files off the HDD (Symbol A, Method 3) the first time around, but that's been true any time I've looked at this. Not a big problem I guess, but strange.