Archive | August 2013

To strip or not to strip

We often get asked why Drive Bender does not have a “read striping” feature (as in reading the primary and duplicate files in parallel)… so I thought it was time to lay down the facts!

Ok the very first thing that needs to be understood is that RAID (the very technology the word “striping” is based on)  is entirely a different beast to NTFS and file duplication. RAID configured in a striping mode does a very specific thing (for the sake of this article we will assume 2 drives), and that is to write to the RAID device in split data chunks. So in laymen terms, if we had a file that was made up of “AAbbCCdd”, RAID would write on drive 1 as “AACC” and drive 2 as “bbdd”. Now the big advantage here is that when RAID needs to read this data, only 4 characters need to be read from each drive, and if read in parallel, there is most certainly a raw a speed increase. Another import aspect of this is the data is read in a continuous manner, in that neither drive skipping data… however I will get to this in a minute.

Now this is possible to do with file duplication, simply by reading half the request data from one drive, and the other half of the data from the other drive… however there are a serious “gotchas” with this idea. Lets consider the RAID example of the “AAbbCCdd” file. As mentioned there are two key benefits with this
1) Each drive only needs to read half of the amount of data required.
2) The data being read is done so in a streaming manner.

Performing striping with file duplication get us the advantage of point one, however it does not give us the performance advantage of point two. So we end up with drive one performing a read like “AA” -> seek to “CC”, and drive two reading like “bb” -> seek to “dd”. The big killer here is the “seek” to the next block of data, this seek is very expensive in terms of performance, and ultimate lowers the data throughput.

As you can see, this type of reading effectively fragments the data, which as we all know is a performance killer. Bottom line, to get the best performance, find the data on the fast drive and pull straight from that!