Data Storage May Become “Crash-Tolerant”

Know that heart wrenching feeling you get when years of digital memories or business information disappears before your eyes? MIT researchers may have found a way to confine that awful feeling to the past.

At the ACM Symposium on Operating Systems Principles in October of 2015, MIT researchers presented the first ever file system that can offer a mathematically-sound guarantee to protect data during a computer crash.

File systems are crucial to writing data to disk and tracks where data is stored in any computer operating system. If and when a computer crashes while the writing process is underway, the file system’s records are likely to become corrupted.

c tolerMIT’s file system is allegedly a bit slow in comparison to the alternative file systems on the market today, but their innovation remains a crucial step in the data-protection process in that the techniques that the researchers utilized to verify the performance of the file system can be extended to more sophisticated designs. In general, formal verification could hold the key to the development of more reliable and efficient file systems.

“What many people worry about is building these file systems to be reliable, both when they’re operating normally but also in the case of crashes, power failure, software bugs, hardware errors, what have you,” explained Nickolai Zeldovich, an associate professor of computer science and engineering and one of the three MIT computer-science researchers cited in the presentation.

“Making sure that the file system can recover from a crash at any point is tricky because there are so many different places that you could crash. You literally have to consider every instruction or every disk operation and think, ‘Well what if I crash now? What now? What now?’ And so empirically, people have found lots of bugs in file systems that have to do with crash recovery, and they keep finding them, even in very well tested file systems, because it’s just so hard to do.”

Among Zeldovich’s colleagues are Frans Kaashoek, the Charles A. Piper Professor in MIT’s Department or Electrical Engineering and Computer Science (EECS); associate professor of computer science Adam Chilapala; Haogang Chen, a graduate student in EECS; and Daniel Ziegler, an undergraduate in EECS.

The researchers used a method called formal verification in order to prove the reliability of their system. Formal verification is the mathematical description of the acceptable bounds of operation for a compute program followed by supplying proof that the program will never exceed those bounds. The process is exceedingly complicated and therefore avoided except in contexts involving high-level schematic representations of a program’s functionality. The subsequent translation of those high-level schema into working code can then introduce merciless amounts of complications that the proofs then don’t address.

mit“All these paper proofs about other file systems may actually be correct, but there’s no file system that we can be sure represents what the proof is about,” admitted Ziegler.

The MIT researchers have, however, managed to prove properties of their file system’s final code as opposed to a high-level schema. They did this by utilizing a tool called a proof assistant that provides formal language for describing aspects of a computer system and the relationships between them.

“This formal proving environment includes a programming language,” Chlipala stated. “So we implement the file system in the same language where we’re writing the proofs. And the proofs are checked against the actual file system, not some whiteboard idealization that has no formal connection to the code.”

The researchers stated that they likely rewrote the file system “probably 10 times,” (Zeldovich) but at the end of the day, their hard work is likely to genuinely improve and protect our digital world.

 

No comments yet.

Leave a Reply