ZFS? ZIL? L2ARC? What’s that about?

ZFS is the free copy-on-write filesystem from Oracle Corp (former Sun Microsystems). I’m currently using ZFS and Solaris 11 Express as the base of my home file server and I thought I’d do some explanation on how it works. Two of the most interesting things about ZFS is the ability to use fast SSDs to speed things up. I’m currently using two mirrored 60GB SSDs as ZIL (ZFS Intent Log) and one 60GB as L2ARC (Layer 2 Adaptive Replacement Cache). The ZIL is the write cache and the L2ARC is the read cache. Why do I need those? Well, I don’t. But it’s interesting to learn about the file system and how it works, so I just HAD to try it out.

The old model

The old model of file servers uses the machines RAM as a read cache and the rest of the data is on normal, spinning disks.

The old model with RAM and disks

The new model

Now, how can we be more efficent? The problem with disks is that they are slooooow, really slow. The problem with RAM is that you never have enough of it. The solution is to insert another layer in the storage hierarchy, a SSD layer. The fast SSD disks will act as a cache, much faster than spinning disks and with a lot more storage capacity than RAM.

The new model with SSDs

ZFS

ZFS uses the new model, but with a “twist”. Like i mentioned before, there are two kinds of SSD cache in ZFS: ZIL and L2ARC.

The ZIL, or ZFS Intent Cache, is the ZFS write cache. Many applications, like databases, needs to do synchronous writes to disk to ensure that the data is secured down in storage. This tends to be a problem since sync writes are really slow. What usually happens is that ZFS uses transaction groups, these are pushed out to every about every couple of seconds. Does the database want to wait this time? Probably not and the ZFS transaction log that says “I’m about to write baladibla to block bla bla” is written to disk instead, painfully slow but at least the data won’t be gone in case of a power failure. This pretty much works like the logs in a normal database. So what the ZIL does is that is gathers these transaction groups and instead of writing the logs to slow spinning disks they are stored on fast SSDs and the sync writes can be handled much faster.

The L2ARC on the other hand is totally different, this is the ZFS read cache. In the old model the data requested would first be read from the cache in RAM, if it’s missing there it would have to be read from disk. Disk reads are slow, can we please avoid them? Yes, we can. We’re inserting another layer between RAM and spinning disks consisting of much faster SSD disks. They will work as an extension of the normal cache in RAM (called ARC, hence L2ARC). This cache is now filled on some basic rules like “Most frequently used”, “Most recently used” and so on. When you read data the system first checks the ARC, then the L2ARC and last the spinning disks. This means a lot faster reads, especially random reads which tends to be extremely slow in spinning disks.

ZFS uses ZIL and L2ARC on SSD

Final thoughts

In my personal experience doing some testing i get about three times the write performance with this setup and twice the read performance. A quite nice addition to my home server indeed. I also have to mention that I use VMware ESXi for virtualization and the backend storage runs over NFS against the file server. NFS uses sync writes and this is probably where I saw the biggest difference when I added the ZIL devices. Earlier the VMs would pretty much be unresponsive whenever i copied a big file, because they couldn’t get their sync writes through to the disks. This is gone now, thankfully.

7 thoughts on “ZFS? ZIL? L2ARC? What’s that about?

  1. Thomas Wong

    Thanks A LOT for the effort you put in describing this.
    Everything is so clear now. Great article!!!!!!!!

    Reply
    1. marcus Post author

      The recommendations are to use SLC for ZIL and MLC for L2ARC, but I’m using consumer grade Corsair F60 for both ZIL and L2ARC. So, I guess my recommendations depend on the load on the server. :)

      Reply
      1. Martin

        Ah ok, SLC is more for entersprise users, my home sever is an Solaris 11 storage server
        with 4x 2 TB samsung hdds @ RAIDZ1 and 16 GB of ram, so i will only pay for MLC SSDs :)

        Thanks foor your fast reply

        Reply
  2. Oskar

    Halloj!

    Mycket bra skrivet!

    Finns det någon gyllene regel när det gäller storlek på ZIL/L2ARC? Har hört att man skall ha en ZIL som är 1/2 av tillgängligt RAM, stämmer? Vad ska man då ha för storlek på L2ARC? Kan prestandan försämras om man har för stor ZIL/L2ARC?

    Reply
    1. marcus Post author

      Jag kan bara säga vad jag har för erfarenhet av ZIL och det är att den aldrig nyttjar mer än ett par GB. 4-8 GB riktigt snabbt SLC är nog det bästa.

      L2ARC ska du ha så stor du kan, prestandan försämras inte med större L2ARC.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>