Calgary RHCE

A linux and open source technology blog.

  • Home
  • About
  • GPG Key
  • GitLab

Connect

  • GitHub
  • LinkedIn
  • Twitter

Filesystem benchmarking

September 11, 2016 By Andrew Ludwar Leave a Comment

Once in a while I need to do some I/O benchmarking. Either when creating a baseline for a before and after performance tuning comparison, or just because I want to see how fast my new SSD/M.2 really is. The tool I find myself going to over the years has been iozone. I like it because it easily does a myriad of tests that I don’t need to create or think about how to simulate myself, and it’s easily installed on any linux distro. It also outputs the test data into a format that is easily imported into a spreadsheet. I recently added a couple SSDs to my workstation, and finally got around to benchmarking them.

IOzone has a slew of pre-built tests you can run, but I typically just do 5 (read, random read, write, random write, and stride read). From the iozone help:

1
2
3
4
          -i #  Test to run (0=write/rewrite, 1=read/re-read, 2=random-read/write
                 3=Read-backwards, 4=Re-write-record, 5=stride-read, 6=fwrite/re-fwrite
                 7=fread/Re-fread, 8=random_mix, 9=pwrite/Re-pwrite, 10=pread/Re-pread
                 11=pwritev/Re-pwritev, 12=preadv/Re-preadv)

Within each test you can also specify the record size, and file size to be written. This is my usual test syntax:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
$ /opt/iozone/bin/iozone  -a -b benchmark.xls -R -i 0 -i 1 -i 2 -i 5 -i 8 -f /export/testfile
    Iozone: Performance Test of File I/O
            Version $Revision: 3.452 $
        Compiled for 32 bit mode.
        Build: linux
 
    Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                 Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                 Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                 Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                 Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                 Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                 Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
                 Vangel Bojaxhi, Ben England, Vikentsi Lapa,
                 Alexey Skidanov.
 
    Run began: Sun Sep 11 20:17:18 2016
 
    Auto Mode
    Excel chart generation enabled
    Command line used: /opt/iozone/bin/iozone -a -b benchmark.xls -R -i 0 -i 1 -i 2 -i 5 -i 8 -f /export/testfile
    Output is in kBytes/sec
    Time Resolution = 0.000001 seconds.
    Processor cache size set to 1024 kBytes.
    Processor cache line size set to 32 bytes.
    File stride size set to 17 * record size.
                                                              random    random     bkwd    record    stride                                    
              kB  reclen    write  rewrite    read    reread    read     write     read   rewrite      read   fwrite frewrite    fread  freread
              64       4   888413   953673  4297180  3033971  2786676  1120480                      2463305                                   
              64       8   744494  1232721  3551002  4261142  3047037  1422167                      2553108                                   
              64      16   864466  1560183  4958872  5328791  3356712  1778661                      2277099                                   
              64      32  1182756  1335389  8051483  6329941  7985638  1783120                      2914882                                   
              64      64  1453111  1598043  7075540 12653171  4552275  1887599                      2207552                                   
             128       4   999800  1195499  3375208  4131854  3648013  1855621                      2783700                                   
             128       8  1164348  1487891  4404899  5328896  3660474  1706333                      5548323                                   
             128      16  1348377  1704441  4738496  4003638  5355501  2459229                      3883374                                   
             128      32  1437115  1704867  5318585  4721384  5144080  2243177                      4142611                                   
             128      64  1049847  1523170  7979556  6401923  4269068  2067294                      4409651                                   
..

… and so on as it goes through all of the test scenarios. At the end, I’m given a nicely formatted spreadsheet that’s easy to make graphs from:

iozone graph output

Looking at my read stats, at the peak, I’m getting just over 20 GB/sec. However, the average seems to be about 12-14 GB/sec. Not too shabby! (I’ve got a few SSDs in a RAID1 array). Full test details are in a spreadsheet on github.

 

Filed Under: open source, performance tuning, storage Tagged With: I/O, performance tuning, SSD, storage

Performance Tweaking – SSDs, I/O Scheduler, Virtual Memory, Browser Cache

July 13, 2014 By Andrew Ludwar Leave a Comment

Having recently been exposed to some SSD tweaking at work, I thought I’d do the same with my home PCs.  Prior to this weekend, I’ve just had four 1TB SATA drives in a RAID 5 configuration for a disk setup.  Performance has always been satisfactory, but with recent SSD prices coming down quite a bit, my late 2008 MacBook Pro feeling it’s age, it was time for an upgrade.  Also, it’s been a while since I’ve invested in some new gear for myself, so why not now? ;).

For the laptop, a 256 GB Samsung 840 Pro SSD.  For the tower, the 128 GB version.

SSDs

These are higher grade SSDs, with quite a bit higher IOP rating than average.  Generally speaking, the larger the hard drive, the more IOPs you will get also.  The laptop would be using the entire drive for all data, while the tower would be using the SSD for just the OS.  My laptop runs OSX Mavericks, and the tower CentOS 6.5.  For now, we’ll just talk about the CentOS box.

Tuning the filesystems – optimizing data structure for SSD use, eliminating journal access time writes, temp filesystems to RAM, enabling TRIM

I’ve split up the OS into five filesystems that will sit on the SSD. Root, var, usr, boot, and ssd. The /ssd filesystem is just set aside for anything that I might need quick disk for.

The advantage of SSDs is read-performance, and they do wear when used, so ideally you want the files that are frequently read on the SSDs, but reduce the writes and file changes as much as you can to save wear and increase the SSDs life span. For this reason I’ve used the “noatime” and “nodiratime” mount options on the SSD ext4 filesystems. Since ext4 is a journaling filesystem, linux will record information about when files were created, last modified, as well as when it was last accessed. Turning this feature off not only reduces the writes to the SSD, it also can improve performance on often accessed and frequently changing files. Since no important application logs sit on these filesystems, and /var/log/messages includes timestamps within the file, I can safely turn this off. My tower /etc/fstab looks like this:

1
2
3
4
5
6
7
8
9
10
UUID=ebc6b3e8-48e5-4d1d-b7c1-0cd5f8738bca /boot ext4 defaults 1 2
/dev/mapper/vg_os-lv_root / ext4 noatime,nodiratime 1 1
/dev/mapper/vg_os-lv_ssd /ssd ext4 noatime,nodiratime 1 2
/dev/mapper/vg_os-lv_usr /usr ext4 noatime,nodiratime 1 2
/dev/mapper/vg_os-lv_var /var ext4 noatime,nodiratime 1 2
/dev/mapper/vg_os-lv_swap swap swap defaults 0 0
/dev/mapper/vg_os-lv_export /export ext4 defaults 1 2
/dev/mapper/vg_os-lv_home /home ext4 defaults 1 2
tmpfs /tmp tmpfs defaults,noatime,nodiratime 0 0
tmpfs /var/tmp tmpfs defaults,noatime,nodiratime 0 0

Since swap can be frequently accessed, I’ve opted to not put that on the SSD. Also, /export and /home contain important data that needs to be revivable if I ever lose a couple disks. SSDs are not good for archival use; once they wear out, you won’t be able to scrape the drive for any useable data. Once it goes, your data is gone with it.

I’ve also made some other tweaks on the filesystem, making /tmp and /var/tmp use RAM instead of disk for file storage. This will speed up temp file access by quite a bit. I have 32GB of RAM on the system, so I’m not crunched for memory.

I’ll also need to enable TRIM support. There’s a good article here that talks about enabling it properly. I’ve enabled TRIM for LVM, and have placed a simple script in the weekly cron to handle the discarding of blocks on the SSD.

Change the kernel I/O scheduler

The I/O scheduler optimizes disk operations for speed. The default scheduler is CFQ (Completely Fair Queuing), which is optimized for the mechanical rotational latency of regular HDDs. Since that doesn’t apply to SSDs, consider using the deadline scheduler instead which prioritizes reads over writes. This change will help take advantage of the SSD performance. You can read more about the I/O scheduler options here. Further to that, RedHat-based systems also have the tuned-adm command that can set pre-defined performance profiles based on your hardware and optimization purpose. (See man tuned-adm for more.)

To enable this change scheduler change permanently, you’ll need to modify GRUB to pass the “elevator=deadline” parameter on the kernel boot line.

1
2
3
4
5
6
7
vi /etc/grub.conf
...
title CentOS (2.6.32-431.20.3.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-431.20.3.el6.x86_64 ro root=/dev/mapper/vg_os-lv_root rd_NO_LUKS KEYBOARDTYPE=pc KEYTABLE=us rd_LVM_LV=vg_os/lv_swap LANG=en_US.UTF-8 rd_LVM_LV=vg_os/lv_root rd_MD_UUID=0b5dd8e0:26c65ad9:72ad5b48:a8447401 rd_MD_UUID=adfcdb87:7b4dcd6e:683df3bf:f6056163 rhgb crashkernel=auto quiet SYSFONT=latarcyrheb-sun16 rd_NO_DM <strong>elevator=deadline</strong>
initrd /initramfs-2.6.32-431.20.3.el6.x86_64.img
...

Tuning virtual memory – Reduce swapping

Although I’ve placed my swap partition on the HDDs, swap file tuning is still an important tuneable to look at. Swap space is hard drive space that is used when RAM is exhausted and the current proccesses need more memory. The system will also move data to the swap file from RAM if it hasn’t been accessed in a while. This preemptively free’s up RAM for other processes. This is incredibly slow compared to RAM, and really should only be used as a fall-back when your system is running out of RAM. If your system is frequently swapping, go buy more RAM and use the swap file as a temporary work around.

In the kernel, there’s a setting that controls the degree to which the system swaps. Since I have a lot of RAM to use, I only want the system to swap when it’s absolutely necessary. A high value in the swappiness setting aggressively swaps processes into physical memory when they’re not active. A low value avoids swapping processes for as long as possible. So I’ve set the vm.swappiness parameter to zero. You can read more about this tuning here.

I’ve added this parameter to /etc/sysctl.conf, and made it active with sysctl -p. This will ensure this is persistent through a reboot.

1
2
# Only swap if absolutely necessary
vm.swappiness = 0

Moving Firefox cache to RAM

This is a nice tweak that tells firefox to use RAM instead of disk for it’s cache. This should speed up browsing frequently visited sites. To enable it, go into the about:config, set browser.cache.disk.enable to false, set browser.cache.memory.enable to true, and browser.cache.memory.capacity to the number of KB you want to assign. I’ve used -1 to let Firefox dynamically determine my cache size depending on how much RAM the system has.

1
2
3
4
about:config
browser.cache.disk.enable = false
browser.cache.memory.enable = true
browser.cache.memory.capacity = -1

You can use the about:cache entry to see your newly changed cache information!

Results

Now for the results. I’ve used iozone as my benchmarking tool because it performs a series of tests quite easily, and graphs the output nicely for me in excel. Attached are the results of the read report statistics. As you can see, considering OS cache amongst a few other things blur the true speed differences a little bit, the SSD has roughly triple the performance in read speed vs the four HDDs in RAID5.

Filed Under: open source, performance tuning, storage Tagged With: I/O, performance tuning, SSD, storage

  • « Previous Page
  • 1
  • 2