wiki:SSDtests1
Last modified 8 years ago Last modified on 07/06/2011 03:38:01 AM

Tests with Solid State Disks on dashio (SDSC)

LSST Database SSD Tests

These results are all using MySQL and are not general-purpose benchmarks of SSD vs. spinning disk for any application, or even any DBMS.

tests with RAM-based data

During all the tests described in this section data was cached in RAM, and no disk access was observed (iostat) while the queries run.

Test01 (concurrency and scaling)

  • Hardware used: H1
  • During each test CPU is 100% busy x "number of queries that we run"
  • All data read from RAM, no disk access
  • Q1 (full table scan):
    • Query executed: select count(*) from Object_200m ignore index(bMag) where bMag>0.2
    • The table has 200 million rows, MYD size = 14GB.
  • Q2 (full index scan):
    • Query used: index scan (select count(*) from ObjectXX where bMag>0.2)
    • Used 16 x 7,916,559KB (~8GB) MYI index files
  • Q3 (join, predicate):
    • select count(*) from o01, o02 where o01.ra=o02.ra and o01.ra<3 and o02.bMag>0.2
    • select count(*) from o02, o03 where o02.ra=o03.ra and o02.ra<3 and o03.bMag>0.2
    • select count(*) from o03, o04 where o03.ra=o04.ra and o03.ra<3 and o04.bMag>0.2
    • ...
    • 30 million rows per table (630000000 MYD, 1040399360 MYI)
    • very narrow table
CREATE TABLE o01 (
  objectId  BIGINT NOT NULL PRIMARY KEY AUTO_INCREMENT,
  ra        FLOAT NOT NULL,
  decl      FLOAT NOT NULL,
  bMag      FLOAT NOT NULL,
  KEY(bMag)
 )
#queries Q1 (avg)Q2 (avg)Q3 (avg) CPU
[sec] [sec] [sec] [%]
1 19.8 19 16 100
2 21.8 24 29 200
4 21.2 33 52 400
6 51 63 600
8 22.1 74 116 800
10 25.2 98 156 1000
12 28.2 126 203 1200
14 32.0 164 264 1400
16 37.0 213 341 1600
18 39.2 1600
20 41.4 1600

Test02 (multithreading in mysql)

  • Hardware used: H1
  • During each test CPU is 100% busy x "number of queries that we run"
  • Query used: index scan (select count(*) from ObjectXX where bMag>0.2)
  • Used 16 x 7,916,559KB (~8GB) MYI index files
  • All data read from RAM, no disk access
  • Run multiple mysql servers on the same machine
# mysqld queries aver time slower than 1 query
servers per server per query [sec] (no concurrency )
1 16 213 11x
2 8 87 5x
4 4 44 2x

Conclusion: mysql multithreading is the bottleneck!

ssd disks configured in RAID0

Test03 (Ingest)

  • hardware used: H1a
  • Ingesting USNOB catalog (69 GB csv file, 60GB in MyISAM table, 1 billion rows) took 32 min 9 sec (31 MB/sec, ~5200 rows/sec)
  • Ingesting the same amount of data into a table with 3 billion rows took only 35 sec longer
  • Loading simultaneously 3 USNOB-catalogs into 3 different tables took ~44 min
  • "INSERT INTO SELECT FROM" for a 1 billion row table took ~18 min (~55 MB/sec)
  • added objectId and building a primary index on 3.1 million row table took 40 min 18 sec
  • "alter table Object add key(bMag), add key (b2Mag)" makes full copy of the MYD file - run out of disk space

Test04 (Spatial join)

  • hardware used: H1a
  • This test didn't really stress SSD, but it was useful to see how fast we can do spatial join on a high-end machine like dashio. Related page: dbSpatialJoinPerf?.

Query executed:

SELECT count(*)
FROM   X o1, X o2 
WHERE  ABS(o1.ra - o2.ra) < 0.00083 / o2.cosRadDecl 
  AND  ABS(o1.decl - o2.decl) < 0.00083 
  AND  o1.objectId <> o2.objectId

Timing (includes timing achieved on the lsst-dev machine)

nRows lsst-dev dashio
[K] [sec] [sec]
1 1 0.20
2 5 0.74
3 11 1.57
4 18 2.65
5 28 3.97
10 101 13.86
15 215 29.12
20 368 49.97
25 566 76.31
30 107.76
50 288.67
60 420.10
70 555.30
80 736.35

Best fit for lsst-dev: 0.92*(nRows)2. Best fit for dashio: 0.115*(nRows)2. See ssh-dashio-tests.ods

USNOB data set was used, schema as described at dbSpatialJoinPerf?

Test05 (Full table scans)

Hardware used: H1a

Tests run with USNOB catalog replicated 3x, total size: 3,135,527,289 rows (214GB MYD, 118GB MYI), primary index on objectId, index on bMag and b2Mag.

Full table scan simple queries takes ~6-7 min (~575 MB/sec, or ~72MB/sec/disk)

  • select sum(muRA) from Object took 6 min 1 sec
  • select count(*) from Object where bMag > 0.2 took 6 min 48 sec
  • select count(*) from Object where b2Mag > 0.2 took 6 min 58 sec

More complex queries take longer (CPU bound), for example:

  • select min(ra), max(ra), min(decl), max(decl) from Object took 11 min 14 sec (that is ~307 MB/sec)

Note that the above queries were run before we build indexes.

Test06 (Full index scan)

Hardware used: H1a

These tests were run with key_buffer_size = 8,384,512.

  • Key_read_requests = 61191923097
  • Key_reads = 530890936
  • so the ratio is 1:115 (relatively high)

Test06a (using indexes)

select count(*) from Object where bMag>1

returns 1,118,411,265 rows (36% of all rows). Query plan: using where, using index.

cpu ~25-35% busy. Based on iostat: ~14MB/sec (~1.75 MB/sec/ssd), ~3,300-3,400 reads/sec (~418 reads/sec/ssd). It took 35 min 21 sec to complete ((36*60+21)*~14MB = ~30GB, 30GB / (3350 reads/sec * (36*60+21)) = 4106, default block size for MySQL is 4096 - so indeed it was a full index scan, and it all fits :-)

select count(*) from Object where b2Mag>10

returns 2,502,471,192 (80% of all rows). Query plan: using where, using index.

It took 1 hour 7 min 48 sec (based on observed ~12 MB/sec, that is ~48GB read, not quite sure why it took so long.)

select count(*) from Object where bMag> 25

returns 533K rows and finishes in 0.7 sec.

select count(*) from Object where b2Mag> 25

returns 4.5M rows and finishes in 1.2 sec

select count(*) from Object where bMag> 20

returns 391M rows, and finishes in 11 min 52 sec

Test06b (not using index)

The following tests were run after disabling keys (alter table Object disable keys).

select count(*) from Object where bMag> 25

which used to take 0.7 sec using full index scan, now takes 6 min 26 sec (it is a full table scan).

iostat reports ~9,000 reads/sec and ~575MB/sec

select count(*) from Object where b2Mag>10

which used to take over an hour using full index scan, now takes 7 min 10 sec.

Test07 (Concurrency)

Hardware used: H1a

The test was run with 8 identical tables, 0.5 billion rows per table, 34GB MYD, 19GB MYI per table. Tables were created as follows: two tables were created simultaneously (Object1 and Object2), then these two files were copied.

Test07a (speed of non-concurrent queries)

Hardware used: H1a

Running one query at a time on a quiet system:

  • select count(*) from Object1 ignore index(bMag) where bMag>0.2 - 1 min 4.41 sec
  • select count(*) from Object2 ignore index(bMag) where bMag>0.2 - 1 min 5.48 sec
  • select count(*) from Object5 ignore index(bMag) where bMag>0.2 - 1 min 5.89 sec

close to 100% CPU, iostat reports ~9.2K reads/sec, ~1.1M rsectors/sec

Test07b (concurrent queries)

Hardware used: H1a

#queries avg time CPU
[sec] ~MB/sec ~r/sec [%]
1 66 523 9000 99
2 267 260 4400 80
3 359 266 4500 78
4 510 300 5200 78
5 616 276 4700 75
6 707 290 4900 73
7 845 280 4700 72
8 816 280 4700 74

MB/sec and r/sec - as reported from iostat (for the entire array). %CPU - 100% - 1 core busy (16 cores available)

See ssd-dashio-tests.ods (concurrency sheet)

Timing:

  • 1: 1 min 5.77 sec
  • 2: 4 min 27.29 sec, 4 min 26.61 sec
  • 3: 5 min 52.24 sec, 6 min 0.09 sec, 6 min 4.39 sec
  • 4: 8 min 33.52 sec, 8 min 27.44 sec, 8 min 32.98 sec, 8 min 29.79 sec
  • 5: 10 min 18.10 sec, 10 min 11.49 sec, 10 min 18.44 sec, 10 min 16.36, 10 min 18.25 sec
  • 6: 11 min 53.91 sec, 12 min 0.02 sec, 11 min 38.89 sec, 11 min 20.97 sec, 11 min 47.58 sec, 11 min 58.81 sec
  • 7: 14 min 15.29 sec, 14 min 15.74 sec, 14 min 13.27 sec, 14 min 1.62 sec, 14 min 16.18 sec, 14 min 8.81 sec, 13 min 29.01 sec
  • 8: 13 min 55.76 sec, 12 min 17.83 sec, 13 min 39.59 sec, 14 min 6.87 sec, 14 min 11.26 sec, 13 min 6.17 sec, 13 min 49.09 sec, 13 min 47.93 sec

ssd mounted as separate disks, no raid

Test08 (sequential scan)

  • query used: select count(*) from Object0 ignore index (bMag) where bMag>0.2
  • sequential full table scan (no index)

Test08a_ssd (no concurrency)

  • hardware used: H1b
  • data read from one disk
  • ~2 min 15 sec
  • 250MB/sec
  • so we were able to saturate disk bandwidth (per spec this drive can do 250 MB/sec sequential reading)

Test08a_spinning (spinning disk, no concurrency)

  • hardware used: H2
  • data read from one disk
  • 32 MB/sec (3 min 39 sec), 42 MB/sec (2 min 47 sec) depending which table
  • BTW, in both case the MYD files are not fragmented (according to filefrag tool)

Test08b (concurrent)

  • hardware used: H1b
  • 4 queries run, each query reads table from different disk
  • timing: 2 min 51.46 sec, 2 min 23.06 sec, 2 min 23.53 sec, 2 min 23.67 sec
  • observed ~250MB/sec *per disk*, 1 GB/sec total

Test9 (index scan / random access)

  • According to spec, the disk can deliver 35K iops (using 32 streams), or ~15.4 K iosp (single stream, see H1b
  • query used: select count(*) from Object0 where bMag>0.2 (it is an index scan, we removed MYD file and it worked)
  • query selectivity is 33% (166,453,323 rows out of 500 million)
  • MYI size = 18,902,020 bytes

Test9a_ssd (no concurrency)

  • Hardware used: H1b
  • 1 query, no concurrency, data read from one SSD disk
  • 3.8K reads per sec (which is ~30% of max speed of ~15.4 K
  • ~13 MB/sec
  • cpu very lightly loaded (~20%)
  • takes ~7-8 min to run the query
  • so we see only 25% of theoretical limit

Test9a_spinning (no concurrency)

  • Hardware used: H2
  • 1 query, no concurrency, data read from one spinning disk
  • ~0.18K reads per sec
  • ~ 0.8 MB/sec
  • cpu almost not used (<2%)

Test9b2_ssd (concurrent)

  • Hardware used: H1b

2 queries running concurrently against the same disk (each query reads from different table):

  • 6.3K reads per sec
  • ~25 MB/sec
  • cpu ~40%
  • Takes ~9-11 min per query

Test9b4_ssd (concurrent)

  • Hardware used: H1b

4 queries running concurrently against the same disk (each query reads from different table):

  • 11.1K reads per sec, gradually going down to 6K
  • 44 MB/sec
  • cpu ~86%
  • takes ~15 min (14 min 41 sec fastest, 16 min 32 sec slowest)

Test9b8_ssd (concurrent)

  • Hardware used: H1b

8 queries

  • 14.2K iops, 56MB/sec, 250%cpu,
  • 1min 55 sec +/- 1 sec
  • suspecting that some data was cached in ram

Test9b16_ssd (concurrent)

  • Hardware used: H1b

16 queries:

  • used smaller data set (7916559KB MYI), all 16 tables on the same disk
  • run 2 mysqld servers, each server served 8 tables
  • 3 min 12 sec fastest, 3 min 53 sec slowest
  • 31K iops (~90% of max speed 35K iops), 124MB/sec

Test9b16_RAM

  • Hardware used: H1b
  • used smaller data set (7916559KB MYI), 16 tables on the same disk
  • 1 query reading ram-cached index takes 35 sec, see [SSDtests#Test07a_RAMnoconcurrency]
  • 16 queries reading ram-cached index = 1600% cpu, 3 min 32 sec +/- 1 sec - 6x slower!

Test9c_ssd (effect of key_buffer)

  • Hardware used: H1b

same (4 query test as in Test08b) with tiny key_buffer_size (4MB instead of 4GB)

  • 13.1K reads
  • 52 MB/sec
  • cpu 96%
  • takes ~13 min (12 min 52 sec fastest, 13 min 46 sec slowest)
  • ratio about 10:1 key_read_request:key_read in both cases
  • so key_buffer does not play important role here

Test10 (fts vs index scan)

  • Hardware used H1b hardware

Test10a (100m row table)

  • 100m row table
  • MYD size: ~7GB (7300000000), MYI size: 3878952960
  • Sequential scan there takes 28.35 sec
bMag cut #rows selected selectivity index scan (ssd)
[%] [sec]
1.0 44,417,626 44 114.5 (4.0x slower)
20.0 19,320,034 19 54.6 (1.9x slower)
20.55 9,939,990 10.0 29.9 (comparable)
20.75 6,593,188 6.5 20.16 (1.4x faster)
21.0 2,687,302 2.7 8.63 (3.3x faster)
21.5 520,595 0.5 1.79 (15.8x faster)
24.0 79,571 0.08 0.24 (118x faster
  • For selectivity above ~10% full table scan becomes more efficient.

Test10b (30m row table)

  • 30m row table
  • MYD size: ~0.6GB (630000000), MYI size: 1040399360
  • Sequential scan there takes 3.96 sec
bMag cut #rows selected selectivity index scan (ssd)
[%] [sec]
1.0 13,331,269 44% 3.46 (1.1x faster)
19.0 8,941,268 30% 2.38 (1.7x faster)
20.0 6,058,366 20% 1.76 (2.2x faster)
20.3 6,058,366 15% 1.46 (2.7x faster)
20.5 3,487,034 12% 1.28 (3.1x faster)
20.6 2,968,015 10% 1.04 (3.8x faster)
20.7 2,325,448 8% 0.89 (4.4x faster)
20.8 1,644,652 5% 0.79 (5.0x faster)
20.9 1,120,081 4% 0.58 (6.8x faster)
21.0 719,542 2.4% 0.58 (6.8x faster)
  • Even for selectivity ~45% full index scan is faster.

Hardware used

H1

  • dashio-1-2 (provided by SDSC)
  • 16 core Intel(R) Xeon(R) CPU E5530 @ 2.40GHz
  • 64 GB RAM
  • XFS filesystem

H1a

  • 8 Intel X25E 64GB SLC flash SSDs
  • locally attached as a single drive
    • 4 SSDs go into a IO-controller that is plugged to the PCIe slot, 2 out of 4 PCIe slots are used
    • Raided together - software RAID, RAID-0, stripe size=64K
  • raw read disk speed we measured for the entire array as "dd if=/mnt/ssd0/lsst/dummyFile.txt of=/dev/null") = 437 MB/s.

H1b

  • 4 Intel X25M 120GB MLC flash SSDs
  • locally attached as 4 independent drives

H2

  • lsst-db1 (provided by SLAC)
  • Dual Core AMD Opteron(tm) Processor 270
  • 4 GB RAM
  • 24 locally attached drives
  • write speed: (dd if=/dev/zero of=/u12/dummyFile.txt bs=1048576 count=4000) from iostat: 10K iops/sec, 41 MB/sec
  • read speed: (dd if=/u12/dummyFile.txt of=/dev/null): 61.3 MB/sec