Last modified 7 years ago Last modified on 04/09/2012 11:55:55 PM

Database Performance for Forced Photometry

Tested performance of database using data produced based on SDSS stripe 82 DR8. Database location: lsst10. Test involved running different number of concurrent queries and measuring response time. Each query was selecting sources for a randomly picked bounding box, and returned ~1200-1300 rows. Executed queries looked like:

SELECT scisql_s2CPolyToBin(ra0, decl0, ra1, decl1, ra2, decl2, ra3, decl3) INTO @poly
  FROM Field 
  WHERE uniqueId = <uniqueId>;

CALL scisql.scisql_s2CPolyRegion(@poly, 20);

SELECT sourceId, ra, decl, htmId20 
  FROM Source AS s 
  INNER JOIN scisql.Region AS r ON (s.htmId20 BETWEEN r.htmMin AND r.htmMax) 
  WHERE scisql_s2PtInCPoly(ra, decl, @poly) = 1;

First test involved running 10,000 queries total, all on a single node (lsst9) and varying the level of parallelism. During each test all queries were started at exactly the same time, and there was no time delay introduced between them. Results:

queries per thread threads slowest elapsed time [sec] queries/sec
10,000 1 466.2 21.5
5,000 2 247.7 20.2
2,500 4 144.4 17.3
2,000 5 124.2 16.1
1,250 8 96.6 12.9
1,000 10 91.8 10.9
400 25 94.6 4.2
200 50 95.6 1.0
100 100 100.0 1.0

Based on the timer in the test program, the database response time was ~the same for each test (~0.2 sec), the limiting factor was threading in the python test program that run all threads.

Second test involved querying the database from two different nodes (lsst9 and lsst5), 100 threads per node. Measured elapsed time was comparable to the single-node test:

queries per thread threads slowest elapsed time [sec] queries/sec
100 50 x 2 machines 109.0 0.9

During this test, average query response time was up (approaching ~0.4 sec) - less time was spent in the python client, but the server response was slower. The limiting factor was server CPU. The server was completely cpu bound - all 24 cpus were 100% busy. "show processlist" showed 200 simultaneous queries processed by the server.

During all these tests there was almost no disk I/O activity. This is not unexpected, as the entire data set including indexes is < 8 GB in size, much smaller than RAM available on lsst10.

Related code: see git repo: contrib/forcedPhotDbDemo.git

About Data Ingest

Counts: ~200 million sources (~6GB + 2GB index), ~200K fields (~15 MB).

Loading the data took: ~35 min, sorting by htmId another 35 min.


The server server seems to be handling well 100-200 requests coming in simultaneously, and deliver results in well under a second, even under heavy load, provided there are no other activities competing for mysqld server time and memory. Ideally, it'd be useful to try querying from many more clients, but given in practice forced photometry code will crunch on the data for ~10 sec, thus there there will be ~10 sec delay between queries incoming from the same process, the load seen by the server should be much lower than that tested.