wiki:db/mtgs/mysql2013
Last modified 6 years ago Last modified on 04/26/2013 10:11:49 AM

Percona Live MySQL conference 2013

highlights

  • conference organized by Percona. Smaller crowd than used to be before Sun/Oracle purchase (~800 people maybe?). Oracle has its own mysql conference ("mysql connect"), Sept 21-23, allegedly last year it was very small
  • mysql software development and community looks very healthy
  • 3 competing camps: oracle, mariaDB+SkySQL (original mysql founders behind), percona
  • oracle investing heavily in mysql, but only in innodb engine
  • mariaDB and skySQL merging (the 2 mysql cofounders are behind: Monty behind former, David behind later)
  • mariaDB rapidly gaining traction
  • percona strengthen their presence through organizing the conference
  • drizzle more or less dead (no real users, no new effort)
  • tokutek made their engine open source (it is primarily for rapid concurrent writes, not very useful for us)

oracle's mysql

  • 5.5 big success, 5.6 looks very good too
  • new feature: read only innodb tables (no transactional overhead). Storage overhead still there
  • new feature: direct access from memcached to innodb
  • multi-core scalability fixed (up to 48 cores linear), 3x faster, but only innodb
  • wants to match performance of myisam
  • no plans to improve myisam (I asked explicitly)

some new interesting features in 5.6:

  • performance schema: captures statistics about how data used etc
  • added mysql hadoop applier. The idea is that the "applier" reads the binary log (produced by mysql for replication purposes) and insert the data into hadoop in real time as if you were replicating data into a slave mysql replica.

mariaDb

  • many major linux distros ship with mariaDb as default (fedora, opensuse, ...)
  • wikipedia just switched to mariaDb distribution
  • big new effort at google to consolidate database work into mariadb (based on a lunch conversation, not confirmed, it wasn't announced)
  • multi-core scalability fixed in the mariaDb-distibution of myisam, (oracle didn't bother to pick the fix yet). Solution: use multiple key-cache-segments, newly introduced variable.
    • not ported to Aria yet, so no point in using Aria
    • Aria is a new storage engine that is supposed to be a better myisam, but work stalled when oracle purchased Sun. It has two modes: row-based and page-based. The former is like myisam, the later is transactional, crash safe, 10% storage overhead, implementation not finished
  • much better optimizer in mariadb-distribution of 5.3 - 10 man-years effort (3 people x 3+ years). Oracle didn't port that in yet.
  • new feature: can limit memory usage per thread
  • investing in storage layer, wants to add more engine-independent features (as opposed to oracle strategy: invest purely in innodb)
  • dynamic columns: different rows have different set of columns
    • but be warned: issue with interoperability with other storage engines or dbmses
  • new storage engine "CONNECT" (only in MariaDB 10.0 for now). Looks very interesting
    • The CONNECT storage engine enables MariaDB to directly use external data as if it were tables in the server without actually being obliged to load them into the physical database
    • can talk csv, xml, dbf, odbc
    • You can also build a table based on a list of tables (equivalent of the MERGE storage engine). These table list (TBL) can be built on any type of table including any CONNECT table. This means that you can build a table whose data is distributed across multiple remote servers. Can join data between mariaDB / cassandra / oracle...
    • new, very compact indexes (block based, bit map used to determine if given block have given value(s)). If many predicates in where, such index can speed up query a lot without looking at data
    • more at:
  • I am scheduling a meeting with Monty at slac 2nd part of July
  • we can hire a "virtual developer" from MariaDB. They'd assign best expert for each task we need them to implement. Interesting idea to consider

innodb vs myisam?

  • all web companies need transactions, they use innodb
  • facebook, google and oracle strongly push development of innodb
  • 5-10% of community uses myisam, the rest is on innodb
  • myisam still very useful for non-transactional data loads
  • Monty very much interested in improving myisam (and working with us, I think he sees us and the place that could push the limits of myisam the way facebook/google push innodb)

mysql gateway (per Monty)

views?

  • potential issues with views all fixed in 5.3 (per Monty)

preventing full-disk problems?

  • one other important customer asked, so will implement now (per Monty)

configuring myisamchk (per Monty)

  • set key_buffer_size to something very small, not used at all
  • read and write buffers should be small, few MB
  • sort buffer size should be huge, eg 50GB on 128 GB machine
    • I showed him error message and we debugged it, identified 2 bugs that limited mysql to using 4G RAM, Monty will send me a patched version of myisam (mariadb distribution) to test next week

merging large myisam tables into one?

  • Use merge engine. Very stable, used in production by many places (don't confuse with federated engine)
    • the only issues (if you delete table and replace with different table with the same name, that has been recently patched).
  • Or try the new connect engine, but it is not production yet, mariaDb 10.0.1 has it, it is in alpha

parelastic

  • new storage engine (commercial)
  • architecture: data nodes, processing nodes,
  • need to move data to processing nodes to do things like join (not that great...)
  • also, no hooks for shared scans

interesting approach to key/value in database:

  • serialize - use binary encoding, like msgpack or protobuf
  • can even implement flexible schema (arbitrary attributes), and own schema versioning
  • mix this with "secondary" indexes for rapid access to selected attributes
  • generating 64-bit unique ids in sharded systems?
    • see twitter's snowflake project
  • see slides

mysql and cassandra

our extensions (eg in scisql)

  • Monty interested in discussing them with the GIS team (GIS for mysql was recently revamped), and plugging them into mariaDB and supporting

See also conference website