wiki:TCTDbServPackages
Last modified 9 years ago Last modified on 02/04/2010 09:58:24 PM

Proposal to add new packages required by database servers

To achieve required database scalability we expect to use a set of open-source off-the-shelf DBMS servers (such as MySQL) integrated by a custom distribution layer to form a shared-nothing architecture. The custom distribution layer is developed by us, and it lives in svn at DMS/qserv. It relies on several external packages; using these packages saves us many man-months of coding. We propose to add these packages to the list of external packages required by the LSST DM, and integrate them with the rest of the LSST DM software as appropriate.

It is worth noting that our layer as well as the packages it depends on will run on database servers only, i.e., pipelines and end-users will never see it, they will just use regular mysql tools to talk to our layer. It will be needed at the Main Archive Center and at every (large) Data Access Center.

xrootd / Scalla

The xrootd /Scalla software enables efficient access to data distributed on large clusters. We use xrootd as a distributed, fault-tolerant dispatch layer. It allows us to dynamically dispatch queries to (and retrieve results from) workers based on partition availability. Our custom layer is shielded from tasks such as load balancing, worker state management, processing topology changes, and replica selection.

xrootd is written in C++ (100+k lines) by SLAC in collaboration with few other labs including CERN, INFN/Italy and BNL, and has been used in production by several large projects since 2001. It is released under the open source BSD license. It does not depend on any external packages.

Website: http://xrootd.slac.stanford.edu/

MySQLProxy

MySQLProxy is a simple proxy sitting between MySQL client and our layer which talks to multiple MySQL servers. We use it to intercept incoming queries in order to pass them to our layer which optimizes the queries (the optimizations include rewriting each query as appropriate to take advantage of our custom partitioning and executing generated sub-queries in parallel.)

MySQLProxy is released under GPLv2 license. Building it requires libevent 1.x or higher, lua 5.1.x or higher, glib2 2.6.0 or higher, pkg-config, libtool 1.5 or higher, MySQL 5.0.x or higher developer files.

Relevant websites:

Lua

MySQLProxy ships with an embedded Lua interpreter (version 5.1). Lua is used to define what to do with intercepted queries. Our lua script uses XML-RPC interface to communicate with our custom layer. For that reason we need luaXMLRPC package, which depends on lua socket module. On ubuntu 9.10 we ended up installing these three packages: liblua5.1-socket2, liblua5.1-xmlrpc0 and lua.

Twisted

"Twisted is an event-driven networking engine written in Python and licensed under the MIT license." Twisted provides a framework for us to build an REST-like HTTP and XML-RPC interface to our custom query processing frontend. Twisted's event model is said to be higher-performance and more scalable than direct usage of Python's built-in socket daemons. It also takes care of a lot of TCP service-related boilerplate code that we would otherwise need to re-invent, re-debug, re-tune, and re-tweak. It is well-maintained and under active development, partly because it appears to be one of the most popular frameworks for networked applications in Python.

Relevant websites: