wiki:DM/TCT/Software/MultiThreadingLibrary
Last modified 8 years ago Last modified on 06/01/2011 09:22:33 AM

Subject: Proposal for new external threading package

Comments are in chronological order.

From: "Robyn Allsman" Date: Wed, 25 May 2011 09:27:25 -0700

Please comment on this tentative proposition for a new external package from Tim Axelrod:

"I'm about to make a multicore version of the selfCal solver, and would prefer to use something a bit higher level than raw threads to do so. Looks like the best choice is http://threadingbuildingblocks.org/, which is an open source package supplied by Intel. It works with both gcc and icc. This is just an early notice that I'm likely to propose it as an external package at the next TCT meeting."

Our next TCT meeting is next week on 1 June at 9am.

From: "Robyn Allsman" Date: Fri, 27 May 2011 10:40:49 -0700

Is there anyone with a preference for a specific multi-threading support library?

If there are no alternatives to be reviewed, then I will suggest that Tim move forward using the package at:

http://threadingbuildingblocks.org

in an exploratory manner to determine if the package can be integrated into the LSST stack without great trauma to the rest of the stack.

From: Kian-Tat Lim Date: Fri, 27 May 2011 10:58:55 -0700

If there are no alternatives to be reviewed

From a cursory inspection, I see the following as possible alternatives:

 1) Ban threads.  Already untenable, I think.
 2) Use POSIX/C pthreads interface only.
 3)  Use Boost::Thread.
 4) Use TBB.
 5) Use OpenMP.
 6) Use Grand Central Dispatch.

From: "Robyn Allsman" Date: Fri, 27 May 2011 11:08:06 -0700

But have you a preference? He's scanned the literature for options hence his proposal. Have you or friend-of-a-friend* used a particular library with which you/they were satisfied?

From: Jim Bosch Date: Fri, 27 May 2011 11:21:45 -0700

On 05/27/2011 10:58 AM, Kian-Tat Lim wrote:

If there are no alternatives to be reviewed

From a cursory inspection, I see the following as possible alternatives:

 1) Ban threads.  Already untenable, I think.
 2) Use POSIX/C pthreads interface only.
 3) Use Boost::Thread.
 4) Use TBB.
 5) Use OpenMP.
 6) Use Grand Central Dispatch.

I don't have a preference for any of these other than maybe the first, but it might be worth pointing out that making Python work happily with any thread interface is far from automatic.

I hope the intent is to only use threading in very specific cases that are highly localized - it'd be a lot of work to make the entire stack thread-safe.

From: Stephen Pietrowicz Date: Fri, 27 May 2011 13:35:52 -0500

I haven't used it on a project, but I'm friends with one of the Senior developers at Intel. He lives in my neighborhood.

I talked to him about it a couple of years ago, and even then it seemed like a package worth pursuing.

From: Stephen Pietrowicz Date: Fri, 27 May 2011 13:39:39 -0500

I wasn't clear enough: that friend is a Senior developer of TBB at Intel.

From: Paul Price Date: Fri, 27 May 2011 15:21:20 -0400

Signed Data (PGP ) On Fri, 2011-05-27 at 10:58 -0700, Kian-Tat Lim wrote:

 1) Ban threads.  Already untenable, I think.
 2) Use POSIX/C pthreads interface only.
 3) Use Boost::Thread.
 4) Use TBB.
 5) Use OpenMP.
 6) Use Grand Central Dispatch.
>

At what level do we want to thread? Python or C++? If C++, do we want to thread tight loops, or larger pieces?

If Python, then I suggest we use the usual Python packages. If C++ tight loops, OMP. For larger pieces in C++, I dunno.

Having used it, I suggest avoiding option 2.

From: Serge Monkewitz Date: Fri, 27 May 2011 12:39:18 -0700

On May 27, 2011, at 10:58 AM, Kian-Tat Lim wrote:

If there are no alternatives to be reviewed

From a cursory inspection, I see the following as possible alternatives:

     1) Ban threads.  Already untenable, I think.

I think the only way we could ban threads is if we do still allow them implicitly (i.e. use something like OpenCL or Intel Array Building Blocks). Both of those are still in beta, and it's not clear if the latter is going to be open source.

     2) Use POSIX/C pthreads interface only.
     3) Use Boost::Thread.

I'd prefer 3 to 2, but it's still pretty low-level.

     4) Use TBB.

TBB looks nice. It supports high-level parallelization constructs (parallel_for, parallel_scan, parallel_reduce, pipeline) and containers that allow concurrent access (vector, queue, bounded queue, hash map, and unordered map). It appears to ship with a scalable memory allocator as well as a task scheduler that implements work stealing. From what I understand, Intel's OpenCL and Array Building Blocks implementations use it underneath their respective interfaces, so it's likely to be very well supported and tested.

     5) Use OpenMP.

I've only ever used this to parallelize simple things (which is really easy to do with OpenMP). I don't have the experience to say much else about it.

     6) Use Grand Central Dispatch.

This looks like it depends on language extensions to C++ which would require using an Apple version of GCC, or LLVM.

My vote would be for either 4 or 5. Tim - are there reasons you prefer TBB to OpenMP?

From: Mike Jarvis Date: Tue, 31 May 2011 11:59:27 -0400

My two cents are that OpenMP is really easy to drop in at only one or two places in your code where you want to parallelize things. Or more obviously, but it's easy to get started doing just the key places where threading will really help. And it's also easy to have the code work either with the parallel stuff or without as single-threaded code by simply putting #ifdef _OPENMP guards around the #pragma lines.

If you want to do really complicated threading, then TBB is undoubtedly more powerful, but it looks like you have to rewrite a fair amount of code whenever you want to use it. So if you're designing something to be parallel from the get go, that's probably fine, but it looks a bit harder to convert existing code over to using threads.

From: Tim Axelrod Date: Tue, 31 May 2011 10:32:03 -0700

I agree with K-T's list of possibilities, though I wonder if OpenCL should be on it as well. My assessment of them:

On Fri, May 27, 2011 at 10:58 AM, Kian-Tat Lim <ktl@…> wrote:

If there are no alternatives to be reviewed

From a cursory inspection, I see the following as possible alternatives:


1) Ban threads. Already untenable, I think.

We have to use threads, I think. The only issue is our API for doing so

2) Use POSIX/C pthreads interface only.

I have done this enough to hate it. It's really easy to make mistakes, and the resulting code is hard to read.

3) Use Boost::Thread.

I'm not very familiar with this one. A cursory glance suggests that it will be easier to use than pthreads, but still pretty low level.

4) Use TBB.

I like TBB for many of the same reasons that I like Eigen. It's well integrated into C++, and the code is easy to read and understand - it looks like C++. The only reason the designers were able to achieve this is that TBB is C++ specific. It couldn't look as nice as it does if they had set out to support Fortran and C as well. I found it easy to install on my Ubuntu box and get the examples going.



5) Use OpenMP.

I find all the #pragma stuff ugly, and not much easier to read than straight thread code.



6) Use Grand Central Dispatch.

Not being much of a Mac developer, I hadn't even heard of this before. It seems to be at a similar abstraction level as TBB, which I like. Though I see it's open source now, it's not clear to me how easily it moves out of the Apple world.



7) OpenCL

This would be an advantage mostly if we want to support GPU's down the road. I'm not really familiar with it, but I frequently hear skepticism about where it's going. The code is really tough to read (for me), and it would have the steepest learning curve.

My preference is still for TBB. I could live with OpenMP if there's a strategic reason to do so. GCD comes third for me, just because of my uncertainty about remaining Apple dependencies.

From: Robert Lupton the Good Date: Wed, 1 Jun 2011 16:48:14 +0100

I'm OK with Tim using TDD; it seems like a reasonable choice, and he'll be a great guinea pig. It seems more general than openMP (which we may well use for other purposes).