2008-05-29

Concurrent programming: It’s not about the language, it’s the framework


There’s a huge discussion on the web about concurrent programming. Now we have 4-core processors and that number will double every few years. The problem is that programmers don’t know how to use multiple cpus.

There are several approaches that address this issue:

  • Intel is developing a compiler that’s going to automatically parallelize software
  • people from Python are developing extension that’s enabling Python to use multiple cpus using threads-like api (normal python threads use only one cpu - see GIL)
  • there are many extensions to C that enable easy writing of parallel software
  • Java has built-in threading support
  • everyone is admiring Haskell support for multiple cpus
  • some people believe that the Software Transactional Memory is the parallel processing silver bullet
I wonder if we need parallelization at this level at all. Maybe the next level over “one processor” is not “multiple processors” but rather “multiple machines”. Here are the strategies that are popular nowadays:
Erlang as a language is horrible. It’s the language for really determined programmers, because the learning curve is so steep. But the Erlang’s framework is excellent. You can easily scale over many machines, using the Erlang message passing you can accomplish more than in thousand lines in other languages.

I believe that the Erlang framework ideas aren’t tied to the Erlang language. I’d love to have so powerful framework for other languages.

Maybe we should skip the “multiple processors” phase and learn to use “multiple machines” technologies right now.


3 comments:

Anonymous said...

Majek,
Pervasive Software, out of Austin, is taking another approach and has created a Java framework called DataRush that developers can use to create highly parallel applications w/o any concurrent programming knowledge. I think before skipping the “multiple processors” phase we should see how projects like the Intel App Dev. Toolkit that you mentioned and DataRush framework help us solve the concurrent programming problems.

Here for more on DataRush
http://www.pervasivedatarush.com/product/faq

majek said...

Thanks for the marketing.

I doubt that any kind of automatic and magic threading solutions are going to be adopted wider. Maybe I'm wrong, we'll see soon.

I think that the problem is with the data. When the shared data is being big enough you run into known problems: blocking locks, synchronization issues, etc. But wait! Locks ought to be the solution for writing concurrent software!

I want to say that without the paradigm shift we still have our current problems. You can try to minimize the impact of this issues on the programmer - write nicer threading interface, create easier locks. But in my opinion you won't solve the problem this way.

Justin George said...

I agree, I'd like to see concurrency primitives built into a lot of languages from here on out. spawn, receive, and ! should be in every language.