Monday, 4 May 2015

Challenges of parallel/concurrent programming

With this article, I'll talk about some of the issues facing programmers who develop concurrent or parallel programs. These days, multi-core machines are everywhere. Almost every computer shipped with a multi-core CPU, so I think it is an important topic for most programmers. I'll be using the word "parallelism" to refer to parallel/concurrent development.

Most of our known algorithms are sequential, but that is not useful for our parallel machines. One of the basic challenges is that Parallelism needs parallel algorithms and data structures which are fundamentally different from the sequential ones. Simply, There is no successful automated tool to convert a sequential algorithm to a parallel one. Human creativity is the only way to make such conversion.

Fortunately for us, some algorithms have what is called inherent parallelism where the algorithm can be broken into completely independent computations that can run in parallel. For example, a lot of image processing algorithms have this feature because each pixel is independent of the other, so those are easy to parallelize. On the other hand, we still don't have efficient parallel algorithms for a lot of problems, either because these problems are hard to break into tasks without causing an overhead of communication between them, or they have a complex data structure that we can't distribute across the available threads. {Programmers who use these algorithms, perhaps through libraries should expect them to run perfectly on the current devices, that's why it's an important challenge for the industry as well.}

Another challenge is the faulty non-deterministic results of the program. Parallelism is about speeding up on multi-core machines, but we must make sure that we still maintain the same right results every time we run the program; for example, multiplying two matrices should give us the same result every time. Some bugs that programmers are doomed to make has a non-deterministic feature, which means a program might run as the programmer expected but sometimes it will do otherwise. A death accidents caused by radiation therapy machine, Therac-25, was caused by such bugs. One example is race condition which happens when a wrong order of events cause incorrect results (Sometimes that wouldn't happennon-deterministic). Another example is data race which happens when two or more threads access the same memory location at the same time and at least one of them is writing without any synchronization between them (Sometimes that wouldn't happennon-deterministic). They're hard to debug because their occurrence is arbitrary, sometimes they happen, sometimes not, so debugging them is really hard and almost all released products have these hidden flaws.

Programming languages that encourage a purely functional paradigm, like Haskell, offer some help in this area [Before proceeding, I do have to say I am not an expert in any functional language]. Regular imperative languages and OOP (e.g., C++, C#, Java, etc) in which your program might has objects that you can call methods on them that would change their state; On the other hand, purely functional programming give us functions that simply has no state. It simply guarantees that calling the same function on the same variables will always return the same results (deterministic). This means no side effects, so there are no data races between different functions running in parallel. Of course, at some point, you need to create effects and change the state, but functional programming makes that a hard choice for programmers.

Moreover, Functional languages like Haskell makes a clear difference between concurrency and parallelism. Concurrency is not about harnessing the power of parallel machines, but it's part of the program's design to run on multiple threads. For example, a server responding to clients. The server must run on multiple threads to serve multiple clients, it's part of the program's design. Popular imperative languages fail to make such difference, and they use the same programming approach for concurrency and parallelism. Haskell seems to do a good job in this area.

Still, I think OOP and imperative programming paradigm would appeal to a wider variety of people and would do a simpler job in most areas. For example, video game development is a simpler task when a language like C++ is used, due to the extensive real-time state change of the different objects in a video game. There's another programming paradigm called Data Parallelism that distributes or maps data across different computational processes, a famous example is Map-Reduce model which is a good solution when it comes to big data. Apparently, there are different programming approaches that should be used for different problems, and these different approaches are not available in a single programming language.

Another issue is the lack of a clear cost model, which is a way to give the programmer an idea of how much an operation could cost in resources (time and space), or sometimes the cost model is too simple and ignore a lot of important details. Cost models are important because they inform the programmer which parts of their program worth parallelizing and which is not in order to reach an optimal use of the machine. Cost models are hard to form theoretically. Thus, usually, a dynamic program analysis on the targeted machine is needed. In other words, costs are better understood by executing your program on the targeted machine.

Last put not least, problems related to different machine architectures. Abstracting the programming languages from the hardware provides a lot of benefits and ease to programmers, but sometimes programmers should be exposed to the hardware. For example, different CPUs need a different number of threads. having more threads than needed may degrade performance or just increase power consumption. That might be caused by problems like over switching between threads on the same core, saturated off-chip bandwidth, or competing for shared resources where a lot of threads ends up waiting for a single process to finish from a critical section. Solutions to such problems could be achieved by providing a feedback while the program running to control the number of threads or how we map data to the available threads. Something that would be of interest is granularity of your software, which is related to the ratio of computation to the amount of communication (read more here). Granularity is an important topic that programmers porting programs for completely different architectures (MIMD, SIMD ...etc) should worry about. Furthermore, both parallel and serial programmers should care about locality of data by keeping accessed data close to each other in memory in order to optimise cache use.

All in all, A single solution for all these challenges is not available yet, but different solutions are scattered between different technologies. Maybe someday we will have a language that could tackle all of these different challenges. In the meanwhile, let's embrace diversity.

No comments:

Post a Comment