Why Concurrency is hard Golang

Home » Programming Language » Golang » Why Concurrency is hard Golang

Race Conditions
Atomicity
Memory Access Synchronization
Deadlocks, Livelocks and Starvation
Conclusion

Concurrent code is notoriously difficult to get right. It usually takes a few iterations to get it working as expected, and even then it’s not uncommon for bugs to exist in code for years before some change in timing (heavier disk utilization, more users logged into the system, etc.) cause a previously undiscovered bug to rear its head.

Fortunately everyone runs into the same issue when working with concurrent code. Because of this, computer scientists have been able to label the common issues, which allows us to discuss how they arise, why, and how to solve them.

So let’s get started. Following are some of the most common issues that make working concurrent code both frustrating and interesting.

Race Conditions

A race condition occurs when two or more operations must execute in the correct order, but the program has not been written so that this order is guaranteed to be maintained.

Most of the time, this shows up in what’s called a data race, where one concurrent operation attempts to read a variable while at some undetermined time another concurrent operation is attempting to write to the same variable.

Basic example of race condition or we can say data race:

package main

import (
    "fmt"
)

func main() {
  var edata int
  go func() {
    edata++
  }()

  if edata == 0 {
    fmt.Printf("the value of edata is %d.\n", edata)
  }
}

In Go, you can use the go keyword to run a function concurrently. Doing so creates what’s called a goroutine.

Here, lines 10 and 13 are both trying to access the variable edata, but is no guarantee what order this might happen in. There are three possible outcomes to running this code:

Nothing is printed. In this case line 10 was executed before line 13.
“the value of edata is 0” is printed. In this case, line 13 and 14 were executed before line 10.
“the value is of edata is 1” is printed. In this case, line 13 was executed before line 10, but line 10 was executed before line 14.

As you can see, just a few lines of incorrect code can introduce tremendous variability into your program.

Most of the time, data races are introduced because the developers are thinking about the problem sequentially. They assume that because a line of code falls before another that it will run first. They assume the goroutine above will be scheduled and execute before the data variable is read in the if statement.

When writing concurrent code, you have to meticulously iterate through the possible scenarios. You have no guarantees that your code will run in the order it’s listed in the source code. I sometimes find it helpful to imagine a large period of time passing between operations. Imagine an hour passes between the time when the goroutine is invoked, and when it is run. How would the rest of the program behave? What if it took an hour between the goroutine executing successfully and the program reaching the if statement? Thinking in this manner helps me because to a computer, the scale may be different, but the relative time differentials are more or less the same.

Indeed, some developers fall into the trap of sprinkling sleeps throughout their code exactly because it seems to solve their concurrency problems. Let’s try that in the preceding program:

package main

import (
    "fmt"
    "time"
)

func main() {
    var edata int
    go func() {
        edata++
    }()
    time.Sleep(1 * time.Second) // this is bad!
    if edata == 0 {
        fmt.Printf("the value of edata is %d.\n", edata)
    }
}

Have we solved our data race? No. In fact, it’s still possible for all three outcomes to arise from this program, just increasingly unlikely. The longer we sleep in between invoking out goroutine and checking and the value of data, the closer our program gets to achieving correctness–but this probability asymptotically approaches logical correctness; it will never be logically correct.

In addition to this. we’ve now introduced an inefficiency into our algorithm. We now have to sleep for one second to make it more likely we won’t see our data race. If we utilize the correct tools, we might not have to wait at all, or the wait could be only a microsecond.

The takeaway here is that you should always target logical correctness. Introducing sleeps into your code can be a handy way to debug concurrent programs, but they are not a solution.

Race conditions are one of the most insidious types of concurrency bugs because they may not show up until years after the code has been placed into production. They are usually precipitated by a change in the environment the code is executing in, or an unprecedented occurrence. In these cases, the code seems to be behaving correctly, but in reality, there’s just a very high chance that the operations will be executed in order. Sooner or later. the program will have an unintended consequence.

Atomicity

When something is considered atomic, or to have property of atomicity, this means that within the context that it is operating, it is indivisible, or uninterruptible.

So what does that really means, and why is this important to know when working with concurrent code?

The first thing that’s very important is the word ‘context’. Something may be atomic in one context, but not another. Operations that are atomic within the context of your process may not be atomic in the context of the operating system; operations that are atomic within the context of the operating system may not be atomic within the context of your machine; and operations that are atomic within the context of your machine may not be atomic within the context of your application. In other words, the atomicity of an operation can change depending on the currently defined scope

When thinking about atomicity, very often the first thing you need to do is to define the context, or scope, the operation will be considered to be atomic in.

Memory Access Synchronization

Let’s say we have a data race: two concurrent processes are attempting to access the same area of memory, and the way they are accessing the memory is not atomic. Our previous example of a simple data race will do nicely with few modifications:

package main

import (
    "fmt"
)

func main() {
    var edata int
    go func() {
        edata++
    }()
    if edata == 0 {
        fmt.Printf("the value of edata is %d.\n", edata)
    } else {
        fmt.Printf("the value of edata is %d.\n", edata)
    }
}

We’ve added an else clause here so that regardless of the value of data we’ll always get some output. Remember that as it is written, there is a data race and the output of the program will completely nondeterministic.

In face, there’s a name for a section of your program that needs exclusive access to a shared resource. This is called a critical section. In this example, we have three critical sections:

Our goroutine, which is incrementing the edata variables.
Our if statement, which checks whether the value of edata is 0.
Our fmt.Printf statement, which retrieves the value of edata for output.

There are various ways to guard your program’s critical sections, and Go has some better ideas on how to deal with this, but one way to solve this problem is to synchronize access to the memory between your critical sections.

The following code is not idiomatic Go (and I don’t suggest you attempt to solve your data race problems like this), but it very simply demonstrates memory access synchronization. If any of the types, functions, or methods in this example are foreign to you, that’s Ok. Focus on the concept of synchronizing access to the memory by following the callouts:

package main

import (
    "fmt"
    "sync"
)

func main() {
    var mc sync.Mutex
    var edata int
    go func() {
        mc.Lock()
        edata++
        mc.Unlock()
    }()
    mc.Lock()
    if edata == 0 {
        fmt.Printf("the value of edata is %d.\n", edata)
    } else {
        fmt.Printf("the value of edata is %d.\n", edata)
    }
    mc.Unlock()
}

Here at line 9, We add a variable that will allow our code to synchronize access to the edata variable’s memory
Here at line 12, We declare that until we declare otherwise, our goroutine should have exclusive access to this memory.
Here at line 14, We declare that the goroutine is done with this memory.
Here at line 16, We once again declare that the following conditional statement should have exclusive access to the edata variable’s memory.
Here at line 22, We declare we’re once again done with this memory.

In this example we’ve created a convention for developers to follow. Anytime developers want to access the edata variable’s memory, they must first call Lock, and when they’re finished they must call Unlock. Code between those two statements can then assume it has exclusive access to data; we have successfully synchronized access to the memory. Also note that if developers don’t follow this convention, we have no guarantee of exclusive access.

You may have noticed that while we have solved our data race, we haven’t actually solved our race condition! The order of operations in this program is still nondeterministic, we’ve just narrowed the scope of the nondeterminism a bit. In this example, either the goroutine will execute first, or both our if and else blocks will. We still don’t know which will occur first in any given execution of this program.

Deadlocks, Livelocks and Starvation

The previous have all been about discussing program correctness in that if these issues are managed correctly, you program will never give an incorrect answer. Unfortunately, even if you successfully handle these classes of issues, there is another class of issues to contend with, deadlocks, livelocks, and starvation. These issues all concern ensuring your program has something useful to do at all times. If not handled properly, you program could enter a state in which it will stop functioning altogether.

Deadlock

A deadlocked program is one in which all concurrent processes are waiting on one another. In this state, the program will never recover without outside intervention

If that sounds grim, it’s because it is! The Go runtime attempts to do its part and will detect some deadlock (all goroutines must be blocked, or ‘asleep’), but this doesn’t do much to help you prevent deadlocks.

Livelock

Livelocks are programs that are actively performing concurrent operations, but these operations do nothing to move the state of the program forward.

Have you ever been in a hallway walking toward another person? She moves to one side to let you pass, but you’ve just done the same. So you move to the other side, but she’s also done the same. Imagine this going to forever , and you understand livelocks.

Starvation

Starvation in any situation where a concurrent process cannot get all the resources it needs to perform work.

When we discussed livelocks, and the resource each goroutine was starved of was a shared lock. Livelocks warrant discussion separate from starvation because in a livelock, all the concurrent processes are starved equally, and no work in accomplished. More broadly, starvation usually implies that there are one or more greedy concurrent process that are unfairly preventing one or more concurrent processes from accomplishing work as efficiently as possible, or maybe at all.

Conclusion

In this article we covered why concurrency is hard to handle and the aspects of this such as data race, atomicity, deadlocks, livelocks and starvation.

your comments are appreciated and if you wants to see your articles on this platform then please shoot a mail at this address kusingh@programmingeeksclub.com

Thanks for reading 🙂