Software Development

Go: Multi-threaded writing to a CSV file

As part of a Go script I’ve been working on I wanted to write to a CSV file from multiple Go routines, but realised that the built in CSV Writer isn’t thread safe.

My first attempt at writing to the CSV file looked like this:

package main
 
 
import (
	"encoding/csv"
	"os"
	"log"
	"strconv"
)
 
func main() {
 
	csvFile, err := os.Create("/tmp/foo.csv")
	if err != nil {
		log.Panic(err)
	}
 
	w := csv.NewWriter(csvFile)
	w.Write([]string{"id1","id2","id3"})
 
	count := 100
	done := make(chan bool, count)
 
	for i := 0; i < count; i++ {
		go func(i int) {
			w.Write([]string {strconv.Itoa(i), strconv.Itoa(i), strconv.Itoa(i)})
			done <- true
		}(i)
	}
 
	for i:=0; i < count; i++ {
		<- done
	}
	w.Flush()
}

package main import ( “encoding/csv” “os” “log” “strconv” ) func main() { csvFile, err := os.Create(“/tmp/foo.csv”) if err != nil { log.Panic(err) } w := csv.NewWriter(csvFile) w.Write([]string{“id1″,”id2″,”id3”}) count := 100 done := make(chan bool, count) for i := 0; i < count; i++ { go func(i int) { w.Write([]string {strconv.Itoa(i), strconv.Itoa(i), strconv.Itoa(i)}) done <- true }(i) } for i:=0; i < count; i++ { <- done } w.Flush() }

This script should output the numbers from 0-99 three times on each line. Some rows in the file are written correctly, but as we can see below, some aren’t:

40,40,40
37,37,37
38,38,38
18,18,39
^@,39,39
...
67,67,70,^@70,70
65,65,65
73,73,73
66,66,66
72,72,72
75,74,75,74,75
74
7779^@,79,77
...

One way that we can make our script safe is to use a mutex whenever we’re calling any methods on the CSV writer. I wrote the following code to do this:

type CsvWriter struct {
	mutex *sync.Mutex
	csvWriter *csv.Writer
}
 
func NewCsvWriter(fileName string) (*CsvWriter, error) {
	csvFile, err := os.Create(fileName)
	if err != nil {
		return nil, err
	}
	w := csv.NewWriter(csvFile)
	return &CsvWriter{csvWriter:w, mutex: &sync.Mutex{}}, nil
}
 
func (w *CsvWriter) Write(row []string) {
	w.mutex.Lock()
	w.csvWriter.Write(row)
	w.mutex.Unlock()
}
 
func (w *CsvWriter) Flush() {
	w.mutex.Lock()
	w.csvWriter.Flush()
	w.mutex.Unlock()
}

We create a mutex when NewCsvWriter instantiates CsvWriter and then use it in the Write and Flush functions so that only one go routine at a time can access the underlying CsvWriter. We then tweak the initial script to call this class instead of calling CsvWriter directly:

func main() {
	w, err := NewCsvWriter("/tmp/foo-safe.csv")
	if err != nil {
		log.Panic(err)
	}
 
	w.Write([]string{"id1","id2","id3"})
 
	count := 100
	done := make(chan bool, count)
 
	for i := 0; i < count; i++ {
		go func(i int) {
			w.Write([]string {strconv.Itoa(i), strconv.Itoa(i), strconv.Itoa(i)})
			done <- true
		}(i)
	}
 
	for i:=0; i < count; i++ {
		<- done
	}
	w.Flush()
}

And now if we inspect the CSV file all lines have been written successfully:

...
25,25,25
13,13,13
29,29,29
32,32,32
26,26,26
30,30,30
27,27,27
31,31,31
28,28,28
34,34,34
35,35,35
33,33,33
37,37,37
36,36,36
...

That’s all for now. If you have any suggestions for a better way to do this do let me know in the comments or on twitter – I’m @markhneedham

Be Sociable, Share!

Reference: Go: Multi-threaded writing to a CSV file from our JCG partner Mark Needham at the Mark Needham Blog blog.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Ivan Babanin
Ivan Babanin
7 years ago

Article about Go on *Java* Code Geeks… Isn’t it strange?

hage
hage
7 years ago

Why not using the channel to synchronize your messages? Create a channel of string array (the type you want to write). And then you have to functions which need access to that channel: one that creates data and writes it to the channel and another one that reads from that channel and writes the data to file.

Now you can spawn multiple “instances” of your create method and only one instance of the functions that reads from that channel. This way you don’t need a mutex,which I suppose will be very slow.

Back to top button