Home » Software Development » Go: Multi-threaded writing to a CSV file

About Mark Needham

Mark Needham

Go: Multi-threaded writing to a CSV file

As part of a Go script I’ve been working on I wanted to write to a CSV file from multiple Go routines, but realised that the built in CSV Writer isn’t thread safe.

My first attempt at writing to the CSV file looked like this:

package main
 
 
import (
	"encoding/csv"
	"os"
	"log"
	"strconv"
)
 
func main() {
 
	csvFile, err := os.Create("/tmp/foo.csv")
	if err != nil {
		log.Panic(err)
	}
 
	w := csv.NewWriter(csvFile)
	w.Write([]string{"id1","id2","id3"})
 
	count := 100
	done := make(chan bool, count)
 
	for i := 0; i < count; i++ {
		go func(i int) {
			w.Write([]string {strconv.Itoa(i), strconv.Itoa(i), strconv.Itoa(i)})
			done <- true
		}(i)
	}
 
	for i:=0; i < count; i++ {
		<- done
	}
	w.Flush()
}

package main import ( “encoding/csv” “os” “log” “strconv” ) func main() { csvFile, err := os.Create(“/tmp/foo.csv”) if err != nil { log.Panic(err) } w := csv.NewWriter(csvFile) w.Write([]string{“id1″,”id2″,”id3”}) count := 100 done := make(chan bool, count) for i := 0; i < count; i++ { go func(i int) { w.Write([]string {strconv.Itoa(i), strconv.Itoa(i), strconv.Itoa(i)}) done <- true }(i) } for i:=0; i < count; i++ { <- done } w.Flush() }

This script should output the numbers from 0-99 three times on each line. Some rows in the file are written correctly, but as we can see below, some aren’t:

40,40,40
37,37,37
38,38,38
18,18,39
^@,39,39
...
67,67,70,^@70,70
65,65,65
73,73,73
66,66,66
72,72,72
75,74,75,74,75
74
7779^@,79,77
...

One way that we can make our script safe is to use a mutex whenever we’re calling any methods on the CSV writer. I wrote the following code to do this:

type CsvWriter struct {
	mutex *sync.Mutex
	csvWriter *csv.Writer
}
 
func NewCsvWriter(fileName string) (*CsvWriter, error) {
	csvFile, err := os.Create(fileName)
	if err != nil {
		return nil, err
	}
	w := csv.NewWriter(csvFile)
	return &CsvWriter{csvWriter:w, mutex: &sync.Mutex{}}, nil
}
 
func (w *CsvWriter) Write(row []string) {
	w.mutex.Lock()
	w.csvWriter.Write(row)
	w.mutex.Unlock()
}
 
func (w *CsvWriter) Flush() {
	w.mutex.Lock()
	w.csvWriter.Flush()
	w.mutex.Unlock()
}

We create a mutex when NewCsvWriter instantiates CsvWriter and then use it in the Write and Flush functions so that only one go routine at a time can access the underlying CsvWriter. We then tweak the initial script to call this class instead of calling CsvWriter directly:

func main() {
	w, err := NewCsvWriter("/tmp/foo-safe.csv")
	if err != nil {
		log.Panic(err)
	}
 
	w.Write([]string{"id1","id2","id3"})
 
	count := 100
	done := make(chan bool, count)
 
	for i := 0; i < count; i++ {
		go func(i int) {
			w.Write([]string {strconv.Itoa(i), strconv.Itoa(i), strconv.Itoa(i)})
			done <- true
		}(i)
	}
 
	for i:=0; i < count; i++ {
		<- done
	}
	w.Flush()
}

And now if we inspect the CSV file all lines have been written successfully:

...
25,25,25
13,13,13
29,29,29
32,32,32
26,26,26
30,30,30
27,27,27
31,31,31
28,28,28
34,34,34
35,35,35
33,33,33
37,37,37
36,36,36
...

That’s all for now. If you have any suggestions for a better way to do this do let me know in the comments or on twitter – I’m @markhneedham

Be Sociable, Share!

Reference: Go: Multi-threaded writing to a CSV file from our JCG partner Mark Needham at the Mark Needham Blog blog.
(0 rating, 0 votes)
You need to be a registered member to rate this.
2 Comments Views Tweet it!
Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
I agree to the Terms and Privacy Policy

2
Leave a Reply

avatar
2 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
2 Comment authors
hageIvan Babanin Recent comment authors

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  Subscribe  
newest oldest most voted
Notify of
Ivan Babanin
Guest
Ivan Babanin

Article about Go on *Java* Code Geeks… Isn’t it strange?

hage
Guest
hage

Why not using the channel to synchronize your messages? Create a channel of string array (the type you want to write). And then you have to functions which need access to that channel: one that creates data and writes it to the channel and another one that reads from that channel and writes the data to file.

Now you can spawn multiple “instances” of your create method and only one instance of the functions that reads from that channel. This way you don’t need a mutex,which I suppose will be very slow.