Scala’s Parallel Collections are useful even in the most simple cases

One of the main features that Scala version 2.9.x brought was Parallel Collections. If you think about it quickly, you might consider using such a feature only in complex, process intensive scenarios.

Although those are certainly the main places where you would consider using parallel collections, there are more simpler ways you can leverage them. The code to use this feature is so simple that it’s easy to come up with reasons to use it everywhere:

myCol.par foreach(println)

Now how could that be easier??

So where did I use that for real, you might ask. The current company I work on stores files in Amazon S3. Before that, those files where stored in the local disk. So, when we migrated to S3, I wrote a script, in Scala, to upload all the files… thousands of them. My first solution looked something like this:

listOfFiles foreach(file => sendToS3(file))

And then suddenly it occurred me that it’s ridiculously simple to parallelize the upload of the files:

listOfFiles.par foreach(file => sendToS3(file))

Result: half the time for uploading the files, having to do almost nothing! Pretty neat, ain’t it?

This entry was posted in scala and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s