Scala 2.11: AnyRefMap – benchmarking a high performance map implementation

In a previous blog post I benchmarked different Scala map implementations alongside java.util.HashMap. One of the clear conclusions was that the mutable map implementation in Scala lagged j.u.HashMap in terms of performance. This limitation was easily worked around by simply using the Scala conversion support to wrap the Java implementation. Fast-forward 18 months to 2014 and we see the emergence of two new high performance map implementations in Scala 2.11. The first is a new specialized map implementation for using Long values as keys, the other general purpose map is
AnyRefMap. The specialized collection offers very high performance for keys of type Long. The focus of this blog post is on AnyRefMap, which is likely to see very widespread use in Scala codebases.

I’ve revisited the benchmark I used in 2012 and incorporated AnyRefMap, benchmarking it alongside java.util.HashMap and collection.mutable.HashMap. I benchmarked the insertion of 3 million random (pre-generated) strings into each map, followed by looking up every one of the 3 million strings 3 times. I ran each test repeatedly to “warm up” the JVM, only taking the last 10 runs for each map. The tests were run on an Intel i3 system using a recent JDK release and Scala 2.11 M8.

The results were nothing short of impressive for AnyRefMap. The new map implementation significantly out-performed mutable.HashMap for both insertions and lookups. AnyRefMap also outperformed java.util.HashMap for insertions, and matching it for lookup performance. I took the median value of the last 10 runs for each map implementation. The results are shown in the figure below.

MapBenchmark

import collection.mutable
import collection.mutable._
import java.lang.System.currentTimeMillis
import java.util

/**
 * Code rough, but serves its purpose.
 */
object SBenchmarkMutable3 extends App {
  val NumElements = 3000000
  val RandStringLength = 10
  val OuterIterations = 40
  val WarmUpIterations = 30

  private var strings: Array[String] = _
  private var map: mutable.Map[String, String] = _
  private var jmap: util.HashMap[String, String] = _

  loadStrings

  println("collection.mutable.HashMap")
  for (i <- 1 to OuterIterations) {

    val mutInsert = benchInsertion(false, false, false)
    val mutLookup = benchLookup

    Thread.sleep(1000)

    if (i > WarmUpIterations)
      println(s"$mutInsert,$mutLookup")
  }

  println("Java HashMap")
  for (i <- 1 to OuterIterations) {

    val jInsert = benchInsertion(true, false)
    val jLookup = benchLookup

    Thread.sleep(1000)
    if (i > WarmUpIterations)
      println(s"$jInsert,$jLookup")
  }

  println("collection.mutable.AnyRefMap:")
  for (i <- 1 to OuterIterations) {

    val anyRefInsert = benchInsertion(false, false, true)
    val anyRefLookup = benchLookup

    Thread.sleep(1000)

    if (i > WarmUpIterations)
      println(s"$anyRefInsert,$anyRefLookup")
  }

  def benchInsertion(useJavaConversion: Boolean, recycle: Boolean, useAnyRefMap: Boolean = false): Long = {
    insertion(false, useJavaConversion, recycle, useAnyRefMap)
  }

  def mapPresize = (RandStringLength / 0.74).toInt

  def benchLookup = {
    val timing = lookup(false)
    map = null
    jmap = null
    timing
  }

  def insertion(warmup: Boolean, useJava: Boolean, recycle: Boolean, useAnyRefMap: Boolean): Long = {
    val start = currentTimeMillis
    if (recycle) {
      if (map != null) {
        map.clear()
      } else {
        jmap.clear()
      }
    }
    else {
      if (useJava) {
        jmap = new util.HashMap[String, String](mapPresize)
      }
      else {
        if (useAnyRefMap) {
          map = new mutable.AnyRefMap[String, String](mapPresize)
        } else {
          map = new mutable.HashMap[String, String]()
        }
      }
    }

    if (jmap == null) {
      var i = 0
      while (i < strings.size) {
        val s = strings(i)
        map.put(s, s)
        i += 1
      }
    } else {
      var i = 0
      while (i < strings.size) {
        val s = strings(i)
        jmap.put(s, s)
        i += 1
      }
    }
    currentTimeMillis - start
  }

  def lookup(warmup: Boolean) = {
    var empty = false
    val start = currentTimeMillis

    var j = 0
    while (j < 3) {
      if (jmap == null) {
        var i = 0
        while (i < strings.length) {
          empty = map(strings(i)).length != 0
          i += 1
        }
      } else {
        var i = 0
        while (i < strings.length) {
          empty = jmap.get(strings(i)).length != 0
          i += 1
        }
      }
      j += 1
    }
    currentTimeMillis - start
  }

  // Generate a bunch of random strings
  def loadStrings {
    val lines = new ArrayBuffer[String]
    val random = new scala.util.Random()

    var i = 0
    while (i < NumElements) {
      lines += random.nextString(RandStringLength)
      i += 1
    }
    strings = lines.toArray
  }
}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s