Set

Set是另一种集合,和序列写法相似:

1
2
val fruit = Set("apple", "banana", "pear")
val s = (1 to 6).toSet

大部分序列上的操作在Set上也能用:

1
2
3
s map (_ + 2)
fruit filter (_.startWith == "app")
s.nonEmpty

Set和序列的区别是Set中不包含相同元素。

例子:n皇后问题

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
object nqueens {
def show(queens: List[Int]) = {
val lines =
for (col <- queens.reverse)
yield Vector.fill(queens.length)("* ").updated(col, "X ").mkString
"\n" + (lines mkString "\n")
}
def isSafe(col: Int, queens: List[Int]): Boolean = {
val row = queens.length
val queensWithRow = (row - 1 to 0 by -1) zip queens
queensWithRow forall {
case (r, c) => col != c && math.abs(col - c) != row - r
}
}
def queens(n: Int) = {
def placeQueens(k: Int): Set[List[Int]] = {
if (k == 0) Set(List())
else
for {
queens <- placeQueens(k - 1)
col <- 0 until n
if isSafe(col, queens)
} yield col :: queens
}
placeQueens(n)
}
(queens(8) take 3 map show) mkString "\n"
}
// > res0: String =
"
* * * * * X * *
* * * X * * * *
* X * * * * * *
* * * * * * * X
* * * * X * * *
* * * * * * X *
X * * * * * * *
* * X * * * * *
* * * * X * * *
* * * * * * X *
* X * * * * * *
* * * X * * * *
* * * * * * * X
X * * * * * * *
* * X * * * * *
* * * * * X * *
* * * * * X * *
* * X * * * * *
* * * * * * X *
* * * X * * * *
X * * * * * * *
* * * * * * * X
* X * * * * * *
* * * * X * * * "

for的翻译

for表达式和mapflatMapfilter这三个高阶函数紧密相关。这三个函数可以写成:

1
2
3
4
5
6
def mapFun[T, U](xs: List[T], f: T => U): List[U] =
for (x <- xs) yield f(x)
def flatMap[T, U](xs: List[T], f: T => Iterable[U]): List[U] =
for (x <- xs; y <- f(x)) yield y
def filter[T](xs: List[T], p: T => Boolean): List[T] =
for(x <- xs if p(x)) yield x

事实上,Scala编译器将for表达式翻译成mapflatMap和懒求值的filter变体。

Scala编译器用下面的方式翻译:

1. 简单的for表达式

1
for (x <- e1) yield e2

被翻译成

1
e1.map(x => e2)

2. 带过滤器的for表达式

1
for (x <- e1 if f; s) yield e2

其中f是过滤器,s是剩下的生成器过滤器序列。这会被翻译成:

1
for (x <- e1.withFilter(x => f); s) yield e2

可以认为withFilter是不生成临时中间列表的filter

3. 多个生成器的for表达式

1
for (x <- e1; y <- e2; s) yield e3

其中s是剩下的生成器过滤器序列,这会被翻译成:

1
e1.flatMap(x => for (y <- e2; s) yield e3)

可以看到上面三种情况概括了所有的for表达式形式。

例子:

1
2
3
4
5
for {
i <- 1 until n
j <- 1 until i
if isPrime(i + j)
} yield (i, j)

用上面的方法会被翻译为:

1
2
(1 until n).flatMap =>
(1 until i).withFilter(j => isPrime(i + j)).map(j => (i, j))

for的推广

有趣的是,for不仅仅只能应用在列表或序列上,它只要求mapflatMapwithFilter这三个方法存在即可。

所以可以将其应用在数组、遍历器、数据库、XML数据、可选值、分析器等等中。

for是Scala数据库框架ScalaQuery、Slick的基础。微软的LINQ也有同样的思想。

Map

Map[Key, Value]是一种将键Key和值Value结合起来的数据结构。

例如:

1
2
val romanNumbers = Map("I" -> 1, "V" -> 5, "X" -> 10)
val capitalOfCountry = Map("US" -> "Washington", "Switzerland" -> "Bern")

Map[Key, Value]继承自Iterable[(Key, Value)],因此大部分Iterable的方法能在Map上用:

1
2
3
val countryOfCapital = capitalOfCounty map {
case(x, y) => (y, x)
}

事实上,->语法就相当于写二元组(key, value)

Map[Key, Value]也继承了函数类型Key => Value,所以Map可以像函数一样使用:

1
2
3
4
5
6
7
8
capitalOfCountry("Andorra") // java.util.NoSuchElementException: key not found: Andorra
``` scala
想要在不知道是否包含键的情况下访问`Map`,可以使用`get`操作:
``` scala
capitalOfCountry get "US" // Some("Washington")
capitalOfCountry get "Andorra" // None

get的返回值是一个Option值。

Option

Option是这样定义的:

1
2
3
trait Option[+A]
case class Some[+A](value A) extends Option[A]
object None extends Option[Nothing]

map get key表达式返回:

  • None:如果map不包含键key
  • Some(x):如果map包含键key

因为Option被定义为case类,所以用模式匹配分解它:

1
2
3
4
5
6
def showCapital(country: String) = capitalOfCountry.get(country) match {
case Some(capital) => capital
case None => "missing data"
}
showCapital("US") // "Washington"
showCapital("Andorra") // "missing data"

sortedgroupBy

除了for表达式外另两个有用的SQL查询是groupByorderBy

orderBy在集合上可以用sortWithsorted来表达:

1
2
3
val fruit = List("apple", "pear", "orange", "pineapple")
fruit sortWith (_.length // List("pear", "apple", "orange", "pineapple")
fruit.sorted // List("apple", "orange", "pear", "pineapple")

groupBy在Scala集合上也能用,它通过鉴别器函数f将一个集合分解成一个包含多个集合的map。

1
2
3
fruit groupBy (_.head) // > Map(p -> List(pear, pineapple),
// | a -> List(apple),
// | o -> List(orange))

例子:多项式

假设x^3 - 2x + 5表示成Map(0 -> 5, 1 -> -2, 3 -> 1),则可以设计一个类Poly来表示多项式。

到目前为止,map还只是部分函数,如果键不存在,map(key)就会抛出异常。可以用withDefaultValue操作将map转变为完全函数:

1
2
val cap1 = capitalOfCountry withDefaultValue ""
cap1("Andorra") // ""

多项式可以先写成这样:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
object polynomials {
class Poly(terms0: Map[Int, Double]) {
val terms = terms0 withDefaultValue 0.0
def +(other: Poly) = new Poly(terms ++ (other.terms map adjust))
def adjust(term: (Int, Double)): (Int, Double) = {
val (exp, coeff) = term
terms get exp match {
case Some(coeff1) => exp -> (coeff + coeff1)
case None => exp -> coeff
}
}
override def toString =
(for ((exp, coeff) <- terms.toList.sorted.reverse) yield coeff + "x^" + exp) mkString " + "
}
val p1 = new Poly(Map(1 -> 2.0, 3 -> 4.0, 5 -> 6.2)) // > p1: Poly = 6.2x^5 + 4.0x^3 + 2.0x^1
val p2 = new Poly(Map(0 -> 3.0, 3 -> 7.0)) // > p2: Poly = 7.0x^3 + 3.0x^0
p1 + p2 // > res0: Poly = 6.2x^5 + 11.0x^3 + 2.0x^1 + 3.0x^0
}

但是每次都要用Map来包裹参数很不方便。可以用多重参数来改进它:

1
def this(bindings: (Int, Double)*) = this(bindings.toMap)

其中bindings的类型为Seq[(Int, Double)]

加法操作也可以用foldLeft来改写:

1
2
3
4
5
6
7
8
9
10
11
class Poly(terms0: Map[Int, Double]) {
def this(bindings: (Int, Double)*) = this(bindings.toMap)
val terms = terms0 withDefaultValue 0.0
def +(other: Poly) = new Poly((other.terms foldLeft terms)(addTerm))
def addTerm(terms: Map[Int, Double], term: (Int, Double)): Map[Int, Double] = {
val (exp, coeff) = term
terms + (exp -> (coeff + terms(exp)))
}
override def toString =
(for ((exp, coeff) <- terms.toList.sorted.reverse) yield coeff + "x^" + exp) mkString " + "
}

实际上foldLeft版本比++版本的更有效率,原因是foldLeft直接将系数加到terms中得到了新的map,而++版本中创建了一个临时map后再将他们连接起来。