8.39. Scala regular expression

Scala passed scala.util.matching in the bag Regex class to supportregular expressions. The following example demonstrates using regular expressions to find words Scala :

8.39.1. Example #

import scala.util.matching.Regex object Test { def main(args: Array[String]) { val pattern = "Scala".r val str = "Scala is Scalable and cool" println(pattern findFirstIn str) } } 

Execute the above code, and the output is as follows:

$ scalac Test.scala $ scala Test Some(Scala)

Use in the instance String analogous r() method constructs a Regexobject.

And then use the findFirstIn method to find the first match.

If you need to see all matches, you can use the findAllIn method.

You can use it. mkString( ) method to concatenate strings of regular expression matching results, and you can use pipes (|) to set different modes:

8.39.2. Example #

import scala.util.matching.Regex object Test { def main(args: Array[String]) { val pattern = new Regex("(S|s)cala") // The initial letter can be uppercase S or lowercase s val str = "Scala is scalable and cool" println((pattern findAllIn str).mkString(",")) // Using commas , Connection return result } } 

Execute the above code, and the output is as follows:

$ scalac Test.scala $ scala Test Scala,scala

If you need to replace the matching text with the specified keywords, you can use the replaceFirstIn( ) method to replace the first match, using the replaceAllIn( ) method to replace all matches, as an example:

8.39.3. Example #

object Test { def main(args: Array[String]) { val pattern = "(S|s)cala".r val str = "Scala is scalable and cool" println(pattern replaceFirstIn(str, "Java")) } } 

Execute the above code, and the output is as follows:

$ scalac Test.scala $ scala Test Java is scalable and cool

Regular expression #

Scala’s regular expressions inherit the syntax rules of Java, while Java mostly uses the rules of the Perl language.

In the following table, we give some common regular expression rules:

Expression.	Matching rule
`^`	Matches the position where the input string begins.
`$`	Matches the position at the end of the input string.
`.`	Matches any single character except “rn”.
`[...]`	Character set. Matches any character contained. For example, “ [abc] Match the “a” in “plain”.
`[^...]`	Reverse character set. Matches any characters that are not included. For example, “ [^abc] Match “p”, “l”, “I”, “n” in “plain”.
`\\A`	Match the position where the input string begins (no multiline support)
`\\z`	End of string (similar to $, but not affected by handling multiline options)
`\\Z`	End of string or end of line (not affected by handling multiline options)
`re*`	Repeat zero or more times
`re+`	Repeat one or more times
`re?`	Repeat zero or once
`re{ n}`	Repeat n times
`re{ n,}`
`re{ n, m}`	Repeat n to m times
`a\|b`	Match an or b
`(re)`	Match the re and capture the text to the automatically named group
`(?: re)`	Match re, no matching text is captured, and no group number is assigned to this packet
`(?> re)`	Greedy subexpression
`\\w`	Match letters or numbers or underscores
`\\W`	Match any character that is not letters, numbers, underscores, or Chinese characters
`\\s`	Match any whitespace character, equal to [tnrf]
`\\S`	Match any character that is not a blank character
`\\d`	Match numbers, similar [0-9]
`\\D`	Match any non-numeric character
`\\G`	The beginning of the current search
`\\n`	Newline character
`\\b`	It is usually the word demarcation position, but if you use it in a character class to represent backspace
`\\B`	Matching is not the beginning or end of a word.
`\\t`	Tab character
`\\Q`	Opening quotation marks:Q (aqb) 3E? Matchable Text “(a+b) 3 “.
`\\E`	Closing quotation marks:Q (aqb) 3E? Matchable Text “(a+b) 3 “.

Regular expression instance

Example	Description
`.`	Matches any single character except “rn”.
`[Rr]uby`	Match “Ruby” or “ruby”
`rub[ye]`	Match “ruby” or “rube”
`[aeiou]`	Match lowercase letters: aeiou
`[0-9]`	Match any number, similar to [0123456789]
`[a-z]`	Match any ASCII lowercase letter
`[A-Z]`	Match any ASCII capital letters
`[a-zA-Z0-9]`	Match numbers, upper and lowercase letters
`[^aeiou]`	Match other characters except aeiou
`[^0-9]`	Match characters other than numbers
`\\d`	Match numbers, similar to: [0-9]
`\\D`	Match non-numeric, similar to: `[^0-9]`
`\\s`	Match spaces, similar to: `[ \t\r\n\f]`
`\\S`	Match spaces, similar to: `[^ \t\r\n\f]`
`\\w`	Match letters, numbers, underscores, similar to: `[A-Za-z0-9_]`
`\\W`	Match non-letters, numbers, underscores, similar to: `[^A-Za-z0-9_]`
`ruby?`	Matching “rub” or “ruby”: y is optional
`ruby*`	Matches “rub” plus 0 or more y.
`ruby+`	Matches “rub” plus one or more y.
`\\d{3}`	It matches exactly three numbers.
`\\d{3,}`	Match 3 or more digits.
`\\d{3,5}`	Match 3, 4, or 5 digits.
`\\D\\d+`	No grouping: + repeat `\d`
`(\\D\\d)+/`	Grouping: + repeat `\D\d` Yes
`([Rr]uby(, )?)+`	Match “Ruby”, “Ruby, ruby, ruby”, etc.

Note that each character in the table above uses two backslashes. This is because backslashes in strings in Java and Scala are escape characters. Therefore, if you want to output \\ , you need to write it as \\ in the string to obtain a backslash. See the following example:

8.39.4. Example #

import scala.util.matching.Regex object Test { def main(args: Array[String]) { val pattern = new Regex("abl[ae]\\\\d+") val str = "ablaw is able1 and cool" println((pattern findAllIn str).mkString(",")) } } 

Execute the above code, and the output is as follows:

$ scalac Test.scala $ scala Test able1