1. Raku 概要
1.1. Hello, World
Raku编译器可以从文件或+ e命令行开关的内容中读取程序。 最简单的“Hello,World!”程序如下所示:
say "Hello, Raku ";
把它保存到文件中并运行:
$ raku hello.pl
Hello, Raku
或者,你可以使用 +e
选项:
$ raku -e'say "Hello, Raku"'
Hello, Raku
1.2. 变量
1.2.1. 符号
Raku使用符号来标记变量。 这些符号部分兼容 使用Perl 5语法。 例如,标量,列表和散列分别使用$,@和%sigils。
my $scalar = 42;
say $scalar;
代码打印42并不奇怪。
考虑以下片段,它也给出了可预测的结果(方括号表示一个数组):
my @array = (10, 20, 30);
say @array; # [10 20 30]
现在,让我们使用Raku的优点并重写上面的代码,使用更少的输入,更少的字符和更少的标点符号:
my @list1 = <10 20 30>;
或者甚至像这样:
my @list2 = 10, 20, 30;
类似地,我们可以在初始化哈希时省略括号,留下裸内容:
my %hash =
'Language' => 'Perl',
'Version' => '6';
say %hash;
这个小程序打印出这个(哈希键的顺序) 输出可能不同,你不应该依赖它):
{Language => Perl, Version => 6}
为了访问列表或散列的元素,Raku使用不同类型的括号。 重要的是要记住,印记始终保持不变。 在以下示例中,我们从列表和哈希中提取标量:
my @squares = 0, 1, 4, 9, 14, 25;
say @squares[3]; # This prints the 4th element, thus 9
my %capitals =
'France' => 'Paris',
'Germany' => 'Berlin';
say %capitals{'Germany'};
存在用于创建散列和访问其元素的替代语法。 要了解它是如何工作的,请检查下一段代码:
my %month+abbrs =
:jan('January'),
:feb('February'),
:mar('March');
say %month+abbrs<mar>; # prints March
命名变量是一件相当有趣的事情,因为Raku不仅允许使用ASCII字母,数字和下划线字符,还允许使用许多UTF-8元素,包括连字符和撇号:
my $hello+world = "Hello, World";
say $hello+world;
my $don't = "Isn’t it a Hello?";
say $don't;
my $привет = "A Cyrillic Hi ";
say $привет;
由于内省的机制,很容易告诉变量中的数据类型(Raku中的变量通常被称为容器)。 为此,请在变量上调用预定义的WHAT方法。 即使它是一个裸标量,Raku也会在内部将其视为一个对象; 因此,你可以在上面调用一些方法。 对于标量,结果取决于驻留在变量中的实际数据类型。 这是一个例子(括号是输出的一部分): 使用非拉丁字符变量名称没有任何性能影响。 但如果你做,总是想到那些可能需要阅读你的代码的开发人员在将来
你是否更喜欢变量名称中的非拉丁字符? 虽然它可能会降低打字的速度,因为它需要切换键盘布局, 。
1.3. 内省
由于内省的机制,很容易告诉变量中的数据类型(Raku中的变量通常被称为容器)。 为此,请在变量上调用预定义的WHAT方法。 即使它是一个裸标量,Raku也会在内部将其视为一个对象; 因此,你可以在上面调用一些方法。 对于标量,结果取决于驻留在变量中的实际数据类型。 这是一个例子(括号是输出的一部分):
my $scalar = 42;
my $hello-world = "Hello, World";
say $scalar.WHAT; # (Int)
say $hello-world.WHAT; # (Str)
对于那些以符号 @ 和 % 开头的变量, WHAT 方法返回字符串 (Array) 和 (Hash)。
在数组上调用 WHAT
my @list = 10, 20, 30;
my @squares = 0, 1, 4, 9, 14, 25;
say @list.WHAT; #(Array)
say @squares.WHAT; #(Array)
在散列上调用 WHAT
my %hash = 'Language' => 'Perl';
my %capitals = 'France' => 'Paris';
say %hash.WHAT; # (Hash)
say %capitals.WHAT; # (Hash)
WHAT 调用之后返回的东西, 叫做所谓的类型对象。在 Raku 中, 你应该使用 ===
运算符来比较这些对象。
例如:
my $value = 42;
say "OK" if $value.WHAT === Int;
还有一种方法可以检查驻留在容器中的对象的类型 - isa 方法。 在一个对象上调用它,将类型名称作为参数传递,并得到答案:
my $value = 42;
say "OK" if $value.isa(Int);
1.4. Twigils
在 Raku 中,变量名称前面可以是单字符符号,例如 $
,@
或 %
,或者带有双字符序列。 在后一种情况下,这被称为twigil。 它的第一个字符意味着与一个简单的印记相同的东西,而第二个字符扩展了描述。
例如,twigil 的第二个字符可以描述变量的范围。 考虑 *
,它表示动态范围(在第3章中有更多内容)。 以下调用逐个打印命令行参数:
.say for @*ARGS;
这里,@*ARGS
数组是一个全局数组,包含从命令行接收的参数(请注意,这称为 ARGS 而不是 Perl 5 中的 ARGV)。 .say
构造是对循环变量调用 say 方法。 如果你想让它更详细,你会这样写:
for @*ARGS {
$_.say;
}
让我们列出一些其他有用的预定义动态变量,其中包含星号。 twigil的第一个元素表示容器的类型(因此是标量,数组或散列):
-
$*PERL 包含 Perl 版本 (Raku)
-
$*PID - 进程标识符
-
$*PROGRAM-NAME - 当前执行程序的文件名(对于单行程序它的值被设置为
-e
) -
$*EXECUTABLE - 解释器的路径
-
$*VM - 虚拟机的名字, 你编译 Raku 所用的虚拟机
-
$*DISTRO - 操作系统分发的名字和版本
-
$*KERNEL - 类似, 但是是对于内核的
-
$*CWD - 当前的工作目录
-
$*TZ 当前时区
-
%*ENV - 环境变量
在我这里, 上面的变量打印出下面的信息:
Raku (6.c)
90177
twigilAvars.pl
"/usr/bin/raku".IO
moar (2016.11)
macosx (10.10.5)
darwin (14.5.0)
"/Users/ash/Books/Raku/code".IO {Apple_PubSub_Socket_Render => /private/tmp/com.apple....,
DISPLAY => /private/tmp/com.apple..., HISTCONTROL => igA norespace, HOME => /Users/ash, LC_CTYPE => UTFA8, LOGNAME => ash ...
下一组预定义变量包括那些带有 ?
的变量作为他们的 twigil。 这些是“常量”或所谓的编译时常量,它包含有关程序流当前位置的信息。
-
$?FILE - 文件名(不包含路径; 单行程序包含字符串
-e
) -
$?LINE - 行号(单行程序中被设置为 1)
-
$?PACKAGE - 当前模块的名字, 在顶层级别中, 这是 (GLOBAL)
-
$?TABSTOP - 空白(以制表符计)的个数(可用于 heredocs 中)
1.5. 经常使用的特殊变量
$ _变量与Perl 5中的变量类似,在某些情况下,它是包含当前上下文参数的默认变量。 与任何其他变量一样,$ _是Raku中的一个对象,即使在最简单的用例中也是如此。 例如,最近的示例.say!for @ * ARGS隐式包含$ _。say调用。 相同的效果会产生$ _。say(),.say()或只是.say。
在其他情况下,此变量用作默认变量,例如,在与正则表达式匹配期间:
for @*ARGS {
.say if /\d/;
}
这个简短的代码相当于以下,它使用了智能匹配(~~
)运算符:
for @*ARGS {
.say if $_ ~~ /\d/;
}
$/ 变量中提供了与正则表达式匹配的结果。 要获取匹配的字符串,可以调用 $/.Str 方法。 为了获得在比赛期间捕获的子串,使用了以下内容:$/[2] 或者更简单的形式,$2。
"Perl's Birthday: 18 December 1987" ~~
/ (\d+) \s (\D+) \s (\d+) /;
say $/.Str;
say $/[$_] for 0..2;
在这里,我们正在寻找约会。 在这种情况下,日期定义为数字序列 \d+
,空格 \s
,单词没有数字 \D+
,另一个空格 \s
,还有一些数字 \d+
。 如果匹配成功,$/.Str
插槽包含整个日期,而 $/[0]
,$/[1]
和 $/[2]
保留其部分(小方角括号是输出的一部分) 表示 Match 对象,请参阅第6章):
18 December 1987
「18」
「December」
「1987]
最后,$!
变量将包含错误消息,例如,try 块中发生的错误消息,或者打开文件时发生的错误消息:
try {
say 42/0;
}
say $! if $!;
如果删除此程序中的最后一行,则不会打印任何内容。 这是因为try块屏蔽了任何错误输出。 删除try,然后重新出现错误消息(程序本身终止)。
1.6. 内置类型
Raku允许使用类型变量。 要告诉编译器输入变量,只需在声明变量时命名类型。
Raku中提供的一些类型很明显,不需要注释:
Bool, Int, Str Array, Hash, Complex
有些可能需要一个小注释:
Num, Pair, Rat
Num 类型用于处理浮点变量,而 Pair 是一个"键/值"对。 Rat 类型使用数字和分母引入有理数。
1.6.1. 带类型的变量
这是你如何声明一个带类型的变量的:
my Int $x;
这里,标量容器 $x
可能只包含整数值。 尝试去为它分配一个非整数的值会导致错误:
my Int $x;
$x = "abc"; # Error: Type check failed in assignment to $x;
# expected Int but got Str
对于类型转换,相应的方法调用非常方便。 请记住,虽然$x包含一个整数,但它被视为一个整体的容器对象,这就是为什么你可以在它上面使用一些预定义的方法。 你可以直接在字符串上执行相同的操作。 例如:
my Int $x;
$x = "123".Int; $ Now this is OK
say $x; # 123
1.6.2. Bool
- 尽管有一些你可能想知道的细节,但Bool变量的使用很简单。 Bool类型是一个内置的枚举,并提供两个值:True和False(或者,在完整形式中,Bool
-
True和Bool :: False)。 允许递增或递减布尔变量:
my $b = Bool::True;
$b--;
say $b; $ 打印 False
$b = Bool::False;
$b++;
say $b; # True
Raku对象(即所有变量)包含Bool方法,该方法将变量的值转换为两个布尔值之一:
say 42.Bool; # True
my $pi = 3.14;
say $pi.Bool; # True
say 0.Bool; # False
say "00".Bool; # True
类似地,你可以在变量上调用Int方法并获取整数 布尔值的表示(或任何其他类型的值):
say Bool::True.Int; # 1
1.6.3. Int
Int 类型用于承载任意大小的整数变量。 例如,以下任务中没有数字丢失:
my Int $x = 12389147319583948275874801735817503285431532;
say $x;
存在一种特殊语法,用于定义具有10个以上基数的整数:
say :16<D0CF11E0>
此外,允许使用下划线字符分隔数字,以便更容易读取大数字:
my Int $x = 735_817_503_285_431_532;
当然,当你打印该值时,所有下划线都消失了。 在Int对象上,你可以调用一些其他方便的方法,例如,将数字转换为字符或检查手中的整数是否为素数(是的,is-prime 是内置方法!)。
my Int $a = 65;
say $a.chr; # A
my Int $i = 17;
say $i.is-prime; # True
say 42.is-prime; # False
1.6.4. 字符串
Str 毫无疑问是一个字符串。 在 Raku 中,有一些操作字符串的方法。 再次,你将它们称为对象上的方法。
my $str = "My string";
say $str.lc; # my string
say $str.uc; # MY STRING
say $str.index('t'); # 4
现在让我们得到一个字符串的长度。 编写 $str.length的天真尝试会产生错误消息。 但是,还提供了一个提示:
No such method 'length' for invocant of type 'Str'. Did you mean any of these?
codes
chars
因此,我们有一个简单的单语义方法来获取Unicode字符串的长度。
say "περλ 6".chars; # 6
习惯使用字符串作为对象的新方法可能需要一些时间。 例如,这是如何将printf作为字符串上的方法调用的:
"Today is %02i %s %i\n".printf($day, $month, $year);
1.6.5. 数组
Array变量(即所有以@sigil开头的变量)都配备了一些简单但相当有用的方法。
my @a = 1, 2, 3, 5, 7, 11;
say @a.Int; # 数组长度
say @a.Str; # 空格分割的值
如果打印数组,则将其值作为方括号中以空格分隔的列表。 或者,你可以将其插入字符串中。
my @a = 1, 2, 3, 5, 7, 11;
say @a; # [1 2 3 5 7 11]
say "This is @a: @a[]"; # This is @a: 1 2 3 5 7 11
1.6.6. 散列
哈希提供了一些具有明确语义的方法,例如:
my %hash = Language => 'Perl', Version => 6;
say %hash.elems; # number of pairs in the hash
say %hash.keys; # the list of the keys
say %hash.values; # the list of the values
这里是输出:
2
(Version Language)
(6 Raku)
不仅可以遍历散列键或值,还可以遍历整个对:
for %hash.pairs {
say $_.key;
say $_.value;
}
say %hash.kv # (Version 6 Language Perl)
2. 运算符
即使对于那些不熟悉Perl 5的人来说,Raku中许多操作符的含义也很明显。另一方面,有时操作符的行为包含一些你可能没有想到的微小细节。 在本章中,我们将列出一些运算符,并在必要时给出一些注释。 操作员可以根据其合成属性分为几组。 这些组是前缀,中缀,后缀和此处未涉及的一些其他类型的运算符(例如cir-cumflex,它是“汉堡包”运算符,就像一对括号)。
2.1. 前缀
前缀运算符是那些位于其操作数之前的运算符。 显然,前缀运算符只需要一个操作数。 在某些情况下,当操作符号位于两个操作数之间时,操作符号可用作中缀操作符。
2.1.1. !, not
! 是布尔否定运算符。
say !True; # False
say !(1 == 2) # True
not 运算符执行相同但优先级较低。
say not False; # True
2.1.2. +
+ 是一元加运算符,它将操作数转换为数字上下文。 该操作等同于Numeric方法的调用。
my Str $price = '4' ~ '2';
my Int $amount = +$price;
say $amount; # 42
say $price.Numeric; # 42
我们将在第6章中看到一元加的一个重要用例:+$/。 该构造将Match类的对象转换为数字,该对象包含有关正则表达式的匹配部分的信息。
2.1.3. -
-
是一元减号,它改变了它的操作数的符号。 因为此运算符以静默方式调用Numeric方法,所以它也可以转换上下文,就像使用一元加运算符一样。
my Str $price = '4' ~ '2';
say -$price; # -42
2.1.4. ?, so
? 是一个一元运算符,通过调用,将上下文转换为布尔值Bool方法对象。
say ?42; # True
第二种形式, so, 是一个一元运算符, 其优先级更低。
say so 42; # True
say so True; # True
say so 0.0; # False
2.1.5. ~
~
将对象强制转换为字符串。 请注意,我们现在正在讨论前缀或一元运算符。 如果代字号被用作中缀(参见本章后面有关什么是中缀),它可以作为字符串连接运算符,但它仍然处理字符串。
my Str $a = ~42;
say $a.WHAT; #(Str)
在某些情况下,可以隐式创建字符串上下文,例如,当你在双引号内插入变量时。
2.1.6. ++
++ 是增量的前缀运算符。 首先,完成增量,然后返回一个新值。
my $x = 41;
say ++$x; # 42
增量操作不仅限于使用数字。它也可以处理字符串。
my $a = 'a';
say ++$a; # b
一个实际的例子是增加包含数字的文件名。 文件扩展名将继续存在,并且只会增加数字部分。
my $f = "file001.txt";
++$f;
say $f; # file002.txt
++$f
say $f; # file003.txt
2.1.7. —
— 是减量的前缀形式。 它的工作方式与++前缀完全相同,但当然会使操作数更小(无论是字符串还是数字)。
my $x = 42;
say --$x; # 41
2.1.8. +^
+^ 是具有二进制补码的按位求反运算符。
my $x = 10;
my $y = +^$x;
say $y; # -11 (但是不是 -10)
将此运算符与以下运算符进行比较。
2.1.9. ?^
?^ 是逻辑否定运算符。 请注意,这不是一个按位否定。 首先,将参数转换为布尔值,然后否定结果。
my $x = 10;
my $y = ?^$x;
say $y; # False
say $y.WHAT; # (Bool)
2.1.10. ^
^是范围创建运算符或所谓的upto运算符。 它创建一个范围(它是Range类型的一个对象),从0到给定值(不包括它)。
.print for ^5; # 01234
此代码等效于以下范围的两端明确指定:
.print for 0..4; # 01234
2.1.11. |
| 将复合对象展平为列表。 例如,当你将列表传递给子例程时,应该使用此运算符,子例程需要一个标量列表:
sub sum($a, $b) {
$a + $b
}
my @data = (10, 20);
say sum(|@data); # 30
如果没有 | 运算符,编译器将报告错误,因为子例程需要两个标量,并且不能接受数组作为参数:
Calling sum(Positional) will never work with declared signature ($a, $b)
2.1.12. temp
temp 创建一个临时变量并在范围的末尾恢复其值(就像它在Perl 5中的本地内置运算符一样)。
my $x = 'x';
{
temp $x = 'y';
say $x; # y
}
say $x;
将它与以下运算符进行比较,let。
2.1.13. let
let 是一个前缀运算符,类似于temp,但可以正常使用异常。 如果由于异常而留下范围,则将恢复变量的先前值。
my $var = 'a';
try {
let $var = 'b';
die;
}
say $var; # a
使用 die,此示例代码将打印初始值a。 如果注释掉骰子的调用,则对b的赋值的效果将保持不变,并且该变量将包含try块之后的值b。 let 关键字看起来类似于我和我们的声明符,但它是一个前缀运算符。
2.2. 后缀
后缀运算符是在单个操作数之后放置的一元运算符。
2.2.1. ++
++ 是一个后缀增量。 在表达式中使用当前值之后,将更改值。
say $x = 42;
say $x++; # 42
say $x; # 43
2.2.2. —
— 是后缀自减。
postfix和prefix运算符都神奇地知道如何处理文件名中的数字。
my $filename = 'file01.txt';
for 1..10 {
say $filename++;
}
此示例使用递增的数字打印文件名列表:file01.txt,file02.txt,… file10.txt。
2.3. 方法后缀
Raku中有一些语法元素,以点开头。 这些运算符可能看起来像一个后缀运算符,但它们都是在对象上调用方法的形式。 与Perl 5不同,点操作符不执行任何字符串连接。
2.3.1. .
.method
在变量上调用方法。 这适用于真实对象和那些不是任何类的实例的变量,例如整数等内置类型。
say "0.0".Numeric; # 0
say 42.Bool; # True
class C {
method m() {say "m()"}
}
my $c = C.new;
$c.m(); # m()
2.3.2. .=
.=method
是对象的方法的变异调用。 调用 $x.=method 与更详细的任务相同 $x = $x.method。
在下面的示例中,$o容器最初包含C类的对象,但在$o.=m()之后,该值将替换为D类的实例。
class D { }
class C {
method m() {
return D.new;
}
}
my $o = C.new;
say $o.WHAT; # (C)
$o.=m();
say $o.WHAT; # (D)
2.3.3. .^
my Int $i;
say $i.^methods();
say $i.HOW.methods($i);
2.3.4. .?
class C {
method m() {'m'}
}
my $c = C.new();
say $c.?m(); # m
say $c.?n(); # Nil
2.3.5. .+
class A {
method m($x) {"A::m($x)"}
}
class B is A {
method m($x) {"B::m($x)"}
}
my $o = B.new;
my @a = $o.+m(7);
say @a; # 打印 [B::m(7) A::m(7)]
这里,$o 对象在它自己的B类和它的类中都有m方法,父类A. $o.+m(7) 调用这两种方法并将其结果放入列表中。 如果未定义方法,则将引发异常。
2.3.6. .*
2.4. 中缀运算符
中缀运算符放在两个操作数之间的程序中。 大多数中缀运算符都是二进制运算符,并且只有一个三元运算符,它需要三个操作数。 二元运算符的最简单示例是加法运算符+。 在右侧和左侧,它需要两个值,例如,两个变量:$a + $b。 重要的是要理解相同的符号或相同的字符序列可以是中缀或前缀操作符,具体取决于上下文。 在带加号的示例中,一元对应是一元加运算符,它将操作数强制转换为数字:+$str。
2.4.1. 算数运算符
+, -, *, /
+,-,* 和 / 是执行相应算术运算的运算符,不需要任何注释。 使用 Raku 时,请记住在执行操作之前,如果必要,操作将自动转换为数字类型。
%
% 是模运算符,返回整数除法的余数。 如有必要,首先将操作数转换为整数。
div, mod
div 是整数除法运算符。 如果浮点被截断,则结果舍入为前一个较低的整数。
say 10 div 3; # 3
say -10 div 3; # 4
mod 是模的另一种形式:
say 10 % 3; # 1
say 10 mod 3; # 1
与 / 和 % 运算符不同,div 和 mod 形式不会将操作数强制转换为数值。 比较以下两个例子。
say 10 % "3" # 1
使用 mod 运算符, 则出现错误:
say 10 mod "3";
Calling 'infix:<mod>' will never work with argument types (Int, Str)
Expected any of: :(Real $a, Real $b)
为了满足要求,你可以使用+前缀运算符明确地进行类型转换:
say 10 mod +"3" # 1
或调用 .Int
方法
say 10 mod "3".Int; # 1
%%
%% 是所谓的整除运算符:它告诉给定的操作数对是否可能没有余数的整数除法。
say 10 %% 3; # False
say 12 %% 3; # True
+&, +|, +^
&,! |和!+ ^是乘法的按位操作数,加法, 和XOR操作。 运算符中的加号表示如果需要,操作数将转换为整数类型。
?|, ?&, ?^
?|,!?&,和?^!将操作数转换为布尔类型(因此?中的?) 运算符名称)并执行OR,AND和XOR的逻辑运算。
+<, +>
+ <和 +> 是左右移位运算符。
say 8 +< 2; # 32
say 1024 +> 8; # 4
gcd
lcm
== !=
<, >, ⇐, >=
<⇒
2.4.2. 字符串运算符
~
x
eq, ne
lt, gt, le, ge
leg
2.4.3. 通用比较运算符
cmp
before, after
eqv
===
=:=
~~
2.5. 列表运算符
2.5.1. xx
2.5.2. Z
2.5.3. X
2.6. Junction 运算符
2.6.1. |, &, ^
2.7. 短路运算符
2.7.1. &&
2.7.2. ||
2.7.3. ^^
2.7.4. //
2.8. 其它中缀运算符
2.8.1. min, max
2.8.2. ?? !!
2.8.3. =
2.8.4. ⇒
2.8.5. ,
2.8.6. :
2.9. 元运算符
2.9.1. 赋值
2.9.2. 否定
2.9.3. 翻转运算符
2.9.4. 化简
2.9.5. 交叉运算符
2.9.6. Zip 元运算符
2.10. 超运算符
2.10.1. >>>> <<<< <<>> >><<
Subroutines, or subs For a sub, which takes no arguments, its definition and the call are very straightforward and easy. sub call+me { say "I’m called" } call+me;
The syntax for declaring a sub’s parameters is similar to what other lan- guages (including Perl 5.20 and higher) provide. sub cube($x) { return $x ** 3; } say cube(3); # 27 The required parameters are a comma-separated list in the parentheses immediately after the sub name. No other keywords, such as my, are required to declare them. sub min($x, $y) { return $x < $y ?? $x $y; }
say min(2, 2); # -2 say min(42, 24); # 24 (?? ... is a ternary operator in Raku. Also, there’s a built-in opera- tor min; see the details in the Chapter 2.) The above-declared arguments are required; thus, if a sub is called with a different number of actual arguments, an error will occur. 58 Non-value argument passing By default, you pass the arguments by their values. Despite that, it is not possible to modify them inside the sub. To pass a variable by refer- ence, add the is rw trait. (Note that formally this is not a reference but a mutable argument.) sub inc($x is rw) { $x+; }
my $value = 42; inc($value); say $value; # 43 Typed arguments Similarly to the above-described typed variables, it is possible to indi- cate that the sub’s parameters are typed. To do so, add a type name before the name of the parameter. sub say+hi(Str $name) { say "Hi, $name "; } If the types of the expected and the actual parameters do not match, a compile-time error will occur. say+hi("Mr. X"); # OK # say+hi(123); # Error: Calling say-hi(Int) will never work # with declared signature (Str $name) Optional parameters Optional parameters are marked with a question mark after their names. The defined built-in function helps to tell if the parameter was really passed:
59
sub send+mail(Str $to, Str $bcc?) { if defined $bcc { # . . . say "Sent to $to with a bcc to $bcc."; } else { # . . . say "Sent to $to."; } }
send+mail('mail@example.com'); send+mail('mail@example.com', 'copy@example.com'); Default values Raku also allows specifying the default values of the sub’s arguments. Syntactically, this looks like an assignment. sub i+live+in(Str $city = "Moscow") { say "I live in $city."; }
i+live+in('Saint Petersburg'); i+live+in(); # The default city It is also possible to pass values that are not known at the compile phase. When the default value is not a constant, it will be calculated at runtime. sub to+pay($salary, $bonus = 100.rand) { return ($salary + $bonus).floor; }
say to+pay(500, 50); # Always 550 net. say to+pay(500); # Any number between 500 and 600. say to+pay(500); # Same call but probably different output. The “default” value will be calculated whenever it is required. Please also note that both rand and floor are called as methods, not as func- tions. 60
It is also possible to use previously passed parameters as default values:
sub f($a, $b = $a) { say $a + $b; }
f(42); # 84 f(42, +1) # 41 Optional parameters or parameters with default values must be listed after all the required ones because otherwise, the compiler will not be able to understand which is which. Named arguments Apart from the positional parameters (those that have to go in the same order both in the sub definition and in the sub call), Raku allows named variables, somewhat similar to how you pass a hash to a Perl 5 subroutine. To declare a named parameter, a semicolon is used: sub power(:$base, :$exponent) { return $base $exponent; } Now, the name of the variable is the name of the parameter, and the order is not important anymore. say power(:base(2), :exponent(3)); # 8 say power(:exponent(3), :base(2)); # 8 It is also possible to have different names for the named arguments and those variables, which will be used inside the sub. To give a different name, put it after a colon: sub power(:val($base), :pow($exponent)) { return $base $exponent; } Now the sub expects new names of the arguments.
61
say power(:val(5), :pow(2)); # 25 say power(:pow(2), :val(5)); # 25 Alternatively, you can use the fatarrow syntax to pass named parame- ters as it is done in the following example:
say power(val ⇒ 5, pow ⇒ 2); # 25 Slurpy parameters and flattening Raku allows passing scalars, arrays, hashes, or objects of any other type as the arguments to a sub. There are no restrictions regarding the com- bination and its order in a sub declaration. For example, the first argu- ment may be an array, and the second one may be a scalar. Raku will pass the array as a whole. Thus the following scalar will not be eaten by the array. In the following example, the @text variable is used inside the sub, and it contains only the values from the array passed in the sub call. sub cute+output(@text, $before, $after) { say $before ~ $_ ~ $after for @text; }
my @text = <C C Perl Go>; cute+output(@text, '{', '}'); The output looks quite predictable. {C}& {C}& {Perl}& {Go}& The language expects that the sub receives the arguments of the same types that were listed in the sub declaration. That also means, for example, that if the sub is declared with only one list argument, then it cannot accept a few scalars. 62
sub get+array(@a) { say @a; }
get+array(1, 2, 3); # Error: Calling get-array(Int, Int, Int) # will never work with declared signature (@a)
To let an array accept a list of separate scalar values, you need to say that explicitly by placing an asterisk before the argument name. Such an argument is called slurpy. sub get+array(*@a) { say @a; }
get+array(1, 2, 3); # Good: [1 2 3] Similarly, it will work in the opposite direction, that is to say, when the sub expects to get a few scalars but receives an array when called. sub get+scalars($a, $b, $c) { say "$a and $b and $c"; }
my @a = <3 4 5>; get+scalars(@a); # Error: Calling get-scalars(Positional) # will never work with declared # signature ($a, $b, $c) A vertical bar is used to unpack an array to a list of scalars. get+scalars(|@a); # 3 and 4 and 5
63
Nested subs Nested subs are allowed in Raku. sub cube($x) { sub square($x) { return $x * $x; } return $x * square($x); }
say cube(3); # 27 The name of the inner sub square is only visible within the body of the outer sub cube. Anonymous subs Let’s look at the creation of anonymous subs. One of the options (there are more than one) is to use syntax similar to what you often see in JavaScript. say sub ($x, $y) {$x ~ ' ' ~ $y}("Perl", 6); The first pair of parentheses contains the list of formal arguments of the anonymous sub; the second, the list of the arguments passed. The body of the sub is located between the braces. (The tilde denotes a string concatenation operator in Raku.) By the way, it is important that there be no spaces before parentheses with the actual values of the sub parameters. Another way of creating an anonymous sub is to use the arrow operator (+>). We will discuss it later in the section dedicated to anonymous blocks. 64
Variables and signatures Lexical variables Lexical variables in Raku are those declared with the my keyword. The- se variables are only visible within the block where they were declared. If you tried accessing them outside the scope, you’d get the error: Var+ iable '$x' is not declared. { my $x = 42; say $x; # This is fine } # say $x; # This is not
To “extend” the scope, lexical variables can be used in closures. In the following example, the seq sub returns a block, which uses a variable defined inside the sub. sub seq($init) { my $c = $init; return {$c++}; } The sub returns a code block containing the variable $c. After the sub’s execution, the variable will not only still exist but also will keep its val- ue, which you can easily see by calling a function by its reference a few times more. my $a = seq(1); say $a(); # 1 say $a(); # 2 say $a(); # 3 It is possible to create two independent copies of the local variable. my $a = seq(1); my $b = seq(42);
65
To see how it works, call the subs a few times:
say $a(); # 1 say $a(); # 2 say $b(); # 42 say $a(); # 3 say $b(); # 43 state variables State variables (declared with the keyword state) appeared in Perl 5.10 and work in Raku. Such variables are initialized during the first call and keep their values in subsequent sub calls. It is important to keep in mind that a single instance of the variable is created. Let us return to the example with a counter and replace the my declaration with the state one. The closure will now contain a refer- ence to the same variable. sub seq($init) { state $c = $init; return {$c++}; } What happens when you create more than one closure? my $a = seq(1); my $b = seq(42);
All of them will reference the same variable, which will increase after calling either $a() or $b(). say $a(); # 1 say $a(); # 2 say $b(); # 3 say $a(); # 4 say $b(); # 5 66
Dynamic variables The scope of dynamic variables is calculated at the moment when a variable is accessed. Thus, two or more calls of the same code may pro- duce different results. Dynamic variables are marked with the * twigil (a character clearly ref- erencing a wildcard). In the following example, the echo() function prints a dynamic variable $*var, which is not declared in the function, nor is it a global variable. It, nevertheless, can be resolved when used in other functions, even if they have their own instances of the variable with the same name.
sub alpha { my $*var = 'Alpha'; echo(); }
sub beta { my $*var = 'Beta'; echo(); }
sub echo() { say $*var; }
alpha(); # Alpha beta(); # Beta Anonymous code blocks Raku introduces the concept of so-called pointy blocks (or pointy ar- row blocks). These are anonymous closure blocks, which return a refer- ence to the function and can take arguments. The syntax of defining pointy blocks is an arrow +> followed by the ar- gument list and a block of code.
67
my $cube = +> $x {$x ** 3}; say $cube(3); # 27
Here, the block {$x 3}, which takes one argument $x, is created first. Then, it is called using a variable $cube as a reference to the func- tion: $cube(3). Pointy blocks are quite handy in loops. for 1..10 +> $c { say $c; } The for loop takes two arguments: the range 1..10 and the block of code with the argument $c. The whole construction looks like syntactic sugar for loops. There can be more than one argument. In that case, list them all after an arrow. my $pow = +> $x, $p {$x $p}; say $pow(2, 15); # 32768
The same works with loops and with other Perl elements where you need passing anonymous code blocks. for 0..9 +> $i, $j { say $i + $j; }
In a loop iteration, two values from the list are consumed each time. So, the loop iterates five times and prints the sum of the pairs of num- bers: 1, 5, 9, 13 and 17. 68
Placeholders When an anonymous code block is created, declaring a list of arguments is not mandatory even when a block takes an argument. To let this hap- pen, Raku uses special variable containers, which come with the ^ twigil. This is similar to the predefined variables $a and $b in Perl 5. In the case of more than one argument, their actual order corresponds to the alphabetical order of the names of the ^-ed variables. my $pow = {$^x ** $^y}; say $pow(3, 4); # 81 The values 3 and 4, which were passed in the function call, will land in its variables $^x and $^y, respectively. Now, let us go back to the loop example from the previous section and rewrite it in the form with no arguments (and thus, no arrow). for 0..9 { say "$^n2, $^n1"; }
Note that the code block starts immediately after the list, and there is no arrow. There are two loop variables, $^n1 and $^n2, and they are not in alphabetical order in the code. Still, they get the values as though they were mentioned in the function signature as ($n1, $n2). Finally, the placeholders may be named parameters. The difference is in the twigil. To make the placeholder named, use the colon :. my $pow = {$:base ** $:exp}; say $pow(:base(25), :exp(2)); # 625
With the named placeholders, the alphabetical order is of no im- portance anymore. The following call gives us the same result.
69
say $pow(:exp(2), :base(25)); # 625 Keep in mind that using named placeholders is sub f($a) { # say $^a; # Error: Redeclaration of symbol '$^a' # as a placeholder parameter } Neither you can use any other placeholder names if the signature of the sub is already defined: sub f($a) { say $^b; # Placeholder variable '$^b' cannot # override existing signature } Function overloading The multi keyword allows defining more than one function (or sub- routine, or simply sub) with the same name. The only restriction is that those functions should have different signatures. In Raku, the signature of the sub is defined together with its name, and the arguments may be typed. In the case of multi subs, typed arguments make even more sense because they help to distinguish between different versions of the function with a single name and make a correct choice when the com- piler needs to call one of them. multi sub twice(Int $x) { return $x * 2; } multi sub twice(Str $s) { return "$s, $s"; } 70 specifying a signature to the block, and you cannot have both. just a different way of The following example demonstrates that you cannot use a placeholder with the name of the already existing parameter:
say twice(42); # 84 say twice("hi"); # hi, hi As we have two functions here, one taking an integer argument and another expecting a string, the compiler can easily decide which one it should use. Sub overloading with subtypes Multi subs can be made even more specific by using subtypes. In Raku, subtypes are created with the subset keyword. A subtype definition takes one of the existing types and adds a restriction to select the val- ues to be included in the subtype range. The following lines give a clear view of how subtypes are defined. From the same integer type, Int, the Odd subtype selects only the odd num- bers, while the Even subtype picks only the even numbers. subset Odd of Int where {$^n % 2 == 1}; subset Even of Int where {$^n % 2 == 0}; Now, the subtypes can be used in the signatures of the multi subs. The testnum function has two versions, one for odd and one for even num- bers. multi sub testnum(Odd $x) { say "$x is odd"; } multi sub testnum(Even $x) { say "$x is even"; }
Which function will be used in a call, testnum($x), depends on the actual value of the variable $x. Here is an example with the loop, calling either testnum(Even) for even numbers or testnum(Odd) for odd numbers.
71
for 1..4 +> $x { testnum($x); }
The loop prints a sequence of alternating function call results, which tells us that Raku made a correct choice by using the rules provided in the subtype definitions. 1&is&odd&& 2&is&even&& 3&is&odd&& 4&is&even& Modules Basically, the Raku modules are the files on disk containing the Raku code. Modules are kept in files with the .pm extension. The disk hierar- chy reflects the namespace enclosure, which means that a module named X::Y corresponds to the file X/Y.pm, which will be searched for in one of the predefined catalogues or in the location specified by the + I command line option. Raku has more sophisticated rules for where and how to search for the real files (e. g., it can distinguish between different versions of the same module), but let us skip that for now. module The keyword module declares a module. The name of the module is given after the keyword. There are two methods of scoping the module. Either it can be a bare directive in the beginning of a file, or the whole module can be scoped in the code block within the pair of braces. In the first option, the rest of the file is the module definition (note the presence of the unit keyword).
72
unit module X; sub x() { say "X::x()"; }
In the second option, the code looks similar to the way you declare classes (more on classes in Chapter 4). module X { sub x() { say "X::x()"; } } export The my and our variables, as well as subs, which are defined in the module, are not visible outside of its scope by default. To export a name, the is export trait is required. unit module X; sub x() is export { say "X::x()"; }
This is all you need to do to be able to call the x() sub in the pro- gramme using your module. use To use a module in your code, use the keyword use. An example. Let us first create the module Greet and save it in the file named Greet.pm.
73
unit module Greet; sub hey($name) is export { say "Hey, $name "; } Then, let us use this module in our programme by saying use Greet. use Greet;
hey("you"); # Hey, you
Module names can be more complicated. With is export, all the ex- ported names will be available in the current scope after the module is used. In the following example, the module Greet::Polite sits in the Greet/Polite.pm file. module Greet::Polite { sub hello($name) is export { say "Hello, $name "; } } The programme uses both of these modules and can access all the ex- ported subs. use Greet; use Greet::Polite;
hey("you"); # a sub from Greet hello("Mr. X"); # from Greet::Polite import The use keyword automatically imports the names from modules. When a module is defined in the current file in the lexical scope (please note that the module can be declared as local with my module), no im- 74
port will be done by default. In this case, importing the names should be done explicitly with the import keyword. my module M { sub f($x) is export { return $x; } }
import M;
say f(42);
The f name will only be available for use after it is imported. Again, only the names marked as is export are exported. As import happens in the compile-time, the import instruction itself can be located even after some names from the module are used. my module M { sub f($x) is export { return $x; } }
say f(1); # 1 import M; say f(2); # 2 need To just load a module and do no exports, use the need keyword. Let us create a module named N, which contains the sub n(). This time, the sub is declared as our but with no is export. unit module N;
our sub n() { say "N::n()"; }
75
Then you need a module and may use its methods using the fully quali- fied names. need N;
N::n();
The sequence of the two instructions: need M; import M; (now im+ port should always come after the need) is equivalent to a single use M; statement. require The require keyword loads a module at a runtime unlike the use, which loads it at the compile-time. For example, here is a module with a single sub, which returns the sum of its arguments. unit module Math; our sub sum(*@a) { return [+] @a; }
(The star in *@a is required to tell Perl to pack all the arguments into a single array so that we can call the sub as sum(1, 2, 3). With no *, a syntax error will occur, as the sub expects an array but not three sca- lars.) Now, require the module and use its sub. require Math;
say Math::sum(24..42); # 627 Before the import Math instruction, the programme will not be able to call Math::sum() because the name is not yet known. A single import 76
Math; will not help as the import happens at compile-time when the module is not loaded yet. Import summary Here is a concise list of the keywords for working with modules. use loads and imports a module at compile time need loads a module at compile time but does not import anything from it import imports the names from the loaded module at compile time require loads a module at runtime without importing the names
We have already seen elements of the object-oriented programming in Raku. Methods may be called on those variables, which do not look like real objects from the first view. Even more, methods may be called on constants. The types that were used earlier (like Int or Str) are container types. Variables of a container type can contain values corresponding to some native representation. The compiler does all the conversion it needs for executing a programme. For example, when it sees 42.say, it calls the say method, which the Int object inherits from the top of the type hierarchy in Raku. Raku also supports object-oriented programming in its general under- standing. If you are familiar with how to use classes in other modern programming languages, it will be easy for you to work with classes in Raku. This is how the class is declared:
class Cafe { }
Class attributes Class data variables are called attributes. They are declared with the has keyword. An attribute’s scope is defined via its twigil. As usual, the first character of the twigil indicates the type of the container (thus, a scalar, an array, or a hash). The second character is either . if a variable is public or for the private ones. An accessor will be generated by a compiler for the public attributes.
class Cafe { has $.name; has @ orders; }
To create or instantiate an object of the class X, the constructor is called: X.new(). This is basically a method derived from the Any class (this is one of the classes on the top of the object system in Raku).
my $cafe = Cafe.new( name => "Paris" );
At this point, you can read public attributes.
say $cafe.name;
Reading from $.name is possible because, by default, all public fields are readable and a corresponding access method for them is created. However, that does not allow changing the attribute. To make a field writable, indicate it explicitly by adding the is rw trait.
class Cafe {
has $.name is rw;
has @ orders;
}
my $cafe = Cafe.new( name => "Paris" );
Now, read and write actions are available. $cafe.name = "Berlin"; say $cafe.name; Class methods The method keyword defines a method sub routines with
, similarly to how we define sub- . A method has access to all attributes of the class, both public and private.
The method itself can be private. We will return to this later after talk- ing about inheritance. In the following short example, two methods are created, and each of them manipulates the private @ orders array.
class Cafe {
has $.name; has @ orders;
method order($what) { @ orders.push($what); }
method list+orders { @ orders.sort.join(', ').say; }
}
my $cafe = Cafe.new( name => "Paris"
);
$cafe.order('meet'); $cafe.order('fish'); $cafe.list+orders; # fish, meet
The code should be quite readable for people familiar with OOP. Just keep in mind that “everything is an object” and you may chain method calls.
@orders.sort.join(', ').say;
Instance methods receive a special variable, self (having no sigil), which points to the current object. It can be used to access instance data or the class methods.
method order($what) { @ orders.push($what); self.list+orders;
}
method list+orders { say self.name; @ orders.sort.join(', ').say; }
Inheritance
Inheritance is easy. Just say is Baseclass when declaring a class. Hav- ing said that, your class will be derived from the base class.
class A { method x { say "A.x" } method y { say "A.y" }
}
class B is A { method x { say "B.x" }
}
The further usage of the inherited classes is straightforward.
my $a = A.new; $a.x; # A.x $a.y; # A.y
my $b = B.new; $b.x; # B.x $b.y; # A.y
It is important that the result of the method search does not depend on which type was used to declare a variable. Raku always will first use the methods belonging to the class of the variable, which is currently stored in the variable container. For example, return to the previous example and declare the variable $b to be one of type A, but still create an instance of B with B.new. Even in that case, calling $b.x will still lead to the method defined in the derived class. my A $b = B.new; $b.x; # B.x $b.y; # A.y Meta-methods (which also are available for every object without writ- ing any code) provide information about the class details. In particular, to see the exact order in which method resolution will be executed, call the .^mro metamethod. say $b.^mro; In our example, the following order will be printed. B)&(A)&(Any)&(Mu& Of course, you may call the .^mro method on any other variable or ob- ject in a programme, regardless of whether it is an instance of the user- defined class, a simple variable, or a constant. Just get an idea of how this is implemented internally. $ raku +e'42.^mro.say' Int)&(Cool)&(Any)&(Mu& 84
Multiple inheritance When more than one class is mentioned in the list of base classes, we have multiple inheritance.
class A { method a { say "A.a" }
}
class B { method b { say "B.b"; }
}
class C is A is B { }
my $c = C.new; $c.a;
$c.b;
With multiple inheritance, method resolution order is more important, as different base classes may have methods with the same name, or, for example, the two base classes have another common parent. This is why you should know the order of the base class now.
class A { method meth { say "A.meth" }
}
class B { method meth { say "B.meth"; }
}
class C is A is B { }
class D is B is A { }
Here, the method named meth exists in both parent classes A and B, thus calling it on variables of the types C and D will be resolved differ- ently. my $c = C.new; $c.meth; # A.meth
my $d = D.new; $d.meth; # B.meth This behaviour is confirmed by the method resolution order list, which is actually used by the compiler.
$c.^mro.say; # ((C) (A) (B) (Any) (Mu)) $d.^mro.say; # ((D) (B) (A) (Any) (Mu))
Private (closed) methods Now, after we have discussed inheritance, let us return to the private (or closed) methods. These methods may only be used within the class itself. Thus, you cannot call them from the programme that uses an instance of the class. Nor are they accessible in the derived classes. An exclamation mark is used to denote a private method. The following example demonstrates the usage of a private method of a class. The comments in the code will help you to understand how it works.
class A {
# Method is only available within A
method private { say "A.private"; }
# Public method calling a private method method public {
# You cannot avoid self here. # Consider the ' ' as a separator like '.' self private;
}
}
class B is A {
method method {
# Again self, but this the '.' this time. # This is a public method. self.public;
# This will be a compile-time error.
# self private; }
}
my $b = B.new; $b.method; # A.private
The exclamation mark is actually part of the method name. So you can have both method meth and method meth in the same class. To access them, use self.meth and self meth, respectively:
class C { method meth {say 'meth' } method meth {say ' meth'} method demo { self.meth; self meth;
}
}
my $c = C.new;
$c.demo; # Prints both meth and meth
Submethods
Raku defines the so-called submethods for classes. These are the methods which are not propagating to the subclass’s definition. The submethods may be either private or public, but they will not be inher- ited by the children.
class A { submethod submeth { say "A.submeth" }
}
class B is A {
}
my A $a;
my B $b;
$a.submeth; # OK
# $b.submeth; # Error: No such method 'submeth' for invocant of type 'B'
Constructors You may have noticed in the previous examples that two different ap- proaches to creating a typed variable were used. The first was via an explicit call of the new constructor. In this case, a new instance was created.
my $a = A.new;
In the second, a variable was declared as a typed variable. Here, a con- tainer was created.
my A $a;
Creating a container means not only that the variable will be allowed to host an object of that class but also that you will still need to create that object itself.
my A $a = A.new;
Let us consider an example of a class which involves one public method and one public data field.
class A {
has $.x = 42; method m { say "A.m"; }
}
The internal public variable $.x is initialized with the constant value. Now, let us create a scalar container for the variable of the A class. my A $a; The container is here, and we know its type, but there are no data yet. At this moment, the class method may be called. It will work, as it is a class method and does not require any instance with real data.
$a.m; # Prints “A.m”
Meanwhile, the $.x field is not available yet.
say $a.x; # Error: Cannot look up attributes in a A type object
We need to create an instance object by calling a constructor first.
my A $b = A.new; say $b.x; # Prints 42
Please note that the initialization (= 42) only happens when a construc- tor is called. Prior to this, there is no object, and thus no value can be assigned to an attribute. The new method is inherited from the Mu class. It accepts a list of the named arguments. So, this method can be used on any object with any reasonable arguments. For instance:
my A $c = A.new(x => 14); say $c.x; # 14, not 42
Note that the name of the field (x) may not be quoted. An attempt of A.new('x' ⇒ 14) will fail because it will be interpreted as a Pair being passed as a positional parameter. Alternatively, you can use the :named(value) format for specifying named parameters:
my A $c = A.new :x(14); # Or A.new(:x(14)) if you wish say $c.x; # 14
For the more sophisticated constructors, the class’s own BUILD sub- method may be defined. This method expects to get a list of the named arguments.
class A {
# Two fields in an object.
# One of them will be calculated in the constructor.
has $.str;
has $ len;
# The constructor expects its argument named ‘str’. submethod BUILD(:$str) {
# This field is being copied as is: $ str = $str;
# And this field is calculated: $ len = $str.chars;
}
method dump {
# Here, we print the current values. # The variables are interpolated as usual # but to escape an apostrophe character from # the variable name, a pair of braces is added.
"{$.str}'s length is $ len.".say; }
}
my $a = A.new(str => "Perl"); $a.dump;
This programme prints the following output:
Roles Apart from the bare classes, the Raku language allows roles. These are what are sometimes called interfaces in other object-oriented languages. Both the methods and the data, which are defined in a role, are availa- ble for “addition” (or mixing-in) to a new class with the help of the does keyword. A role looks like a base class that appends its methods and data to gen- erate a new type. The difference between prepending a role and deriv- ing a class from a base class is that with a role, you do not create any inheritance. Instead, all the fields from the role become the fields of an existing class. In other words, classes are the is a characteristic of an object, while roles are the does traits. With roles, name conflicts will be found at compile time; there is no need to traverse the method resolu- tion order paths. The following example defines a role, which is later used to create two classes; we could achieve the same with bare inheritance, though:
# The role of the catering place is to take orders
# (via the order method), to count the total amount
# of the order (method calc) and issuing a bill (method bill).
role FoodService { has @ orders;
method order($price) { @ orders.push($price); }
method calc {
# [+] is a hyperoperator (hyperop) connecting all the # elements of an array.
# It means that [+] @a is equivalent to
# @a[0] + @a[1] + ... + @a[N]. return [+] @ orders;
}
method bill {
# Order's total is still a sum of the orders. return self.calc;
}
}
# Launching a cafe. A cafe is a catering place. class Cafe does FoodService { method bill {
# But with a small surcharge. return self.calc * 1.1;
}
}
# And now a restaurant. class Restaurant does FoodService { method bill {
# First, let the customer wait some time. sleep 10.rand;
# Second, increase the prices even more. return self.calc * 1.3;
}
}
Let us try that in action. First, the cafe.
my $cafe = Cafe.new; $cafe.order(10); $cafe.order(20); say $cafe.bill; # Immediate 33 Then, the restaurant. (Note that this code will have a delay because of the class definition). my $restaurant = Restaurant.new; $restaurant.order(100); $restaurant.order(200); say $restaurant.bill; # 390 after some unpredictable delay Roles can be used for defining and API and forcing the presence of a meth- od in a class that uses a role. For example, let’s create a role named Liq+ uid, which requires that the flows method must be implemented.
role Liquid { method flows {...}
}
class Water does Liquid { }
It is not possible to run this programme as it generates a compile-time error:
Note that the ellipsis … is a valid Raku construction that is used to create forward declarations.
Channels Raku includes a number of solutions for parallel and concurrent calcu- lations. The great thing is that this is already built-in into the language and no external libraries are required. The idea of the channels is simple. You create a channel through which you can read and write. It is a kind of a pipe that can also easily transfer Raku objects. If you are familiar with channels in, for example, Go, you would find Raku’s channels easily to understand. Read and write In Raku, there is a predefined class Channel, which includes, among the others, the send and the receive methods. Here is the simplest example, where an integer number first is being sent to the channel $c and is then immediately read from it.
my $c = Channel.new; $c.send(42); say $c.receive; # 42
A channel can be passed to a sub as any other variable. Should you do that, you will be able to read from that channel in the sub.
my $ch = Channel.new; $ch.send(2017); func($ch);
sub func($ch) { say $ch.receive; # 2017 }
It is possible to send more than one value to a channel. Of course, you can later read them all one by one in the same order as they were sent.
my $channel = Channel.new;
# A few even numbers are sent to the channel.
for <1 3 5 7 9> {
$channel.send($_);
}
# Now, we read the numbers until the channel has them.
# "while @a -> $x" creates a loop with the $x as a loop variable. while $channel.poll +> $x {
say $x;
}
# After the last available number, Nil is returned. $channel.poll.say; # Nil
In the last example, instead of the previously used receive method, another one is used: $channel.poll. The difference lies in how they handle the end of the queue. When there are no more data in the chan- nel, the receive will block the execution of the programme until new data arrive. Instead, the poll method returns Nil when no data are left. To prevent the programme from hanging after the channel data is con- sumed, close the channel by calling the close method.
$channel.close;
while $channel.receive +> $x {
say $x;
}
Now, you only read data, which are already in the channel, but after the queue is over, an exception will occur: Cannot receive a mes+ sage on a closed channel. Thus either put a try block around it or use poll.
$channel.close;
try { while $channel.receive +> $x { say $x;
}
}
Here, closing a channel is a required to quit after the last data piece from the channel arrives. The list method The list method accompanies the previously seen methods and re- turns everything that is left unread in the channel.
my $c = Channel.new;
$c.send(5); $c.send(6);
$c.close;
say $c.list; # (5 6)
The method blocks the programme until the channel is open, thus it is wise to close it before calling the list method. Beyond scalars Channels may also transfer both arrays and hashes and do it as easily as they work with scalars. Unlike Perl 5, an array will not be unfolded to a list of scalars but will be passed as a single unit. Thus, you may write the following code.
my $c = Channel.new; my @a = (2, 4, 6, 8); $c.send(@a);
say $c.receive; # [2 4 6 8]
The @a array is sent to the channel as a whole and later is consumed as a whole with a single receive call. What’s more, if you save the received value into a scalar variable, that variable will contain an array.
my $x = $c.receive; say $x.WHAT; # (Array)
The same discussions apply to hashes.
my $c = Channel.new; my %h = (alpha => 1, beta => 2); $c.send(%h);
say $c.receive; # {alpha => 1, beta => 2}
Instead of calling the list method, you can use the channel in the list context (but do not forget to close it first).
$c.close;
my @v = @$c;
say @v; # [{alpha => 1, beta => 2}]
Note that if you send a list, you will receive it as a list element of the @v array. Here is another example of “dereferencing” a channel:
$c.close; for @$c +> $x { say $x;
} # {alpha => 1, beta => 2}
The closed method The Channel class also defines a method that checks on whether the channel is closed. This method is called closed.
my $c = Channel.new; say "open" if $c.closed; # is open
$c.close; say "closed" if $c.closed; # closed
Despite the simplicity of using the method, it in fact returns not a sim- ple Boolean value but a promise object (a variable of the Promise class). A promise (we will talk about this later) can be either kept or broken. Thus, if the channel is open, the closed promise is not yet kept; it is only given (or planned).
Promise.new(status&=>&PromiseStatus::Planned,&...)
After the channel is closed, the promise is kept.
Promise.new(status&=>&PromiseStatus::Kept,&...)
You can see the state of the promise above in its status field.
In this section, we discussed the simplest applications of channels, where things happen in the same thread. The big thing about channels is that they transparently do the right thing if you’re sending in one or more threads, and receiving in another one or more threads. No value will be received by more than one thread, and no value shall be lost because of race conditions when sending them from more than one thread.
Promises Promises are objects aimed to help synchronize parallel processes. The simplest use case involving them is to notify if the earlier given promise is kept or broken or if its status is not yet known. Basics The Promise.new constructor builds a new promise. The status of it can be read using the status method. Before any other actions are done with the promise, its status remains to be Planned.
my $p = Promise.new; say $p.status; # Planned
When the promise is kept, call the keep method to update the status to the value of Kept.
my $p = Promise.new;
$p.keep;
say $p.status; # Kept
Alternatively, to break the promise, call the break method and set the status of the promise to Broken.
my $p = Promise.new;
say $p.status; # Planned
$p.break; say $p.status; # Broken
Instead of asking for a status, the whole promise object can be convert- ed to a Boolean value. There is the Bool method for that; alternatively, the unary operator ? can be used instead.
say $p.Bool;
say ?$p;
Keep in mind that as a Boolean value can only take one of the two pos- sible states, the result of the Boolean typecast is not a full replacement for the status method. There is another method for getting a result called result. It returns truth if the promise has been kept.
my $p = Promise.new; $p.keep; say $p.result; # True
Be careful. If the promise is not kept at the moment the result is called, the programme will be blocked until the promise is not in the Planned status anymore. In the case of the broken promise, the call of result throws an excep- tion.
my $p = Promise.new; $p.break; say $p.result;
Run this programme and get the exception details in the console. Tried&to&get&the&result&of&a&broken&Promise& To avoid quitting the programme under an exception, surround the code with the try block (but be ready to lose the result of say—it will not appear on the screen).
my $p = Promise.new; $p.break;
try { say $p.result; }
The cause method, when called instead of the result, will explain the details for the broken promise. The method cannot be called on the kept promise:
Like with exceptions, both kept and broken promises can be attributed to a message or an object. In this case, the result will return that mes- sage instead of a bare True or False. This is how a message is passed for the kept promise:
my $p = Promise.new; $p.keep('All done'); say $p.status; # Kept say $p.result; # All done
This is how it works with the broken promise:
my $p = Promise.new; $p.break('Timeout'); say $p.status; # Broken say $p.cause; # Timeout
Factory methods There are a few factory methods defined in the Promise class. start The start method creates a promise containing a block of code. There is an alternative way to create a promise by calling Promise.start via the start keyword.
my $p = start {
}
(Note that in Raku, a semicolon is assumed after a closing brace at the end of a line.)
The start method returns a promise. It will be broken if the code block throws an exception. If there are no exceptions, the promise will be kept.
my $p = start {
}
say $p.result; # 42
say $p.status; # Kept
Please note that the start instruction itself just creates a promise and the code from the code block will be executed on its own. The start method immediately returns, and the code block runs in parallel. A test of the promise status will depend on whether the code has been exe- cuted or not. Again, remember that result will block the execution until the promise is not in the Planned status anymore. In the given example, the result method returns the value calculated in the code block. After that, the status call will print Kept. If you change the last two lines in the example, the result may be dif- ferent. To make the test more robust, add a delay within the code block.
my $p = start { sleep 1;
42
}
say $p.status; # Planned
say $p.result; # 42 say $p.status; # Kept
Now, it can be clearly seen that the first call of $p.status is happening immediately after the promise has been created and informs us that the promise is Planned. Later, after the result unblocked the programme flow in about a second, the second call of $p.status prints Kept, which means that the execution of the code block is completed and no excep- tions were thrown. 104
Would the code block generate an exception, the promise becomes broken.
my $p = start {
die;
}
try {
say $p.result;
}
say $p.status; # This line will be executed
# and will print 'Broken'
The second thing you have to know when working with start is to un- derstand what exactly causes an exception. For example, an attempt to divide by zero will only throw an exception when you try using the re- sult of that division. The division itself is harmless. In Raku, this be- haviour is called soft failure. Before the result is actually used, Raku assumes that the result is of the Rat (rational) type.
# $p1 is Kept
my $p1 = start {
my $inf = 1 / 0;
}
# $p2 is Broken
my $p2 = start {
my $inf = 1 / 0;
say $inf;
}
sleep 1; # Wait to make sure the code blocks are done
say $p1.status; # Kept
say $p2.status; # Broken
in and at The other two factory methods, Promise.in and Promise.at, create a promise, which will be kept after a given number of seconds or by a given time. For example:
my $p = Promise.in(3);
for 1..5 {
say $p.status;
sleep 1;
}
The programme prints the following lines. Planned Planned Planned Kept Kept
That means that the promise was kept after three seconds. anyof and allof Another pair of factory methods, Promise.anyof and Promise.allof, creates new promises, which will be only kept when at least one of the promises (in the case of anyof) is kept or, in the case of allof, all of the promises listed at the moment of creation are kept. One of the useful examples found in the documentation is a timeout keeper to prevent long calculations from hanging the programme. Create the promise $timeout, which must be kept after a few seconds, and the code block, which will be running for longer time. Then, list them both in the constructor of Promise.anyof.
my $code = start {
sleep 5
}
my $timeout = Promise.in(3);
my $done = Promise.anyof($code, $timeout); say $done.result;
The code should be terminated after three seconds. At this moment, the $timeout promise is kept, and that makes the $done promise be kept, too.
then The then method, when called on an already existing promise, creates another promise, whose code will be called after the “parent” promise is either kept or broken.
my $p = Promise.in(2); my $t = $p.then({say "OK"}); # Prints this in two seconds
say "promised"; # Prints immediately
sleep 3;
say "done";
The code above produces the following output:
promised OK done
In another example, the promise is broken.
Promise.start({ # A new promise say 1 / 0 # generates an exception # (the result of the division is used in say). }).then({ # The code executed after the broken line. say "oops"
}).result # This is required so that we wait until # the result is known.
The only output here is the following: oops
An example Finally, a funny example of how promises can be used for implement- ing the sleep sort algorithm. In sleep sort, every integer number, con- sumed from the input, creates a delay proportional to its value. As the sleep is over, the number is printed out. Promises are exactly the things that will execute the code and tell the result after they are done. Here, a list of promises is created, and then the programme waits until all of them are done (this time, we do it using the await keyword).
my @promises; for @*ARGS +> $a { @promises.push(start { sleep $a; say $a;
})
}
await(|@promises);
Provide the programme with a list of integers:
$ raku sleep+sort.pl 3 7 4 9 1 6 2 5
For each value, a separate promise will be created with a respective delay in seconds. You may experiment and make smaller delays such as sleep $a / 10 instead. The presence of await ensures that the pro- gramme is not finished until all the promises are kept. As an exercise, let’s simplify the code and get rid of an explicit array that collects the promises.
await do for @*ARGS { start { sleep $_; say $_; }
}
First, we use the $_ variable here and thus don’t have to declare $a. Second, notice the do for combination, which returns the result of each loop iteration. The following code will help you to understand how that works:
my @a = do for 1..5 {$_ * 2}; say @a; # [2 4 6 8 10]
3. 正则表达式和 Grammars
Raku 中的 grammars 是众所周知的正则表达式的"下一个级别"。 Grammars 可以让你创建更复杂的文本解析器。 只使用 Raku 提供的 grammars 设施,即可在没有任何外部帮助的情况下创建新的特定领域的语言(DSL),语言翻译器或解释器。
3.1. 正则表达式
事实上,Raku 把正则表达式叫做正则。 基本语法与 Perl 5 略有不同,但大多数元素(如量词 * 或 +
)看起来仍然很熟悉。 regex 关键字用于构建正则表达式。 让我们为工作日的短名称创建一个正则表达式。
my regex weekday
{[Mon | Tue | Wed | Thu | Fri | Sat | Sun]};
方括号包着一组备选项。
你可以在其他正则表达式中使用命名的正则表达式,方法是在一对尖括号中引用它的名称。 要将字符串与正则表达式匹配,请使用 smartmatch 运算符(~~)。
say 'Thu' ~~ m/<weekday>/;
say 'Thy' ~~ m/<weekday>/;
这两个匹配会打印如下输出:
「Thu」
weekday => 「Thu」
False
匹配的结果是 Match 类型的对象。 当你打印它时,你会在小方括号 「…」 内看到所匹配到的子字符串 。
正则表达式是最简单的命名结构。 除此之外,还有 rules 和 tokens(因此,关键字是 rule 和 token)。
token 与 rule 的不同之处在于它们如何处理空格。 在 rule 中,空格是正则表达式的一部分。 在 token 中,空格只是视觉分隔符。 我们将在下面的示例中看到更多相关信息。
my token number_token { <[\d]> <[\d]> }
my rule number_rule { <[\d]> <[\d]> }
(注意闭合花括号后面没有不需要分号。)
<[…]>
结构创建一个字符类。在上面的例子中, 两个字符的字符串 42 匹配 number_token
token 但是不匹配 number_rule
rule。
say 1 if "42" ~~ /<number_token>/;
say 1 if "42" ~~ /<number_rule>/;
3.2. $/
对象
正如我们刚刚看到的,智能匹配运算符将字符串与正则表达式进行比较会返回一个 Match 类型的对象。 该对象存储在 $/ 变量中。 它还包含所有匹配的子字符串。 为了保留(捕获)子字符串,需要使用一对圆括号。 第一个匹配索引为 0,你可以使用完整语法 $/[0] 或缩短的 $0 来将其作为数组元素进行访问。
请记住,即使是 $0 或 $0 等单独的元素仍然包含 Match 类型的对象。 要将它们转换为字符串或数字,可以使用强制语法。 例如,~$0 将对象转换为字符串,+$0 将对象转换为整数。
'Wed 15' ~~ /(\w+) \s (\d+)/;
say ~$0; # Wed
say ~$1; # 15
3.3. Grammars
Grammars 是正则表达式的发展。 从语法上讲,grammar 定义类似于类,但使用关键字 grammar。 在 grammar 里面,它包含 tokens 和 rules。 在下一节中,我们将在示例中探索 grammar。
3.3.1. 简单的解析器
Grammar 应用的第一个例子是定义赋值操作并包含打印指令的小语言的 grammar。以下是此语言的程序示例。
x = 42;
y = x;
print x;
print y;
print 7;
让我们开始编写该语言的 grammar。 首先,我们必须表达一个事实,即程序是由分号分隔的一系列语句。 因此,在顶层语法看起来像这样:
grammar Lang {
rule TOP {
^ <statements> $
}
rule statements {
<statement>+ %% ';'
}
}
在这里,Lang 是 grammar 的名称,而 TOP 是解析将开始的起始规则。规则的内容是由一对符号 ^ 和 $ 包围的正则表达式,用于将规则绑定到文本的开头和结尾。换句话说,整个程序应该匹配 TOP 规则。规则的核心部分 <statements> 引用了另一条规则。规则将忽略其各部分之间的所有空格。因此,你可以自由地在 grammar 的定义中添加空格,以使其易于阅读。
第二条规则解释了 <statements> 的含义。<statements> 块是一系列单独的 statement。它应该包含至少一个 statement,如 + 量词所要求的那样,并且分隔符是分号。在 %% 符号后面是分隔符。在 grammar 中,这意味着指令之间必须有分隔符,但是你可以在最后一个之后省略分隔符。如果只有一个百分号字符而不是两个百分号字符,则规则也要求在最后一个语句之后有分隔符。
下一步是描述 statement。目前,我们的语言只有两个操作:赋值和打印。它们中的每一个都接受值或变量名。
rule statement {
| <assignment>
| <printout>
}
垂直条分隔备选分支,就像它在 Perl 5 中的正则表达式一样。为了使代码看起来更好看并简化维护,可以在第一个子规则之前添加额外的垂直条。 以下两个描述相同:
rule statement {
<assignment>
| <printout>
}
rule statement {
| <assignment>
| <printout>
}
然后,让我们定义 assignment 和 printout 的含义。
rule assignment {
<identifier> '=' <expression>
}
rule printout {
'print' <expression>
}
在这里,我们看到字符串字面量,即 '=' 和 'print'。 同样,它们周围的空格不会影响规则。
expression 与标识符(在我们的例子中是变量名)或常量值匹配。 因此,expression 是 identifier 或没有附加字符串的 value。
rule expression {
| <identifier>
| <value>
}
此时,我们应该编写标识符和值的规则。 对于那种 grammar,最好使用另一种名为 token 的方法。 在 token 中,空格很重要(那些与大括号相邻的空格除外)。
标识符是一组字母:
token identifier {
<:alpha>+
}
这里,<:alpha>
是包含所有字母字符的预定义字符类。
我们示例中的值是一组数字,因此我们这里仅限于整数。
token value {
\d+
}
我们的第一个 grammar 已经完成。 现在可以使用它来解析文本文件。
my $parsed = Lang.parsefile('test.lang');
如果文件内容已经存储在变量中, 那么你可以使用 Lang.parse($str)
方法来解析它。(附录中有更多关于从文件中读取的内容)
如果解析成功, 即如果文件包含有效的文法, 那么 $parse
变量会包含一个 Match 类型的对象。可以把它转储出来(say $parsed
)并看看里面是什么。
「x = 42;
y = x;
print x;
print y;
print 7;」
statements => 「x = 42;
y = x;
print x;
print y;
print 7;」
statement => 「x = 42」
assignment => 「x = 42」
identifier => 「x」
expression => 「42」
value => 「42」
statement => 「y = x」
assignment => 「y = x」
identifier => 「y」
expression => 「x」
identifier => 「x」
statement => 「print x」
printout => 「print x」
expression => 「x」
identifier => 「x」
statement => 「print y」
printout => 「print y」
expression => 「y」
identifier => 「y」
statement => 「print 7」
printout => 「print 7」
expression => 「7」
value => 「7」
此输出对应于本节开头的示例程序。 它包含已解析程序的结构。 捕获的部分显示在括号 「…」 中。 首先,打印整个匹配的文本。 实际上,由于 TOP 规则使用了一对 ^ … $
结构,因此整个文本应该与规则匹配。
然后,打印解析树。 它从 <statements> 开始,然后 grammar 的其他部分完全按照文件中的程序包含的内容呈现。 在下一级别,你可以看到 identifier 和 value token 的内容。
如果程序在文法上不正确,则该解析方法将返回空值(Any)。 如果只有程序的起始部分与规则匹配,则会发生同样的情况。
为方便起见,这是完整的 grammar:
grammar Lang {
rule TOP {
^ <statements> $
}
rule statements {
<statement>+ %% ';'
}
rule statement {
| <assignment>
| <printout>
}
rule assignment {
<identifier> '=' <expression>
}
rule printout {
'print' <expression>
}
rule expression {
| <identifier>
| <value>
}
token identifier {
<:alpha>+
}
token value {
\d+
}
}
3.3.2. 解释器
到目前为止,grammar 看到了程序的结构,并且可以判断它是否在 grammar 上是正确的,但它不会执行程序中包含的任何指令。 在本节中,我们将扩展解析器,以便它可以实际执行程序。
我们的示例语言使用变量和整数值。值是常量并描述自己。对于变量,我们需要创建一个存储。在最简单的情况下,所有变量都是全局变量,并且需要一个散列:my %var;
。
我们现在要实现的第一个 action 是赋值。 它将获取值并将其保存在变量存储中。 在 grammar 的 assignment rule 中,期望在等号的右侧是一个 expression。 表达式可以是变量也可以是数字。为了简化变量名查找,让我们使 grammar 更复杂一些,并将赋值和打印规则分别拆分为两个备选项。
rule assignment {
| <identifier> '=' <value>
| <identifier> '=' <identifier>
}
rule printout {
| 'print' <value>
| 'print' <identifier>
}
3.4. Actions
Raku 中的 grammar 允许响应 rule 或 token 匹配的 action。 action 是在解析的文本中找到相应的 rule 或 token 时执行的代码块。 action 会接收一个对象 $/
,你可以在其中查看匹配的详细信息。 例如,$<identifier>
的值将包含 Match 类型的对象,其中包含有关 grammar 实际消耗的子字符串的信息。
rule assignment {
| <identifier> '=' <value>
{ say "$<identifier>=$<value>" }
| <identifier> '=' <identifier>
}
如果你使用上面的 action 更新 grammar 并针对同一示例文件运行程序,那么你将在输出中看到子字符串 x=42
。
Match 对象在用双引号插值时转换为字符串,如给定示例中所示:"$<identifier>=$<value>"
。 要使用带引号的字符串外部的文本值,你应该进行明确的类型转换:
rule assignment {
| <identifier> '=' <value>
{%var{~$<identifier>} = +$<value> }
| <identifier> '=' <identifier>
}
到目前为止,我们已经有了一个为变量赋值的 action,并且可以处理文件的第一行。 变量存储将包含 {x ⇒ 42}
对。
在 assignment rule 的第二个备选项中,<identifier>
名称被提及了两次; 这就是为什么你可以引用它作为 $<identifier>
的数组元素的原因。
rule assignment {
| <identifier> '=' <value>
{
%var{~$<identifier>} = +$<value>
}
| <identifier> '=' <identifier>
{
%var{~$<identifier>[0]} =
%var{~$<identifier>[1]}
}
}
对代码的这一添加使得可以解析两个变量的赋值:y = x。 %var 散列将包含两个值:{x ⇒ 42, y ⇒ 42}。
或者,可以使用捕获圆括号。 在这种情况下,要访问捕获的子字符串,请使用特殊变量,例如 $0:
rule assignment {
| (<identifier>) '=' (<value>)
{
%var{$0} = +$1
}
| (<identifier>) '=' (<identifier>)
{
%var{$0} = %var{$1}
}
}
这里,当变量用作散列键时,不再需要一元运算符 ~
,但仍需要 $1 之前的一元 +
以将 Match 对象转换为数字。
同样,给打印创建 action。
rule printout {
| 'print' <value>
{
say +$<value>
}
| 'print' <identifier>
{
say %var{$<identifier>}
}
}
现在,grammar 能够完成语言设计所需的所有 action,并将打印请求的值:
42
42
7
只要我们在规则中使用捕获圆括号,解析树就会包含名为 0 和 1 的条目以及命名字符串,例如 identifier。 在解析 y = x 字符串时你可以清楚地看到它:
statement => 「y = x」
assignment => 「y = x」
0 => 「y」
identifier => 「y」
1 => 「x」
identifier => 「x」
更新的解析器如下所示:
my %var;
grammar Lang {
rule TOP {
^ <statements> $
}
rule statements {
<statement>+ %% ';'
}
rule statement {
| <assignment>
| <printout>
}
rule assignment {
| (<identifier>) '=' (<value>)
{
%var{$0} = +$1
}
| (<identifier>) '=' (<identifier>)
{
%var{$0} = %var{$1}
}
}
rule printout {
| 'print' <value>
{
say +$<value>
}
| 'print' <identifier>
{
say %var{$<identifier>}
}
}
token identifier {
<:alpha>+
}
token value {
\d+
}
}
Lang.parsefile('data/test.lang');
为方便起见,可以将 action 代码放在单独的类中。 当 action 更复杂并包含不止一两行代码时,这会有很大帮助。
要创建外部 action,请创建一个类,稍后将在调用 grammar 的 parse 或 parsefile 方法时通过 :actions 参数引用该类。 与内置 action 一样,外部类中的 action 会接收 Match 类型的 $/ 对象。
首先,我们将训练一个小的孤立示例,然后返回我们的自定义语言解析器。
grammar G {
rule TOP {^ \d+ $}
}
class A {
method TOP($/) {say ~$/}
}
G.parse("42", :actions(A));
Grammar G 和 action 类 A 都有一个名为 TOP 的方法。通用名称将 action 与相应的规则相关联。 当 grammar 解析提供的测试字符串并使用 ^ \d $ 规则消耗值 42 时,将触发 A::TOP action,并将 $/ 参数传递给它,并立即打印。
3.5. AST 和属性
现在,我们准备在将 assignment 和 printout 规则分别分割成两个备选项之后再次简化 grammar。 困难在于如果没有拆分,就无法理解触发了哪个分支。 你需要从 value token 中读取值,或者从 identifier token 中获取变量名,并在变量存储中查找它。
Raku 的 grammar 提供了一种很好的机制,它在语言解析理论中很常见,即抽象语法树, 缩写为 AST。
首先,更新规则并从其中删除一些替代规则。 包含两个分支的唯一规则是 expression 规则。
rule assignment {
<identifier> '=' <expression>
}
rule printout {
'print' <expression>
}
rule expression {
| <identifier>
| <value>
}
在解析阶段构建的语法树可以包含前面步骤中计算的结果。 Match 对象有一个字段 ast,专门用于保持每个节点上的计算值。 可以简单地读取值以获得先前完成的 action 的结果。 树被称为抽象,因为计算值的方式不是很重要。 重要的是,当触发 action 时,你只需一个地点就可以获得完成 action 所需的结果。
该 action 可以通过调用 $/.make 方法保存自己的结果(并因此在树上进一步传递)。 你保存在那里的数据可以通过 made 字段访问,该字段具有同义词 ast。
让我们填充 identifier 和 value token 的语法树的属性。 与标识符的匹配产生变量名; 找到值时,action 会生成一个数字。 以下是 action 类的方法。
method identifier($/) {
$/.make(~$0);
}
method value($/) {
$/.make(+$0);
}
向前移动一步,在我们构建表达式的值的地方。 它可以是变量值或整数。
因为 expression 规则有两个备选项,第一项任务是了解哪一个匹配。 为此,检查 $/ 对象中是否存在相应的字段。
(如果在 action 方法的签名中使用推荐的变量名 $/ ,则可以以不同方式访问其字段。完整语法为 $/<identifier>,但是有另一个版本 $<identifier>。)
expression 方法的两个分支表现不同。 对于数字,它直接从捕获的子字符串中提取值。 对于变量,它从 %var 散列中获取值。 在这两种情况下,结果都使用 make 方法存储在 AST 中。
method expression($/) {
if $<identifier> {
$/.make(%var{$<identifier>});
}
else {
$/.make(+$<value>);
}
}
要使用尚未定义的变量,我们可以添加 defined-or 运算符以使用零值初始化变量。
$/.make(%var{$<identifier>} // 0);
现在,表达式将具有归属于它的值,但不再知道值的来源。 它可以是文件中的变量值或常量。 这使得 assignment 和 printout action 更简单:
method printout($/) {
say $<expression>.ast;
}
打印值所需的只是从 ast 字段中获取它。
对于 assignment,它有点复杂但仍然可以写成单行。
method assignment($/) {
%var{$<identifier>} = $<expression>.made;
}
该方法获取 $/ 对象并使用其 identifier 和 expression 元素的值。 第一个转换为字符串,并成为 %var 散列的键。 从第二个开始,我们通过获取 made 属性来获取值。
最后,让我们停止使用全局变量存储并将哈希移动到 action 类中(我们在 grammar 本身中不需要它)。因此它将被声明为 has %!var; 并在 action 主体中用作私有键变量:%!var{…}。
在此更改之后,在使用 grammar 对其进行解析之前,创建 actions 类的实例非常重要:
Lang.parsefile(
'test.lang',
:actions(LangActions.new())
);
以下是带有 action 的解析器的完整代码。
grammar Lang {
rule TOP {
^ <statements> $
}
rule statements {
<statement>+ %% ';'
}
rule statement {
| <assignment>
| <printout>
}
rule assignment {
<identifier> '=' <expression>
}
rule printout {
'print' <expression>
}
rule expression {
| <identifier>
| <value>
}
token identifier {
(<:alpha>+)
}
token value {
(\d+)
}
}
class LangActions {
has %var;
method assignment($/) {
%!var{$<identifier>} = $<expression>.made;
}
method printout($/) {
say $<expression>.ast;
}
method expression($/) {
if $<identifier> {
$/.make(%!var{$<identifier>} // 0);
}
else {
$/.make(+$<value>);
}
}
method identifier($/) {
$/.make(~$0);
}
method value($/) {
$/.make(+$0)
}
}
Lang.parsefile(
'data/test.lang',
:actions(LangActions.new())
);
3.6. 计算器
在考虑语言解析器时,实现计算器就像编写一个 "Hello,World!"程序。 在本节中,我们将为计算器创建一个 grammar,可以处理四个算术运算符和圆括号。计算器示例的隐藏优势是你必须教它遵循运算符优先级和嵌套表达式。
我们的计算器 grammar 将期望在顶层有单个表达式。 运算符的优先级将通过传统的 grammar 构建方法自动实现,其中表达式包括项和因式。
项是由加号和减号分隔的部分:
<term>+ %% ['+'|'-']
这里使用了 Raku 的 %% 符号。你可以使用更传统的量词来重写该规则:
<term> [['+'|'-'] <term>]*
反过来,每个项是由乘法或除法符号分隔的因式列表:
<factor>+ %% ['*'|'/']
项和因式都可以包含值或圆括号组。 组基本上是另一种表达方式。
rule group {
'(' <expression> ')'
}
此规则引用 expression 规则,因此可以启动另一个递归循环。
是时候引入增强的 value token 了,以便它接受浮点值。 这个任务很简单; 它只需要创建一个与尽可能多的格式匹配的正则表达式。我将跳过负数和科学记数法格式的数字。
token value {
| \d+['.' \d+]*
| '.' \d+
}
这里是计算器的完整 grammar:
grammar Calc {
rule TOP {
^ <expression> $
}
rule expression {
| <term>+ %% $<op>=(['+'|'-'])
| <group>
}
rule term {
<factor>+ %% $<op>=(['*'|'/'])
}
rule factor {
| <value>
| <group>
}
rule group {
'(' <expression> ')'
}
token value {
| \d+['.' \d+]*
| '.' \d+
}
}
注意某些规则中的 $<op>=(…) 结构。 这是命名捕获。 该名称通过 $/ 变量简化了对值的访问。 在这种情况下,你可以将值作为 $<op>,并且在更新规则后不必担心变量名称的可能更改,因为它发生在编号变量 $0,$1 等处。
现在,为编译器创建 action。 在 TOP 级别,规则返回计算后的值,它从 expression 的 ast 字段中获取。
class CalcActions {
method TOP($/) {
$/.make: $<expression>.ast
}
...
}
基础规则 groups 和 value 的 action 就像我们刚才看到的一样简单。
method group($/) {
$/.make: $<expression>.ast
}
method value($/) {
$/.make: +$/
}
其余的 action 有点复杂。 factor action 包含两个可选分支,就像 factor 规则一样。
method factor($/) {
if $<value> {
$/.make: +$<value>
}
else {
$/.make: $<group>.ast
}
}
转到 term action。 在这里,我们必须处理具有可变长度的列表。 规则的正则表达式具有 +
量词,这意味着它可以捕获一个或多个元素。 此外,由于规则处理乘法运算符和除法运算符,因此必须区分这两种情况。 $<op> 变量包含 * 或 / 字符。
这是具有三个项的字符串的语法树的样子,3*4*5
:
expression => 「3*4*5」
term => 「3*4*5」
factor => 「3」
value => 「3」
op => 「*」
factor => 「4」
value => 「4」
op => 「*」
factor => 「5」
value => 「5」
正如你所看到的,顶层有 factor 和 op 条目。 你将在 action 中看到 $<factor> 和 $<op> 的值。 至少有一个 $<factor> 将始终可用。 节点的值已经知道并且在 ast 属性中可用。 因此,你需要做的就是遍历这两个数组的元素并执行乘法或除法。
method term($/) {
my $result = $<factor>[0].ast;
if $<op> {
my @ops = $<op>.map(~*);
my @vals = $<factor>[1..*].map(*.ast);
for 0..@ops.elems - 1 -> $c {
if @ops[$c] eq '*' {
$result *= @vals[$c];
}
else {
$result /= @vals[$c];
}
}
}
$/.make: $result;
}
在此代码片段中,星号出现在占位符的新角色中,该角色告诉 Perl 它应该处理此时可以获取的数据。 这听起来很奇怪,但它完美而直观地工作。
带有运算符符号列表的 @ops
数组包含我们在对 $<op>
的值进行字符串化后得到的元素:
my @ops = $<op>.map(~*);
值本身将落在 @vals
数组中。 为了确保两个数组 @vals
和 @ops
的值彼此对应,得到从第二个元素开始的 $<factor> 的切片:
my @vals = $<factor>[1..*].map(*.ast);
最后,expression action 要么采用 group 的计算值,要么执行加法和减法的序列。 该算法接近 term action 中的一个。
method expression($/) {
if $<group> {
$/.make: $<group>.ast
}
else {
my $result = $<term>[0].ast;
if $<op> {
my @ops = $<op>.map(~*);
my @vals = $<term>[1..*].map(*.ast);
for 0..@ops.elems -1 -> $c {
if @ops[$c] eq '+' {
$result += @vals[$c];
}
else {
$result -= @vals[$c];
}
}
}
$/.make: $result;
}
}
计算器的大部分代码都已准备就绪。 现在,我们需要从用户读取字符串,将其传递给解析器,然后打印结果。
my $calc = Calc.parse(
@*ARGS[0],
:actions(CalcActions)
);
say $calc.ast;
让我们看看它是否有效。
$ raku calc.pl '39 + 3.14 * (7 - 18 / (505 - 502)) - .14'
42
它确实有效。
在 github.com/ash/lang 上,你可以找到本章演示的代码的延续,它结合了语言翻译器和计算器,允许用户在变量赋值和打印指令中编写算术表达式。 这是一个解释器可以处理的示例:
x = 40 + 2;
print x;
y = x - (5/2);
print y;
z = 1 + y * x;
print z;
print 14 - 16/3 + x;