Lexical Elements
Lexical elements
Comments
There two forms of comments:
Line based comments that start with either
#
or//
and continue until the end of the current lineand block comments that start with
/*
and continue until the sequence*/
is found.
Comments cannot start inside string-ish literals (including characters and regexes).
// this is a line not processed # this line too /* And here we have an whole region of the file that's not processed! */
Doc-Comments
Doc comments are a special sub-variant of comments: they are intended for documentation of all declarative elements and are supported by all offical lapyst tooling (when applicable, such as documentation generators).
Doc comments come in two variants:
Block doc-comments are like start like a regular block comment, but have an extra
*
in their opening tag:/**
.Line doc-comments start with
///
instead of the normal//
and in addition group together, as long as there is no non-doc comment line inbetween two doc comment lines.
Tokens
explain this |
Semicolons
Lapyst uses semicolons ;
as terminators in a number of productions in the languages grammar. There are no rules or cases where you can omit them.
Identifiers
A identifier is used to name (or 'identify') entities from each other inside a program.
identifier = letter { letter | unicode_digit } ;
Keywords
The following keywords are reserved and cannot be used as identifiers.
arguments else in redo then as elsif include retry throw break end instanceof return to case ensure macro role true cast enum module self try catch export namespace shape unit const false next shapeof unless dec for new static use def from nil step var default if of super while do import prop switch
Operators and punctuation
+ & += &= && &&= == === ( ) - | -= |= || ||= != !== [ ] + ^ *= ^= ?? ??= < <= { } ** << **= <<= ++ > >= , ; / >> /= >>= -- = ... . : % ~ %= ! =~
Integer literals
Integer literals are a sequence of digits representing an integer. Optional prefixes sets non-decimal bases: 0b
or 0B
for binary, 0o
or 0O
for octal, 0x
or 0X
for hexadecimal. A single 0
is considered a decimal zero. In hexadecimal literals, letters a
through f
and A
through F
represent values 10 through 15.
For readability, underscore characters _ may appear after a base prefix or between digits; these underscores do not change the value the integer literal represent.
int_lit = dec_lit | bin_lit | oct_lit | hex_lit ; dec_lit = "0" | ( "1" ... "9" ) [ [ "_" ] dec_digits ] ; bin_lit = "0" ( "b" | "B" ) { "_" } bin_digits ; oct_lit = "0" ( "o" | "O" ) { "_" } oct_digits ; hex_lit = "0" ( "x" | "X" ) { "_" } hex_digits ; dec_digits = dec_digit { { "_" } dec_digit } ; bin_digits = bin_digit { { "_" } bin_digit } ; oct_digits = oct_digit { { "_" } oct_digit } ; hex_digits = hex_digit { { "_" } hex_digit } ;
Floating-point literals
Floating point literals consists of an integer part (decimal digits), a decimal point, a fractional part (decimal digits), and an exponent part (e or E followed by an optional sign and decimal digits). One of the integer part or the fractional part may be omitted; one of the decimal point or the exponent part may be omitted. An exponent value exp scales the mantissa (integer and fractional part) by 10exp.
For readability, underscore characters _ may appear between digits; these underscores do not change the value the float literal represent.
float_lit = dec_digits "." [ dec_digits ] [ decimal_exponent ] | dec_digits decimal_exponent | "." dec_digits [ decimal_exponent ] ; decimal_exponent = ( "e" | "E" ) [ "+" | "-" ] dec_digits ;
Character literals
A character literal (sometimes also refered to as a rune literal), is used to represent a single character / rune. They are at most one unicode character long in the source, except they are a escaped char.
Escaped chars are a special series of characters in the source to represent an single character, which most of the time are unprintable in the source.
explain escaped chars more |
\a U+0007 alert or bell \b U+0008 backspace \f U+000C form feed \n U+000A line feed or newline \r U+000D carriage return \t U+0009 horizontal tab \v U+000B vertical tab \\ U+005C backslash \' U+0027 single quote \" U+0022 double quote
Any unrecognized character after a backslash in a character literal is considered a error.
char_lit = "'" ( unicode_value | byte_value ) "'" ; unicode_value = unicode_char | little_u_value | escaped_char ; byte_value = hex_byte_value; hex_byte_value = `\` "x" hex_digit hex_digit ; little_u_value = `\` "u" ( "{" { hex_digit } "}" | hex_digit hex_digit hex_digit hex_digit ) ; escaped_char = `\` ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | `\` | "'" | `"` ) ;
String literals
TODO: document this |
string_lit = `"` { unicode_value | byte_value } `"` ;
Boolean literals
The keywords true
and false
are used to express the builtin boolean type.
The nil literal
The keyword nil
is specially used in mutliple places to represent the absence of an normal value.
TODO: add more documentation maybe? |