Skip to content

Change doc pages for syntax and indentation #11197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 24, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 39 additions & 29 deletions docs/docs/internals/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ layout: doc-page
title: "Scala 3 Syntax Summary"
---

The following descriptions of Scala tokens uses literal characters `‘c’` when
The following description of Scala tokens uses literal characters `‘c’` when
referring to the ASCII fragment `\u0000` – `\u007F`.

_Unicode escapes_ are used to represent the Unicode character with the given
_Unicode escapes_ are used to represent the [Unicode character](https://www.w3.org/International/articles/definitions-characters/) with the given
hexadecimal code:

```ebnf
Expand All @@ -17,6 +17,7 @@ hexDigit ::= ‘0’ | … | ‘9’ | ‘A’ | … | ‘F’ | ‘a’ |
Informal descriptions are typeset as `“some comment”`.

### Lexical Syntax

The lexical syntax of Scala is given by the following grammar in EBNF
form.

Expand All @@ -28,9 +29,9 @@ letter ::= upper | lower “… and Unicode categories Lo, Lt, Nl”
digit ::= ‘0’ | … | ‘9’
paren ::= ‘(’ | ‘)’ | ‘[’ | ‘]’ | ‘{’ | ‘}’ | ‘'(’ | ‘'[’ | ‘'{’
delim ::= ‘`’ | ‘'’ | ‘"’ | ‘.’ | ‘;’ | ‘,’
opchar ::= “printableChar not matched by (whiteSpace | upper | lower |
letter | digit | paren | delim | opchar | Unicode_Sm |
Unicode_So)”
opchar ::= “printableChar not matched by (whiteSpace | upper |
lower | letter | digit | paren | delim | opchar |
Unicode_Sm | Unicode_So)”
printableChar ::= “all characters in [\u0020, \u007F] inclusive”
charEscapeSeq ::= ‘\’ (‘b’ | ‘t’ | ‘n’ | ‘f’ | ‘r’ | ‘"’ | ‘'’ | ‘\’)

Expand All @@ -42,7 +43,6 @@ plainid ::= alphaid
| op
id ::= plainid
| ‘`’ { charNoBackQuoteOrNewline | UnicodeEscape | charEscapeSeq } ‘`’
| INT // interpolation id, only for quasi-quotes
idrest ::= {letter | digit} [‘_’ op]
quoteId ::= ‘'’ alphaid

Expand Down Expand Up @@ -85,26 +85,36 @@ nl ::= “new line character”
semi ::= ‘;’ | nl {nl}
```

The lexical analyzer also inserts `indent` and `outdent` tokensthat represent regions of indented code [at certain points](../reference/other-new-features-indentation.html)

In the context-free productions below we use the notation `<<< ts >>>`
to indicate a token sequence `ts` that is either enclosed in a pair of braces `{ ts }` or that constitutes an indented region `indent ts outdent`.


## Keywords

### Regular keywords

```
abstract case catch class def do else enum
export extends false final finally for given if
implicit import lazy match new null object package
private protected override return super sealed then throw
trait true try type val var while with
yield
: = <- => <: :> # @
=>> ?=>
abstract case catch class def do else
enum export extends false final finally for
given if implicit import lazy match new
null object override package private protected return
sealed super then throw trait true try
type val var while with yield
: = <- => <: :> #
@ =>> ?=>
```

### Soft keywords

```
derives end extension inline infix opaque open transparent using | * + -
derives end extension infix inline opaque open transparent using | * + -
```

See the [separate section on soft keywords](./soft-modifier.md) for additional
details on where a soft keyword is recognized.

## Context-free Syntax

The context-free syntax of Scala is given by the following EBNF
Expand Down Expand Up @@ -146,7 +156,7 @@ FunArgTypes ::= InfixType
| FunParamClause
FunParamClause ::= ‘(’ TypedFunParam {‘,’ TypedFunParam } ‘)’
TypedFunParam ::= id ‘:’ Type
MatchType ::= InfixType `match` ‘{’ TypeCaseClauses ‘}’
MatchType ::= InfixType `match` <<< TypeCaseClauses >>>
InfixType ::= RefinedType {id [nl] RefinedType} InfixOp(t1, op, t2)
RefinedType ::= WithType {[nl | ‘with’] Refinement} RefinedTypeTree(t, ds)
WithType ::= AnnotType {‘with’ AnnotType} (deprecated)
Expand All @@ -172,7 +182,7 @@ FunArgType ::= Type
ParamType ::= [‘=>’] ParamValueType
ParamValueType ::= Type [‘*’] PostfixOp(t, "*")
TypeArgs ::= ‘[’ Types ‘]’ ts
Refinement ::= ‘{’ [RefineDcl] {semi [RefineDcl]} ‘}’ ds
Refinement ::= <<< [RefineDcl] {semi [RefineDcl]} >>> ds
TypeBounds ::= [‘>:’ Type] [‘<:’ Type] TypeBoundsTree(lo, hi)
TypeParamBounds ::= TypeBounds {‘:’ Type} ContextBounds(typeBounds, tps)
Types ::= Type {‘,’ Type}
Expand Down Expand Up @@ -208,7 +218,7 @@ PostfixExpr ::= InfixExpr [id]
InfixExpr ::= PrefixExpr
| InfixExpr id [nl] InfixExpr InfixOp(expr, op, expr)
| InfixExpr MatchClause
MatchClause ::= ‘match’ ‘{’ CaseClauses ‘}’ Match(expr, cases)
MatchClause ::= ‘match’ <<< CaseClauses >>> Match(expr, cases)
PrefixExpr ::= [‘-’ | ‘+’ | ‘~’ | ‘!’] SimpleExpr PrefixOp(expr, op)
SimpleExpr ::= SimpleRef
| Literal
Expand All @@ -235,7 +245,7 @@ ParArgumentExprs ::= ‘(’ [‘using’] ExprsInParens ‘)’
| ‘(’ [ExprsInParens ‘,’] PostfixExpr ‘:’ ‘_’ ‘*’ ‘)’ exprs :+ Typed(expr, Ident(wildcardStar))
ArgumentExprs ::= ParArgumentExprs
| BlockExpr
BlockExpr ::= ‘{’ (CaseClauses | Block) ‘}’
BlockExpr ::= <<< (CaseClauses | Block) >>>
Block ::= {BlockStat semi} [BlockResult] Block(stats, expr?)
BlockStat ::= Import
| {Annotation {nl}} [‘implicit’ | ‘lazy’] Def
Expand Down Expand Up @@ -354,7 +364,6 @@ EndMarkerTag ::= id | ‘if’ | ‘while’ | ‘for’ | ‘match’ |
RefineDcl ::= ‘val’ ValDcl
| ‘def’ DefDcl
| ‘type’ {nl} TypeDcl
| INT
Dcl ::= RefineDcl
| ‘var’ VarDcl
ValDcl ::= ids ‘:’ Type PatDef(_, ids, tpe, EmptyTree)
Expand All @@ -370,7 +379,7 @@ Def ::= ‘val’ PatDef
| ‘type’ {nl} TypeDcl
| TmplDef
PatDef ::= ids [‘:’ Type] ‘=’ Expr
| Pattern2 [‘:’ Type] ‘=’ Expr PatDef(_, pats, tpe?, expr)
| Pattern2 [‘:’ Type] ‘=’ Expr PatDef(_, pats, tpe?, expr)
VarDef ::= PatDef
| ids ‘:’ Type ‘=’ ‘_’
DefDef ::= DefSig [‘:’ Type] ‘=’ Expr DefDef(_, name, tparams, vparamss, tpe, expr)
Expand All @@ -389,18 +398,19 @@ GivenDef ::= [GivenSig] (AnnotType [‘=’ Expr] | ConstrApps Templat
GivenSig ::= [id] [DefTypeParamClause] {UsingParamClause} ‘:’ -- one of `id`, `DefParamClause`, `UsingParamClause` must be present
Extension ::= ‘extension’ [DefTypeParamClause] ‘(’ DefParam ‘)’
{UsingParamClause}] ExtMethods
ExtMethods ::= ExtMethod | [nl] ‘{’ ExtMethod {semi ExtMethod ‘}’
ExtMethods ::= ExtMethod | [nl] <<< ExtMethod {semi ExtMethod >>>
ExtMethod ::= {Annotation [nl]} {Modifier} ‘def’ DefDef
Template ::= InheritClauses [TemplateBody] Template(constr, parents, self, stats)
InheritClauses ::= [‘extends’ ConstrApps] [‘derives’ QualId {‘,’ QualId}]
ConstrApps ::= ConstrApp ({‘,’ ConstrApp} | {‘with’ ConstrApp})
ConstrApp ::= SimpleType1 {Annotation} {ParArgumentExprs} Apply(tp, args)
ConstrExpr ::= SelfInvocation
| ‘{’ SelfInvocation {semi BlockStat} ‘}’
| <<< SelfInvocation {semi BlockStat} >>>
SelfInvocation ::= ‘this’ ArgumentExprs {ArgumentExprs}

TemplateBody ::= [nl | ‘with’] ‘{’ [SelfType] TemplateStat {semi TemplateStat} ‘}’
| ‘with’ [SelfType] indent TemplateStats outdent
TemplateBody ::= [nl | ‘with’] <<< [SelfType] TemplateStats >>>
| ‘with’ SelfType indent TemplateStats outdent
TemplateStats ::= TemplateStat {semi TemplateStat}
TemplateStat ::= Import
| Export
| {Annotation [nl]} {Modifier} Def
Expand All @@ -412,14 +422,14 @@ TemplateStat ::= Import
SelfType ::= id [‘:’ InfixType] ‘=>’ ValDef(_, name, tpt, _)
| ‘this’ ‘:’ InfixType ‘=>’

EnumBody ::= [nl | ‘with’] ‘{’ [SelfType] EnumStats ‘}’
EnumBody ::= [nl | ‘with’] <<< [SelfType] EnumStats >>>
| ‘with’ [SelfType] indent EnumStats outdent
EnumStats ::= EnumStat {semi EnumStat}
EnumStat ::= TemplateStat
| {Annotation [nl]} {Modifier} EnumCase
EnumCase ::= ‘case’ (id ClassConstr [‘extends’ ConstrApps]] | ids)

TopStatSeq ::= TopStat {semi TopStat}
TopStats ::= TopStat {semi TopStat}
TopStat ::= Import
| Export
| {Annotation [nl]} {Modifier} Def
Expand All @@ -428,8 +438,8 @@ TopStat ::= Import
| PackageObject
| EndMarker
|
Packaging ::= ‘package’ QualId [nl| ‘with’] ‘{’ TopStatSeq ‘}’ Package(qid, stats)
Packaging ::= ‘package’ QualId [nl| ‘with’] <<< TopStats >>> Package(qid, stats)
PackageObject ::= ‘package’ ‘object’ ObjectDef object with package in mods.

CompilationUnit ::= {‘package’ QualId semi} TopStatSeq Package(qid, stats)
CompilationUnit ::= {‘package’ QualId semi} TopStats Package(qid, stats)
```
17 changes: 8 additions & 9 deletions docs/docs/reference/other-new-features/indentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,23 +58,20 @@ There are two rules:

An indentation region can start

- after the leading parameters of an `extension`, or
- after a `with` in a given instance, or
- after a ": at end of line" token (see below)
- after one of the following tokens:

```
= => ?=> <- catch do else finally for
if match return then throw try while yield
= => ?=> <- catch do else finally for if
match return then throw try with while yield
```
- after the leading parameters of an `extension`.

If an `<indent>` is inserted, the indentation width of the token on the next line
is pushed onto `IW`, which makes it the new current indentation width.

2. An `<outdent>` is inserted at a line break, if

- the first token on the next line has an indentation width strictly less
than the current indentation width, and
than the current indentation width, and
- the last token on the previous line is not one of the following tokens
which indicate that the previous statement continues:
```
Expand All @@ -87,9 +84,11 @@ There are two rules:
If the indentation width of the token on the next line is still less than the new current indentation width, step (2) repeats. Therefore, several `<outdent>` tokens
may be inserted in a row.

An `<outdent>` is also inserted if the next token following a statement sequence starting with an `<indent>` closes an indentation region, i.e. is one of `then`, `else`, `do`, `catch`, `finally`, `yield`, `}`, `)`, `]` or `case`.
The folllowing two additional rules support parsing of legacy code with ad-hoc layout. They might be withdrawn in future language versions:

- An `<outdent>` is also inserted if the next token following a statement sequence starting with an `<indent>` closes an indentation region, i.e. is one of `then`, `else`, `do`, `catch`, `finally`, `yield`, `}`, `)`, `]` or `case`.

An `<outdent>` is finally inserted in front of a comma that follows a statement sequence starting with an `<indent>` if the indented region is itself enclosed in parentheses
- An `<outdent>` is finally inserted in front of a comma that follows a statement sequence starting with an `<indent>` if the indented region is itself enclosed in parentheses

It is an error if the indentation width of the token following an `<outdent>` does not match the indentation of some previous line in the enclosing indentation region. For instance, the following would be rejected.

Expand Down
Loading