Created Application overview (markdown)

2020-03-29 23:05:01 +02:00
parent 575667b16a
commit 2e34af38a0
1 changed files with 42 additions and 0 deletions
--- a/Application-overview.md
+++ b/Application-overview.md
@@ -0,0 +1,42 @@
+# Architecture overview
+[[schemas/architecture.svg]]
+
+The system is composed of following components (which are technically Gradle subprojects) :
+* `core` - the SMNP language engine consisting of interpreter (being actually a facade for tokenizer, parser and evaluator), as well as the modules management system
+* `app` - the commandline-based frontend for `core` component
+* modules (`smnp.lang`, `smnp.io`, `smnp.audio.synth` etc.) - a set of external modules that extends the functionality of SMNP scripts
+* `api` - component that provides shared interfaces and abstract classes common for both `core` and each module components.
+
+# Interpreter
+SMNP language interpreter is a facade of three parts composed to pipeline:
+* tokenizer (or lexer)
+* parser
+* evaluator
+
+All of these components participate in processing and executing passed code, producing output that can be consumed by next component.
+
+## Tokenizer
+Tokenizer is the first component in code processing pipeline. Input code is directly passed to tokenizer which splits it to several pieces called tokens. Each token contains of main properties, such as value and related token type, for example:
+* the `"Hello, world!"` is token with value `Hello, world!` and token type of `STRING`
+* the `abc123` is token with value `abc123` and token type of `IDENTIFIER`
+
+Apart from mentioned data, each token also includes some metadata, like location including column, line and source name (file name or module name).
+
+You can check what tokens are produced for arbitrary input code using --tokens flag, for example:
+```
+smnp --tokens --dry-run -c "[1, 2, 3] as i ^ println(\"Current: \" + i.toString());"
+size: 21
+current: 0 -> (open_square, »[«, 1:1)
+all: [(open_square, »[«, 1:1), (integer, »1«, 1:2), (comma, »,«, 1:3), (integer, »2«, 1:5), (comma, »,«, 1:6), (integer, »3«, 1:8), (close_square, »]«, 1:9), (as, »as«, 1:11), (identifier, »i«, 1:14), (caret, »^«, 1:16), (identifier, »println«, 1:18), (open_paren, »(«, 1:25), (string, »"Current: "«, 1:26), (plus, »+«, 1:38), (identifier, »i«, 1:40), (dot, ».«, 1:41), (identifier, »toString«, 1:42), (open_paren, »(«, 1:50), (close_paren, »)«, 1:51), (close_paren, »)«, 1:52), (semicolon, »;«, 1:53)]
+```
+
+Tokenizer tries to match input with all available patterns, sticking with rule first-match. That means if there is more than one patterns that match input, only first will be applied. This is why you can't for example name your variables or functions/methods with keywords. Take a look at the output of following command:
+```
+smnp --tokens --dry-run -c "function = 14;" 
+size: 4
+current: 0 -> (function, »function«, 1:1)
+all: [(function, »function«, 1:1), (assign, »=«, 1:10), (integer, »14«, 1:12), (semicolon, »;«, 1:14)]
+```
+The first token has type of `FUNCTION`, not `IDENTIFIER` which is expected for assignment operation.
+
+All tokenizer-related code is located in `io.smnp.dsl.token` module.