Motivation
The following post is just an experimentation on using a lisp-like syntax, with hints of JSON, for storing structured data as text. The main motivation for this was a need to have a less verbose alternative to XML, but still be readable and editable (weak point of JSON) and support pattern matching for transforming the data.
The current specification has errors and is incomplete. The programming elements are currently in the toy stage.
Specification
(@tuples content: (## Info An experimental data-format using s-expression-like syntax for storing tree-based data as text. A tuple can be a named or unnamed parenthesis that enclose a series of expressions separated by space, where each expression is either another tuple, primitive or pair bound by a binary operator where : gives a key-value pair used for named attribute == gives an assertion-rule pair used for pattern matching -> gives a substitution-rule pair used for pattern matching => gives a combination-rule pair used for pattern matching The format also specifies primitives for common data like integers, floats, bytearrays, booleans and @tuples itself. An unpaired string can be regarded as a comment. Syntax () - encloses an (un)named tuple ## - begins a text block (nested text blocks are allowed but unbalanced text blocks must use quotation instead) ## - ends a text block "" - encloses a series of quotation escaped characters @string - gives one or more strings (explicit form of text block or quoted characters) : - a key-value pair => - rule that matches the left side and then combines it with the right side -> - rule that matches the left side and then substitute it with the right side == - rule that matches left side and then asserts that the right side is present '' - encloses a variable that either specify a type or give a named reference @tuples - gives the special tuple that specify content and rules @bytes - gives a base64 encoded text containing unsigned bytes @int - gives one or more signed integers @float - gives one or more floating point numbers @bool - gives one or more booleans using true and false Pattern matching - Explicit application of pattern matching is handled at program level - The @tuple rules attribute stores pattern matching rules - The @tuple content attribute stores data - Pattern matching variables can be used (Data value: 'PI') - as a shared reference within the @tuples content attribute ('PI' -> (@float 3.14)) - for substitution within the @tuples rules attribute ((Data) == (value: '@float')) - to match type within the @tuples rules attribute Implicit primitives "some characters" - implicit typed string -4 1.05 3e-1 - implicit untyped number true false - implicit typed boolean Syntax examples - Primitives: "string" 1.05 true (@int 3) (@float 1.05) (@bytes "MTIz") - Named tuples: (Document title: "just another format" Document) - Unnamed tuples: (title: "just another format") Parsing syntax start ::= tuple tuple ::= unnamed<tuple-body> | named<name,tuple-body> | primitive tuple-body ::= {pair | tuple, pad} unnamed<body> ::= '(' body ')' named<tag,body> ::= '(' tag (pad body)? (pad tag)? ')' name ::= [a-zA-Z][-A-Za-z0-9]* pair ::= keyvalue-pair | combine-rule | substitute-rule | assert-rule keyvalue-pair ::= name ':' pad tuple combine-rule ::= tuple pad '=>' pad tuple substitute-rule ::= tuple pad '->' pad tuple assert-rule ::= tuple pad '==' pad tuple primitive ::= tuples | string | text | bytes | int | float | bool tuples ::= named<'@tuples',tuple-body> bytes ::= named<'@bytes',base64-body> base64-body ::= '"' [a-zA-Z0-9+/=]* '"' int ::= named<'@int',int-body> int-body ::= float ::= named<'@float',float-body> float-body ::= bool ::= bool-body | named<'@bool',bool-body> bool-body ::= 'true' | 'false' string ::= '"' string-body '"' | named<'@string',string-body> string-body ::= <any characters until unescaped quote character> text ::= text-begin text-body text-end text-begin ::= '(##' text-end ::= '##)' text-body ::= <any characters until balanced text-end> pad ::= [/s]+ Examples of invalid expressions (Data name: "A" name: "B") - duplicate key-value pair Data - neither a primitive or named tuple name: "A" - pair not enclosed in parenthesis @tuples - primitive not enclosed with parenthesis '@byte' - variable does not contain a valid type Examples of valid expressions (name: "Kyrre") - unnamed tuple "Some characters!" - implicit (@string "Some characters!") 1.04 - implicit (@string "1.04") true - implicit (@bool true) ##) @tuples)
Example
Data with redundant information removed:
(@tuples (## At the program level the @tuples content will be transformed by pattern matching against the rules. (@tuples content: (@map content rules)) ##) (## Use substitution and combination rules to update missing information. This can be used for instancing and (re)naming tuples. ##) rules: ( (type: '') => (Node) (from: '' to: '') => (Connector) 'test-data' -> (@bytes "QUI9PQ==") 'instructions' -> (## some data in text format ##) ) content: ( name: "Group1" type: "Group" nodes: ( (name: "Source" type: "Value" value: 'test-data' inputs: ((name: "In" datatype: "bytedata")) outputs: ((name: "Out" datatype: "bytedata"))) (name: "Transform" type: "Process" data: 'instructions' inputs: ((name: "In" datatype: "bytedata")) outputs: ((name: "Out" datatype: "bytedata"))) (name: "Target" type: "Value" value: "" inputs: ((name: "In" datatype: "bytedata")) outputs: ((name: "Out" datatype: "bytedata"))) ) connectors: ( (from: (node: "Source" socket: "Out") to: (node: "Transform" socket: "In")) (from: (node: "Transform" socket: "Out") to: (node: "Target" socket: "In")) ) ) @tuples)
The resulting data after the content has been transformed by the rules:
(@tuples content: (Node name: "Group1" type: "Group" nodes: ( (Node name: "Source" type: "Value" value: (@bytes "QUI9PQ==") inputs: ((Socket name: "In" datatype: "bytedata")) outputs: ((Socket name: "Out" datatype: "bytedata"))) (Node name: "Transform" type: "Process" data: (## some data in text format ##) inputs: ((Socket name: "In" datatype: "bytedata")) outputs: ((Socket name: "Out" datatype: "bytedata"))) (Node name: "Target" type: "Value" value: "output" inputs: ((Socket name: "In" datatype: "bytedata")) outputs: ((Socket name: "Out" datatype: "bytedata"))) ) connectors: ( (Connector from: (node: "Source" socket: "Out") to: (node: "Transform" socket: "In")) (Connector from: (node: "Transform" socket: "Out") to: (node: "Target" socket: "In")) ) ) @tuples)
Ideas for extensions
- Chaining together separate files
- Nesting separate files with pattern matching variables - like referencing large bytedata and instantiating tuples
- Use separate files for validation, update and typing of data





