Blog o' Weston: How to Make View-Independent Program Models

Sunday, June 7, 2015

How to Make View-Independent Program Models

In Part 1 of this two part series, I made an argument that the standard way of structuring program authoring tools involves a peculiar and unnecessary model/view coupling that makes a number of problems in the programming tools domain more difficult than they need to be. Here I'll be describing a simple, general way of constructing generic 'program models,' which have much in common with ASTs, but improve on them in a couple of critical ways. I'll be referring to this method as the 'path formulation' of program models.

Why this structure?

Short answer: because it's simple and general, and bears a very close relation to the essential activity in high-level program construction.

The 'path formulation,' a particular way of creating program models, comes from asking the question, "what are we really doing when writing source code?" and finding the answer, "selecting and configuring abstractions provided by a programming language." In that case, writing source code is just one way of doing a more general and essential activity: selecting and configuring abstract 'language constructs.' Here's a short dialogue illustrating the point a bit more.

So, let's say your programming language provides an abstract language construct called 'function_declaration'; it is comprised of a few parts: a 'name,' a 'return_type,' and an 'argument_list.' A configuration for this construct would be a particular 'name,' 'return_type,' and 'argument_list.' What makes it 'abstract' is that none of these entities are tied to a representation. In other words, while it has become reflex to think of these sorts of constructs in terms of character sequences, here we intentionally leave open the question of how they should look.

Making Program Models using the Path Formulation

The idea behind the path formulation is: if you have a model of a programming language in the form of a graph, then individual program models are just paths through the graph.

Here's a high-level, three step recipe for making these language/program models:

(1) Represent your programming language 'abstractly,' using a formal grammar that has no lexical section (now called an 'abstract grammar'): its fundamental units are the fundamental units from your language in the abstract, rather than character sequences. Only consider which abstractions your language should include and the rules for composing them; worry about how to represent them visually elsewhere. For example, we can say that a 'class' is made up of a 'name,' a set of 'variables,' and a set of 'methods,' without any assumptions about how these things are going to look. (I talk about this in the first part, too.)

(2) Convert your abstract grammar into a graph, which will serve as the 'model' for your language. (I have some Java code that will do this for ANTLR grammars, btw, which I can clean up and share on Github if there's interest.)

(3) Represent individual programs as specific paths through the language graph.

So, if your grammar contains a subsection like this ('class_reference' is for inheritance, indicating a parent class):

program

: class+

interface*

;

class

: name

class_reference?

function_declaration*

;

function_declaration

: etc. etc. etc.

Your graph-based language model will have a subsection like this (black nodes are 'language constructs,' orange nodes are 'language atoms'):

And a particular program in your language, consisting of one simple class might look like this:

These paths could be represented by just listing the edges taken—though of course you have to number the edges:

Taking that approach, the model of our simple program looks like this:

(2 2 0 1 3 0 0 3)

If we would like a little structure in our model representation, we can distinguish between 'language constructs' and 'language atoms' (black nodes and orange nodes), by using opening and closing parentheses to mark the start and end of language constructs.

(2 (2 0 1 (0... 1... 2... 3) 0 0 3))

Note: the ellipses are where we followed some hypothetical edges in 'function declaration.'

Language constructs are just composite abstractions, made up of more than one part; language atoms cannot be broken into smaller units. Actually, 'language constructs' and 'atoms' have a relationship that mirrors S-expressions in Lisp, so it's no surprise that the notation for program models resembles Lisp (here, however, we aren't tied to using this as the visual interface for the programmer).

Using Insight from the Language Model

A big advantage of the path formulation of program models is that it maintains a strong connection between elements of a user's program, and the programming language itself (since programs are paths in the language graph). This connection can be taken advantage of by programming tools to guide programmers in using the language.

As an example, let's say a programmer has just selected a 'class' construct (maybe by typing out the keyword 'class'—or in some new way); the program model now contains a node for that 'class' construct, and the editor, just by examining the 'class' node in the language model, knows all the legal options for proceeding, because each option corresponds to an edge going out of the 'class' node:

To make this as concrete as possible, I'll show one possible UI that takes advantage of this connection to the language graph. Keep in mind, though: you could still render it like a traditional text editor, using the UI contemporary IDEs use for 'auto complete' to display alternatives. It's like automatic auto complete for all aspects of the language—at the least, this would be tremendously useful to new users of a language.

In this hypothetical editor, the UI is split into two main sections: the top is our document, which is just a rendering of the program model; and the bottom contains controls for selecting and configuring language constructs.

Let's say we've just instantiated a class definition by supplying all the necessary parameters; it has the name 'InputHandler,' and our editor has 'collapsed' it, so we just see the name and type:

Since our editor is following along in the language graph, it knows we're back in the 'program' node, from which point we can begin specifying either a class or an interface. Let's say we select 'class.' Our editor now looks like this:

Notice the whole 'class' construct and the 'name' section have red borders; this is to indicate that 'class' hasn't been fully instantiated: it still has 'free' parameters that must be bound to something (in this case 'name' must be bound to something). Also notice that in the bottom section, the options appearing in the grid are just the neighbors of the 'class' node in the language graph.

I imagine that in using a system like this, the cells in the bottom area would map to keys on your keyboard: this way you could accomplish a task like creating the skeleton of a new method declaration with a single keystroke. Something along these lines would also be much better than text editors for programming with virtual/augmented reality systems and mobile devices. Anyway, this UI is a just a quick sketch of one possible approach. The document region could also be something like this (in that video, I'm rendering the AST in manner identical to how I'm suggesting we render program models).

Conclusion

It's been a long time since we laid down the character sequence and parsing-based architecture of program authoring tools, and contemporary work on programming languages is deeply invested in that established approach. It seems like the program model approach could be an improvement—but who knows what lethal oversights might still be lurking. What's especially needed at this point is a concrete implementation. I'm working on it, slowly, in my free time—but my hope is that others will read the ideas here, and if the they prove to be generally interesting after all, expand and solidify them into serious tools that will improve the experience of programming for the upcoming years. If you'd like to hire me to work on something related, I can be contacted at 'westoncb[at google's mail service]' (or even if you just want to talk about it—though the comment section is probably best for that).

----------------------------------------

Appendix:

Identifiers etc.

Let's take a look at the abstract program graph one last time:

Notice that 'class reference' and 'name' are language atoms, but these things need structure of their own, so where does that come from? First I'll point out the reason we don't include identifiers in the language definition is that they are mnemonics for humans, not part of the abstract structure of a language; all the language needs is a unique identifier, so we just generate a random one. As for the mnemonic, maybe it should be a string, maybe something else—accordingly we leave it outside of the language spec. In the program model, we just attach the random ID (discussed earlier, too) which can be associated to some representation specified by the programmer (probably totally without their notice, by just typing in their editor as always). A program model with IDs might look like:

(2 (3 (#49843345) 0 0 (#95728745) 0 1 (... ... ... 3) 4))

There's an external file that maps these IDs to representations (often it's just a string) for editors to use.

The following section discusses how to handle the 'class reference' node and others like it.

Referencing the Program Model

There is an aspect of programming languages that isn't captured by the 'abstract grammar' that I've described so far. The abstract grammar only allows us to describe 'free' language constructs which, when supplied with specific parameters, are 'instantiated'; program models contain only instantiated language constructs. However, the 'abstract grammar' should describe the full capabilities of the language, and programming languages always contain mechanisms for referencing already instantiated language constructs: e.g., I have instantiated a 'function_declaration,' which had its 'name' parameter bound to the value 'testFunction'; other parts of my program should be able to reference this specific, instantiated 'function_declaration' by using it's 'name,' 'testFunction,' as a reference.

To be honest, I'm very curious to hear other people's ideas on how to go about doing this, though I do have an approach that seems like it would work well: extend the notation of our 'abstract grammar' (which is just some variation of BNF at the moment) to express 'queries' on program models: i.e. "select all the nodes from the program model of type 'class'." More concretely, let's say our 'class' construct is defined as follows:

class

: name

class_reference?

function_declaration*

visibility?

;

(The 'class_reference?' component is used to reference a parent class.)

'class_reference' would be defined in our grammar as follows (except using some appropriate notation, not English):

class_reference
: "select nodes on program model of type 'class' in same 'package'"
;

So, in order to instantiate a 'class_reference' the programmer would have to select a node from the program model that meets the criteria in the query. Ideally, the programmer's IDE would parse the query, run it on the program model, and offer up a selection of valid nodes. Present IDEs do this sort of thing of course, but including the necessary information in a unified, abstract language specification would be beneficial.

14 comments:

scottnelsonsmithAugust 22, 2015 at 3:34 PM
Fascinating appealing ideas. I've programmed for decades (my original background was electronics engineering), but I'm "grokking" what you're saying here.

I wonder if "blowing open" the notion of having to use text editors like vim and emacs to being able to express "programming" in more abstract levels would attract more smart people to programming.

Lastly, it's interesting to see attempts a metaprogramming applied to create "domain specific language" extensions (Ruby pushes this hard) as a substitute for what you're advocating here.
ReplyDelete
Replies
UnknownAugust 23, 2015 at 2:06 PM
Very interesting ideas. A large part of the reason that people like the textual representation is that it is incredibly information dense, and universally easy to read and display, so I think it would be hard to move away from that entirely.

It would be cool to apply this as a sort of preprocessing step - generate real code to be parsed by a compiler, but utilize some kind of highly efficient graph editor. Due to the fact that the editor would fully understand the abstract representation of the language, it could offer very high level tools fairly easily.

For example, meta-programming could be accomplished via a generative graph algorithm, without having any intermediate serialization steps. Querying the codebase would be very easy programatically, and developers could perform very powerful graph matching searches without the horrible pain of regex.

A special purpose editor that could facilitate that kind of thing would be fantastic, but is no small endeavour and isn't going to become usable overnight. Many existing editors could benefit from some of these ideas through plugins and extensions. We humans have gotten pretty good at understanding and manipulating textual representations, so why not just augment that with amazing tooling that makes writing code happen language-construct by language-construct, rather than letter by letter.

First off, you could take the text in the editor, compute its graph representation and find what point in the graph contained the cursor. From there, many very useful developer aids could be implemented. As a basic example, you expose corrolate graph traversals as set of keyboard commands that move the cursor between nodes. I.e. go to parent, cycle through siblings, cycle through descendants.

Going on step further... Every time the cursor moves in the graph, recalculate the set of options that they have and display them as tab completion options (same idea as the bottom menu in your UI.) When the user tab completes something like an empty class, the editor could guide them through all of the parameters of that empty class via keyboard shortcuts. A very similar concept is referred to as "snippets" in Sublime Text land. This could extend to both generating code, and referencing code. When the user is at the point in the graph where a reference could be made, you could query the graph for all applicable references and offer them as tab completion options.

I think it would be really interesting to see some of this functionality implemented, at least in a simple case. This option could effectively utilize the highly information dense and easy to transfer format of text, while still offering the tooling of the graph approach. This could significantly speed up the editing process by taking most of the formatting and text creating work out of the hands of the developer. Spelling errors and syntax mistakes could be reduced, while still allowing developers to do that text manipulations themselves if that is faster than the graph approach.

Of course, this sort of behavior is highly dependent on the language being written. For example, some languages have type systems or scope conventions that would allow you to significantly reduce the number of tab completion options for references. Ideally, the plugin/extension itself would be as language agnostic as possible, relying on language grammars and semantics defined in an easily-modifyable way.
ReplyDelete
Replies
Brett Douglas WilliamsJanuary 30, 2017 at 2:04 PM
The problem is that necessarily only the models that the humans habitually observe and maintain the simplicity of will remain simple. If you have another way of grabbing the code, working naturally with things that are simple from that perspective will introduce what appears like complexities from the textual view. Unless you continually return to the textual view and maintain its simplicity, the textual simplicity will be traded for simplicity in the other view.

A slightly more concrete example: We're often viewing things graphically, so we add a control where you can change a color associated with each function (or whatever). The coloring is natural and useful in the graphical view-- but now if you view the program as text it has noisy notes everywhere about what colors things are.

I think we should experiment with new ways of viewing programs, but I don't think we can have our cake and eat it too, I think that changing how we view programs also changes what those programs are and we have to abandon old views to find new ones.
ReplyDelete
Replies
Weston BeecroftJanuary 30, 2017 at 5:38 PM
Hey Brett—thanks for reading!

When you say "... I think that changing how we view programs also changes what those programs are and we have to abandon old views to find new ones." —that's definitely true of classical programming language architectures, but the reason for it is exactly what I'm attempting to address with this idea of 'program models'.

The trick is to contain the meaning of programs inside 'program models' which are more abstract than particular visualizations of programs. When I say abstract there, there's a very specific thing I mean: whenever you have two representations of the same thing where one contains less information than the other, that one is more abstract. For example, let's say we have two models of a Person:

Person(name, height, birthdate, weight, voice, eye_color, hometown)

and

Person(name, height)

—the second model is more abstract.

So, what I'm proposing here is an abstract model of computer programs:

ComputerProgram(types, semantics)

When it comes time to visualize a particular program, we just make that model more concrete—we just add information:

ComputerProgram(types, semantics, appearance)

'appearance' is interchangeable; you can keep everything meaningful about how the program operates while replacing the appearance. Additionally, the structures of separate appearance models are totally independent; so, for your example where we add color to each function in one view, that information would disappear completely once you swapped in a text view.

There's more information on this aspect in the first post (this is the second in a two part series): http://westoncb.blogspot.com/2015/06/how-to-make-view-independent-program.html

Lemme know if I can clarify more—I'm still trying to figure out the best way of explaining this stuff...
ReplyDelete
Replies
ShalabhSeptember 9, 2019 at 1:49 PM
Interesting stuff! Curious why a program is a 'path' and not just another graph? Specifically, a path isn't sufficient since you need additional information (a value for the class name, for instance) attached to the nodes as you traverse the path. Also, order isn't relevant to the program. Whether you encode a program as a path [class -> name -> class -> function] or another path [class -> function -> class -> name], you're defining the same program - a class with a name and a function.

I really like the idea of the high level 'graph grammar' instead of the typical text grammar. However, I see the model as one graph which defines the grammar and provides the 'schema' for another graph: the program. IOW, I see the program as an instance of the grammar graph (the latter defines what kinds of nodes and edges are possible), where each node in the program graph is an instance of a node in the grammar graph. Serializing either of these graphs is a separate concern.
ReplyDelete
Replies
ShalabhSeptember 9, 2019 at 11:08 PM
> But consider enforcing the schema: the validity of a path-program can be verified by simply checking if the path is valid in the language graph.

I see. Optimizing a representation for easy/quick validation seems reasonable, though a separate concern from the core idea above. Might be a bit trickier than just a path check (e.g. you don't want a class to have two names.)

One big benefit of something like this seems to be that designing a language doesn't require me to produce a character level syntax specification but rather just make a higher level model grammar. Then a library could just read a serialized form and return an AST for my language - so there's a bunch of design and implementation I don't have to do. Further, the surface syntax becomes automatically skinnable, and is perhaps outside the purview of language design and now moved within the purview of editor design.

From another angle, this idea looks like I'd provide class definitions for my AST nodes (~ the graph grammar) and the editor would automatically provide a UI to build a valid AST, libraries would automatically parse serialized forms, etc. The graph grammar reminds me of PEG and recursive descent parsers, BTW.

> I am curious why you see program models as graphs rather than trees. What kind of relationship would exist that would require a graph to capture?

Well, trees are also graphs so trees are subsumed by that statement :D. Depends on what phases of the program you want to represent here. Consider a function A that calls a function B, defined previously in the same program. Does the 'function call node' to `B()` get parsed into a node that contains the symbol 'B' or as a node that holds a reference to the B function node directly? Typically parsing to ASTs do the former and the latter binding is then done somewhere deeper in the compiler or runtime after 'name resolution', which basically converts symbols to links. In the end if we examine the program as run and squint a bit, we are dealing with a graph. It doesn't mean that the initial phase must produce a graph - it could stop at producing a tree. But once we're talking about validation against a model, we may want to ensure that the call to `B()` isn't a nonexistent function - and the name resolution and binding could happen in the editor itself. The issue get considerably complex once we have cross file references in the picture.
ReplyDelete
Replies
Weston BeecroftSeptember 10, 2019 at 12:41 AM
> Might be a bit trickier than just a path check (e.g. you don't want a class to have two names.)

If I understand correctly, this case would be handled: the language graph includes associating *, +, ? to edges, so it would be an invalid path to choose two names.

> the surface syntax becomes automatically skinnable, and is perhaps outside the purview of language design and now moved within the purview of editor design

Yes, exactly. That is a core goal. Could potentially do it with two layers: 1) the library like you describe 2) a 'sublime-text-like' (or Howl-like ;) minimal/language-agnostic editor (built on the library). The editor could then be scripted to define new editing/insertion modes, in addition to using something like CSS to decorate program renderings.

> and the editor would automatically provide a UI to build a valid AST, libraries would automatically parse serialized forms

Yes to the first, but I didn't understand the second one.

> The graph grammar reminds me of PEG and recursive descent parsers

I'll look into PEG tonight. I'm not familiar.

Oookay—I think I see where you're going with program graphs. That's a really neat idea. A big part of the appeal of all this to me is incorporating into a unified language-source-definition aspects like call graph relationships that are typically in a separate phase from language front-end.

Not sure if you already saw it, but there is an appendix in the article that describes a mechanism I had in mind for capturing these kinds of relationships in the language definition. Title is "Referencing the Program Model".

That said, what you are describing is really complementary to the mechanism I was laying out there. Maybe even a necessary second half. I need to think a bit about the possibilities with program models as graphs...

How optimistic are you that an effective UI can be put together for doing AST construction/editing? My sense is that the common opinion that people "in the know" have about AST editors is that they seem like a nice idea, but when you try to write one you run into insurmountable UX problems and conclude text is ultimate.

Personally I remain very optimistic, and have been working on a concrete design for one approach (and have rough ideas for others)—but it's just a bunch of text in a .txt file right now.
ReplyDelete
Replies
ShalabhSeptember 11, 2019 at 10:06 AM
> If I understand correctly, this case would be handled: the language graph includes associating *, +, ? to edges, so it would be an invalid path to choose two names.

Yes, you're right. I was just being pedantic about the meaning of 'path' wrt graphs.

> Yes, exactly. That is a core goal. Could potentially do it with two layers: 1) the library like you describe 2) a 'sublime-text-like' (or Howl-like ;) minimal/language-agnostic editor (built on the library). The editor could then be scripted to define new editing/insertion modes, in addition to using something like CSS to decorate program renderings.

What API would this library provide? Most editors assume you're editing a plain text file and the 'helper mechanisms' (e.g. language server) are then overlaid on it. That said, Sublime, VSCode and Howl like editors seem extensible enough where in a special mode they could just use a library to read the 'document' display it in different ways. How much of the view layout would be specified by the library vs how much custom built in the editor?

> > and the editor would automatically provide a UI to build a valid AST, libraries would automatically parse serialized forms

> Yes to the first, but I didn't understand the second one.

I just meant that the file format is automatically handled by libraries.

> "Referencing the Program Model".
Ah yes, this does cover cross references between nodes.

> How optimistic are you that an effective UI can be put together for doing AST construction/editing? My sense is that the common opinion that people "in the know" have about AST editors is that they seem like a nice idea, but when you try to write one you run into insurmountable UX problems and conclude text is ultimate.
> Personally I remain very optimistic, and have been working on a concrete design for one approach (and have rough ideas for others)—but it's just a bunch of text in a .txt file right now.

That's great. I'm very optimistic, otherwise I wouldn't be excited about this stuff :D. I do think it's hard to get the fluidity and sense of control and fallback you get when working with text files. Integration with existing tooling (git, text editors..) another big issue - though unrelated to UI.

Regarding UI specifically, one issue that seems to come up is writing code 'in the small'. For e.g. an expression `a + b - c`. Doing this with tree editing is very cumbersome. PEGs for this get clunky too (would be interesting to see how the graph grammar can represent this). Yet there are non plain-text-file UIs that seem fairly fluid - spreadsheets, outliners etc. And even in text edtors, there's a lot of 'non plain text editing' work we do - switching between buffers, inspecting history, outline views, searching and navigating, etc. - which seems fairly fluid.
BTW, if you haven't already, worth checking out https://futureofcoding.org/catalog/ which covers structured editors and such.
ReplyDelete
Replies
ShalabhSeptember 14, 2019 at 3:46 PM
The reason I was discussing the API is because if it's a 'data API' - all the layout logic is in the editor which requires a custom editor. If some layout is exposed in the API (e.g. 'a visual tree, a visual list, etc..) then the editors have to do less (perhaps) - there seem to be important implications of this API, but I'm not sure how to identify them.

If you're looking for a community, there's an interesting and active one growing at futureofcoding.org/slack. A related idea that was recently posted there is http://jtree.treenotation.org/faq.html , which tries to create a minimal syntax that encodes language agnostic structure and be existing toolchain friendly.

Re value - it's hard to say, what is valuable depends a lot on goals and motivations.

Re 'editing in the small' - yeah I think bottom up is a good way to describe it. Pratt parser like ideas might be useful here.
ReplyDelete
Replies
Weston BeecroftSeptember 14, 2019 at 5:25 PM
> futureofcoding.org/slack

Thanks—I think I will join. I'd been thinking about checking it out a while ago but forgot completely.

> http://jtree.treenotation.org/faq.html

That seems very much in line with what I'm seeking 0_0

RE value+goals/motivation: My top level goal is to get structure editing for software dev into the mainstream. (Or to understand how its value is not what it appears to be.)

The primary thing I'm trying to do with this graph grammar/program model idea is see whether its interestingness can be invalidated (trying balance my ratio of maybe-promising ideas/time). I've actually made much more headway on that in talking with you than I've otherwise been able to in the > 4 years since I wrote it. So thanks for that, I appreciate it.

I'm starting to think now about ways in which program models might be formed in terms of language semantics concepts, or at least type information (I'm guessing my ideas about abstract syntax are just poor approximations to some notions in type theory anyway). So maybe the abstract grammar just ends up being some Agda or Haskell program... I'm curious how source autocomplete might look if it were also always informed by type relations/constraints.

I'm already overcommitted on coding right now, so I'll keep exploring on the conceptual level until I find something blatantly valuable. I'm guessing now that what I was describing in the article is on the level of "might make a nice architecture," but not necessarily game-changing.
ReplyDelete
Replies

Add comment