Reactive programming

Updated on Dec 11, 2024

Edit

Comment

In computing, reactive programming is an asynchronous programming paradigm oriented around data streams and the propagation of change. This means that it should be possible to express static (e.g. arrays) or dynamic (e.g. event emitters) data streams with ease in the programming languages used, and that the underlying execution model will automatically propagate the changes through the data flow.

For example, in an imperative programming setting, a := b + c would mean that a is being assigned the result of b + c in the instant the expression is evaluated, and later, the values of b and c can be changed with no effect on the value of a . However, in reactive programming, the value of a would be automatically updated whenever the values of b and c change, without the program executing the sentence a := b + c again.

Another example is a hardware description language such as Verilog. In this case, reactive programming allows changes to be modeled as they propagate through a circuit.

Reactive programming has foremost been proposed as a way to simplify the creation of interactive user interfaces, animations in real time systems, but is essentially a general programming paradigm.

For example, in a model–view–controller architecture, reactive programming can allow changes in the underlying model to automatically be reflected in the view, and vice versa.

Definition of Reactive Programming

Quoting Gérard Berry:

It is convenient to distinguish roughly between three kinds of computer programs. Transformational programs compute results from a given set of inputs; typical examples are compilers or numerical computation programs. Interactive programs interact at their own speed with users or with other programs; from a user point of view, a time-sharing system is interactive. Reactive programs also maintain a continuous interaction with their environment, but at a speed which is determined by the environment, not the program itself. Interactive programs work at their own pace and mostly deal with communication, while reactive programs only work in respond to external demands and mostly deal with accurate interrupt handling. Real-time programs are usually reactive. However, there are reactive programs that are not usually considered as being real-time, such as protocols, system drivers, or man-machine interface handlers.

Approaches to Creating Reactive Programming Languages

There are several popular approaches to creating reactive programming languages. Some are dedicated languages that are specific to some domain constraints (such as real-time or embedded computing or hardware description). Some are general-purpose languages that support reactivity. Finally, some are libraries or embedded domain-specific languages that enable reactivity alongside or on top of an existing general-purpose programming language. These different approaches result in trade-offs in the languages; in general, the more restricted the language, the more compilers and analysis tools can inform programmers (e.g., in performing analysis for whether programs can be executed in real time), while trading off general applicability.

Programming Models and Semantics

A variety of models and semantics govern the family of reactive programming. We can loosely split them along the following dimensions:

Synchrony: is the underlying model of time synchronous versus asynchronous?

Determinism: Deterministic versus non-deterministic in both evaluation process and results (the former does not necessarily imply the latter)

Update process: callbacks versus dataflow versus actors

Essence of Implementations

The runtime of reactive programming languages usually relies on a graph that captures the dependencies among the reactive values. In the graph, nodes represent computations and edges model dependency relationships. The language runtime uses the graph to keep track of which computations must be executed again when one of the inputs changes.

Change Propagation Algorithms

There are numerous implementation techniques used by reactive programming systems that represent the data flow graph explicitly. The most common algorithms are:

pull

push

hybrid push-pull

What to push?

At the implementation level, reacting to an event consists of propagating across the graph the information that a change has happened. As a consequence, computations that are affected by the change and may be outdated are re-executed. These computations are usually in the transitive closure of the changed source. Change propagation may lead to an update of the sinks of the graph.

The information propagated in the graph can consist of the complete state of a node, i.e., the result of the computation of that node. In this case the previous output of the node is ignored. Another option is that changes are propagated incrementally. In this case, the information propagated along edges consists only of a delta that describes how the previous node has changed. The latter approach is especially important when nodes hold a large amount of state which would be expensive to recompute from scratch. Propagating deltas is essentially an optimization and has been extensively studied in incremental computing. This approach requires a solution to the view-update problem, which is well-known from databases maintaining views of changing data. Another common optimization is to accumulate changes and propagate a batch of them instead of a single one. This solution can be faster because it reduces communication among nodes and optimization strategies can reason about a batch of changes - for example two changes in the batch can cancel each other and can be simply ignored. Finally, it is also possible to propagate notifications of invalidity, causing nodes with invalid inputs to pull updates in order to update their own outputs.

There are two principal ways in which the dependency graph is built:

The graph of dependencies is maintained implicitly by an event loop. In this case, the registration of explicit callbacks creates implicit dependencies. This means that the inversion of control induced by callbacks is left in place; however, by making the callbacks functional (returning a state value instead of a unit value) callbacks become compositional.
The graph of dependencies is program-specific and given by the programmer. This approach enables addressing the inversion of control of callbacks in two ways: either the graph is specified explicitly (typically using a DSL which may be embedded), or the graph is implicitly defined by expressions and generated by "the language".

Glitches

When propagating changes, it is possible to pick propagation orders such that the value of an expression is not a natural consequence of the source program. We can illustrate this easily with an example. Suppose seconds is a reactive value that changes every second to represent the current time (in seconds). Consider this expression:

t = seconds + 1g = (t > seconds)

Because t should always be greater than seconds, this expression should always evaluate to a true value. Unfortunately, this can depend on the order of evaluation. When seconds changes, two expressions have to update: seconds + 1 and the conditional. If the first evaluates before the second, then this invariant will hold. If, however, the conditional updates first, using the old value of t and the new value of seconds, then the expression will evaluate to a false value. This is called a glitch.

Some reactive languages are glitch-free, and prove this property. This is usually achieved by topologically sorting expressions and updating values in topological order. This can, however, have performance implications, such as delaying the delivery of values (due to the order of propagation). In some cases, therefore, reactive languages permit glitches, and developers must be aware of the possibility that values may temporarily fail to correspond to the program source, and that some expressions may evaluate multiple times (for instance, t > seconds may evaluate twice: once when the new value of seconds arrives, and once more when t updates).

Cyclic Dependencies

Topological sorting of dependencies depends on the dependency graph being a directed acyclic graph (DAG). In practice, a program may define a dependency graph that has cycles. Usually, reactive programming languages expect such cycles to be "broken" by placing some element along a "back edge" to permit reactive updating to terminate. Typically, languages provide an operator like delay that is used by the update mechanism for this purpose, since a delay implies that what follows must be evaluated in the "next time step" (allowing the current evaluation to terminate).

Interaction with Mutable State

Reactive languages typically assume that their expressions are purely functional. This allows an update mechanism to choose different orders in which to perform updates, and leave the specific order unspecified (thereby enabling optimizations). When a reactive language is embedded in a programming language with state, however, it may be possible for programmers to perform mutable operations. How to make this interaction smooth remains an open problem.

In some cases, it is possible to have principled partial solutions. Two such solutions include:

A language might offer a notion of "mutable cell". A mutable cell is one that the reactive update system is aware of, so that changes made to the cell propagate to the rest of the reactive program. This enables the non-reactive part of the program to perform a traditional mutation while enabling reactive code to be aware of and respond to this update, thus maintaining the consistency of the relationship between values in the program. An example of a reactive language that provides such a cell is FrTime.

Properly encapsulated object-oriented libraries offer an encapsulated notion of state. In principle, it is therefore possible for such a library to interact smoothly with the reactive portion of a language. For instance, callbacks can be installed in the getters of the object-oriented library to notify the reactive update engine about state changes, and changes in the reactive component can be pushed to the object-oriented library through getters. FrTime employs such a strategy.

Dynamic Updating of the Graph of Dependencies

In some reactive languages, the graph of dependencies is static, i.e., the graph is fixed throughout the program's execution. In other languages, the graph can be "dynamic", i.e., it can change as the program executes. For a simple example, consider this illustrative example (where seconds is a reactive value):

t = if ((seconds mod 2) == 0): seconds + 1 else: seconds - 1 endt + 1

Every second, the value of this expression changes to a different reactive expression, which t + 1 then depends on. Therefore, the graph of dependencies updates every second.

Permitting dynamic updating of dependencies provides significant expressive power (for instance, dynamic dependencies routinely occur in graphical user interface (GUI) programs). However, the reactive update engine must decide whether to reconstruct expressions each time, or to keep an expression's node constructed but inactive; in the latter case, ensure that they do not participate in the computation when they are not supposed to be active.

Degrees of explicitness

Reactive programming languages can range from very explicit ones where data flows are set up by using arrows, to implicit where the data flows are derived from language constructs that look similar to those of imperative or functional programming. For example, in implicitly lifted functional reactive programming (FRP) a function call might implicitly cause a node in a data flow graph to be constructed. Reactive programming libraries for dynamic languages (such as the Lisp "Cells" and Python "Trellis" libraries) can construct a dependency graph from runtime analysis of the values read during a function's execution, allowing data flow specifications to be both implicit and dynamic.

Sometimes the term reactive programming refers to the architectural level of software engineering, where individual nodes in the data flow graph are ordinary programs that communicate with each other.

Static or Dynamic

Reactive programming can be purely static where the data flows are set up statically, or be dynamic where the data flows can change during the execution of a program.

The use of data switches in the data flow graph could to some extent make a static data flow graph appear as dynamic, and blur the distinction slightly. True dynamic reactive programming however could use imperative programming to reconstruct the data flow graph.

Higher-order reactive programming

Reactive programming could be said to be of higher order if it supports the idea that data flows could be used to construct other data flows. That is, the resulting value out of a data flow is another data flow graph that is executed using the same evaluation model as the first.

Data flow differentiation

Ideally all data changes are propagated instantly, but this cannot be assured in practice. Instead it might be necessary to give different parts of the data flow graph different evaluation priorities. This can be called differentiated reactive programming.

For example, in a word processor the marking of spelling errors need not be totally in sync with the inserting of characters. Here differentiated reactive programming could potentially be used to give the spell checker lower priority, allowing it to be delayed while keeping other data-flows instantaneous.

However, such differentiation introduces additional design complexity. For example, deciding how to define the different data flow areas, and how to handle event passing between different data flow areas.

Evaluation models of reactive programming

Evaluation of reactive programs is not necessarily based on how stack based programming languages are evaluated. Instead, when some data is changed, the change is propagated to all data that is derived partially or completely from the data that was changed. This change propagation could be achieved in a number of ways, where perhaps the most natural way is an invalidate/lazy-revalidate scheme.

It could be problematic simply to naively propagate a change using a stack, because of potential exponential update complexity if the data structure has a certain shape. One such shape can be described as "repeated diamonds shape", and has the following structure: A_n→B_n→A_n+1, A_n→C_n→A_n+1, where n=1,2... This problem could be overcome by propagating invalidation only when some data is not already invalidated, and later re-validate the data when needed using lazy evaluation.

One inherent problem for reactive programming is that most computations that would be evaluated and forgotten in a normal programming language, needs to be represented in the memory as data-structures. This could potentially make RP highly memory consuming. However, research on what is called lowering could potentially overcome this problem.

On the other side, reactive programming is a form of what could be described as "explicit parallelism", and could therefore be beneficial for utilizing the power of parallel hardware.

Similarities with observer pattern

Reactive programming has principal similarities with the observer pattern commonly used in object-oriented programming. However, integrating the data flow concepts into the programming language would make it easier to express them and could therefore increase the granularity of the data flow graph. For example, the observer pattern commonly describes data-flows between whole objects/classes, whereas object-oriented reactive programming could target the members of objects/classes.

The stack-based evaluation model of common object orientation is also not entirely suitable for data-flow propagation, as occurrences of "tree feedback edges" in the data structures could make the program face exponential complexities. But because of its relatively limited use and low granularity, this is rarely a problem for the observer pattern in practice.

Imperative

It is possible to fuse reactive programming with ordinary imperative programming. In such a paradigm, imperative programs operate upon reactive data structures. Such a set-up is analogous to constraint imperative programming; however, while constraint imperative programming manages bidirectional constraints, reactive imperative programming manages one-way dataflow constraints.

Object-oriented

Object-oriented reactive programming (OORP) is a combination of object oriented programming and reactive programming. Perhaps the most natural way to make such a combination is as follows: Instead of methods and fields, objects have reactions that automatically re-evaluate when the other reactions they depend on have been modified.

Below is an illustration of the A=X+Y introductory example using JavaScript and jQuery:

If an OORP programming language maintains its imperative methods, it would also fall under the category of imperative reactive programming.

Functional

Functional reactive programming (FRP) is a programming paradigm for reactive programming on functional programming.

Spreadsheets

A modern spreadsheet is often cited as an example of reactive programming. This is problematic because the unqualified term "spreadsheet" may refer to either:

The underlying collection of cells, where each cell contains either a literal value or a formula that refers to other cells such as "=B1+C1". This table of cells is effectively a computer program that determines how a set of output cells are computed from a set of input cells. This may be saved to a file, and this is often referred to as a "spreadsheet" (e.g. "the budget spreadsheet").
The interactive application program with a graphical user interface that is used to edit and evaluate the underlying table of cells from (1). In virtually all spreadsheet applications, interactively changing any one cell on the sheet will result in immediately re-evaluating all formulas that directly or indirectly depend on that cell and updating the display to reflect these re-evaluations.

Confusion arises because the spreadsheet application (2) is an example of a reactive program, while the program effectively defined by the underlying spreadsheet (1) is typically not itself a reactive program. Semantically, the underlying spreadsheet (1) simply determines a calculation from a set of input cells to a set of output cells, and thus could be directly translated to a simple transformational calculation (i.e. function) in a traditional programming language.

References

Reactive programming Wikipedia

(Text) CC BY-SA

Contents